[0/4] [AArch64] Add SVE support

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [0/4] [AArch64] Add SVE support
@ 2017-11-03 17:45 Richard Sandiford
  2017-11-03 17:48 ` [1/4] [AArch64] SVE backend support Richard Sandiford
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Richard Sandiford @ 2017-11-03 17:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.earnshaw, james.greenhalgh, marcus.shawcroft

This series adds support for ARM's Scalable Vector Extension.
More details on SVE can be found here:

  https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a

There are four parts for ease of review, but it probably makes
sense to commit them as one patch.

The series plugs SVE into the current vectorisation framework without
adding any new features to the framework itself.  This means for example
that vector loops still handle full vectors, with a scalar epilogue loop
being needed for the rest.  Later patches add support for other features
like fully-predicated loops.

The patches build on top of the various series that I've already posted.
Sorry that there were so many, and thanks again for all the reviews.

Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE
(in the default vector-length agnostic mode).  Also tested with
-msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit
and 512-bit SVE registers.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [1/4] [AArch64] SVE backend support
  2017-11-03 17:45 [0/4] [AArch64] Add SVE support Richard Sandiford
@ 2017-11-03 17:48 ` Richard Sandiford
  2018-01-05 11:41   ` Richard Sandiford
  2017-11-03 17:50 ` [2/4] [AArch64] Testsuite markup for SVE Richard Sandiford
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2017-11-03 17:48 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.earnshaw, james.greenhalgh, marcus.shawcroft

[-- Attachment #1: Type: text/plain, Size: 15840 bytes --]

This patch adds support for ARM's Scalable Vector Extension.
The patch just contains the core features that work with the
current vectoriser framework; later patches will add extra
capabilities to both the target-independent code and AArch64 code.
The patch doesn't include:

- support for unwinding frames whose size depends on the vector length
- modelling the effect of __tls_get_addr on the SVE registers

These are handled by later patches instead.

Some notes:

- The copyright years for aarch64-sve.md start at 2009 because some of
  the code is based on aarch64.md, which also starts from then.

- The patch inserts spaces between items in the AArch64 section
  of sourcebuild.texi.  This matches at least the surrounding
  architectures and looks a little nicer in the info output.

- aarch64-sve.md includes a pattern:

    while_ult<GPI:mode><PRED_ALL:mode>

  A later patch adds a matching "while_ult" optab, but the pattern
  is also needed by the predicate vec_duplicate expander.


2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/invoke.texi (-msve-vector-bits=): Document new option.
	(sve): Document new AArch64 extension.
	* doc/md.texi (w): Extend the description of the AArch64
	constraint to include SVE vectors.
	(Upl, Upa): Document new AArch64 predicate constraints.
	* config/aarch64/aarch64-opts.h (aarch64_sve_vector_bits_enum): New
	enum.
	* config/aarch64/aarch64.opt (sve_vector_bits): New enum.
	(msve-vector-bits=): New option.
	* config/aarch64/aarch64-option-extensions.def (fp, simd): Disable
	SVE when these are disabled.
	(sve): New extension.
	* config/aarch64/aarch64-modes.def: Define SVE vector and predicate
	modes.  Adjust their number of units based on aarch64_sve_vg.
	(MAX_BITSIZE_MODE_ANY_MODE): Define.
	* config/aarch64/aarch64-protos.h (ADDR_QUERY_ANY): New
	aarch64_addr_query_type.
	(aarch64_const_vec_all_same_in_range_p, aarch64_sve_pred_mode)
	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)
	(aarch64_sve_inc_dec_immediate_p, aarch64_add_offset_temporaries)
	(aarch64_split_add_offset, aarch64_output_sve_cnt_immediate)
	(aarch64_output_sve_addvl_addpl, aarch64_output_sve_inc_dec_immediate)
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): Declare.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_check_zero_based_sve_index_immediate): Declare.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): Likewise.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): Declare.
	(aarch64_expand_mov_immediate): Take a gen_vec_duplicate callback.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move): Declare.
	(aarch64_expand_sve_vec_cmp_int, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond, aarch64_expand_sve_vec_perm): Declare.
	(aarch64_regmode_natural_size): Likewise.
	* config/aarch64/aarch64.h (AARCH64_FL_SVE): New macro.
	(AARCH64_FL_V8_3, AARCH64_FL_RCPC, AARCH64_FL_DOTPROD): Shift
	left one place.
	(AARCH64_ISA_SVE, TARGET_SVE): New macros.
	(FIXED_REGISTERS, CALL_USED_REGISTERS, REGISTER_NAMES): Add entries
	for VG and the SVE predicate registers.
	(V_ALIASES): Add a "z"-prefixed alias.
	(FIRST_PSEUDO_REGISTER): Change to P15_REGNUM + 1.
	(AARCH64_DWARF_VG, AARCH64_DWARF_P0): New macros.
	(PR_REGNUM_P, PR_LO_REGNUM_P): Likewise.
	(PR_LO_REGS, PR_HI_REGS, PR_REGS): New reg_classes.
	(REG_CLASS_NAMES): Add entries for them.
	(REG_CLASS_CONTENTS): Likewise.  Update ALL_REGS to include VG
	and the predicate registers.
	(aarch64_sve_vg): Declare.
	(BITS_PER_SVE_VECTOR, BYTES_PER_SVE_VECTOR, BYTES_PER_SVE_PRED)
	(SVE_BYTE_MODE, MAX_COMPILE_TIME_VEC_BYTES): New macros.
	(REGMODE_NATURAL_SIZE): Define.
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Handle
	SVE macros.
	* config/aarch64/aarch64.c: Include cfgrtl.h.
	(simd_immediate_info): Add a constructor for series vectors,
	and an associated step field.
	(aarch64_sve_vg): New variable.
	(aarch64_dbx_register_number): Handle VG and the predicate registers.
	(aarch64_vect_struct_mode_p, aarch64_vector_mode_p): Delete.
	(VEC_ADVSIMD, VEC_SVE_DATA, VEC_SVE_PRED, VEC_STRUCT, VEC_ANY_SVE)
	(VEC_ANY_DATA, VEC_STRUCT): New constants.
	(aarch64_advsimd_struct_mode_p, aarch64_sve_pred_mode_p)
	(aarch64_classify_vector_mode, aarch64_vector_data_mode_p)
	(aarch64_sve_data_mode_p, aarch64_pred_mode, aarch64_get_mask_mode):
	New functions.
	(aarch64_hard_regno_nregs): Handle SVE data modes for FP_REGS
	and FP_LO_REGS.  Handle PR_REGS, PR_LO_REGS and PR_HI_REGS.
	(aarch64_hard_regno_mode_ok): Handle VG.  Also handle the SVE
	predicate modes and predicate registers.  Explicitly restrict
	GPRs to modes of 16 bytes or smaller.  Only allow FP registers
	to store a vector mode if it is recognized by
	aarch64_classify_vector_mode.
	(aarch64_regmode_natural_size): New function.
	(aarch64_hard_regno_caller_save_mode): Return the original mode
	for predicates.
	(aarch64_sve_cnt_immediate_p, aarch64_output_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate_p, aarch64_output_sve_addvl_addpl)
	(aarch64_sve_inc_dec_immediate_p, aarch64_output_sve_inc_dec_immediate)
	(aarch64_add_offset_1_temporaries, aarch64_offset_temporaries): New
	functions.
	(aarch64_add_offset): Add a temp2 parameter.  Assert that temp1
	does not overlap dest if the function is frame-related.  Handle
	SVE constants.
	(aarch64_split_add_offset): New function.
	(aarch64_add_sp, aarch64_sub_sp): Add temp2 parameters and pass
	them aarch64_add_offset.
	(aarch64_allocate_and_probe_stack_space): Add a temp2 parameter
	and update call to aarch64_sub_sp.
	(aarch64_add_cfa_expression): New function.
	(aarch64_expand_prologue): Pass extra temporary registers to the
	functions above.  Handle the case in which we need to emit new
	DW_CFA_expressions for registers that were originally saved
	relative to the stack pointer, but now have to be expressed
	relative to the frame pointer.
	(aarch64_output_mi_thunk): Pass extra temporary registers to the
	functions above.
	(aarch64_expand_epilogue): Likewise.  Prevent inheritance of
	IP0 and IP1 values for SVE frames.
	(aarch64_expand_vec_series): New function.
	(aarch64_expand_mov_immediate): Add a gen_vec_duplicate parameter.
	Handle SVE constants.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move)
	(aarch64_get_reg_raw_mode, offset_4bit_signed_scaled_p)
	(offset_6bit_unsigned_scaled_p, aarch64_offset_7bit_signed_scaled_p)
	(offset_9bit_signed_scaled_p): New functions.
	(aarch64_replicate_bitmask_imm): New function.
	(aarch64_bitmask_imm): Use it.
	(aarch64_cannot_force_const_mem): Reject expressions involving
	a CONST_POLY_INT.  Update call to aarch64_classify_symbol.
	(aarch64_classify_index): Handle SVE indices, by requiring
	a plain register index with a scale that matches the element size.
	(aarch64_classify_address): Handle SVE addresses.  Assert that
	the mode of the address is VOIDmode or an integer mode.
	Update call to aarch64_classify_symbol.
	(aarch64_classify_symbolic_expression): Update call to
	aarch64_classify_symbol.
	(aarch64_const_vec_all_same_in_range_p): Extend to VEC_DUPLICATE
	constants by using const_vec_duplicate_p.
	(aarch64_const_vec_all_in_range_p): New function.
	(aarch64_print_vector_float_operand): Likewise.
	(aarch64_print_operand): Handle 'N' and 'C'.  Use "zN" rather than
	"vN" for FP registers with SVE modes.  Handle (const ...) vectors
	and the FP immediates 1.0 and 0.5.
	(aarch64_print_operand_address): Use ADDR_QUERY_ANY.  Handle
	SVE addresses.
	(aarch64_regno_regclass): Handle predicate registers.
	(aarch64_secondary_reload): Handle big-endian reloads of SVE
	data modes.
	(aarch64_class_max_nregs): Handle SVE modes and predicate registers.
	(aarch64_rtx_costs): Check for ADDVL and ADDPL instructions.
	(aarch64_convert_sve_vector_bits): New function.
	(aarch64_override_options): Use it to handle -msve-vector-bits=.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_legitimate_constant_p): Use aarch64_classify_vector_mode.
	Handle SVE vector and predicate modes.  Accept VL-based constants
	that need only one temporary register.  Only call
	aarch64_constant_address_p if the constant is a scalar integer.
	(aarch64_conditional_register_usage): Mark the predicate registers
	as fixed if SVE isn't available.
	(aarch64_vector_mode_supported_p): Use aarch64_classify_vector_mode.
	Return true for SVE vector and predicate modes.
	(aarch64_simd_container_mode): Take the number of bits as a poly_int64
	rather than an unsigned int.  Handle SVE modes.
	(aarch64_preferred_simd_mode): Update call accordingly.  Handle
	SVE modes.
	(aarch64_autovectorize_vector_sizes): Add BYTES_PER_SVE_VECTOR
	if SVE is enabled.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): New functions.
	(aarch64_sve_valid_immediate): New function.
	(aarch64_simd_valid_immediate): Use it as the fallback for SVE vectors.
	Explicitly reject structure modes.  Check for INDEX constants.
	Handle PTRUE and PFALSE constants.
	(aarch64_check_zero_based_sve_index_immediate): New function.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_mov_operand_p): Use aarch64_simd_valid_immediate for
	vector modes.  Accept constants in the range of CNT[BHWD].
	(aarch64_simd_scalar_immediate_valid_for_move): Explicitly
	ask for an Advanced SIMD mode.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): New functions.
	(aarch64_simd_vector_alignment): Handle SVE predicates.
	(aarch64_vectorize_preferred_vector_alignment): New function.
	(aarch64_simd_vector_alignment_reachable): Use it instead of
	the vector size.
	(aarch64_shift_truncation_mask): Use aarch64_vector_data_mode_p.
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): New
	functions.
	(MAX_VECT_LEN): Delete.
	(expand_vec_perm_d): Add a vec_flags field.
	(emit_unspec2, aarch64_expand_sve_vec_perm): New functions.
	(aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip)
	(aarch64_evpc_ext): Don't apply a big-endian lane correction
	for SVE modes.
	(aarch64_evpc_rev): Use a predicated operation for SVE.
	(aarch64_evpc_dup): Enforce a 64-byte range for SVE DUP.
	(aarch64_evpc_tbl): Use MAX_COMPILE_TIME_VEC_BYTES instead of
	MAX_VECT_LEN.
	(aarch64_evpc_sve_tbl): New function.
	(aarch64_expand_vec_perm_const_1): Handle SVE permutes too,
	using aarch64_evpc_sve_tbl rather than aarch64_evpc_tbl.
	(aarch64_expand_vec_perm_const): Initialize vec_flags.
	(aarch64_vectorize_vec_perm_const_ok): Likewise.
	(aarch64_sve_cmp_operand_p, aarch64_unspec_cond_code)
	(aarch64_gen_unspec_cond, aarch64_expand_sve_vec_cmp_int)
	(aarch64_emit_unspec_cond, aarch64_emit_unspec_cond_or)
	(aarch64_emit_inverted_unspec_cond, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond): New functions.
	(aarch64_modes_tieable_p): Use aarch64_vector_data_mode_p instead
	of aarch64_vector_mode_p.
	(aarch64_dwarf_poly_indeterminate_value): New function.
	(aarch64_compute_pressure_classes): Likewise.
	(aarch64_can_change_mode_class): Likewise.
	(TARGET_GET_RAW_RESULT_MODE, TARGET_GET_RAW_ARG_MODE): Redefine.
	(TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): Likewise.
	(TARGET_VECTORIZE_GET_MASK_MODE): Likewise.
	(TARGET_DWARF_POLY_INDETERMINATE_VALUE): Likewise.
	(TARGET_COMPUTE_PRESSURE_CLASSES): Likewise.
	(TARGET_CAN_CHANGE_MODE_CLASS): Likewise.
	* config/aarch64/constraints.md (Upa, Upl, Uad, Ual, Utr, Utw, Di)
	(Dm, Dv, vsa, vsc, vsd, vsi, vsn, vsl, vsm, vfa, vfm, vfn): New
	constraints.
	(Dn, Dl, Dr): Accept const as well as const_vector.
	(Dz): Likewise.  Compare against CONST0_RTX.
	* config/aarch64/iterators.md: Refer to "Advanced SIMD" instead
	of "vector" where appropriate.
	(SVE_ALL, SVE_BH, SVE_BHS, SVE_BHSI, SVE_HSDI, SVE_HSF, SVE_SD)
	(SVE_SDI, SVE_I, SVE_F, PRED_ALL, PRED_BHS): New mode iterators.
	(UNSPEC_SEL, UNSPEC_ANDF, UNSPEC_IORF, UNSPEC_XORF, UNSPEC_COND_LT)
	(UNSPEC_COND_LE, UNSPEC_COND_EQ, UNSPEC_COND_NE, UNSPEC_COND_GE)
	(UNSPEC_COND_GT, UNSPEC_COND_LO, UNSPEC_COND_LS, UNSPEC_COND_HS)
	(UNSPEC_COND_HI, UNSPEC_COND_UO): New unspecs.
	(Vetype, VEL, Vel, VWIDE, Vwide, vw, vwcore, V_INT_EQUIV)
	(v_int_equiv): Extend to SVE modes.
	(Vesize, V128, v128, Vewtype, V_FP_EQUIV, v_fp_equiv, VPRED): New
	mode attributes.
	(LOGICAL_OR, SVE_INT_UNARY, SVE_FP_UNARY): New code iterators.
	(optab): Handle popcount, smin, smax, umin, umax, abs and sqrt.
	(logical_nn, lr, sve_int_op, sve_fp_op): New code attributs.
	(LOGICALF, OPTAB_PERMUTE, UNPACK, UNPACK_UNSIGNED, SVE_COND_INT_CMP)
	(SVE_COND_FP_CMP): New int iterators.
	(perm_hilo): Handle the new unpack unspecs.
	(optab, logicalf_op, su, perm_optab, cmp_op, imm_con): New int
	attributes.
	* config/aarch64/predicates.md (aarch64_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate, aarch64_split_add_offset_immediate)
	(aarch64_pluslong_or_poly_operand, aarch64_nonmemory_operand)
	(aarch64_equality_operator, aarch64_constant_vector_operand)
	(aarch64_sve_ld1r_operand, aarch64_sve_ldr_operand): New predicates.
	(aarch64_sve_nonimmediate_operand): Likewise.
	(aarch64_sve_general_operand): Likewise.
	(aarch64_sve_dup_operand, aarch64_sve_arith_immediate): Likewise.
	(aarch64_sve_sub_arith_immediate, aarch64_sve_inc_dec_immediate)
	(aarch64_sve_logical_immediate, aarch64_sve_mul_immediate): Likewise.
	(aarch64_sve_dup_immediate, aarch64_sve_cmp_vsc_immediate): Likewise.
	(aarch64_sve_cmp_vsd_immediate, aarch64_sve_index_immediate): Likewise.
	(aarch64_sve_float_arith_immediate): Likewise.
	(aarch64_sve_float_arith_with_sub_immediate): Likewise.
	(aarch64_sve_float_mul_immediate, aarch64_sve_arith_operand): Likewise.
	(aarch64_sve_add_operand, aarch64_sve_logical_operand): Likewise.
	(aarch64_sve_lshift_operand, aarch64_sve_rshift_operand): Likewise.
	(aarch64_sve_mul_operand, aarch64_sve_cmp_vsc_operand): Likewise.
	(aarch64_sve_cmp_vsd_operand, aarch64_sve_index_operand): Likewise.
	(aarch64_sve_float_arith_operand): Likewise.
	(aarch64_sve_float_arith_with_sub_operand): Likewise.
	(aarch64_sve_float_mul_operand): Likewise.
	(aarch64_sve_vec_perm_operand): Likewise.
	(aarch64_pluslong_operand): Include aarch64_sve_addvl_addpl_immediate.
	(aarch64_mov_operand): Accept const_poly_int and const_vector.
	(aarch64_simd_lshift_imm, aarch64_simd_rshift_imm): Accept const
	as well as const_vector.
	(aarch64_simd_imm_zero, aarch64_simd_imm_minus_one): Move earlier
	in file.  Use CONST0_RTX and CONSTM1_RTX.
	(aarch64_simd_reg_or_zero): Accept const as well as const_vector.
	Use aarch64_simd_imm_zero.
	* config/aarch64/aarch64-sve.md: New file.
	* config/aarch64/aarch64.md: Include it.
	(VG_REGNUM, P0_REGNUM, P7_REGNUM, P15_REGNUM): New register numbers.
	(UNSPEC_REV, UNSPEC_LD1_SVE, UNSPEC_ST1_SVE, UNSPEC_MERGE_PTRUE)
	(UNSPEC_PTEST_PTRUE, UNSPEC_UNPACKSHI, UNSPEC_UNPACKUHI)
	(UNSPEC_UNPACKSLO, UNSPEC_UNPACKULO, UNSPEC_PACK)
	(UNSPEC_FLOAT_CONVERT, UNSPEC_WHILE_LO): New unspec constants.
	(movqi, movhi): Pass CONST_POLY_INT operaneds through
	aarch64_expand_mov_immediate.
	(*mov<mode>_aarch64, *movsi_aarch64, *movdi_aarch64): Handle
	CNT[BHSD] immediates.
	(movti): Split CONST_POLY_INT moves into two halves.
	(add<mode>3): Accept aarch64_pluslong_or_poly_operand.
	Split additions that need a temporary here if the destination
	is the stack pointer.
	(*add<mode>3_aarch64): Handle ADDVL and ADDPL immediates.
	(*add<mode>3_poly_1): New instruction.
	(set_clobber_cc): New expander.


[-- Attachment #2: sve-01-main.diff.gz --]
[-- Type: application/gzip, Size: 55524 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [1/4] [AArch64] SVE backend support
  2017-11-03 17:48 ` [1/4] [AArch64] SVE backend support Richard Sandiford
@ 2018-01-05 11:41   ` Richard Sandiford
  2018-01-10 19:19     ` James Greenhalgh
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2018-01-05 11:41 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.earnshaw, james.greenhalgh, marcus.shawcroft

[-- Attachment #1: Type: text/plain, Size: 16539 bytes --]

Here's the patch updated to apply on top of the v8.4 and
__builtin_load_no_speculate support.  It also handles the new
vec_perm_indices and CONST_VECTOR encoding and uses VNx... names
for the SVE modes.

Richard Sandiford <richard.sandiford@linaro.org> writes:
> This patch adds support for ARM's Scalable Vector Extension.
> The patch just contains the core features that work with the
> current vectoriser framework; later patches will add extra
> capabilities to both the target-independent code and AArch64 code.
> The patch doesn't include:
>
> - support for unwinding frames whose size depends on the vector length
> - modelling the effect of __tls_get_addr on the SVE registers
>
> These are handled by later patches instead.
>
> Some notes:
>
> - The copyright years for aarch64-sve.md start at 2009 because some of
>   the code is based on aarch64.md, which also starts from then.
>
> - The patch inserts spaces between items in the AArch64 section
>   of sourcebuild.texi.  This matches at least the surrounding
>   architectures and looks a little nicer in the info output.
>
> - aarch64-sve.md includes a pattern:
>
>     while_ult<GPI:mode><PRED_ALL:mode>
>
>   A later patch adds a matching "while_ult" optab, but the pattern
>   is also needed by the predicate vec_duplicate expander.

2018-01-05  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/invoke.texi (-msve-vector-bits=): Document new option.
	(sve): Document new AArch64 extension.
	* doc/md.texi (w): Extend the description of the AArch64
	constraint to include SVE vectors.
	(Upl, Upa): Document new AArch64 predicate constraints.
	* config/aarch64/aarch64-opts.h (aarch64_sve_vector_bits_enum): New
	enum.
	* config/aarch64/aarch64.opt (sve_vector_bits): New enum.
	(msve-vector-bits=): New option.
	* config/aarch64/aarch64-option-extensions.def (fp, simd): Disable
	SVE when these are disabled.
	(sve): New extension.
	* config/aarch64/aarch64-modes.def: Define SVE vector and predicate
	modes.  Adjust their number of units based on aarch64_sve_vg.
	(MAX_BITSIZE_MODE_ANY_MODE): Define.
	* config/aarch64/aarch64-protos.h (ADDR_QUERY_ANY): New
	aarch64_addr_query_type.
	(aarch64_const_vec_all_same_in_range_p, aarch64_sve_pred_mode)
	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)
	(aarch64_sve_inc_dec_immediate_p, aarch64_add_offset_temporaries)
	(aarch64_split_add_offset, aarch64_output_sve_cnt_immediate)
	(aarch64_output_sve_addvl_addpl, aarch64_output_sve_inc_dec_immediate)
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): Declare.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_check_zero_based_sve_index_immediate): Declare.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): Likewise.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): Declare.
	(aarch64_expand_mov_immediate): Take a gen_vec_duplicate callback.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move): Declare.
	(aarch64_expand_sve_vec_cmp_int, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond, aarch64_expand_sve_vec_perm): Declare.
	(aarch64_regmode_natural_size): Likewise.
	* config/aarch64/aarch64.h (AARCH64_FL_SVE): New macro.
	(AARCH64_FL_V8_3, AARCH64_FL_RCPC, AARCH64_FL_DOTPROD): Shift
	left one place.
	(AARCH64_ISA_SVE, TARGET_SVE): New macros.
	(FIXED_REGISTERS, CALL_USED_REGISTERS, REGISTER_NAMES): Add entries
	for VG and the SVE predicate registers.
	(V_ALIASES): Add a "z"-prefixed alias.
	(FIRST_PSEUDO_REGISTER): Change to P15_REGNUM + 1.
	(AARCH64_DWARF_VG, AARCH64_DWARF_P0): New macros.
	(PR_REGNUM_P, PR_LO_REGNUM_P): Likewise.
	(PR_LO_REGS, PR_HI_REGS, PR_REGS): New reg_classes.
	(REG_CLASS_NAMES): Add entries for them.
	(REG_CLASS_CONTENTS): Likewise.  Update ALL_REGS to include VG
	and the predicate registers.
	(aarch64_sve_vg): Declare.
	(BITS_PER_SVE_VECTOR, BYTES_PER_SVE_VECTOR, BYTES_PER_SVE_PRED)
	(SVE_BYTE_MODE, MAX_COMPILE_TIME_VEC_BYTES): New macros.
	(REGMODE_NATURAL_SIZE): Define.
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Handle
	SVE macros.
	* config/aarch64/aarch64.c: Include cfgrtl.h.
	(simd_immediate_info): Add a constructor for series vectors,
	and an associated step field.
	(aarch64_sve_vg): New variable.
	(aarch64_dbx_register_number): Handle VG and the predicate registers.
	(aarch64_vect_struct_mode_p, aarch64_vector_mode_p): Delete.
	(VEC_ADVSIMD, VEC_SVE_DATA, VEC_SVE_PRED, VEC_STRUCT, VEC_ANY_SVE)
	(VEC_ANY_DATA, VEC_STRUCT): New constants.
	(aarch64_advsimd_struct_mode_p, aarch64_sve_pred_mode_p)
	(aarch64_classify_vector_mode, aarch64_vector_data_mode_p)
	(aarch64_sve_data_mode_p, aarch64_sve_pred_mode)
	(aarch64_get_mask_mode): New functions.
	(aarch64_hard_regno_nregs): Handle SVE data modes for FP_REGS
	and FP_LO_REGS.  Handle PR_REGS, PR_LO_REGS and PR_HI_REGS.
	(aarch64_hard_regno_mode_ok): Handle VG.  Also handle the SVE
	predicate modes and predicate registers.  Explicitly restrict
	GPRs to modes of 16 bytes or smaller.  Only allow FP registers
	to store a vector mode if it is recognized by
	aarch64_classify_vector_mode.
	(aarch64_regmode_natural_size): New function.
	(aarch64_hard_regno_caller_save_mode): Return the original mode
	for predicates.
	(aarch64_sve_cnt_immediate_p, aarch64_output_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate_p, aarch64_output_sve_addvl_addpl)
	(aarch64_sve_inc_dec_immediate_p, aarch64_output_sve_inc_dec_immediate)
	(aarch64_add_offset_1_temporaries, aarch64_offset_temporaries): New
	functions.
	(aarch64_add_offset): Add a temp2 parameter.  Assert that temp1
	does not overlap dest if the function is frame-related.  Handle
	SVE constants.
	(aarch64_split_add_offset): New function.
	(aarch64_add_sp, aarch64_sub_sp): Add temp2 parameters and pass
	them aarch64_add_offset.
	(aarch64_allocate_and_probe_stack_space): Add a temp2 parameter
	and update call to aarch64_sub_sp.
	(aarch64_add_cfa_expression): New function.
	(aarch64_expand_prologue): Pass extra temporary registers to the
	functions above.  Handle the case in which we need to emit new
	DW_CFA_expressions for registers that were originally saved
	relative to the stack pointer, but now have to be expressed
	relative to the frame pointer.
	(aarch64_output_mi_thunk): Pass extra temporary registers to the
	functions above.
	(aarch64_expand_epilogue): Likewise.  Prevent inheritance of
	IP0 and IP1 values for SVE frames.
	(aarch64_expand_vec_series): New function.
	(aarch64_expand_sve_widened_duplicate): Likewise.
	(aarch64_expand_sve_const_vector): Likewise.
	(aarch64_expand_mov_immediate): Add a gen_vec_duplicate parameter.
	Handle SVE constants.  Use emit_move_insn to move a force_const_mem
	into the register, rather than emitting a SET directly.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move)
	(aarch64_get_reg_raw_mode, offset_4bit_signed_scaled_p)
	(offset_6bit_unsigned_scaled_p, aarch64_offset_7bit_signed_scaled_p)
	(offset_9bit_signed_scaled_p): New functions.
	(aarch64_replicate_bitmask_imm): New function.
	(aarch64_bitmask_imm): Use it.
	(aarch64_cannot_force_const_mem): Reject expressions involving
	a CONST_POLY_INT.  Update call to aarch64_classify_symbol.
	(aarch64_classify_index): Handle SVE indices, by requiring
	a plain register index with a scale that matches the element size.
	(aarch64_classify_address): Handle SVE addresses.  Assert that
	the mode of the address is VOIDmode or an integer mode.
	Update call to aarch64_classify_symbol.
	(aarch64_classify_symbolic_expression): Update call to
	aarch64_classify_symbol.
	(aarch64_const_vec_all_in_range_p): New function.
	(aarch64_print_vector_float_operand): Likewise.
	(aarch64_print_operand): Handle 'N' and 'C'.  Use "zN" rather than
	"vN" for FP registers with SVE modes.  Handle (const ...) vectors
	and the FP immediates 1.0 and 0.5.
	(aarch64_print_address_internal): Handle SVE addresses.
	(aarch64_print_operand_address): Use ADDR_QUERY_ANY.
	(aarch64_regno_regclass): Handle predicate registers.
	(aarch64_secondary_reload): Handle big-endian reloads of SVE
	data modes.
	(aarch64_class_max_nregs): Handle SVE modes and predicate registers.
	(aarch64_rtx_costs): Check for ADDVL and ADDPL instructions.
	(aarch64_convert_sve_vector_bits): New function.
	(aarch64_override_options): Use it to handle -msve-vector-bits=.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_legitimate_constant_p): Use aarch64_classify_vector_mode.
	Handle SVE vector and predicate modes.  Accept VL-based constants
	that need only one temporary register, and VL offsets that require
	no temporary registers.
	(aarch64_conditional_register_usage): Mark the predicate registers
	as fixed if SVE isn't available.
	(aarch64_vector_mode_supported_p): Use aarch64_classify_vector_mode.
	Return true for SVE vector and predicate modes.
	(aarch64_simd_container_mode): Take the number of bits as a poly_int64
	rather than an unsigned int.  Handle SVE modes.
	(aarch64_preferred_simd_mode): Update call accordingly.  Handle
	SVE modes.
	(aarch64_autovectorize_vector_sizes): Add BYTES_PER_SVE_VECTOR
	if SVE is enabled.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): New functions.
	(aarch64_sve_valid_immediate): New function.
	(aarch64_simd_valid_immediate): Use it as the fallback for SVE vectors.
	Explicitly reject structure modes.  Check for INDEX constants.
	Handle PTRUE and PFALSE constants.
	(aarch64_check_zero_based_sve_index_immediate): New function.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_mov_operand_p): Use aarch64_simd_valid_immediate for
	vector modes.  Accept constants in the range of CNT[BHWD].
	(aarch64_simd_scalar_immediate_valid_for_move): Explicitly
	ask for an Advanced SIMD mode.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): New functions.
	(aarch64_simd_vector_alignment): Handle SVE predicates.
	(aarch64_vectorize_preferred_vector_alignment): New function.
	(aarch64_simd_vector_alignment_reachable): Use it instead of
	the vector size.
	(aarch64_shift_truncation_mask): Use aarch64_vector_data_mode_p.
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): New
	functions.
	(MAX_VECT_LEN): Delete.
	(expand_vec_perm_d): Add a vec_flags field.
	(emit_unspec2, aarch64_expand_sve_vec_perm): New functions.
	(aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip)
	(aarch64_evpc_ext): Don't apply a big-endian lane correction
	for SVE modes.
	(aarch64_evpc_rev): Rename to...
	(aarch64_evpc_rev_local): ...this.  Use a predicated operation for SVE.
	(aarch64_evpc_rev_global): New function.
	(aarch64_evpc_dup): Enforce a 64-byte range for SVE DUP.
	(aarch64_evpc_tbl): Use MAX_COMPILE_TIME_VEC_BYTES instead of
	MAX_VECT_LEN.
	(aarch64_evpc_sve_tbl): New function.
	(aarch64_expand_vec_perm_const_1): Update after rename of
	aarch64_evpc_rev.  Handle SVE permutes too, trying
	aarch64_evpc_rev_global and using aarch64_evpc_sve_tbl rather
	than aarch64_evpc_tbl.
	(aarch64_vectorize_vec_perm_const): Initialize vec_flags.
	(aarch64_sve_cmp_operand_p, aarch64_unspec_cond_code)
	(aarch64_gen_unspec_cond, aarch64_expand_sve_vec_cmp_int)
	(aarch64_emit_unspec_cond, aarch64_emit_unspec_cond_or)
	(aarch64_emit_inverted_unspec_cond, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond): New functions.
	(aarch64_modes_tieable_p): Use aarch64_vector_data_mode_p instead
	of aarch64_vector_mode_p.
	(aarch64_dwarf_poly_indeterminate_value): New function.
	(aarch64_compute_pressure_classes): Likewise.
	(aarch64_can_change_mode_class): Likewise.
	(TARGET_GET_RAW_RESULT_MODE, TARGET_GET_RAW_ARG_MODE): Redefine.
	(TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): Likewise.
	(TARGET_VECTORIZE_GET_MASK_MODE): Likewise.
	(TARGET_DWARF_POLY_INDETERMINATE_VALUE): Likewise.
	(TARGET_COMPUTE_PRESSURE_CLASSES): Likewise.
	(TARGET_CAN_CHANGE_MODE_CLASS): Likewise.
	* config/aarch64/constraints.md (Upa, Upl, Uav, Uat, Usv, Usi, Utr)
	(Uty, Dm, vsa, vsc, vsd, vsi, vsn, vsl, vsm, vsA, vsM, vsN): New
	constraints.
	(Dn, Dl, Dr): Accept const as well as const_vector.
	(Dz): Likewise.  Compare against CONST0_RTX.
	* config/aarch64/iterators.md: Refer to "Advanced SIMD" instead
	of "vector" where appropriate.
	(SVE_ALL, SVE_BH, SVE_BHS, SVE_BHSI, SVE_HSDI, SVE_HSF, SVE_SD)
	(SVE_SDI, SVE_I, SVE_F, PRED_ALL, PRED_BHS): New mode iterators.
	(UNSPEC_SEL, UNSPEC_ANDF, UNSPEC_IORF, UNSPEC_XORF, UNSPEC_COND_LT)
	(UNSPEC_COND_LE, UNSPEC_COND_EQ, UNSPEC_COND_NE, UNSPEC_COND_GE)
	(UNSPEC_COND_GT, UNSPEC_COND_LO, UNSPEC_COND_LS, UNSPEC_COND_HS)
	(UNSPEC_COND_HI, UNSPEC_COND_UO): New unspecs.
	(Vetype, VEL, Vel, VWIDE, Vwide, vw, vwcore, V_INT_EQUIV)
	(v_int_equiv): Extend to SVE modes.
	(Vesize, V128, v128, Vewtype, V_FP_EQUIV, v_fp_equiv, VPRED): New
	mode attributes.
	(LOGICAL_OR, SVE_INT_UNARY, SVE_FP_UNARY): New code iterators.
	(optab): Handle popcount, smin, smax, umin, umax, abs and sqrt.
	(logical_nn, lr, sve_int_op, sve_fp_op): New code attributs.
	(LOGICALF, OPTAB_PERMUTE, UNPACK, UNPACK_UNSIGNED, SVE_COND_INT_CMP)
	(SVE_COND_FP_CMP): New int iterators.
	(perm_hilo): Handle the new unpack unspecs.
	(optab, logicalf_op, su, perm_optab, cmp_op, imm_con): New int
	attributes.
	* config/aarch64/predicates.md (aarch64_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate, aarch64_split_add_offset_immediate)
	(aarch64_pluslong_or_poly_operand, aarch64_nonmemory_operand)
	(aarch64_equality_operator, aarch64_constant_vector_operand)
	(aarch64_sve_ld1r_operand, aarch64_sve_ldr_operand): New predicates.
	(aarch64_sve_nonimmediate_operand): Likewise.
	(aarch64_sve_general_operand): Likewise.
	(aarch64_sve_dup_operand, aarch64_sve_arith_immediate): Likewise.
	(aarch64_sve_sub_arith_immediate, aarch64_sve_inc_dec_immediate)
	(aarch64_sve_logical_immediate, aarch64_sve_mul_immediate): Likewise.
	(aarch64_sve_dup_immediate, aarch64_sve_cmp_vsc_immediate): Likewise.
	(aarch64_sve_cmp_vsd_immediate, aarch64_sve_index_immediate): Likewise.
	(aarch64_sve_float_arith_immediate): Likewise.
	(aarch64_sve_float_arith_with_sub_immediate): Likewise.
	(aarch64_sve_float_mul_immediate, aarch64_sve_arith_operand): Likewise.
	(aarch64_sve_add_operand, aarch64_sve_logical_operand): Likewise.
	(aarch64_sve_lshift_operand, aarch64_sve_rshift_operand): Likewise.
	(aarch64_sve_mul_operand, aarch64_sve_cmp_vsc_operand): Likewise.
	(aarch64_sve_cmp_vsd_operand, aarch64_sve_index_operand): Likewise.
	(aarch64_sve_float_arith_operand): Likewise.
	(aarch64_sve_float_arith_with_sub_operand): Likewise.
	(aarch64_sve_float_mul_operand): Likewise.
	(aarch64_sve_vec_perm_operand): Likewise.
	(aarch64_pluslong_operand): Include aarch64_sve_addvl_addpl_immediate.
	(aarch64_mov_operand): Accept const_poly_int and const_vector.
	(aarch64_simd_lshift_imm, aarch64_simd_rshift_imm): Accept const
	as well as const_vector.
	(aarch64_simd_imm_zero, aarch64_simd_imm_minus_one): Move earlier
	in file.  Use CONST0_RTX and CONSTM1_RTX.
	(aarch64_simd_or_scalar_imm_zero): Likewise.  Add match_codes.
	(aarch64_simd_reg_or_zero): Accept const as well as const_vector.
	Use aarch64_simd_imm_zero.
	* config/aarch64/aarch64-sve.md: New file.
	* config/aarch64/aarch64.md: Include it.
	(VG_REGNUM, P0_REGNUM, P7_REGNUM, P15_REGNUM): New register numbers.
	(UNSPEC_REV, UNSPEC_LD1_SVE, UNSPEC_ST1_SVE, UNSPEC_MERGE_PTRUE)
	(UNSPEC_PTEST_PTRUE, UNSPEC_UNPACKSHI, UNSPEC_UNPACKUHI)
	(UNSPEC_UNPACKSLO, UNSPEC_UNPACKULO, UNSPEC_PACK)
	(UNSPEC_FLOAT_CONVERT, UNSPEC_WHILE_LO): New unspec constants.
	(sve): New attribute.
	(enabled): Disable instructions with the sve attribute unless
	TARGET_SVE.
	(movqi, movhi): Pass CONST_POLY_INT operaneds through
	aarch64_expand_mov_immediate.
	(*mov<mode>_aarch64, *movsi_aarch64, *movdi_aarch64): Handle
	CNT[BHSD] immediates.
	(movti): Split CONST_POLY_INT moves into two halves.
	(add<mode>3): Accept aarch64_pluslong_or_poly_operand.
	Split additions that need a temporary here if the destination
	is the stack pointer.
	(*add<mode>3_aarch64): Handle ADDVL and ADDPL immediates.
	(*add<mode>3_poly_1): New instruction.
	(set_clobber_cc): New expander.


[-- Attachment #2: sve-01-main.diff.gz --]
[-- Type: application/gzip, Size: 58140 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [1/4] [AArch64] SVE backend support
  2018-01-05 11:41   ` Richard Sandiford
@ 2018-01-10 19:19     ` James Greenhalgh
  2018-01-10 19:55       ` Richard Sandiford
  0 siblings, 1 reply; 18+ messages in thread
From: James Greenhalgh @ 2018-01-10 19:19 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Richard Earnshaw, Marcus Shawcroft, nd

On Fri, Jan 05, 2018 at 11:41:25AM +0000, Richard Sandiford wrote:
> Here's the patch updated to apply on top of the v8.4 and
> __builtin_load_no_speculate support.  It also handles the new
> vec_perm_indices and CONST_VECTOR encoding and uses VNx... names
> for the SVE modes.
> 
> Richard Sandiford <richard.sandiford@linaro.org> writes:
> > This patch adds support for ARM's Scalable Vector Extension.
> > The patch just contains the core features that work with the
> > current vectoriser framework; later patches will add extra
> > capabilities to both the target-independent code and AArch64 code.
> > The patch doesn't include:
> >
> > - support for unwinding frames whose size depends on the vector length
> > - modelling the effect of __tls_get_addr on the SVE registers
> >
> > These are handled by later patches instead.
> >
> > Some notes:
> >
> > - The copyright years for aarch64-sve.md start at 2009 because some of
> >   the code is based on aarch64.md, which also starts from then.
> >
> > - The patch inserts spaces between items in the AArch64 section
> >   of sourcebuild.texi.  This matches at least the surrounding
> >   architectures and looks a little nicer in the info output.
> >
> > - aarch64-sve.md includes a pattern:
> >
> >     while_ult<GPI:mode><PRED_ALL:mode>
> >
> >   A later patch adds a matching "while_ult" optab, but the pattern
> >   is also needed by the predicate vec_duplicate expander.

I'm keen to take this. The code is good quality overall, I'm confident in your
reputation and implementation. There are some parts of the design that I'm
less happy about, but pragmatically, we should take this now to get the
behaviour correct, and look to optimise, refactor, and clean-up in future.

Sorry it took me a long time to get to the review. I've got no meaningful
design concerns here, and certainly nothing so critical that we couldn't
fix it after the fact in GCC 9 and up.

That said...

> 	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)

I'm not a big fan of these sorts of functions which return a char* where
we've dumped the text we want to print out in the short term. The interface
(fill a static char[] we can then leak on return) is pretty ugly.

One consideration for future work would be refactoring out aarch64.c - it is
getting to be too big for my liking (near 18,000 lines).

> 	(aarch64_expand_sve_mem_move)

Do we have a good description of how SVE big-endian vectors work, <snip more
comments - I found the detailed comment at the top of aarch64-sve.md> 

The sort of comment you write later ("see the comment at the head of
aarch64-sve.md for details") would also be useful here as a reference.

> aarch64_get_reg_raw_mode

Do we assert/warn anywhere for users of __builtin_apply that they are
fundamentally unsupported?

> offset_4bit_signed_scaled_p

So much code duplication here and in similair functions. Would a single
interface (unsigned bits, bool signed, bool scaled) let you avoid the many
identical functions?

> aarch64_evpc_rev_local 

I'm likely missing something obvious, but what is the distinction you're
drawing between global and local? Could you comment it?

> aarch64-sve.md - scheduling types

None of the instructions here have types for scheduling. That's going to
make for a future headache. Adding them to the existing scheduling types
is going to cause all sorts of trouble when building GCC (we already have
too many types for some compilers to handle the structure!). We'll need
to finds a solution to how we'll direct scheduling for SVE.

> aarch64-sve.md - predicated operands

It is a shame this ends up being so ugly and requiring UNSPEC_MERGE_PTRUE
everywhere. That will block a lot of useful optimisation.

Otherwise, this is OK for trunk. I'm happy to take it as is, and have the
above suggestions applied as follow-ups if you think they are worth doing.

Thanks,
James

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [1/4] [AArch64] SVE backend support
  2018-01-10 19:19     ` James Greenhalgh
@ 2018-01-10 19:55       ` Richard Sandiford
  0 siblings, 0 replies; 18+ messages in thread
From: Richard Sandiford @ 2018-01-10 19:55 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: gcc-patches, Richard Earnshaw, Marcus Shawcroft, nd

Thanks for the review!

James Greenhalgh <james.greenhalgh@arm.com> writes:
> On Fri, Jan 05, 2018 at 11:41:25AM +0000, Richard Sandiford wrote:
>> Here's the patch updated to apply on top of the v8.4 and
>> __builtin_load_no_speculate support.  It also handles the new
>> vec_perm_indices and CONST_VECTOR encoding and uses VNx... names
>> for the SVE modes.
>> 
>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>> > This patch adds support for ARM's Scalable Vector Extension.
>> > The patch just contains the core features that work with the
>> > current vectoriser framework; later patches will add extra
>> > capabilities to both the target-independent code and AArch64 code.
>> > The patch doesn't include:
>> >
>> > - support for unwinding frames whose size depends on the vector length
>> > - modelling the effect of __tls_get_addr on the SVE registers
>> >
>> > These are handled by later patches instead.
>> >
>> > Some notes:
>> >
>> > - The copyright years for aarch64-sve.md start at 2009 because some of
>> >   the code is based on aarch64.md, which also starts from then.
>> >
>> > - The patch inserts spaces between items in the AArch64 section
>> >   of sourcebuild.texi.  This matches at least the surrounding
>> >   architectures and looks a little nicer in the info output.
>> >
>> > - aarch64-sve.md includes a pattern:
>> >
>> >     while_ult<GPI:mode><PRED_ALL:mode>
>> >
>> >   A later patch adds a matching "while_ult" optab, but the pattern
>> >   is also needed by the predicate vec_duplicate expander.
>
> I'm keen to take this. The code is good quality overall, I'm confident in your
> reputation and implementation. There are some parts of the design that I'm
> less happy about, but pragmatically, we should take this now to get the
> behaviour correct, and look to optimise, refactor, and clean-up in future.
>
> Sorry it took me a long time to get to the review. I've got no meaningful
> design concerns here, and certainly nothing so critical that we couldn't
> fix it after the fact in GCC 9 and up.
>
> That said...
>
>> 	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)
>
> I'm not a big fan of these sorts of functions which return a char* where
> we've dumped the text we want to print out in the short term. The interface
> (fill a static char[] we can then leak on return) is pretty ugly.

Yeah, it's not pretty, but I think the various possible ways of doing
the addition do justify using output functions here.  The distinction
between INC[BHWD], DEC[BHWD], ADDVL and ADDPL doesn't really affect
anything other than the final output, so it isn't something that
should be exposed as different constraints (for example).

We should probably "just" have a nicer interface for target code
to construct instruction format strings.

> One consideration for future work would be refactoring out aarch64.c - it is
> getting to be too big for my liking (near 18,000 lines).
>
>> 	(aarch64_expand_sve_mem_move)
>
> Do we have a good description of how SVE big-endian vectors work, <snip more
> comments - I found the detailed comment at the top of aarch64-sve.md> 
>
> The sort of comment you write later ("see the comment at the head of
> aarch64-sve.md for details") would also be useful here as a reference.

Ah, yeah, will add a reference there too.

>> aarch64_get_reg_raw_mode
>
> Do we assert/warn anywhere for users of __builtin_apply that they are
> fundamentally unsupported?

Not as far as I know.  FWIW, this doesn't affect SVE (yet), because we
don't yet support any types that would be passed in the SVE-specific
part of the registers.

>> offset_4bit_signed_scaled_p
>
> So much code duplication here and in similair functions. Would a single
> interface (unsigned bits, bool signed, bool scaled) let you avoid the many
> identical functions?

We just kept to the existing style here.  I agree it might be a good idea
to consolidate them, but personally I'd prefer to keep the signed/scaled
distinction in the function name, since it's more readable than booleans
and shorter than a new enum.

>> aarch64_evpc_rev_local 
>
> I'm likely missing something obvious, but what is the distinction you're
> drawing between global and local? Could you comment it?

"global" reverses the whole vector: the first and last elements switch
places.  "local" reverses within groups of N consecutive elements but
not between them.

But yet again names are probably my downfall here. :-)  I'm happy to
call them something else instead.  Either way I'll expand the comments.

>> aarch64-sve.md - scheduling types
>
> None of the instructions here have types for scheduling. That's going to
> make for a future headache. Adding them to the existing scheduling types
> is going to cause all sorts of trouble when building GCC (we already have
> too many types for some compilers to handle the structure!). We'll need
> to finds a solution to how we'll direct scheduling for SVE.

Yeah.  I didn't want to add scheduling attributes now without scheduling
descriptions to go with them, since there's no way of knowing what the
division should be.

>> aarch64-sve.md - predicated operands
>
> It is a shame this ends up being so ugly and requiring UNSPEC_MERGE_PTRUE
> everywhere. That will block a lot of useful optimisation.

I don't think it blocks many in practice (at least, not the kind that
really do belong in RTL rather than gimple).  Most instructions map
directly to an optab and those that don't do combine OK in the
UNSPEC_MERGE_PTRUE form (e.g. AND + NOT -> BIC).

> Otherwise, this is OK for trunk. I'm happy to take it as is, and have the
> above suggestions applied as follow-ups if you think they are worth doing.

Thanks.  If we can reach quick agreement about the offset checks then
I'll roll in that change.

Richard

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [2/4] [AArch64] Testsuite markup for SVE
  2017-11-03 17:45 [0/4] [AArch64] Add SVE support Richard Sandiford
  2017-11-03 17:48 ` [1/4] [AArch64] SVE backend support Richard Sandiford
@ 2017-11-03 17:50 ` Richard Sandiford
  2018-01-06 17:58   ` James Greenhalgh
  2017-11-03 17:51 ` [3/4] [AArch64] SVE tests Richard Sandiford
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2017-11-03 17:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.earnshaw, james.greenhalgh, marcus.shawcroft

This patch adds new target selectors for SVE and updates existing
selectors accordingly.  It also XFAILs some tests that don't yet
work for some SVE modes; most of these go away with follow-on
vectorisation enhancements.


2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_aarch64_sve)
	(aarch64_sve_bits, check_effective_target_aarch64_sve_hw)
	(aarch64_sve_hw_bits, check_effective_target_aarch64_sve256_hw):
	New procedures.
	(check_effective_target_vect_perm): Handle SVE.
	(check_effective_target_vect_perm_byte): Likewise.
	(check_effective_target_vect_perm_short): Likewise.
	(check_effective_target_vect_widen_sum_hi_to_si_pattern): Likewise.
	(check_effective_target_vect_widen_mult_qi_to_hi): Likewise.
	(check_effective_target_vect_widen_mult_hi_to_si): Likewise.
	(check_effective_target_vect_element_align_preferred): Likewise.
	(check_effective_target_vect_align_stack_vars): Likewise.
	(check_effective_target_vect_load_lanes): Likewise.
	(check_effective_target_vect_masked_store): Likewise.
	(available_vector_sizes): Use aarch64_sve_bits for SVE.
	* gcc.dg/vect/tree-vect.h (VECTOR_BITS): Define appropriately
	for SVE.
	* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Add SVE XFAIL.
	* gcc.dg/vect/bb-slp-pr69907.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-depend-2.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise.
	* gcc.dg/vect/slp-23.c: Likewise.
	* gcc.dg/vect/slp-25.c: Likewise.
	* gcc.dg/vect/slp-perm-5.c: Likewise.
	* gcc.dg/vect/slp-perm-6.c: Likewise.
	* gcc.dg/vect/slp-perm-9.c: Likewise.
	* gcc.dg/vect/slp-reduc-3.c: Likewise.
	* gcc.dg/vect/vect-114.c: Likewise.
	* gcc.dg/vect/vect-119.c: Likewise.
	* gcc.dg/vect/vect-cselim-1.c: Likewise.
	* gcc.dg/vect/vect-live-slp-1.c: Likewise.
	* gcc.dg/vect/vect-live-slp-2.c: Likewise.
	* gcc.dg/vect/vect-live-slp-3.c: Likewise.
	* gcc.dg/vect/vect-mult-const-pattern-1.c: Likewise.
	* gcc.dg/vect/vect-mult-const-pattern-2.c: Likewise.
	* gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
	* gcc.dg/vect/vect-over-widen-1.c: Likewise.
	* gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
	* gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
	* gcc.dg/vect/vect-over-widen-4.c: Likewise.

Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	2017-11-03 17:22:13.533564036 +0000
+++ gcc/testsuite/lib/target-supports.exp	2017-11-03 17:24:09.475993817 +0000
@@ -3350,6 +3350,35 @@ proc check_effective_target_aarch64_litt
     }]
 }
 
+# Return 1 if this is an AArch64 target supporting SVE.
+proc check_effective_target_aarch64_sve { } {
+    if { ![istarget aarch64*-*-*] } {
+	return 0
+    }
+    return [check_no_compiler_messages aarch64_sve assembly {
+	#if !defined (__ARM_FEATURE_SVE)
+	#error FOO
+	#endif
+    }]
+}
+
+# Return the size in bits of an SVE vector, or 0 if the size is variable.
+proc aarch64_sve_bits { } {
+    return [check_cached_effective_target aarch64_sve_bits {
+	global tool
+
+	set src dummy[pid].c
+	set f [open $src "w"]
+	puts $f "int bits = __ARM_FEATURE_SVE_BITS;"
+	close $f
+	set output [${tool}_target_compile $src "" preprocess ""]
+	file delete $src
+
+	regsub {.*bits = ([^;]*);.*} $output {\1} bits
+	expr { $bits }
+    }]
+}
+
 # Return 1 if this is a compiler supporting ARC atomic operations
 proc check_effective_target_arc_atomic { } {
     return [check_no_compiler_messages arc_atomic assembly {
@@ -4275,6 +4304,49 @@ proc check_effective_target_arm_neon_hw
     } [add_options_for_arm_neon ""]]
 }
 
+# Return true if this is an AArch64 target that can run SVE code.
+
+proc check_effective_target_aarch64_sve_hw { } {
+    if { ![istarget aarch64*-*-*] } {
+	return 0
+    }
+    return [check_runtime aarch64_sve_hw_available {
+	int
+	main (void)
+	{
+	  asm volatile ("ptrue p0.b");
+	  return 0;
+	}
+    }]
+}
+
+# Return true if this is an AArch64 target that can run SVE code and
+# if its SVE vectors have exactly BITS bits.
+
+proc aarch64_sve_hw_bits { bits } {
+    if { ![check_effective_target_aarch64_sve_hw] } {
+	return 0
+    }
+    return [check_runtime aarch64_sve${bits}_hw [subst {
+	int
+	main (void)
+	{
+	  int res;
+	  asm volatile ("cntd %0" : "=r" (res));
+	  if (res * 64 != $bits)
+	    __builtin_abort ();
+	  return 0;
+	}
+    }]]
+}
+
+# Return true if this is an AArch64 target that can run SVE code and
+# if its SVE vectors have exactly 256 bits.
+
+proc check_effective_target_aarch64_sve256_hw { } {
+    return [aarch64_sve_hw_bits 256]
+}
+
 proc check_effective_target_arm_neonv2_hw { } {
     return [check_runtime arm_neon_hwv2_available {
 	#include "arm_neon.h"
@@ -5531,7 +5603,8 @@ proc check_effective_target_vect_perm {
     } else {
 	set et_vect_perm_saved($et_index) 0
         if { [is-effective-target arm_neon]
-	     || [istarget aarch64*-*-*]
+	     || ([istarget aarch64*-*-*]
+		 && ![check_effective_target_vect_variable_length])
 	     || [istarget powerpc*-*-*]
              || [istarget spu-*-*]
 	     || [istarget i?86-*-*] || [istarget x86_64-*-*]
@@ -5636,7 +5709,8 @@ proc check_effective_target_vect_perm_by
         if { ([is-effective-target arm_neon]
 	      && [is-effective-target arm_little_endian])
 	     || ([istarget aarch64*-*-*]
-		 && [is-effective-target aarch64_little_endian])
+		 && [is-effective-target aarch64_little_endian]
+		 && ![check_effective_target_vect_variable_length])
 	     || [istarget powerpc*-*-*]
 	     || [istarget spu-*-*]
 	     || ([istarget mips-*.*]
@@ -5675,7 +5749,8 @@ proc check_effective_target_vect_perm_sh
         if { ([is-effective-target arm_neon]
 	      && [is-effective-target arm_little_endian])
 	     || ([istarget aarch64*-*-*]
-		 && [is-effective-target aarch64_little_endian])
+		 && [is-effective-target aarch64_little_endian]
+		 && ![check_effective_target_vect_variable_length])
 	     || [istarget powerpc*-*-*]
 	     || [istarget spu-*-*]
 	     || ([istarget mips*-*-*]
@@ -5735,7 +5810,8 @@ proc check_effective_target_vect_widen_s
     } else {
 	set et_vect_widen_sum_hi_to_si_pattern_saved($et_index) 0
         if { [istarget powerpc*-*-*]
-             || [istarget aarch64*-*-*]
+             || ([istarget aarch64*-*-*]
+		 && ![check_effective_target_aarch64_sve])
 	     || [is-effective-target arm_neon]
              || [istarget ia64-*-*] } {
 	    set et_vect_widen_sum_hi_to_si_pattern_saved($et_index) 1
@@ -5847,7 +5923,8 @@ proc check_effective_target_vect_widen_m
 	    set et_vect_widen_mult_qi_to_hi_saved($et_index) 0
 	}
         if { [istarget powerpc*-*-*]
-              || [istarget aarch64*-*-*]
+              || ([istarget aarch64*-*-*]
+		  && ![check_effective_target_aarch64_sve])
               || [is-effective-target arm_neon]
 	      || ([istarget s390*-*-*]
 		  && [check_effective_target_s390_vx]) } {
@@ -5885,7 +5962,8 @@ proc check_effective_target_vect_widen_m
         if { [istarget powerpc*-*-*]
 	     || [istarget spu-*-*]
 	     || [istarget ia64-*-*]
-	     || [istarget aarch64*-*-*]
+	     || ([istarget aarch64*-*-*]
+		 && ![check_effective_target_aarch64_sve])
 	     || [istarget i?86-*-*] || [istarget x86_64-*-*]
 	     || [is-effective-target arm_neon]
 	     || ([istarget s390*-*-*]
@@ -6347,12 +6425,16 @@ proc check_effective_target_vect_natural
 # alignment during vectorization.
 
 proc check_effective_target_vect_element_align_preferred { } {
-    return [check_effective_target_vect_variable_length]
+    return [expr { [check_effective_target_aarch64_sve]
+		   && [check_effective_target_vect_variable_length] }]
 }
 
 # Return 1 if we can align stack data to the preferred vector alignment.
 
 proc check_effective_target_vect_align_stack_vars { } {
+    if { [check_effective_target_aarch64_sve] } {
+	return [check_effective_target_vect_variable_length]
+    }
     return 1
 }
 
@@ -6424,7 +6506,8 @@ proc check_effective_target_vect_load_la
     } else {
 	set et_vect_load_lanes 0
 	if { ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok])
-	     || [istarget aarch64*-*-*] } {
+	     || ([istarget aarch64*-*-*]
+		 && ![check_effective_target_aarch64_sve]) } {
 	    set et_vect_load_lanes 1
 	}
     }
@@ -6436,7 +6519,7 @@ proc check_effective_target_vect_load_la
 # Return 1 if the target supports vector masked stores.
 
 proc check_effective_target_vect_masked_store { } {
-    return 0
+    return [check_effective_target_aarch64_sve]
 }
 
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
@@ -6704,6 +6787,9 @@ foreach N {2 3 4 8} {
 proc available_vector_sizes { } {
     set result {}
     if { [istarget aarch64*-*-*] } {
+	if { [check_effective_target_aarch64_sve] } {
+	    lappend result [aarch64_sve_bits]
+	}
 	lappend result 128 64
     } elseif { [istarget arm*-*-*]
 		&& [check_effective_target_arm_neon_ok] } {
Index: gcc/testsuite/gcc.dg/vect/tree-vect.h
===================================================================
--- gcc/testsuite/gcc.dg/vect/tree-vect.h	2017-11-03 17:21:09.761094925 +0000
+++ gcc/testsuite/gcc.dg/vect/tree-vect.h	2017-11-03 17:24:09.472993942 +0000
@@ -76,4 +76,12 @@ check_vect (void)
   signal (SIGILL, SIG_DFL);
 }
 
-#define VECTOR_BITS 128
+#if defined (__ARM_FEATURE_SVE)
+#  if __ARM_FEATURE_SVE_BITS == 0
+#    define VECTOR_BITS 1024
+#  else
+#    define VECTOR_BITS __ARM_FEATURE_SVE_BITS
+#  endif
+#else
+#  define VECTOR_BITS 128
+#endif
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c	2017-02-23 19:54:09.000000000 +0000
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c	2017-11-03 17:24:09.471993983 +0000
@@ -25,4 +25,4 @@ foo ()
    but the loop reads only one element at a time, and DOM cannot resolve these.
    The same happens on powerpc depending on the SIMD support available.  */
 
-/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail { { alpha*-*-* hppa*64*-*-* } || { lp64 && { powerpc*-*-* sparc*-*-* riscv*-*-* } } } } } } */
+/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail { { alpha*-*-* hppa*64*-*-* } || { { lp64 && { powerpc*-*-* sparc*-*-* riscv*-*-* } } || aarch64_sve } } } } } */
Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr69907.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/bb-slp-pr69907.c	2017-11-03 17:21:09.758095046 +0000
+++ gcc/testsuite/gcc.dg/vect/bb-slp-pr69907.c	2017-11-03 17:24:09.471993983 +0000
@@ -17,4 +17,6 @@ void foo(unsigned *p1, unsigned short *p
     p1[n] = p2[n * 2];
 }
 
-/* { dg-final { scan-tree-dump "BB vectorization with gaps at the end of a load is not supported" "slp1" } } */
+/* Disable for SVE because for long or variable-length vectors we don't
+   get an unrolled epilogue loop.  */
+/* { dg-final { scan-tree-dump "BB vectorization with gaps at the end of a load is not supported" "slp1" { target { ! aarch64_sve } } } } */
Index: gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c	2015-06-02 23:53:38.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c	2017-11-03 17:24:09.471993983 +0000
@@ -51,4 +51,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "dependence distance negative" 1 "vect"  } } */
+/* Requires reverse for variable-length SVE, which is implemented for
+   by a later patch.  Until then we report it twice, once for SVE and
+   once for 128-bit Advanced SIMD.  */
+/* { dg-final { scan-tree-dump-times "dependence distance negative" 1 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
Index: gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-3.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-3.c	2015-06-02 23:53:38.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-3.c	2017-11-03 17:24:09.472993942 +0000
@@ -183,4 +183,7 @@ int main ()
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" {xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "dependence distance negative" 4 "vect"  } } */
+/* f4 requires reverse for SVE, which is implemented by a later patch.
+   Until then we report it twice, once for SVE and once for 128-bit
+   Advanced SIMD.  */
+/* { dg-final { scan-tree-dump-times "dependence distance negative" 4 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
Index: gcc/testsuite/gcc.dg/vect/slp-23.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-23.c	2017-11-03 17:21:09.742095692 +0000
+++ gcc/testsuite/gcc.dg/vect/slp-23.c	2017-11-03 17:24:09.472993942 +0000
@@ -107,6 +107,8 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target { vect_strided8 && { ! { vect_no_align} } } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! { vect_strided8 || vect_no_align } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_perm } } } } */
+/* We fail to vectorize the second loop with variable-length SVE but
+   fall back to 128-bit vectors, which does use SLP.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_perm } xfail aarch64_sve } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target vect_perm } } } */
   
Index: gcc/testsuite/gcc.dg/vect/slp-25.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-25.c	2017-11-03 17:21:09.814092784 +0000
+++ gcc/testsuite/gcc.dg/vect/slp-25.c	2017-11-03 17:24:09.472993942 +0000
@@ -57,4 +57,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { { ! vect_unaligned_possible } || { ! vect_natural_alignment } } } } } */
+/* Needs store_lanes for SVE, otherwise falls back to Advanced SIMD.
+   Will be fixed when SVE LOAD_LANES support is added.  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { { { ! vect_unaligned_possible } || { ! vect_natural_alignment } } && { ! { aarch64_sve && vect_variable_length } } } } } } */
Index: gcc/testsuite/gcc.dg/vect/slp-perm-5.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-perm-5.c	2017-11-03 17:21:09.792093672 +0000
+++ gcc/testsuite/gcc.dg/vect/slp-perm-5.c	2017-11-03 17:24:09.472993942 +0000
@@ -104,7 +104,9 @@ int main (int argc, const char* argv[])
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target vect_perm } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target { vect_perm3_int && {! vect_load_lanes } } } } } */
+/* Fails for variable-length SVE because we fall back to Advanced SIMD
+   and use LD3/ST3.  Will be fixed when SVE LOAD_LANES support is added.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target { vect_perm3_int && {! vect_load_lanes } } xfail { aarch64_sve && vect_variable_length } } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target vect_load_lanes } } } */
 /* { dg-final { scan-tree-dump "note: Built SLP cancelled: can use load/store-lanes" "vect" { target { vect_perm3_int && vect_load_lanes } } } } */
 /* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target vect_load_lanes } } } */
Index: gcc/testsuite/gcc.dg/vect/slp-perm-6.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-perm-6.c	2017-11-03 17:21:09.792093672 +0000
+++ gcc/testsuite/gcc.dg/vect/slp-perm-6.c	2017-11-03 17:24:09.472993942 +0000
@@ -103,7 +103,9 @@ int main (int argc, const char* argv[])
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target vect_perm } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target { vect_perm3_int && {! vect_load_lanes } } } } } */
+/* Fails for variable-length SVE because we fall back to Advanced SIMD
+   and use LD3/ST3.  Will be fixed when SVE LOAD_LANES support is added.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target { vect_perm3_int && {! vect_load_lanes } } xfail { aarch64_sve && vect_variable_length } } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_load_lanes } } } */
 /* { dg-final { scan-tree-dump "note: Built SLP cancelled: can use load/store-lanes" "vect" { target { vect_perm3_int && vect_load_lanes } } } } */
 /* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target vect_load_lanes } } } */
Index: gcc/testsuite/gcc.dg/vect/slp-perm-9.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-perm-9.c	2017-11-03 17:21:09.792093672 +0000
+++ gcc/testsuite/gcc.dg/vect/slp-perm-9.c	2017-11-03 17:24:09.472993942 +0000
@@ -57,10 +57,11 @@ int main (int argc, const char* argv[])
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" { target { ! { vect_perm_short || vect_load_lanes } } } } } */
+/* Fails for variable-length SVE because we fall back to Advanced SIMD
+   and use LD3/ST3.  Will be fixed when SVE LOAD_LANES support is added.  */
+/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" { target { ! { vect_perm_short || vect_load_lanes } } xfail { aarch64_sve && vect_variable_length } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_perm_short || vect_load_lanes } } } } */
 /* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { ! vect_perm3_short } } } } } */
 /* { dg-final { scan-tree-dump-not "permutation requires at least three vectors" "vect" { target vect_perm3_short } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { ! vect_perm3_short } || vect_load_lanes } } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_perm3_short && { ! vect_load_lanes } } } } } */
-
Index: gcc/testsuite/gcc.dg/vect/slp-reduc-3.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-reduc-3.c	2015-06-02 23:53:38.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/slp-reduc-3.c	2017-11-03 17:24:09.472993942 +0000
@@ -58,4 +58,7 @@ int main (void)
 /* The initialization loop in main also gets vectorized.  */
 /* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 "vect" { xfail *-*-* } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target { vect_short_mult && { vect_widen_sum_hi_to_si  && vect_unpack } } } } } */ 
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_widen_sum_hi_to_si_pattern ||  { ! vect_unpack } } } } } */
+/* We can't yet create the necessary SLP constant vector for variable-length
+   SVE and so fall back to Advanced SIMD.  This means that we repeat each
+   analysis note.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_widen_sum_hi_to_si_pattern || { { ! vect_unpack } || { aarch64_sve && vect_variable_length } } } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-114.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-114.c	2015-06-02 23:53:35.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-114.c	2017-11-03 17:24:09.473993900 +0000
@@ -34,6 +34,9 @@ int main (void)
   return main1 ();
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { ! { vect_perm  } } } } } */
+/* Requires reverse for SVE, which is implemented by a later patch.
+   Until then we fall back to Advanced SIMD and successfully vectorize
+   the loop.  */
+/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { ! vect_perm } xfail { aarch64_sve && vect_variable_length } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_perm } } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-119.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-119.c	2015-08-05 22:28:34.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-119.c	2017-11-03 17:24:09.473993900 +0000
@@ -25,4 +25,7 @@ unsigned int foo (const unsigned int x[O
   return sum;
 }
 
-/* { dg-final { scan-tree-dump-times "Detected interleaving load of size 2" 1 "vect" } } */
+/* Requires load-lanes for SVE, which is implemented by a later patch.
+   Until then we report it twice, once for SVE and once for 128-bit
+   Advanced SIMD.  */
+/* { dg-final { scan-tree-dump-times "Detected interleaving load of size 2" 1 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-cselim-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-cselim-1.c	2017-11-03 17:21:09.849091370 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-cselim-1.c	2017-11-03 17:24:09.473993900 +0000
@@ -83,4 +83,6 @@ main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! vect_masked_store } xfail { { vect_no_align && { ! vect_hw_misalign } } || { ! vect_strided2 } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target { vect_masked_store } } } } */
+/* Fails for variable-length SVE because we can't yet handle the
+   interleaved load.  This is fixed by a later patch.  */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target vect_masked_store xfail { aarch64_sve && vect_variable_length } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-live-slp-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-live-slp-1.c	2016-11-22 21:16:10.000000000 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-live-slp-1.c	2017-11-03 17:24:09.473993900 +0000
@@ -69,4 +69,7 @@ main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" } } */
-/* { dg-final { scan-tree-dump-times "vec_stmt_relevant_p: stmt live but not relevant" 4 "vect" } } */
+/* We can't yet create the necessary SLP constant vector for variable-length
+   SVE and so fall back to Advanced SIMD.  This means that we repeat each
+   analysis note.  */
+/* { dg-final { scan-tree-dump-times "vec_stmt_relevant_p: stmt live but not relevant" 4 "vect" { xfail { aarch64_sve && vect_variable_length } } } }*/
Index: gcc/testsuite/gcc.dg/vect/vect-live-slp-2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-live-slp-2.c	2016-11-22 21:16:10.000000000 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-live-slp-2.c	2017-11-03 17:24:09.473993900 +0000
@@ -63,4 +63,7 @@ main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "vec_stmt_relevant_p: stmt live but not relevant" 2 "vect" } } */
+/* We can't yet create the necessary SLP constant vector for variable-length
+   SVE and so fall back to Advanced SIMD.  This means that we repeat each
+   analysis note.  */
+/* { dg-final { scan-tree-dump-times "vec_stmt_relevant_p: stmt live but not relevant" 2 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-live-slp-3.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-live-slp-3.c	2016-11-22 21:16:10.000000000 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-live-slp-3.c	2017-11-03 17:24:09.473993900 +0000
@@ -70,4 +70,7 @@ main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" } } */
-/* { dg-final { scan-tree-dump-times "vec_stmt_relevant_p: stmt live but not relevant" 4 "vect" } } */
+/* We can't yet create the necessary SLP constant vector for variable-length
+   SVE and so fall back to Advanced SIMD.  This means that we repeat each
+   analysis note.  */
+/* { dg-final { scan-tree-dump-times "vec_stmt_relevant_p: stmt live but not relevant" 4 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-1.c	2016-11-22 21:16:10.000000000 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-1.c	2017-11-03 17:24:09.473993900 +0000
@@ -37,5 +37,5 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 "vect"  { target aarch64*-*-* } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 "vect" { target aarch64*-*-* xfail aarch64_sve } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target aarch64*-*-* } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-2.c	2016-11-22 21:16:10.000000000 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-mult-const-pattern-2.c	2017-11-03 17:24:09.473993900 +0000
@@ -36,5 +36,5 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 "vect"  { target aarch64*-*-* } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 "vect"  { target aarch64*-*-* xfail aarch64_sve } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target aarch64*-*-* } } } */
Index: gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c	2016-01-13 13:48:42.000000000 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-over-widen-1-big-array.c	2017-11-03 17:24:09.473993900 +0000
@@ -59,6 +59,8 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
 /* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { ! vect_widen_shift } } } } */
+/* Requires LD4 for variable-length SVE.  Until that's supported we fall
+   back to Advanced SIMD, which does have widening shifts.  */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { ! vect_widen_shift } xfail { aarch64_sve && vect_variable_length } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c	2017-11-03 17:21:09.762094884 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-over-widen-1.c	2017-11-03 17:24:09.474993858 +0000
@@ -63,7 +63,9 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
 /* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { { ! vect_sizes_32B_16B } && { ! vect_widen_shift } } } } } */
+/* Requires LD4 for variable-length SVE.  Until that's supported we fall
+   back to Advanced SIMD, which does have widening shifts.  */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { { ! vect_sizes_32B_16B } && { ! vect_widen_shift } } xfail { aarch64_sve && vect_variable_length } } } } */
 /* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 8 "vect" { target vect_sizes_32B_16B } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-over-widen-3-big-array.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-over-widen-3-big-array.c	2017-11-03 17:21:09.763094844 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-over-widen-3-big-array.c	2017-11-03 17:24:09.474993858 +0000
@@ -59,7 +59,9 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target { ! vect_widen_shift } } } } */
+/* Requires LD4 for variable-length SVE.  Until that's supported we fall
+   back to Advanced SIMD, which does have widening shifts.  */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target { ! vect_widen_shift } xfail { aarch64_sve && vect_variable_length } } } } */
 /* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 1 "vect" { target vect_widen_shift } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c	2016-01-13 13:48:42.000000000 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-over-widen-4-big-array.c	2017-11-03 17:24:09.474993858 +0000
@@ -63,6 +63,8 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
 /* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { ! vect_widen_shift } } } } */
+/* Requires LD4 for variable-length SVE.  Until that's supported we fall
+   back to Advanced SIMD, which does have widening shifts.  */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { ! vect_widen_shift } xfail { aarch64_sve && vect_variable_length } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c	2017-11-03 17:21:09.763094844 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-over-widen-4.c	2017-11-03 17:24:09.474993858 +0000
@@ -67,7 +67,9 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vect_recog_widen_shift_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
 /* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 2 "vect" { target vect_widen_shift } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { { ! vect_sizes_32B_16B } && { ! vect_widen_shift } } } } } */
+/* Requires LD4 for variable-length SVE.  Until that's supported we fall
+   back to Advanced SIMD, which does have widening shifts.  */
+/* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 4 "vect" { target { { ! vect_sizes_32B_16B } && { ! vect_widen_shift } } xfail { aarch64_sve && vect_variable_length } } } } */
 /* { dg-final { scan-tree-dump-times "vect_recog_over_widening_pattern: detected" 8 "vect" { target vect_sizes_32B_16B } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [2/4] [AArch64] Testsuite markup for SVE
  2017-11-03 17:50 ` [2/4] [AArch64] Testsuite markup for SVE Richard Sandiford
@ 2018-01-06 17:58   ` James Greenhalgh
  0 siblings, 0 replies; 18+ messages in thread
From: James Greenhalgh @ 2018-01-06 17:58 UTC (permalink / raw)
  To: gcc-patches, richard.earnshaw, marcus.shawcroft, richard.sandiford; +Cc: nd

On Fri, Nov 03, 2017 at 05:49:56PM +0000, Richard Sandiford wrote:
> This patch adds new target selectors for SVE and updates existing
> selectors accordingly.  It also XFAILs some tests that don't yet
> work for some SVE modes; most of these go away with follow-on
> vectorisation enhancements.

OK.

Thanks,
James

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [3/4] [AArch64] SVE tests
  2017-11-03 17:45 [0/4] [AArch64] Add SVE support Richard Sandiford
  2017-11-03 17:48 ` [1/4] [AArch64] SVE backend support Richard Sandiford
  2017-11-03 17:50 ` [2/4] [AArch64] Testsuite markup for SVE Richard Sandiford
@ 2017-11-03 17:51 ` Richard Sandiford
  2018-01-06 18:06   ` James Greenhalgh
  2017-11-03 17:52 ` [4/4] SVE unwinding Richard Sandiford
  2017-11-24 16:34 ` [0/4] [AArch64] Add SVE support Richard Sandiford
  4 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2017-11-03 17:51 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.earnshaw, james.greenhalgh, marcus.shawcroft

[-- Attachment #1: Type: text/plain, Size: 8897 bytes --]

This patch adds gcc.target/aarch64 tests for SVE, and forces some
existing Advanced SIMD tests to use -march=armv8-a.


2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/testsuite/
	* gcc.target/aarch64/bic_imm_1.c: Force -march=armv8-a.
	* gcc.target/aarch64/fmaxmin.c: Likewise.
	* gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
	* gcc.target/aarch64/orr_imm_1.c: Likewise.
	* gcc.target/aarch64/pr62178.c: Likewise.
	* gcc.target/aarch64/pr71727-2.c: Likewise.
	* gcc.target/aarch64/saddw-1.c: Likewise.
	* gcc.target/aarch64/saddw-2.c: Likewise.
	* gcc.target/aarch64/uaddw-1.c: Likewise.
	* gcc.target/aarch64/uaddw-2.c: Likewise.
	* gcc.target/aarch64/uaddw-3.c: Likewise.
	* gcc.target/aarch64/vect-add-sub-cond.c: Likewise.
	* gcc.target/aarch64/vect-compile.c: Likewise.
	* gcc.target/aarch64/vect-faddv-compile.c: Likewise.
	* gcc.target/aarch64/vect-fcm-eq-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
	* gcc.target/aarch64/vect-fcm-ge-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
	* gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
	* gcc.target/aarch64/vect-fmax-fmin-compile.c: Likewise.
	* gcc.target/aarch64/vect-fmaxv-fminv-compile.c: Likewise.
	* gcc.target/aarch64/vect-fmovd-zero.c: Likewise.
	* gcc.target/aarch64/vect-fmovd.c: Likewise.
	* gcc.target/aarch64/vect-fmovf-zero.c: Likewise.
	* gcc.target/aarch64/vect-fmovf.c: Likewise.
	* gcc.target/aarch64/vect-fp-compile.c: Likewise.
	* gcc.target/aarch64/vect-ld1r-compile-fp.c: Likewise.
	* gcc.target/aarch64/vect-ld1r-compile.c: Likewise.
	* gcc.target/aarch64/vect-movi.c: Likewise.
	* gcc.target/aarch64/vect-mull-compile.c: Likewise.
	* gcc.target/aarch64/vect-reduc-or_1.c: Likewise.
	* gcc.target/aarch64/vect-vaddv.c: Likewise.
	* gcc.target/aarch64/vect_saddl_1.c: Likewise.
	* gcc.target/aarch64/vect_smlal_1.c: Likewise.
	* gcc.target/aarch64/vector_initialization_nostack.c: XFAIL for
	fixed-length SVE.
	* gcc.target/aarch64/sve_arith_1.c: New test.
	* gcc.target/aarch64/sve_const_pred_1.C: Likewise.
	* gcc.target/aarch64/sve_const_pred_2.C: Likewise.
	* gcc.target/aarch64/sve_const_pred_3.C: Likewise.
	* gcc.target/aarch64/sve_const_pred_4.C: Likewise.
	* gcc.target/aarch64/sve_cvtf_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_cvtf_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_cvtf_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_cvtf_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_dup_imm_1.c: Likewise.
	* gcc.target/aarch64/sve_dup_imm_1_run.c: Likewise.
	* gcc.target/aarch64/sve_dup_lane_1.c: Likewise.
	* gcc.target/aarch64/sve_ext_1.c: Likewise.
	* gcc.target/aarch64/sve_ext_2.c: Likewise.
	* gcc.target/aarch64/sve_extract_1.c: Likewise.
	* gcc.target/aarch64/sve_extract_2.c: Likewise.
	* gcc.target/aarch64/sve_extract_3.c: Likewise.
	* gcc.target/aarch64/sve_extract_4.c: Likewise.
	* gcc.target/aarch64/sve_fabs_1.c: Likewise.
	* gcc.target/aarch64/sve_fcvtz_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_fcvtz_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_fcvtz_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_fcvtz_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_fdiv_1.c: Likewise.
	* gcc.target/aarch64/sve_fdup_1.c: Likewise.
	* gcc.target/aarch64/sve_fdup_1_run.c: Likewise.
	* gcc.target/aarch64/sve_fmad_1.c: Likewise.
	* gcc.target/aarch64/sve_fmla_1.c: Likewise.
	* gcc.target/aarch64/sve_fmls_1.c: Likewise.
	* gcc.target/aarch64/sve_fmsb_1.c: Likewise.
	* gcc.target/aarch64/sve_fmul_1.c: Likewise.
	* gcc.target/aarch64/sve_fneg_1.c: Likewise.
	* gcc.target/aarch64/sve_fnmad_1.c: Likewise.
	* gcc.target/aarch64/sve_fnmla_1.c: Likewise.
	* gcc.target/aarch64/sve_fnmls_1.c: Likewise.
	* gcc.target/aarch64/sve_fnmsb_1.c: Likewise.
	* gcc.target/aarch64/sve_fp_arith_1.c: Likewise.
	* gcc.target/aarch64/sve_frinta_1.c: Likewise.
	* gcc.target/aarch64/sve_frinti_1.c: Likewise.
	* gcc.target/aarch64/sve_frintm_1.c: Likewise.
	* gcc.target/aarch64/sve_frintp_1.c: Likewise.
	* gcc.target/aarch64/sve_frintx_1.c: Likewise.
	* gcc.target/aarch64/sve_frintz_1.c: Likewise.
	* gcc.target/aarch64/sve_fsqrt_1.c: Likewise.
	* gcc.target/aarch64/sve_fsubr_1.c: Likewise.
	* gcc.target/aarch64/sve_index_1.c: Likewise.
	* gcc.target/aarch64/sve_index_1_run.c: Likewise.
	* gcc.target/aarch64/sve_ld1r_1.c: Likewise.
	* gcc.target/aarch64/sve_load_const_offset_1.c: Likewise.
	* gcc.target/aarch64/sve_load_scalar_offset_1.c: Likewise.
	* gcc.target/aarch64/sve_logical_1.c: Likewise.
	* gcc.target/aarch64/sve_loop_add_1.c: Likewise.
	* gcc.target/aarch64/sve_loop_add_1_run.c: Likewise.
	* gcc.target/aarch64/sve_mad_1.c: Likewise.
	* gcc.target/aarch64/sve_maxmin_1.c: Likewise.
	* gcc.target/aarch64/sve_maxmin_1_run.c: Likewise.
	* gcc.target/aarch64/sve_maxmin_strict_1.c: Likewise.
	* gcc.target/aarch64/sve_maxmin_strict_1_run.c: Likewise.
	* gcc.target/aarch64/sve_mla_1.c: Likewise.
	* gcc.target/aarch64/sve_mls_1.c: Likewise.
	* gcc.target/aarch64/sve_mov_rr_1.c: Likewise.
	* gcc.target/aarch64/sve_msb_1.c: Likewise.
	* gcc.target/aarch64/sve_mul_1.c: Likewise.
	* gcc.target/aarch64/sve_neg_1.c: Likewise.
	* gcc.target/aarch64/sve_nlogical_1.c: Likewise.
	* gcc.target/aarch64/sve_nlogical_1_run.c: Likewise.
	* gcc.target/aarch64/sve_pack_1.c: Likewise.
	* gcc.target/aarch64/sve_pack_1_run.c: Likewise.
	* gcc.target/aarch64/sve_pack_fcvt_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_pack_fcvt_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_pack_fcvt_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_pack_fcvt_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_pack_float_1.c: Likewise.
	* gcc.target/aarch64/sve_pack_float_1_run.c: Likewise.
	* gcc.target/aarch64/sve_popcount_1.c: Likewise.
	* gcc.target/aarch64/sve_popcount_1_run.c: Likewise.
	* gcc.target/aarch64/sve_reduc_1.c: Likewise.
	* gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
	* gcc.target/aarch64/sve_reduc_2.c: Likewise.
	* gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
	* gcc.target/aarch64/sve_reduc_3.c: Likewise.
	* gcc.target/aarch64/sve_revb_1.c: Likewise.
	* gcc.target/aarch64/sve_revh_1.c: Likewise.
	* gcc.target/aarch64/sve_revw_1.c: Likewise.
	* gcc.target/aarch64/sve_shift_1.c: Likewise.
	* gcc.target/aarch64/sve_single_1.c: Likewise.
	* gcc.target/aarch64/sve_single_2.c: Likewise.
	* gcc.target/aarch64/sve_single_3.c: Likewise.
	* gcc.target/aarch64/sve_single_4.c: Likewise.
	* gcc.target/aarch64/sve_store_scalar_offset_1.c: Likewise.
	* gcc.target/aarch64/sve_subr_1.c: Likewise.
	* gcc.target/aarch64/sve_trn1_1.c: Likewise.
	* gcc.target/aarch64/sve_trn2_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_fcvt_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_fcvt_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_unpack_fcvt_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_fcvt_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_unpack_float_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_float_1_run.c: Likewise.
	* gcc.target/aarch64/sve_unpack_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_unpack_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_uzp1_1.c: Likewise.
	* gcc.target/aarch64/sve_uzp1_1_run.c: Likewise.
	* gcc.target/aarch64/sve_uzp2_1.c: Likewise.
	* gcc.target/aarch64/sve_uzp2_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vcond_1.C: Likewise.
	* gcc.target/aarch64/sve_vcond_1_run.C: Likewise.
	* gcc.target/aarch64/sve_vcond_2.c: Likewise.
	* gcc.target/aarch64/sve_vcond_2_run.c: Likewise.
	* gcc.target/aarch64/sve_vcond_3.c: Likewise.
	* gcc.target/aarch64/sve_vcond_4.c: Likewise.
	* gcc.target/aarch64/sve_vcond_4_run.c: Likewise.
	* gcc.target/aarch64/sve_vcond_5.c: Likewise.
	* gcc.target/aarch64/sve_vcond_5_run.c: Likewise.
	* gcc.target/aarch64/sve_vcond_6.c: Likewise.
	* gcc.target/aarch64/sve_vcond_6_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_init_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_init_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_init_2.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_1_overrange_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_1_overrun.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_single_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_single_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_single_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_single_1_run.c: Likewise.
	* gcc.target/aarch64/sve_zip1_1.c: Likewise.
	* gcc.target/aarch64/sve_zip2_1.c: Likewise.


[-- Attachment #2: sve-03-tests.diff.gz --]
[-- Type: application/gzip, Size: 31572 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [3/4] [AArch64] SVE tests
  2017-11-03 17:51 ` [3/4] [AArch64] SVE tests Richard Sandiford
@ 2018-01-06 18:06   ` James Greenhalgh
  2018-01-06 19:13     ` Richard Sandiford
  0 siblings, 1 reply; 18+ messages in thread
From: James Greenhalgh @ 2018-01-06 18:06 UTC (permalink / raw)
  To: gcc-patches, richard.earnshaw, marcus.shawcroft, richard.sandiford; +Cc: nd

On Fri, Nov 03, 2017 at 05:50:54PM +0000, Richard Sandiford wrote:
> This patch adds gcc.target/aarch64 tests for SVE, and forces some
> existing Advanced SIMD tests to use -march=armv8-a.

I'm going to assume that these new testcases are broadly sensible, and not
spend any significant time looking at them.

I'm not completely happy forcing the architecture to Armv8-a - it would be
useful for our testing coverage if users which have configured with other
architecture variants had this test execute in those environments. That
way we'd check we still do the right thing once we have an implicit
-march=armv8.2-a .

However, as we don't have a good way to make that happen (other than maybe
only forcing the arch if we are in a configuration wired for SVE?) I'm
happy with this patch as a compromise for now.

OK, but a modification to cover the above point would make me happier.

Thanks,
James

> 
> 
> 2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>
> 	    Alan Hayward  <alan.hayward@arm.com>
> 	    David Sherwood  <david.sherwood@arm.com>
> 
> gcc/testsuite/
> 	* gcc.target/aarch64/bic_imm_1.c: Force -march=armv8-a.
> 	* gcc.target/aarch64/fmaxmin.c: Likewise.
> 	* gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
> 	* gcc.target/aarch64/orr_imm_1.c: Likewise.
> 	* gcc.target/aarch64/pr62178.c: Likewise.
> 	* gcc.target/aarch64/pr71727-2.c: Likewise.
> 	* gcc.target/aarch64/saddw-1.c: Likewise.
> 	* gcc.target/aarch64/saddw-2.c: Likewise.
> 	* gcc.target/aarch64/uaddw-1.c: Likewise.
> 	* gcc.target/aarch64/uaddw-2.c: Likewise.
> 	* gcc.target/aarch64/uaddw-3.c: Likewise.
> 	* gcc.target/aarch64/vect-add-sub-cond.c: Likewise.
> 	* gcc.target/aarch64/vect-compile.c: Likewise.
> 	* gcc.target/aarch64/vect-faddv-compile.c: Likewise.
> 	* gcc.target/aarch64/vect-fcm-eq-d.c: Likewise.
> 	* gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
> 	* gcc.target/aarch64/vect-fcm-ge-d.c: Likewise.
> 	* gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
> 	* gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
> 	* gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
> 	* gcc.target/aarch64/vect-fmax-fmin-compile.c: Likewise.
> 	* gcc.target/aarch64/vect-fmaxv-fminv-compile.c: Likewise.
> 	* gcc.target/aarch64/vect-fmovd-zero.c: Likewise.
> 	* gcc.target/aarch64/vect-fmovd.c: Likewise.
> 	* gcc.target/aarch64/vect-fmovf-zero.c: Likewise.
> 	* gcc.target/aarch64/vect-fmovf.c: Likewise.
> 	* gcc.target/aarch64/vect-fp-compile.c: Likewise.
> 	* gcc.target/aarch64/vect-ld1r-compile-fp.c: Likewise.
> 	* gcc.target/aarch64/vect-ld1r-compile.c: Likewise.
> 	* gcc.target/aarch64/vect-movi.c: Likewise.
> 	* gcc.target/aarch64/vect-mull-compile.c: Likewise.
> 	* gcc.target/aarch64/vect-reduc-or_1.c: Likewise.
> 	* gcc.target/aarch64/vect-vaddv.c: Likewise.
> 	* gcc.target/aarch64/vect_saddl_1.c: Likewise.
> 	* gcc.target/aarch64/vect_smlal_1.c: Likewise.
> 	* gcc.target/aarch64/vector_initialization_nostack.c: XFAIL for
> 	fixed-length SVE.
> 	* gcc.target/aarch64/sve_arith_1.c: New test.
> 	* gcc.target/aarch64/sve_const_pred_1.C: Likewise.
> 	* gcc.target/aarch64/sve_const_pred_2.C: Likewise.
> 	* gcc.target/aarch64/sve_const_pred_3.C: Likewise.
> 	* gcc.target/aarch64/sve_const_pred_4.C: Likewise.
> 	* gcc.target/aarch64/sve_cvtf_signed_1.c: Likewise.
> 	* gcc.target/aarch64/sve_cvtf_signed_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_cvtf_unsigned_1.c: Likewise.
> 	* gcc.target/aarch64/sve_cvtf_unsigned_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_dup_imm_1.c: Likewise.
> 	* gcc.target/aarch64/sve_dup_imm_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_dup_lane_1.c: Likewise.
> 	* gcc.target/aarch64/sve_ext_1.c: Likewise.
> 	* gcc.target/aarch64/sve_ext_2.c: Likewise.
> 	* gcc.target/aarch64/sve_extract_1.c: Likewise.
> 	* gcc.target/aarch64/sve_extract_2.c: Likewise.
> 	* gcc.target/aarch64/sve_extract_3.c: Likewise.
> 	* gcc.target/aarch64/sve_extract_4.c: Likewise.
> 	* gcc.target/aarch64/sve_fabs_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fcvtz_signed_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fcvtz_signed_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_fcvtz_unsigned_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fcvtz_unsigned_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_fdiv_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fdup_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fdup_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_fmad_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fmla_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fmls_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fmsb_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fmul_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fneg_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fnmad_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fnmla_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fnmls_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fnmsb_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fp_arith_1.c: Likewise.
> 	* gcc.target/aarch64/sve_frinta_1.c: Likewise.
> 	* gcc.target/aarch64/sve_frinti_1.c: Likewise.
> 	* gcc.target/aarch64/sve_frintm_1.c: Likewise.
> 	* gcc.target/aarch64/sve_frintp_1.c: Likewise.
> 	* gcc.target/aarch64/sve_frintx_1.c: Likewise.
> 	* gcc.target/aarch64/sve_frintz_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fsqrt_1.c: Likewise.
> 	* gcc.target/aarch64/sve_fsubr_1.c: Likewise.
> 	* gcc.target/aarch64/sve_index_1.c: Likewise.
> 	* gcc.target/aarch64/sve_index_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_ld1r_1.c: Likewise.
> 	* gcc.target/aarch64/sve_load_const_offset_1.c: Likewise.
> 	* gcc.target/aarch64/sve_load_scalar_offset_1.c: Likewise.
> 	* gcc.target/aarch64/sve_logical_1.c: Likewise.
> 	* gcc.target/aarch64/sve_loop_add_1.c: Likewise.
> 	* gcc.target/aarch64/sve_loop_add_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_mad_1.c: Likewise.
> 	* gcc.target/aarch64/sve_maxmin_1.c: Likewise.
> 	* gcc.target/aarch64/sve_maxmin_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_maxmin_strict_1.c: Likewise.
> 	* gcc.target/aarch64/sve_maxmin_strict_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_mla_1.c: Likewise.
> 	* gcc.target/aarch64/sve_mls_1.c: Likewise.
> 	* gcc.target/aarch64/sve_mov_rr_1.c: Likewise.
> 	* gcc.target/aarch64/sve_msb_1.c: Likewise.
> 	* gcc.target/aarch64/sve_mul_1.c: Likewise.
> 	* gcc.target/aarch64/sve_neg_1.c: Likewise.
> 	* gcc.target/aarch64/sve_nlogical_1.c: Likewise.
> 	* gcc.target/aarch64/sve_nlogical_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_pack_1.c: Likewise.
> 	* gcc.target/aarch64/sve_pack_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_pack_fcvt_signed_1.c: Likewise.
> 	* gcc.target/aarch64/sve_pack_fcvt_signed_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_pack_fcvt_unsigned_1.c: Likewise.
> 	* gcc.target/aarch64/sve_pack_fcvt_unsigned_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_pack_float_1.c: Likewise.
> 	* gcc.target/aarch64/sve_pack_float_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_popcount_1.c: Likewise.
> 	* gcc.target/aarch64/sve_popcount_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_reduc_1.c: Likewise.
> 	* gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_reduc_2.c: Likewise.
> 	* gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
> 	* gcc.target/aarch64/sve_reduc_3.c: Likewise.
> 	* gcc.target/aarch64/sve_revb_1.c: Likewise.
> 	* gcc.target/aarch64/sve_revh_1.c: Likewise.
> 	* gcc.target/aarch64/sve_revw_1.c: Likewise.
> 	* gcc.target/aarch64/sve_shift_1.c: Likewise.
> 	* gcc.target/aarch64/sve_single_1.c: Likewise.
> 	* gcc.target/aarch64/sve_single_2.c: Likewise.
> 	* gcc.target/aarch64/sve_single_3.c: Likewise.
> 	* gcc.target/aarch64/sve_single_4.c: Likewise.
> 	* gcc.target/aarch64/sve_store_scalar_offset_1.c: Likewise.
> 	* gcc.target/aarch64/sve_subr_1.c: Likewise.
> 	* gcc.target/aarch64/sve_trn1_1.c: Likewise.
> 	* gcc.target/aarch64/sve_trn2_1.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_fcvt_signed_1.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_fcvt_signed_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_fcvt_unsigned_1.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_fcvt_unsigned_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_float_1.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_float_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_signed_1.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_signed_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_unsigned_1.c: Likewise.
> 	* gcc.target/aarch64/sve_unpack_unsigned_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_uzp1_1.c: Likewise.
> 	* gcc.target/aarch64/sve_uzp1_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_uzp2_1.c: Likewise.
> 	* gcc.target/aarch64/sve_uzp2_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vcond_1.C: Likewise.
> 	* gcc.target/aarch64/sve_vcond_1_run.C: Likewise.
> 	* gcc.target/aarch64/sve_vcond_2.c: Likewise.
> 	* gcc.target/aarch64/sve_vcond_2_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vcond_3.c: Likewise.
> 	* gcc.target/aarch64/sve_vcond_4.c: Likewise.
> 	* gcc.target/aarch64/sve_vcond_4_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vcond_5.c: Likewise.
> 	* gcc.target/aarch64/sve_vcond_5_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vcond_6.c: Likewise.
> 	* gcc.target/aarch64/sve_vcond_6_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_init_1.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_init_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_init_2.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_1.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_1_overrange_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_const_1.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_const_1_overrun.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_const_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_const_single_1.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_const_single_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_single_1.c: Likewise.
> 	* gcc.target/aarch64/sve_vec_perm_single_1_run.c: Likewise.
> 	* gcc.target/aarch64/sve_zip1_1.c: Likewise.
> 	* gcc.target/aarch64/sve_zip2_1.c: Likewise.
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [3/4] [AArch64] SVE tests
  2018-01-06 18:06   ` James Greenhalgh
@ 2018-01-06 19:13     ` Richard Sandiford
       [not found]       ` <20180107165948.GA13800@arm.com>
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2018-01-06 19:13 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: gcc-patches, richard.earnshaw, marcus.shawcroft, nd

James Greenhalgh <james.greenhalgh@arm.com> writes:
> On Fri, Nov 03, 2017 at 05:50:54PM +0000, Richard Sandiford wrote:
>> This patch adds gcc.target/aarch64 tests for SVE, and forces some
>> existing Advanced SIMD tests to use -march=armv8-a.
>
> I'm going to assume that these new testcases are broadly sensible, and not
> spend any significant time looking at them.
>
> I'm not completely happy forcing the architecture to Armv8-a - it would be
> useful for our testing coverage if users which have configured with other
> architecture variants had this test execute in those environments. That
> way we'd check we still do the right thing once we have an implicit
> -march=armv8.2-a .
>
> However, as we don't have a good way to make that happen (other than maybe
> only forcing the arch if we are in a configuration wired for SVE?) I'm
> happy with this patch as a compromise for now.

Would something like LLVM's -mattr be useful?  Then we could have
-mattr=+nosve without having to change the base architecture.

I suppose we'd need to be careful about how it interacts with -march
though, so it probably isn't GCC 8 material.  I'll try only forcing
the arch when we're compiling for SVE, like you say.

Not strictly related, but do you think it's OK to require binutils 2.28+
when testing GCC (rather than simply building it)?  When trying with an
older OS the other day, I realised that the SVE dg-do assemble tests
would fail for 2.27 and earlier.  We'd need something like:

  /* { dg-do assemble { aarch64_sve_asm } } */

if we wanted to support older binutils.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 18+ messages in thread

[parent not found: <20180107165948.GA13800@arm.com>]

* Re: [3/4] [AArch64] SVE tests
       [not found]       ` <20180107165948.GA13800@arm.com>
@ 2018-01-07 17:10         ` James Greenhalgh
  2018-01-12 15:44           ` Richard Sandiford
  0 siblings, 1 reply; 18+ messages in thread
From: James Greenhalgh @ 2018-01-07 17:10 UTC (permalink / raw)
  To: gcc-patches, Richard Earnshaw, Marcus Shawcroft, richard.sandiford; +Cc: nd

On Sat, Jan 06, 2018 at 07:13:22PM +0000, Richard Sandiford wrote:
> James Greenhalgh <james.greenhalgh@arm.com> writes:
> > On Fri, Nov 03, 2017 at 05:50:54PM +0000, Richard Sandiford wrote:
> >> This patch adds gcc.target/aarch64 tests for SVE, and forces some
> >> existing Advanced SIMD tests to use -march=armv8-a.
> >
> > I'm going to assume that these new testcases are broadly sensible, and not
> > spend any significant time looking at them.
> >
> > I'm not completely happy forcing the architecture to Armv8-a - it would be
> > useful for our testing coverage if users which have configured with other
> > architecture variants had this test execute in those environments. That
> > way we'd check we still do the right thing once we have an implicit
> > -march=armv8.2-a .
> >
> > However, as we don't have a good way to make that happen (other than maybe
> > only forcing the arch if we are in a configuration wired for SVE?) I'm
> > happy with this patch as a compromise for now.
> 
> Would something like LLVM's -mattr be useful?  Then we could have
> -mattr=+nosve without having to change the base architecture.
> 
> I suppose we'd need to be careful about how it interacts with -march
> though, so it probably isn't GCC 8 material.  I'll try only forcing
> the arch when we're compiling for SVE, like you say.

(Sorry if you took a duplicate of this - I mistakenly sent with a disclaimer)

We also could do this with Target pragmas:

  #pragma GCC target ("+nosve")

Should work here I think.

> Not strictly related, but do you think it's OK to require binutils 2.28+
> when testing GCC (rather than simply building it)?  When trying with an
> older OS the other day, I realised that the SVE dg-do assemble tests
> would fail for 2.27 and earlier.  We'd need something like:
> 
>   /* { dg-do assemble { aarch64_sve_asm } } */
> 
> if we wanted to support older binutils.

Personally I think this is OK. We have the same problem with other
new instructions we add and want assemble tests for.

Thanks,
James

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [3/4] [AArch64] SVE tests
  2018-01-07 17:10         ` James Greenhalgh
@ 2018-01-12 15:44           ` Richard Sandiford
  0 siblings, 0 replies; 18+ messages in thread
From: Richard Sandiford @ 2018-01-12 15:44 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: gcc-patches, Richard Earnshaw, Marcus Shawcroft, nd

James Greenhalgh <james.greenhalgh@arm.com> writes:
> On Sat, Jan 06, 2018 at 07:13:22PM +0000, Richard Sandiford wrote:
>> James Greenhalgh <james.greenhalgh@arm.com> writes:
>> > On Fri, Nov 03, 2017 at 05:50:54PM +0000, Richard Sandiford wrote:
>> >> This patch adds gcc.target/aarch64 tests for SVE, and forces some
>> >> existing Advanced SIMD tests to use -march=armv8-a.
>> >
>> > I'm going to assume that these new testcases are broadly sensible, and not
>> > spend any significant time looking at them.
>> >
>> > I'm not completely happy forcing the architecture to Armv8-a - it would be
>> > useful for our testing coverage if users which have configured with other
>> > architecture variants had this test execute in those environments. That
>> > way we'd check we still do the right thing once we have an implicit
>> > -march=armv8.2-a .
>> >
>> > However, as we don't have a good way to make that happen (other than maybe
>> > only forcing the arch if we are in a configuration wired for SVE?) I'm
>> > happy with this patch as a compromise for now.
>> 
>> Would something like LLVM's -mattr be useful?  Then we could have
>> -mattr=+nosve without having to change the base architecture.
>> 
>> I suppose we'd need to be careful about how it interacts with -march
>> though, so it probably isn't GCC 8 material.  I'll try only forcing
>> the arch when we're compiling for SVE, like you say.
>
> (Sorry if you took a duplicate of this - I mistakenly sent with a disclaimer)
>
> We also could do this with Target pragmas:
>
>   #pragma GCC target ("+nosve")
>
> Should work here I think.

Yeah, it does, thanks.  I switched to that instead of forcing -march=.

>> Not strictly related, but do you think it's OK to require binutils 2.28+
>> when testing GCC (rather than simply building it)?  When trying with an
>> older OS the other day, I realised that the SVE dg-do assemble tests
>> would fail for 2.27 and earlier.  We'd need something like:
>> 
>>   /* { dg-do assemble { aarch64_sve_asm } } */
>> 
>> if we wanted to support older binutils.
>
> Personally I think this is OK. We have the same problem with other
> new instructions we add and want assemble tests for.

I later saw that we have aarch64_asm_<foo>_ok, so I added "sve" to the
list and used it to protect dg-do assemble tests.

In another message you said:

> Additionally, I'd be happy with the whole AArch64 testsuite being organised
> in to folders (doing this might also make it easier for us to turn all SVE
> tests off with older assemblers, by using the skipping the exp file if we
> can't find support).

I agree organising it into directories is nicer, and it means that we
can add the -march= option in the harness rather than each individual
dg-options line.  That in turn makes it easy to avoid overriding -march=
if the setting that the tester chose (or the toolchain's default setting)
already includes SVE.

Here's what I plan to commit (without reposting the new tests).

Thanks,
Richard


2018-01-12  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_aarch64_asm_sve_ok):
	New proc.
	* gcc.target/aarch64/bic_imm_1.c: Use #pragma GCC target "+nosve".
	* gcc.target/aarch64/fmaxmin.c: Likewise.
	* gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
	* gcc.target/aarch64/orr_imm_1.c: Likewise.
	* gcc.target/aarch64/pr62178.c: Likewise.
	* gcc.target/aarch64/pr71727-2.c: Likewise.
	* gcc.target/aarch64/saddw-1.c: Likewise.
	* gcc.target/aarch64/saddw-2.c: Likewise.
	* gcc.target/aarch64/uaddw-1.c: Likewise.
	* gcc.target/aarch64/uaddw-2.c: Likewise.
	* gcc.target/aarch64/uaddw-3.c: Likewise.
	* gcc.target/aarch64/vect-add-sub-cond.c: Likewise.
	* gcc.target/aarch64/vect-compile.c: Likewise.
	* gcc.target/aarch64/vect-faddv-compile.c: Likewise.
	* gcc.target/aarch64/vect-fcm-eq-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
	* gcc.target/aarch64/vect-fcm-ge-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
	* gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
	* gcc.target/aarch64/vect-fmax-fmin-compile.c: Likewise.
	* gcc.target/aarch64/vect-fmaxv-fminv-compile.c: Likewise.
	* gcc.target/aarch64/vect-fmovd-zero.c: Likewise.
	* gcc.target/aarch64/vect-fmovd.c: Likewise.
	* gcc.target/aarch64/vect-fmovf-zero.c: Likewise.
	* gcc.target/aarch64/vect-fmovf.c: Likewise.
	* gcc.target/aarch64/vect-fp-compile.c: Likewise.
	* gcc.target/aarch64/vect-ld1r-compile-fp.c: Likewise.
	* gcc.target/aarch64/vect-ld1r-compile.c: Likewise.
	* gcc.target/aarch64/vect-movi.c: Likewise.
	* gcc.target/aarch64/vect-mull-compile.c: Likewise.
	* gcc.target/aarch64/vect-reduc-or_1.c: Likewise.
	* gcc.target/aarch64/vect-vaddv.c: Likewise.
	* gcc.target/aarch64/vect_saddl_1.c: Likewise.
	* gcc.target/aarch64/vect_smlal_1.c: Likewise.
	* gcc.target/aarch64/vector_initialization_nostack.c: XFAIL for
	fixed-length SVE.
	* gcc.target/aarch64/sve/aarch64-sve.exp: New file.
	* gcc.target/aarch64/sve/arith_1.c: New test.
	* gcc.target/aarch64/sve/const_pred_1.C: Likewise.
	* gcc.target/aarch64/sve/const_pred_2.C: Likewise.
	* gcc.target/aarch64/sve/const_pred_3.C: Likewise.
	* gcc.target/aarch64/sve/const_pred_4.C: Likewise.
	* gcc.target/aarch64/sve/cvtf_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/cvtf_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cvtf_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/cvtf_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/dup_imm_1.c: Likewise.
	* gcc.target/aarch64/sve/dup_imm_1_run.c: Likewise.
	* gcc.target/aarch64/sve/dup_lane_1.c: Likewise.
	* gcc.target/aarch64/sve/ext_1.c: Likewise.
	* gcc.target/aarch64/sve/ext_2.c: Likewise.
	* gcc.target/aarch64/sve/extract_1.c: Likewise.
	* gcc.target/aarch64/sve/extract_2.c: Likewise.
	* gcc.target/aarch64/sve/extract_3.c: Likewise.
	* gcc.target/aarch64/sve/extract_4.c: Likewise.
	* gcc.target/aarch64/sve/fabs_1.c: Likewise.
	* gcc.target/aarch64/sve/fcvtz_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/fcvtz_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/fcvtz_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/fcvtz_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/fdiv_1.c: Likewise.
	* gcc.target/aarch64/sve/fdup_1.c: Likewise.
	* gcc.target/aarch64/sve/fdup_1_run.c: Likewise.
	* gcc.target/aarch64/sve/fmad_1.c: Likewise.
	* gcc.target/aarch64/sve/fmla_1.c: Likewise.
	* gcc.target/aarch64/sve/fmls_1.c: Likewise.
	* gcc.target/aarch64/sve/fmsb_1.c: Likewise.
	* gcc.target/aarch64/sve/fmul_1.c: Likewise.
	* gcc.target/aarch64/sve/fneg_1.c: Likewise.
	* gcc.target/aarch64/sve/fnmad_1.c: Likewise.
	* gcc.target/aarch64/sve/fnmla_1.c: Likewise.
	* gcc.target/aarch64/sve/fnmls_1.c: Likewise.
	* gcc.target/aarch64/sve/fnmsb_1.c: Likewise.
	* gcc.target/aarch64/sve/fp_arith_1.c: Likewise.
	* gcc.target/aarch64/sve/frinta_1.c: Likewise.
	* gcc.target/aarch64/sve/frinti_1.c: Likewise.
	* gcc.target/aarch64/sve/frintm_1.c: Likewise.
	* gcc.target/aarch64/sve/frintp_1.c: Likewise.
	* gcc.target/aarch64/sve/frintx_1.c: Likewise.
	* gcc.target/aarch64/sve/frintz_1.c: Likewise.
	* gcc.target/aarch64/sve/fsqrt_1.c: Likewise.
	* gcc.target/aarch64/sve/fsubr_1.c: Likewise.
	* gcc.target/aarch64/sve/index_1.c: Likewise.
	* gcc.target/aarch64/sve/index_1_run.c: Likewise.
	* gcc.target/aarch64/sve/ld1r_1.c: Likewise.
	* gcc.target/aarch64/sve/load_const_offset_1.c: Likewise.
	* gcc.target/aarch64/sve/load_const_offset_2.c: Likewise.
	* gcc.target/aarch64/sve/load_const_offset_3.c: Likewise.
	* gcc.target/aarch64/sve/load_scalar_offset_1.c: Likewise.
	* gcc.target/aarch64/sve/logical_1.c: Likewise.
	* gcc.target/aarch64/sve/loop_add_1.c: Likewise.
	* gcc.target/aarch64/sve/loop_add_1_run.c: Likewise.
	* gcc.target/aarch64/sve/mad_1.c: Likewise.
	* gcc.target/aarch64/sve/maxmin_1.c: Likewise.
	* gcc.target/aarch64/sve/maxmin_1_run.c: Likewise.
	* gcc.target/aarch64/sve/maxmin_strict_1.c: Likewise.
	* gcc.target/aarch64/sve/maxmin_strict_1_run.c: Likewise.
	* gcc.target/aarch64/sve/mla_1.c: Likewise.
	* gcc.target/aarch64/sve/mls_1.c: Likewise.
	* gcc.target/aarch64/sve/mov_rr_1.c: Likewise.
	* gcc.target/aarch64/sve/msb_1.c: Likewise.
	* gcc.target/aarch64/sve/mul_1.c: Likewise.
	* gcc.target/aarch64/sve/neg_1.c: Likewise.
	* gcc.target/aarch64/sve/nlogical_1.c: Likewise.
	* gcc.target/aarch64/sve/nlogical_1_run.c: Likewise.
	* gcc.target/aarch64/sve/pack_1.c: Likewise.
	* gcc.target/aarch64/sve/pack_1_run.c: Likewise.
	* gcc.target/aarch64/sve/pack_fcvt_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/pack_fcvt_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/pack_fcvt_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/pack_fcvt_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/pack_float_1.c: Likewise.
	* gcc.target/aarch64/sve/pack_float_1_run.c: Likewise.
	* gcc.target/aarch64/sve/popcount_1.c: Likewise.
	* gcc.target/aarch64/sve/popcount_1_run.c: Likewise.
	* gcc.target/aarch64/sve/reduc_1.c: Likewise.
	* gcc.target/aarch64/sve/reduc_1_run.c: Likewise.
	* gcc.target/aarch64/sve/reduc_2.c: Likewise.
	* gcc.target/aarch64/sve/reduc_2_run.c: Likewise.
	* gcc.target/aarch64/sve/reduc_3.c: Likewise.
	* gcc.target/aarch64/sve/rev_1.c: Likewise.
	* gcc.target/aarch64/sve/revb_1.c: Likewise.
	* gcc.target/aarch64/sve/revh_1.c: Likewise.
	* gcc.target/aarch64/sve/revw_1.c: Likewise.
	* gcc.target/aarch64/sve/shift_1.c: Likewise.
	* gcc.target/aarch64/sve/single_1.c: Likewise.
	* gcc.target/aarch64/sve/single_2.c: Likewise.
	* gcc.target/aarch64/sve/single_3.c: Likewise.
	* gcc.target/aarch64/sve/single_4.c: Likewise.
	* gcc.target/aarch64/sve/spill_1.c: Likewise.
	* gcc.target/aarch64/sve/store_scalar_offset_1.c: Likewise.
	* gcc.target/aarch64/sve/subr_1.c: Likewise.
	* gcc.target/aarch64/sve/trn1_1.c: Likewise.
	* gcc.target/aarch64/sve/trn2_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_fcvt_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_float_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_float_1_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/uzp1_1.c: Likewise.
	* gcc.target/aarch64/sve/uzp1_1_run.c: Likewise.
	* gcc.target/aarch64/sve/uzp2_1.c: Likewise.
	* gcc.target/aarch64/sve/uzp2_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_1.C: Likewise.
	* gcc.target/aarch64/sve/vcond_1_run.C: Likewise.
	* gcc.target/aarch64/sve/vcond_2.c: Likewise.
	* gcc.target/aarch64/sve/vcond_2_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_3.c: Likewise.
	* gcc.target/aarch64/sve/vcond_4.c: Likewise.
	* gcc.target/aarch64/sve/vcond_4_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_5.c: Likewise.
	* gcc.target/aarch64/sve/vcond_5_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_6.c: Likewise.
	* gcc.target/aarch64/sve/vcond_6_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_init_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_init_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_init_2.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_1_overrange_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_1_overrun.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_single_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_single_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_single_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_single_1_run.c: Likewise.
	* gcc.target/aarch64/sve/zip1_1.c: Likewise.
	* gcc.target/aarch64/sve/zip2_1.c: Likewise.

Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/lib/target-supports.exp	2018-01-12 15:13:44.050826874 +0000
@@ -8590,7 +8590,7 @@ proc check_effective_target_aarch64_tiny
 # Create functions to check that the AArch64 assembler supports the
 # various architecture extensions via the .arch_extension pseudo-op.
 
-foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod"} {
+foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod" "sve"} {
     eval [string map [list FUNC $aarch64_ext] {
 	proc check_effective_target_aarch64_asm_FUNC_ok { } {
 	  if { [istarget aarch64*-*-*] } {
Index: gcc/testsuite/gcc.target/aarch64/bic_imm_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/bic_imm_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/bic_imm_1.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do assemble } */
 /* { dg-options "-O2 --save-temps -ftree-vectorize" } */
 
+#pragma GCC target "+nosve"
+
 /* Each function uses the correspoding 'CLASS' in
    Marco CHECK (aarch64_simd_valid_immediate).  */
 
Index: gcc/testsuite/gcc.target/aarch64/fmaxmin.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/fmaxmin.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/fmaxmin.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fno-inline -fno-vect-cost-model -save-temps" } */
 
+#pragma GCC target "+nosve"
 
 extern void abort (void);
 double fmax (double, double);
Index: gcc/testsuite/gcc.target/aarch64/fmul_fcvt_2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/fmul_fcvt_2.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/fmul_fcvt_2.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-save-temps -O2 -ftree-vectorize -fno-inline -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 1024
 
 #define FUNC_DEF(__a)		\
Index: gcc/testsuite/gcc.target/aarch64/orr_imm_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/orr_imm_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/orr_imm_1.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do assemble } */
 /* { dg-options "-O2 --save-temps -ftree-vectorize" } */
 
+#pragma GCC target "+nosve"
+
 /* Each function uses the correspoding 'CLASS' in
    Marco CHECK (aarch64_simd_valid_immediate).  */
 
Index: gcc/testsuite/gcc.target/aarch64/pr62178.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/pr62178.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/pr62178.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int a[30 +1][30 +1], b[30 +1][30 +1], r[30 +1][30 +1];
 
 void foo (void) {
Index: gcc/testsuite/gcc.target/aarch64/pr71727-2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/pr71727-2.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/pr71727-2.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-mstrict-align -O3" } */
 
+#pragma GCC target "+nosve"
+
 unsigned char foo(const unsigned char *buffer, unsigned int length)
 {
   unsigned char sum;
Index: gcc/testsuite/gcc.target/aarch64/saddw-1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/saddw-1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/saddw-1.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, short * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/saddw-2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/saddw-2.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/saddw-2.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, int * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/uaddw-1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/uaddw-1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/uaddw-1.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, unsigned short * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/uaddw-2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/uaddw-2.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/uaddw-2.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, unsigned short * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/uaddw-3.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/uaddw-3.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/uaddw-3.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, char * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/vect-add-sub-cond.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-add-sub-cond.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-add-sub-cond.c	2018-01-12 15:13:44.049826915 +0000
@@ -3,6 +3,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize" } */
 
+#pragma GCC target "+nosve"
+
 #define COUNT1(X) if (X) count += 1
 #define COUNT2(X) if (X) count -= 1
 #define COUNT3(X) count += (X)
Index: gcc/testsuite/gcc.target/aarch64/vect-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-compile.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,7 +1,8 @@
-
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect.x"
 
 /* { dg-final { scan-assembler "orn\\tv" } } */
Index: gcc/testsuite/gcc.target/aarch64/vect-faddv-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-faddv-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-faddv-compile.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,7 +1,8 @@
-
 /* { dg-do compile } */
 /* { dg-options "-O3 -ffast-math" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect-faddv.x"
 
 /* { dg-final { scan-assembler-times "faddp\\tv" 2} } */
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE double
 #define ITYPE long
 #define OP ==
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE float
 #define ITYPE int
 #define OP ==
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE double
 #define ITYPE long
 #define OP >=
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE float
 #define ITYPE int
 #define OP >=
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE double
 #define ITYPE long
 #define OP >
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE float
 #define ITYPE int
 #define OP >
Index: gcc/testsuite/gcc.target/aarch64/vect-fmax-fmin-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmax-fmin-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmax-fmin-compile.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -ffast-math" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect-fmax-fmin.x"
 
 /* { dg-final { scan-assembler "fmaxnm\\tv" } } */
Index: gcc/testsuite/gcc.target/aarch64/vect-fmaxv-fminv-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmaxv-fminv-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmaxv-fminv-compile.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,7 +1,8 @@
-
 /* { dg-do compile } */
 /* { dg-options "-O3 -ffast-math -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect-fmaxv-fminv.x"
 
 /* { dg-final { scan-assembler "fminnmv" } } */
Index: gcc/testsuite/gcc.target/aarch64/vect-fmovd-zero.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmovd-zero.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmovd-zero.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 32
 
 void
Index: gcc/testsuite/gcc.target/aarch64/vect-fmovd.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmovd.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmovd.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 32
 
 void
Index: gcc/testsuite/gcc.target/aarch64/vect-fmovf-zero.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmovf-zero.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmovf-zero.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 32
 
 void
Index: gcc/testsuite/gcc.target/aarch64/vect-fmovf.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmovf.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmovf.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 32
 
 void
Index: gcc/testsuite/gcc.target/aarch64/vect-fp-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fp-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fp-compile.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,8 +1,8 @@
-
-
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect-fp.x"
 
 /* { dg-final { scan-assembler "fadd\\tv" } } */
Index: gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile-fp.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile-fp.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile-fp.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #include "stdint.h"
 #include "vect-ld1r.x"
 
Index: gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #include "stdint.h"
 #include "vect-ld1r.x"
 
Index: gcc/testsuite/gcc.target/aarch64/vect-movi.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-movi.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-movi.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O3 --save-temps -fno-inline" } */
 
+#pragma GCC target "+nosve"
+
 extern void abort (void);
 
 #define N 16
Index: gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,7 +1,8 @@
-
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 #define N 16
 
 #include "vect-mull.x"
Index: gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c	2018-01-12 15:13:44.050826874 +0000
@@ -2,6 +2,8 @@
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 /* Write a reduction loop to be reduced using whole vector right shift.  */
 
+#pragma GCC target "+nosve"
+
 extern void abort (void);
 
 unsigned char in[8] __attribute__((__aligned__(16)));
Index: gcc/testsuite/gcc.target/aarch64/vect-vaddv.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-vaddv.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-vaddv.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O3 --save-temps -ffast-math" } */
 
+#pragma GCC target "+nosve"
+
 #include <arm_neon.h>
 
 extern void abort (void);
Index: gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -fno-inline -save-temps -fno-vect-cost-model -fno-ipa-icf" } */
 
+#pragma GCC target "+nosve"
+
 typedef signed char S8_t;
 typedef signed short S16_t;
 typedef signed int S32_t;
Index: gcc/testsuite/gcc.target/aarch64/vect_smlal_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect_smlal_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect_smlal_1.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -fno-inline -save-temps -fno-vect-cost-model -fno-ipa-icf" } */
 
+#pragma GCC target "+nosve"
+
 typedef signed char S8_t;
 typedef signed short S16_t;
 typedef signed int S32_t;
Index: gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c	2018-01-12 15:13:44.050826874 +0000
@@ -49,5 +49,6 @@ f12 (void)
   return sum;
 }
 
-
-/* { dg-final { scan-assembler-not "sp" } } */
+/* Fails for fixed-length SVE because we lack a vec_init pattern.
+   A later patch fixes this in generic code.  */
+/* { dg-final { scan-assembler-not "sp" { xfail { aarch64_sve && { ! vect_variable_length } } } } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/aarch64-sve.exp
===================================================================
--- /dev/null	2018-01-12 06:40:27.684409621 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve/aarch64-sve.exp	2018-01-12 15:13:44.035827490 +0000
@@ -0,0 +1,52 @@
+#  Specific regression driver for AArch64 SVE.
+#  Copyright (C) 2009-2018 Free Software Foundation, Inc.
+#  Contributed by ARM Ltd.
+#
+#  This file is part of GCC.
+#
+#  GCC is free software; you can redistribute it and/or modify it
+#  under the terms of the GNU General Public License as published by
+#  the Free Software Foundation; either version 3, or (at your option)
+#  any later version.
+#
+#  GCC is distributed in the hope that it will be useful, but
+#  WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with GCC; see the file COPYING3.  If not see
+#  <http://www.gnu.org/licenses/>.  */
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an AArch64 target.
+if {![istarget aarch64*-*-*] } then {
+  return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# If a testcase doesn't have special options, use these.
+global DEFAULT_CFLAGS
+if ![info exists DEFAULT_CFLAGS] then {
+    set DEFAULT_CFLAGS " -ansi -pedantic-errors"
+}
+
+# Initialize `dg'.
+dg-init
+
+# Force SVE if we're not testing it already.
+if { [check_effective_target_aarch64_sve] } {
+    set sve_flags ""
+} else {
+    set sve_flags "-march=armv8.2-a+sve"
+}
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \
+    $sve_flags $DEFAULT_CFLAGS
+
+# All done.
+dg-finish

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [4/4] SVE unwinding
  2017-11-03 17:45 [0/4] [AArch64] Add SVE support Richard Sandiford
                   ` (2 preceding siblings ...)
  2017-11-03 17:51 ` [3/4] [AArch64] SVE tests Richard Sandiford
@ 2017-11-03 17:52 ` Richard Sandiford
  2017-11-10 10:58   ` James Greenhalgh
  2017-11-24 16:34 ` [0/4] [AArch64] Add SVE support Richard Sandiford
  4 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2017-11-03 17:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.earnshaw, james.greenhalgh, marcus.shawcroft

This patch adds support for unwinding frames that use the SVE
pseudo VG register.  We want this register to act like a normal
register if the CFI explicitly sets it, but want to provide a
default value otherwise.  Computing the default value requires
an SVE target, so we only want to compute it on demand.

aarch64_vg uses a hard-coded .inst in order to avoid a build
dependency on binutils 2.28 or later.


2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>

libgcc/
	* config/aarch64/value-unwind.h (aarch64_vg): New function.
	(DWARF_LAZY_REGISTER_VALUE): Define.
	* unwind-dw2.c (_Unwind_GetGR): Use DWARF_LAZY_REGISTER_VALUE
	to provide a fallback register value.

gcc/testsuite/
	* g++.target/aarch64/aarch64.exp: New harness.
	* g++.target/aarch64/sve_catch_1.C: New test.
	* g++.target/aarch64/sve_catch_2.C: Likewise.
	* g++.target/aarch64/sve_catch_3.C: Likewise.
	* g++.target/aarch64/sve_catch_4.C: Likewise.
	* g++.target/aarch64/sve_catch_5.C: Likewise.
	* g++.target/aarch64/sve_catch_6.C: Likewise.

Index: libgcc/config/aarch64/value-unwind.h
===================================================================
--- libgcc/config/aarch64/value-unwind.h	2017-02-23 19:53:58.000000000 +0000
+++ libgcc/config/aarch64/value-unwind.h	2017-11-03 17:24:20.172023500 +0000
@@ -23,3 +23,19 @@
 #if defined __aarch64__ && !defined __LP64__
 # define REG_VALUE_IN_UNWIND_CONTEXT
 #endif
+
+/* Return the value of the pseudo VG register.  This should only be
+   called if we know this is an SVE host.  */
+static inline int
+aarch64_vg (void)
+{
+  register int vg asm ("x0");
+  /* CNTD X0.  */
+  asm (".inst 0x04e0e3e0" : "=r" (vg));
+  return vg;
+}
+
+/* Lazily provide a value for VG, so that we don't try to execute SVE
+   instructions unless we know they're needed.  */
+#define DWARF_LAZY_REGISTER_VALUE(REGNO, VALUE) \
+  ((REGNO) == AARCH64_DWARF_VG && ((*VALUE) = aarch64_vg (), 1))
Index: libgcc/unwind-dw2.c
===================================================================
--- libgcc/unwind-dw2.c	2017-02-23 19:54:02.000000000 +0000
+++ libgcc/unwind-dw2.c	2017-11-03 17:24:20.172023500 +0000
@@ -216,12 +216,12 @@ _Unwind_IsExtendedContext (struct _Unwin
 	  || (context->flags & EXTENDED_CONTEXT_BIT));
 }
 \f
-/* Get the value of register INDEX as saved in CONTEXT.  */
+/* Get the value of register REGNO as saved in CONTEXT.  */
 
 inline _Unwind_Word
-_Unwind_GetGR (struct _Unwind_Context *context, int index)
+_Unwind_GetGR (struct _Unwind_Context *context, int regno)
 {
-  int size;
+  int size, index;
   _Unwind_Context_Reg_Val val;
 
 #ifdef DWARF_ZERO_REG
@@ -229,7 +229,7 @@ _Unwind_GetGR (struct _Unwind_Context *c
     return 0;
 #endif
 
-  index = DWARF_REG_TO_UNWIND_COLUMN (index);
+  index = DWARF_REG_TO_UNWIND_COLUMN (regno);
   gcc_assert (index < (int) sizeof(dwarf_reg_size_table));
   size = dwarf_reg_size_table[index];
   val = context->reg[index];
@@ -237,6 +237,14 @@ _Unwind_GetGR (struct _Unwind_Context *c
   if (_Unwind_IsExtendedContext (context) && context->by_value[index])
     return _Unwind_Get_Unwind_Word (val);
 
+#ifdef DWARF_LAZY_REGISTER_VALUE
+  {
+    _Unwind_Word value;
+    if (DWARF_LAZY_REGISTER_VALUE (regno, &value))
+      return value;
+  }
+#endif
+
   /* This will segfault if the register hasn't been saved.  */
   if (size == sizeof(_Unwind_Ptr))
     return * (_Unwind_Ptr *) (_Unwind_Internal_Ptr) val;
Index: gcc/testsuite/g++.target/aarch64/aarch64.exp
===================================================================
--- /dev/null	2017-11-03 10:40:07.002381728 +0000
+++ gcc/testsuite/g++.target/aarch64/aarch64.exp	2017-11-03 17:24:20.171023116 +0000
@@ -0,0 +1,38 @@
+#  Specific regression driver for AArch64.
+#  Copyright (C) 2009-2017 Free Software Foundation, Inc.
+#  Contributed by ARM Ltd.
+#
+#  This file is part of GCC.
+#
+#  GCC is free software; you can redistribute it and/or modify it
+#  under the terms of the GNU General Public License as published by
+#  the Free Software Foundation; either version 3, or (at your option)
+#  any later version.
+#
+#  GCC is distributed in the hope that it will be useful, but
+#  WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with GCC; see the file COPYING3.  If not see
+#  <http://www.gnu.org/licenses/>.  */
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an AArch64 target.
+if {![istarget aarch64*-*-*] } then {
+  return
+}
+
+# Load support procs.
+load_lib g++-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C]] "" ""
+
+# All done.
+dg-finish
Index: gcc/testsuite/g++.target/aarch64/sve_catch_1.C
===================================================================
--- /dev/null	2017-11-03 10:40:07.002381728 +0000
+++ gcc/testsuite/g++.target/aarch64/sve_catch_1.C	2017-11-03 17:24:20.171023116 +0000
@@ -0,0 +1,70 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+/* Invoke X (P##n) for n in [0, 7].  */
+#define REPEAT8(X, P) \
+  X (P##0) X (P##1) X (P##2) X (P##3) X (P##4) X (P##5) X (P##6) X (P##7)
+
+/* Invoke X (n) for all octal n in [0, 39].  */
+#define REPEAT40(X) \
+  REPEAT8 (X, 0) REPEAT8 (X, 1)  REPEAT8 (X, 2) REPEAT8 (X, 3) REPEAT8 (X, 4)
+
+volatile int testi;
+
+/* Throw to f3.  */
+void __attribute__ ((weak))
+f1 (int x[40][100], int *y)
+{
+  /* A wild write to x and y.  */
+  asm volatile ("" ::: "memory");
+  if (y[testi] == x[testi][testi])
+    throw 100;
+}
+
+/* Expect vector work to be done, with spilling of vector registers.  */
+void __attribute__ ((weak))
+f2 (int x[40][100], int *y)
+{
+  /* Try to force some spilling.  */
+#define DECLARE(N) int y##N = y[N];
+  REPEAT40 (DECLARE);
+  for (int j = 0; j < 20; ++j)
+    {
+      f1 (x, y);
+#pragma omp simd
+      for (int i = 0; i < 100; ++i)
+	{
+#define INC(N) x[N][i] += y##N;
+	  REPEAT40 (INC);
+	}
+    }
+}
+
+/* Catch an exception thrown from f1, via f2.  */
+void __attribute__ ((weak))
+f3 (int x[40][100], int *y, int *z)
+{
+  volatile int extra = 111;
+  try
+    {
+      f2 (x, y);
+    }
+  catch (int val)
+    {
+      *z = val + extra;
+    }
+}
+
+static int x[40][100];
+static int y[40];
+static int z;
+
+int
+main (void)
+{
+  f3 (x, y, &z);
+  if (z != 211)
+    __builtin_abort ();
+  return 0;
+}
Index: gcc/testsuite/g++.target/aarch64/sve_catch_2.C
===================================================================
--- /dev/null	2017-11-03 10:40:07.002381728 +0000
+++ gcc/testsuite/g++.target/aarch64/sve_catch_2.C	2017-11-03 17:24:20.171023116 +0000
@@ -0,0 +1,5 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+#include "sve_catch_1.C"
Index: gcc/testsuite/g++.target/aarch64/sve_catch_3.C
===================================================================
--- /dev/null	2017-11-03 10:40:07.002381728 +0000
+++ gcc/testsuite/g++.target/aarch64/sve_catch_3.C	2017-11-03 17:24:20.171023116 +0000
@@ -0,0 +1,79 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+/* Invoke X (P##n) for n in [0, 7].  */
+#define REPEAT8(X, P) \
+  X (P##0) X (P##1) X (P##2) X (P##3) X (P##4) X (P##5) X (P##6) X (P##7)
+
+/* Invoke X (n) for all octal n in [0, 39].  */
+#define REPEAT40(X) \
+  REPEAT8 (X, 0) REPEAT8 (X, 1)  REPEAT8 (X, 2) REPEAT8 (X, 3) REPEAT8 (X, 4)
+
+volatile int testi, sink;
+
+/* Take 2 stack arguments and throw to f3.  */
+void __attribute__ ((weak))
+f1 (int x[40][100], int *y, int z1, int z2, int z3, int z4,
+    int z5, int z6, int z7, int z8)
+{
+  /* A wild write to x and y.  */
+  sink = z1;
+  sink = z2;
+  sink = z3;
+  sink = z4;
+  sink = z5;
+  sink = z6;
+  sink = z7;
+  sink = z8;
+  asm volatile ("" ::: "memory");
+  if (y[testi] == x[testi][testi])
+    throw 100;
+}
+
+/* Expect vector work to be done, with spilling of vector registers.  */
+void __attribute__ ((weak))
+f2 (int x[40][100], int *y)
+{
+  /* Try to force some spilling.  */
+#define DECLARE(N) int y##N = y[N];
+  REPEAT40 (DECLARE);
+  for (int j = 0; j < 20; ++j)
+    {
+      f1 (x, y, 1, 2, 3, 4, 5, 6, 7, 8);
+#pragma omp simd
+      for (int i = 0; i < 100; ++i)
+	{
+#define INC(N) x[N][i] += y##N;
+	  REPEAT40 (INC);
+	}
+    }
+}
+
+/* Catch an exception thrown from f1, via f2.  */
+void __attribute__ ((weak))
+f3 (int x[40][100], int *y, int *z)
+{
+  volatile int extra = 111;
+  try
+    {
+      f2 (x, y);
+    }
+  catch (int val)
+    {
+      *z = val + extra;
+    }
+}
+
+static int x[40][100];
+static int y[40];
+static int z;
+
+int
+main (void)
+{
+  f3 (x, y, &z);
+  if (z != 211)
+    __builtin_abort ();
+  return 0;
+}
Index: gcc/testsuite/g++.target/aarch64/sve_catch_4.C
===================================================================
--- /dev/null	2017-11-03 10:40:07.002381728 +0000
+++ gcc/testsuite/g++.target/aarch64/sve_catch_4.C	2017-11-03 17:24:20.171023116 +0000
@@ -0,0 +1,5 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+#include "sve_catch_3.C"
Index: gcc/testsuite/g++.target/aarch64/sve_catch_5.C
===================================================================
--- /dev/null	2017-11-03 10:40:07.002381728 +0000
+++ gcc/testsuite/g++.target/aarch64/sve_catch_5.C	2017-11-03 17:24:20.172023500 +0000
@@ -0,0 +1,82 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fno-omit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+/* Invoke X (P##n) for n in [0, 7].  */
+#define REPEAT8(X, P) \
+  X (P##0) X (P##1) X (P##2) X (P##3) X (P##4) X (P##5) X (P##6) X (P##7)
+
+/* Invoke X (n) for all octal n in [0, 39].  */
+#define REPEAT40(X) \
+  REPEAT8 (X, 0) REPEAT8 (X, 1)  REPEAT8 (X, 2) REPEAT8 (X, 3) REPEAT8 (X, 4)
+
+volatile int testi, sink;
+volatile void *ptr;
+
+/* Take 2 stack arguments and throw to f3.  */
+void __attribute__ ((weak))
+f1 (int x[40][100], int *y, int z1, int z2, int z3, int z4,
+    int z5, int z6, int z7, int z8)
+{
+  /* A wild write to x and y.  */
+  sink = z1;
+  sink = z2;
+  sink = z3;
+  sink = z4;
+  sink = z5;
+  sink = z6;
+  sink = z7;
+  sink = z8;
+  asm volatile ("" ::: "memory");
+  if (y[testi] == x[testi][testi])
+    throw 100;
+}
+
+/* Expect vector work to be done, with spilling of vector registers.  */
+void __attribute__ ((weak))
+f2 (int x[40][100], int *y)
+{
+  /* Create a true variable-sized frame.  */
+  ptr = __builtin_alloca (testi + 40);
+  /* Try to force some spilling.  */
+#define DECLARE(N) int y##N = y[N];
+  REPEAT40 (DECLARE);
+  for (int j = 0; j < 20; ++j)
+    {
+      f1 (x, y, 1, 2, 3, 4, 5, 6, 7, 8);
+#pragma omp simd
+      for (int i = 0; i < 100; ++i)
+	{
+#define INC(N) x[N][i] += y##N;
+	  REPEAT40 (INC);
+	}
+    }
+}
+
+/* Catch an exception thrown from f1, via f2.  */
+void __attribute__ ((weak))
+f3 (int x[40][100], int *y, int *z)
+{
+  volatile int extra = 111;
+  try
+    {
+      f2 (x, y);
+    }
+  catch (int val)
+    {
+      *z = val + extra;
+    }
+}
+
+static int x[40][100];
+static int y[40];
+static int z;
+
+int
+main (void)
+{
+  f3 (x, y, &z);
+  if (z != 211)
+    __builtin_abort ();
+  return 0;
+}
Index: gcc/testsuite/g++.target/aarch64/sve_catch_6.C
===================================================================
--- /dev/null	2017-11-03 10:40:07.002381728 +0000
+++ gcc/testsuite/g++.target/aarch64/sve_catch_6.C	2017-11-03 17:24:20.172023500 +0000
@@ -0,0 +1,5 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer" } */
+/* { dg-options "-O3 -fopenmp-simd -fomit-frame-pointer -march=armv8-a+sve" { target aarch64_sve_hw } } */
+
+#include "sve_catch_5.C"

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [4/4] SVE unwinding
  2017-11-03 17:52 ` [4/4] SVE unwinding Richard Sandiford
@ 2017-11-10 10:58   ` James Greenhalgh
  0 siblings, 0 replies; 18+ messages in thread
From: James Greenhalgh @ 2017-11-10 10:58 UTC (permalink / raw)
  To: gcc-patches, richard.earnshaw, marcus.shawcroft, richard.sandiford; +Cc: nd

On Fri, Nov 03, 2017 at 05:52:05PM +0000, Richard Sandiford wrote:
> This patch adds support for unwinding frames that use the SVE
> pseudo VG register.  We want this register to act like a normal
> register if the CFI explicitly sets it, but want to provide a
> default value otherwise.  Computing the default value requires
> an SVE target, so we only want to compute it on demand.
> 
> aarch64_vg uses a hard-coded .inst in order to avoid a build
> dependency on binutils 2.28 or later.

I think the new hook needs documenting in tm.texi , particularly as it
implies a conditional write to VALUE.

I think this is practice we've seen before, for example
DWARF_REG_TO_UNWIND_COLUMN and REG_VALUE_IN_UNWIND_CONTEXT are defined
in libgcc/config and documented in tm.texi.

Otherwise, the AArch64 parts of this are OK. You mind need to wait for
someone to OK the unwind-dw2.c part.

Thanks,
James

Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>

> 2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>
> 
> libgcc/
> 	* config/aarch64/value-unwind.h (aarch64_vg): New function.
> 	(DWARF_LAZY_REGISTER_VALUE): Define.
> 	* unwind-dw2.c (_Unwind_GetGR): Use DWARF_LAZY_REGISTER_VALUE
> 	to provide a fallback register value.
> 
> gcc/testsuite/
> 	* g++.target/aarch64/aarch64.exp: New harness.
> 	* g++.target/aarch64/sve_catch_1.C: New test.
> 	* g++.target/aarch64/sve_catch_2.C: Likewise.
> 	* g++.target/aarch64/sve_catch_3.C: Likewise.
> 	* g++.target/aarch64/sve_catch_4.C: Likewise.
> 	* g++.target/aarch64/sve_catch_5.C: Likewise.
> 	* g++.target/aarch64/sve_catch_6.C: Likewise.
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [0/4] [AArch64] Add SVE support
  2017-11-03 17:45 [0/4] [AArch64] Add SVE support Richard Sandiford
                   ` (3 preceding siblings ...)
  2017-11-03 17:52 ` [4/4] SVE unwinding Richard Sandiford
@ 2017-11-24 16:34 ` Richard Sandiford
  2018-01-06 18:09   ` James Greenhalgh
  4 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2017-11-24 16:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.earnshaw, james.greenhalgh, marcus.shawcroft

[-- Attachment #1: Type: text/plain, Size: 17607 bytes --]

Richard Sandiford <richard.sandiford@linaro.org> writes:
> This series adds support for ARM's Scalable Vector Extension.
> More details on SVE can be found here:
>
>   https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a
>
> There are four parts for ease of review, but it probably makes
> sense to commit them as one patch.
>
> The series plugs SVE into the current vectorisation framework without
> adding any new features to the framework itself.  This means for example
> that vector loops still handle full vectors, with a scalar epilogue loop
> being needed for the rest.  Later patches add support for other features
> like fully-predicated loops.
>
> The patches build on top of the various series that I've already posted.
> Sorry that there were so many, and thanks again for all the reviews.
>
> Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE
> (in the default vector-length agnostic mode).  Also tested with
> -msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit
> and 512-bit SVE registers.

Here's an update based on an off-list discussion with the maintainers.
Changes since v1:

- Changed the names of the modes from 256-bit vectors to "VNx"
  + a 128-bit mode name, e.g. V32QI -> VNx16QI.

- Added an "sve" attribute and used it in the "enabled" attribute.
  This allows generic aarch64.md patterns to disable things related
  to SVE on non-SVE targets; previously this was implicit through the
  constraints.

- Improved the consistency of the constraint names, specifically:

  Ua?: addition contraints (already used for Uaa)
  Us?: general scalar constraints (already used for various other scalars)
  Ut?: memory constraints (unchanged from v1)
  vs?: vector SVE constraints (mostly unchanged, but now includes FP
       as well as integer constraints)

  There's still the general "Dm" (minus one) constraint, for consistency
  with "Dz" (zero).

- Added missing register descriptions above FIXED_REGISTERS.

- "should"/"is expected to" -> "must".

- Added more commentary to things like regmode_natural_size.

I also did a before and after comparison of the testsuite output
for base AArch64 (but using the new FIRST_PSEUDO_REGISTER definition
to avoid changes to hash values).  There were no differences.

Thanks,
Richard


2017-11-24  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/invoke.texi (-msve-vector-bits=): Document new option.
	(sve): Document new AArch64 extension.
	* doc/md.texi (w): Extend the description of the AArch64
	constraint to include SVE vectors.
	(Upl, Upa): Document new AArch64 predicate constraints.
	* config/aarch64/aarch64-opts.h (aarch64_sve_vector_bits_enum): New
	enum.
	* config/aarch64/aarch64.opt (sve_vector_bits): New enum.
	(msve-vector-bits=): New option.
	* config/aarch64/aarch64-option-extensions.def (fp, simd): Disable
	SVE when these are disabled.
	(sve): New extension.
	* config/aarch64/aarch64-modes.def: Define SVE vector and predicate
	modes.  Adjust their number of units based on aarch64_sve_vg.
	(MAX_BITSIZE_MODE_ANY_MODE): Define.
	* config/aarch64/aarch64-protos.h (ADDR_QUERY_ANY): New
	aarch64_addr_query_type.
	(aarch64_const_vec_all_same_in_range_p, aarch64_sve_pred_mode)
	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)
	(aarch64_sve_inc_dec_immediate_p, aarch64_add_offset_temporaries)
	(aarch64_split_add_offset, aarch64_output_sve_cnt_immediate)
	(aarch64_output_sve_addvl_addpl, aarch64_output_sve_inc_dec_immediate)
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): Declare.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_check_zero_based_sve_index_immediate): Declare.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): Likewise.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): Declare.
	(aarch64_expand_mov_immediate): Take a gen_vec_duplicate callback.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move): Declare.
	(aarch64_expand_sve_vec_cmp_int, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond, aarch64_expand_sve_vec_perm): Declare.
	(aarch64_regmode_natural_size): Likewise.
	* config/aarch64/aarch64.h (AARCH64_FL_SVE): New macro.
	(AARCH64_FL_V8_3, AARCH64_FL_RCPC, AARCH64_FL_DOTPROD): Shift
	left one place.
	(AARCH64_ISA_SVE, TARGET_SVE): New macros.
	(FIXED_REGISTERS, CALL_USED_REGISTERS, REGISTER_NAMES): Add entries
	for VG and the SVE predicate registers.
	(V_ALIASES): Add a "z"-prefixed alias.
	(FIRST_PSEUDO_REGISTER): Change to P15_REGNUM + 1.
	(AARCH64_DWARF_VG, AARCH64_DWARF_P0): New macros.
	(PR_REGNUM_P, PR_LO_REGNUM_P): Likewise.
	(PR_LO_REGS, PR_HI_REGS, PR_REGS): New reg_classes.
	(REG_CLASS_NAMES): Add entries for them.
	(REG_CLASS_CONTENTS): Likewise.  Update ALL_REGS to include VG
	and the predicate registers.
	(aarch64_sve_vg): Declare.
	(BITS_PER_SVE_VECTOR, BYTES_PER_SVE_VECTOR, BYTES_PER_SVE_PRED)
	(SVE_BYTE_MODE, MAX_COMPILE_TIME_VEC_BYTES): New macros.
	(REGMODE_NATURAL_SIZE): Define.
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Handle
	SVE macros.
	* config/aarch64/aarch64.c: Include cfgrtl.h.
	(simd_immediate_info): Add a constructor for series vectors,
	and an associated step field.
	(aarch64_sve_vg): New variable.
	(aarch64_dbx_register_number): Handle VG and the predicate registers.
	(aarch64_vect_struct_mode_p, aarch64_vector_mode_p): Delete.
	(VEC_ADVSIMD, VEC_SVE_DATA, VEC_SVE_PRED, VEC_STRUCT, VEC_ANY_SVE)
	(VEC_ANY_DATA, VEC_STRUCT): New constants.
	(aarch64_advsimd_struct_mode_p, aarch64_sve_pred_mode_p)
	(aarch64_classify_vector_mode, aarch64_vector_data_mode_p)
	(aarch64_sve_data_mode_p, aarch64_pred_mode, aarch64_get_mask_mode):
	New functions.
	(aarch64_hard_regno_nregs): Handle SVE data modes for FP_REGS
	and FP_LO_REGS.  Handle PR_REGS, PR_LO_REGS and PR_HI_REGS.
	(aarch64_hard_regno_mode_ok): Handle VG.  Also handle the SVE
	predicate modes and predicate registers.  Explicitly restrict
	GPRs to modes of 16 bytes or smaller.  Only allow FP registers
	to store a vector mode if it is recognized by
	aarch64_classify_vector_mode.
	(aarch64_regmode_natural_size): New function.
	(aarch64_hard_regno_caller_save_mode): Return the original mode
	for predicates.
	(aarch64_sve_cnt_immediate_p, aarch64_output_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate_p, aarch64_output_sve_addvl_addpl)
	(aarch64_sve_inc_dec_immediate_p, aarch64_output_sve_inc_dec_immediate)
	(aarch64_add_offset_1_temporaries, aarch64_offset_temporaries): New
	functions.
	(aarch64_add_offset): Add a temp2 parameter.  Assert that temp1
	does not overlap dest if the function is frame-related.  Handle
	SVE constants.
	(aarch64_split_add_offset): New function.
	(aarch64_add_sp, aarch64_sub_sp): Add temp2 parameters and pass
	them aarch64_add_offset.
	(aarch64_allocate_and_probe_stack_space): Add a temp2 parameter
	and update call to aarch64_sub_sp.
	(aarch64_add_cfa_expression): New function.
	(aarch64_expand_prologue): Pass extra temporary registers to the
	functions above.  Handle the case in which we need to emit new
	DW_CFA_expressions for registers that were originally saved
	relative to the stack pointer, but now have to be expressed
	relative to the frame pointer.
	(aarch64_output_mi_thunk): Pass extra temporary registers to the
	functions above.
	(aarch64_expand_epilogue): Likewise.  Prevent inheritance of
	IP0 and IP1 values for SVE frames.
	(aarch64_expand_vec_series): New function.
	(aarch64_expand_mov_immediate): Add a gen_vec_duplicate parameter.
	Handle SVE constants.  Use emit_move_insn to move a force_const_mem
	into the register, rather than emitting a SET directly.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move)
	(aarch64_get_reg_raw_mode, offset_4bit_signed_scaled_p)
	(offset_6bit_unsigned_scaled_p, aarch64_offset_7bit_signed_scaled_p)
	(offset_9bit_signed_scaled_p): New functions.
	(aarch64_replicate_bitmask_imm): New function.
	(aarch64_bitmask_imm): Use it.
	(aarch64_cannot_force_const_mem): Reject expressions involving
	a CONST_POLY_INT.  Update call to aarch64_classify_symbol.
	(aarch64_classify_index): Handle SVE indices, by requiring
	a plain register index with a scale that matches the element size.
	(aarch64_classify_address): Handle SVE addresses.  Assert that
	the mode of the address is VOIDmode or an integer mode.
	Update call to aarch64_classify_symbol.
	(aarch64_classify_symbolic_expression): Update call to
	aarch64_classify_symbol.
	(aarch64_const_vec_all_same_in_range_p): Extend to VEC_DUPLICATE
	constants by using const_vec_duplicate_p.
	(aarch64_const_vec_all_in_range_p): New function.
	(aarch64_print_vector_float_operand): Likewise.
	(aarch64_print_operand): Handle 'N' and 'C'.  Use "zN" rather than
	"vN" for FP registers with SVE modes.  Handle (const ...) vectors
	and the FP immediates 1.0 and 0.5.
	(aarch64_print_operand_address): Use ADDR_QUERY_ANY.  Handle
	SVE addresses.
	(aarch64_regno_regclass): Handle predicate registers.
	(aarch64_secondary_reload): Handle big-endian reloads of SVE
	data modes.
	(aarch64_class_max_nregs): Handle SVE modes and predicate registers.
	(aarch64_rtx_costs): Check for ADDVL and ADDPL instructions.
	(aarch64_convert_sve_vector_bits): New function.
	(aarch64_override_options): Use it to handle -msve-vector-bits=.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_legitimate_constant_p): Use aarch64_classify_vector_mode.
	Handle SVE vector and predicate modes.  Accept VL-based constants
	that need only one temporary register, and VL offsets that require
	no temporary registers.
	(aarch64_conditional_register_usage): Mark the predicate registers
	as fixed if SVE isn't available.
	(aarch64_vector_mode_supported_p): Use aarch64_classify_vector_mode.
	Return true for SVE vector and predicate modes.
	(aarch64_simd_container_mode): Take the number of bits as a poly_int64
	rather than an unsigned int.  Handle SVE modes.
	(aarch64_preferred_simd_mode): Update call accordingly.  Handle
	SVE modes.
	(aarch64_autovectorize_vector_sizes): Add BYTES_PER_SVE_VECTOR
	if SVE is enabled.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): New functions.
	(aarch64_sve_valid_immediate): New function.
	(aarch64_simd_valid_immediate): Use it as the fallback for SVE vectors.
	Explicitly reject structure modes.  Check for INDEX constants.
	Handle PTRUE and PFALSE constants.
	(aarch64_check_zero_based_sve_index_immediate): New function.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_mov_operand_p): Use aarch64_simd_valid_immediate for
	vector modes.  Accept constants in the range of CNT[BHWD].
	(aarch64_simd_scalar_immediate_valid_for_move): Explicitly
	ask for an Advanced SIMD mode.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): New functions.
	(aarch64_simd_vector_alignment): Handle SVE predicates.
	(aarch64_vectorize_preferred_vector_alignment): New function.
	(aarch64_simd_vector_alignment_reachable): Use it instead of
	the vector size.
	(aarch64_shift_truncation_mask): Use aarch64_vector_data_mode_p.
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): New
	functions.
	(MAX_VECT_LEN): Delete.
	(expand_vec_perm_d): Add a vec_flags field.
	(emit_unspec2, aarch64_expand_sve_vec_perm): New functions.
	(aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip)
	(aarch64_evpc_ext): Don't apply a big-endian lane correction
	for SVE modes.
	(aarch64_evpc_rev): Rename to...
	(aarch64_evpc_rev_local): ...this.  Use a predicated operation for SVE.
	(aarch64_evpc_rev_global): New function.
	(aarch64_evpc_dup): Enforce a 64-byte range for SVE DUP.
	(aarch64_evpc_tbl): Use MAX_COMPILE_TIME_VEC_BYTES instead of
	MAX_VECT_LEN.
	(aarch64_evpc_sve_tbl): New function.
	(aarch64_expand_vec_perm_const_1): Update after rename of
	aarch64_evpc_rev.  Handle SVE permutes too, trying
	aarch64_evpc_rev_global and using aarch64_evpc_sve_tbl rather
	than aarch64_evpc_tbl.
	(aarch64_expand_vec_perm_const): Initialize vec_flags.
	(aarch64_vectorize_vec_perm_const_ok): Likewise.
	(aarch64_sve_cmp_operand_p, aarch64_unspec_cond_code)
	(aarch64_gen_unspec_cond, aarch64_expand_sve_vec_cmp_int)
	(aarch64_emit_unspec_cond, aarch64_emit_unspec_cond_or)
	(aarch64_emit_inverted_unspec_cond, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond): New functions.
	(aarch64_modes_tieable_p): Use aarch64_vector_data_mode_p instead
	of aarch64_vector_mode_p.
	(aarch64_dwarf_poly_indeterminate_value): New function.
	(aarch64_compute_pressure_classes): Likewise.
	(aarch64_can_change_mode_class): Likewise.
	(TARGET_GET_RAW_RESULT_MODE, TARGET_GET_RAW_ARG_MODE): Redefine.
	(TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): Likewise.
	(TARGET_VECTORIZE_GET_MASK_MODE): Likewise.
	(TARGET_DWARF_POLY_INDETERMINATE_VALUE): Likewise.
	(TARGET_COMPUTE_PRESSURE_CLASSES): Likewise.
	(TARGET_CAN_CHANGE_MODE_CLASS): Likewise.
	* config/aarch64/constraints.md (Upa, Upl, Uav, Uat, Usv, Usi, Utr)
	(Uty, Dm, vsa, vsc, vsd, vsi, vsn, vsl, vsm, vsA, vsM, vsN): New
	constraints.
	(Dn, Dl, Dr): Accept const as well as const_vector.
	(Dz): Likewise.  Compare against CONST0_RTX.
	* config/aarch64/iterators.md: Refer to "Advanced SIMD" instead
	of "vector" where appropriate.
	(SVE_ALL, SVE_BH, SVE_BHS, SVE_BHSI, SVE_HSDI, SVE_HSF, SVE_SD)
	(SVE_SDI, SVE_I, SVE_F, PRED_ALL, PRED_BHS): New mode iterators.
	(UNSPEC_SEL, UNSPEC_ANDF, UNSPEC_IORF, UNSPEC_XORF, UNSPEC_COND_LT)
	(UNSPEC_COND_LE, UNSPEC_COND_EQ, UNSPEC_COND_NE, UNSPEC_COND_GE)
	(UNSPEC_COND_GT, UNSPEC_COND_LO, UNSPEC_COND_LS, UNSPEC_COND_HS)
	(UNSPEC_COND_HI, UNSPEC_COND_UO): New unspecs.
	(Vetype, VEL, Vel, VWIDE, Vwide, vw, vwcore, V_INT_EQUIV)
	(v_int_equiv): Extend to SVE modes.
	(Vesize, V128, v128, Vewtype, V_FP_EQUIV, v_fp_equiv, VPRED): New
	mode attributes.
	(LOGICAL_OR, SVE_INT_UNARY, SVE_FP_UNARY): New code iterators.
	(optab): Handle popcount, smin, smax, umin, umax, abs and sqrt.
	(logical_nn, lr, sve_int_op, sve_fp_op): New code attributs.
	(LOGICALF, OPTAB_PERMUTE, UNPACK, UNPACK_UNSIGNED, SVE_COND_INT_CMP)
	(SVE_COND_FP_CMP): New int iterators.
	(perm_hilo): Handle the new unpack unspecs.
	(optab, logicalf_op, su, perm_optab, cmp_op, imm_con): New int
	attributes.
	* config/aarch64/predicates.md (aarch64_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate, aarch64_split_add_offset_immediate)
	(aarch64_pluslong_or_poly_operand, aarch64_nonmemory_operand)
	(aarch64_equality_operator, aarch64_constant_vector_operand)
	(aarch64_sve_ld1r_operand, aarch64_sve_ldr_operand): New predicates.
	(aarch64_sve_nonimmediate_operand): Likewise.
	(aarch64_sve_general_operand): Likewise.
	(aarch64_sve_dup_operand, aarch64_sve_arith_immediate): Likewise.
	(aarch64_sve_sub_arith_immediate, aarch64_sve_inc_dec_immediate)
	(aarch64_sve_logical_immediate, aarch64_sve_mul_immediate): Likewise.
	(aarch64_sve_dup_immediate, aarch64_sve_cmp_vsc_immediate): Likewise.
	(aarch64_sve_cmp_vsd_immediate, aarch64_sve_index_immediate): Likewise.
	(aarch64_sve_float_arith_immediate): Likewise.
	(aarch64_sve_float_arith_with_sub_immediate): Likewise.
	(aarch64_sve_float_mul_immediate, aarch64_sve_arith_operand): Likewise.
	(aarch64_sve_add_operand, aarch64_sve_logical_operand): Likewise.
	(aarch64_sve_lshift_operand, aarch64_sve_rshift_operand): Likewise.
	(aarch64_sve_mul_operand, aarch64_sve_cmp_vsc_operand): Likewise.
	(aarch64_sve_cmp_vsd_operand, aarch64_sve_index_operand): Likewise.
	(aarch64_sve_float_arith_operand): Likewise.
	(aarch64_sve_float_arith_with_sub_operand): Likewise.
	(aarch64_sve_float_mul_operand): Likewise.
	(aarch64_sve_vec_perm_operand): Likewise.
	(aarch64_pluslong_operand): Include aarch64_sve_addvl_addpl_immediate.
	(aarch64_mov_operand): Accept const_poly_int and const_vector.
	(aarch64_simd_lshift_imm, aarch64_simd_rshift_imm): Accept const
	as well as const_vector.
	(aarch64_simd_imm_zero, aarch64_simd_imm_minus_one): Move earlier
	in file.  Use CONST0_RTX and CONSTM1_RTX.
	(aarch64_simd_or_scalar_imm_zero): Likewise.  Add match_codes.
	(aarch64_simd_reg_or_zero): Accept const as well as const_vector.
	Use aarch64_simd_imm_zero.
	* config/aarch64/aarch64-sve.md: New file.
	* config/aarch64/aarch64.md: Include it.
	(VG_REGNUM, P0_REGNUM, P7_REGNUM, P15_REGNUM): New register numbers.
	(UNSPEC_REV, UNSPEC_LD1_SVE, UNSPEC_ST1_SVE, UNSPEC_MERGE_PTRUE)
	(UNSPEC_PTEST_PTRUE, UNSPEC_UNPACKSHI, UNSPEC_UNPACKUHI)
	(UNSPEC_UNPACKSLO, UNSPEC_UNPACKULO, UNSPEC_PACK)
	(UNSPEC_FLOAT_CONVERT, UNSPEC_WHILE_LO): New unspec constants.
	(sve): New attribute.
	(enabled): Disable instructions with the sve attribute unless
	TARGET_SVE.
	(movqi, movhi): Pass CONST_POLY_INT operaneds through
	aarch64_expand_mov_immediate.
	(*mov<mode>_aarch64, *movsi_aarch64, *movdi_aarch64): Handle
	CNT[BHSD] immediates.
	(movti): Split CONST_POLY_INT moves into two halves.
	(add<mode>3): Accept aarch64_pluslong_or_poly_operand.
	Split additions that need a temporary here if the destination
	is the stack pointer.
	(*add<mode>3_aarch64): Handle ADDVL and ADDPL immediates.
	(*add<mode>3_poly_1): New instruction.
	(set_clobber_cc): New expander.


[-- Attachment #2: sve-01-main.diff.gz --]
[-- Type: application/gzip, Size: 56871 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [0/4] [AArch64] Add SVE support
  2017-11-24 16:34 ` [0/4] [AArch64] Add SVE support Richard Sandiford
@ 2018-01-06 18:09   ` James Greenhalgh
  2018-01-06 19:39     ` Richard Sandiford
  0 siblings, 1 reply; 18+ messages in thread
From: James Greenhalgh @ 2018-01-06 18:09 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Richard Earnshaw, Marcus Shawcroft, nd

On Fri, Nov 24, 2017 at 03:59:58PM +0000, Richard Sandiford wrote:
> Richard Sandiford <richard.sandiford@linaro.org> writes:
> > This series adds support for ARM's Scalable Vector Extension.
> > More details on SVE can be found here:
> >
> >   https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a
> >
> > There are four parts for ease of review, but it probably makes
> > sense to commit them as one patch.
> >
> > The series plugs SVE into the current vectorisation framework without
> > adding any new features to the framework itself.  This means for example
> > that vector loops still handle full vectors, with a scalar epilogue loop
> > being needed for the rest.  Later patches add support for other features
> > like fully-predicated loops.
> >
> > The patches build on top of the various series that I've already posted.
> > Sorry that there were so many, and thanks again for all the reviews.
> >
> > Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE
> > (in the default vector-length agnostic mode).  Also tested with
> > -msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit
> > and 512-bit SVE registers.
> 
> Here's an update based on an off-list discussion with the maintainers.
> Changes since v1:
> 
> - Changed the names of the modes from 256-bit vectors to "VNx"
>   + a 128-bit mode name, e.g. V32QI -> VNx16QI.
> 
> - Added an "sve" attribute and used it in the "enabled" attribute.
>   This allows generic aarch64.md patterns to disable things related
>   to SVE on non-SVE targets; previously this was implicit through the
>   constraints.
> 
> - Improved the consistency of the constraint names, specifically:
> 
>   Ua?: addition contraints (already used for Uaa)
>   Us?: general scalar constraints (already used for various other scalars)
>   Ut?: memory constraints (unchanged from v1)
>   vs?: vector SVE constraints (mostly unchanged, but now includes FP
>        as well as integer constraints)
> 
>   There's still the general "Dm" (minus one) constraint, for consistency
>   with "Dz" (zero).
> 
> - Added missing register descriptions above FIXED_REGISTERS.
> 
> - "should"/"is expected to" -> "must".
> 
> - Added more commentary to things like regmode_natural_size.
> 
> I also did a before and after comparison of the testsuite output
> for base AArch64 (but using the new FIRST_PSEUDO_REGISTER definition
> to avoid changes to hash values).  There were no differences.

I seem to have lost 4/4 in my mailer. Would you mind pinging it if I have
any action to take? Also, please ping any other SVE parts I've missed that
you haven't pinged in recent days.

I'll get to 1/4 in good time, but at 5000+ lines, it will need at least
another day! I'd like to OK everything around it which is outstanding, then
build up the courage for the big patch!

Thanks,
James

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [0/4] [AArch64] Add SVE support
  2018-01-06 18:09   ` James Greenhalgh
@ 2018-01-06 19:39     ` Richard Sandiford
       [not found]       ` <20180107210818.GQ6993@arm.com>
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Sandiford @ 2018-01-06 19:39 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: gcc-patches, Richard Earnshaw, Marcus Shawcroft, nd

James Greenhalgh <james.greenhalgh@arm.com> writes:
> On Fri, Nov 24, 2017 at 03:59:58PM +0000, Richard Sandiford wrote:
>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>> > This series adds support for ARM's Scalable Vector Extension.
>> > More details on SVE can be found here:
>> >
>> >   https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a
>> >
>> > There are four parts for ease of review, but it probably makes
>> > sense to commit them as one patch.
>> >
>> > The series plugs SVE into the current vectorisation framework without
>> > adding any new features to the framework itself.  This means for example
>> > that vector loops still handle full vectors, with a scalar epilogue loop
>> > being needed for the rest.  Later patches add support for other features
>> > like fully-predicated loops.
>> >
>> > The patches build on top of the various series that I've already posted.
>> > Sorry that there were so many, and thanks again for all the reviews.
>> >
>> > Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE
>> > (in the default vector-length agnostic mode).  Also tested with
>> > -msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit
>> > and 512-bit SVE registers.
>> 
>> Here's an update based on an off-list discussion with the maintainers.
>> Changes since v1:
>> 
>> - Changed the names of the modes from 256-bit vectors to "VNx"
>>   + a 128-bit mode name, e.g. V32QI -> VNx16QI.
>> 
>> - Added an "sve" attribute and used it in the "enabled" attribute.
>>   This allows generic aarch64.md patterns to disable things related
>>   to SVE on non-SVE targets; previously this was implicit through the
>>   constraints.
>> 
>> - Improved the consistency of the constraint names, specifically:
>> 
>>   Ua?: addition contraints (already used for Uaa)
>>   Us?: general scalar constraints (already used for various other scalars)
>>   Ut?: memory constraints (unchanged from v1)
>>   vs?: vector SVE constraints (mostly unchanged, but now includes FP
>>        as well as integer constraints)
>> 
>>   There's still the general "Dm" (minus one) constraint, for consistency
>>   with "Dz" (zero).
>> 
>> - Added missing register descriptions above FIXED_REGISTERS.
>> 
>> - "should"/"is expected to" -> "must".
>> 
>> - Added more commentary to things like regmode_natural_size.
>> 
>> I also did a before and after comparison of the testsuite output
>> for base AArch64 (but using the new FIRST_PSEUDO_REGISTER definition
>> to avoid changes to hash values).  There were no differences.
>
> I seem to have lost 4/4 in my mailer. Would you mind pinging it if I have
> any action to take? Also, please ping any other SVE parts I've missed that
> you haven't pinged in recent days.

4/4 was the unwinder support, which you've already reviewed (thanks):

  https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00251.html

There are two other AArch64 patches that I'll ping in a sec.

There are also quite a few patches that add target-independent
support for something and also add corresponding SVE code
to config/aarch64 and/or code quality tests to gcc.target/aarch64.
I think the full list of those is:

  Patches with config/aarch64 code:

    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02066.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02068.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01484.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01485.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01491.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01494.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01497.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01506.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01570.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01575.html

  Patches with gcc.target/aarch64 tests but no config/aarch64 changes,
  with the tests being in the spirit of the ones added in the original
  SVE patch:

    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00752.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01446.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01489.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01490.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01498.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01499.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01572.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01573.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01577.html

The target-independent pieces have already been reviewed (except where
I'll ping seperately).

Thanks,
Richard

^ permalink raw reply	[flat|nested] 18+ messages in thread

[parent not found: <20180107210818.GQ6993@arm.com>]

* Re: [0/4] [AArch64] Add SVE support
       [not found]       ` <20180107210818.GQ6993@arm.com>
@ 2018-01-07 21:10         ` James Greenhalgh
  0 siblings, 0 replies; 18+ messages in thread
From: James Greenhalgh @ 2018-01-07 21:10 UTC (permalink / raw)
  To: gcc-patches, Richard Earnshaw, Marcus Shawcroft, richard.sandiford; +Cc: nd

(Resending; this bounced)

On Sat, Jan 06, 2018 at 07:39:46PM +0000, Richard Sandiford wrote:
> James Greenhalgh <james.greenhalgh@arm.com> writes:
> > On Fri, Nov 24, 2017 at 03:59:58PM +0000, Richard Sandiford wrote:
> >> Richard Sandiford <richard.sandiford@linaro.org> writes:
> >> > This series adds support for ARM's Scalable Vector Extension.
> >> > More details on SVE can be found here:
> >> >
> >> >   https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a
> >> >
> >> > There are four parts for ease of review, but it probably makes
> >> > sense to commit them as one patch.
> >> >
> >> > The series plugs SVE into the current vectorisation framework without
> >> > adding any new features to the framework itself.  This means for example
> >> > that vector loops still handle full vectors, with a scalar epilogue loop
> >> > being needed for the rest.  Later patches add support for other features
> >> > like fully-predicated loops.
> >> >
> >> > The patches build on top of the various series that I've already posted.
> >> > Sorry that there were so many, and thanks again for all the reviews.
> >> >
> >> > Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE
> >> > (in the default vector-length agnostic mode).  Also tested with
> >> > -msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit
> >> > and 512-bit SVE registers.
> >> 
> >> Here's an update based on an off-list discussion with the maintainers.
> >> Changes since v1:
> >> 
> >> - Changed the names of the modes from 256-bit vectors to "VNx"
> >>   + a 128-bit mode name, e.g. V32QI -> VNx16QI.
> >> 
> >> - Added an "sve" attribute and used it in the "enabled" attribute.
> >>   This allows generic aarch64.md patterns to disable things related
> >>   to SVE on non-SVE targets; previously this was implicit through the
> >>   constraints.
> >> 
> >> - Improved the consistency of the constraint names, specifically:
> >> 
> >>   Ua?: addition contraints (already used for Uaa)
> >>   Us?: general scalar constraints (already used for various other scalars)
> >>   Ut?: memory constraints (unchanged from v1)
> >>   vs?: vector SVE constraints (mostly unchanged, but now includes FP
> >>        as well as integer constraints)
> >> 
> >>   There's still the general "Dm" (minus one) constraint, for consistency
> >>   with "Dz" (zero).
> >> 
> >> - Added missing register descriptions above FIXED_REGISTERS.
> >> 
> >> - "should"/"is expected to" -> "must".
> >> 
> >> - Added more commentary to things like regmode_natural_size.
> >> 
> >> I also did a before and after comparison of the testsuite output
> >> for base AArch64 (but using the new FIRST_PSEUDO_REGISTER definition
> >> to avoid changes to hash values).  There were no differences.
> >
> > I seem to have lost 4/4 in my mailer. Would you mind pinging it if I have
> > any action to take? Also, please ping any other SVE parts I've missed that
> > you haven't pinged in recent days.
> 
> 4/4 was the unwinder support, which you've already reviewed (thanks):
> 
>   https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00251.html
> 
> There are two other AArch64 patches that I'll ping in a sec.

These, and 1/4 will now have to wait for later this week I'm afraid, I
hope I'll have a chance by midweek. I'll also see how far I can get with
the Armv8.4-A review this week to avoid further sequencing issues with who
goes first. Thanks for the preemptive rebase.

> There are also quite a few patches that add target-independent
> support for something and also add corresponding SVE code
> to config/aarch64 and/or code quality tests to gcc.target/aarch64.
> I think the full list of those is:
> 
>   Patches with config/aarch64 code:
> 
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02066.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02068.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01484.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01485.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01491.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01494.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01497.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01506.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01570.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01575.html

I think I've put an OK on most of these. The review was overall
straightforward - sorry for missing the action left on me earlier.

>   Patches with gcc.target/aarch64 tests but no config/aarch64 changes,
>   with the tests being in the spirit of the ones added in the original
>   SVE patch:
> 
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00752.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01446.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01489.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01490.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01498.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01499.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01572.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01573.html
>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01577.html
> 
> The target-independent pieces have already been reviewed (except where
> I'll ping seperately).

And these are also OK. With two comments on the overall strategy - as I've
mentioned elsewhere I find the -march=armv8-a+sve to be too restrictive for
our testing efforts. I'd prefer to just add SVE by a target pragma if we can.
Additionally, I'd be happy with the whole AArch64 testsuite being organised
in to folders (doing this might also make it easier for us to turn all SVE
tests off with older assemblers, by using the skipping the exp file if we
can't find support).

For now, I've OK'd the tests. They would still be OK in my mind with either
or both of my suggestions above.

Thanks,
James

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-01-12 15:30 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-03 17:45 [0/4] [AArch64] Add SVE support Richard Sandiford
2017-11-03 17:48 ` [1/4] [AArch64] SVE backend support Richard Sandiford
2018-01-05 11:41   ` Richard Sandiford
2018-01-10 19:19     ` James Greenhalgh
2018-01-10 19:55       ` Richard Sandiford
2017-11-03 17:50 ` [2/4] [AArch64] Testsuite markup for SVE Richard Sandiford
2018-01-06 17:58   ` James Greenhalgh
2017-11-03 17:51 ` [3/4] [AArch64] SVE tests Richard Sandiford
2018-01-06 18:06   ` James Greenhalgh
2018-01-06 19:13     ` Richard Sandiford
     [not found]       ` <20180107165948.GA13800@arm.com>
2018-01-07 17:10         ` James Greenhalgh
2018-01-12 15:44           ` Richard Sandiford
2017-11-03 17:52 ` [4/4] SVE unwinding Richard Sandiford
2017-11-10 10:58   ` James Greenhalgh
2017-11-24 16:34 ` [0/4] [AArch64] Add SVE support Richard Sandiford
2018-01-06 18:09   ` James Greenhalgh
2018-01-06 19:39     ` Richard Sandiford
     [not found]       ` <20180107210818.GQ6993@arm.com>
2018-01-07 21:10         ` James Greenhalgh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).