From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
To: Jakub Jelinek <jakub@redhat.com>,
Richard Biener <rguenther@suse.de>,
Uros Bizjak <ubizjak@gmail.com>,
David Edelsohn <dje.gcc@gmail.com>,
Segher Boessenkool <segher@kernel.crashing.org>,
Marcus Shawcroft <marcus.shawcroft@arm.com>,
Andreas Krebbel <Andreas.Krebbel@de.ibm.com>,
Matthew Fortune <matthew.fortune@imgtec.com>,
Eric Botcazou <ebotcazou@libertysurf.fr>,
Andrew Jenner <andrew@codesourcery.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Switch vec_init and vec_extract optabs to 2 mode optab to allow extraction of vector from vector or initialization of vector from smaller vectors (PR target/80846)
Date: Tue, 01 Aug 2017 08:09:00 -0000 [thread overview]
Message-ID: <aecf98be-3631-c884-1801-b2849b0db8d4@arm.com> (raw)
In-Reply-To: <20170725091432.GQ2123@tucnak>
On 25/07/17 10:14, Jakub Jelinek wrote:
> Hi!
>
> The following patch adjusts the vec_init and vec_extract optabs, so that
> they don't have in the expander names just the vector mode, but also another
> mode, for vec_extract the mode of the result and for vec_init the mode of
> the elts of the vector passed as second operand.
>
> Without this patch, the second mode has been implicit, GET_MODE_INNER of
> the vector mode, so one could just extract a single element from a vector
> or construct vector from elements. While that is most common, we allow
> in GIMPLE e.g. construction of V8DImode from 4 V2DImode elements etc.
> and the vectorizer uses them. By having the second mode in the name
> it allows the generic code (vectorizer, expansion) to query whether the
> backend supports such vector from vector expansions or inits from vector
> elts and use them if available.
>
> For vec_extract, if we say want to extract high V2SImode from V4SImode
> the fallback is try to expand it as DImode extraction from V2DImode.
> This works well in many cases, but doesn't really work for very large
> vectors, say if we want to extract high V8SImode from V16SImode on x86,
> we'd need OImode extraction from V2OImode, which is something the backend
> doesn't have any support for.
> For vec_init, the fallback is usually to go through memory, which is slow in
> many cases.
>
> This patch only adds new vector from vector extract and init patterns to
> the i386 backend, but I had to change many other targets too, because
> it needs to have the element mode in the vec_extract/vec_init expander
> names. Seems most of the backends didn't really have a mode attribute
> usable for this or had it only in uppercase, while for the names we need
> lowercase. Some backends had a convention on how to name lower case
> vs. upper case modes, others didn't have any. So I'm CCing maintainers
> of affected backends to seek advice on what mode attributes they want to
> use.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, where it improves
> e.g. the code generation for slp-43.c and slp-45.c testcases.
> make cc1 tested in cross-compilers to the remaining targets.
>
> Ok for trunk?
>
> 2017-07-25 Jakub Jelinek <jakub@redhat.com>
>
> PR target/80846
> * optabs.def (vec_extract_optab, vec_init_optab): Change from
> a direct optab to conversion optab.
> * optabs.c (expand_vector_broadcast): Use convert_optab_handler
> with GET_MODE_INNER as last argument instead of optab_handler.
> * expmed.c (extract_bit_field_1): Likewise. Use vector from
> vector extraction if possible and optab is available.
> * expr.c (store_constructor): Use convert_optab_handler instead
> of optab_handler. Use vector initialization from smaller
> vectors if possible and optab is available.
> * tree-vect-stmts.c (vectorizable_load): Likewise.
> * doc/md.texi (vec_extract, vec_init): Document that the optabs
> now have two modes.
> * config/i386/i386.c (ix86_expand_vector_init): Handle expansion
> of vec_init from half-sized vectors with the same element mode.
> * config/i386/sse.md (ssehalfvecmode): Add V4TI case.
> (ssehalfvecmodelower, ssescalarmodelower): New mode attributes.
> (reduc_plus_scal_v8df, reduc_plus_scal_v4df, reduc_plus_scal_v2df,
> reduc_plus_scal_v16sf, reduc_plus_scal_v8sf, reduc_plus_scal_v4sf,
> reduc_<code>_scal_<mode>, reduc_umin_scal_v8hi): Add element mode
> after mode in gen_vec_extract* calls.
> (vec_extract<mode>): Renamed to ...
> (vec_extract<mode><ssescalarmodelower>): ... this.
> (vec_extract<mode><ssehalfvecmodelower>): New expander.
> (rotl<mode>3, rotr<mode>3, <shift_insn><mode>3, ashrv2di3): Add
> element mode after mode in gen_vec_init* calls.
> (VEC_INIT_HALF_MODE): New mode iterator.
> (vec_init<mode>): Renamed to ...
> (vec_init<mode><ssescalarmodelower>): ... this.
> (vec_init<mode><ssehalfvecmodelower>): New expander.
> * config/i386/mmx.md (vec_extractv2sf): Renamed to ...
> (vec_extractv2sfsf): ... this.
> (vec_initv2sf): Renamed to ...
> (vec_initv2sfsf): ... this.
> (vec_extractv2si): Renamed to ...
> (vec_extractv2sisi): ... this.
> (vec_initv2si): Renamed to ...
> (vec_initv2sisi): ... this.
> (vec_extractv4hi): Renamed to ...
> (vec_extractv4hihi): ... this.
> (vec_initv4hi): Renamed to ...
> (vec_initv4hihi): ... this.
> (vec_extractv8qi): Renamed to ...
> (vec_extractv8qiqi): ... this.
> (vec_initv8qi): Renamed to ...
> (vec_initv8qiqi): ... this.
> * config/rs6000/vector.md (VEC_base_l): New mode attribute.
> (vec_init<mode>): Renamed to ...
> (vec_init<mode><VEC_base_l>): ... this.
> (vec_extract<mode>): Renamed to ...
> (vec_extract<mode><VEC_base_l>): ... this.
> * config/rs6000/paired.md (vec_initv2sf): Renamed to ...
> (vec_initv2sfsf): ... this.
> * config/rs6000/altivec.md (splitter, altivec_copysign_v4sf3,
> vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi,
> vec_unpacku_lo_v8hi, mulv16qi3, altivec_vreve<mode>2): Add
> element mode after mode in gen_vec_init* calls.
> * config/aarch64/aarch64-simd.md (vec_init<mode>): Renamed to ...
> (vec_init<mode><Vel>): ... this.
> (vec_extract<mode>): Renamed to ...
> (vec_extract<mode><Vel>): ... this.
> * config/aarch64/iterators.md (Vel): New mode attribute.
> * config/s390/s390.c (s390_expand_vec_strlen, s390_expand_vec_movstr):
> Add element mode after mode in gen_vec_extract* calls.
> * config/s390/vector.md (non_vec_l): New mode attribute.
> (vec_extract<mode>): Renamed to ...
> (vec_extract<mode><non_vec_l>): ... this.
> (vec_init<mode>): Renamed to ...
> (vec_init<mode><non_vec_l>): ... this.
> * config/s390/s390-builtins.def (s390_vlgvb, s390_vlgvh, s390_vlgvf,
> s390_vlgvf_flt, s390_vlgvg, s390_vlgvg_dbl): Add element mode after
> vec_extract mode.
> * config/arm/iterators.md (V_elem_l): New mode attribute.
> * config/arm/neon.md (vec_extract<mode>): Renamed to ...
> (vec_extract<mode><V_elem_l>): ... this.
> (vec_extractv2di): Renamed to ...
> (vec_extractv2didi): ... this.
> (vec_init<mode>): Renamed to ...
> (vec_init<mode><V_elem_l>): ... this.
> (reduc_plus_scal_<mode>, reduc_plus_scal_v2di, reduc_smin_scal_<mode>,
> reduc_smax_scal_<mode>, reduc_umin_scal_<mode>,
> reduc_umax_scal_<mode>, neon_vget_lane<mode>, neon_vget_laneu<mode>):
> Add element mode after gen_vec_extract* calls.
> * config/mips/mips-msa.md (vec_init<mode>): Renamed to ...
> (vec_init<mode><unitmode>): ... this.
> (vec_extract<mode>): Renamed to ...
> (vec_extract<mode><unitmode>): ... this.
> * config/mips/loongson.md (vec_init<mode>): Renamed to ...
> (vec_init<mode><unitmode>): ... this.
> * config/mips/mips-ps-3d.md (vec_initv2sf): Renamed to ...
> (vec_initv2sfsf): ... this.
> (vec_extractv2sf): Renamed to ...
> (vec_extractv2sfsf): ... this.
> (reduc_plus_scal_v2sf, reduc_smin_scal_v2sf, reduc_smax_scal_v2sf):
> Add element mode after gen_vec_extract* calls.
> * config/mips/mips.md (unitmode): New mode iterator.
> * config/spu/spu.c (spu_expand_prologue, spu_allocate_stack,
> spu_builtin_extract): Add element mode after gen_vec_extract* calls.
> * config/spu/spu.md (inner_l): New mode attribute.
> (vec_init<mode>): Renamed to ...
> (vec_init<mode><inner_l>): ... this.
> (vec_extract<mode>): Renamed to ...
> (vec_extract<mode><inner_l>): ... this.
> * config/sparc/sparc.md (veltmode): New mode iterator.
> (vec_init<VMALL:mode>): Renamed to ...
> (vec_init<VMALL:mode><VMALL:veltmode>): ... this.
> * config/ia64/vect.md (vec_initv2si): Renamed to ...
> (vec_initv2sisi): ... this.
> (vec_initv2sf): Renamed to ...
> (vec_initv2sfsf): ... this.
> (vec_extractv2sf): Renamed to ...
> (vec_extractv2sfsf): ... this.
> * config/powerpcspe/vector.md (VEC_base_l): New mode attribute.
> (vec_init<mode>): Renamed to ...
> (vec_init<mode><VEC_base_l>): ... this.
> (vec_extract<mode>): Renamed to ...
> (vec_extract<mode><VEC_base_l>): ... this.
> * config/powerpcspe/paired.md (vec_initv2sf): Renamed to ...
> (vec_initv2sfsf): ... this.
> * config/powerpcspe/altivec.md (splitter, altivec_copysign_v4sf3,
> vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi,
> vec_unpacku_lo_v8hi, mulv16qi3): Add element mode after mode in
> gen_vec_init* calls.
>
Arm & AArch64 bits are OK.
R.
> --- gcc/optabs.def.jj 2017-07-24 10:57:45.944815535 +0200
> +++ gcc/optabs.def 2017-07-24 16:11:23.066229910 +0200
> @@ -89,6 +89,8 @@ OPTAB_CD(vec_cmpu_optab, "vec_cmpu$a$b")
> OPTAB_CD(vec_cmpeq_optab, "vec_cmpeq$a$b")
> OPTAB_CD(maskload_optab, "maskload$a$b")
> OPTAB_CD(maskstore_optab, "maskstore$a$b")
> +OPTAB_CD(vec_extract_optab, "vec_extract$a$b")
> +OPTAB_CD(vec_init_optab, "vec_init$a$b")
>
> OPTAB_NL(add_optab, "add$P$a3", PLUS, "add", '3', gen_int_fp_fixed_libfunc)
> OPTAB_NX(add_optab, "add$F$a3")
> @@ -294,8 +296,6 @@ OPTAB_D (udot_prod_optab, "udot_prod$I$a
> OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
> OPTAB_D (usad_optab, "usad$I$a")
> OPTAB_D (ssad_optab, "ssad$I$a")
> -OPTAB_D (vec_extract_optab, "vec_extract$a")
> -OPTAB_D (vec_init_optab, "vec_init$a")
> OPTAB_D (vec_pack_sfix_trunc_optab, "vec_pack_sfix_trunc_$a")
> OPTAB_D (vec_pack_ssat_optab, "vec_pack_ssat_$a")
> OPTAB_D (vec_pack_trunc_optab, "vec_pack_trunc_$a")
> --- gcc/optabs.c.jj 2017-07-24 10:57:46.216812275 +0200
> +++ gcc/optabs.c 2017-07-24 16:11:23.067229898 +0200
> @@ -386,7 +386,8 @@ expand_vector_broadcast (machine_mode vm
> /* ??? If the target doesn't have a vec_init, then we have no easy way
> of performing this operation. Most of this sort of generic support
> is hidden away in the vector lowering support in gimple. */
> - icode = optab_handler (vec_init_optab, vmode);
> + icode = convert_optab_handler (vec_init_optab, vmode,
> + GET_MODE_INNER (vmode));
> if (icode == CODE_FOR_nothing)
> return NULL;
>
> --- gcc/expmed.c.jj 2017-07-24 10:57:45.914815894 +0200
> +++ gcc/expmed.c 2017-07-24 16:11:23.071229850 +0200
> @@ -1566,6 +1566,55 @@ extract_bit_field_1 (rtx str_rtx, unsign
> return op0;
> }
>
> + /* First try to check for vector from vector extractions. */
> + if (VECTOR_MODE_P (GET_MODE (op0))
> + && !MEM_P (op0)
> + && VECTOR_MODE_P (tmode)
> + && GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (tmode))
> + {
> + machine_mode new_mode = GET_MODE (op0);
> + if (GET_MODE_INNER (new_mode) != GET_MODE_INNER (tmode))
> + {
> + new_mode = mode_for_vector (GET_MODE_INNER (tmode),
> + GET_MODE_BITSIZE (GET_MODE (op0))
> + / GET_MODE_UNIT_BITSIZE (tmode));
> + if (!VECTOR_MODE_P (new_mode)
> + || GET_MODE_SIZE (new_mode) != GET_MODE_SIZE (GET_MODE (op0))
> + || GET_MODE_INNER (new_mode) != GET_MODE_INNER (tmode)
> + || !targetm.vector_mode_supported_p (new_mode))
> + new_mode = VOIDmode;
> + }
> + if (new_mode != VOIDmode
> + && (convert_optab_handler (vec_extract_optab, new_mode, tmode)
> + != CODE_FOR_nothing)
> + && ((bitnum + bitsize - 1) / GET_MODE_BITSIZE (tmode)
> + == bitnum / GET_MODE_BITSIZE (tmode)))
> + {
> + struct expand_operand ops[3];
> + machine_mode outermode = new_mode;
> + machine_mode innermode = tmode;
> + enum insn_code icode
> + = convert_optab_handler (vec_extract_optab, outermode, innermode);
> + unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
> +
> + if (new_mode != GET_MODE (op0))
> + op0 = gen_lowpart (new_mode, op0);
> + create_output_operand (&ops[0], target, innermode);
> + ops[0].target = 1;
> + create_input_operand (&ops[1], op0, outermode);
> + create_integer_operand (&ops[2], pos);
> + if (maybe_expand_insn (icode, 3, ops))
> + {
> + if (alt_rtl && ops[0].target)
> + *alt_rtl = target;
> + target = ops[0].value;
> + if (GET_MODE (target) != mode)
> + return gen_lowpart (tmode, target);
> + return target;
> + }
> + }
> + }
> +
> /* See if we can get a better vector mode before extracting. */
> if (VECTOR_MODE_P (GET_MODE (op0))
> && !MEM_P (op0)
> @@ -1599,14 +1648,17 @@ extract_bit_field_1 (rtx str_rtx, unsign
> available. */
> if (VECTOR_MODE_P (GET_MODE (op0))
> && !MEM_P (op0)
> - && optab_handler (vec_extract_optab, GET_MODE (op0)) != CODE_FOR_nothing
> + && (convert_optab_handler (vec_extract_optab, GET_MODE (op0),
> + GET_MODE_INNER (GET_MODE (op0)))
> + != CODE_FOR_nothing)
> && ((bitnum + bitsize - 1) / GET_MODE_UNIT_BITSIZE (GET_MODE (op0))
> == bitnum / GET_MODE_UNIT_BITSIZE (GET_MODE (op0))))
> {
> struct expand_operand ops[3];
> machine_mode outermode = GET_MODE (op0);
> machine_mode innermode = GET_MODE_INNER (outermode);
> - enum insn_code icode = optab_handler (vec_extract_optab, outermode);
> + enum insn_code icode
> + = convert_optab_handler (vec_extract_optab, outermode, innermode);
> unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
>
> create_output_operand (&ops[0], target, innermode);
> --- gcc/expr.c.jj 2017-07-24 10:57:45.963815307 +0200
> +++ gcc/expr.c 2017-07-24 16:11:23.073229826 +0200
> @@ -6589,6 +6589,7 @@ store_constructor (tree exp, rtx target,
> rtvec vector = NULL;
> unsigned n_elts;
> alias_set_type alias;
> + bool vec_vec_init_p = false;
>
> gcc_assert (eltmode != BLKmode);
>
> @@ -6596,27 +6597,30 @@ store_constructor (tree exp, rtx target,
> if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))
> {
> machine_mode mode = GET_MODE (target);
> + machine_mode emode = eltmode;
>
> - icode = (int) optab_handler (vec_init_optab, mode);
> - /* Don't use vec_init<mode> if some elements have VECTOR_TYPE. */
> - if (icode != CODE_FOR_nothing)
> + if (CONSTRUCTOR_NELTS (exp)
> + && (TREE_CODE (TREE_TYPE (CONSTRUCTOR_ELT (exp, 0)->value))
> + == VECTOR_TYPE))
> {
> - tree value;
> -
> - FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (exp), idx, value)
> - if (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE)
> - {
> - icode = CODE_FOR_nothing;
> - break;
> - }
> + tree etype = TREE_TYPE (CONSTRUCTOR_ELT (exp, 0)->value);
> + gcc_assert (CONSTRUCTOR_NELTS (exp) * TYPE_VECTOR_SUBPARTS (etype)
> + == n_elts);
> + emode = TYPE_MODE (etype);
> }
> + icode = (int) convert_optab_handler (vec_init_optab, mode, emode);
> if (icode != CODE_FOR_nothing)
> {
> - unsigned int i;
> + unsigned int i, n = n_elts;
>
> - vector = rtvec_alloc (n_elts);
> - for (i = 0; i < n_elts; i++)
> - RTVEC_ELT (vector, i) = CONST0_RTX (GET_MODE_INNER (mode));
> + if (emode != eltmode)
> + {
> + n = CONSTRUCTOR_NELTS (exp);
> + vec_vec_init_p = true;
> + }
> + vector = rtvec_alloc (n);
> + for (i = 0; i < n; i++)
> + RTVEC_ELT (vector, i) = CONST0_RTX (emode);
> }
> }
>
> @@ -6634,10 +6638,10 @@ store_constructor (tree exp, rtx target,
>
> FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (exp), idx, value)
> {
> - int n_elts_here = tree_to_uhwi
> - (int_const_binop (TRUNC_DIV_EXPR,
> - TYPE_SIZE (TREE_TYPE (value)),
> - TYPE_SIZE (elttype)));
> + tree sz = TYPE_SIZE (TREE_TYPE (value));
> + int n_elts_here
> + = tree_to_uhwi (int_const_binop (TRUNC_DIV_EXPR, sz,
> + TYPE_SIZE (elttype)));
>
> count += n_elts_here;
> if (mostly_zeros_p (value))
> @@ -6687,18 +6691,21 @@ store_constructor (tree exp, rtx target,
>
> if (vector)
> {
> - /* vec_init<mode> should not be used if there are VECTOR_TYPE
> - elements. */
> - gcc_assert (TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE);
> - RTVEC_ELT (vector, eltpos)
> - = expand_normal (value);
> + if (vec_vec_init_p)
> + {
> + gcc_assert (ce->index == NULL_TREE);
> + gcc_assert (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE);
> + eltpos = idx;
> + }
> + else
> + gcc_assert (TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE);
> + RTVEC_ELT (vector, eltpos) = expand_normal (value);
> }
> else
> {
> - machine_mode value_mode =
> - TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE
> - ? TYPE_MODE (TREE_TYPE (value))
> - : eltmode;
> + machine_mode value_mode
> + = (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE
> + ? TYPE_MODE (TREE_TYPE (value)) : eltmode);
> bitpos = eltpos * elt_size;
> store_constructor_field (target, bitsize, bitpos, 0,
> bitregion_end, value_mode,
> @@ -6707,9 +6714,9 @@ store_constructor (tree exp, rtx target,
> }
>
> if (vector)
> - emit_insn (GEN_FCN (icode)
> - (target,
> - gen_rtx_PARALLEL (GET_MODE (target), vector)));
> + emit_insn (GEN_FCN (icode) (target,
> + gen_rtx_PARALLEL (GET_MODE (target),
> + vector)));
> break;
> }
>
> --- gcc/tree-vect-stmts.c.jj 2017-07-24 10:57:46.004814816 +0200
> +++ gcc/tree-vect-stmts.c 2017-07-24 16:11:23.049230114 +0200
> @@ -6996,29 +6996,43 @@ vectorizable_load (gimple *stmt, gimple_
> {
> if (group_size < nunits)
> {
> - /* Avoid emitting a constructor of vector elements by performing
> - the loads using an integer type of the same size,
> - constructing a vector of those and then re-interpreting it
> - as the original vector type. This works around the fact
> - that the vec_init optab was only designed for scalar
> - element modes and thus expansion goes through memory.
> - This avoids a huge runtime penalty due to the general
> - inability to perform store forwarding from smaller stores
> - to a larger load. */
> - unsigned lsize
> - = group_size * TYPE_PRECISION (TREE_TYPE (vectype));
> - machine_mode elmode = mode_for_size (lsize, MODE_INT, 0);
> - machine_mode vmode = mode_for_vector (elmode,
> - nunits / group_size);
> - /* If we can't construct such a vector fall back to
> - element loads of the original vector type. */
> + /* First check if vec_init optab supports construction from
> + vector elts directly. */
> + machine_mode elmode = TYPE_MODE (TREE_TYPE (vectype));
> + machine_mode vmode = mode_for_vector (elmode, group_size);
> if (VECTOR_MODE_P (vmode)
> - && optab_handler (vec_init_optab, vmode) != CODE_FOR_nothing)
> + && (convert_optab_handler (vec_init_optab,
> + TYPE_MODE (vectype), vmode)
> + != CODE_FOR_nothing))
> {
> nloads = nunits / group_size;
> lnel = group_size;
> - ltype = build_nonstandard_integer_type (lsize, 1);
> - lvectype = build_vector_type (ltype, nloads);
> + ltype = build_vector_type (TREE_TYPE (vectype), group_size);
> + }
> + else
> + {
> + /* Otherwise avoid emitting a constructor of vector elements
> + by performing the loads using an integer type of the same
> + size, constructing a vector of those and then
> + re-interpreting it as the original vector type.
> + This avoids a huge runtime penalty due to the general
> + inability to perform store forwarding from smaller stores
> + to a larger load. */
> + unsigned lsize
> + = group_size * TYPE_PRECISION (TREE_TYPE (vectype));
> + elmode = mode_for_size (lsize, MODE_INT, 0);
> + vmode = mode_for_vector (elmode, nunits / group_size);
> + /* If we can't construct such a vector fall back to
> + element loads of the original vector type. */
> + if (VECTOR_MODE_P (vmode)
> + && (convert_optab_handler (vec_init_optab, vmode, elmode)
> + != CODE_FOR_nothing))
> + {
> + nloads = nunits / group_size;
> + lnel = group_size;
> + ltype = build_nonstandard_integer_type (lsize, 1);
> + lvectype = build_vector_type (ltype, nloads);
> + }
> }
> }
> else
> --- gcc/doc/md.texi.jj 2017-07-24 10:57:45.989814996 +0200
> +++ gcc/doc/md.texi 2017-07-24 17:09:55.536882382 +0200
> @@ -4871,15 +4871,22 @@ This pattern is not allowed to @code{FAI
> Set given field in the vector value. Operand 0 is the vector to modify,
> operand 1 is new value of field and operand 2 specify the field index.
>
> -@cindex @code{vec_extract@var{m}} instruction pattern
> -@item @samp{vec_extract@var{m}}
> +@cindex @code{vec_extract@var{m}@var{n}} instruction pattern
> +@item @samp{vec_extract@var{m}@var{n}}
> Extract given field from the vector value. Operand 1 is the vector, operand 2
> -specify field index and operand 0 place to store value into.
> +specify field index and operand 0 place to store value into. The
> +@var{n} mode is the mode of the field or vector of fields that should be
> +extracted, should be either element mode of the vector mode @var{m}, or
> +a vector mode with the same element mode and smaller number of elements.
> +If @var{n} is a vector mode, the index is counted in units of that mode.
>
> -@cindex @code{vec_init@var{m}} instruction pattern
> -@item @samp{vec_init@var{m}}
> +@cindex @code{vec_init@var{m}@var{n}} instruction pattern
> +@item @samp{vec_init@var{m}@var{n}}
> Initialize the vector to given values. Operand 0 is the vector to initialize
> -and operand 1 is parallel containing values for individual fields.
> +and operand 1 is parallel containing values for individual fields. The
> +@var{n} mode is the mode of the elements, should be either element mode of
> +the vector mode @var{m}, or a vector mode with the same element mode and
> +smaller number of elements.
>
> @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
> @item @samp{vec_cmp@var{m}@var{n}}
> --- gcc/config/i386/i386.c.jj 2017-07-24 10:58:11.831505333 +0200
> +++ gcc/config/i386/i386.c 2017-07-24 16:11:23.060229982 +0200
> @@ -44297,6 +44297,34 @@ ix86_expand_vector_init (bool mmx_ok, rt
> int i;
> rtx x;
>
> + /* Handle first initialization from vector elts. */
> + if (n_elts != XVECLEN (vals, 0))
> + {
> + rtx subtarget = target;
> + x = XVECEXP (vals, 0, 0);
> + gcc_assert (GET_MODE_INNER (GET_MODE (x)) == inner_mode);
> + if (GET_MODE_NUNITS (GET_MODE (x)) * 2 == n_elts)
> + {
> + rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) };
> + if (inner_mode == QImode || inner_mode == HImode)
> + {
> + mode = mode_for_vector (SImode,
> + n_elts * GET_MODE_SIZE (inner_mode) / 4);
> + inner_mode
> + = mode_for_vector (SImode,
> + n_elts * GET_MODE_SIZE (inner_mode) / 8);
> + ops[0] = gen_lowpart (inner_mode, ops[0]);
> + ops[1] = gen_lowpart (inner_mode, ops[1]);
> + subtarget = gen_reg_rtx (mode);
> + }
> + ix86_expand_vector_init_concat (mode, subtarget, ops, 2);
> + if (subtarget != target)
> + emit_move_insn (target, gen_lowpart (GET_MODE (target), subtarget));
> + return;
> + }
> + gcc_unreachable ();
> + }
> +
> for (i = 0; i < n_elts; ++i)
> {
> x = XVECEXP (vals, 0, i);
> --- gcc/config/i386/sse.md.jj 2017-07-24 10:57:45.807817176 +0200
> +++ gcc/config/i386/sse.md 2017-07-24 16:54:35.658088768 +0200
> @@ -658,13 +658,21 @@ (define_mode_attr ssedoublevecmode
>
> ;; Mapping of vector modes to a vector mode of half size
> (define_mode_attr ssehalfvecmode
> - [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI") (V8DI "V4DI")
> + [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI") (V8DI "V4DI") (V4TI "V2TI")
> (V32QI "V16QI") (V16HI "V8HI") (V8SI "V4SI") (V4DI "V2DI")
> (V16QI "V8QI") (V8HI "V4HI") (V4SI "V2SI")
> (V16SF "V8SF") (V8DF "V4DF")
> (V8SF "V4SF") (V4DF "V2DF")
> (V4SF "V2SF")])
>
> +(define_mode_attr ssehalfvecmodelower
> + [(V64QI "v32qi") (V32HI "v16hi") (V16SI "v8si") (V8DI "v4di") (V4TI "v2ti")
> + (V32QI "v16qi") (V16HI "v8hi") (V8SI "v4si") (V4DI "v2di")
> + (V16QI "v8qi") (V8HI "v4hi") (V4SI "v2si")
> + (V16SF "v8sf") (V8DF "v4df")
> + (V8SF "v4sf") (V4DF "v2df")
> + (V4SF "v2sf")])
> +
> ;; Mapping of vector modes ti packed single mode of the same size
> (define_mode_attr ssePSmode
> [(V16SI "V16SF") (V8DF "V16SF")
> @@ -690,6 +698,16 @@ (define_mode_attr ssescalarmode
> (V8DF "DF") (V4DF "DF") (V2DF "DF")
> (V4TI "TI") (V2TI "TI")])
>
> +;; Mapping of vector modes back to the scalar modes
> +(define_mode_attr ssescalarmodelower
> + [(V64QI "qi") (V32QI "qi") (V16QI "qi")
> + (V32HI "hi") (V16HI "hi") (V8HI "hi")
> + (V16SI "si") (V8SI "si") (V4SI "si")
> + (V8DI "di") (V4DI "di") (V2DI "di")
> + (V16SF "sf") (V8SF "sf") (V4SF "sf")
> + (V8DF "df") (V4DF "df") (V2DF "df")
> + (V4TI "ti") (V2TI "ti")])
> +
> ;; Mapping of vector modes to the 128bit modes
> (define_mode_attr ssexmmmode
> [(V64QI "V16QI") (V32QI "V16QI") (V16QI "V16QI")
> @@ -2356,7 +2374,7 @@ (define_expand "reduc_plus_scal_v8df"
> {
> rtx tmp = gen_reg_rtx (V8DFmode);
> ix86_expand_reduc (gen_addv8df3, tmp, operands[1]);
> - emit_insn (gen_vec_extractv8df (operands[0], tmp, const0_rtx));
> + emit_insn (gen_vec_extractv8dfdf (operands[0], tmp, const0_rtx));
> DONE;
> })
>
> @@ -2371,7 +2389,7 @@ (define_expand "reduc_plus_scal_v4df"
> emit_insn (gen_avx_haddv4df3 (tmp, operands[1], operands[1]));
> emit_insn (gen_avx_vperm2f128v4df3 (tmp2, tmp, tmp, GEN_INT (1)));
> emit_insn (gen_addv4df3 (vec_res, tmp, tmp2));
> - emit_insn (gen_vec_extractv4df (operands[0], vec_res, const0_rtx));
> + emit_insn (gen_vec_extractv4dfdf (operands[0], vec_res, const0_rtx));
> DONE;
> })
>
> @@ -2382,7 +2400,7 @@ (define_expand "reduc_plus_scal_v2df"
> {
> rtx tmp = gen_reg_rtx (V2DFmode);
> emit_insn (gen_sse3_haddv2df3 (tmp, operands[1], operands[1]));
> - emit_insn (gen_vec_extractv2df (operands[0], tmp, const0_rtx));
> + emit_insn (gen_vec_extractv2dfdf (operands[0], tmp, const0_rtx));
> DONE;
> })
>
> @@ -2393,7 +2411,7 @@ (define_expand "reduc_plus_scal_v16sf"
> {
> rtx tmp = gen_reg_rtx (V16SFmode);
> ix86_expand_reduc (gen_addv16sf3, tmp, operands[1]);
> - emit_insn (gen_vec_extractv16sf (operands[0], tmp, const0_rtx));
> + emit_insn (gen_vec_extractv16sfsf (operands[0], tmp, const0_rtx));
> DONE;
> })
>
> @@ -2409,7 +2427,7 @@ (define_expand "reduc_plus_scal_v8sf"
> emit_insn (gen_avx_haddv8sf3 (tmp2, tmp, tmp));
> emit_insn (gen_avx_vperm2f128v8sf3 (tmp, tmp2, tmp2, GEN_INT (1)));
> emit_insn (gen_addv8sf3 (vec_res, tmp, tmp2));
> - emit_insn (gen_vec_extractv8sf (operands[0], vec_res, const0_rtx));
> + emit_insn (gen_vec_extractv8sfsf (operands[0], vec_res, const0_rtx));
> DONE;
> })
>
> @@ -2427,7 +2445,7 @@ (define_expand "reduc_plus_scal_v4sf"
> }
> else
> ix86_expand_reduc (gen_addv4sf3, vec_res, operands[1]);
> - emit_insn (gen_vec_extractv4sf (operands[0], vec_res, const0_rtx));
> + emit_insn (gen_vec_extractv4sfsf (operands[0], vec_res, const0_rtx));
> DONE;
> })
>
> @@ -2449,7 +2467,8 @@ (define_expand "reduc_<code>_scal_<mode>
> {
> rtx tmp = gen_reg_rtx (<MODE>mode);
> ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
> - emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
> + emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
> + const0_rtx));
> DONE;
> })
>
> @@ -2461,7 +2480,8 @@ (define_expand "reduc_<code>_scal_<mode>
> {
> rtx tmp = gen_reg_rtx (<MODE>mode);
> ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
> - emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
> + emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
> + const0_rtx));
> DONE;
> })
>
> @@ -2473,7 +2493,8 @@ (define_expand "reduc_<code>_scal_<mode>
> {
> rtx tmp = gen_reg_rtx (<MODE>mode);
> ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
> - emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
> + emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
> + const0_rtx));
> DONE;
> })
>
> @@ -2485,7 +2506,7 @@ (define_expand "reduc_umin_scal_v8hi"
> {
> rtx tmp = gen_reg_rtx (V8HImode);
> ix86_expand_reduc (gen_uminv8hi3, tmp, operands[1]);
> - emit_insn (gen_vec_extractv8hi (operands[0], tmp, const0_rtx));
> + emit_insn (gen_vec_extractv8hihi (operands[0], tmp, const0_rtx));
> DONE;
> })
>
> @@ -7881,7 +7902,7 @@ (define_mode_iterator VEC_EXTRACT_MODE
> (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") V2DF
> (V4TI "TARGET_AVX512F") (V2TI "TARGET_AVX")])
>
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><ssescalarmodelower>"
> [(match_operand:<ssescalarmode> 0 "register_operand")
> (match_operand:VEC_EXTRACT_MODE 1 "register_operand")
> (match_operand 2 "const_int_operand")]
> @@ -7892,6 +7913,19 @@ (define_expand "vec_extract<mode>"
> DONE;
> })
>
> +(define_expand "vec_extract<mode><ssehalfvecmodelower>"
> + [(match_operand:<ssehalfvecmode> 0 "nonimmediate_operand")
> + (match_operand:V_512 1 "register_operand")
> + (match_operand 2 "const_0_to_1_operand")]
> + "TARGET_AVX512F"
> +{
> + if (INTVAL (operands[2]))
> + emit_insn (gen_vec_extract_hi_<mode> (operands[0], operands[1]));
> + else
> + emit_insn (gen_vec_extract_lo_<mode> (operands[0], operands[1]));
> + DONE;
> +})
> +
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> ;;
> ;; Parallel double-precision floating point element swizzling
> @@ -16693,7 +16727,7 @@ (define_expand "rotl<mode>3"
> for (i = 0; i < <ssescalarnum>; i++)
> RTVEC_ELT (vs, i) = op2;
>
> - emit_insn (gen_vec_init<mode> (reg, par));
> + emit_insn (gen_vec_init<mode><ssescalarmodelower> (reg, par));
> emit_insn (gen_xop_vrotl<mode>3 (operands[0], operands[1], reg));
> DONE;
> }
> @@ -16725,7 +16759,7 @@ (define_expand "rotr<mode>3"
> for (i = 0; i < <ssescalarnum>; i++)
> RTVEC_ELT (vs, i) = op2;
>
> - emit_insn (gen_vec_init<mode> (reg, par));
> + emit_insn (gen_vec_init<mode><ssescalarmodelower> (reg, par));
> emit_insn (gen_neg<mode>2 (neg, reg));
> emit_insn (gen_xop_vrotl<mode>3 (operands[0], operands[1], neg));
> DONE;
> @@ -17019,7 +17053,7 @@ (define_expand "<shift_insn><mode>3"
> XVECEXP (par, 0, i) = operands[2];
>
> tmp = gen_reg_rtx (V16QImode);
> - emit_insn (gen_vec_initv16qi (tmp, par));
> + emit_insn (gen_vec_initv16qiqi (tmp, par));
>
> if (negate)
> emit_insn (gen_negv16qi2 (tmp, tmp));
> @@ -17055,7 +17089,7 @@ (define_expand "ashrv2di3"
> for (i = 0; i < 2; i++)
> XVECEXP (par, 0, i) = operands[2];
>
> - emit_insn (gen_vec_initv2di (reg, par));
> + emit_insn (gen_vec_initv2didi (reg, par));
>
> if (negate)
> emit_insn (gen_negv2di2 (reg, reg));
> @@ -18775,7 +18809,7 @@ (define_insn_and_split "avx_<castmode><a
> <ssehalfvecmode>mode);
> })
>
> -;; Modes handled by vec_init patterns.
> +;; Modes handled by vec_init expanders.
> (define_mode_iterator VEC_INIT_MODE
> [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
> (V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI
> @@ -18785,11 +18819,31 @@ (define_mode_iterator VEC_INIT_MODE
> (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") (V2DF "TARGET_SSE2")
> (V4TI "TARGET_AVX512F") (V2TI "TARGET_AVX")])
>
> -(define_expand "vec_init<mode>"
> +;; Likewise, but for initialization from half sized vectors.
> +;; Thus, these are all VEC_INIT_MODE modes except V2??.
> +(define_mode_iterator VEC_INIT_HALF_MODE
> + [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
> + (V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI
> + (V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX") V4SI
> + (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX")
> + (V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF
> + (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX")
> + (V4TI "TARGET_AVX512F")])
> +
> +(define_expand "vec_init<mode><ssescalarmodelower>"
> [(match_operand:VEC_INIT_MODE 0 "register_operand")
> (match_operand 1)]
> "TARGET_SSE"
> {
> + ix86_expand_vector_init (false, operands[0], operands[1]);
> + DONE;
> +})
> +
> +(define_expand "vec_init<mode><ssehalfvecmodelower>"
> + [(match_operand:VEC_INIT_HALF_MODE 0 "register_operand")
> + (match_operand 1)]
> + "TARGET_SSE"
> +{
> ix86_expand_vector_init (false, operands[0], operands[1]);
> DONE;
> })
> --- gcc/config/i386/mmx.md.jj 2017-07-24 10:57:45.869816434 +0200
> +++ gcc/config/i386/mmx.md 2017-07-24 16:11:23.065229922 +0200
> @@ -641,7 +641,7 @@ (define_split
> [(set (match_dup 0) (match_dup 1))]
> "operands[1] = adjust_address (operands[1], SFmode, 4);")
>
> -(define_expand "vec_extractv2sf"
> +(define_expand "vec_extractv2sfsf"
> [(match_operand:SF 0 "register_operand")
> (match_operand:V2SF 1 "register_operand")
> (match_operand 2 "const_int_operand")]
> @@ -652,7 +652,7 @@ (define_expand "vec_extractv2sf"
> DONE;
> })
>
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
> [(match_operand:V2SF 0 "register_operand")
> (match_operand 1)]
> "TARGET_SSE"
> @@ -1344,7 +1344,7 @@ (define_insn_and_split "*vec_extractv2si
> operands[1] = adjust_address (operands[1], SImode, INTVAL (operands[2]) * 4);
> })
>
> -(define_expand "vec_extractv2si"
> +(define_expand "vec_extractv2sisi"
> [(match_operand:SI 0 "register_operand")
> (match_operand:V2SI 1 "register_operand")
> (match_operand 2 "const_int_operand")]
> @@ -1355,7 +1355,7 @@ (define_expand "vec_extractv2si"
> DONE;
> })
>
> -(define_expand "vec_initv2si"
> +(define_expand "vec_initv2sisi"
> [(match_operand:V2SI 0 "register_operand")
> (match_operand 1)]
> "TARGET_SSE"
> @@ -1375,7 +1375,7 @@ (define_expand "vec_setv4hi"
> DONE;
> })
>
> -(define_expand "vec_extractv4hi"
> +(define_expand "vec_extractv4hihi"
> [(match_operand:HI 0 "register_operand")
> (match_operand:V4HI 1 "register_operand")
> (match_operand 2 "const_int_operand")]
> @@ -1386,7 +1386,7 @@ (define_expand "vec_extractv4hi"
> DONE;
> })
>
> -(define_expand "vec_initv4hi"
> +(define_expand "vec_initv4hihi"
> [(match_operand:V4HI 0 "register_operand")
> (match_operand 1)]
> "TARGET_SSE"
> @@ -1406,7 +1406,7 @@ (define_expand "vec_setv8qi"
> DONE;
> })
>
> -(define_expand "vec_extractv8qi"
> +(define_expand "vec_extractv8qiqi"
> [(match_operand:QI 0 "register_operand")
> (match_operand:V8QI 1 "register_operand")
> (match_operand 2 "const_int_operand")]
> @@ -1417,7 +1417,7 @@ (define_expand "vec_extractv8qi"
> DONE;
> })
>
> -(define_expand "vec_initv8qi"
> +(define_expand "vec_initv8qiqi"
> [(match_operand:V8QI 0 "register_operand")
> (match_operand 1)]
> "TARGET_SSE"
> --- gcc/config/rs6000/vector.md.jj 2017-06-08 20:50:49.000000000 +0200
> +++ gcc/config/rs6000/vector.md 2017-07-24 17:44:44.699580927 +0200
> @@ -74,6 +74,16 @@ (define_mode_attr VEC_base [(V16QI "QI")
> (V1TI "TI")
> (TI "TI")])
>
> +;; As above, but in lower case
> +(define_mode_attr VEC_base_l [(V16QI "qi")
> + (V8HI "hi")
> + (V4SI "si")
> + (V2DI "di")
> + (V4SF "sf")
> + (V2DF "df")
> + (V1TI "ti")
> + (TI "ti")])
> +
> ;; Same size integer type for floating point data
> (define_mode_attr VEC_int [(V4SF "v4si")
> (V2DF "v2di")])
> @@ -1016,7 +1026,7 @@ (define_expand "fixuns_trunc<mode><VEC_i
>
> \f
> ;; Vector initialization, set, extract
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><VEC_base_l>"
> [(match_operand:VEC_E 0 "vlogical_operand" "")
> (match_operand:VEC_E 1 "" "")]
> "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
> @@ -1035,7 +1045,7 @@ (define_expand "vec_set<mode>"
> DONE;
> })
>
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><VEC_base_l>"
> [(match_operand:<VEC_base> 0 "register_operand" "")
> (match_operand:VEC_E 1 "vlogical_operand" "")
> (match_operand 2 "const_int_operand" "")]
> --- gcc/config/rs6000/paired.md.jj 2017-06-08 20:50:49.000000000 +0200
> +++ gcc/config/rs6000/paired.md 2017-07-24 17:48:20.324985029 +0200
> @@ -377,7 +377,7 @@ (define_insn "paired_muls1"
> "ps_muls1 %0, %1, %2"
> [(set_attr "type" "fp")])
>
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
> [(match_operand:V2SF 0 "gpc_reg_operand" "=f")
> (match_operand 1 "" "")]
> "TARGET_PAIRED_FLOAT"
> --- gcc/config/rs6000/altivec.md.jj 2017-07-24 10:58:12.000000000 +0200
> +++ gcc/config/rs6000/altivec.md 2017-07-24 17:48:49.573633038 +0200
> @@ -311,7 +311,7 @@ (define_split
> for (i = 0; i < num_elements; i++)
> RTVEC_ELT (v, i) = constm1_rtx;
>
> - emit_insn (gen_vec_initv4si (dest, gen_rtx_PARALLEL (mode, v)));
> + emit_insn (gen_vec_initv4sisi (dest, gen_rtx_PARALLEL (mode, v)));
> emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, dest, dest)));
> DONE;
> })
> @@ -2267,7 +2267,7 @@ (define_expand "altivec_copysign_v4sf3"
> RTVEC_ELT (v, 2) = GEN_INT (mask_val);
> RTVEC_ELT (v, 3) = GEN_INT (mask_val);
>
> - emit_insn (gen_vec_initv4si (mask, gen_rtx_PARALLEL (V4SImode, v)));
> + emit_insn (gen_vec_initv4sisi (mask, gen_rtx_PARALLEL (V4SImode, v)));
> emit_insn (gen_vector_select_v4sf (operands[0], operands[1], operands[2],
> gen_lowpart (V4SFmode, mask)));
> DONE;
> @@ -3409,7 +3409,7 @@ (define_expand "vec_unpacku_hi_v16qi"
> RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 0);
> RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16);
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
> DONE;
> }")
> @@ -3445,7 +3445,7 @@ (define_expand "vec_unpacku_hi_v8hi"
> RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 6 : 17);
> RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16);
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
> DONE;
> }")
> @@ -3481,7 +3481,7 @@ (define_expand "vec_unpacku_lo_v16qi"
> RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 8);
> RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
> DONE;
> }")
> @@ -3517,7 +3517,7 @@ (define_expand "vec_unpacku_lo_v8hi"
> RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 14 : 17);
> RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
> DONE;
> }")
> @@ -3758,7 +3758,7 @@ (define_expand "mulv16qi3"
> = gen_rtx_CONST_INT (QImode, BYTES_BIG_ENDIAN ? 2 * i + 17 : 15 - 2 * i);
> }
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_altivec_vmulesb (even, operands[1], operands[2]));
> emit_insn (gen_altivec_vmulosb (odd, operands[1], operands[2]));
> emit_insn (gen_altivec_vperm_v8hiv16qi (operands[0], even, odd, mask));
> @@ -3804,7 +3804,7 @@ (define_expand "altivec_vreve<mode>2"
> RTVEC_ELT (v, i + j * size)
> = GEN_INT (i + (num_elements - 1 - j) * size);
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_altivec_vperm_<mode> (operands[0], operands[1],
> operands[1], mask));
> DONE;
> --- gcc/config/aarch64/aarch64-simd.md.jj 2017-07-24 15:01:21.000000000 +0200
> +++ gcc/config/aarch64/aarch64-simd.md 2017-07-24 17:19:05.660170375 +0200
> @@ -5617,9 +5617,9 @@ (define_expand "aarch64_set_qreg<VSTRUCT
> DONE;
> })
>
> -;; Standard pattern name vec_init<mode>.
> +;; Standard pattern name vec_init<mode><Vel>.
>
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><Vel>"
> [(match_operand:VALL_F16 0 "register_operand" "")
> (match_operand 1 "" "")]
> "TARGET_SIMD"
> @@ -5674,9 +5674,9 @@ (define_insn "aarch64_urecpe<mode>"
> "urecpe\\t%0.<Vtype>, %1.<Vtype>"
> [(set_attr "type" "neon_fp_recpe_<Vetype><q>")])
>
> -;; Standard pattern name vec_extract<mode>.
> +;; Standard pattern name vec_extract<mode><Vel>.
>
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><Vel>"
> [(match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "")
> (match_operand:VALL_F16 1 "register_operand" "")
> (match_operand:SI 2 "immediate_operand" "")]
> --- gcc/config/aarch64/iterators.md.jj 2017-03-19 11:57:22.000000000 +0100
> +++ gcc/config/aarch64/iterators.md 2017-07-24 17:17:50.318091273 +0200
> @@ -520,6 +520,17 @@ (define_mode_attr VEL [(V8QI "QI") (V16Q
> (SI "SI") (HI "HI")
> (QI "QI")])
>
> +;; Define element mode for each vector mode (lower case).
> +(define_mode_attr Vel [(V8QI "qi") (V16QI "qi")
> + (V4HI "hi") (V8HI "hi")
> + (V2SI "si") (V4SI "si")
> + (DI "di") (V2DI "di")
> + (V4HF "hf") (V8HF "hf")
> + (V2SF "sf") (V4SF "sf")
> + (V2DF "df") (DF "df")
> + (SI "si") (HI "hi")
> + (QI "qi")])
> +
> ;; 64-bit container modes the inner or scalar source mode.
> (define_mode_attr VCOND [(HI "V4HI") (SI "V2SI")
> (V4HI "V4HI") (V8HI "V4HI")
> --- gcc/config/s390/s390.c.jj 2017-07-17 10:08:39.000000000 +0200
> +++ gcc/config/s390/s390.c 2017-07-24 17:58:24.416715142 +0200
> @@ -5792,7 +5792,7 @@ s390_expand_vec_strlen (rtx target, rtx
> add_int_reg_note (s390_emit_ccraw_jump (8, NE, loop_start_label),
> REG_BR_PROB,
> profile_probability::very_likely ().to_reg_br_prob_note ());
> - emit_insn (gen_vec_extractv16qi (len, result_reg, GEN_INT (7)));
> + emit_insn (gen_vec_extractv16qiqi (len, result_reg, GEN_INT (7)));
>
> /* If the string pointer wasn't aligned we have loaded less then 16
> bytes and the remaining bytes got filled with zeros (by vll).
> @@ -5850,7 +5850,7 @@ s390_expand_vec_movstr (rtx result, rtx
> emit_insn (gen_vlbb (vsrc, src, GEN_INT (6)));
> emit_insn (gen_lcbb (loadlen, src_addr, GEN_INT (6)));
> emit_insn (gen_vfenezv16qi (vpos, vsrc, vsrc));
> - emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
> + emit_insn (gen_vec_extractv16qiqi (gpos_qi, vpos, GEN_INT (7)));
> emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
> /* gpos is the byte index if a zero was found and 16 otherwise.
> So if it is lower than the loaded bytes we have a hit. */
> @@ -5928,7 +5928,7 @@ s390_expand_vec_movstr (rtx result, rtx
> force_expand_binop (Pmode, add_optab, dst_addr_reg, offset, dst_addr_reg,
> 1, OPTAB_DIRECT);
>
> - emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
> + emit_insn (gen_vec_extractv16qiqi (gpos_qi, vpos, GEN_INT (7)));
> emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
>
> emit_insn (gen_vstlv16qi (vsrc, gpos, gen_rtx_MEM (BLKmode, dst_addr_reg)));
> --- gcc/config/s390/vector.md.jj 2017-04-25 15:51:31.000000000 +0200
> +++ gcc/config/s390/vector.md 2017-07-24 17:57:37.665277768 +0200
> @@ -90,6 +90,17 @@ (define_mode_attr non_vec[(V1QI "QI") (V
> (V1DF "DF") (V2DF "DF")
> (V1TF "TF") (TF "TF")])
>
> +; Like above, but in lower case.
> +(define_mode_attr non_vec_l[(V1QI "qi") (V2QI "qi") (V4QI "qi") (V8QI "qi")
> + (V16QI "qi")
> + (V1HI "hi") (V2HI "hi") (V4HI "hi") (V8HI "hi")
> + (V1SI "si") (V2SI "si") (V4SI "si")
> + (V1DI "di") (V2DI "di")
> + (V1TI "ti") (TI "ti")
> + (V1SF "sf") (V2SF "sf") (V4SF "sf")
> + (V1DF "df") (V2DF "df")
> + (V1TF "tf") (TF "tf")])
> +
> ; The instruction suffix for integer instructions and instructions
> ; which do not care about whether it is floating point or integer.
> (define_mode_attr bhfgq[(V1QI "b") (V2QI "b") (V4QI "b") (V8QI "b") (V16QI "b")
> @@ -453,7 +464,7 @@ (define_insn "*vec_set<mode>_plus"
> ; FIXME: Support also vector mode operands for 0
> ; FIXME: This should be (vec_select ..) or something but it does only allow constant selectors :(
> ; This is used via RTL standard name as well as for expanding the builtin
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><non_vec_l>"
> [(set (match_operand:<non_vec> 0 "nonimmediate_operand" "")
> (unspec:<non_vec> [(match_operand:V 1 "register_operand" "")
> (match_operand:SI 2 "nonmemory_operand" "")]
> @@ -485,7 +496,7 @@ (define_insn "*vec_extract<mode>_plus"
> "vlgv<bhfgq>\t%0,%v1,%Y3(%2)"
> [(set_attr "op_type" "VRS")])
>
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><non_vec_l>"
> [(match_operand:V_128 0 "register_operand" "")
> (match_operand:V_128 1 "nonmemory_operand" "")]
> "TARGET_VX"
> --- gcc/config/s390/s390-builtins.def.jj 2017-03-24 15:08:56.000000000 +0100
> +++ gcc/config/s390/s390-builtins.def 2017-07-24 18:02:22.571849086 +0200
> @@ -450,12 +450,12 @@ OB_DEF_VAR (s390_vec_extract_u64,
> OB_DEF_VAR (s390_vec_extract_b64, s390_vlgvg, 0, O2_ELEM, BT_OV_ULONGLONG_BV2DI_INT)
> OB_DEF_VAR (s390_vec_extract_dbl, s390_vlgvg_dbl, 0, O2_ELEM, BT_OV_DBL_V2DF_INT) /* vlgvg */
>
> -B_DEF (s390_vlgvb, vec_extractv16qi, 0, B_VX, O2_ELEM, BT_FN_UCHAR_UV16QI_INT)
> -B_DEF (s390_vlgvh, vec_extractv8hi, 0, B_VX, O2_ELEM, BT_FN_USHORT_UV8HI_INT)
> -B_DEF (s390_vlgvf, vec_extractv4si, 0, B_VX, O2_ELEM, BT_FN_UINT_UV4SI_INT)
> -B_DEF (s390_vlgvf_flt, vec_extractv4sf, 0, B_INT | B_VXE, O2_ELEM, BT_FN_FLT_V4SF_INT)
> -B_DEF (s390_vlgvg, vec_extractv2di, 0, B_VX, O2_ELEM, BT_FN_ULONGLONG_UV2DI_INT)
> -B_DEF (s390_vlgvg_dbl, vec_extractv2df, 0, B_INT | B_VX, O2_ELEM, BT_FN_DBL_V2DF_INT)
> +B_DEF (s390_vlgvb, vec_extractv16qiqi, 0, B_VX, O2_ELEM, BT_FN_UCHAR_UV16QI_INT)
> +B_DEF (s390_vlgvh, vec_extractv8hihi, 0, B_VX, O2_ELEM, BT_FN_USHORT_UV8HI_INT)
> +B_DEF (s390_vlgvf, vec_extractv4sisi, 0, B_VX, O2_ELEM, BT_FN_UINT_UV4SI_INT)
> +B_DEF (s390_vlgvf_flt, vec_extractv4sfsf, 0, B_INT | B_VXE, O2_ELEM, BT_FN_FLT_V4SF_INT)
> +B_DEF (s390_vlgvg, vec_extractv2didi, 0, B_VX, O2_ELEM, BT_FN_ULONGLONG_UV2DI_INT)
> +B_DEF (s390_vlgvg_dbl, vec_extractv2dfdf, 0, B_INT | B_VX, O2_ELEM, BT_FN_DBL_V2DF_INT)
>
> OB_DEF (s390_vec_insert_and_zero, s390_vec_insert_and_zero_s8,s390_vec_insert_and_zero_dbl,B_VX,BT_FN_OV4SI_INTCONSTPTR)
> OB_DEF_VAR (s390_vec_insert_and_zero_s8,s390_vllezb, 0, 0, BT_OV_V16QI_SCHARCONSTPTR)
> --- gcc/config/arm/iterators.md.jj 2017-05-05 09:20:02.000000000 +0200
> +++ gcc/config/arm/iterators.md 2017-07-24 17:25:15.665681575 +0200
> @@ -444,6 +444,14 @@ (define_mode_attr V_elem [(V8QI "QI") (V
> (V2SF "SF") (V4SF "SF")
> (DI "DI") (V2DI "DI")])
>
> +;; As above but in lower case.
> +(define_mode_attr V_elem_l [(V8QI "qi") (V16QI "qi")
> + (V4HI "hi") (V8HI "hi")
> + (V4HF "hf") (V8HF "hf")
> + (V2SI "si") (V4SI "si")
> + (V2SF "sf") (V4SF "sf")
> + (DI "di") (V2DI "di")])
> +
> ;; Element modes for vector extraction, padded up to register size.
>
> (define_mode_attr V_ext [(V8QI "SI") (V16QI "SI")
> --- gcc/config/arm/neon.md.jj 2017-07-17 10:08:41.000000000 +0200
> +++ gcc/config/arm/neon.md 2017-07-24 17:27:42.173917259 +0200
> @@ -412,7 +412,7 @@ (define_expand "vec_set<mode>"
> DONE;
> })
>
> -(define_insn "vec_extract<mode>"
> +(define_insn "vec_extract<mode><V_elem_l>"
> [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
> (vec_select:<V_elem>
> (match_operand:VD_LANE 1 "s_register_operand" "w,w")
> @@ -434,7 +434,7 @@ (define_insn "vec_extract<mode>"
> [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
> )
>
> -(define_insn "vec_extract<mode>"
> +(define_insn "vec_extract<mode><V_elem_l>"
> [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
> (vec_select:<V_elem>
> (match_operand:VQ2 1 "s_register_operand" "w,w")
> @@ -460,7 +460,7 @@ (define_insn "vec_extract<mode>"
> [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
> )
>
> -(define_insn "vec_extractv2di"
> +(define_insn "vec_extractv2didi"
> [(set (match_operand:DI 0 "nonimmediate_operand" "=Um,r")
> (vec_select:DI
> (match_operand:V2DI 1 "s_register_operand" "w,w")
> @@ -479,7 +479,7 @@ (define_insn "vec_extractv2di"
> [(set_attr "type" "neon_store1_one_lane_q,neon_to_gp_q")]
> )
>
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><V_elem_l>"
> [(match_operand:VDQ 0 "s_register_operand" "")
> (match_operand 1 "" "")]
> "TARGET_NEON"
> @@ -1581,7 +1581,7 @@ (define_expand "reduc_plus_scal_<mode>"
> neon_pairwise_reduce (vec, operands[1], <MODE>mode,
> &gen_neon_vpadd_internal<mode>);
> /* The same result is actually computed into every element. */
> - emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> + emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
> DONE;
> })
>
> @@ -1607,7 +1607,7 @@ (define_expand "reduc_plus_scal_v2di"
> rtx vec = gen_reg_rtx (V2DImode);
>
> emit_insn (gen_arm_reduc_plus_internal_v2di (vec, operands[1]));
> - emit_insn (gen_vec_extractv2di (operands[0], vec, const0_rtx));
> + emit_insn (gen_vec_extractv2didi (operands[0], vec, const0_rtx));
>
> DONE;
> })
> @@ -1631,7 +1631,7 @@ (define_expand "reduc_smin_scal_<mode>"
> neon_pairwise_reduce (vec, operands[1], <MODE>mode,
> &gen_neon_vpsmin<mode>);
> /* The result is computed into every element of the vector. */
> - emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> + emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
> DONE;
> })
>
> @@ -1658,7 +1658,7 @@ (define_expand "reduc_smax_scal_<mode>"
> neon_pairwise_reduce (vec, operands[1], <MODE>mode,
> &gen_neon_vpsmax<mode>);
> /* The result is computed into every element of the vector. */
> - emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> + emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
> DONE;
> })
>
> @@ -1685,7 +1685,7 @@ (define_expand "reduc_umin_scal_<mode>"
> neon_pairwise_reduce (vec, operands[1], <MODE>mode,
> &gen_neon_vpumin<mode>);
> /* The result is computed into every element of the vector. */
> - emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> + emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
> DONE;
> })
>
> @@ -1711,7 +1711,7 @@ (define_expand "reduc_umax_scal_<mode>"
> neon_pairwise_reduce (vec, operands[1], <MODE>mode,
> &gen_neon_vpumax<mode>);
> /* The result is computed into every element of the vector. */
> - emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> + emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
> DONE;
> })
>
> @@ -3272,7 +3272,8 @@ (define_expand "neon_vget_lane<mode>"
> }
>
> if (GET_MODE_UNIT_BITSIZE (<MODE>mode) == 32)
> - emit_insn (gen_vec_extract<mode> (operands[0], operands[1], operands[2]));
> + emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], operands[1],
> + operands[2]));
> else
> emit_insn (gen_neon_vget_lane<mode>_sext_internal (operands[0],
> operands[1],
> @@ -3301,7 +3302,8 @@ (define_expand "neon_vget_laneu<mode>"
> }
>
> if (GET_MODE_UNIT_BITSIZE (<MODE>mode) == 32)
> - emit_insn (gen_vec_extract<mode> (operands[0], operands[1], operands[2]));
> + emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], operands[1],
> + operands[2]));
> else
> emit_insn (gen_neon_vget_lane<mode>_zext_internal (operands[0],
> operands[1],
> --- gcc/config/mips/mips-msa.md.jj 2017-03-31 20:36:09.000000000 +0200
> +++ gcc/config/mips/mips-msa.md 2017-07-24 17:33:32.657689124 +0200
> @@ -231,7 +231,7 @@ (define_mode_attr bitimm
> (V4SI "uimm5")
> (V2DI "uimm6")])
>
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><unitmode>"
> [(match_operand:MSA 0 "register_operand")
> (match_operand:MSA 1 "")]
> "ISA_HAS_MSA"
> @@ -311,7 +311,7 @@ (define_expand "vec_unpacku_lo_<mode>"
> DONE;
> })
>
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><unitmode>"
> [(match_operand:<UNITMODE> 0 "register_operand")
> (match_operand:IMSA 1 "register_operand")
> (match_operand 2 "const_<indeximm>_operand")]
> @@ -329,7 +329,7 @@ (define_expand "vec_extract<mode>"
> DONE;
> })
>
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><unitmode>"
> [(match_operand:<UNITMODE> 0 "register_operand")
> (match_operand:FMSA 1 "register_operand")
> (match_operand 2 "const_<indeximm>_operand")]
> --- gcc/config/mips/loongson.md.jj 2017-01-01 12:45:40.000000000 +0100
> +++ gcc/config/mips/loongson.md 2017-07-24 18:08:29.736433972 +0200
> @@ -119,7 +119,7 @@ (define_insn "mov<mode>_internal"
>
> ;; Initialization of a vector.
>
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><unitmode>"
> [(set (match_operand:VWHB 0 "register_operand")
> (match_operand 1 ""))]
> "TARGET_HARD_FLOAT && TARGET_LOONGSON_VECTORS"
> --- gcc/config/mips/mips-ps-3d.md.jj 2017-01-01 12:45:40.000000000 +0100
> +++ gcc/config/mips/mips-ps-3d.md 2017-07-24 17:34:13.540195876 +0200
> @@ -254,7 +254,7 @@ (define_expand "mips_pll_ps"
> })
>
> ; vec_init
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
> [(match_operand:V2SF 0 "register_operand")
> (match_operand:V2SF 1 "")]
> "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT"
> @@ -282,7 +282,7 @@ (define_insn "vec_concatv2sf"
> ;; emulated. There is no other way to get a vector mode bitfield extract
> ;; currently.
>
> -(define_insn "vec_extractv2sf"
> +(define_insn "vec_extractv2sfsf"
> [(set (match_operand:SF 0 "register_operand" "=f")
> (vec_select:SF (match_operand:V2SF 1 "register_operand" "f")
> (parallel
> @@ -379,7 +379,7 @@ (define_expand "reduc_plus_scal_v2sf"
> rtx temp = gen_reg_rtx (V2SFmode);
> emit_insn (gen_mips_addr_ps (temp, operands[1], operands[1]));
> rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
> - emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
> + emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
> DONE;
> })
>
> @@ -757,7 +757,7 @@ (define_expand "reduc_smin_scal_v2sf"
> rtx temp = gen_reg_rtx (V2SFmode);
> mips_expand_vec_reduc (temp, operands[1], gen_sminv2sf3);
> rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
> - emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
> + emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
> DONE;
> })
>
> @@ -769,6 +769,6 @@ (define_expand "reduc_smax_scal_v2sf"
> rtx temp = gen_reg_rtx (V2SFmode);
> mips_expand_vec_reduc (temp, operands[1], gen_smaxv2sf3);
> rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
> - emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
> + emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
> DONE;
> })
> --- gcc/config/mips/mips.md.jj 2017-06-15 11:03:32.000000000 +0200
> +++ gcc/config/mips/mips.md 2017-07-24 19:00:15.519582707 +0200
> @@ -917,6 +917,11 @@ (define_mode_attr UNITMODE [(SF "SF") (D
> (V16QI "QI") (V8HI "HI") (V4SI "SI") (V2DI "DI")
> (V2DF "DF")])
>
> +;; As above, but in lower case.
> +(define_mode_attr unitmode [(SF "sf") (DF "df") (V2SF "sf") (V4SF "sf")
> + (V16QI "qi") (V8QI "qi") (V8HI "hi") (V4HI "hi")
> + (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")])
> +
> ;; This attribute gives the integer mode that has the same size as a
> ;; fixed-point mode.
> (define_mode_attr IMODE [(QQ "QI") (HQ "HI") (SQ "SI") (DQ "DI")
> --- gcc/config/spu/spu.c.jj 2017-07-17 10:08:39.000000000 +0200
> +++ gcc/config/spu/spu.c 2017-07-24 18:06:01.693214125 +0200
> @@ -1773,7 +1773,7 @@ spu_expand_prologue (void)
> size_v4si = scratch_v4si;
> }
> emit_insn (gen_cgt_v4si (scratch_v4si, sp_v4si, size_v4si));
> - emit_insn (gen_vec_extractv4si
> + emit_insn (gen_vec_extractv4sisi
> (scratch_reg_0, scratch_v4si, GEN_INT (1)));
> emit_insn (gen_spu_heq (scratch_reg_0, GEN_INT (0)));
> }
> @@ -5368,7 +5368,7 @@ spu_allocate_stack (rtx op0, rtx op1)
> {
> rtx avail = gen_reg_rtx(SImode);
> rtx result = gen_reg_rtx(SImode);
> - emit_insn (gen_vec_extractv4si (avail, sp, GEN_INT (1)));
> + emit_insn (gen_vec_extractv4sisi (avail, sp, GEN_INT (1)));
> emit_insn (gen_cgt_si(result, avail, GEN_INT (-1)));
> emit_insn (gen_spu_heq (result, GEN_INT(0) ));
> }
> @@ -5684,22 +5684,22 @@ spu_builtin_extract (rtx ops[])
> switch (mode)
> {
> case V16QImode:
> - emit_insn (gen_vec_extractv16qi (ops[0], ops[1], ops[2]));
> + emit_insn (gen_vec_extractv16qiqi (ops[0], ops[1], ops[2]));
> break;
> case V8HImode:
> - emit_insn (gen_vec_extractv8hi (ops[0], ops[1], ops[2]));
> + emit_insn (gen_vec_extractv8hihi (ops[0], ops[1], ops[2]));
> break;
> case V4SFmode:
> - emit_insn (gen_vec_extractv4sf (ops[0], ops[1], ops[2]));
> + emit_insn (gen_vec_extractv4sfsf (ops[0], ops[1], ops[2]));
> break;
> case V4SImode:
> - emit_insn (gen_vec_extractv4si (ops[0], ops[1], ops[2]));
> + emit_insn (gen_vec_extractv4sisi (ops[0], ops[1], ops[2]));
> break;
> case V2DImode:
> - emit_insn (gen_vec_extractv2di (ops[0], ops[1], ops[2]));
> + emit_insn (gen_vec_extractv2didi (ops[0], ops[1], ops[2]));
> break;
> case V2DFmode:
> - emit_insn (gen_vec_extractv2df (ops[0], ops[1], ops[2]));
> + emit_insn (gen_vec_extractv2dfdf (ops[0], ops[1], ops[2]));
> break;
> default:
> abort ();
> --- gcc/config/spu/spu.md.jj 2017-01-01 12:45:40.000000000 +0100
> +++ gcc/config/spu/spu.md 2017-07-24 18:05:05.591888718 +0200
> @@ -256,6 +256,13 @@ (define_mode_attr inner [(V16QI "QI")
> (V2DI "DI")
> (V4SF "SF")
> (V2DF "DF")])
> +;; Like above, but in lower case
> +(define_mode_attr inner_l [(V16QI "qi")
> + (V8HI "hi")
> + (V4SI "si")
> + (V2DI "di")
> + (V4SF "sf")
> + (V2DF "df")])
> (define_mode_attr vmult [(V16QI "1")
> (V8HI "2")
> (V4SI "4")
> @@ -4318,7 +4325,7 @@ (define_expand "restore_stack_nonlocal"
> ;; vector patterns
>
> ;; Vector initialization
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><inner_l>"
> [(match_operand:V 0 "register_operand" "")
> (match_operand 1 "" "")]
> ""
> @@ -4347,7 +4354,7 @@ (define_expand "vec_set<mode>"
> operands[6] = GEN_INT (size);
> })
>
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><inner_l>"
> [(set (match_operand:<inner> 0 "spu_reg_operand" "=r")
> (vec_select:<inner> (match_operand:V 1 "spu_reg_operand" "r")
> (parallel [(match_operand 2 "const_int_operand" "i")])))]
> --- gcc/config/sparc/sparc.md.jj 2017-07-17 10:08:39.000000000 +0200
> +++ gcc/config/sparc/sparc.md 2017-07-24 18:11:52.396997069 +0200
> @@ -8621,6 +8621,8 @@ (define_mode_attr vconstr [(V1SI "f") (V
> (define_mode_attr vfptype [(V1SI "single") (V2HI "single") (V4QI "single")
> (V1DI "double") (V2SI "double") (V4HI "double")
> (V8QI "double")])
> +(define_mode_attr veltmode [(V1SI "si") (V2HI "hi") (V4QI "qi") (V1DI "di")
> + (V2SI "si") (V4HI "hi") (V8QI "qi")])
>
> (define_expand "mov<VMALL:mode>"
> [(set (match_operand:VMALL 0 "nonimmediate_operand" "")
> @@ -8762,7 +8764,7 @@ (define_split
> DONE;
> })
>
> -(define_expand "vec_init<VMALL:mode>"
> +(define_expand "vec_init<VMALL:mode><VMALL:veltmode>"
> [(match_operand:VMALL 0 "register_operand" "")
> (match_operand:VMALL 1 "" "")]
> "TARGET_VIS"
> --- gcc/config/ia64/vect.md.jj 2017-01-01 12:45:42.000000000 +0100
> +++ gcc/config/ia64/vect.md 2017-07-24 17:29:28.996628899 +0200
> @@ -1015,7 +1015,7 @@ (define_insn "*vec_interleave_highv2si"
> }
> [(set_attr "itanium_class" "mmshf")])
>
> -(define_expand "vec_initv2si"
> +(define_expand "vec_initv2sisi"
> [(match_operand:V2SI 0 "gr_register_operand" "")
> (match_operand 1 "" "")]
> ""
> @@ -1299,7 +1299,7 @@ (define_insn "*fselect"
> "fselect %0 = %F2, %F3, %1"
> [(set_attr "itanium_class" "fmisc")])
>
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
> [(match_operand:V2SF 0 "fr_register_operand" "")
> (match_operand 1 "" "")]
> ""
> @@ -1483,7 +1483,7 @@ (define_insn_and_split "*vec_extractv2sf
> operands[1] = gen_rtx_REG (SFmode, REGNO (operands[1]));
> })
>
> -(define_expand "vec_extractv2sf"
> +(define_expand "vec_extractv2sfsf"
> [(set (match_operand:SF 0 "register_operand" "")
> (unspec:SF [(match_operand:V2SF 1 "register_operand" "")
> (match_operand:DI 2 "const_int_operand" "")]
> --- gcc/config/powerpcspe/vector.md.jj 2017-05-25 10:37:03.000000000 +0200
> +++ gcc/config/powerpcspe/vector.md 2017-07-24 17:41:21.897027743 +0200
> @@ -74,6 +74,16 @@ (define_mode_attr VEC_base [(V16QI "QI")
> (V1TI "TI")
> (TI "TI")])
>
> +;; As above, but in lower case
> +(define_mode_attr VEC_base_l [(V16QI "qi")
> + (V8HI "hi")
> + (V4SI "si")
> + (V2DI "di")
> + (V4SF "sf")
> + (V2DF "df")
> + (V1TI "ti")
> + (TI "ti")])
> +
> ;; Same size integer type for floating point data
> (define_mode_attr VEC_int [(V4SF "v4si")
> (V2DF "v2di")])
> @@ -1017,7 +1027,7 @@ (define_expand "fixuns_trunc<mode><VEC_i
>
> \f
> ;; Vector initialization, set, extract
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><VEC_base_l>"
> [(match_operand:VEC_E 0 "vlogical_operand" "")
> (match_operand:VEC_E 1 "" "")]
> "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
> @@ -1036,7 +1046,7 @@ (define_expand "vec_set<mode>"
> DONE;
> })
>
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><VEC_base_l>"
> [(match_operand:<VEC_base> 0 "register_operand" "")
> (match_operand:VEC_E 1 "vlogical_operand" "")
> (match_operand 2 "const_int_operand" "")]
> --- gcc/config/powerpcspe/paired.md.jj 2017-05-25 10:37:04.000000000 +0200
> +++ gcc/config/powerpcspe/paired.md 2017-07-24 17:42:17.980351097 +0200
> @@ -377,7 +377,7 @@ (define_insn "paired_muls1"
> "ps_muls1 %0, %1, %2"
> [(set_attr "type" "fp")])
>
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
> [(match_operand:V2SF 0 "gpc_reg_operand" "=f")
> (match_operand 1 "" "")]
> "TARGET_PAIRED_FLOAT"
> --- gcc/config/powerpcspe/altivec.md.jj 2017-05-25 10:37:05.000000000 +0200
> +++ gcc/config/powerpcspe/altivec.md 2017-07-24 17:42:49.897966010 +0200
> @@ -301,7 +301,7 @@ (define_split
> for (i = 0; i < num_elements; i++)
> RTVEC_ELT (v, i) = constm1_rtx;
>
> - emit_insn (gen_vec_initv4si (dest, gen_rtx_PARALLEL (mode, v)));
> + emit_insn (gen_vec_initv4sisi (dest, gen_rtx_PARALLEL (mode, v)));
> emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, dest, dest)));
> DONE;
> })
> @@ -2222,7 +2222,7 @@ (define_expand "altivec_copysign_v4sf3"
> RTVEC_ELT (v, 2) = GEN_INT (mask_val);
> RTVEC_ELT (v, 3) = GEN_INT (mask_val);
>
> - emit_insn (gen_vec_initv4si (mask, gen_rtx_PARALLEL (V4SImode, v)));
> + emit_insn (gen_vec_initv4sisi (mask, gen_rtx_PARALLEL (V4SImode, v)));
> emit_insn (gen_vector_select_v4sf (operands[0], operands[1], operands[2],
> gen_lowpart (V4SFmode, mask)));
> DONE;
> @@ -3014,7 +3014,7 @@ (define_expand "vec_unpacku_hi_v16qi"
> RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 0);
> RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16);
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
> DONE;
> }")
> @@ -3050,7 +3050,7 @@ (define_expand "vec_unpacku_hi_v8hi"
> RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 6 : 17);
> RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16);
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
> DONE;
> }")
> @@ -3086,7 +3086,7 @@ (define_expand "vec_unpacku_lo_v16qi"
> RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 8);
> RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
> DONE;
> }")
> @@ -3122,7 +3122,7 @@ (define_expand "vec_unpacku_lo_v8hi"
> RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 14 : 17);
> RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
> DONE;
> }")
> @@ -3363,7 +3363,7 @@ (define_expand "mulv16qi3"
> = gen_rtx_CONST_INT (QImode, BYTES_BIG_ENDIAN ? 2 * i + 17 : 15 - 2 * i);
> }
>
> - emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> + emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> emit_insn (gen_altivec_vmulesb (even, operands[1], operands[2]));
> emit_insn (gen_altivec_vmulosb (odd, operands[1], operands[2]));
> emit_insn (gen_altivec_vperm_v8hiv16qi (operands[0], even, odd, mask));
>
> Jakub
>
prev parent reply other threads:[~2017-08-01 8:09 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-25 9:14 Jakub Jelinek
2017-07-25 21:12 ` Segher Boessenkool
2017-07-26 7:09 ` Jakub Jelinek
2017-07-26 7:29 ` Richard Biener
2017-07-26 11:41 ` Segher Boessenkool
2017-08-01 16:21 ` Jakub Jelinek
2017-08-01 23:57 ` Segher Boessenkool
2017-07-25 21:45 ` Matthew Fortune
2017-07-26 7:25 ` Richard Biener
2017-07-26 7:34 ` Eric Botcazou
2017-07-26 10:35 ` Richard Biener
2017-07-26 10:42 ` Uros Bizjak
2017-07-27 11:43 ` Segher Boessenkool
2017-07-27 11:56 ` Andreas Krebbel
2017-08-01 8:09 ` Richard Earnshaw (lists) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aecf98be-3631-c884-1801-b2849b0db8d4@arm.com \
--to=richard.earnshaw@arm.com \
--cc=Andreas.Krebbel@de.ibm.com \
--cc=andrew@codesourcery.com \
--cc=dje.gcc@gmail.com \
--cc=ebotcazou@libertysurf.fr \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=marcus.shawcroft@arm.com \
--cc=matthew.fortune@imgtec.com \
--cc=rguenther@suse.de \
--cc=segher@kernel.crashing.org \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).