From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-459000-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 75234 invoked by alias); 26 Jul 2017 10:35:40 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 72214 invoked by uid 89); 26 Jul 2017 10:35:39 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-16.9 required=5.0 tests=BAYES_00,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: mx1.suse.de
Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 26 Jul 2017 10:35:31 +0000
Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254])	by mx1.suse.de (Postfix) with ESMTP id 9C593AB1E;	Wed, 26 Jul 2017 10:35:28 +0000 (UTC)
Date: Wed, 26 Jul 2017 10:35:00 -0000
From: Richard Biener <rguenther@suse.de>
To: Jakub Jelinek <jakub@redhat.com>
cc: Uros Bizjak <ubizjak@gmail.com>, David Edelsohn <dje.gcc@gmail.com>,     Segher Boessenkool <segher@kernel.crashing.org>,     Marcus Shawcroft <marcus.shawcroft@arm.com>,     Richard Earnshaw <richard.earnshaw@arm.com>,     Andreas Krebbel <Andreas.Krebbel@de.ibm.com>,     Matthew Fortune <matthew.fortune@imgtec.com>,     Eric Botcazou <ebotcazou@libertysurf.fr>,     Andrew Jenner <andrew@codesourcery.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Switch vec_init and vec_extract optabs to 2 mode optab to allow extraction of vector from vector or initialization of vector from smaller vectors (PR target/80846)
In-Reply-To: <20170725091432.GQ2123@tucnak>
Message-ID: <alpine.LSU.2.20.1707261235170.10808@zhemvz.fhfr.qr>
References: <20170725091432.GQ2123@tucnak>
User-Agent: Alpine 2.20 (LSU 67 2015-01-07)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-SW-Source: 2017-07/txt/msg01649.txt.bz2

On Tue, 25 Jul 2017, Jakub Jelinek wrote:

> Hi!
> 
> The following patch adjusts the vec_init and vec_extract optabs, so that
> they don't have in the expander names just the vector mode, but also another
> mode, for vec_extract the mode of the result and for vec_init the mode of
> the elts of the vector passed as second operand.
> 
> Without this patch, the second mode has been implicit, GET_MODE_INNER of
> the vector mode, so one could just extract a single element from a vector
> or construct vector from elements.  While that is most common, we allow
> in GIMPLE e.g. construction of V8DImode from 4 V2DImode elements etc.
> and the vectorizer uses them.  By having the second mode in the name
> it allows the generic code (vectorizer, expansion) to query whether the
> backend supports such vector from vector expansions or inits from vector
> elts and use them if available.
> 
> For vec_extract, if we say want to extract high V2SImode from V4SImode
> the fallback is try to expand it as DImode extraction from V2DImode.
> This works well in many cases, but doesn't really work for very large
> vectors, say if we want to extract high V8SImode from V16SImode on x86,
> we'd need OImode extraction from V2OImode, which is something the backend
> doesn't have any support for.
> For vec_init, the fallback is usually to go through memory, which is slow in
> many cases.
> 
> This patch only adds new vector from vector extract and init patterns to
> the i386 backend, but I had to change many other targets too, because
> it needs to have the element mode in the vec_extract/vec_init expander
> names.  Seems most of the backends didn't really have a mode attribute
> usable for this or had it only in uppercase, while for the names we need
> lowercase.  Some backends had a convention on how to name lower case
> vs. upper case modes, others didn't have any.  So I'm CCing maintainers
> of affected backends to seek advice on what mode attributes they want to
> use.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, where it improves
> e.g. the code generation for slp-43.c and slp-45.c testcases.
> make cc1 tested in cross-compilers to the remaining targets.
> 
> Ok for trunk?

The non-target specific bits are ok.

Thanks,
Richard.

> 2017-07-25  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/80846
> 	* optabs.def (vec_extract_optab, vec_init_optab): Change from
> 	a direct optab to conversion optab.
> 	* optabs.c (expand_vector_broadcast): Use convert_optab_handler
> 	with GET_MODE_INNER as last argument instead of optab_handler.
> 	* expmed.c (extract_bit_field_1): Likewise.  Use vector from
> 	vector extraction if possible and optab is available.
> 	* expr.c (store_constructor): Use convert_optab_handler instead
> 	of optab_handler.  Use vector initialization from smaller
> 	vectors if possible and optab is available.
> 	* tree-vect-stmts.c (vectorizable_load): Likewise.
> 	* doc/md.texi (vec_extract, vec_init): Document that the optabs
> 	now have two modes.
> 	* config/i386/i386.c (ix86_expand_vector_init): Handle expansion
> 	of vec_init from half-sized vectors with the same element mode.
> 	* config/i386/sse.md (ssehalfvecmode): Add V4TI case.
> 	(ssehalfvecmodelower, ssescalarmodelower): New mode attributes.
> 	(reduc_plus_scal_v8df, reduc_plus_scal_v4df, reduc_plus_scal_v2df,
> 	reduc_plus_scal_v16sf, reduc_plus_scal_v8sf, reduc_plus_scal_v4sf,
> 	reduc_<code>_scal_<mode>, reduc_umin_scal_v8hi): Add element mode
> 	after mode in gen_vec_extract* calls.
> 	(vec_extract<mode>): Renamed to ...
> 	(vec_extract<mode><ssescalarmodelower>): ... this.
> 	(vec_extract<mode><ssehalfvecmodelower>): New expander.
> 	(rotl<mode>3, rotr<mode>3, <shift_insn><mode>3, ashrv2di3): Add
> 	element mode after mode in gen_vec_init* calls.
> 	(VEC_INIT_HALF_MODE): New mode iterator.
> 	(vec_init<mode>): Renamed to ...
> 	(vec_init<mode><ssescalarmodelower>): ... this.
> 	(vec_init<mode><ssehalfvecmodelower>): New expander.
> 	* config/i386/mmx.md (vec_extractv2sf): Renamed to ...
> 	(vec_extractv2sfsf): ... this.
> 	(vec_initv2sf): Renamed to ...
> 	(vec_initv2sfsf): ... this.
> 	(vec_extractv2si): Renamed to ...
> 	(vec_extractv2sisi): ... this.
> 	(vec_initv2si): Renamed to ...
> 	(vec_initv2sisi): ... this.
> 	(vec_extractv4hi): Renamed to ...
> 	(vec_extractv4hihi): ... this.
> 	(vec_initv4hi): Renamed to ...
> 	(vec_initv4hihi): ... this.
> 	(vec_extractv8qi): Renamed to ...
> 	(vec_extractv8qiqi): ... this.
> 	(vec_initv8qi): Renamed to ...
> 	(vec_initv8qiqi): ... this.
> 	* config/rs6000/vector.md (VEC_base_l): New mode attribute.
> 	(vec_init<mode>): Renamed to ...
> 	(vec_init<mode><VEC_base_l>): ... this.
> 	(vec_extract<mode>): Renamed to ...
> 	(vec_extract<mode><VEC_base_l>): ... this.
> 	* config/rs6000/paired.md (vec_initv2sf): Renamed to ...
> 	(vec_initv2sfsf): ... this.
> 	* config/rs6000/altivec.md (splitter, altivec_copysign_v4sf3,
> 	vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi,
> 	vec_unpacku_lo_v8hi, mulv16qi3, altivec_vreve<mode>2): Add
> 	element mode after mode in gen_vec_init* calls.
> 	* config/aarch64/aarch64-simd.md (vec_init<mode>): Renamed to ...
> 	(vec_init<mode><Vel>): ... this.
> 	(vec_extract<mode>): Renamed to ...
> 	(vec_extract<mode><Vel>): ... this.
> 	* config/aarch64/iterators.md (Vel): New mode attribute.
> 	* config/s390/s390.c (s390_expand_vec_strlen, s390_expand_vec_movstr):
> 	Add element mode after mode in gen_vec_extract* calls.
> 	* config/s390/vector.md (non_vec_l): New mode attribute.
> 	(vec_extract<mode>): Renamed to ...
> 	(vec_extract<mode><non_vec_l>): ... this.
> 	(vec_init<mode>): Renamed to ...
> 	(vec_init<mode><non_vec_l>): ... this.
> 	* config/s390/s390-builtins.def (s390_vlgvb, s390_vlgvh, s390_vlgvf,
> 	s390_vlgvf_flt, s390_vlgvg, s390_vlgvg_dbl): Add element mode after
> 	vec_extract mode.
> 	* config/arm/iterators.md (V_elem_l): New mode attribute.
> 	* config/arm/neon.md (vec_extract<mode>): Renamed to ...
> 	(vec_extract<mode><V_elem_l>): ... this.
> 	(vec_extractv2di): Renamed to ...
> 	(vec_extractv2didi): ... this.
> 	(vec_init<mode>): Renamed to ...
> 	(vec_init<mode><V_elem_l>): ... this.
> 	(reduc_plus_scal_<mode>, reduc_plus_scal_v2di, reduc_smin_scal_<mode>,
> 	reduc_smax_scal_<mode>, reduc_umin_scal_<mode>,
> 	reduc_umax_scal_<mode>, neon_vget_lane<mode>, neon_vget_laneu<mode>):
> 	Add element mode after gen_vec_extract* calls.
> 	* config/mips/mips-msa.md (vec_init<mode>): Renamed to ...
> 	(vec_init<mode><unitmode>): ... this.
> 	(vec_extract<mode>): Renamed to ...
> 	(vec_extract<mode><unitmode>): ... this.
> 	* config/mips/loongson.md (vec_init<mode>): Renamed to ...
> 	(vec_init<mode><unitmode>): ... this.
> 	* config/mips/mips-ps-3d.md (vec_initv2sf): Renamed to ...
> 	(vec_initv2sfsf): ... this.
> 	(vec_extractv2sf): Renamed to ...
> 	(vec_extractv2sfsf): ... this.
> 	(reduc_plus_scal_v2sf, reduc_smin_scal_v2sf, reduc_smax_scal_v2sf):
> 	Add element mode after gen_vec_extract* calls.
> 	* config/mips/mips.md (unitmode): New mode iterator.
> 	* config/spu/spu.c (spu_expand_prologue, spu_allocate_stack,
> 	spu_builtin_extract): Add element mode after gen_vec_extract* calls.
> 	* config/spu/spu.md (inner_l): New mode attribute.
> 	(vec_init<mode>): Renamed to ...
> 	(vec_init<mode><inner_l>): ... this.
> 	(vec_extract<mode>): Renamed to ...
> 	(vec_extract<mode><inner_l>): ... this.
> 	* config/sparc/sparc.md (veltmode): New mode iterator.
> 	(vec_init<VMALL:mode>): Renamed to ...
> 	(vec_init<VMALL:mode><VMALL:veltmode>): ... this.
> 	* config/ia64/vect.md (vec_initv2si): Renamed to ...
> 	(vec_initv2sisi): ... this.
> 	(vec_initv2sf): Renamed to ...
> 	(vec_initv2sfsf): ... this.
> 	(vec_extractv2sf): Renamed to ...
> 	(vec_extractv2sfsf): ... this.
> 	* config/powerpcspe/vector.md (VEC_base_l): New mode attribute.
> 	(vec_init<mode>): Renamed to ...
> 	(vec_init<mode><VEC_base_l>): ... this.
> 	(vec_extract<mode>): Renamed to ...
> 	(vec_extract<mode><VEC_base_l>): ... this.
> 	* config/powerpcspe/paired.md (vec_initv2sf): Renamed to ...
> 	(vec_initv2sfsf): ... this.
> 	* config/powerpcspe/altivec.md (splitter, altivec_copysign_v4sf3,
> 	vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi,
> 	vec_unpacku_lo_v8hi, mulv16qi3): Add element mode after mode in
> 	gen_vec_init* calls.
> 
> --- gcc/optabs.def.jj	2017-07-24 10:57:45.944815535 +0200
> +++ gcc/optabs.def	2017-07-24 16:11:23.066229910 +0200
> @@ -89,6 +89,8 @@ OPTAB_CD(vec_cmpu_optab, "vec_cmpu$a$b")
>  OPTAB_CD(vec_cmpeq_optab, "vec_cmpeq$a$b")
>  OPTAB_CD(maskload_optab, "maskload$a$b")
>  OPTAB_CD(maskstore_optab, "maskstore$a$b")
> +OPTAB_CD(vec_extract_optab, "vec_extract$a$b")
> +OPTAB_CD(vec_init_optab, "vec_init$a$b")
>  
>  OPTAB_NL(add_optab, "add$P$a3", PLUS, "add", '3', gen_int_fp_fixed_libfunc)
>  OPTAB_NX(add_optab, "add$F$a3")
> @@ -294,8 +296,6 @@ OPTAB_D (udot_prod_optab, "udot_prod$I$a
>  OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
>  OPTAB_D (usad_optab, "usad$I$a")
>  OPTAB_D (ssad_optab, "ssad$I$a")
> -OPTAB_D (vec_extract_optab, "vec_extract$a")
> -OPTAB_D (vec_init_optab, "vec_init$a")
>  OPTAB_D (vec_pack_sfix_trunc_optab, "vec_pack_sfix_trunc_$a")
>  OPTAB_D (vec_pack_ssat_optab, "vec_pack_ssat_$a")
>  OPTAB_D (vec_pack_trunc_optab, "vec_pack_trunc_$a")
> --- gcc/optabs.c.jj	2017-07-24 10:57:46.216812275 +0200
> +++ gcc/optabs.c	2017-07-24 16:11:23.067229898 +0200
> @@ -386,7 +386,8 @@ expand_vector_broadcast (machine_mode vm
>    /* ??? If the target doesn't have a vec_init, then we have no easy way
>       of performing this operation.  Most of this sort of generic support
>       is hidden away in the vector lowering support in gimple.  */
> -  icode = optab_handler (vec_init_optab, vmode);
> +  icode = convert_optab_handler (vec_init_optab, vmode,
> +				 GET_MODE_INNER (vmode));
>    if (icode == CODE_FOR_nothing)
>      return NULL;
>  
> --- gcc/expmed.c.jj	2017-07-24 10:57:45.914815894 +0200
> +++ gcc/expmed.c	2017-07-24 16:11:23.071229850 +0200
> @@ -1566,6 +1566,55 @@ extract_bit_field_1 (rtx str_rtx, unsign
>        return op0;
>      }
>  
> +  /* First try to check for vector from vector extractions.  */
> +  if (VECTOR_MODE_P (GET_MODE (op0))
> +      && !MEM_P (op0)
> +      && VECTOR_MODE_P (tmode)
> +      && GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (tmode))
> +    {
> +      machine_mode new_mode = GET_MODE (op0);
> +      if (GET_MODE_INNER (new_mode) != GET_MODE_INNER (tmode))
> +	{
> +	  new_mode = mode_for_vector (GET_MODE_INNER (tmode),
> +				      GET_MODE_BITSIZE (GET_MODE (op0))
> +				      / GET_MODE_UNIT_BITSIZE (tmode));
> +	  if (!VECTOR_MODE_P (new_mode)
> +	      || GET_MODE_SIZE (new_mode) != GET_MODE_SIZE (GET_MODE (op0))
> +	      || GET_MODE_INNER (new_mode) != GET_MODE_INNER (tmode)
> +	      || !targetm.vector_mode_supported_p (new_mode))
> +	    new_mode = VOIDmode;
> +	}
> +      if (new_mode != VOIDmode
> +	  && (convert_optab_handler (vec_extract_optab, new_mode, tmode)
> +	      != CODE_FOR_nothing)
> +	  && ((bitnum + bitsize - 1) / GET_MODE_BITSIZE (tmode)
> +	      == bitnum / GET_MODE_BITSIZE (tmode)))
> +	{
> +	  struct expand_operand ops[3];
> +	  machine_mode outermode = new_mode;
> +	  machine_mode innermode = tmode;
> +	  enum insn_code icode
> +	    = convert_optab_handler (vec_extract_optab, outermode, innermode);
> +	  unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
> +
> +	  if (new_mode != GET_MODE (op0))
> +	    op0 = gen_lowpart (new_mode, op0);
> +	  create_output_operand (&ops[0], target, innermode);
> +	  ops[0].target = 1;
> +	  create_input_operand (&ops[1], op0, outermode);
> +	  create_integer_operand (&ops[2], pos);
> +	  if (maybe_expand_insn (icode, 3, ops))
> +	    {
> +	      if (alt_rtl && ops[0].target)
> +		*alt_rtl = target;
> +	      target = ops[0].value;
> +	      if (GET_MODE (target) != mode)
> +		return gen_lowpart (tmode, target);
> +	      return target;
> +	    }
> +	}
> +    }
> +
>    /* See if we can get a better vector mode before extracting.  */
>    if (VECTOR_MODE_P (GET_MODE (op0))
>        && !MEM_P (op0)
> @@ -1599,14 +1648,17 @@ extract_bit_field_1 (rtx str_rtx, unsign
>       available.  */
>    if (VECTOR_MODE_P (GET_MODE (op0))
>        && !MEM_P (op0)
> -      && optab_handler (vec_extract_optab, GET_MODE (op0)) != CODE_FOR_nothing
> +      && (convert_optab_handler (vec_extract_optab, GET_MODE (op0),
> +				 GET_MODE_INNER (GET_MODE (op0)))
> +	  != CODE_FOR_nothing)
>        && ((bitnum + bitsize - 1) / GET_MODE_UNIT_BITSIZE (GET_MODE (op0))
>  	  == bitnum / GET_MODE_UNIT_BITSIZE (GET_MODE (op0))))
>      {
>        struct expand_operand ops[3];
>        machine_mode outermode = GET_MODE (op0);
>        machine_mode innermode = GET_MODE_INNER (outermode);
> -      enum insn_code icode = optab_handler (vec_extract_optab, outermode);
> +      enum insn_code icode
> +	= convert_optab_handler (vec_extract_optab, outermode, innermode);
>        unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
>  
>        create_output_operand (&ops[0], target, innermode);
> --- gcc/expr.c.jj	2017-07-24 10:57:45.963815307 +0200
> +++ gcc/expr.c	2017-07-24 16:11:23.073229826 +0200
> @@ -6589,6 +6589,7 @@ store_constructor (tree exp, rtx target,
>  	rtvec vector = NULL;
>  	unsigned n_elts;
>  	alias_set_type alias;
> +	bool vec_vec_init_p = false;
>  
>  	gcc_assert (eltmode != BLKmode);
>  
> @@ -6596,27 +6597,30 @@ store_constructor (tree exp, rtx target,
>  	if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))
>  	  {
>  	    machine_mode mode = GET_MODE (target);
> +	    machine_mode emode = eltmode;
>  
> -	    icode = (int) optab_handler (vec_init_optab, mode);
> -	    /* Don't use vec_init<mode> if some elements have VECTOR_TYPE.  */
> -	    if (icode != CODE_FOR_nothing)
> +	    if (CONSTRUCTOR_NELTS (exp)
> +		&& (TREE_CODE (TREE_TYPE (CONSTRUCTOR_ELT (exp, 0)->value))
> +		    == VECTOR_TYPE))
>  	      {
> -		tree value;
> -
> -		FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (exp), idx, value)
> -		  if (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE)
> -		    {
> -		      icode = CODE_FOR_nothing;
> -		      break;
> -		    }
> +		tree etype = TREE_TYPE (CONSTRUCTOR_ELT (exp, 0)->value);
> +		gcc_assert (CONSTRUCTOR_NELTS (exp) * TYPE_VECTOR_SUBPARTS (etype)
> +			    == n_elts);
> +		emode = TYPE_MODE (etype);
>  	      }
> +	    icode = (int) convert_optab_handler (vec_init_optab, mode, emode);
>  	    if (icode != CODE_FOR_nothing)
>  	      {
> -		unsigned int i;
> +		unsigned int i, n = n_elts;
>  
> -		vector = rtvec_alloc (n_elts);
> -		for (i = 0; i < n_elts; i++)
> -		  RTVEC_ELT (vector, i) = CONST0_RTX (GET_MODE_INNER (mode));
> +		if (emode != eltmode)
> +		  {
> +		    n = CONSTRUCTOR_NELTS (exp);
> +		    vec_vec_init_p = true;
> +		  }
> +		vector = rtvec_alloc (n);
> +		for (i = 0; i < n; i++)
> +		  RTVEC_ELT (vector, i) = CONST0_RTX (emode);
>  	      }
>  	  }
>  
> @@ -6634,10 +6638,10 @@ store_constructor (tree exp, rtx target,
>  
>  	    FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (exp), idx, value)
>  	      {
> -		int n_elts_here = tree_to_uhwi
> -		  (int_const_binop (TRUNC_DIV_EXPR,
> -				    TYPE_SIZE (TREE_TYPE (value)),
> -				    TYPE_SIZE (elttype)));
> +		tree sz = TYPE_SIZE (TREE_TYPE (value));
> +		int n_elts_here
> +		  = tree_to_uhwi (int_const_binop (TRUNC_DIV_EXPR, sz,
> +						   TYPE_SIZE (elttype)));
>  
>  		count += n_elts_here;
>  		if (mostly_zeros_p (value))
> @@ -6687,18 +6691,21 @@ store_constructor (tree exp, rtx target,
>  
>  	    if (vector)
>  	      {
> -		/* vec_init<mode> should not be used if there are VECTOR_TYPE
> -		   elements.  */
> -		gcc_assert (TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE);
> -		RTVEC_ELT (vector, eltpos)
> -		  = expand_normal (value);
> +		if (vec_vec_init_p)
> +		  {
> +		    gcc_assert (ce->index == NULL_TREE);
> +		    gcc_assert (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE);
> +		    eltpos = idx;
> +		  }
> +		else
> +		  gcc_assert (TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE);
> +		RTVEC_ELT (vector, eltpos) = expand_normal (value);
>  	      }
>  	    else
>  	      {
> -		machine_mode value_mode =
> -		  TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE
> -		  ? TYPE_MODE (TREE_TYPE (value))
> -		  : eltmode;
> +		machine_mode value_mode
> +		  = (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE
> +		     ? TYPE_MODE (TREE_TYPE (value)) : eltmode);
>  		bitpos = eltpos * elt_size;
>  		store_constructor_field (target, bitsize, bitpos, 0,
>  					 bitregion_end, value_mode,
> @@ -6707,9 +6714,9 @@ store_constructor (tree exp, rtx target,
>  	  }
>  
>  	if (vector)
> -	  emit_insn (GEN_FCN (icode)
> -		     (target,
> -		      gen_rtx_PARALLEL (GET_MODE (target), vector)));
> +	  emit_insn (GEN_FCN (icode) (target,
> +				      gen_rtx_PARALLEL (GET_MODE (target),
> +							vector)));
>  	break;
>        }
>  
> --- gcc/tree-vect-stmts.c.jj	2017-07-24 10:57:46.004814816 +0200
> +++ gcc/tree-vect-stmts.c	2017-07-24 16:11:23.049230114 +0200
> @@ -6996,29 +6996,43 @@ vectorizable_load (gimple *stmt, gimple_
>  	{
>  	  if (group_size < nunits)
>  	    {
> -	      /* Avoid emitting a constructor of vector elements by performing
> -		 the loads using an integer type of the same size,
> -		 constructing a vector of those and then re-interpreting it
> -		 as the original vector type.  This works around the fact
> -		 that the vec_init optab was only designed for scalar
> -		 element modes and thus expansion goes through memory.
> -		 This avoids a huge runtime penalty due to the general
> -		 inability to perform store forwarding from smaller stores
> -		 to a larger load.  */
> -	      unsigned lsize
> -		= group_size * TYPE_PRECISION (TREE_TYPE (vectype));
> -	      machine_mode elmode = mode_for_size (lsize, MODE_INT, 0);
> -	      machine_mode vmode = mode_for_vector (elmode,
> -						    nunits / group_size);
> -	      /* If we can't construct such a vector fall back to
> -		 element loads of the original vector type.  */
> +	      /* First check if vec_init optab supports construction from
> +		 vector elts directly.  */
> +	      machine_mode elmode = TYPE_MODE (TREE_TYPE (vectype));
> +	      machine_mode vmode = mode_for_vector (elmode, group_size);
>  	      if (VECTOR_MODE_P (vmode)
> -		  && optab_handler (vec_init_optab, vmode) != CODE_FOR_nothing)
> +		  && (convert_optab_handler (vec_init_optab,
> +					     TYPE_MODE (vectype), vmode)
> +		      != CODE_FOR_nothing))
>  		{
>  		  nloads = nunits / group_size;
>  		  lnel = group_size;
> -		  ltype = build_nonstandard_integer_type (lsize, 1);
> -		  lvectype = build_vector_type (ltype, nloads);
> +		  ltype = build_vector_type (TREE_TYPE (vectype), group_size);
> +		}
> +	      else
> +		{
> +		  /* Otherwise avoid emitting a constructor of vector elements
> +		     by performing the loads using an integer type of the same
> +		     size, constructing a vector of those and then
> +		     re-interpreting it as the original vector type.
> +		     This avoids a huge runtime penalty due to the general
> +		     inability to perform store forwarding from smaller stores
> +		     to a larger load.  */
> +		  unsigned lsize
> +		    = group_size * TYPE_PRECISION (TREE_TYPE (vectype));
> +		  elmode = mode_for_size (lsize, MODE_INT, 0);
> +		  vmode = mode_for_vector (elmode, nunits / group_size);
> +		  /* If we can't construct such a vector fall back to
> +		     element loads of the original vector type.  */
> +		  if (VECTOR_MODE_P (vmode)
> +		      && (convert_optab_handler (vec_init_optab, vmode, elmode)
> +			  != CODE_FOR_nothing))
> +		    {
> +		      nloads = nunits / group_size;
> +		      lnel = group_size;
> +		      ltype = build_nonstandard_integer_type (lsize, 1);
> +		      lvectype = build_vector_type (ltype, nloads);
> +		    }
>  		}
>  	    }
>  	  else
> --- gcc/doc/md.texi.jj	2017-07-24 10:57:45.989814996 +0200
> +++ gcc/doc/md.texi	2017-07-24 17:09:55.536882382 +0200
> @@ -4871,15 +4871,22 @@ This pattern is not allowed to @code{FAI
>  Set given field in the vector value.  Operand 0 is the vector to modify,
>  operand 1 is new value of field and operand 2 specify the field index.
>  
> -@cindex @code{vec_extract@var{m}} instruction pattern
> -@item @samp{vec_extract@var{m}}
> +@cindex @code{vec_extract@var{m}@var{n}} instruction pattern
> +@item @samp{vec_extract@var{m}@var{n}}
>  Extract given field from the vector value.  Operand 1 is the vector, operand 2
> -specify field index and operand 0 place to store value into.
> +specify field index and operand 0 place to store value into.  The
> +@var{n} mode is the mode of the field or vector of fields that should be
> +extracted, should be either element mode of the vector mode @var{m}, or
> +a vector mode with the same element mode and smaller number of elements.
> +If @var{n} is a vector mode, the index is counted in units of that mode.
>  
> -@cindex @code{vec_init@var{m}} instruction pattern
> -@item @samp{vec_init@var{m}}
> +@cindex @code{vec_init@var{m}@var{n}} instruction pattern
> +@item @samp{vec_init@var{m}@var{n}}
>  Initialize the vector to given values.  Operand 0 is the vector to initialize
> -and operand 1 is parallel containing values for individual fields.
> +and operand 1 is parallel containing values for individual fields.  The
> +@var{n} mode is the mode of the elements, should be either element mode of
> +the vector mode @var{m}, or a vector mode with the same element mode and
> +smaller number of elements.
>  
>  @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
>  @item @samp{vec_cmp@var{m}@var{n}}
> --- gcc/config/i386/i386.c.jj	2017-07-24 10:58:11.831505333 +0200
> +++ gcc/config/i386/i386.c	2017-07-24 16:11:23.060229982 +0200
> @@ -44297,6 +44297,34 @@ ix86_expand_vector_init (bool mmx_ok, rt
>    int i;
>    rtx x;
>  
> +  /* Handle first initialization from vector elts.  */
> +  if (n_elts != XVECLEN (vals, 0))
> +    {
> +      rtx subtarget = target;
> +      x = XVECEXP (vals, 0, 0);
> +      gcc_assert (GET_MODE_INNER (GET_MODE (x)) == inner_mode);
> +      if (GET_MODE_NUNITS (GET_MODE (x)) * 2 == n_elts)
> +	{
> +	  rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) };
> +	  if (inner_mode == QImode || inner_mode == HImode)
> +	    {
> +	      mode = mode_for_vector (SImode,
> +				      n_elts * GET_MODE_SIZE (inner_mode) / 4);
> +	      inner_mode
> +		= mode_for_vector (SImode,
> +				   n_elts * GET_MODE_SIZE (inner_mode) / 8);
> +	      ops[0] = gen_lowpart (inner_mode, ops[0]);
> +	      ops[1] = gen_lowpart (inner_mode, ops[1]);
> +	      subtarget = gen_reg_rtx (mode);
> +	    }
> +	  ix86_expand_vector_init_concat (mode, subtarget, ops, 2);
> +	  if (subtarget != target)
> +	    emit_move_insn (target, gen_lowpart (GET_MODE (target), subtarget));
> +	  return;
> +	}
> +      gcc_unreachable ();
> +    }
> +
>    for (i = 0; i < n_elts; ++i)
>      {
>        x = XVECEXP (vals, 0, i);
> --- gcc/config/i386/sse.md.jj	2017-07-24 10:57:45.807817176 +0200
> +++ gcc/config/i386/sse.md	2017-07-24 16:54:35.658088768 +0200
> @@ -658,13 +658,21 @@ (define_mode_attr ssedoublevecmode
>  
>  ;; Mapping of vector modes to a vector mode of half size
>  (define_mode_attr ssehalfvecmode
> -  [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI") (V8DI "V4DI")
> +  [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI") (V8DI "V4DI") (V4TI "V2TI")
>     (V32QI "V16QI") (V16HI  "V8HI") (V8SI  "V4SI") (V4DI "V2DI")
>     (V16QI  "V8QI") (V8HI   "V4HI") (V4SI  "V2SI")
>     (V16SF "V8SF") (V8DF "V4DF")
>     (V8SF  "V4SF") (V4DF "V2DF")
>     (V4SF  "V2SF")])
>  
> +(define_mode_attr ssehalfvecmodelower
> +  [(V64QI "v32qi") (V32HI "v16hi") (V16SI "v8si") (V8DI "v4di") (V4TI "v2ti")
> +   (V32QI "v16qi") (V16HI  "v8hi") (V8SI  "v4si") (V4DI "v2di")
> +   (V16QI  "v8qi") (V8HI   "v4hi") (V4SI  "v2si")
> +   (V16SF "v8sf") (V8DF "v4df")
> +   (V8SF  "v4sf") (V4DF "v2df")
> +   (V4SF  "v2sf")])
> +
>  ;; Mapping of vector modes ti packed single mode of the same size
>  (define_mode_attr ssePSmode
>    [(V16SI "V16SF") (V8DF "V16SF")
> @@ -690,6 +698,16 @@ (define_mode_attr ssescalarmode
>     (V8DF "DF")  (V4DF "DF")  (V2DF "DF")
>     (V4TI "TI")  (V2TI "TI")])
>  
> +;; Mapping of vector modes back to the scalar modes
> +(define_mode_attr ssescalarmodelower
> +  [(V64QI "qi") (V32QI "qi") (V16QI "qi")
> +   (V32HI "hi") (V16HI "hi") (V8HI "hi")
> +   (V16SI "si") (V8SI "si")  (V4SI "si")
> +   (V8DI "di")  (V4DI "di")  (V2DI "di")
> +   (V16SF "sf") (V8SF "sf")  (V4SF "sf")
> +   (V8DF "df")  (V4DF "df")  (V2DF "df")
> +   (V4TI "ti")  (V2TI "ti")])
> +
>  ;; Mapping of vector modes to the 128bit modes
>  (define_mode_attr ssexmmmode
>    [(V64QI "V16QI") (V32QI "V16QI") (V16QI "V16QI")
> @@ -2356,7 +2374,7 @@ (define_expand "reduc_plus_scal_v8df"
>  {
>    rtx tmp = gen_reg_rtx (V8DFmode);
>    ix86_expand_reduc (gen_addv8df3, tmp, operands[1]);
> -  emit_insn (gen_vec_extractv8df (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extractv8dfdf (operands[0], tmp, const0_rtx));
>    DONE;
>  })
>  
> @@ -2371,7 +2389,7 @@ (define_expand "reduc_plus_scal_v4df"
>    emit_insn (gen_avx_haddv4df3 (tmp, operands[1], operands[1]));
>    emit_insn (gen_avx_vperm2f128v4df3 (tmp2, tmp, tmp, GEN_INT (1)));
>    emit_insn (gen_addv4df3 (vec_res, tmp, tmp2));
> -  emit_insn (gen_vec_extractv4df (operands[0], vec_res, const0_rtx));
> +  emit_insn (gen_vec_extractv4dfdf (operands[0], vec_res, const0_rtx));
>    DONE;
>  })
>  
> @@ -2382,7 +2400,7 @@ (define_expand "reduc_plus_scal_v2df"
>  {
>    rtx tmp = gen_reg_rtx (V2DFmode);
>    emit_insn (gen_sse3_haddv2df3 (tmp, operands[1], operands[1]));
> -  emit_insn (gen_vec_extractv2df (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extractv2dfdf (operands[0], tmp, const0_rtx));
>    DONE;
>  })
>  
> @@ -2393,7 +2411,7 @@ (define_expand "reduc_plus_scal_v16sf"
>  {
>    rtx tmp = gen_reg_rtx (V16SFmode);
>    ix86_expand_reduc (gen_addv16sf3, tmp, operands[1]);
> -  emit_insn (gen_vec_extractv16sf (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extractv16sfsf (operands[0], tmp, const0_rtx));
>    DONE;
>  })
>  
> @@ -2409,7 +2427,7 @@ (define_expand "reduc_plus_scal_v8sf"
>    emit_insn (gen_avx_haddv8sf3 (tmp2, tmp, tmp));
>    emit_insn (gen_avx_vperm2f128v8sf3 (tmp, tmp2, tmp2, GEN_INT (1)));
>    emit_insn (gen_addv8sf3 (vec_res, tmp, tmp2));
> -  emit_insn (gen_vec_extractv8sf (operands[0], vec_res, const0_rtx));
> +  emit_insn (gen_vec_extractv8sfsf (operands[0], vec_res, const0_rtx));
>    DONE;
>  })
>  
> @@ -2427,7 +2445,7 @@ (define_expand "reduc_plus_scal_v4sf"
>      }
>    else
>      ix86_expand_reduc (gen_addv4sf3, vec_res, operands[1]);
> -  emit_insn (gen_vec_extractv4sf (operands[0], vec_res, const0_rtx));
> +  emit_insn (gen_vec_extractv4sfsf (operands[0], vec_res, const0_rtx));
>    DONE;
>  })
>  
> @@ -2449,7 +2467,8 @@ (define_expand "reduc_<code>_scal_<mode>
>  {
>    rtx tmp = gen_reg_rtx (<MODE>mode);
>    ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
> -  emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
> +							const0_rtx));
>    DONE;
>  })
>  
> @@ -2461,7 +2480,8 @@ (define_expand "reduc_<code>_scal_<mode>
>  {
>    rtx tmp = gen_reg_rtx (<MODE>mode);
>    ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
> -  emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
> +  							const0_rtx));
>    DONE;
>  })
>  
> @@ -2473,7 +2493,8 @@ (define_expand "reduc_<code>_scal_<mode>
>  {
>    rtx tmp = gen_reg_rtx (<MODE>mode);
>    ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
> -  emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
> +							const0_rtx));
>    DONE;
>  })
>  
> @@ -2485,7 +2506,7 @@ (define_expand "reduc_umin_scal_v8hi"
>  {
>    rtx tmp = gen_reg_rtx (V8HImode);
>    ix86_expand_reduc (gen_uminv8hi3, tmp, operands[1]);
> -  emit_insn (gen_vec_extractv8hi (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extractv8hihi (operands[0], tmp, const0_rtx));
>    DONE;
>  })
>  
> @@ -7881,7 +7902,7 @@ (define_mode_iterator VEC_EXTRACT_MODE
>     (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") V2DF
>     (V4TI "TARGET_AVX512F") (V2TI "TARGET_AVX")])
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><ssescalarmodelower>"
>    [(match_operand:<ssescalarmode> 0 "register_operand")
>     (match_operand:VEC_EXTRACT_MODE 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -7892,6 +7913,19 @@ (define_expand "vec_extract<mode>"
>    DONE;
>  })
>  
> +(define_expand "vec_extract<mode><ssehalfvecmodelower>"
> +  [(match_operand:<ssehalfvecmode> 0 "nonimmediate_operand")
> +   (match_operand:V_512 1 "register_operand")
> +   (match_operand 2 "const_0_to_1_operand")]
> +  "TARGET_AVX512F"
> +{
> +  if (INTVAL (operands[2]))
> +    emit_insn (gen_vec_extract_hi_<mode> (operands[0], operands[1]));
> +  else
> +    emit_insn (gen_vec_extract_lo_<mode> (operands[0], operands[1]));
> +  DONE;
> +})
> +
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ;;
>  ;; Parallel double-precision floating point element swizzling
> @@ -16693,7 +16727,7 @@ (define_expand "rotl<mode>3"
>        for (i = 0; i < <ssescalarnum>; i++)
>  	RTVEC_ELT (vs, i) = op2;
>  
> -      emit_insn (gen_vec_init<mode> (reg, par));
> +      emit_insn (gen_vec_init<mode><ssescalarmodelower> (reg, par));
>        emit_insn (gen_xop_vrotl<mode>3 (operands[0], operands[1], reg));
>        DONE;
>      }
> @@ -16725,7 +16759,7 @@ (define_expand "rotr<mode>3"
>        for (i = 0; i < <ssescalarnum>; i++)
>  	RTVEC_ELT (vs, i) = op2;
>  
> -      emit_insn (gen_vec_init<mode> (reg, par));
> +      emit_insn (gen_vec_init<mode><ssescalarmodelower> (reg, par));
>        emit_insn (gen_neg<mode>2 (neg, reg));
>        emit_insn (gen_xop_vrotl<mode>3 (operands[0], operands[1], neg));
>        DONE;
> @@ -17019,7 +17053,7 @@ (define_expand "<shift_insn><mode>3"
>          XVECEXP (par, 0, i) = operands[2];
>  
>        tmp = gen_reg_rtx (V16QImode);
> -      emit_insn (gen_vec_initv16qi (tmp, par));
> +      emit_insn (gen_vec_initv16qiqi (tmp, par));
>  
>        if (negate)
>  	emit_insn (gen_negv16qi2 (tmp, tmp));
> @@ -17055,7 +17089,7 @@ (define_expand "ashrv2di3"
>        for (i = 0; i < 2; i++)
>  	XVECEXP (par, 0, i) = operands[2];
>  
> -      emit_insn (gen_vec_initv2di (reg, par));
> +      emit_insn (gen_vec_initv2didi (reg, par));
>  
>        if (negate)
>  	emit_insn (gen_negv2di2 (reg, reg));
> @@ -18775,7 +18809,7 @@ (define_insn_and_split "avx_<castmode><a
>  				  <ssehalfvecmode>mode);
>  })
>  
> -;; Modes handled by vec_init patterns.
> +;; Modes handled by vec_init expanders.
>  (define_mode_iterator VEC_INIT_MODE
>    [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
>     (V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI
> @@ -18785,11 +18819,31 @@ (define_mode_iterator VEC_INIT_MODE
>     (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") (V2DF "TARGET_SSE2")
>     (V4TI "TARGET_AVX512F") (V2TI "TARGET_AVX")])
>  
> -(define_expand "vec_init<mode>"
> +;; Likewise, but for initialization from half sized vectors.
> +;; Thus, these are all VEC_INIT_MODE modes except V2??.
> +(define_mode_iterator VEC_INIT_HALF_MODE
> +  [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
> +   (V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI
> +   (V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX") V4SI
> +   (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX")
> +   (V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF
> +   (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX")
> +   (V4TI "TARGET_AVX512F")])
> +
> +(define_expand "vec_init<mode><ssescalarmodelower>"
>    [(match_operand:VEC_INIT_MODE 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
>  {
> +  ix86_expand_vector_init (false, operands[0], operands[1]);
> +  DONE;
> +})
> +
> +(define_expand "vec_init<mode><ssehalfvecmodelower>"
> +  [(match_operand:VEC_INIT_HALF_MODE 0 "register_operand")
> +   (match_operand 1)]
> +  "TARGET_SSE"
> +{
>    ix86_expand_vector_init (false, operands[0], operands[1]);
>    DONE;
>  })
> --- gcc/config/i386/mmx.md.jj	2017-07-24 10:57:45.869816434 +0200
> +++ gcc/config/i386/mmx.md	2017-07-24 16:11:23.065229922 +0200
> @@ -641,7 +641,7 @@ (define_split
>    [(set (match_dup 0) (match_dup 1))]
>    "operands[1] = adjust_address (operands[1], SFmode, 4);")
>  
> -(define_expand "vec_extractv2sf"
> +(define_expand "vec_extractv2sfsf"
>    [(match_operand:SF 0 "register_operand")
>     (match_operand:V2SF 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -652,7 +652,7 @@ (define_expand "vec_extractv2sf"
>    DONE;
>  })
>  
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
> @@ -1344,7 +1344,7 @@ (define_insn_and_split "*vec_extractv2si
>    operands[1] = adjust_address (operands[1], SImode, INTVAL (operands[2]) * 4);
>  })
>  
> -(define_expand "vec_extractv2si"
> +(define_expand "vec_extractv2sisi"
>    [(match_operand:SI 0 "register_operand")
>     (match_operand:V2SI 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -1355,7 +1355,7 @@ (define_expand "vec_extractv2si"
>    DONE;
>  })
>  
> -(define_expand "vec_initv2si"
> +(define_expand "vec_initv2sisi"
>    [(match_operand:V2SI 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
> @@ -1375,7 +1375,7 @@ (define_expand "vec_setv4hi"
>    DONE;
>  })
>  
> -(define_expand "vec_extractv4hi"
> +(define_expand "vec_extractv4hihi"
>    [(match_operand:HI 0 "register_operand")
>     (match_operand:V4HI 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -1386,7 +1386,7 @@ (define_expand "vec_extractv4hi"
>    DONE;
>  })
>  
> -(define_expand "vec_initv4hi"
> +(define_expand "vec_initv4hihi"
>    [(match_operand:V4HI 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
> @@ -1406,7 +1406,7 @@ (define_expand "vec_setv8qi"
>    DONE;
>  })
>  
> -(define_expand "vec_extractv8qi"
> +(define_expand "vec_extractv8qiqi"
>    [(match_operand:QI 0 "register_operand")
>     (match_operand:V8QI 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -1417,7 +1417,7 @@ (define_expand "vec_extractv8qi"
>    DONE;
>  })
>  
> -(define_expand "vec_initv8qi"
> +(define_expand "vec_initv8qiqi"
>    [(match_operand:V8QI 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
> --- gcc/config/rs6000/vector.md.jj	2017-06-08 20:50:49.000000000 +0200
> +++ gcc/config/rs6000/vector.md	2017-07-24 17:44:44.699580927 +0200
> @@ -74,6 +74,16 @@ (define_mode_attr VEC_base [(V16QI "QI")
>  			    (V1TI  "TI")
>  			    (TI    "TI")])
>  
> +;; As above, but in lower case
> +(define_mode_attr VEC_base_l [(V16QI "qi")
> +			      (V8HI  "hi")
> +			      (V4SI  "si")
> +			      (V2DI  "di")
> +			      (V4SF  "sf")
> +			      (V2DF  "df")
> +			      (V1TI  "ti")
> +			      (TI    "ti")])
> +
>  ;; Same size integer type for floating point data
>  (define_mode_attr VEC_int [(V4SF  "v4si")
>  			   (V2DF  "v2di")])
> @@ -1016,7 +1026,7 @@ (define_expand "fixuns_trunc<mode><VEC_i
>  
>  
>  ;; Vector initialization, set, extract
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><VEC_base_l>"
>    [(match_operand:VEC_E 0 "vlogical_operand" "")
>     (match_operand:VEC_E 1 "" "")]
>    "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
> @@ -1035,7 +1045,7 @@ (define_expand "vec_set<mode>"
>    DONE;
>  })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><VEC_base_l>"
>    [(match_operand:<VEC_base> 0 "register_operand" "")
>     (match_operand:VEC_E 1 "vlogical_operand" "")
>     (match_operand 2 "const_int_operand" "")]
> --- gcc/config/rs6000/paired.md.jj	2017-06-08 20:50:49.000000000 +0200
> +++ gcc/config/rs6000/paired.md	2017-07-24 17:48:20.324985029 +0200
> @@ -377,7 +377,7 @@ (define_insn "paired_muls1"
>    "ps_muls1 %0, %1, %2"
>    [(set_attr "type" "fp")])
>  
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "gpc_reg_operand" "=f")
>     (match_operand 1 "" "")]
>    "TARGET_PAIRED_FLOAT"
> --- gcc/config/rs6000/altivec.md.jj	2017-07-24 10:58:12.000000000 +0200
> +++ gcc/config/rs6000/altivec.md	2017-07-24 17:48:49.573633038 +0200
> @@ -311,7 +311,7 @@ (define_split
>    for (i = 0; i < num_elements; i++)
>      RTVEC_ELT (v, i) = constm1_rtx;
>  
> -  emit_insn (gen_vec_initv4si (dest, gen_rtx_PARALLEL (mode, v)));
> +  emit_insn (gen_vec_initv4sisi (dest, gen_rtx_PARALLEL (mode, v)));
>    emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, dest, dest)));
>    DONE;
>  })
> @@ -2267,7 +2267,7 @@ (define_expand "altivec_copysign_v4sf3"
>    RTVEC_ELT (v, 2) = GEN_INT (mask_val);
>    RTVEC_ELT (v, 3) = GEN_INT (mask_val);
>  
> -  emit_insn (gen_vec_initv4si (mask, gen_rtx_PARALLEL (V4SImode, v)));
> +  emit_insn (gen_vec_initv4sisi (mask, gen_rtx_PARALLEL (V4SImode, v)));
>    emit_insn (gen_vector_select_v4sf (operands[0], operands[1], operands[2],
>  				     gen_lowpart (V4SFmode, mask)));
>    DONE;
> @@ -3409,7 +3409,7 @@ (define_expand "vec_unpacku_hi_v16qi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 :  0);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ?  7 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3445,7 +3445,7 @@ (define_expand "vec_unpacku_hi_v8hi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ?  6 : 17);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ?  7 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3481,7 +3481,7 @@ (define_expand "vec_unpacku_lo_v16qi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 :  8);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3517,7 +3517,7 @@ (define_expand "vec_unpacku_lo_v8hi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 14 : 17);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3758,7 +3758,7 @@ (define_expand "mulv16qi3"
>       = gen_rtx_CONST_INT (QImode, BYTES_BIG_ENDIAN ? 2 * i + 17 : 15 - 2 * i);
>    }
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_altivec_vmulesb (even, operands[1], operands[2]));
>    emit_insn (gen_altivec_vmulosb (odd, operands[1], operands[2]));
>    emit_insn (gen_altivec_vperm_v8hiv16qi (operands[0], even, odd, mask));
> @@ -3804,7 +3804,7 @@ (define_expand "altivec_vreve<mode>2"
>        RTVEC_ELT (v, i + j * size)
>  	= GEN_INT (i + (num_elements - 1 - j) * size);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_altivec_vperm_<mode> (operands[0], operands[1],
>  	     operands[1], mask));
>    DONE;
> --- gcc/config/aarch64/aarch64-simd.md.jj	2017-07-24 15:01:21.000000000 +0200
> +++ gcc/config/aarch64/aarch64-simd.md	2017-07-24 17:19:05.660170375 +0200
> @@ -5617,9 +5617,9 @@ (define_expand "aarch64_set_qreg<VSTRUCT
>    DONE;
>  })
>  
> -;; Standard pattern name vec_init<mode>.
> +;; Standard pattern name vec_init<mode><Vel>.
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><Vel>"
>    [(match_operand:VALL_F16 0 "register_operand" "")
>     (match_operand 1 "" "")]
>    "TARGET_SIMD"
> @@ -5674,9 +5674,9 @@ (define_insn "aarch64_urecpe<mode>"
>   "urecpe\\t%0.<Vtype>, %1.<Vtype>"
>    [(set_attr "type" "neon_fp_recpe_<Vetype><q>")])
>  
> -;; Standard pattern name vec_extract<mode>.
> +;; Standard pattern name vec_extract<mode><Vel>.
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><Vel>"
>    [(match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "")
>     (match_operand:VALL_F16 1 "register_operand" "")
>     (match_operand:SI 2 "immediate_operand" "")]
> --- gcc/config/aarch64/iterators.md.jj	2017-03-19 11:57:22.000000000 +0100
> +++ gcc/config/aarch64/iterators.md	2017-07-24 17:17:50.318091273 +0200
> @@ -520,6 +520,17 @@ (define_mode_attr VEL [(V8QI "QI") (V16Q
>  			(SI   "SI") (HI   "HI")
>  			(QI   "QI")])
>  
> +;; Define element mode for each vector mode (lower case).
> +(define_mode_attr Vel [(V8QI "qi") (V16QI "qi")
> +			(V4HI "hi") (V8HI "hi")
> +			(V2SI "si") (V4SI "si")
> +			(DI "di")   (V2DI "di")
> +			(V4HF "hf") (V8HF "hf")
> +			(V2SF "sf") (V4SF "sf")
> +			(V2DF "df") (DF "df")
> +			(SI   "si") (HI   "hi")
> +			(QI   "qi")])
> +
>  ;; 64-bit container modes the inner or scalar source mode.
>  (define_mode_attr VCOND [(HI "V4HI") (SI "V2SI")
>  			 (V4HI "V4HI") (V8HI "V4HI")
> --- gcc/config/s390/s390.c.jj	2017-07-17 10:08:39.000000000 +0200
> +++ gcc/config/s390/s390.c	2017-07-24 17:58:24.416715142 +0200
> @@ -5792,7 +5792,7 @@ s390_expand_vec_strlen (rtx target, rtx
>    add_int_reg_note (s390_emit_ccraw_jump (8, NE, loop_start_label),
>  		    REG_BR_PROB,
>  		    profile_probability::very_likely ().to_reg_br_prob_note ());
> -  emit_insn (gen_vec_extractv16qi (len, result_reg, GEN_INT (7)));
> +  emit_insn (gen_vec_extractv16qiqi (len, result_reg, GEN_INT (7)));
>  
>    /* If the string pointer wasn't aligned we have loaded less then 16
>       bytes and the remaining bytes got filled with zeros (by vll).
> @@ -5850,7 +5850,7 @@ s390_expand_vec_movstr (rtx result, rtx
>    emit_insn (gen_vlbb (vsrc, src, GEN_INT (6)));
>    emit_insn (gen_lcbb (loadlen, src_addr, GEN_INT (6)));
>    emit_insn (gen_vfenezv16qi (vpos, vsrc, vsrc));
> -  emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
> +  emit_insn (gen_vec_extractv16qiqi (gpos_qi, vpos, GEN_INT (7)));
>    emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
>    /* gpos is the byte index if a zero was found and 16 otherwise.
>       So if it is lower than the loaded bytes we have a hit.  */
> @@ -5928,7 +5928,7 @@ s390_expand_vec_movstr (rtx result, rtx
>    force_expand_binop (Pmode, add_optab, dst_addr_reg, offset, dst_addr_reg,
>  		      1, OPTAB_DIRECT);
>  
> -  emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
> +  emit_insn (gen_vec_extractv16qiqi (gpos_qi, vpos, GEN_INT (7)));
>    emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
>  
>    emit_insn (gen_vstlv16qi (vsrc, gpos, gen_rtx_MEM (BLKmode, dst_addr_reg)));
> --- gcc/config/s390/vector.md.jj	2017-04-25 15:51:31.000000000 +0200
> +++ gcc/config/s390/vector.md	2017-07-24 17:57:37.665277768 +0200
> @@ -90,6 +90,17 @@ (define_mode_attr non_vec[(V1QI "QI") (V
>  			  (V1DF "DF") (V2DF "DF")
>  			  (V1TF "TF") (TF "TF")])
>  
> +; Like above, but in lower case.
> +(define_mode_attr non_vec_l[(V1QI "qi") (V2QI "qi") (V4QI "qi") (V8QI "qi")
> +			    (V16QI "qi")
> +			    (V1HI "hi") (V2HI "hi") (V4HI "hi") (V8HI "hi")
> +			    (V1SI "si") (V2SI "si") (V4SI "si")
> +			    (V1DI "di") (V2DI "di")
> +			    (V1TI "ti") (TI "ti")
> +			    (V1SF "sf") (V2SF "sf") (V4SF "sf")
> +			    (V1DF "df") (V2DF "df")
> +			    (V1TF "tf") (TF "tf")])
> +
>  ; The instruction suffix for integer instructions and instructions
>  ; which do not care about whether it is floating point or integer.
>  (define_mode_attr bhfgq[(V1QI "b") (V2QI "b") (V4QI "b") (V8QI "b") (V16QI "b")
> @@ -453,7 +464,7 @@ (define_insn "*vec_set<mode>_plus"
>  ; FIXME: Support also vector mode operands for 0
>  ; FIXME: This should be (vec_select ..) or something but it does only allow constant selectors :(
>  ; This is used via RTL standard name as well as for expanding the builtin
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><non_vec_l>"
>    [(set (match_operand:<non_vec> 0 "nonimmediate_operand" "")
>  	(unspec:<non_vec> [(match_operand:V  1 "register_operand" "")
>  			   (match_operand:SI 2 "nonmemory_operand" "")]
> @@ -485,7 +496,7 @@ (define_insn "*vec_extract<mode>_plus"
>    "vlgv<bhfgq>\t%0,%v1,%Y3(%2)"
>    [(set_attr "op_type" "VRS")])
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><non_vec_l>"
>    [(match_operand:V_128 0 "register_operand" "")
>     (match_operand:V_128 1 "nonmemory_operand" "")]
>    "TARGET_VX"
> --- gcc/config/s390/s390-builtins.def.jj	2017-03-24 15:08:56.000000000 +0100
> +++ gcc/config/s390/s390-builtins.def	2017-07-24 18:02:22.571849086 +0200
> @@ -450,12 +450,12 @@ OB_DEF_VAR (s390_vec_extract_u64,
>  OB_DEF_VAR (s390_vec_extract_b64,       s390_vlgvg,         0,                  O2_ELEM,            BT_OV_ULONGLONG_BV2DI_INT)
>  OB_DEF_VAR (s390_vec_extract_dbl,       s390_vlgvg_dbl,     0,                  O2_ELEM,            BT_OV_DBL_V2DF_INT)                      /* vlgvg */
>  
> -B_DEF      (s390_vlgvb,                 vec_extractv16qi,   0,                  B_VX,               O2_ELEM,            BT_FN_UCHAR_UV16QI_INT)
> -B_DEF      (s390_vlgvh,                 vec_extractv8hi,    0,                  B_VX,               O2_ELEM,            BT_FN_USHORT_UV8HI_INT)
> -B_DEF      (s390_vlgvf,                 vec_extractv4si,    0,                  B_VX,               O2_ELEM,            BT_FN_UINT_UV4SI_INT)
> -B_DEF      (s390_vlgvf_flt,             vec_extractv4sf,    0,                  B_INT | B_VXE,      O2_ELEM,            BT_FN_FLT_V4SF_INT)
> -B_DEF      (s390_vlgvg,                 vec_extractv2di,    0,                  B_VX,               O2_ELEM,            BT_FN_ULONGLONG_UV2DI_INT)
> -B_DEF      (s390_vlgvg_dbl,             vec_extractv2df,    0,                  B_INT | B_VX,       O2_ELEM,            BT_FN_DBL_V2DF_INT)
> +B_DEF      (s390_vlgvb,                 vec_extractv16qiqi, 0,                  B_VX,               O2_ELEM,            BT_FN_UCHAR_UV16QI_INT)
> +B_DEF      (s390_vlgvh,                 vec_extractv8hihi,  0,                  B_VX,               O2_ELEM,            BT_FN_USHORT_UV8HI_INT)
> +B_DEF      (s390_vlgvf,                 vec_extractv4sisi,  0,                  B_VX,               O2_ELEM,            BT_FN_UINT_UV4SI_INT)
> +B_DEF      (s390_vlgvf_flt,             vec_extractv4sfsf,  0,                  B_INT | B_VXE,      O2_ELEM,            BT_FN_FLT_V4SF_INT)
> +B_DEF      (s390_vlgvg,                 vec_extractv2didi,  0,                  B_VX,               O2_ELEM,            BT_FN_ULONGLONG_UV2DI_INT)
> +B_DEF      (s390_vlgvg_dbl,             vec_extractv2dfdf,  0,                  B_INT | B_VX,       O2_ELEM,            BT_FN_DBL_V2DF_INT)
>  
>  OB_DEF     (s390_vec_insert_and_zero,   s390_vec_insert_and_zero_s8,s390_vec_insert_and_zero_dbl,B_VX,BT_FN_OV4SI_INTCONSTPTR)
>  OB_DEF_VAR (s390_vec_insert_and_zero_s8,s390_vllezb,        0,                  0,                  BT_OV_V16QI_SCHARCONSTPTR)
> --- gcc/config/arm/iterators.md.jj	2017-05-05 09:20:02.000000000 +0200
> +++ gcc/config/arm/iterators.md	2017-07-24 17:25:15.665681575 +0200
> @@ -444,6 +444,14 @@ (define_mode_attr V_elem [(V8QI "QI") (V
>                            (V2SF "SF") (V4SF "SF")
>                            (DI "DI")   (V2DI "DI")])
>  
> +;; As above but in lower case.
> +(define_mode_attr V_elem_l [(V8QI "qi") (V16QI "qi")
> +			    (V4HI "hi") (V8HI "hi")
> +			    (V4HF "hf") (V8HF "hf")
> +			    (V2SI "si") (V4SI "si")
> +			    (V2SF "sf") (V4SF "sf")
> +			    (DI "di")   (V2DI "di")])
> +
>  ;; Element modes for vector extraction, padded up to register size.
>  
>  (define_mode_attr V_ext [(V8QI "SI") (V16QI "SI")
> --- gcc/config/arm/neon.md.jj	2017-07-17 10:08:41.000000000 +0200
> +++ gcc/config/arm/neon.md	2017-07-24 17:27:42.173917259 +0200
> @@ -412,7 +412,7 @@ (define_expand "vec_set<mode>"
>    DONE;
>  })
>  
> -(define_insn "vec_extract<mode>"
> +(define_insn "vec_extract<mode><V_elem_l>"
>    [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
>          (vec_select:<V_elem>
>            (match_operand:VD_LANE 1 "s_register_operand" "w,w")
> @@ -434,7 +434,7 @@ (define_insn "vec_extract<mode>"
>    [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
>  )
>  
> -(define_insn "vec_extract<mode>"
> +(define_insn "vec_extract<mode><V_elem_l>"
>    [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
>  	(vec_select:<V_elem>
>            (match_operand:VQ2 1 "s_register_operand" "w,w")
> @@ -460,7 +460,7 @@ (define_insn "vec_extract<mode>"
>    [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
>  )
>  
> -(define_insn "vec_extractv2di"
> +(define_insn "vec_extractv2didi"
>    [(set (match_operand:DI 0 "nonimmediate_operand" "=Um,r")
>  	(vec_select:DI
>            (match_operand:V2DI 1 "s_register_operand" "w,w")
> @@ -479,7 +479,7 @@ (define_insn "vec_extractv2di"
>    [(set_attr "type" "neon_store1_one_lane_q,neon_to_gp_q")]
>  )
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><V_elem_l>"
>    [(match_operand:VDQ 0 "s_register_operand" "")
>     (match_operand 1 "" "")]
>    "TARGET_NEON"
> @@ -1581,7 +1581,7 @@ (define_expand "reduc_plus_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>  			&gen_neon_vpadd_internal<mode>);
>    /* The same result is actually computed into every element.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -1607,7 +1607,7 @@ (define_expand "reduc_plus_scal_v2di"
>    rtx vec = gen_reg_rtx (V2DImode);
>  
>    emit_insn (gen_arm_reduc_plus_internal_v2di (vec, operands[1]));
> -  emit_insn (gen_vec_extractv2di (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extractv2didi (operands[0], vec, const0_rtx));
>  
>    DONE;
>  })
> @@ -1631,7 +1631,7 @@ (define_expand "reduc_smin_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>  			&gen_neon_vpsmin<mode>);
>    /* The result is computed into every element of the vector.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -1658,7 +1658,7 @@ (define_expand "reduc_smax_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>  			&gen_neon_vpsmax<mode>);
>    /* The result is computed into every element of the vector.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -1685,7 +1685,7 @@ (define_expand "reduc_umin_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>  			&gen_neon_vpumin<mode>);
>    /* The result is computed into every element of the vector.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -1711,7 +1711,7 @@ (define_expand "reduc_umax_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>  			&gen_neon_vpumax<mode>);
>    /* The result is computed into every element of the vector.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -3272,7 +3272,8 @@ (define_expand "neon_vget_lane<mode>"
>      }
>  
>    if (GET_MODE_UNIT_BITSIZE (<MODE>mode) == 32)
> -    emit_insn (gen_vec_extract<mode> (operands[0], operands[1], operands[2]));
> +    emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], operands[1],
> +						operands[2]));
>    else
>      emit_insn (gen_neon_vget_lane<mode>_sext_internal (operands[0],
>  						       operands[1],
> @@ -3301,7 +3302,8 @@ (define_expand "neon_vget_laneu<mode>"
>      }
>  
>    if (GET_MODE_UNIT_BITSIZE (<MODE>mode) == 32)
> -    emit_insn (gen_vec_extract<mode> (operands[0], operands[1], operands[2]));
> +    emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], operands[1],
> +						operands[2]));
>    else
>      emit_insn (gen_neon_vget_lane<mode>_zext_internal (operands[0],
>  						       operands[1],
> --- gcc/config/mips/mips-msa.md.jj	2017-03-31 20:36:09.000000000 +0200
> +++ gcc/config/mips/mips-msa.md	2017-07-24 17:33:32.657689124 +0200
> @@ -231,7 +231,7 @@ (define_mode_attr bitimm
>     (V4SI  "uimm5")
>     (V2DI  "uimm6")])
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><unitmode>"
>    [(match_operand:MSA 0 "register_operand")
>     (match_operand:MSA 1 "")]
>    "ISA_HAS_MSA"
> @@ -311,7 +311,7 @@ (define_expand "vec_unpacku_lo_<mode>"
>    DONE;
>  })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><unitmode>"
>    [(match_operand:<UNITMODE> 0 "register_operand")
>     (match_operand:IMSA 1 "register_operand")
>     (match_operand 2 "const_<indeximm>_operand")]
> @@ -329,7 +329,7 @@ (define_expand "vec_extract<mode>"
>    DONE;
>  })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><unitmode>"
>    [(match_operand:<UNITMODE> 0 "register_operand")
>     (match_operand:FMSA 1 "register_operand")
>     (match_operand 2 "const_<indeximm>_operand")]
> --- gcc/config/mips/loongson.md.jj	2017-01-01 12:45:40.000000000 +0100
> +++ gcc/config/mips/loongson.md	2017-07-24 18:08:29.736433972 +0200
> @@ -119,7 +119,7 @@ (define_insn "mov<mode>_internal"
>  
>  ;; Initialization of a vector.
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><unitmode>"
>    [(set (match_operand:VWHB 0 "register_operand")
>  	(match_operand 1 ""))]
>    "TARGET_HARD_FLOAT && TARGET_LOONGSON_VECTORS"
> --- gcc/config/mips/mips-ps-3d.md.jj	2017-01-01 12:45:40.000000000 +0100
> +++ gcc/config/mips/mips-ps-3d.md	2017-07-24 17:34:13.540195876 +0200
> @@ -254,7 +254,7 @@ (define_expand "mips_pll_ps"
>  })
>  
>  ; vec_init
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "register_operand")
>     (match_operand:V2SF 1 "")]
>    "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT"
> @@ -282,7 +282,7 @@ (define_insn "vec_concatv2sf"
>  ;; emulated.  There is no other way to get a vector mode bitfield extract
>  ;; currently.
>  
> -(define_insn "vec_extractv2sf"
> +(define_insn "vec_extractv2sfsf"
>    [(set (match_operand:SF 0 "register_operand" "=f")
>  	(vec_select:SF (match_operand:V2SF 1 "register_operand" "f")
>  		       (parallel
> @@ -379,7 +379,7 @@ (define_expand "reduc_plus_scal_v2sf"
>      rtx temp = gen_reg_rtx (V2SFmode);
>      emit_insn (gen_mips_addr_ps (temp, operands[1], operands[1]));
>      rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
> -    emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
> +    emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
>      DONE;
>    })
>  
> @@ -757,7 +757,7 @@ (define_expand "reduc_smin_scal_v2sf"
>    rtx temp = gen_reg_rtx (V2SFmode);
>    mips_expand_vec_reduc (temp, operands[1], gen_sminv2sf3);
>    rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
> -  emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
> +  emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
>    DONE;
>  })
>  
> @@ -769,6 +769,6 @@ (define_expand "reduc_smax_scal_v2sf"
>    rtx temp = gen_reg_rtx (V2SFmode);
>    mips_expand_vec_reduc (temp, operands[1], gen_smaxv2sf3);
>    rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
> -  emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
> +  emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
>    DONE;
>  })
> --- gcc/config/mips/mips.md.jj	2017-06-15 11:03:32.000000000 +0200
> +++ gcc/config/mips/mips.md	2017-07-24 19:00:15.519582707 +0200
> @@ -917,6 +917,11 @@ (define_mode_attr UNITMODE [(SF "SF") (D
>  			    (V16QI "QI") (V8HI "HI") (V4SI "SI") (V2DI "DI")
>  			    (V2DF "DF")])
>  
> +;; As above, but in lower case.
> +(define_mode_attr unitmode [(SF "sf") (DF "df") (V2SF "sf") (V4SF "sf")
> +			    (V16QI "qi") (V8QI "qi") (V8HI "hi") (V4HI "hi")
> +			    (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")])
> +
>  ;; This attribute gives the integer mode that has the same size as a
>  ;; fixed-point mode.
>  (define_mode_attr IMODE [(QQ "QI") (HQ "HI") (SQ "SI") (DQ "DI")
> --- gcc/config/spu/spu.c.jj	2017-07-17 10:08:39.000000000 +0200
> +++ gcc/config/spu/spu.c	2017-07-24 18:06:01.693214125 +0200
> @@ -1773,7 +1773,7 @@ spu_expand_prologue (void)
>  	      size_v4si = scratch_v4si;
>  	    }
>  	  emit_insn (gen_cgt_v4si (scratch_v4si, sp_v4si, size_v4si));
> -	  emit_insn (gen_vec_extractv4si
> +	  emit_insn (gen_vec_extractv4sisi
>  		     (scratch_reg_0, scratch_v4si, GEN_INT (1)));
>  	  emit_insn (gen_spu_heq (scratch_reg_0, GEN_INT (0)));
>  	}
> @@ -5368,7 +5368,7 @@ spu_allocate_stack (rtx op0, rtx op1)
>      {
>        rtx avail = gen_reg_rtx(SImode);
>        rtx result = gen_reg_rtx(SImode);
> -      emit_insn (gen_vec_extractv4si (avail, sp, GEN_INT (1)));
> +      emit_insn (gen_vec_extractv4sisi (avail, sp, GEN_INT (1)));
>        emit_insn (gen_cgt_si(result, avail, GEN_INT (-1)));
>        emit_insn (gen_spu_heq (result, GEN_INT(0) ));
>      }
> @@ -5684,22 +5684,22 @@ spu_builtin_extract (rtx ops[])
>        switch (mode)
>  	{
>  	case V16QImode:
> -	  emit_insn (gen_vec_extractv16qi (ops[0], ops[1], ops[2]));
> +	  emit_insn (gen_vec_extractv16qiqi (ops[0], ops[1], ops[2]));
>  	  break;
>  	case V8HImode:
> -	  emit_insn (gen_vec_extractv8hi (ops[0], ops[1], ops[2]));
> +	  emit_insn (gen_vec_extractv8hihi (ops[0], ops[1], ops[2]));
>  	  break;
>  	case V4SFmode:
> -	  emit_insn (gen_vec_extractv4sf (ops[0], ops[1], ops[2]));
> +	  emit_insn (gen_vec_extractv4sfsf (ops[0], ops[1], ops[2]));
>  	  break;
>  	case V4SImode:
> -	  emit_insn (gen_vec_extractv4si (ops[0], ops[1], ops[2]));
> +	  emit_insn (gen_vec_extractv4sisi (ops[0], ops[1], ops[2]));
>  	  break;
>  	case V2DImode:
> -	  emit_insn (gen_vec_extractv2di (ops[0], ops[1], ops[2]));
> +	  emit_insn (gen_vec_extractv2didi (ops[0], ops[1], ops[2]));
>  	  break;
>  	case V2DFmode:
> -	  emit_insn (gen_vec_extractv2df (ops[0], ops[1], ops[2]));
> +	  emit_insn (gen_vec_extractv2dfdf (ops[0], ops[1], ops[2]));
>  	  break;
>  	default:
>  	  abort ();
> --- gcc/config/spu/spu.md.jj	2017-01-01 12:45:40.000000000 +0100
> +++ gcc/config/spu/spu.md	2017-07-24 18:05:05.591888718 +0200
> @@ -256,6 +256,13 @@ (define_mode_attr inner  [(V16QI "QI")
>  			  (V2DI  "DI")
>  			  (V4SF  "SF")
>  			  (V2DF  "DF")])
> +;; Like above, but in lower case
> +(define_mode_attr inner_l [(V16QI "qi")
> +			   (V8HI  "hi")
> +			   (V4SI  "si")
> +			   (V2DI  "di")
> +			   (V4SF  "sf")
> +			   (V2DF  "df")])
>  (define_mode_attr vmult  [(V16QI "1")
>  			  (V8HI  "2")
>  			  (V4SI  "4")
> @@ -4318,7 +4325,7 @@ (define_expand "restore_stack_nonlocal"
>  ;; vector patterns
>  
>  ;; Vector initialization
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><inner_l>"
>    [(match_operand:V 0 "register_operand" "")
>     (match_operand 1 "" "")]
>    ""
> @@ -4347,7 +4354,7 @@ (define_expand "vec_set<mode>"
>      operands[6] = GEN_INT (size);
>    })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><inner_l>"
>    [(set (match_operand:<inner> 0 "spu_reg_operand" "=r")
>  	(vec_select:<inner> (match_operand:V 1 "spu_reg_operand" "r")
>  			    (parallel [(match_operand 2 "const_int_operand" "i")])))]
> --- gcc/config/sparc/sparc.md.jj	2017-07-17 10:08:39.000000000 +0200
> +++ gcc/config/sparc/sparc.md	2017-07-24 18:11:52.396997069 +0200
> @@ -8621,6 +8621,8 @@ (define_mode_attr vconstr [(V1SI "f") (V
>  (define_mode_attr vfptype [(V1SI "single") (V2HI "single") (V4QI "single")
>  			   (V1DI "double") (V2SI "double") (V4HI "double")
>  			   (V8QI "double")])
> +(define_mode_attr veltmode [(V1SI "si") (V2HI "hi") (V4QI "qi") (V1DI "di")
> +			    (V2SI "si") (V4HI "hi") (V8QI "qi")])
>  
>  (define_expand "mov<VMALL:mode>"
>    [(set (match_operand:VMALL 0 "nonimmediate_operand" "")
> @@ -8762,7 +8764,7 @@ (define_split
>    DONE;
>  })
>  
> -(define_expand "vec_init<VMALL:mode>"
> +(define_expand "vec_init<VMALL:mode><VMALL:veltmode>"
>    [(match_operand:VMALL 0 "register_operand" "")
>     (match_operand:VMALL 1 "" "")]
>    "TARGET_VIS"
> --- gcc/config/ia64/vect.md.jj	2017-01-01 12:45:42.000000000 +0100
> +++ gcc/config/ia64/vect.md	2017-07-24 17:29:28.996628899 +0200
> @@ -1015,7 +1015,7 @@ (define_insn "*vec_interleave_highv2si"
>  }
>    [(set_attr "itanium_class" "mmshf")])
>  
> -(define_expand "vec_initv2si"
> +(define_expand "vec_initv2sisi"
>    [(match_operand:V2SI 0 "gr_register_operand" "")
>     (match_operand 1 "" "")]
>    ""
> @@ -1299,7 +1299,7 @@ (define_insn "*fselect"
>    "fselect %0 = %F2, %F3, %1"
>    [(set_attr "itanium_class" "fmisc")])
>  
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "fr_register_operand" "")
>     (match_operand 1 "" "")]
>    ""
> @@ -1483,7 +1483,7 @@ (define_insn_and_split "*vec_extractv2sf
>    operands[1] = gen_rtx_REG (SFmode, REGNO (operands[1]));
>  })
>  
> -(define_expand "vec_extractv2sf"
> +(define_expand "vec_extractv2sfsf"
>    [(set (match_operand:SF 0 "register_operand" "")
>  	(unspec:SF [(match_operand:V2SF 1 "register_operand" "")
>  		    (match_operand:DI 2 "const_int_operand" "")]
> --- gcc/config/powerpcspe/vector.md.jj	2017-05-25 10:37:03.000000000 +0200
> +++ gcc/config/powerpcspe/vector.md	2017-07-24 17:41:21.897027743 +0200
> @@ -74,6 +74,16 @@ (define_mode_attr VEC_base [(V16QI "QI")
>  			    (V1TI  "TI")
>  			    (TI    "TI")])
>  
> +;; As above, but in lower case
> +(define_mode_attr VEC_base_l [(V16QI "qi")
> +			      (V8HI  "hi")
> +			      (V4SI  "si")
> +			      (V2DI  "di")
> +			      (V4SF  "sf")
> +			      (V2DF  "df")
> +			      (V1TI  "ti")
> +			      (TI    "ti")])
> +
>  ;; Same size integer type for floating point data
>  (define_mode_attr VEC_int [(V4SF  "v4si")
>  			   (V2DF  "v2di")])
> @@ -1017,7 +1027,7 @@ (define_expand "fixuns_trunc<mode><VEC_i
>  
>  
>  ;; Vector initialization, set, extract
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><VEC_base_l>"
>    [(match_operand:VEC_E 0 "vlogical_operand" "")
>     (match_operand:VEC_E 1 "" "")]
>    "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
> @@ -1036,7 +1046,7 @@ (define_expand "vec_set<mode>"
>    DONE;
>  })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><VEC_base_l>"
>    [(match_operand:<VEC_base> 0 "register_operand" "")
>     (match_operand:VEC_E 1 "vlogical_operand" "")
>     (match_operand 2 "const_int_operand" "")]
> --- gcc/config/powerpcspe/paired.md.jj	2017-05-25 10:37:04.000000000 +0200
> +++ gcc/config/powerpcspe/paired.md	2017-07-24 17:42:17.980351097 +0200
> @@ -377,7 +377,7 @@ (define_insn "paired_muls1"
>    "ps_muls1 %0, %1, %2"
>    [(set_attr "type" "fp")])
>  
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "gpc_reg_operand" "=f")
>     (match_operand 1 "" "")]
>    "TARGET_PAIRED_FLOAT"
> --- gcc/config/powerpcspe/altivec.md.jj	2017-05-25 10:37:05.000000000 +0200
> +++ gcc/config/powerpcspe/altivec.md	2017-07-24 17:42:49.897966010 +0200
> @@ -301,7 +301,7 @@ (define_split
>    for (i = 0; i < num_elements; i++)
>      RTVEC_ELT (v, i) = constm1_rtx;
>  
> -  emit_insn (gen_vec_initv4si (dest, gen_rtx_PARALLEL (mode, v)));
> +  emit_insn (gen_vec_initv4sisi (dest, gen_rtx_PARALLEL (mode, v)));
>    emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, dest, dest)));
>    DONE;
>  })
> @@ -2222,7 +2222,7 @@ (define_expand "altivec_copysign_v4sf3"
>    RTVEC_ELT (v, 2) = GEN_INT (mask_val);
>    RTVEC_ELT (v, 3) = GEN_INT (mask_val);
>  
> -  emit_insn (gen_vec_initv4si (mask, gen_rtx_PARALLEL (V4SImode, v)));
> +  emit_insn (gen_vec_initv4sisi (mask, gen_rtx_PARALLEL (V4SImode, v)));
>    emit_insn (gen_vector_select_v4sf (operands[0], operands[1], operands[2],
>  				     gen_lowpart (V4SFmode, mask)));
>    DONE;
> @@ -3014,7 +3014,7 @@ (define_expand "vec_unpacku_hi_v16qi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 :  0);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ?  7 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3050,7 +3050,7 @@ (define_expand "vec_unpacku_hi_v8hi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ?  6 : 17);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ?  7 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3086,7 +3086,7 @@ (define_expand "vec_unpacku_lo_v16qi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 :  8);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3122,7 +3122,7 @@ (define_expand "vec_unpacku_lo_v8hi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 14 : 17);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3363,7 +3363,7 @@ (define_expand "mulv16qi3"
>       = gen_rtx_CONST_INT (QImode, BYTES_BIG_ENDIAN ? 2 * i + 17 : 15 - 2 * i);
>    }
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_altivec_vmulesb (even, operands[1], operands[2]));
>    emit_insn (gen_altivec_vmulosb (odd, operands[1], operands[2]));
>    emit_insn (gen_altivec_vperm_v8hiv16qi (operands[0], even, odd, mask));
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)