From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rguenther@suse.de>
Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29])
 by sourceware.org (Postfix) with ESMTPS id 3B6BD385783E
 for <gcc-patches@gcc.gnu.org>; Mon,  2 Aug 2021 07:44:11 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3B6BD385783E
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=suse.de
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de
Received: from relay2.suse.de (relay2.suse.de [149.44.160.134])
 by smtp-out2.suse.de (Postfix) with ESMTP id 13CB91FF3D;
 Mon,  2 Aug 2021 07:44:10 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa;
 t=1627890250; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
 mime-version:mime-version:content-type:content-type:
 in-reply-to:in-reply-to:references:references;
 bh=STB0hyefJoDdfalg7JxZq5iREiOi93SQrPDovBKm94k=;
 b=fWfa3SFxoNfmKcaDQOrWql62SxHjpHYpghnIDpco0V/MfnENj/aIBKdJNB+cO/dlzgyqmO
 SEwGIMqQ0hkie0FdoOLgpQYPFLfNe2VmyQ1txcEiwdcVkX8b9k4HOEStX1AW1uENlMYb8i
 u9zvGoCmws/lbVTcfynfborfTVdHfH4=
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;
 s=susede2_ed25519; t=1627890250;
 h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
 mime-version:mime-version:content-type:content-type:
 in-reply-to:in-reply-to:references:references;
 bh=STB0hyefJoDdfalg7JxZq5iREiOi93SQrPDovBKm94k=;
 b=PsFYljznEV3Dfc7CZOTSlPVgSDrrXVk4qPD+jwfMxbROzRQ1VL1gW/indgQAMJ9oxe/so8
 1pIC61DMyAoo/zBg==
Received: from murzim.suse.de (murzim.suse.de [10.160.4.192])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by relay2.suse.de (Postfix) with ESMTPS id 0D443A3BB0;
 Mon,  2 Aug 2021 07:44:10 +0000 (UTC)
Date: Mon, 2 Aug 2021 09:44:09 +0200 (CEST)
From: Richard Biener <rguenther@suse.de>
To: Richard Sandiford <richard.sandiford@arm.com>
cc: gcc-patches@gcc.gnu.org, linkw@linux.ibm.com
Subject: Re: [PATCH] Add emulated gather capability to the vectorizer
In-Reply-To: <mptczqzhq9l.fsf@arm.com>
Message-ID: <nycvar.YFH.7.76.2108020937420.11781@zhemvz.fhfr.qr>
References: <3n590sn-6584-r9q-oqp7-953p556qp515@fhfr.qr>
 <mptczqzhq9l.fsf@arm.com>
User-Agent: Alpine 2.21 (LSU 202 2017-01-01)
MIME-Version: 1.0
X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 8BIT
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Aug 2021 07:44:13 -0000

On Fri, 30 Jul 2021, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > This adds a gather vectorization capability to the vectorizer
> > without target support by decomposing the offset vector, doing
> > sclar loads and then building a vector from the result.  This
> > is aimed mainly at cases where vectorizing the rest of the loop
> > offsets the cost of vectorizing the gather.
> >
> > Note it's difficult to avoid vectorizing the offset load, but in
> > some cases later passes can turn the vector load + extract into
> > scalar loads, see the followup patch.
> >
> > On SPEC CPU 2017 510.parest_r this improves runtime from 250s
> > to 219s on a Zen2 CPU which has its native gather instructions
> > disabled (using those the runtime instead increases to 254s)
> > using -Ofast -march=znver2 [-flto].  It turns out the critical
> > loops in this benchmark all perform gather operations.
> 
> Nice!
> 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > Any comments?  I still plan to run this over full SPEC and
> > I have to apply TLC to the followup patch before I can post it.
> >
> > I think neither power nor z has gather so I'm curious if the
> > patch helps 510.parest_r there, I'm unsure about neon/advsimd.
> > Both might need the followup patch - I was surprised about
> > the speedup without it on Zen (the followup improves runtime
> > to 198s there).
> >
> > Thanks,
> > Richard.
> >
> > 2021-07-30  Richard Biener  <rguenther@suse.de>
> >
> > 	* tree-vect-data-refs.c (vect_check_gather_scatter):
> > 	Include widening conversions only when the result is
> > 	still handed by native gather or the current offset
> > 	size not already matches the data size.
> > 	Also succeed analysis in case there's no native support,
> > 	noted by a IFN_LAST ifn and a NULL decl.
> > 	* tree-vect-patterns.c (vect_recog_gather_scatter_pattern):
> > 	Test for no IFN gather rather than decl gather.
> > 	* tree-vect-stmts.c (vect_model_load_cost): Pass in the
> > 	gather-scatter info and cost emulated gathers accordingly.
> > 	(vect_truncate_gather_scatter_offset): Properly test for
> > 	no IFN gather.
> > 	(vect_use_strided_gather_scatters_p): Likewise.
> > 	(get_load_store_type): Handle emulated gathers and its
> > 	restrictions.
> > 	(vectorizable_load): Likewise.  Emulate them by extracting
> >         scalar offsets, doing scalar loads and a vector construct.
> >
> > 	* gcc.target/i386/vect-gather-1.c: New testcase.
> > 	* gfortran.dg/vect/vect-8.f90: Adjust.
> > ---
> >  gcc/testsuite/gcc.target/i386/vect-gather-1.c |  18 ++++
> >  gcc/testsuite/gfortran.dg/vect/vect-8.f90     |   2 +-
> >  gcc/tree-vect-data-refs.c                     |  29 +++--
> >  gcc/tree-vect-patterns.c                      |   2 +-
> >  gcc/tree-vect-stmts.c                         | 100 ++++++++++++++++--
> >  5 files changed, 136 insertions(+), 15 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/vect-gather-1.c
> >
> > diff --git a/gcc/testsuite/gcc.target/i386/vect-gather-1.c b/gcc/testsuite/gcc.target/i386/vect-gather-1.c
> > new file mode 100644
> > index 00000000000..134aef39666
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/vect-gather-1.c
> > @@ -0,0 +1,18 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-Ofast -msse2 -fdump-tree-vect-details" } */
> > +
> > +#ifndef INDEXTYPE
> > +#define INDEXTYPE int
> > +#endif
> > +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend,
> > +	    double *luval, double *dst)
> > +{
> > +  double res = 0;
> > +  for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval)
> > +        res += *luval * dst[*col];
> > +  return res;
> > +}
> > +
> > +/* With gather emulation this should be profitable to vectorize
> > +   even with plain SSE2.  */
> > +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */
> > diff --git a/gcc/testsuite/gfortran.dg/vect/vect-8.f90 b/gcc/testsuite/gfortran.dg/vect/vect-8.f90
> > index 9994805d77f..cc1aebfbd84 100644
> > --- a/gcc/testsuite/gfortran.dg/vect/vect-8.f90
> > +++ b/gcc/testsuite/gfortran.dg/vect/vect-8.f90
> > @@ -706,5 +706,5 @@ END SUBROUTINE kernel
> >  
> >  ! { dg-final { scan-tree-dump-times "vectorized 24 loops" 1 "vect" { target aarch64_sve } } }
> >  ! { dg-final { scan-tree-dump-times "vectorized 23 loops" 1 "vect" { target { aarch64*-*-* && { ! aarch64_sve } } } } }
> > -! { dg-final { scan-tree-dump-times "vectorized 2\[23\] loops" 1 "vect" { target { vect_intdouble_cvt && { ! aarch64*-*-* } } } } }
> > +! { dg-final { scan-tree-dump-times "vectorized 2\[234\] loops" 1 "vect" { target { vect_intdouble_cvt && { ! aarch64*-*-* } } } } }
> >  ! { dg-final { scan-tree-dump-times "vectorized 17 loops" 1 "vect" { target { { ! vect_intdouble_cvt } && { ! aarch64*-*-* } } } } }
> > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> > index 6995efba899..0279e75fa8e 100644
> > --- a/gcc/tree-vect-data-refs.c
> > +++ b/gcc/tree-vect-data-refs.c
> > @@ -4007,8 +4007,26 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo,
> >  	      continue;
> >  	    }
> >  
> > -	  if (TYPE_PRECISION (TREE_TYPE (op0))
> > -	      < TYPE_PRECISION (TREE_TYPE (off)))
> > +	  /* Include the conversion if it is widening and we're using
> > +	     the IFN path or the target can handle the converted from
> > +	     offset or the current size is not already the same as the
> > +	     data vector element size.  */
> > +	  if ((TYPE_PRECISION (TREE_TYPE (op0))
> > +	       < TYPE_PRECISION (TREE_TYPE (off)))
> > +	      && ((!use_ifn_p
> > +		   && (DR_IS_READ (dr)
> > +		       ? (targetm.vectorize.builtin_gather
> > +			  && targetm.vectorize.builtin_gather (vectype,
> > +							       TREE_TYPE (op0),
> > +							       scale))
> > +		       : (targetm.vectorize.builtin_scatter
> > +			  && targetm.vectorize.builtin_scatter (vectype,
> > +								TREE_TYPE (op0),
> > +								scale))))
> > +		  || (!use_ifn_p
> > +		      && !operand_equal_p (TYPE_SIZE (TREE_TYPE (off)),
> > +					   TYPE_SIZE (TREE_TYPE (vectype)),
> > +					   0))))
> 
> I don't understand this bit: it seems to disable the path for
> use_ifn_p altogether.

Ah, yeah - a || use_ifn_p is missing.  I hope to get to turn x86
to use IFNs for gather/scatter for GCC 12 to make this all a bit less
iffy.  Note x86 can do both DFmode[DImode] and DFmode[SImode] where
we'd prefer the SImode case (look through another conversion) but
the current way IFNs are selected is to stop at the first "match".
The above builtin_gather/scatter check I added instead looks at support
of the converted from type.  That would need resolving somehow.

> >  	    {
> >  	      off = op0;
> >  	      offtype = TREE_TYPE (off);
> > @@ -4036,7 +4054,8 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo,
> >        if (!vect_gather_scatter_fn_p (loop_vinfo, DR_IS_READ (dr), masked_p,
> >  				     vectype, memory_type, offtype, scale,
> >  				     &ifn, &offset_vectype))
> > -	return false;
> > +	ifn = IFN_LAST;
> > +      decl = NULL_TREE;
> >      }
> >    else
> >      {
> > @@ -4050,10 +4069,6 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo,
> >  	  if (targetm.vectorize.builtin_scatter)
> >  	    decl = targetm.vectorize.builtin_scatter (vectype, offtype, scale);
> >  	}
> > -
> > -      if (!decl)
> > -	return false;
> > -
> >        ifn = IFN_LAST;
> >        /* The offset vector type will be read from DECL when needed.  */
> >        offset_vectype = NULL_TREE;
> > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > index 743fd3f5414..25de97bd9b0 100644
> > --- a/gcc/tree-vect-patterns.c
> > +++ b/gcc/tree-vect-patterns.c
> > @@ -4811,7 +4811,7 @@ vect_recog_gather_scatter_pattern (vec_info *vinfo,
> >       function for the gather/scatter operation.  */
> >    gather_scatter_info gs_info;
> >    if (!vect_check_gather_scatter (stmt_info, loop_vinfo, &gs_info)
> > -      || gs_info.decl)
> > +      || gs_info.ifn == IFN_LAST)
> >      return NULL;
> >  
> >    /* Convert the mask to the right form.  */
> > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> > index 05085e1b110..9d51b476db6 100644
> > --- a/gcc/tree-vect-stmts.c
> > +++ b/gcc/tree-vect-stmts.c
> > @@ -1084,6 +1084,7 @@ static void
> >  vect_model_load_cost (vec_info *vinfo,
> >  		      stmt_vec_info stmt_info, unsigned ncopies, poly_uint64 vf,
> >  		      vect_memory_access_type memory_access_type,
> > +		      gather_scatter_info *gs_info,
> >  		      slp_tree slp_node,
> >  		      stmt_vector_for_cost *cost_vec)
> >  {
> > @@ -1172,9 +1173,17 @@ vect_model_load_cost (vec_info *vinfo,
> >    if (memory_access_type == VMAT_ELEMENTWISE
> >        || memory_access_type == VMAT_GATHER_SCATTER)
> >      {
> > -      /* N scalar loads plus gathering them into a vector.  */
> >        tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> >        unsigned int assumed_nunits = vect_nunits_for_cost (vectype);
> > +      if (memory_access_type == VMAT_GATHER_SCATTER
> > +	  && gs_info->ifn == IFN_LAST && !gs_info->decl)
> > +	/* For emulated gathers N offset vector element extracts
> > +	   (we assume the scalar scaling and ptr + offset add is consumed by
> > +	   the load).  */
> > +	inside_cost += record_stmt_cost (cost_vec, ncopies * assumed_nunits,
> > +					 vec_to_scalar, stmt_info, 0,
> > +					 vect_body);
> > +      /* N scalar loads plus gathering them into a vector.  */
> >        inside_cost += record_stmt_cost (cost_vec,
> >  				       ncopies * assumed_nunits,
> >  				       scalar_load, stmt_info, 0, vect_body);
> > @@ -1184,7 +1193,9 @@ vect_model_load_cost (vec_info *vinfo,
> >  			&inside_cost, &prologue_cost, 
> >  			cost_vec, cost_vec, true);
> >    if (memory_access_type == VMAT_ELEMENTWISE
> > -      || memory_access_type == VMAT_STRIDED_SLP)
> > +      || memory_access_type == VMAT_STRIDED_SLP
> > +      || (memory_access_type == VMAT_GATHER_SCATTER
> > +	  && gs_info->ifn == IFN_LAST && !gs_info->decl))
> >      inside_cost += record_stmt_cost (cost_vec, ncopies, vec_construct,
> >  				     stmt_info, 0, vect_body);
> >  
> > @@ -1866,7 +1877,8 @@ vect_truncate_gather_scatter_offset (stmt_vec_info stmt_info,
> >        tree memory_type = TREE_TYPE (DR_REF (dr));
> >        if (!vect_gather_scatter_fn_p (loop_vinfo, DR_IS_READ (dr), masked_p,
> >  				     vectype, memory_type, offset_type, scale,
> > -				     &gs_info->ifn, &gs_info->offset_vectype))
> > +				     &gs_info->ifn, &gs_info->offset_vectype)
> > +	  || gs_info->ifn == IFN_LAST)
> >  	continue;
> >  
> >        gs_info->decl = NULL_TREE;
> > @@ -1901,7 +1913,7 @@ vect_use_strided_gather_scatters_p (stmt_vec_info stmt_info,
> >  				    gather_scatter_info *gs_info)
> >  {
> >    if (!vect_check_gather_scatter (stmt_info, loop_vinfo, gs_info)
> > -      || gs_info->decl)
> > +      || gs_info->ifn == IFN_LAST)
> >      return vect_truncate_gather_scatter_offset (stmt_info, loop_vinfo,
> >  						masked_p, gs_info);
> >  
> > @@ -2355,6 +2367,27 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info stmt_info,
> >  			     vls_type == VLS_LOAD ? "gather" : "scatter");
> >  	  return false;
> >  	}
> > +      else if (gs_info->ifn == IFN_LAST && !gs_info->decl)
> > +	{
> > +	  if (vls_type != VLS_LOAD)
> > +	    {
> > +	      if (dump_enabled_p ())
> > +		dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > +				 "unsupported emulated scatter.\n");
> > +	      return false;
> > +	    }
> > +	  else if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant ()
> > +		   || !known_eq (TYPE_VECTOR_SUBPARTS (vectype),
> > +				 TYPE_VECTOR_SUBPARTS
> > +				   (gs_info->offset_vectype)))
> > +	    {
> > +	      if (dump_enabled_p ())
> > +		dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > +				 "unsupported vector types for emulated "
> > +				 "gather.\n");
> > +	      return false;
> > +	    }
> > +	}
> >        /* Gather-scatter accesses perform only component accesses, alignment
> >  	 is irrelevant for them.  */
> >        *alignment_support_scheme = dr_unaligned_supported;
> > @@ -8692,6 +8725,15 @@ vectorizable_load (vec_info *vinfo,
> >  			     "unsupported access type for masked load.\n");
> >  	  return false;
> >  	}
> > +      else if (memory_access_type == VMAT_GATHER_SCATTER
> > +	       && gs_info.ifn == IFN_LAST
> > +	       && !gs_info.decl)
> > +	{
> > +	  if (dump_enabled_p ())
> > +	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > +			     "unsupported masked emulated gather.\n");
> > +	  return false;
> > +	}
> >      }
> >  
> >    if (!vec_stmt) /* transformation not required.  */
> > @@ -8725,7 +8767,7 @@ vectorizable_load (vec_info *vinfo,
> >  
> >        STMT_VINFO_TYPE (orig_stmt_info) = load_vec_info_type;
> >        vect_model_load_cost (vinfo, stmt_info, ncopies, vf, memory_access_type,
> > -			    slp_node, cost_vec);
> > +			    &gs_info, slp_node, cost_vec);
> >        return true;
> >      }
> >  
> > @@ -9438,7 +9480,8 @@ vectorizable_load (vec_info *vinfo,
> >  		    unsigned int misalign;
> >  		    unsigned HOST_WIDE_INT align;
> >  
> > -		    if (memory_access_type == VMAT_GATHER_SCATTER)
> > +		    if (memory_access_type == VMAT_GATHER_SCATTER
> > +			&& gs_info.ifn != IFN_LAST)
> >  		      {
> >  			tree zero = build_zero_cst (vectype);
> >  			tree scale = size_int (gs_info.scale);
> > @@ -9456,6 +9499,51 @@ vectorizable_load (vec_info *vinfo,
> >  			data_ref = NULL_TREE;
> >  			break;
> >  		      }
> > +		    else if (memory_access_type == VMAT_GATHER_SCATTER)
> > +		      {
> > +			/* Emulated gather-scatter.  */
> > +			gcc_assert (!final_mask);
> 
> For this to be safe, we need to make sure that
> check_load_store_for_partial_vectors clears
> LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P.  Is that already guaranteed?
> (Maybe yes, I didn't look closely. :-))

I've made sure to fail earlier:

@@ -8692,6 +8725,15 @@ vectorizable_load (vec_info *vinfo,
                             "unsupported access type for masked 
load.\n");
          return false;
        }
+      else if (memory_access_type == VMAT_GATHER_SCATTER
+              && gs_info.ifn == IFN_LAST
+              && !gs_info.decl)
+       {
+         if (dump_enabled_p ())
+           dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                            "unsupported masked emulated gather.\n");
+         return false;
+       }
     }

at least I don't see a way to emulate masked gathers ;)

> 
> LGTM otherwise FWIW.

Thanks, I'll fix the above and will re-test on full SPEC (and also
post the followup).

Richard.

> Thanks,
> Richard
> 
> > +			unsigned HOST_WIDE_INT const_nunits
> > +			  = nunits.to_constant ();
> > +			vec<constructor_elt, va_gc> *ctor_elts;
> > +			vec_alloc (ctor_elts, const_nunits);
> > +			gimple_seq stmts = NULL;
> > +			tree idx_type = TREE_TYPE (TREE_TYPE (vec_offset));
> > +			tree scale = size_int (gs_info.scale);
> > +			align
> > +			  = get_object_alignment (DR_REF (first_dr_info->dr));
> > +			tree ltype = build_aligned_type (TREE_TYPE (vectype),
> > +							 align);
> > +			for (unsigned k = 0; k < const_nunits; ++k)
> > +			  {
> > +			    tree boff = size_binop (MULT_EXPR,
> > +						    TYPE_SIZE (idx_type),
> > +						    bitsize_int (k));
> > +			    tree idx = gimple_build (&stmts, BIT_FIELD_REF,
> > +						     idx_type, vec_offset,
> > +						     TYPE_SIZE (idx_type),
> > +						     boff);
> > +			    idx = gimple_convert (&stmts, sizetype, idx);
> > +			    idx = gimple_build (&stmts, MULT_EXPR,
> > +						sizetype, idx, scale);
> > +			    tree ptr = gimple_build (&stmts, PLUS_EXPR,
> > +						     TREE_TYPE (dataref_ptr),
> > +						     dataref_ptr, idx);
> > +			    ptr = gimple_convert (&stmts, ptr_type_node, ptr);
> > +			    tree elt = make_ssa_name (TREE_TYPE (vectype));
> > +			    tree ref = build2 (MEM_REF, ltype, ptr,
> > +					       build_int_cst (ref_type, 0));
> > +			    new_stmt = gimple_build_assign (elt, ref);
> > +			    gimple_seq_add_stmt (&stmts, new_stmt);
> > +			    CONSTRUCTOR_APPEND_ELT (ctor_elts, NULL_TREE, elt);
> > +			  }
> > +			gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
> > +			new_stmt = gimple_build_assign (NULL_TREE,
> > +							build_constructor
> > +							  (vectype, ctor_elts));
> > +			data_ref = NULL_TREE;
> > +			break;
> > +		      }
> >  
> >  		    align =
> >  		      known_alignment (DR_TARGET_ALIGNMENT (first_dr_info));
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend�rffer; HRB 36809 (AG Nuernberg)