From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-306646-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 26662 invoked by alias); 4 Nov 2011 11:22:12 -0000
Received: (qmail 26654 invoked by uid 22791); 4 Nov 2011 11:22:10 -0000
X-SWARE-Spam-Status: No, hits=-3.8 required=5.0	tests=AWL,BAYES_00,RP_MATCHES_RCVD,TW_TM
X-Spam-Check-By: sourceware.org
Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 04 Nov 2011 11:21:51 +0000
Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.221.2])	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))	(No client certificate requested)	by mx2.suse.de (Postfix) with ESMTP id 8C8BB8F903;	Fri,  4 Nov 2011 12:21:49 +0100 (CET)
Date: Fri, 04 Nov 2011 11:44:00 -0000
From: Richard Guenther <rguenther@suse.de>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Ira Rosen <ira.rosen@linaro.org>,	Richard Henderson <rth@redhat.com>, gcc-patches@gcc.gnu.org
Subject: Re: Patch ping
In-Reply-To: <20111104102525.GH1052@tyan-ft48-01.lab.bos.redhat.com>
Message-ID: <alpine.LNX.2.00.1111041219460.26779@zhemvz.fhfr.qr>
References: <20111102201829.GT1052@tyan-ft48-01.lab.bos.redhat.com> <alpine.LNX.2.00.1111041043570.26779@zhemvz.fhfr.qr> <20111104102525.GH1052@tyan-ft48-01.lab.bos.redhat.com>
User-Agent: Alpine 2.00 (LNX 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2011-11/txt/msg00536.txt.bz2

On Fri, 4 Nov 2011, Jakub Jelinek wrote:

> On Fri, Nov 04, 2011 at 10:52:44AM +0100, Richard Guenther wrote:
> > > - Gather vectorization patch + incremental patches
> > >   http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02411.html
> > 
> > I'm not sure I like using new builtins for gather representation
> > on the tree level too much, given that we are now moving
> > towards using tree codes for suffle.  Thus, how complicated
> > would it be to have a gather tree code and optab and to
> > handle the mixed size index issue in the expander?
> > 
> > I realize this would be quite some reorg to the patchset ...
> > so, why did you choose builtins over a more generic approach?
> 
> Because while permutations etc. are common to most targets,
> currently gather is very specialized, specific to one target,
> with lots of details how it must look like (e.g. the mask stuff where
> we currently don't even have tree code for conditional loads or conditional
> stores), and additionally unlike VEC_PERM_EXPR etc. which are normal
> expressions this one is a reference (and conditional one too), so
> I'm afraid I'd need to touch huge amounts of code (most places that
> currently handle MEM_REF/TARGET_MEM_REF would need to handle
> VEC_GATHER_MEM_REF too, as it is a memory read (well, set of conditional
> memory reads).  The i?86 backend already has (except 4) all the needed
> builtins anyway and handles expanding them too, the 4 ones are just
> to cope with the weirdo definition of some of them (half sized vectors).
> And when it is represented as builtin, the side-effects are handled by
> all optimization passes automatically, similarly how e.g. atomic builtins
> are right now builtins and not expressions.
> 
> So I thought it is better to use builtins right now, then when we in 4.8+
> hopefully do something about conditional loads/stores and their
> vectorization and if we introduce for that some new GIMPLE representation,
> this could be done on top of that.

Ok.  I guess it's ok to use builtins for now - I didn't think of
the memory reference issue ;)

> > >   http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02846.html
> > 
> > +  if (TREE_CODE (base) == MEM_REF)
> >      {
> > -      off = TREE_OPERAND (base, 1);
> > +      if (!integer_zerop (TREE_OPERAND (base, 1)))
> > +	{
> > +	  if (off == NULL_TREE)
> > +	    {
> > +	      double_int moff = mem_ref_offset (base);
> > +	      off = double_int_to_tree (sizetype, moff);
> > +	    }
> > +	  else
> > +	    off = size_binop (PLUS_EXPR, off, TREE_OPERAND (base, 1));
> > 
> > that's not safe, TREE_OPEAND (base, 1) is of pointer type, so
> > you unconditionally need to do the conversion to sizetype of
> > TREE_OPEAND (base, 1).
> 
> Ok, will fix.
> 
> > The routine lacks comments - it's got quite big and fails to
> 
> And add the comments.
> 
> > state any reason for its complexity.  I'm also not sure why
> > DR would include any loop invariant parts of the SCEV - doesn't
> > it instantiate just for the loop in question?
> 
> I'm not sure I understand your question.  With the incremental
> patch I'm not using any DR info appart from DR_REF to determine
> what is loop invariant part of the address and what is not.
> The reason for that is that split_constant_offset turns the GIMPLE
> code into a sometimes big tree, which actually may contain a mixture
> of loop invariants/constants and SSA_NAMEs defined in the loop,
> that all with casts, multiplications/additions/subtractions.
> For gather I need to split it into a single loop invariant
> argument (which can be computed before the loop as loop invariant
> and thus can be arbitrary tree expression that is just gimplified
> there) and another SSA_NAME defined into the loop which can be
> vectorized which is perhaps sign-extended and multiplied by 1/2/4/8.
> 
> With the approach the incremental patch does I just
> walk what split_constant_offset during DR walks and peel off
> loop invariants until I have something that should be used as the
> vectorized index.

It looks like split_constant_offset walks def stmts in an unbound
fashion.  That's surely a bad idea - SCEV should already have
expanded everything non-loop-invariant, thus it should at most
look through DEFs that trivially add to the constant offset,
not through others.

Richard.