From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-306629-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 10605 invoked by alias); 4 Nov 2011 10:25:48 -0000
Received: (qmail 10561 invoked by uid 22791); 4 Nov 2011 10:25:46 -0000
X-SWARE-Spam-Status: No, hits=-7.2 required=5.0	tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,SPF_HELO_PASS
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 04 Nov 2011 10:25:28 +0000
Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pA4APRqR028612	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);	Fri, 4 Nov 2011 06:25:27 -0400
Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [10.16.42.4])	by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id pA4APQ73026743	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);	Fri, 4 Nov 2011 06:25:27 -0400
Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [127.0.0.1])	by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4) with ESMTP id pA4APQlD029302;	Fri, 4 Nov 2011 11:25:26 +0100
Received: (from jakub@localhost)	by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4/Submit) id pA4APPCi029300;	Fri, 4 Nov 2011 11:25:25 +0100
Date: Fri, 04 Nov 2011 10:39:00 -0000
From: Jakub Jelinek <jakub@redhat.com>
To: Richard Guenther <rguenther@suse.de>
Cc: Ira Rosen <ira.rosen@linaro.org>, Richard Henderson <rth@redhat.com>,        gcc-patches@gcc.gnu.org
Subject: Re: Patch ping
Message-ID: <20111104102525.GH1052@tyan-ft48-01.lab.bos.redhat.com>
Reply-To: Jakub Jelinek <jakub@redhat.com>
References: <20111102201829.GT1052@tyan-ft48-01.lab.bos.redhat.com> <alpine.LNX.2.00.1111041043570.26779@zhemvz.fhfr.qr>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <alpine.LNX.2.00.1111041043570.26779@zhemvz.fhfr.qr>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2011-11/txt/msg00519.txt.bz2

On Fri, Nov 04, 2011 at 10:52:44AM +0100, Richard Guenther wrote:
> > - Gather vectorization patch + incremental patches
> >   http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02411.html
> 
> I'm not sure I like using new builtins for gather representation
> on the tree level too much, given that we are now moving
> towards using tree codes for suffle.  Thus, how complicated
> would it be to have a gather tree code and optab and to
> handle the mixed size index issue in the expander?
> 
> I realize this would be quite some reorg to the patchset ...
> so, why did you choose builtins over a more generic approach?

Because while permutations etc. are common to most targets,
currently gather is very specialized, specific to one target,
with lots of details how it must look like (e.g. the mask stuff where
we currently don't even have tree code for conditional loads or conditional
stores), and additionally unlike VEC_PERM_EXPR etc. which are normal
expressions this one is a reference (and conditional one too), so
I'm afraid I'd need to touch huge amounts of code (most places that
currently handle MEM_REF/TARGET_MEM_REF would need to handle
VEC_GATHER_MEM_REF too, as it is a memory read (well, set of conditional
memory reads).  The i?86 backend already has (except 4) all the needed
builtins anyway and handles expanding them too, the 4 ones are just
to cope with the weirdo definition of some of them (half sized vectors).
And when it is represented as builtin, the side-effects are handled by
all optimization passes automatically, similarly how e.g. atomic builtins
are right now builtins and not expressions.

So I thought it is better to use builtins right now, then when we in 4.8+
hopefully do something about conditional loads/stores and their
vectorization and if we introduce for that some new GIMPLE representation,
this could be done on top of that.

> >   http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02846.html
> 
> +  if (TREE_CODE (base) == MEM_REF)
>      {
> -      off = TREE_OPERAND (base, 1);
> +      if (!integer_zerop (TREE_OPERAND (base, 1)))
> +	{
> +	  if (off == NULL_TREE)
> +	    {
> +	      double_int moff = mem_ref_offset (base);
> +	      off = double_int_to_tree (sizetype, moff);
> +	    }
> +	  else
> +	    off = size_binop (PLUS_EXPR, off, TREE_OPERAND (base, 1));
> 
> that's not safe, TREE_OPEAND (base, 1) is of pointer type, so
> you unconditionally need to do the conversion to sizetype of
> TREE_OPEAND (base, 1).

Ok, will fix.

> The routine lacks comments - it's got quite big and fails to

And add the comments.

> state any reason for its complexity.  I'm also not sure why
> DR would include any loop invariant parts of the SCEV - doesn't
> it instantiate just for the loop in question?

I'm not sure I understand your question.  With the incremental
patch I'm not using any DR info appart from DR_REF to determine
what is loop invariant part of the address and what is not.
The reason for that is that split_constant_offset turns the GIMPLE
code into a sometimes big tree, which actually may contain a mixture
of loop invariants/constants and SSA_NAMEs defined in the loop,
that all with casts, multiplications/additions/subtractions.
For gather I need to split it into a single loop invariant
argument (which can be computed before the loop as loop invariant
and thus can be arbitrary tree expression that is just gimplified
there) and another SSA_NAME defined into the loop which can be
vectorized which is perhaps sign-extended and multiplied by 1/2/4/8.

With the approach the incremental patch does I just
walk what split_constant_offset during DR walks and peel off
loop invariants until I have something that should be used as the
vectorized index.

	Jakub