From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-407618-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 14448 invoked by alias); 16 Sep 2015 21:21:10 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 14433 invoked by uid 89); 16 Sep 2015 21:21:09 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,T_RP_MATCHES_RCVD autolearn=no version=3.3.2
X-HELO: e35.co.us.ibm.com
Received: from e35.co.us.ibm.com (HELO e35.co.us.ibm.com) (32.97.110.153) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Wed, 16 Sep 2015 21:21:08 +0000
Received: from /spool/local	by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted	for <gcc-patches@gcc.gnu.org> from <wschmidt@linux.vnet.ibm.com>;	Wed, 16 Sep 2015 15:21:06 -0600
Received: from d03dlp03.boulder.ibm.com (9.17.202.179)	by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;	Wed, 16 Sep 2015 15:21:05 -0600
X-MailFrom: wschmidt@linux.vnet.ibm.com
X-RcptTo: gcc-patches@gcc.gnu.org
Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16])	by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 641D119D8041	for <gcc-patches@gcc.gnu.org>; Wed, 16 Sep 2015 15:11:58 -0600 (MDT)
Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170])	by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t8GLL4hv5046620	for <gcc-patches@gcc.gnu.org>; Wed, 16 Sep 2015 14:21:04 -0700
Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1])	by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t8GLL4Ro024800	for <gcc-patches@gcc.gnu.org>; Wed, 16 Sep 2015 15:21:04 -0600
Received: from [9.80.47.176] ([9.80.47.176])	by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t8GLL1Es024534;	Wed, 16 Sep 2015 15:21:02 -0600
Message-ID: <1442438462.10907.9.camel@gnopaine>
Subject: Re: [PATCH, rs6000] Add expansions for min/max vector reductions
From: Bill Schmidt <wschmidt@linux.vnet.ibm.com>
To: Alan Lawrence <alan.lawrence@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,        "dje.gcc@gmail.com" <dje.gcc@gmail.com>,        "rguenther@suse.de" <rguenther@suse.de>,        Alan Hayward <Alan.Hayward@arm.com>,        "ramana.gcc@googlemail.com" <ramana.gcc@googlemail.com>
Date: Wed, 16 Sep 2015 22:04:00 -0000
In-Reply-To: <55F9A0D8.3020900@arm.com>
References: <1442413689.2896.45.camel@gnopaine> <55F98AD2.4080408@arm.com>	 <1442419857.10907.0.camel@gnopaine> <55F9A0D8.3020900@arm.com>
Content-Type: text/plain; charset="UTF-8"
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 15091621-0013-0000-0000-000018696637
X-IsSubscribed: yes
X-SW-Source: 2015-09/txt/msg01243.txt.bz2

On Wed, 2015-09-16 at 18:03 +0100, Alan Lawrence wrote:
> On 16/09/15 17:10, Bill Schmidt wrote:
> >
> > On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote:
> >> On 16/09/15 15:28, Bill Schmidt wrote:
> >>> 2015-09-16  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
> >>>
> >>>           * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX, UNSPEC_REDUC_SMIN,
> >>>           UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN, UNSPEC_REDUC_SMAX_SCAL,
> >>>           UNSPEC_REDUC_SMIN_SCAL, UNSPEC_REDUC_UMAX_SCAL,
> >>>           UNSPEC_REDUC_UMIN_SCAL): New enumerated constants.
> >>>           (reduc_smax_v2di): New define_expand.
> >>>           (reduc_smax_scal_v2di): Likewise.
> >>>           (reduc_smin_v2di): Likewise.
> >>>           (reduc_smin_scal_v2di): Likewise.
> >>>           (reduc_umax_v2di): Likewise.
> >>>           (reduc_umax_scal_v2di): Likewise.
> >>>           (reduc_umin_v2di): Likewise.
> >>>           (reduc_umin_scal_v2di): Likewise.
> >>>           (reduc_smax_v4si): Likewise.
> >>>           (reduc_smin_v4si): Likewise.
> >>>           (reduc_umax_v4si): Likewise.
> >>>           (reduc_umin_v4si): Likewise.
> >>>           (reduc_smax_v8hi): Likewise.
> >>>           (reduc_smin_v8hi): Likewise.
> >>>           (reduc_umax_v8hi): Likewise.
> >>>           (reduc_umin_v8hi): Likewise.
> >>>           (reduc_smax_v16qi): Likewise.
> >>>           (reduc_smin_v16qi): Likewise.
> >>>           (reduc_umax_v16qi): Likewise.
> >>>           (reduc_umin_v16qi): Likewise.
> >>>           (reduc_smax_scal_<mode>): Likewise.
> >>>           (reduc_smin_scal_<mode>): Likewise.
> >>>           (reduc_umax_scal_<mode>): Likewise.
> >>>           (reduc_umin_scal_<mode>): Likewise.
> >>
> >> You shouldn't need the non-_scal reductions. Indeed, they shouldn't be used if
> >> the _scal are present. The non-_scal's were previously defined as producing a
> >> vector with one element holding the result and the other elements all zero, and
> >> this was only ever used with a vec_extract immediately after; the _scal pattern
> >> now includes the vec_extract as well. Hence the non-_scal patterns are
> >> deprecated / considered legacy, as per md.texi.
> >
> > Thanks -- I had misread the description of the non-scalar versions,
> > missing the part where the other elements are zero.  What I really
> > want/need is an optab defined as computing the maximum value in all
> > elements of the vector.
> 
> Yes, indeed. It seems reasonable to me that this would coexist with an optab 
> which computes only a single value (i.e. a scalar).
> 
> At that point it might be appropriate to change the cond-reduction code to 
> generate the reduce-to-vector in all cases, and optabs.c expand it to 
> reduce-to-scalar + broadcast if reduce-to-vector was not available. Along with 
> the (parallel) changes to cost model already proposed, does that cover all the 
> cases? It does add a new tree code, yes, but I'm feeling that could be justified 
> if we go down this route.

That's how I envisioned it as well, and it was my original preference,
provided the maintainers are ok with it.  However, your next suggestion
is intriguing...

> 
> However, another point that springs to mind: if you reduce a loop containing OR 
> or MUL expressions, the vect_create_epilog_for_reduction reduces these using 
> shifts, and I think will also use shifts for platforms not possessing a 
> reduc_plus/min/max. If shifts could be changed for rotates, the code there would 
> do your reduce-to-a-vector-of-identical-elements in the midend...can we 
> (sensibly!) bring all of these together?

Perhaps so.  I can have a look at that and see.  What I'm calling a
rotate is really a double-vector shift where both input vectors are the
same, so perhaps this is already pretty close to what we need.

Thanks!  I'll try to put together some sort of proposal over the next
week or so, workload permitting.

Bill

> 
> > Perhaps the practical thing is to have the vectorizer also do an
> > add_stmt_cost with some new token that indicates the cost model should
> > make an adjustment if the back end doesn't need the extract/broadcast.
> > Targets like PowerPC and AArch32 could then subtract the unnecessary
> > cost, and remove the unnecessary code in simplify-rtx.
> 
> I think it'd be good if we could do it before simplify-rtx, really, although I'm 
> not sure I have a strong argument as to why, as long as we can cost it 
> appropriately.
> 
> > In any case, I will remove implementing the deprecated optabs, and I'll
> > also try to look at Alan L's patch shortly.
> 
> That'd be great, thanks :)
> 
> Cheers, Alan
>