From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26002 invoked by alias); 16 Sep 2015 17:03:28 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 25993 invoked by uid 89); 16 Sep 2015 17:03:27 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 16 Sep 2015 17:03:26 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-3-ZNhq2GKiRRe-sa-7kBAnPQ-1; Wed, 16 Sep 2015 18:03:21 +0100 Received: from e105915-lin.cambridge.arm.com ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 16 Sep 2015 18:03:21 +0100 Subject: Re: [PATCH, rs6000] Add expansions for min/max vector reductions To: Bill Schmidt References: <1442413689.2896.45.camel@gnopaine> <55F98AD2.4080408@arm.com> <1442419857.10907.0.camel@gnopaine> Cc: "gcc-patches@gcc.gnu.org" , "dje.gcc@gmail.com" , "rguenther@suse.de" , Alan Hayward , "ramana.gcc@googlemail.com" From: Alan Lawrence Message-ID: <55F9A0D8.3020900@arm.com> Date: Wed, 16 Sep 2015 17:09:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <1442419857.10907.0.camel@gnopaine> X-MC-Unique: ZNhq2GKiRRe-sa-7kBAnPQ-1 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-09/txt/msg01211.txt.bz2 On 16/09/15 17:10, Bill Schmidt wrote: > > On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote: >> On 16/09/15 15:28, Bill Schmidt wrote: >>> 2015-09-16 Bill Schmidt >>> >>> * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX, UNSPEC_REDUC_S= MIN, >>> UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN, UNSPEC_REDUC_SMAX_SCAL, >>> UNSPEC_REDUC_SMIN_SCAL, UNSPEC_REDUC_UMAX_SCAL, >>> UNSPEC_REDUC_UMIN_SCAL): New enumerated constants. >>> (reduc_smax_v2di): New define_expand. >>> (reduc_smax_scal_v2di): Likewise. >>> (reduc_smin_v2di): Likewise. >>> (reduc_smin_scal_v2di): Likewise. >>> (reduc_umax_v2di): Likewise. >>> (reduc_umax_scal_v2di): Likewise. >>> (reduc_umin_v2di): Likewise. >>> (reduc_umin_scal_v2di): Likewise. >>> (reduc_smax_v4si): Likewise. >>> (reduc_smin_v4si): Likewise. >>> (reduc_umax_v4si): Likewise. >>> (reduc_umin_v4si): Likewise. >>> (reduc_smax_v8hi): Likewise. >>> (reduc_smin_v8hi): Likewise. >>> (reduc_umax_v8hi): Likewise. >>> (reduc_umin_v8hi): Likewise. >>> (reduc_smax_v16qi): Likewise. >>> (reduc_smin_v16qi): Likewise. >>> (reduc_umax_v16qi): Likewise. >>> (reduc_umin_v16qi): Likewise. >>> (reduc_smax_scal_): Likewise. >>> (reduc_smin_scal_): Likewise. >>> (reduc_umax_scal_): Likewise. >>> (reduc_umin_scal_): Likewise. >> >> You shouldn't need the non-_scal reductions. Indeed, they shouldn't be u= sed if >> the _scal are present. The non-_scal's were previously defined as produc= ing a >> vector with one element holding the result and the other elements all ze= ro, and >> this was only ever used with a vec_extract immediately after; the _scal = pattern >> now includes the vec_extract as well. Hence the non-_scal patterns are >> deprecated / considered legacy, as per md.texi. > > Thanks -- I had misread the description of the non-scalar versions, > missing the part where the other elements are zero. What I really > want/need is an optab defined as computing the maximum value in all > elements of the vector. Yes, indeed. It seems reasonable to me that this would coexist with an opta= b=20 which computes only a single value (i.e. a scalar). At that point it might be appropriate to change the cond-reduction code to= =20 generate the reduce-to-vector in all cases, and optabs.c expand it to=20 reduce-to-scalar + broadcast if reduce-to-vector was not available. Along w= ith=20 the (parallel) changes to cost model already proposed, does that cover all = the=20 cases? It does add a new tree code, yes, but I'm feeling that could be just= ified=20 if we go down this route. However, another point that springs to mind: if you reduce a loop containin= g OR=20 or MUL expressions, the vect_create_epilog_for_reduction reduces these usin= g=20 shifts, and I think will also use shifts for platforms not possessing a=20 reduc_plus/min/max. If shifts could be changed for rotates, the code there = would=20 do your reduce-to-a-vector-of-identical-elements in the midend...can we=20 (sensibly!) bring all of these together? > Perhaps the practical thing is to have the vectorizer also do an > add_stmt_cost with some new token that indicates the cost model should > make an adjustment if the back end doesn't need the extract/broadcast. > Targets like PowerPC and AArch32 could then subtract the unnecessary > cost, and remove the unnecessary code in simplify-rtx. I think it'd be good if we could do it before simplify-rtx, really, althoug= h I'm=20 not sure I have a strong argument as to why, as long as we can cost it=20 appropriately. > In any case, I will remove implementing the deprecated optabs, and I'll > also try to look at Alan L's patch shortly. That'd be great, thanks :) Cheers, Alan