From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19031 invoked by alias); 16 Sep 2015 16:11:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 19016 invoked by uid 89); 16 Sep 2015 16:11:10 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: e39.co.us.ibm.com Received: from e39.co.us.ibm.com (HELO e39.co.us.ibm.com) (32.97.110.160) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Wed, 16 Sep 2015 16:11:09 +0000 Received: from /spool/local by e39.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 16 Sep 2015 10:11:07 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e39.co.us.ibm.com (192.168.1.139) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 16 Sep 2015 10:11:04 -0600 X-MailFrom: wschmidt@linux.vnet.ibm.com X-RcptTo: gcc-patches@gcc.gnu.org Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 510BF3E40044 for ; Wed, 16 Sep 2015 10:11:04 -0600 (MDT) Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t8GGB4Iv4718886 for ; Wed, 16 Sep 2015 09:11:04 -0700 Received: from d03av03.boulder.ibm.com (localhost [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t8GGB202031517 for ; Wed, 16 Sep 2015 10:11:03 -0600 Received: from [9.80.47.176] ([9.80.47.176]) by d03av03.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t8GGAwTP030932; Wed, 16 Sep 2015 10:10:59 -0600 Message-ID: <1442419857.10907.0.camel@gnopaine> Subject: Re: [PATCH, rs6000] Add expansions for min/max vector reductions From: Bill Schmidt To: Alan Lawrence Cc: "gcc-patches@gcc.gnu.org" , "dje.gcc@gmail.com" , rguenther@suse.de, alan.hayward@arm.com, ramana.gcc@googlemail.com Date: Wed, 16 Sep 2015 16:14:00 -0000 In-Reply-To: <55F98AD2.4080408@arm.com> References: <1442413689.2896.45.camel@gnopaine> <55F98AD2.4080408@arm.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15091616-0033-0000-0000-000005E6CC04 X-IsSubscribed: yes X-SW-Source: 2015-09/txt/msg01204.txt.bz2 On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote: > On 16/09/15 15:28, Bill Schmidt wrote: > > 2015-09-16 Bill Schmidt > > > > * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX, UNSPEC_REDUC_SMIN, > > UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN, UNSPEC_REDUC_SMAX_SCAL, > > UNSPEC_REDUC_SMIN_SCAL, UNSPEC_REDUC_UMAX_SCAL, > > UNSPEC_REDUC_UMIN_SCAL): New enumerated constants. > > (reduc_smax_v2di): New define_expand. > > (reduc_smax_scal_v2di): Likewise. > > (reduc_smin_v2di): Likewise. > > (reduc_smin_scal_v2di): Likewise. > > (reduc_umax_v2di): Likewise. > > (reduc_umax_scal_v2di): Likewise. > > (reduc_umin_v2di): Likewise. > > (reduc_umin_scal_v2di): Likewise. > > (reduc_smax_v4si): Likewise. > > (reduc_smin_v4si): Likewise. > > (reduc_umax_v4si): Likewise. > > (reduc_umin_v4si): Likewise. > > (reduc_smax_v8hi): Likewise. > > (reduc_smin_v8hi): Likewise. > > (reduc_umax_v8hi): Likewise. > > (reduc_umin_v8hi): Likewise. > > (reduc_smax_v16qi): Likewise. > > (reduc_smin_v16qi): Likewise. > > (reduc_umax_v16qi): Likewise. > > (reduc_umin_v16qi): Likewise. > > (reduc_smax_scal_): Likewise. > > (reduc_smin_scal_): Likewise. > > (reduc_umax_scal_): Likewise. > > (reduc_umin_scal_): Likewise. > > You shouldn't need the non-_scal reductions. Indeed, they shouldn't be used if > the _scal are present. The non-_scal's were previously defined as producing a > vector with one element holding the result and the other elements all zero, and > this was only ever used with a vec_extract immediately after; the _scal pattern > now includes the vec_extract as well. Hence the non-_scal patterns are > deprecated / considered legacy, as per md.texi. Thanks -- I had misread the description of the non-scalar versions, missing the part where the other elements are zero. What I really want/need is an optab defined as computing the maximum value in all elements of the vector. This seems like a strange thing to want, but Alan Hayward's proposed patch will cause us to generate the scalar version, followed by a broadcast of the vector. Since our patterns already generate the maximum value in all positions, this creates an unnecessary extract followed by an unnecessary broadcast. As discussed elsewhere, we *could* remove the unnecessary code by recognizing this in simplify-rtx, etc., but the vectorization cost modeling would be wrong. It would have still told us to model this as a vec_to_scalar for the reduc_max_scal, and a vec_stmt for the broadcast. This would overcount the cost of the reduction compared to what we would actually generate. To get this right for all targets, one could envision having a new optab for a reduction-to-vector, which most targets wouldn't implement, but PowerPC and AArch32, at least, would. If a target has a reduction-to-vector, the vectorizer would have to generate a different GIMPLE code that mapped to this; otherwise it would do the REDUC_MAX_EXPR and the broadcast. This obviously starts to get complicated, since adding a GIMPLE code certainly has a nontrivial cost. :/ Perhaps the practical thing is to have the vectorizer also do an add_stmt_cost with some new token that indicates the cost model should make an adjustment if the back end doesn't need the extract/broadcast. Targets like PowerPC and AArch32 could then subtract the unnecessary cost, and remove the unnecessary code in simplify-rtx. Copying Richi and ARM folks for opinions on the best design. I want to be able to model this stuff as accurately as possible, but obviously we need to avoid unnecessary effects on other architectures. In any case, I will remove implementing the deprecated optabs, and I'll also try to look at Alan L's patch shortly. Thanks, Bill > > I proposed a patch to migrate PPC off the old patterns, but have forgotten to > ping it recently - last at > https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01024.html ... (ping?!) > > --Alan >