From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113841 invoked by alias); 14 Sep 2015 14:13:44 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 113832 invoked by uid 89); 14 Sep 2015 14:13:43 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: e36.co.us.ibm.com Received: from e36.co.us.ibm.com (HELO e36.co.us.ibm.com) (32.97.110.154) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Mon, 14 Sep 2015 14:13:42 +0000 Received: from /spool/local by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 14 Sep 2015 08:13:40 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 14 Sep 2015 08:10:51 -0600 X-MailFrom: wschmidt@linux.vnet.ibm.com X-RcptTo: gcc-patches@gcc.gnu.org Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 9F0AE3E4003B for ; Mon, 14 Sep 2015 08:10:47 -0600 (MDT) Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t8EE9fBW49938586 for ; Mon, 14 Sep 2015 07:09:41 -0700 Received: from d03av03.boulder.ibm.com (localhost [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t8EEAkbE016771 for ; Mon, 14 Sep 2015 08:10:47 -0600 Received: from [9.80.47.176] ([9.80.47.176]) by d03av03.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t8EEAhIJ016471; Mon, 14 Sep 2015 08:10:44 -0600 Message-ID: <1442239843.2896.10.camel@gnopaine> Subject: Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947) From: Bill Schmidt To: Alan Lawrence Cc: gcc-patches@gcc.gnu.org, Ramana Radhakrishnan , Richard Sandiford , Alan Hayward Date: Mon, 14 Sep 2015 14:20:00 -0000 In-Reply-To: <55F69799.8010901@arm.com> References: <1441923254.4772.37.camel@oc8801110288.ibm.com> <1441977591.2795.11.camel@gnopaine> <55F69799.8010901@arm.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15091414-0021-0000-0000-000012CF3FD5 X-IsSubscribed: yes X-SW-Source: 2015-09/txt/msg00937.txt.bz2 On Mon, 2015-09-14 at 10:47 +0100, Alan Lawrence wrote: > On 11/09/15 14:19, Bill Schmidt wrote: > > > > A secondary concern for powerpc is that REDUC_MAX_EXPR produces a scalar > > that has to be broadcast back to a vector, and the best way to implement > > it for us already has the max value in all positions of a vector. But > > that is something we should be able to fix with simplify-rtx in the back > > end. > > Reading this thread again, this bit stands out as unaddressed. Yes PowerPC can > "fix" this with simplify-rtx, but the vector cost model will not take this into > account - it will think that the broadcast-back-to-a-vector requires an extra > operation after the reduction, whereas in fact it will not. > > Does that suggest we should have a new entry in vect_cost_for_stmt for > vec_to_scalar-and-back-to-vector (that defaults to vec_to_scalar+scalar_to_vec, > but on some architectures e.g. PowerPC would be the same as vec_to_scalar)? Ideally I think we need to do something for that, yeah. The back ends could try to patch up the cost when finishing costs for the loop body, epilogue, etc., but that would be somewhat of a guess; it would be better to just be up-front that we're doing a reduction to a vector. As part of this, I dislike the term "vec_to_scalar", which is somewhat vague about what's going on (it sound like it could mean a vector extract operation, which is more of an inverse of "scalar_to_vec" than a reduction is). GIMPLE calls it a reduction, and the optabs call it a reduction, so we ought to call it a reduction in the vectorizer cost model, too. To cover our bases for PowerPC and AArch32, we probably need: plus_reduc_to_scalar plus_reduc_to_vector minmax_reduc_to_scalar minmax_reduc_to_vector although I think plus_reduc_to_vector wouldn't be used yet, so could be omitted. If we go this route, then at that time we would change your code to use minmax_reduc_to_vector and let the back ends determine whether that requires a scalar reduction followed by a broadcast, or whether it would be performed directly. Using direct reduction to vector for MIN and MAX on PowerPC would be a big cost savings over scalar reduction/broadcast. Thanks, Bill > > (I agree that if that's the limit of how "different" conditional reductions may > be between architectures, then we should not have a vec_cost_for_stmt for a > whole conditional reduction.) > > Cheers, Alan >