From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 115802 invoked by alias); 21 Sep 2016 07:12:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 115785 invoked by uid 89); 21 Sep 2016 07:12:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.0 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=opportunity, entity, living X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 21 Sep 2016 07:12:54 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id D5B63AAD1; Wed, 21 Sep 2016 07:12:51 +0000 (UTC) Date: Wed, 21 Sep 2016 07:13:00 -0000 From: Richard Biener To: Jeff Law cc: Tamar Christina , GCC Patches , "jakub@redhat.com" , nd Subject: Re: [PATCH] Optimise the fpclassify builtin to perform integer operations when possible In-Reply-To: Message-ID: References: <6c97e2e2-7934-00e3-a4ea-ec94a8c26abc@redhat.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2016-09/txt/msg01377.txt.bz2 On Tue, 20 Sep 2016, Jeff Law wrote: > On 09/20/2016 06:00 AM, Tamar Christina wrote: > > > > > > On 16/09/16 20:49, Jeff Law wrote: > > > On 09/12/2016 10:19 AM, Tamar Christina wrote: > > > > Hi All, > > > > + > > > > + /* Re-interpret the float as an unsigned integer type > > > > + with equal precision. */ > > > > + int_arg_type = build_nonstandard_integer_type (TYPE_PRECISION > > > > (type), 0); > > > > + int_arg = fold_build1_loc (loc, INDIRECT_REF, int_arg_type, > > > > + fold_build1_loc (loc, NOP_EXPR, > > > > + build_pointer_type (int_arg_type), > > > > + fold_build1_loc (loc, ADDR_EXPR, > > > > + build_pointer_type (type), arg))); > > > Doesn't this make ARG addressable? Which in turn means ARG won't be > > > exposed to the gimple/ssa optimizers. Or is it the case that when > > > fpclassify is used its argument is already in memory (and thus > > > addressable?) > > > > > I believe that it is the case that when fpclassify is use the argument > > is already addressable, but I am not 100% certain. I may be able to do > > this differently so I'll come back to you on this one. > The more I think about it, the more I suspect ARG is only going to already be > marked as addressable if it has already had its address taken. Sure, if it has it's address taken ... but I don't see how fpclassify requires the arg to be address taken. > But I think we can look at this as an opportunity. If ARG is already > addressable, then it's most likely going to be living in memory (there are > exceptions). If ARG is most likely going to be living in memory, then we > clearly want to use your fast integer path, regardless of the target. > > If ARG is not addressable, then it's not as clear as the object is likely > going to be assigned into an FP register. Integer operations on the an FP > register likely will force a sequence where we dump the register into memory, > load from memory into a GPR, then bit test on the GPR. That gets very > expensive on some architectures. > > Could we defer lowering in the case where the object is not addressable until > gimple->rtl expansion time? That's the best time to introduce target > dependencies into the code we generate. Note that GIMPLE doesn't require sth to be addressable just because you access random pieces of it. The IL has tricks like allowing MEM[&decl + CST] w/o actually marking decl TREE_ADDRESSABLE (and the expanders trying to cope with that) and there is of course BIT_FIELD_REF which you can use to extract arbitrary bits off any entity without it living in memory (and again the expanders trying to cope with that). So may I suggest to move the "folding" from builtins.c to gimplify.c and simply emit GIMPLE directly there? That would make it also clearer that we are dealing with a lowering process rather than a "folding". Doing it in GIMPLE lowering is another possibility - we lower things like posix_memalign and setjmp there as well. Thanks, Richard.