From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-436414-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 115802 invoked by alias); 21 Sep 2016 07:12:56 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 115785 invoked by uid 89); 21 Sep 2016 07:12:55 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-5.0 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=opportunity, entity, living
X-HELO: mx2.suse.de
Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 21 Sep 2016 07:12:54 +0000
Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254])	by mx2.suse.de (Postfix) with ESMTP id D5B63AAD1;	Wed, 21 Sep 2016 07:12:51 +0000 (UTC)
Date: Wed, 21 Sep 2016 07:13:00 -0000
From: Richard Biener <rguenther@suse.de>
To: Jeff Law <law@redhat.com>
cc: Tamar Christina <tamar.christina@arm.com>,     GCC Patches <gcc-patches@gcc.gnu.org>,     "jakub@redhat.com" <jakub@redhat.com>, nd <nd@arm.com>
Subject: Re: [PATCH] Optimise the fpclassify builtin to perform integer operations when possible
In-Reply-To: <c092ca3e-fef4-0d89-f527-17ce09e41de3@redhat.com>
Message-ID: <alpine.LSU.2.11.1609210906270.26629@t29.fhfr.qr>
References: <VI1PR0801MB2031BC0C70CCAD966A9B2933FFFF0@VI1PR0801MB2031.eurprd08.prod.outlook.com> <6c97e2e2-7934-00e3-a4ea-ec94a8c26abc@redhat.com> <a5a81ad7-12e4-e8f4-03aa-1ef21c536d5e@arm.com> <c092ca3e-fef4-0d89-f527-17ce09e41de3@redhat.com>
User-Agent: Alpine 2.11 (LSU 23 2013-08-11)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-SW-Source: 2016-09/txt/msg01377.txt.bz2

On Tue, 20 Sep 2016, Jeff Law wrote:

> On 09/20/2016 06:00 AM, Tamar Christina wrote:
> > 
> > 
> > On 16/09/16 20:49, Jeff Law wrote:
> > > On 09/12/2016 10:19 AM, Tamar Christina wrote:
> > > > Hi All,
> > > > +
> > > > +      /* Re-interpret the float as an unsigned integer type
> > > > +     with equal precision.  */
> > > > +      int_arg_type = build_nonstandard_integer_type (TYPE_PRECISION
> > > > (type), 0);
> > > > +      int_arg = fold_build1_loc (loc, INDIRECT_REF, int_arg_type,
> > > > +          fold_build1_loc (loc, NOP_EXPR,
> > > > +                   build_pointer_type (int_arg_type),
> > > > +            fold_build1_loc (loc, ADDR_EXPR,
> > > > +                     build_pointer_type (type), arg)));
> > > Doesn't this make ARG addressable?  Which in turn means ARG won't be
> > > exposed to the gimple/ssa optimizers.    Or is it the case that when
> > > fpclassify is used its argument is already in memory (and thus
> > > addressable?)
> > > 
> > I believe that it is the case that when fpclassify is use the argument
> > is already addressable, but I am not 100% certain. I may be able to do
> > this differently so I'll come back to you on this one.
> The more I think about it, the more I suspect ARG is only going to already be
> marked as addressable if it has already had its address taken.

Sure, if it has it's address taken ... but I don't see how
fpclassify requires the arg to be address taken.

> But I think we can look at this as an opportunity.  If ARG is already
> addressable, then it's most likely going to be living in memory (there are
> exceptions).  If ARG is most likely going to be living in memory, then we
> clearly want to use your fast integer path, regardless of the target.
> 
> If ARG is not addressable, then it's not as clear as the object is likely
> going to be assigned into an FP register.  Integer operations on the an FP
> register likely will force a sequence where we dump the register into memory,
> load from memory into a GPR, then bit test on the GPR.  That gets very
> expensive on some architectures.
> 
> Could we defer lowering in the case where the object is not addressable until
> gimple->rtl expansion time?  That's the best time to introduce target
> dependencies into the code we generate.

Note that GIMPLE doesn't require sth to be addressable just because
you access random pieces of it.  The IL has tricks like allowing
MEM[&decl + CST] w/o actually marking decl TREE_ADDRESSABLE (and the
expanders trying to cope with that) and there is of course
BIT_FIELD_REF which you can use to extract arbitrary bits off any
entity without it living in memory (and again the expanders trying to
cope with that).

So may I suggest to move the "folding" from builtins.c to gimplify.c
and simply emit GIMPLE directly there?  That would make it also clearer
that we are dealing with a lowering process rather than a "folding".

Doing it in GIMPLE lowering is another possibility - we lower things
like posix_memalign and setjmp there as well.

Thanks,
Richard.