Re: [PATCH][ARM][RFC] PR target/65578 Fix gcc.dg/torture/stackalign/builtin-apply-4.c for single-precision fpus

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
To: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
Subject: Re: [PATCH][ARM][RFC] PR target/65578 Fix gcc.dg/torture/stackalign/builtin-apply-4.c for single-precision fpus
Date: Fri, 08 Apr 2016 10:10:00 -0000	[thread overview]
Message-ID: <57078394.8020603@arm.com> (raw)
In-Reply-To: <56BA2014.1020708@foss.arm.com>

On 09/02/16 17:21, Kyrill Tkachov wrote:
> Hi all,
> 
> In this wrong-code PR the builtin-apply-4.c test fails with -flto but
> only when targeting an fpu
> with only single-precision capabilities.
> 
> bar is a function returing a double. For non-LTO compilation the caller
> of bar reads the return value
> from it from the s0 and s1 VFP registers like expected, but for -flto
> the caller seems to expect the
> return value from the r0 and r1 regs.  The RTL dumps show that too.
> 
> Debugging the calls to arm_function_value show that in the -flto
> compilation the function bar is deemed
> to be a local function call and assigned the ARM_PCS_AAPCS_LOCAL PCS
> variant, whereas for the non-LTO (and non-breaking)
> compilation it uses the ARM_PCS_AAPCS_VFP variant.
> 
> Further down in use_vfp_abi when deciding whether to use VFP registers
> for the result there is a bit of
> logic that rejects VFP registers when handling the ARM_PCS_AAPCS_LOCAL
> variant with a double precision value
> on an FPU that is not TARGET_VFP_DOUBLE.
> 
> This seems wrong for ARM_PCS_AAPCS_LOCAL to me. ARM_PCS_AAPCS_LOCAL
> means that the function doesn't escape
> the translation unit and we can thus use whatever variant we want. From
> what I understand we want to use the
> VFP regs when possible for FP values.
> 

I'm not sure I agree with you here.  The problem is that a core with a
single-precision only FPU is required by the hard-float ABI to pass
double precision values in VFP registers, but it can do nothing useful
with them while they are in that register bank, so before any processing
can be done on a double precision value it has to copy it back to a core
register; copying values between register banks is often relatively
expensive.  Furthermore, if that value is used as a parameter or
function call result it has to put it back again.  That adds further to
the inefficiency.  Don't forget also that the EABI soft-float libcalls
don't follow the normal ABI rules and always take parameters and pass
results in the core registers.

So I think the idea here is that we can avoid some copying overhead when
we know that the function in question does not escape from the current
compilation unit.  If that's going wrong with LTO then there's a bug
somewhere in how we assess whether a function escapes.

R.

> So this patch removes that restriction and for the testcase the caller
> of bar correctly reads the return
> value of bar from the VFP registers and everything works.
> 
> This patch has been bootstrapped and tested on arm-none-linux-gnueabihf
> configured with --with-fpu=fpv4-sp-d16.
> The bootstrapped was performed with LTO.
> I didn't see any regressions.
> 
> It seems that this logic was put there in 2009 with r154034 as part of a
> large patch to enable support for half-precision
> floating point.
> 
> I'm not very familiar with this part of the code, so is this a safe
> patch to do?
> The patch should only ever change behaviour for single-precision-only
> fpus and only for static functions
> that don't get called outside their translation units (or during LTO I
> suppose) so there shouldn't
> be any ABI problems, I think.
> 
> Is this ok for trunk?
> 
> Thanks,
> Kyrill
> 
> 2016-02-09  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     PR target/65578
>     * config/arm/arm.c (use_vfp_abi): Remove id_double argument.
>     Don't check for is_double and TARGET_VFP_DOUBLE.
>     (aapcs_vfp_is_call_or_return_candidate): Update callsite.
>     (aapcs_vfp_is_return_candidate): Likewise.
>     (aapcs_vfp_is_call_candidate): Likewise.
>     (aapcs_vfp_allocate_return_reg): Likewise.

     prev parent reply	other threads:[~2016-04-08 10:10 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-09 17:21 Kyrill Tkachov
2016-02-09 17:24 ` Kyrill Tkachov
2016-02-17 10:12   ` Kyrill Tkachov
2016-02-24 13:48     ` Kyrill Tkachov
2016-03-02 13:46       ` Kyrill Tkachov
2016-03-09  9:32         ` Kyrill Tkachov
2016-03-23 12:25 ` Ramana Radhakrishnan
2016-04-07 12:32   ` Kyrill Tkachov
2016-04-08 10:10 ` Richard Earnshaw (lists) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57078394.8020603@arm.com \
    --to=richard.earnshaw@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=kyrylo.tkachov@foss.arm.com \
    --cc=ramana.radhakrishnan@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).