From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-422594-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 73931 invoked by alias); 2 Mar 2016 13:46:14 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 73915 invoked by uid 89); 2 Mar 2016 13:46:13 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=Ping, stackalign, letter
X-HELO: foss.arm.com
Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 02 Mar 2016 13:46:12 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7167C49;	Wed,  2 Mar 2016 05:45:15 -0800 (PST)
Received: from [10.2.206.200] (e100706-lin.cambridge.arm.com [10.2.206.200])	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3CC923F213;	Wed,  2 Mar 2016 05:46:10 -0800 (PST)
Message-ID: <56D6EEA0.7000107@foss.arm.com>
Date: Wed, 02 Mar 2016 13:46:00 -0000
From: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: GCC Patches <gcc-patches@gcc.gnu.org>
CC: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>,  Richard Earnshaw <Richard.Earnshaw@arm.com>
Subject: Re: [PATCH][ARM][RFC] PR target/65578 Fix gcc.dg/torture/stackalign/builtin-apply-4.c for single-precision fpus
References: <56BA2014.1020708@foss.arm.com> <56BA20CF.5090108@foss.arm.com> <56C4479D.8010101@foss.arm.com> <56CDB49F.40109@foss.arm.com>
In-Reply-To: <56CDB49F.40109@foss.arm.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-SW-Source: 2016-03/txt/msg00166.txt.bz2

Ping*3.

Thanks,
Kyrill
On 24/02/16 13:48, Kyrill Tkachov wrote:
> Ping*2
>
> Thanks,
> Kyrill
>
> On 17/02/16 10:12, Kyrill Tkachov wrote:
>> Ping.
>> https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00634.html
>>
>> As mentioned before, this is actually a fix for PR target/69538.
>> I got confused when writing the cover letter and ChangeLog...
>>
>> Thanks,
>> Kyrill
>>
>> On 09/02/16 17:24, Kyrill Tkachov wrote:
>>>
>>> On 09/02/16 17:21, Kyrill Tkachov wrote:
>>>> Hi all,
>>>>
>>>> In this wrong-code PR the builtin-apply-4.c test fails with -flto but only when targeting an fpu
>>>> with only single-precision capabilities.
>>>>
>>>> bar is a function returing a double. For non-LTO compilation the caller of bar reads the return value
>>>> from it from the s0 and s1 VFP registers like expected, but for -flto the caller seems to expect the
>>>> return value from the r0 and r1 regs.  The RTL dumps show that too.
>>>>
>>>> Debugging the calls to arm_function_value show that in the -flto compilation the function bar is deemed
>>>> to be a local function call and assigned the ARM_PCS_AAPCS_LOCAL PCS variant, whereas for the non-LTO (and non-breaking)
>>>> compilation it uses the ARM_PCS_AAPCS_VFP variant.
>>>>
>>>> Further down in use_vfp_abi when deciding whether to use VFP registers for the result there is a bit of
>>>> logic that rejects VFP registers when handling the ARM_PCS_AAPCS_LOCAL variant with a double precision value
>>>> on an FPU that is not TARGET_VFP_DOUBLE.
>>>>
>>>> This seems wrong for ARM_PCS_AAPCS_LOCAL to me. ARM_PCS_AAPCS_LOCAL means that the function doesn't escape
>>>> the translation unit and we can thus use whatever variant we want. From what I understand we want to use the
>>>> VFP regs when possible for FP values.
>>>>
>>>> So this patch removes that restriction and for the testcase the caller of bar correctly reads the return
>>>> value of bar from the VFP registers and everything works.
>>>>
>>>> This patch has been bootstrapped and tested on arm-none-linux-gnueabihf configured with --with-fpu=fpv4-sp-d16.
>>>> The bootstrapped was performed with LTO.
>>>> I didn't see any regressions.
>>>>
>>>> It seems that this logic was put there in 2009 with r154034 as part of a large patch to enable support for half-precision
>>>> floating point.
>>>>
>>>> I'm not very familiar with this part of the code, so is this a safe patch to do?
>>>> The patch should only ever change behaviour for single-precision-only fpus and only for static functions
>>>> that don't get called outside their translation units (or during LTO I suppose) so there shouldn't
>>>> be any ABI problems, I think.
>>>>
>>>> Is this ok for trunk?
>>>>
>>>> Thanks,
>>>> Kyrill
>>>>
>>>
>>> Huh, I just realised I wrote completely the wrong PR number on this.
>>> The PR I'm talking about here is PR target/69538
>>>
>>> Sorry for the confusion.
>>>
>>> Kyrill
>>>
>>>
>>>> 2016-02-09 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>>>>
>>>>     PR target/65578
>>>>     * config/arm/arm.c (use_vfp_abi): Remove id_double argument.
>>>>     Don't check for is_double and TARGET_VFP_DOUBLE.
>>>>     (aapcs_vfp_is_call_or_return_candidate): Update callsite.
>>>>     (aapcs_vfp_is_return_candidate): Likewise.
>>>>     (aapcs_vfp_is_call_candidate): Likewise.
>>>>     (aapcs_vfp_allocate_return_reg): Likewise.
>>>
>>
>