public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>
To: Ramana Radhakrishnan <ramana.gcc@googlemail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	Richard Earnshaw <Richard.Earnshaw@arm.com>,
	Alex Coplan <alex.coplan@arm.com>
Subject: Re: [Patch Arm] Fix PR 92999
Date: Thu, 24 Nov 2022 16:12:15 +0000	[thread overview]
Message-ID: <585303f9-1f2f-7177-3029-9646dd19f63e@foss.arm.com> (raw)
In-Reply-To: <CAJA7tRYadtzpCG8ehT5DVvHR9_BPFh7ZURwftwdqzYhHS6XLtw@mail.gmail.com>



On 11/11/2022 21:50, Ramana Radhakrishnan via Gcc-patches wrote:
> On Thu, Nov 10, 2022 at 7:46 PM Ramana Radhakrishnan
> <ramana.gcc@googlemail.com> wrote:
>>
>> On Thu, Nov 10, 2022 at 6:03 PM Richard Earnshaw
>> <Richard.Earnshaw@foss.arm.com> wrote:
>>>
>>>
>>>
>>> On 10/11/2022 17:21, Richard Earnshaw via Gcc-patches wrote:
>>>>
>>>>
>>>> On 08/11/2022 18:20, Ramana Radhakrishnan via Gcc-patches wrote:
>>>>> PR92999 is a case where the VFP calling convention does not allocate
>>>>> enough FP registers for a homogenous aggregate containing FP16 values.
>>>>> I believe this is the complete fix but would appreciate another set of
>>>>> eyes on this.
>>>>>
>>>>> Could I get a hand with a regression test run on an armhf environment
>>>>> while I fix my environment ?
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>> PR target/92999
>>>>> *  config/arm/arm.c (aapcs_vfp_allocate_return_reg): Adjust to handle
>>>>> aggregates with elements smaller than SFmode.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> * gcc.target/arm/pr92999.c: New test.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Ramana
>>>>>
>>>>> Signed-off-by: Ramana Radhakrishnan <ramana.gcc@gmail.com>
>>>>
>>>> I'm not sure about this.  The AAPCS does not mention a base type of a
>>>> half-precision FP type as an appropriate homogeneous aggregate for using
>>>> VFP registers for either calling or returning.
>>
>> Ooh interesting, thanks for taking a look and poking at the AAPCS and
>> that's a good catch. BF16 should also have the same behaviour as FP16
>> , I suspect ?
> 
> I suspect I got caught out by the definition of the Homogenous
> aggregate from Section 5.3.5
> ((https://github.com/ARM-software/abi-aa/blob/2982a9f3b512a5bfdc9e3fea5d3b298f9165c36b/aapcs32/aapcs32.rst#homogeneous-aggregates)
> which simply suggests it's an aggregate of fundamental types which
> lists half precision floating point .

A homogeneous aggregate is any aggregate that fits the general 
definition, but only HAs of specific types are of interest for the VFP 
PCS rules.

The problem we have is that when we added HFmode (and later BF16mode) 
support we didn't notice that the base types are VFP candidates, but the 
nested types (eg in records or arrays) are not.

The problems started around SVN r236269 (git:1b81a1c1bd53) when we added 
FP16 support.


> 
> FTR, ideally I should have read 7.1.2.1
> https://github.com/ARM-software/abi-aa/blob/2982a9f3b512a5bfdc9e3fea5d3b298f9165c36b/aapcs32/aapcs32.rst#procedure-calling)
> :)
> 
> 
> 
>>
>>>>
>>>> So perhaps the bug is that we try to treat this as a homogeneous
>>>> aggregate at all.
>>
>> Yep I agree - I'll take a look again tomorrow and see if I can get a fix.
>>
>> (And thanks Alex for the test run, I might trouble you again while I
>> still (slowly) get some of my boards back up)
> 
> 
> and as promised take 2. I'd really prefer another review on this one
> to see if I've not missed anything in the cases below.

I think I'd prefer to try and fix this at the point where we accept the 
base types, ie around:

     case REAL_TYPE:
       mode = TYPE_MODE (type);
       if (mode != DFmode && mode != SFmode && mode != HFmode && mode != 
BFmode)
         return -1;

by changing this to something like

     /* HFmode and BFmode can be passed in registers, but are not valid
        base types for an HFA, so only accept these if we are at the top
        level.  */
     if (!(mode == DFmode || mode == SFmode
           || (depth == 0
               && (mode == HFmode || mode == BFmode)))
        return -1;

and we then pass depth into the recursion calls as an extra parameter, 
starting at 0 for the top level and incrementing it by 1 each time 
aapcs_vfp_sub_candidate recurses.

For the test, would it be possible to rewrite it in the style of 
gcc.target/arm/aapcs/* and put it there? That would ensure that not only 
are the caller and callee compatible, but that the values are passed in 
the correct location.

R.


      parent reply	other threads:[~2022-11-24 16:12 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-08 18:20 Ramana Radhakrishnan
2022-11-09 18:42 ` Alex Coplan
2022-11-10 17:21 ` Richard Earnshaw
2022-11-10 18:03   ` Richard Earnshaw
2022-11-10 19:46     ` Ramana Radhakrishnan
2022-11-11 21:50       ` Ramana Radhakrishnan
2022-11-17 20:15         ` Ramana Radhakrishnan
2022-11-24 12:32           ` Ramana Radhakrishnan
2022-11-24 16:12         ` Richard Earnshaw [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=585303f9-1f2f-7177-3029-9646dd19f63e@foss.arm.com \
    --to=richard.earnshaw@foss.arm.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=alex.coplan@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=ramana.gcc@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).