public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Indu Bhagat <indu.bhagat@oracle.com>
To: Jens Remus <jremus@linux.ibm.com>, binutils@sourceware.org
Cc: Andreas Krebbel <krebbel@linux.ibm.com>
Subject: Re: [RFC PATCH 1/1] sframe: Represent FP without RA on stack
Date: Wed, 24 Apr 2024 23:59:16 -0700	[thread overview]
Message-ID: <1b6b9b74-4b50-46a3-9773-9483f7dfdc9d@oracle.com> (raw)
In-Reply-To: <16c9b3b9-5b65-4722-81b4-0578ffe3409c@linux.ibm.com>

On 4/23/24 08:44, Jens Remus wrote:
> Am 23.04.2024 um 01:58 schrieb Indu Bhagat:
>> On 4/22/24 08:58, Jens Remus wrote:
>>> If an architecture uses both SFrame RA and FP tracking SFrame assumes
>>> that the RA offset is the 2nd offset and the FP offset is the 3rd offset
>>> following the SFrame FRE. An architecture does not need to store both on
>>> the stack. SFrame cannot represent a FP without RA on stack, since it
>>> cannot distinguish whether the 2nd offset is the RA or FP offset.
>>>
>>> Use an invalid SFrame FRE RA offset value of zero as dummy padding to
>>> represent the FP being saved on the stack when the RA is not saved on
>>> the stack.
>>>
>>> include/
>>>     * sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining
>>>     the invalid RA offset value used to represent a dummy padding
>>>     offset.
>>>
>>> gas/
>>>     * gen-sframe.c (get_fre_num_offsets): Accommodate for dummy
>>>     padding RA offset if FP without RA on stack.
>>>     (sframe_get_fre_offset_size): Likewise.
>>>     (output_sframe_row_entry): Write a dummy padding RA offset
>>>     if FP without RA needs to be represented.
>>>
>>> libsframe/
>>>     * sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA
>>>     offsets as if they were undefined. Display them as "u*" to
>>>     distinguish them.
>>>
>>> Signed-off-by: Jens Remus <jremus@linux.ibm.com>
>>> ---
>>>
>>> Notes (jremus):
>>>      This patch eliminates 497 occurrences of the warning "skipping 
>>> SFrame
>>>      FDE due to FP without RA on stack" for a build of glibc on 
>>> s390x. For
>>>      libc.so this increases the number of FDEs by 166 and the number of
>>>      FREs by 861, while adding 337 dummy padding RA offsets. With a 
>>> total
>>>      of 28157 offsets the dummy padding offsets account for ~1.20 % 
>>> of the
>>>      offsets.
>>
>> While this increase seems small, it does look wasteful.
>>
>> An orthogonal question below...
>>
>>>      SFrame statistics without patch:
>>>          VALUE        TOTAL      MIN        MAX        AVG
>>>          FDEs:        3478       -          -          -
>>>          FREs/FDE:    14441      1          15         4
>>>          Offsets/FDE: 28157      1          31         8
>>>             8-bit:    0          0          0          0
>>>            16-bit:    28157      1          31         8
>>>            32-bit:    0          0          0          0
>>>          Offsets/FRE: 28157      1          3          1
>>>             8-bit:    -          0          0          0
>>>            16-bit:    -          1          3          1
>>>            32-bit:    -          0          0          0
>>>      SFrame statistics with patch applied:
>>>          VALUE        TOTAL      MIN        MAX        AVG
>>>          FDEs:        3644       -          -          -
>>>          FREs/FDE:    15302      1          20         4
>>>          Offsets/FDE: 29944      1          38         8
>>>             8-bit:    0          0          0          0
>>>            16-bit:    29944      1          38         8
>>>            32-bit:    0          0          0          0
>>>          Offsets/FRE: 29944      1          3          1
>>>             8-bit:    -          0          0          0
>>>            16-bit:    -          1          3          1
>>>            32-bit:    -          0          0          0
>>>          O_Padd/FDE:  337        -          -          0
>>>             8-bit:    0
>>>            16-bit:    337
>>>            32-bit:    0
>>>      Note that on s390x the offsets are at minimum 16-bits in size, 
>>> due to
>>>      the mandatory CFA offset being at least 160.
>>>
>>
>> IIUC, all stack layouts supported in the ABI use the offset 160. Is 
>> that right ? I am wondering if adjusting the stored offsets in the 
>> SFrame section (by decrementing 160 from it) will work ?
>>
>> If yes, we could encode this constant in SFrame aux hdr bytes for s390x.
> 
> Thank you for the hint! Using a constant adjustment of -160 on s390x for 
> the CFA offset from CFA base register should work to allow for 8-bit 
> offsets to be used. Aren't all tracked offsets (i.e. CFA, FP, and RA) 
> signed anyway? Thus applying a constant adjustment should work in any case?
> 

Yes, these offsets are signed.  Applying the constant should work for CFA.

> Couldn't it simply be an architecture specific constant in the code to 
> begin with? For example a new macro, which is only applied when defined?
> 
> #define S390_SFRAME_CFA_OFFSET_ADJUSTMENT -160
> #define SFRAME_CFA_OFFSET_ADJUSTMENT S390_SFRAME_CFA_OFFSET_ADJUSTMENT
> 

I think a macro should also work in this case.

> Implementing this in the SFrame auxiliary header would of course allow 
> to implement this enhancement at a later stage and to change the 
> adjustment value in the future, as the linker can then either reject or 
> merge different adjustment values during link editing.
> 
> I wonder whether it would make sense to store the FP register number in 
> the SFrame auxiliary header for s390x as well. Register 11 is just a 
> convention of the compilers and not defined by the ABI. That would 
> enable us to choose a different register as frame pointer in the future.
> 

I think the problem will remain that there is ATM no way to communicate 
this information to the assembler (that compiler used r11 as fp role). 
And even if there was a way, I am not so sure.  Since the ABI doesnt 
mandate r11 as fp, the compiler may pick another register for say a 
different stack layout etc, in the future ?  IOW, even it picking 
different fp register across functions is a possibility, no? So what is 
expected of the compiler then...



  reply	other threads:[~2024-04-25  6:59 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-22 15:58 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (padding) Jens Remus
2024-04-22 15:58 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus
2024-04-22 23:58   ` Indu Bhagat
2024-04-23 15:44     ` Jens Remus
2024-04-25  6:59       ` Indu Bhagat [this message]
2024-04-22 15:59 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap) Jens Remus
2024-04-22 15:59 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1b6b9b74-4b50-46a3-9773-9483f7dfdc9d@oracle.com \
    --to=indu.bhagat@oracle.com \
    --cc=binutils@sourceware.org \
    --cc=jremus@linux.ibm.com \
    --cc=krebbel@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).