* [RFC PATCH 0/1] sframe: Represent FP without RA on stack (padding) @ 2024-04-22 15:58 Jens Remus 2024-04-22 15:58 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus 0 siblings, 1 reply; 5+ messages in thread From: Jens Remus @ 2024-04-22 15:58 UTC (permalink / raw) To: binutils, Indu Bhagat; +Cc: Jens Remus, Andreas Krebbel This patch series adds support in SFrame to represent the frame pointer (FP) without the return address (RA) being saved on the stack (and/or on s390x in another register). This is the first of two proposed alternatives: 1. This patch series uses a dummy padding offset (invalid offset from CFA value of zero) as RA offset to represent FP without RA on saved on the stack. 2. The alternative patch series changes the SFrame FRE count field into a bitmap, to convey which offsets follow the FRE. Note that it currently applies on top of my v3 patch series series that adds initial support to generate .sframe from CFI directives on s390x, although it is independent of that. The use of padding offsets has the benefit that it is a minor change to the SFrame V2 format. The downside is that it adds some (but apparently only minimal) bloat to the .sframe information. Also a value of zero might not be an invalid offset from CFA on all architectures or in all use cases (e.g. CFI in glibc longjmp() on some architectures defines the jump buffer pointer register as CFA base for unwinders to restore the jump target registers from (as if the return would be to the jump target)). A test build of glibc on s390x with this patch series applied shows the following changes for libc.so: The number of FDEs increases by 166 and the number of FREs increases by 861, while adding 337 dummy padding RA offsets. With a total of 28157 offsets the dummy padding offsets account for ~1.20 % of the offsets. Thanks and regards, Jens Jens Remus (1): sframe: Represent FP without RA on stack gas/gen-sframe.c | 50 +++++++++++++++++++---------------------- include/sframe.h | 9 ++++++-- libsframe/sframe-dump.c | 4 ++++ 3 files changed, 34 insertions(+), 29 deletions(-) -- 2.40.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC PATCH 1/1] sframe: Represent FP without RA on stack 2024-04-22 15:58 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (padding) Jens Remus @ 2024-04-22 15:58 ` Jens Remus 2024-04-22 23:58 ` Indu Bhagat 0 siblings, 1 reply; 5+ messages in thread From: Jens Remus @ 2024-04-22 15:58 UTC (permalink / raw) To: binutils, Indu Bhagat; +Cc: Jens Remus, Andreas Krebbel If an architecture uses both SFrame RA and FP tracking SFrame assumes that the RA offset is the 2nd offset and the FP offset is the 3rd offset following the SFrame FRE. An architecture does not need to store both on the stack. SFrame cannot represent a FP without RA on stack, since it cannot distinguish whether the 2nd offset is the RA or FP offset. Use an invalid SFrame FRE RA offset value of zero as dummy padding to represent the FP being saved on the stack when the RA is not saved on the stack. include/ * sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining the invalid RA offset value used to represent a dummy padding offset. gas/ * gen-sframe.c (get_fre_num_offsets): Accommodate for dummy padding RA offset if FP without RA on stack. (sframe_get_fre_offset_size): Likewise. (output_sframe_row_entry): Write a dummy padding RA offset if FP without RA needs to be represented. libsframe/ * sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA offsets as if they were undefined. Display them as "u*" to distinguish them. Signed-off-by: Jens Remus <jremus@linux.ibm.com> --- Notes (jremus): This patch eliminates 497 occurrences of the warning "skipping SFrame FDE due to FP without RA on stack" for a build of glibc on s390x. For libc.so this increases the number of FDEs by 166 and the number of FREs by 861, while adding 337 dummy padding RA offsets. With a total of 28157 offsets the dummy padding offsets account for ~1.20 % of the offsets. SFrame statistics without patch: VALUE TOTAL MIN MAX AVG FDEs: 3478 - - - FREs/FDE: 14441 1 15 4 Offsets/FDE: 28157 1 31 8 8-bit: 0 0 0 0 16-bit: 28157 1 31 8 32-bit: 0 0 0 0 Offsets/FRE: 28157 1 3 1 8-bit: - 0 0 0 16-bit: - 1 3 1 32-bit: - 0 0 0 SFrame statistics with patch applied: VALUE TOTAL MIN MAX AVG FDEs: 3644 - - - FREs/FDE: 15302 1 20 4 Offsets/FDE: 29944 1 38 8 8-bit: 0 0 0 0 16-bit: 29944 1 38 8 32-bit: 0 0 0 0 Offsets/FRE: 29944 1 3 1 8-bit: - 0 0 0 16-bit: - 1 3 1 32-bit: - 0 0 0 O_Padd/FDE: 337 - - 0 8-bit: 0 16-bit: 337 32-bit: 0 Note that on s390x the offsets are at minimum 16-bits in size, due to the mandatory CFA offset being at least 160. gas/gen-sframe.c | 50 +++++++++++++++++++---------------------- include/sframe.h | 9 ++++++-- libsframe/sframe-dump.c | 4 ++++ 3 files changed, 34 insertions(+), 29 deletions(-) diff --git a/gas/gen-sframe.c b/gas/gen-sframe.c index 4cc86eb6c815..990b08d87953 100644 --- a/gas/gen-sframe.c +++ b/gas/gen-sframe.c @@ -347,7 +347,9 @@ get_fre_num_offsets (struct sframe_row_entry *sframe_fre) fre_num_offsets++; #ifdef SFRAME_FRE_RA_TRACKING if (sframe_ra_tracking_p () - && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) + && (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK + /* Accommodate for padding RA offset if FP without RA on stack. */ + || sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)) fre_num_offsets++; #endif return fre_num_offsets; @@ -371,9 +373,14 @@ sframe_get_fre_offset_size (struct sframe_row_entry *sframe_fre) if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) bp_offset_size = get_offset_size_in_bytes (sframe_fre->bp_offset); #ifdef SFRAME_FRE_RA_TRACKING - if (sframe_ra_tracking_p () - && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) - ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset); + if (sframe_ra_tracking_p ()) + { + if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) + ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset); + /* Accommodate for padding RA offset if FP without RA on stack. */ + else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) + ra_offset_size = get_offset_size_in_bytes (SFRAME_FRE_RA_OFFSET_INVALID); + } #endif /* Get the maximum size needed to represent the offsets. */ @@ -537,11 +544,19 @@ output_sframe_row_entry (symbolS *fde_start_addr, fre_write_offsets++; #ifdef SFRAME_FRE_RA_TRACKING - if (sframe_ra_tracking_p () - && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) + if (sframe_ra_tracking_p ()) { - fre_offset_func_map[idx].out_func (sframe_fre->ra_offset); - fre_write_offsets++; + if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) + { + fre_offset_func_map[idx].out_func (sframe_fre->ra_offset); + fre_write_offsets++; + } + /* Write padding RA offset if FP without RA on stack. */ + else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) + { + fre_offset_func_map[idx].out_func (SFRAME_FRE_RA_OFFSET_INVALID); + fre_write_offsets++; + } } #endif if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) @@ -1497,25 +1512,6 @@ sframe_do_fde (struct sframe_xlate_ctx *xlate_ctx, = get_dw_fde_end_addrS (xlate_ctx->dw_fde); } -#ifdef SFRAME_FRE_RA_TRACKING - if (sframe_ra_tracking_p ()) - { - struct sframe_row_entry *fre; - - /* Iterate over the scratchpad FREs and validate them. */ - for (fre = xlate_ctx->first_fre; fre; fre = fre->next) - { - /* SFrame format cannot represent FP on stack without RA on stack. */ - if (fre->ra_loc != SFRAME_FRE_ELEM_LOC_STACK - && fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) - { - as_warn (_("skipping SFrame FDE due to FP without RA on stack")); - return SFRAME_XLATE_ERR_NOTREPRESENTED; - } - } - } -#endif /* SFRAME_FRE_RA_TRACKING */ - return SFRAME_XLATE_OK; } diff --git a/include/sframe.h b/include/sframe.h index 90bc92a32f84..d1a26875b3e2 100644 --- a/include/sframe.h +++ b/include/sframe.h @@ -237,6 +237,9 @@ typedef struct sframe_func_desc_entry may or may not be tracked. */ #define SFRAME_FRE_FP_OFFSET_IDX 2 +/* Invalid RA offset. Used as padding to represent FP without RA on stack. */ +#define SFRAME_FRE_RA_OFFSET_INVALID 0 + typedef struct sframe_fre_info { /* Information about @@ -288,9 +291,11 @@ typedef struct sframe_fre_info offset1 (interpreted as CFA = BASE_REG + offset1) if RA is being tracked - offset2 (interpreted as RA = CFA + offset2) + offset2 (interpreted as RA = CFA + offset2; an offset value of + SFRAME_FRE_RA_OFFSET_INVALID indicates a dummy padding RA offset + to represent FP without RA saved on stack) if FP is being tracked - offset3 (intrepreted as FP = CFA + offset2) + offset3 (intrepreted as FP = CFA + offset3) fi else if FP is being tracked diff --git a/libsframe/sframe-dump.c b/libsframe/sframe-dump.c index 40ea531314ba..3ea4bc327efd 100644 --- a/libsframe/sframe-dump.c +++ b/libsframe/sframe-dump.c @@ -199,6 +199,10 @@ dump_sframe_func_with_fres (sframe_decoder_ctx *sfd_ctx, if (sframe_decoder_get_fixed_ra_offset (sfd_ctx) != SFRAME_CFA_FIXED_RA_INVALID) strcpy (temp, "f"); + /* If an ABI does track RA offset, e.g. AArch64 and S390, it can be a + dummy as padding to represent FP without RA being saved on stack. */ + else if (err[2] == 0 && ra_offset == SFRAME_FRE_RA_OFFSET_INVALID) + sprintf (temp, "u*"); else if (err[2] == 0) { if (is_sframe_abi_arch_s390 (sfd_ctx) && (ra_offset & 1)) -- 2.40.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH 1/1] sframe: Represent FP without RA on stack 2024-04-22 15:58 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus @ 2024-04-22 23:58 ` Indu Bhagat 2024-04-23 15:44 ` Jens Remus 0 siblings, 1 reply; 5+ messages in thread From: Indu Bhagat @ 2024-04-22 23:58 UTC (permalink / raw) To: Jens Remus, binutils; +Cc: Andreas Krebbel On 4/22/24 08:58, Jens Remus wrote: > If an architecture uses both SFrame RA and FP tracking SFrame assumes > that the RA offset is the 2nd offset and the FP offset is the 3rd offset > following the SFrame FRE. An architecture does not need to store both on > the stack. SFrame cannot represent a FP without RA on stack, since it > cannot distinguish whether the 2nd offset is the RA or FP offset. > > Use an invalid SFrame FRE RA offset value of zero as dummy padding to > represent the FP being saved on the stack when the RA is not saved on > the stack. > > include/ > * sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining > the invalid RA offset value used to represent a dummy padding > offset. > > gas/ > * gen-sframe.c (get_fre_num_offsets): Accommodate for dummy > padding RA offset if FP without RA on stack. > (sframe_get_fre_offset_size): Likewise. > (output_sframe_row_entry): Write a dummy padding RA offset > if FP without RA needs to be represented. > > libsframe/ > * sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA > offsets as if they were undefined. Display them as "u*" to > distinguish them. > > Signed-off-by: Jens Remus <jremus@linux.ibm.com> > --- > > Notes (jremus): > This patch eliminates 497 occurrences of the warning "skipping SFrame > FDE due to FP without RA on stack" for a build of glibc on s390x. For > libc.so this increases the number of FDEs by 166 and the number of > FREs by 861, while adding 337 dummy padding RA offsets. With a total > of 28157 offsets the dummy padding offsets account for ~1.20 % of the > offsets. While this increase seems small, it does look wasteful. An orthogonal question below... > > SFrame statistics without patch: > > VALUE TOTAL MIN MAX AVG > FDEs: 3478 - - - > FREs/FDE: 14441 1 15 4 > Offsets/FDE: 28157 1 31 8 > 8-bit: 0 0 0 0 > 16-bit: 28157 1 31 8 > 32-bit: 0 0 0 0 > Offsets/FRE: 28157 1 3 1 > 8-bit: - 0 0 0 > 16-bit: - 1 3 1 > 32-bit: - 0 0 0 > > SFrame statistics with patch applied: > > VALUE TOTAL MIN MAX AVG > FDEs: 3644 - - - > FREs/FDE: 15302 1 20 4 > Offsets/FDE: 29944 1 38 8 > 8-bit: 0 0 0 0 > 16-bit: 29944 1 38 8 > 32-bit: 0 0 0 0 > Offsets/FRE: 29944 1 3 1 > 8-bit: - 0 0 0 > 16-bit: - 1 3 1 > 32-bit: - 0 0 0 > O_Padd/FDE: 337 - - 0 > 8-bit: 0 > 16-bit: 337 > 32-bit: 0 > > Note that on s390x the offsets are at minimum 16-bits in size, due to > the mandatory CFA offset being at least 160. > IIUC, all stack layouts supported in the ABI use the offset 160. Is that right ? I am wondering if adjusting the stored offsets in the SFrame section (by decrementing 160 from it) will work ? If yes, we could encode this constant in SFrame aux hdr bytes for s390x. > gas/gen-sframe.c | 50 +++++++++++++++++++---------------------- > include/sframe.h | 9 ++++++-- > libsframe/sframe-dump.c | 4 ++++ > 3 files changed, 34 insertions(+), 29 deletions(-) > > diff --git a/gas/gen-sframe.c b/gas/gen-sframe.c > index 4cc86eb6c815..990b08d87953 100644 > --- a/gas/gen-sframe.c > +++ b/gas/gen-sframe.c > @@ -347,7 +347,9 @@ get_fre_num_offsets (struct sframe_row_entry *sframe_fre) > fre_num_offsets++; > #ifdef SFRAME_FRE_RA_TRACKING > if (sframe_ra_tracking_p () > - && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) > + && (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK > + /* Accommodate for padding RA offset if FP without RA on stack. */ > + || sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)) > fre_num_offsets++; > #endif > return fre_num_offsets; > @@ -371,9 +373,14 @@ sframe_get_fre_offset_size (struct sframe_row_entry *sframe_fre) > if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) > bp_offset_size = get_offset_size_in_bytes (sframe_fre->bp_offset); > #ifdef SFRAME_FRE_RA_TRACKING > - if (sframe_ra_tracking_p () > - && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) > - ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset); > + if (sframe_ra_tracking_p ()) > + { > + if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) > + ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset); > + /* Accommodate for padding RA offset if FP without RA on stack. */ > + else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) > + ra_offset_size = get_offset_size_in_bytes (SFRAME_FRE_RA_OFFSET_INVALID); > + } > #endif > > /* Get the maximum size needed to represent the offsets. */ > @@ -537,11 +544,19 @@ output_sframe_row_entry (symbolS *fde_start_addr, > fre_write_offsets++; > > #ifdef SFRAME_FRE_RA_TRACKING > - if (sframe_ra_tracking_p () > - && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) > + if (sframe_ra_tracking_p ()) > { > - fre_offset_func_map[idx].out_func (sframe_fre->ra_offset); > - fre_write_offsets++; > + if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) > + { > + fre_offset_func_map[idx].out_func (sframe_fre->ra_offset); > + fre_write_offsets++; > + } > + /* Write padding RA offset if FP without RA on stack. */ > + else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) > + { > + fre_offset_func_map[idx].out_func (SFRAME_FRE_RA_OFFSET_INVALID); > + fre_write_offsets++; > + } > } > #endif > if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) > @@ -1497,25 +1512,6 @@ sframe_do_fde (struct sframe_xlate_ctx *xlate_ctx, > = get_dw_fde_end_addrS (xlate_ctx->dw_fde); > } > > -#ifdef SFRAME_FRE_RA_TRACKING > - if (sframe_ra_tracking_p ()) > - { > - struct sframe_row_entry *fre; > - > - /* Iterate over the scratchpad FREs and validate them. */ > - for (fre = xlate_ctx->first_fre; fre; fre = fre->next) > - { > - /* SFrame format cannot represent FP on stack without RA on stack. */ > - if (fre->ra_loc != SFRAME_FRE_ELEM_LOC_STACK > - && fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) > - { > - as_warn (_("skipping SFrame FDE due to FP without RA on stack")); > - return SFRAME_XLATE_ERR_NOTREPRESENTED; > - } > - } > - } > -#endif /* SFRAME_FRE_RA_TRACKING */ > - > return SFRAME_XLATE_OK; > } > > diff --git a/include/sframe.h b/include/sframe.h > index 90bc92a32f84..d1a26875b3e2 100644 > --- a/include/sframe.h > +++ b/include/sframe.h > @@ -237,6 +237,9 @@ typedef struct sframe_func_desc_entry > may or may not be tracked. */ > #define SFRAME_FRE_FP_OFFSET_IDX 2 > > +/* Invalid RA offset. Used as padding to represent FP without RA on stack. */ > +#define SFRAME_FRE_RA_OFFSET_INVALID 0 > + > typedef struct sframe_fre_info > { > /* Information about > @@ -288,9 +291,11 @@ typedef struct sframe_fre_info > offset1 (interpreted as CFA = BASE_REG + offset1) > > if RA is being tracked > - offset2 (interpreted as RA = CFA + offset2) > + offset2 (interpreted as RA = CFA + offset2; an offset value of > + SFRAME_FRE_RA_OFFSET_INVALID indicates a dummy padding RA offset > + to represent FP without RA saved on stack) > if FP is being tracked > - offset3 (intrepreted as FP = CFA + offset2) > + offset3 (intrepreted as FP = CFA + offset3) I too noticed this typo recently and have a patch fixing this. > fi > else > if FP is being tracked > diff --git a/libsframe/sframe-dump.c b/libsframe/sframe-dump.c > index 40ea531314ba..3ea4bc327efd 100644 > --- a/libsframe/sframe-dump.c > +++ b/libsframe/sframe-dump.c > @@ -199,6 +199,10 @@ dump_sframe_func_with_fres (sframe_decoder_ctx *sfd_ctx, > if (sframe_decoder_get_fixed_ra_offset (sfd_ctx) > != SFRAME_CFA_FIXED_RA_INVALID) > strcpy (temp, "f"); > + /* If an ABI does track RA offset, e.g. AArch64 and S390, it can be a > + dummy as padding to represent FP without RA being saved on stack. */ > + else if (err[2] == 0 && ra_offset == SFRAME_FRE_RA_OFFSET_INVALID) > + sprintf (temp, "u*"); > else if (err[2] == 0) > { > if (is_sframe_abi_arch_s390 (sfd_ctx) && (ra_offset & 1)) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH 1/1] sframe: Represent FP without RA on stack 2024-04-22 23:58 ` Indu Bhagat @ 2024-04-23 15:44 ` Jens Remus 2024-04-25 6:59 ` Indu Bhagat 0 siblings, 1 reply; 5+ messages in thread From: Jens Remus @ 2024-04-23 15:44 UTC (permalink / raw) To: Indu Bhagat, binutils; +Cc: Andreas Krebbel Am 23.04.2024 um 01:58 schrieb Indu Bhagat: > On 4/22/24 08:58, Jens Remus wrote: >> If an architecture uses both SFrame RA and FP tracking SFrame assumes >> that the RA offset is the 2nd offset and the FP offset is the 3rd offset >> following the SFrame FRE. An architecture does not need to store both on >> the stack. SFrame cannot represent a FP without RA on stack, since it >> cannot distinguish whether the 2nd offset is the RA or FP offset. >> >> Use an invalid SFrame FRE RA offset value of zero as dummy padding to >> represent the FP being saved on the stack when the RA is not saved on >> the stack. >> >> include/ >> * sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining >> the invalid RA offset value used to represent a dummy padding >> offset. >> >> gas/ >> * gen-sframe.c (get_fre_num_offsets): Accommodate for dummy >> padding RA offset if FP without RA on stack. >> (sframe_get_fre_offset_size): Likewise. >> (output_sframe_row_entry): Write a dummy padding RA offset >> if FP without RA needs to be represented. >> >> libsframe/ >> * sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA >> offsets as if they were undefined. Display them as "u*" to >> distinguish them. >> >> Signed-off-by: Jens Remus <jremus@linux.ibm.com> >> --- >> >> Notes (jremus): >> This patch eliminates 497 occurrences of the warning "skipping >> SFrame >> FDE due to FP without RA on stack" for a build of glibc on s390x. >> For >> libc.so this increases the number of FDEs by 166 and the number of >> FREs by 861, while adding 337 dummy padding RA offsets. With a total >> of 28157 offsets the dummy padding offsets account for ~1.20 % of >> the >> offsets. > > While this increase seems small, it does look wasteful. > > An orthogonal question below... > >> SFrame statistics without patch: >> VALUE TOTAL MIN MAX AVG >> FDEs: 3478 - - - >> FREs/FDE: 14441 1 15 4 >> Offsets/FDE: 28157 1 31 8 >> 8-bit: 0 0 0 0 >> 16-bit: 28157 1 31 8 >> 32-bit: 0 0 0 0 >> Offsets/FRE: 28157 1 3 1 >> 8-bit: - 0 0 0 >> 16-bit: - 1 3 1 >> 32-bit: - 0 0 0 >> SFrame statistics with patch applied: >> VALUE TOTAL MIN MAX AVG >> FDEs: 3644 - - - >> FREs/FDE: 15302 1 20 4 >> Offsets/FDE: 29944 1 38 8 >> 8-bit: 0 0 0 0 >> 16-bit: 29944 1 38 8 >> 32-bit: 0 0 0 0 >> Offsets/FRE: 29944 1 3 1 >> 8-bit: - 0 0 0 >> 16-bit: - 1 3 1 >> 32-bit: - 0 0 0 >> O_Padd/FDE: 337 - - 0 >> 8-bit: 0 >> 16-bit: 337 >> 32-bit: 0 >> Note that on s390x the offsets are at minimum 16-bits in size, >> due to >> the mandatory CFA offset being at least 160. >> > > IIUC, all stack layouts supported in the ABI use the offset 160. Is that > right ? I am wondering if adjusting the stored offsets in the SFrame > section (by decrementing 160 from it) will work ? > > If yes, we could encode this constant in SFrame aux hdr bytes for s390x. Thank you for the hint! Using a constant adjustment of -160 on s390x for the CFA offset from CFA base register should work to allow for 8-bit offsets to be used. Aren't all tracked offsets (i.e. CFA, FP, and RA) signed anyway? Thus applying a constant adjustment should work in any case? Couldn't it simply be an architecture specific constant in the code to begin with? For example a new macro, which is only applied when defined? #define S390_SFRAME_CFA_OFFSET_ADJUSTMENT -160 #define SFRAME_CFA_OFFSET_ADJUSTMENT S390_SFRAME_CFA_OFFSET_ADJUSTMENT Implementing this in the SFrame auxiliary header would of course allow to implement this enhancement at a later stage and to change the adjustment value in the future, as the linker can then either reject or merge different adjustment values during link editing. I wonder whether it would make sense to store the FP register number in the SFrame auxiliary header for s390x as well. Register 11 is just a convention of the compilers and not defined by the ABI. That would enable us to choose a different register as frame pointer in the future. > >> gas/gen-sframe.c | 50 +++++++++++++++++++---------------------- >> include/sframe.h | 9 ++++++-- >> libsframe/sframe-dump.c | 4 ++++ >> 3 files changed, 34 insertions(+), 29 deletions(-) >> >> diff --git a/gas/gen-sframe.c b/gas/gen-sframe.c >> index 4cc86eb6c815..990b08d87953 100644 >> --- a/gas/gen-sframe.c >> +++ b/gas/gen-sframe.c >> @@ -347,7 +347,9 @@ get_fre_num_offsets (struct sframe_row_entry >> *sframe_fre) >> fre_num_offsets++; >> #ifdef SFRAME_FRE_RA_TRACKING >> if (sframe_ra_tracking_p () >> - && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) >> + && (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK >> + /* Accommodate for padding RA offset if FP without RA on >> stack. */ >> + || sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)) >> fre_num_offsets++; >> #endif >> return fre_num_offsets; >> @@ -371,9 +373,14 @@ sframe_get_fre_offset_size (struct >> sframe_row_entry *sframe_fre) >> if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) >> bp_offset_size = get_offset_size_in_bytes (sframe_fre->bp_offset); >> #ifdef SFRAME_FRE_RA_TRACKING >> - if (sframe_ra_tracking_p () >> - && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) >> - ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset); >> + if (sframe_ra_tracking_p ()) >> + { >> + if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) >> + ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset); >> + /* Accommodate for padding RA offset if FP without RA on >> stack. */ >> + else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) >> + ra_offset_size = get_offset_size_in_bytes >> (SFRAME_FRE_RA_OFFSET_INVALID); >> + } >> #endif >> /* Get the maximum size needed to represent the offsets. */ >> @@ -537,11 +544,19 @@ output_sframe_row_entry (symbolS *fde_start_addr, >> fre_write_offsets++; >> #ifdef SFRAME_FRE_RA_TRACKING >> - if (sframe_ra_tracking_p () >> - && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) >> + if (sframe_ra_tracking_p ()) >> { >> - fre_offset_func_map[idx].out_func (sframe_fre->ra_offset); >> - fre_write_offsets++; >> + if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK) >> + { >> + fre_offset_func_map[idx].out_func (sframe_fre->ra_offset); >> + fre_write_offsets++; >> + } >> + /* Write padding RA offset if FP without RA on stack. */ >> + else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) >> + { >> + fre_offset_func_map[idx].out_func (SFRAME_FRE_RA_OFFSET_INVALID); >> + fre_write_offsets++; >> + } >> } >> #endif >> if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) >> @@ -1497,25 +1512,6 @@ sframe_do_fde (struct sframe_xlate_ctx *xlate_ctx, >> = get_dw_fde_end_addrS (xlate_ctx->dw_fde); >> } >> -#ifdef SFRAME_FRE_RA_TRACKING >> - if (sframe_ra_tracking_p ()) >> - { >> - struct sframe_row_entry *fre; >> - >> - /* Iterate over the scratchpad FREs and validate them. */ >> - for (fre = xlate_ctx->first_fre; fre; fre = fre->next) >> - { >> - /* SFrame format cannot represent FP on stack without RA on >> stack. */ >> - if (fre->ra_loc != SFRAME_FRE_ELEM_LOC_STACK >> - && fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK) >> - { >> - as_warn (_("skipping SFrame FDE due to FP without RA on >> stack")); >> - return SFRAME_XLATE_ERR_NOTREPRESENTED; >> - } >> - } >> - } >> -#endif /* SFRAME_FRE_RA_TRACKING */ >> - >> return SFRAME_XLATE_OK; >> } >> diff --git a/include/sframe.h b/include/sframe.h >> index 90bc92a32f84..d1a26875b3e2 100644 >> --- a/include/sframe.h >> +++ b/include/sframe.h >> @@ -237,6 +237,9 @@ typedef struct sframe_func_desc_entry >> may or may not be tracked. */ >> #define SFRAME_FRE_FP_OFFSET_IDX 2 >> +/* Invalid RA offset. Used as padding to represent FP without RA on >> stack. */ >> +#define SFRAME_FRE_RA_OFFSET_INVALID 0 >> + >> typedef struct sframe_fre_info >> { >> /* Information about >> @@ -288,9 +291,11 @@ typedef struct sframe_fre_info >> offset1 (interpreted as CFA = BASE_REG + offset1) >> if RA is being tracked >> - offset2 (interpreted as RA = CFA + offset2) >> + offset2 (interpreted as RA = CFA + offset2; an offset value of >> + SFRAME_FRE_RA_OFFSET_INVALID indicates a dummy padding RA >> offset >> + to represent FP without RA saved on stack) >> if FP is being tracked >> - offset3 (intrepreted as FP = CFA + offset2) >> + offset3 (intrepreted as FP = CFA + offset3) > > I too noticed this typo recently and have a patch fixing this. > >> fi >> else >> if FP is being tracked >> diff --git a/libsframe/sframe-dump.c b/libsframe/sframe-dump.c >> index 40ea531314ba..3ea4bc327efd 100644 >> --- a/libsframe/sframe-dump.c >> +++ b/libsframe/sframe-dump.c >> @@ -199,6 +199,10 @@ dump_sframe_func_with_fres (sframe_decoder_ctx >> *sfd_ctx, >> if (sframe_decoder_get_fixed_ra_offset (sfd_ctx) >> != SFRAME_CFA_FIXED_RA_INVALID) >> strcpy (temp, "f"); >> + /* If an ABI does track RA offset, e.g. AArch64 and S390, it >> can be a >> + dummy as padding to represent FP without RA being saved on >> stack. */ >> + else if (err[2] == 0 && ra_offset == SFRAME_FRE_RA_OFFSET_INVALID) >> + sprintf (temp, "u*"); >> else if (err[2] == 0) >> { >> if (is_sframe_abi_arch_s390 (sfd_ctx) && (ra_offset & 1)) > Regards, Jens -- Jens Remus Linux on Z Development (D3303) and z/VSE Support +49-7031-16-1128 Office jremus@de.ibm.com IBM IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294 IBM Data Privacy Statement: https://www.ibm.com/privacy/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH 1/1] sframe: Represent FP without RA on stack 2024-04-23 15:44 ` Jens Remus @ 2024-04-25 6:59 ` Indu Bhagat 0 siblings, 0 replies; 5+ messages in thread From: Indu Bhagat @ 2024-04-25 6:59 UTC (permalink / raw) To: Jens Remus, binutils; +Cc: Andreas Krebbel On 4/23/24 08:44, Jens Remus wrote: > Am 23.04.2024 um 01:58 schrieb Indu Bhagat: >> On 4/22/24 08:58, Jens Remus wrote: >>> If an architecture uses both SFrame RA and FP tracking SFrame assumes >>> that the RA offset is the 2nd offset and the FP offset is the 3rd offset >>> following the SFrame FRE. An architecture does not need to store both on >>> the stack. SFrame cannot represent a FP without RA on stack, since it >>> cannot distinguish whether the 2nd offset is the RA or FP offset. >>> >>> Use an invalid SFrame FRE RA offset value of zero as dummy padding to >>> represent the FP being saved on the stack when the RA is not saved on >>> the stack. >>> >>> include/ >>> * sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining >>> the invalid RA offset value used to represent a dummy padding >>> offset. >>> >>> gas/ >>> * gen-sframe.c (get_fre_num_offsets): Accommodate for dummy >>> padding RA offset if FP without RA on stack. >>> (sframe_get_fre_offset_size): Likewise. >>> (output_sframe_row_entry): Write a dummy padding RA offset >>> if FP without RA needs to be represented. >>> >>> libsframe/ >>> * sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA >>> offsets as if they were undefined. Display them as "u*" to >>> distinguish them. >>> >>> Signed-off-by: Jens Remus <jremus@linux.ibm.com> >>> --- >>> >>> Notes (jremus): >>> This patch eliminates 497 occurrences of the warning "skipping >>> SFrame >>> FDE due to FP without RA on stack" for a build of glibc on >>> s390x. For >>> libc.so this increases the number of FDEs by 166 and the number of >>> FREs by 861, while adding 337 dummy padding RA offsets. With a >>> total >>> of 28157 offsets the dummy padding offsets account for ~1.20 % >>> of the >>> offsets. >> >> While this increase seems small, it does look wasteful. >> >> An orthogonal question below... >> >>> SFrame statistics without patch: >>> VALUE TOTAL MIN MAX AVG >>> FDEs: 3478 - - - >>> FREs/FDE: 14441 1 15 4 >>> Offsets/FDE: 28157 1 31 8 >>> 8-bit: 0 0 0 0 >>> 16-bit: 28157 1 31 8 >>> 32-bit: 0 0 0 0 >>> Offsets/FRE: 28157 1 3 1 >>> 8-bit: - 0 0 0 >>> 16-bit: - 1 3 1 >>> 32-bit: - 0 0 0 >>> SFrame statistics with patch applied: >>> VALUE TOTAL MIN MAX AVG >>> FDEs: 3644 - - - >>> FREs/FDE: 15302 1 20 4 >>> Offsets/FDE: 29944 1 38 8 >>> 8-bit: 0 0 0 0 >>> 16-bit: 29944 1 38 8 >>> 32-bit: 0 0 0 0 >>> Offsets/FRE: 29944 1 3 1 >>> 8-bit: - 0 0 0 >>> 16-bit: - 1 3 1 >>> 32-bit: - 0 0 0 >>> O_Padd/FDE: 337 - - 0 >>> 8-bit: 0 >>> 16-bit: 337 >>> 32-bit: 0 >>> Note that on s390x the offsets are at minimum 16-bits in size, >>> due to >>> the mandatory CFA offset being at least 160. >>> >> >> IIUC, all stack layouts supported in the ABI use the offset 160. Is >> that right ? I am wondering if adjusting the stored offsets in the >> SFrame section (by decrementing 160 from it) will work ? >> >> If yes, we could encode this constant in SFrame aux hdr bytes for s390x. > > Thank you for the hint! Using a constant adjustment of -160 on s390x for > the CFA offset from CFA base register should work to allow for 8-bit > offsets to be used. Aren't all tracked offsets (i.e. CFA, FP, and RA) > signed anyway? Thus applying a constant adjustment should work in any case? > Yes, these offsets are signed. Applying the constant should work for CFA. > Couldn't it simply be an architecture specific constant in the code to > begin with? For example a new macro, which is only applied when defined? > > #define S390_SFRAME_CFA_OFFSET_ADJUSTMENT -160 > #define SFRAME_CFA_OFFSET_ADJUSTMENT S390_SFRAME_CFA_OFFSET_ADJUSTMENT > I think a macro should also work in this case. > Implementing this in the SFrame auxiliary header would of course allow > to implement this enhancement at a later stage and to change the > adjustment value in the future, as the linker can then either reject or > merge different adjustment values during link editing. > > I wonder whether it would make sense to store the FP register number in > the SFrame auxiliary header for s390x as well. Register 11 is just a > convention of the compilers and not defined by the ABI. That would > enable us to choose a different register as frame pointer in the future. > I think the problem will remain that there is ATM no way to communicate this information to the assembler (that compiler used r11 as fp role). And even if there was a way, I am not so sure. Since the ABI doesnt mandate r11 as fp, the compiler may pick another register for say a different stack layout etc, in the future ? IOW, even it picking different fp register across functions is a possibility, no? So what is expected of the compiler then... ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-04-25 6:59 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-04-22 15:58 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (padding) Jens Remus 2024-04-22 15:58 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus 2024-04-22 23:58 ` Indu Bhagat 2024-04-23 15:44 ` Jens Remus 2024-04-25 6:59 ` Indu Bhagat
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).