public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/1] sframe: Represent FP without RA on stack (padding)
@ 2024-04-22 15:58 Jens Remus
  2024-04-22 15:58 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Remus @ 2024-04-22 15:58 UTC (permalink / raw)
  To: binutils, Indu Bhagat; +Cc: Jens Remus, Andreas Krebbel

This patch series adds support in SFrame to represent the frame pointer
(FP) without the return address (RA) being saved on the stack (and/or on
s390x in another register).

This is the first of two proposed alternatives:
1. This patch series uses a dummy padding offset (invalid offset from
   CFA value of zero) as RA offset to represent FP without RA on saved
   on the stack.
2. The alternative patch series changes the SFrame FRE count field into
   a bitmap, to convey which offsets follow the FRE.

Note that it currently applies on top of my v3 patch series series that
adds initial support to generate .sframe from CFI directives on s390x,
although it is independent of that.

The use of padding offsets has the benefit that it is a minor change
to the SFrame V2 format. The downside is that it adds some (but
apparently only minimal) bloat to the .sframe information. Also a value
of zero might not be an invalid offset from CFA on all architectures or
in all use cases (e.g. CFI in glibc longjmp() on some architectures
defines the jump buffer pointer register as CFA base for unwinders to
restore the jump target registers from (as if the return would be to the
jump target)).

A test build of glibc on s390x with this patch series applied shows the
following changes for libc.so:
The number of FDEs increases by 166 and the number of FREs increases by
861, while adding 337 dummy padding RA offsets. With a total of 28157
offsets the dummy padding offsets account for ~1.20 % of the offsets.

Thanks and regards,
Jens


Jens Remus (1):
  sframe: Represent FP without RA on stack

 gas/gen-sframe.c        | 50 +++++++++++++++++++----------------------
 include/sframe.h        |  9 ++++++--
 libsframe/sframe-dump.c |  4 ++++
 3 files changed, 34 insertions(+), 29 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH 1/1] sframe: Represent FP without RA on stack
  2024-04-22 15:58 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (padding) Jens Remus
@ 2024-04-22 15:58 ` Jens Remus
  2024-04-22 23:58   ` Indu Bhagat
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Remus @ 2024-04-22 15:58 UTC (permalink / raw)
  To: binutils, Indu Bhagat; +Cc: Jens Remus, Andreas Krebbel

If an architecture uses both SFrame RA and FP tracking SFrame assumes
that the RA offset is the 2nd offset and the FP offset is the 3rd offset
following the SFrame FRE. An architecture does not need to store both on
the stack. SFrame cannot represent a FP without RA on stack, since it
cannot distinguish whether the 2nd offset is the RA or FP offset.

Use an invalid SFrame FRE RA offset value of zero as dummy padding to
represent the FP being saved on the stack when the RA is not saved on
the stack.

include/
	* sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining
	the invalid RA offset value used to represent a dummy padding
	offset.

gas/
	* gen-sframe.c (get_fre_num_offsets): Accommodate for dummy
	padding RA offset if FP without RA on stack.
	(sframe_get_fre_offset_size): Likewise.
	(output_sframe_row_entry): Write a dummy padding RA offset
	if FP without RA needs to be represented.

libsframe/
	* sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA
	offsets as if they were undefined. Display them as "u*" to
	distinguish them.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    This patch eliminates 497 occurrences of the warning "skipping SFrame
    FDE due to FP without RA on stack" for a build of glibc on s390x. For
    libc.so this increases the number of FDEs by 166 and the number of
    FREs by 861, while adding 337 dummy padding RA offsets. With a total
    of 28157 offsets the dummy padding offsets account for ~1.20 % of the
    offsets.
    
    SFrame statistics without patch:
    
        VALUE        TOTAL      MIN        MAX        AVG
        FDEs:        3478       -          -          -
        FREs/FDE:    14441      1          15         4
        Offsets/FDE: 28157      1          31         8
           8-bit:    0          0          0          0
          16-bit:    28157      1          31         8
          32-bit:    0          0          0          0
        Offsets/FRE: 28157      1          3          1
           8-bit:    -          0          0          0
          16-bit:    -          1          3          1
          32-bit:    -          0          0          0
    
    SFrame statistics with patch applied:
    
        VALUE        TOTAL      MIN        MAX        AVG
        FDEs:        3644       -          -          -
        FREs/FDE:    15302      1          20         4
        Offsets/FDE: 29944      1          38         8
           8-bit:    0          0          0          0
          16-bit:    29944      1          38         8
          32-bit:    0          0          0          0
        Offsets/FRE: 29944      1          3          1
           8-bit:    -          0          0          0
          16-bit:    -          1          3          1
          32-bit:    -          0          0          0
        O_Padd/FDE:  337        -          -          0
           8-bit:    0
          16-bit:    337
          32-bit:    0
    
    Note that on s390x the offsets are at minimum 16-bits in size, due to
    the mandatory CFA offset being at least 160.

 gas/gen-sframe.c        | 50 +++++++++++++++++++----------------------
 include/sframe.h        |  9 ++++++--
 libsframe/sframe-dump.c |  4 ++++
 3 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/gas/gen-sframe.c b/gas/gen-sframe.c
index 4cc86eb6c815..990b08d87953 100644
--- a/gas/gen-sframe.c
+++ b/gas/gen-sframe.c
@@ -347,7 +347,9 @@ get_fre_num_offsets (struct sframe_row_entry *sframe_fre)
     fre_num_offsets++;
 #ifdef SFRAME_FRE_RA_TRACKING
   if (sframe_ra_tracking_p ()
-      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
+      && (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK
+	  /* Accommodate for padding RA offset if FP without RA on stack.  */
+	  || sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK))
     fre_num_offsets++;
 #endif
   return fre_num_offsets;
@@ -371,9 +373,14 @@ sframe_get_fre_offset_size (struct sframe_row_entry *sframe_fre)
   if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
     bp_offset_size = get_offset_size_in_bytes (sframe_fre->bp_offset);
 #ifdef SFRAME_FRE_RA_TRACKING
-  if (sframe_ra_tracking_p ()
-      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
-    ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset);
+  if (sframe_ra_tracking_p ())
+    {
+      if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
+	ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset);
+      /* Accommodate for padding RA offset if FP without RA on stack.  */
+      else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
+	ra_offset_size = get_offset_size_in_bytes (SFRAME_FRE_RA_OFFSET_INVALID);
+    }
 #endif
 
   /* Get the maximum size needed to represent the offsets.  */
@@ -537,11 +544,19 @@ output_sframe_row_entry (symbolS *fde_start_addr,
   fre_write_offsets++;
 
 #ifdef SFRAME_FRE_RA_TRACKING
-  if (sframe_ra_tracking_p ()
-      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
+  if (sframe_ra_tracking_p ())
     {
-      fre_offset_func_map[idx].out_func (sframe_fre->ra_offset);
-      fre_write_offsets++;
+      if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
+	{
+	  fre_offset_func_map[idx].out_func (sframe_fre->ra_offset);
+	  fre_write_offsets++;
+	}
+      /* Write padding RA offset if FP without RA on stack.  */
+      else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
+	{
+	  fre_offset_func_map[idx].out_func (SFRAME_FRE_RA_OFFSET_INVALID);
+	  fre_write_offsets++;
+	}
     }
 #endif
   if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
@@ -1497,25 +1512,6 @@ sframe_do_fde (struct sframe_xlate_ctx *xlate_ctx,
 	= get_dw_fde_end_addrS (xlate_ctx->dw_fde);
     }
 
-#ifdef SFRAME_FRE_RA_TRACKING
-  if (sframe_ra_tracking_p ())
-    {
-      struct sframe_row_entry *fre;
-
-      /* Iterate over the scratchpad FREs and validate them.  */
-      for (fre = xlate_ctx->first_fre; fre; fre = fre->next)
-	{
-	  /* SFrame format cannot represent FP on stack without RA on stack.  */
-	  if (fre->ra_loc != SFRAME_FRE_ELEM_LOC_STACK
-	      && fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
-	    {
-	      as_warn (_("skipping SFrame FDE due to FP without RA on stack"));
-	      return SFRAME_XLATE_ERR_NOTREPRESENTED;
-	    }
-	}
-    }
-#endif /* SFRAME_FRE_RA_TRACKING  */
-
   return SFRAME_XLATE_OK;
 }
 
diff --git a/include/sframe.h b/include/sframe.h
index 90bc92a32f84..d1a26875b3e2 100644
--- a/include/sframe.h
+++ b/include/sframe.h
@@ -237,6 +237,9 @@ typedef struct sframe_func_desc_entry
    may or may not be tracked.  */
 #define SFRAME_FRE_FP_OFFSET_IDX    2
 
+/* Invalid RA offset.  Used as padding to represent FP without RA on stack.  */
+#define SFRAME_FRE_RA_OFFSET_INVALID 0
+
 typedef struct sframe_fre_info
 {
   /* Information about
@@ -288,9 +291,11 @@ typedef struct sframe_fre_info
     offset1 (interpreted as CFA = BASE_REG + offset1)
 
     if RA is being tracked
-      offset2 (interpreted as RA = CFA + offset2)
+      offset2 (interpreted as RA = CFA + offset2; an offset value of
+	       SFRAME_FRE_RA_OFFSET_INVALID indicates a dummy padding RA offset
+	       to represent FP without RA saved on stack)
       if FP is being tracked
-	offset3 (intrepreted as FP = CFA + offset2)
+	offset3 (intrepreted as FP = CFA + offset3)
       fi
     else
       if FP is being tracked
diff --git a/libsframe/sframe-dump.c b/libsframe/sframe-dump.c
index 40ea531314ba..3ea4bc327efd 100644
--- a/libsframe/sframe-dump.c
+++ b/libsframe/sframe-dump.c
@@ -199,6 +199,10 @@ dump_sframe_func_with_fres (sframe_decoder_ctx *sfd_ctx,
       if (sframe_decoder_get_fixed_ra_offset (sfd_ctx)
 	  != SFRAME_CFA_FIXED_RA_INVALID)
 	strcpy (temp, "f");
+      /* If an ABI does track RA offset, e.g. AArch64 and S390, it can be a
+	 dummy as padding to represent FP without RA being saved on stack.  */
+      else if (err[2] == 0 && ra_offset == SFRAME_FRE_RA_OFFSET_INVALID)
+	sprintf (temp, "u*");
       else if (err[2] == 0)
 	{
 	  if (is_sframe_abi_arch_s390 (sfd_ctx) && (ra_offset & 1))
-- 
2.40.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/1] sframe: Represent FP without RA on stack
  2024-04-22 15:58 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus
@ 2024-04-22 23:58   ` Indu Bhagat
  2024-04-23 15:44     ` Jens Remus
  0 siblings, 1 reply; 5+ messages in thread
From: Indu Bhagat @ 2024-04-22 23:58 UTC (permalink / raw)
  To: Jens Remus, binutils; +Cc: Andreas Krebbel

On 4/22/24 08:58, Jens Remus wrote:
> If an architecture uses both SFrame RA and FP tracking SFrame assumes
> that the RA offset is the 2nd offset and the FP offset is the 3rd offset
> following the SFrame FRE. An architecture does not need to store both on
> the stack. SFrame cannot represent a FP without RA on stack, since it
> cannot distinguish whether the 2nd offset is the RA or FP offset.
> 
> Use an invalid SFrame FRE RA offset value of zero as dummy padding to
> represent the FP being saved on the stack when the RA is not saved on
> the stack.
> 
> include/
> 	* sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining
> 	the invalid RA offset value used to represent a dummy padding
> 	offset.
> 
> gas/
> 	* gen-sframe.c (get_fre_num_offsets): Accommodate for dummy
> 	padding RA offset if FP without RA on stack.
> 	(sframe_get_fre_offset_size): Likewise.
> 	(output_sframe_row_entry): Write a dummy padding RA offset
> 	if FP without RA needs to be represented.
> 
> libsframe/
> 	* sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA
> 	offsets as if they were undefined. Display them as "u*" to
> 	distinguish them.
> 
> Signed-off-by: Jens Remus <jremus@linux.ibm.com>
> ---
> 
> Notes (jremus):
>      This patch eliminates 497 occurrences of the warning "skipping SFrame
>      FDE due to FP without RA on stack" for a build of glibc on s390x. For
>      libc.so this increases the number of FDEs by 166 and the number of
>      FREs by 861, while adding 337 dummy padding RA offsets. With a total
>      of 28157 offsets the dummy padding offsets account for ~1.20 % of the
>      offsets.

While this increase seems small, it does look wasteful.

An orthogonal question below...

>      
>      SFrame statistics without patch:
>      
>          VALUE        TOTAL      MIN        MAX        AVG
>          FDEs:        3478       -          -          -
>          FREs/FDE:    14441      1          15         4
>          Offsets/FDE: 28157      1          31         8
>             8-bit:    0          0          0          0
>            16-bit:    28157      1          31         8
>            32-bit:    0          0          0          0
>          Offsets/FRE: 28157      1          3          1
>             8-bit:    -          0          0          0
>            16-bit:    -          1          3          1
>            32-bit:    -          0          0          0
>      
>      SFrame statistics with patch applied:
>      
>          VALUE        TOTAL      MIN        MAX        AVG
>          FDEs:        3644       -          -          -
>          FREs/FDE:    15302      1          20         4
>          Offsets/FDE: 29944      1          38         8
>             8-bit:    0          0          0          0
>            16-bit:    29944      1          38         8
>            32-bit:    0          0          0          0
>          Offsets/FRE: 29944      1          3          1
>             8-bit:    -          0          0          0
>            16-bit:    -          1          3          1
>            32-bit:    -          0          0          0
>          O_Padd/FDE:  337        -          -          0
>             8-bit:    0
>            16-bit:    337
>            32-bit:    0
>      
>      Note that on s390x the offsets are at minimum 16-bits in size, due to
>      the mandatory CFA offset being at least 160.
> 

IIUC, all stack layouts supported in the ABI use the offset 160. Is that 
right ? I am wondering if adjusting the stored offsets in the SFrame 
section (by decrementing 160 from it) will work ?

If yes, we could encode this constant in SFrame aux hdr bytes for s390x.


>   gas/gen-sframe.c        | 50 +++++++++++++++++++----------------------
>   include/sframe.h        |  9 ++++++--
>   libsframe/sframe-dump.c |  4 ++++
>   3 files changed, 34 insertions(+), 29 deletions(-)
> 
> diff --git a/gas/gen-sframe.c b/gas/gen-sframe.c
> index 4cc86eb6c815..990b08d87953 100644
> --- a/gas/gen-sframe.c
> +++ b/gas/gen-sframe.c
> @@ -347,7 +347,9 @@ get_fre_num_offsets (struct sframe_row_entry *sframe_fre)
>       fre_num_offsets++;
>   #ifdef SFRAME_FRE_RA_TRACKING
>     if (sframe_ra_tracking_p ()
> -      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
> +      && (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK
> +	  /* Accommodate for padding RA offset if FP without RA on stack.  */
> +	  || sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK))
>       fre_num_offsets++;
>   #endif
>     return fre_num_offsets;
> @@ -371,9 +373,14 @@ sframe_get_fre_offset_size (struct sframe_row_entry *sframe_fre)
>     if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
>       bp_offset_size = get_offset_size_in_bytes (sframe_fre->bp_offset);
>   #ifdef SFRAME_FRE_RA_TRACKING
> -  if (sframe_ra_tracking_p ()
> -      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
> -    ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset);
> +  if (sframe_ra_tracking_p ())
> +    {
> +      if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
> +	ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset);
> +      /* Accommodate for padding RA offset if FP without RA on stack.  */
> +      else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
> +	ra_offset_size = get_offset_size_in_bytes (SFRAME_FRE_RA_OFFSET_INVALID);
> +    }
>   #endif
>   
>     /* Get the maximum size needed to represent the offsets.  */
> @@ -537,11 +544,19 @@ output_sframe_row_entry (symbolS *fde_start_addr,
>     fre_write_offsets++;
>   
>   #ifdef SFRAME_FRE_RA_TRACKING
> -  if (sframe_ra_tracking_p ()
> -      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
> +  if (sframe_ra_tracking_p ())
>       {
> -      fre_offset_func_map[idx].out_func (sframe_fre->ra_offset);
> -      fre_write_offsets++;
> +      if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
> +	{
> +	  fre_offset_func_map[idx].out_func (sframe_fre->ra_offset);
> +	  fre_write_offsets++;
> +	}
> +      /* Write padding RA offset if FP without RA on stack.  */
> +      else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
> +	{
> +	  fre_offset_func_map[idx].out_func (SFRAME_FRE_RA_OFFSET_INVALID);
> +	  fre_write_offsets++;
> +	}
>       }
>   #endif
>     if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
> @@ -1497,25 +1512,6 @@ sframe_do_fde (struct sframe_xlate_ctx *xlate_ctx,
>   	= get_dw_fde_end_addrS (xlate_ctx->dw_fde);
>       }
>   
> -#ifdef SFRAME_FRE_RA_TRACKING
> -  if (sframe_ra_tracking_p ())
> -    {
> -      struct sframe_row_entry *fre;
> -
> -      /* Iterate over the scratchpad FREs and validate them.  */
> -      for (fre = xlate_ctx->first_fre; fre; fre = fre->next)
> -	{
> -	  /* SFrame format cannot represent FP on stack without RA on stack.  */
> -	  if (fre->ra_loc != SFRAME_FRE_ELEM_LOC_STACK
> -	      && fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
> -	    {
> -	      as_warn (_("skipping SFrame FDE due to FP without RA on stack"));
> -	      return SFRAME_XLATE_ERR_NOTREPRESENTED;
> -	    }
> -	}
> -    }
> -#endif /* SFRAME_FRE_RA_TRACKING  */
> -
>     return SFRAME_XLATE_OK;
>   }
>   
> diff --git a/include/sframe.h b/include/sframe.h
> index 90bc92a32f84..d1a26875b3e2 100644
> --- a/include/sframe.h
> +++ b/include/sframe.h
> @@ -237,6 +237,9 @@ typedef struct sframe_func_desc_entry
>      may or may not be tracked.  */
>   #define SFRAME_FRE_FP_OFFSET_IDX    2
>   
> +/* Invalid RA offset.  Used as padding to represent FP without RA on stack.  */
> +#define SFRAME_FRE_RA_OFFSET_INVALID 0
> +
>   typedef struct sframe_fre_info
>   {
>     /* Information about
> @@ -288,9 +291,11 @@ typedef struct sframe_fre_info
>       offset1 (interpreted as CFA = BASE_REG + offset1)
>   
>       if RA is being tracked
> -      offset2 (interpreted as RA = CFA + offset2)
> +      offset2 (interpreted as RA = CFA + offset2; an offset value of
> +	       SFRAME_FRE_RA_OFFSET_INVALID indicates a dummy padding RA offset
> +	       to represent FP without RA saved on stack)
>         if FP is being tracked
> -	offset3 (intrepreted as FP = CFA + offset2)
> +	offset3 (intrepreted as FP = CFA + offset3)

I too noticed this typo recently and have a patch fixing this.

>         fi
>       else
>         if FP is being tracked
> diff --git a/libsframe/sframe-dump.c b/libsframe/sframe-dump.c
> index 40ea531314ba..3ea4bc327efd 100644
> --- a/libsframe/sframe-dump.c
> +++ b/libsframe/sframe-dump.c
> @@ -199,6 +199,10 @@ dump_sframe_func_with_fres (sframe_decoder_ctx *sfd_ctx,
>         if (sframe_decoder_get_fixed_ra_offset (sfd_ctx)
>   	  != SFRAME_CFA_FIXED_RA_INVALID)
>   	strcpy (temp, "f");
> +      /* If an ABI does track RA offset, e.g. AArch64 and S390, it can be a
> +	 dummy as padding to represent FP without RA being saved on stack.  */
> +      else if (err[2] == 0 && ra_offset == SFRAME_FRE_RA_OFFSET_INVALID)
> +	sprintf (temp, "u*");
>         else if (err[2] == 0)
>   	{
>   	  if (is_sframe_abi_arch_s390 (sfd_ctx) && (ra_offset & 1))


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/1] sframe: Represent FP without RA on stack
  2024-04-22 23:58   ` Indu Bhagat
@ 2024-04-23 15:44     ` Jens Remus
  2024-04-25  6:59       ` Indu Bhagat
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Remus @ 2024-04-23 15:44 UTC (permalink / raw)
  To: Indu Bhagat, binutils; +Cc: Andreas Krebbel

Am 23.04.2024 um 01:58 schrieb Indu Bhagat:
> On 4/22/24 08:58, Jens Remus wrote:
>> If an architecture uses both SFrame RA and FP tracking SFrame assumes
>> that the RA offset is the 2nd offset and the FP offset is the 3rd offset
>> following the SFrame FRE. An architecture does not need to store both on
>> the stack. SFrame cannot represent a FP without RA on stack, since it
>> cannot distinguish whether the 2nd offset is the RA or FP offset.
>>
>> Use an invalid SFrame FRE RA offset value of zero as dummy padding to
>> represent the FP being saved on the stack when the RA is not saved on
>> the stack.
>>
>> include/
>>     * sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining
>>     the invalid RA offset value used to represent a dummy padding
>>     offset.
>>
>> gas/
>>     * gen-sframe.c (get_fre_num_offsets): Accommodate for dummy
>>     padding RA offset if FP without RA on stack.
>>     (sframe_get_fre_offset_size): Likewise.
>>     (output_sframe_row_entry): Write a dummy padding RA offset
>>     if FP without RA needs to be represented.
>>
>> libsframe/
>>     * sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA
>>     offsets as if they were undefined. Display them as "u*" to
>>     distinguish them.
>>
>> Signed-off-by: Jens Remus <jremus@linux.ibm.com>
>> ---
>>
>> Notes (jremus):
>>      This patch eliminates 497 occurrences of the warning "skipping 
>> SFrame
>>      FDE due to FP without RA on stack" for a build of glibc on s390x. 
>> For
>>      libc.so this increases the number of FDEs by 166 and the number of
>>      FREs by 861, while adding 337 dummy padding RA offsets. With a total
>>      of 28157 offsets the dummy padding offsets account for ~1.20 % of 
>> the
>>      offsets.
> 
> While this increase seems small, it does look wasteful.
> 
> An orthogonal question below...
> 
>>      SFrame statistics without patch:
>>          VALUE        TOTAL      MIN        MAX        AVG
>>          FDEs:        3478       -          -          -
>>          FREs/FDE:    14441      1          15         4
>>          Offsets/FDE: 28157      1          31         8
>>             8-bit:    0          0          0          0
>>            16-bit:    28157      1          31         8
>>            32-bit:    0          0          0          0
>>          Offsets/FRE: 28157      1          3          1
>>             8-bit:    -          0          0          0
>>            16-bit:    -          1          3          1
>>            32-bit:    -          0          0          0
>>      SFrame statistics with patch applied:
>>          VALUE        TOTAL      MIN        MAX        AVG
>>          FDEs:        3644       -          -          -
>>          FREs/FDE:    15302      1          20         4
>>          Offsets/FDE: 29944      1          38         8
>>             8-bit:    0          0          0          0
>>            16-bit:    29944      1          38         8
>>            32-bit:    0          0          0          0
>>          Offsets/FRE: 29944      1          3          1
>>             8-bit:    -          0          0          0
>>            16-bit:    -          1          3          1
>>            32-bit:    -          0          0          0
>>          O_Padd/FDE:  337        -          -          0
>>             8-bit:    0
>>            16-bit:    337
>>            32-bit:    0
>>      Note that on s390x the offsets are at minimum 16-bits in size, 
>> due to
>>      the mandatory CFA offset being at least 160.
>>
> 
> IIUC, all stack layouts supported in the ABI use the offset 160. Is that 
> right ? I am wondering if adjusting the stored offsets in the SFrame 
> section (by decrementing 160 from it) will work ?
> 
> If yes, we could encode this constant in SFrame aux hdr bytes for s390x.

Thank you for the hint! Using a constant adjustment of -160 on s390x for 
the CFA offset from CFA base register should work to allow for 8-bit 
offsets to be used. Aren't all tracked offsets (i.e. CFA, FP, and RA) 
signed anyway? Thus applying a constant adjustment should work in any case?

Couldn't it simply be an architecture specific constant in the code to 
begin with? For example a new macro, which is only applied when defined?

#define S390_SFRAME_CFA_OFFSET_ADJUSTMENT -160
#define SFRAME_CFA_OFFSET_ADJUSTMENT S390_SFRAME_CFA_OFFSET_ADJUSTMENT

Implementing this in the SFrame auxiliary header would of course allow 
to implement this enhancement at a later stage and to change the 
adjustment value in the future, as the linker can then either reject or 
merge different adjustment values during link editing.

I wonder whether it would make sense to store the FP register number in 
the SFrame auxiliary header for s390x as well. Register 11 is just a 
convention of the compilers and not defined by the ABI. That would 
enable us to choose a different register as frame pointer in the future.

> 
>>   gas/gen-sframe.c        | 50 +++++++++++++++++++----------------------
>>   include/sframe.h        |  9 ++++++--
>>   libsframe/sframe-dump.c |  4 ++++
>>   3 files changed, 34 insertions(+), 29 deletions(-)
>>
>> diff --git a/gas/gen-sframe.c b/gas/gen-sframe.c
>> index 4cc86eb6c815..990b08d87953 100644
>> --- a/gas/gen-sframe.c
>> +++ b/gas/gen-sframe.c
>> @@ -347,7 +347,9 @@ get_fre_num_offsets (struct sframe_row_entry 
>> *sframe_fre)
>>       fre_num_offsets++;
>>   #ifdef SFRAME_FRE_RA_TRACKING
>>     if (sframe_ra_tracking_p ()
>> -      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
>> +      && (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK
>> +      /* Accommodate for padding RA offset if FP without RA on 
>> stack.  */
>> +      || sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK))
>>       fre_num_offsets++;
>>   #endif
>>     return fre_num_offsets;
>> @@ -371,9 +373,14 @@ sframe_get_fre_offset_size (struct 
>> sframe_row_entry *sframe_fre)
>>     if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
>>       bp_offset_size = get_offset_size_in_bytes (sframe_fre->bp_offset);
>>   #ifdef SFRAME_FRE_RA_TRACKING
>> -  if (sframe_ra_tracking_p ()
>> -      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
>> -    ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset);
>> +  if (sframe_ra_tracking_p ())
>> +    {
>> +      if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
>> +    ra_offset_size = get_offset_size_in_bytes (sframe_fre->ra_offset);
>> +      /* Accommodate for padding RA offset if FP without RA on 
>> stack.  */
>> +      else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
>> +    ra_offset_size = get_offset_size_in_bytes 
>> (SFRAME_FRE_RA_OFFSET_INVALID);
>> +    }
>>   #endif
>>     /* Get the maximum size needed to represent the offsets.  */
>> @@ -537,11 +544,19 @@ output_sframe_row_entry (symbolS *fde_start_addr,
>>     fre_write_offsets++;
>>   #ifdef SFRAME_FRE_RA_TRACKING
>> -  if (sframe_ra_tracking_p ()
>> -      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
>> +  if (sframe_ra_tracking_p ())
>>       {
>> -      fre_offset_func_map[idx].out_func (sframe_fre->ra_offset);
>> -      fre_write_offsets++;
>> +      if (sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
>> +    {
>> +      fre_offset_func_map[idx].out_func (sframe_fre->ra_offset);
>> +      fre_write_offsets++;
>> +    }
>> +      /* Write padding RA offset if FP without RA on stack.  */
>> +      else if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
>> +    {
>> +      fre_offset_func_map[idx].out_func (SFRAME_FRE_RA_OFFSET_INVALID);
>> +      fre_write_offsets++;
>> +    }
>>       }
>>   #endif
>>     if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
>> @@ -1497,25 +1512,6 @@ sframe_do_fde (struct sframe_xlate_ctx *xlate_ctx,
>>       = get_dw_fde_end_addrS (xlate_ctx->dw_fde);
>>       }
>> -#ifdef SFRAME_FRE_RA_TRACKING
>> -  if (sframe_ra_tracking_p ())
>> -    {
>> -      struct sframe_row_entry *fre;
>> -
>> -      /* Iterate over the scratchpad FREs and validate them.  */
>> -      for (fre = xlate_ctx->first_fre; fre; fre = fre->next)
>> -    {
>> -      /* SFrame format cannot represent FP on stack without RA on 
>> stack.  */
>> -      if (fre->ra_loc != SFRAME_FRE_ELEM_LOC_STACK
>> -          && fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
>> -        {
>> -          as_warn (_("skipping SFrame FDE due to FP without RA on 
>> stack"));
>> -          return SFRAME_XLATE_ERR_NOTREPRESENTED;
>> -        }
>> -    }
>> -    }
>> -#endif /* SFRAME_FRE_RA_TRACKING  */
>> -
>>     return SFRAME_XLATE_OK;
>>   }
>> diff --git a/include/sframe.h b/include/sframe.h
>> index 90bc92a32f84..d1a26875b3e2 100644
>> --- a/include/sframe.h
>> +++ b/include/sframe.h
>> @@ -237,6 +237,9 @@ typedef struct sframe_func_desc_entry
>>      may or may not be tracked.  */
>>   #define SFRAME_FRE_FP_OFFSET_IDX    2
>> +/* Invalid RA offset.  Used as padding to represent FP without RA on 
>> stack.  */
>> +#define SFRAME_FRE_RA_OFFSET_INVALID 0
>> +
>>   typedef struct sframe_fre_info
>>   {
>>     /* Information about
>> @@ -288,9 +291,11 @@ typedef struct sframe_fre_info
>>       offset1 (interpreted as CFA = BASE_REG + offset1)
>>       if RA is being tracked
>> -      offset2 (interpreted as RA = CFA + offset2)
>> +      offset2 (interpreted as RA = CFA + offset2; an offset value of
>> +           SFRAME_FRE_RA_OFFSET_INVALID indicates a dummy padding RA 
>> offset
>> +           to represent FP without RA saved on stack)
>>         if FP is being tracked
>> -    offset3 (intrepreted as FP = CFA + offset2)
>> +    offset3 (intrepreted as FP = CFA + offset3)
> 
> I too noticed this typo recently and have a patch fixing this.
> 
>>         fi
>>       else
>>         if FP is being tracked
>> diff --git a/libsframe/sframe-dump.c b/libsframe/sframe-dump.c
>> index 40ea531314ba..3ea4bc327efd 100644
>> --- a/libsframe/sframe-dump.c
>> +++ b/libsframe/sframe-dump.c
>> @@ -199,6 +199,10 @@ dump_sframe_func_with_fres (sframe_decoder_ctx 
>> *sfd_ctx,
>>         if (sframe_decoder_get_fixed_ra_offset (sfd_ctx)
>>         != SFRAME_CFA_FIXED_RA_INVALID)
>>       strcpy (temp, "f");
>> +      /* If an ABI does track RA offset, e.g. AArch64 and S390, it 
>> can be a
>> +     dummy as padding to represent FP without RA being saved on 
>> stack.  */
>> +      else if (err[2] == 0 && ra_offset == SFRAME_FRE_RA_OFFSET_INVALID)
>> +    sprintf (temp, "u*");
>>         else if (err[2] == 0)
>>       {
>>         if (is_sframe_abi_arch_s390 (sfd_ctx) && (ra_offset & 1))
> 

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303) and z/VSE Support
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des 
Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der 
Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/1] sframe: Represent FP without RA on stack
  2024-04-23 15:44     ` Jens Remus
@ 2024-04-25  6:59       ` Indu Bhagat
  0 siblings, 0 replies; 5+ messages in thread
From: Indu Bhagat @ 2024-04-25  6:59 UTC (permalink / raw)
  To: Jens Remus, binutils; +Cc: Andreas Krebbel

On 4/23/24 08:44, Jens Remus wrote:
> Am 23.04.2024 um 01:58 schrieb Indu Bhagat:
>> On 4/22/24 08:58, Jens Remus wrote:
>>> If an architecture uses both SFrame RA and FP tracking SFrame assumes
>>> that the RA offset is the 2nd offset and the FP offset is the 3rd offset
>>> following the SFrame FRE. An architecture does not need to store both on
>>> the stack. SFrame cannot represent a FP without RA on stack, since it
>>> cannot distinguish whether the 2nd offset is the RA or FP offset.
>>>
>>> Use an invalid SFrame FRE RA offset value of zero as dummy padding to
>>> represent the FP being saved on the stack when the RA is not saved on
>>> the stack.
>>>
>>> include/
>>>     * sframe.h (SFRAME_FRE_RA_OFFSET_INVALID): New macro defining
>>>     the invalid RA offset value used to represent a dummy padding
>>>     offset.
>>>
>>> gas/
>>>     * gen-sframe.c (get_fre_num_offsets): Accommodate for dummy
>>>     padding RA offset if FP without RA on stack.
>>>     (sframe_get_fre_offset_size): Likewise.
>>>     (output_sframe_row_entry): Write a dummy padding RA offset
>>>     if FP without RA needs to be represented.
>>>
>>> libsframe/
>>>     * sframe-dump.c (dump_sframe_func_with_fres): Treat invalid RA
>>>     offsets as if they were undefined. Display them as "u*" to
>>>     distinguish them.
>>>
>>> Signed-off-by: Jens Remus <jremus@linux.ibm.com>
>>> ---
>>>
>>> Notes (jremus):
>>>      This patch eliminates 497 occurrences of the warning "skipping 
>>> SFrame
>>>      FDE due to FP without RA on stack" for a build of glibc on 
>>> s390x. For
>>>      libc.so this increases the number of FDEs by 166 and the number of
>>>      FREs by 861, while adding 337 dummy padding RA offsets. With a 
>>> total
>>>      of 28157 offsets the dummy padding offsets account for ~1.20 % 
>>> of the
>>>      offsets.
>>
>> While this increase seems small, it does look wasteful.
>>
>> An orthogonal question below...
>>
>>>      SFrame statistics without patch:
>>>          VALUE        TOTAL      MIN        MAX        AVG
>>>          FDEs:        3478       -          -          -
>>>          FREs/FDE:    14441      1          15         4
>>>          Offsets/FDE: 28157      1          31         8
>>>             8-bit:    0          0          0          0
>>>            16-bit:    28157      1          31         8
>>>            32-bit:    0          0          0          0
>>>          Offsets/FRE: 28157      1          3          1
>>>             8-bit:    -          0          0          0
>>>            16-bit:    -          1          3          1
>>>            32-bit:    -          0          0          0
>>>      SFrame statistics with patch applied:
>>>          VALUE        TOTAL      MIN        MAX        AVG
>>>          FDEs:        3644       -          -          -
>>>          FREs/FDE:    15302      1          20         4
>>>          Offsets/FDE: 29944      1          38         8
>>>             8-bit:    0          0          0          0
>>>            16-bit:    29944      1          38         8
>>>            32-bit:    0          0          0          0
>>>          Offsets/FRE: 29944      1          3          1
>>>             8-bit:    -          0          0          0
>>>            16-bit:    -          1          3          1
>>>            32-bit:    -          0          0          0
>>>          O_Padd/FDE:  337        -          -          0
>>>             8-bit:    0
>>>            16-bit:    337
>>>            32-bit:    0
>>>      Note that on s390x the offsets are at minimum 16-bits in size, 
>>> due to
>>>      the mandatory CFA offset being at least 160.
>>>
>>
>> IIUC, all stack layouts supported in the ABI use the offset 160. Is 
>> that right ? I am wondering if adjusting the stored offsets in the 
>> SFrame section (by decrementing 160 from it) will work ?
>>
>> If yes, we could encode this constant in SFrame aux hdr bytes for s390x.
> 
> Thank you for the hint! Using a constant adjustment of -160 on s390x for 
> the CFA offset from CFA base register should work to allow for 8-bit 
> offsets to be used. Aren't all tracked offsets (i.e. CFA, FP, and RA) 
> signed anyway? Thus applying a constant adjustment should work in any case?
> 

Yes, these offsets are signed.  Applying the constant should work for CFA.

> Couldn't it simply be an architecture specific constant in the code to 
> begin with? For example a new macro, which is only applied when defined?
> 
> #define S390_SFRAME_CFA_OFFSET_ADJUSTMENT -160
> #define SFRAME_CFA_OFFSET_ADJUSTMENT S390_SFRAME_CFA_OFFSET_ADJUSTMENT
> 

I think a macro should also work in this case.

> Implementing this in the SFrame auxiliary header would of course allow 
> to implement this enhancement at a later stage and to change the 
> adjustment value in the future, as the linker can then either reject or 
> merge different adjustment values during link editing.
> 
> I wonder whether it would make sense to store the FP register number in 
> the SFrame auxiliary header for s390x as well. Register 11 is just a 
> convention of the compilers and not defined by the ABI. That would 
> enable us to choose a different register as frame pointer in the future.
> 

I think the problem will remain that there is ATM no way to communicate 
this information to the assembler (that compiler used r11 as fp role). 
And even if there was a way, I am not so sure.  Since the ABI doesnt 
mandate r11 as fp, the compiler may pick another register for say a 
different stack layout etc, in the future ?  IOW, even it picking 
different fp register across functions is a possibility, no? So what is 
expected of the compiler then...



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-25  6:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-22 15:58 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (padding) Jens Remus
2024-04-22 15:58 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus
2024-04-22 23:58   ` Indu Bhagat
2024-04-23 15:44     ` Jens Remus
2024-04-25  6:59       ` Indu Bhagat

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).