public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap)
@ 2024-04-22 15:59 Jens Remus
  2024-04-22 15:59 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus
  2024-04-22 23:59 ` [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap) Indu Bhagat
  0 siblings, 2 replies; 5+ messages in thread
From: Jens Remus @ 2024-04-22 15:59 UTC (permalink / raw)
  To: binutils, Indu Bhagat; +Cc: Jens Remus, Andreas Krebbel

This patch series adds support in SFrame to represent the frame pointer
(FP) without the return address (RA) being saved on the stack (and/or on
s390x in another register).

This is the second of two proposed alternatives:
1. The alternative patch series uses a dummy padding offset (invalid
   offset from CFA value of zero) as RA offset to represent FP without
   RA on saved on the stack.
2. This patch series changes the SFrame FRE count field into
   a bitmap, to convey which offsets follow the FRE.

Note that it currently applies on top of my v3 patch series series that
adds initial support to generate .sframe from CFI directives on s390x,
although it is independent of that.

A SFrame FRE currently has a 4-bit field containing the offset count
that follow the FRE. While this could account for up to 15 offsets (or
16, when excluding the mandatory CFA offset from CFA base register), it
cannot represent which of these offsets actually follow.

Redefining the 4-bit count field into a 4-bit offset bitmap allows to
track up to 4 offsets (or 5, when excluding the mandatory CFA offset
from CFA base register).

The main downside of this approach is that this is potentially a major
change to the SFrame V2 format, which may require a bump to V3. The
benefits are that (1) it does not add any dummy padding offsets, which
would unnecessarily add bloat to the SFrame information, and that (2)
it does not change any of the external SFrame API.
Using a lookup table the bitmap can easily be translated into an offset
count. Similar any logic that checks the presence of an offset can
easily be implemented using a bit test.

Note that there is a minor implementation issue with regards to the
internal API methods (callbacks), due to the change in SFrame format
changing a method argument from count to bitmap.
Additionally this initial implementation lacks better naming of the
tracked register IDs and any update of the SFrame format specification.

Thanks and regards,
Jens


Jens Remus (1):
  sframe: Represent FP without RA on stack

 gas/gen-sframe.c   | 66 ++++++++++++++++++++++++++++------------------
 include/sframe.h   | 13 ++++++---
 libsframe/sframe.c | 51 +++++++++++++++++++++++++++--------
 3 files changed, 90 insertions(+), 40 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH 1/1] sframe: Represent FP without RA on stack
  2024-04-22 15:59 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap) Jens Remus
@ 2024-04-22 15:59 ` Jens Remus
  2024-04-22 23:59 ` [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap) Indu Bhagat
  1 sibling, 0 replies; 5+ messages in thread
From: Jens Remus @ 2024-04-22 15:59 UTC (permalink / raw)
  To: binutils, Indu Bhagat; +Cc: Jens Remus, Andreas Krebbel

If an architecture uses both SFrame RA and FP tracking SFrame assumes
that the RA offset is the 2nd offset and the FP offset is the 3rd offset
following the SFrame FRE. An architecture does not need to store both on
the stack. SFrame cannot represent a FP without RA on stack.

Update the SFrame V2 FRE info word format to redefine the 4-bit offset
count field from the SFrame V1 FRE info word as 4-bit offset presence
flags field in the SFrame V2 FRE info word. This allows up to four
optional offsets to be tracked. While the mandatory CFA offset is
represented in the offset presence flags, it could be omitted to free
up one offset presence flag for an optional offset.

include/
        * sframe.h: Updated SFrame FRE info word for V2. Redefine the
	4-bit offset count field as offset presence bits.
	(SFRAME_V2_FRE_INFO): New macro to build the SFrame V2 FRE info
	word using offset presence bits instead of offset count.
	(SFRAME_V2_FRE_OFFSET_FLAGS): New macro to extract the offset
	presence flags from the SFrame V2 FRE info word.

gas/
        * gen-sframe.c (sframe_v2_set_fre_info): New function to build
	the SFrame V2 FRE info word using offset presence bits instead
	of offset count.
	(sframe_set_version): Use new sframe_v2_set_fre_info to set the
	SFrame V2 FRE info word.
	(sframe_set_fre_info): Redefine parameter num_offsets as
	offset_count_or_flags.
	(get_fre_offset_flags): New helper to compute the offset
	presence flags required for a SFrame V2 FRE.
	(output_sframe_row_entry): Use new get_fre_offset_flags helper
	and pass the offset presence flags instead of offset count to
	sframe_set_fre_info.
	(sframe_do_fde): Revert commit "gas: Skip SFrame FDE if FP
	without RA on stack".

libsframe/
        * sframe.c (sframe_get_fre_offset_count): Compute offset count
	from offset presence bits using a lookup table.
	(sframe_get_fre_offset_index): New helper to compute the offset
	index from the offset presence bits and the offset ID.
	(sframe_fre_get_cfa_offset, sframe_fre_get_fp_offset,
	sframe_fre_get_ra_offset): Use new sframe_get_fre_offset_index
	helper to determine the CFA, FP, and RA offset index.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    TODO:
    - Change wording of SFRAME_FRE_{CFA,FP,RA}_OFFSET_IDX, as they no longer
      represent an index, which the FP offset actually never really did.
      Maybe use _ID instead of _IDX?
    - Update SFrame V2 specification accordingly. Can this change still be
      included in V2 or would it require a V3?

 gas/gen-sframe.c   | 66 ++++++++++++++++++++++++++++------------------
 include/sframe.h   | 13 ++++++---
 libsframe/sframe.c | 51 +++++++++++++++++++++++++++--------
 3 files changed, 90 insertions(+), 40 deletions(-)

diff --git a/gas/gen-sframe.c b/gas/gen-sframe.c
index 4cc86eb6c815..7f3b406c7882 100644
--- a/gas/gen-sframe.c
+++ b/gas/gen-sframe.c
@@ -251,7 +251,7 @@ fre_offset_func_map[SFRAME_FRE_OFFSET_FUNC_MAP_INDEX_MAX+1] =
 static struct sframe_version_ops sframe_ver_ops;
 
 /* SFrame (SFRAME_VERSION_1) set FRE info.  */
-
+/*
 static unsigned char
 sframe_v1_set_fre_info (unsigned int base_reg, unsigned int num_offsets,
 			unsigned int offset_size, bool mangled_ra_p)
@@ -261,6 +261,19 @@ sframe_v1_set_fre_info (unsigned int base_reg, unsigned int num_offsets,
   fre_info = SFRAME_V1_FRE_INFO_UPDATE_MANGLED_RA_P (mangled_ra_p, fre_info);
   return fre_info;
 }
+*/
+
+/* SFrame (SFRAME_VERSION_2) set FRE info.  */
+
+static unsigned char
+sframe_v2_set_fre_info (unsigned int base_reg, unsigned int offset_flags,
+			unsigned int offset_size, bool mangled_ra_p)
+{
+  unsigned char fre_info;
+  fre_info = SFRAME_V2_FRE_INFO (base_reg, offset_flags, offset_size);
+  fre_info = SFRAME_V1_FRE_INFO_UPDATE_MANGLED_RA_P (mangled_ra_p, fre_info);
+  return fre_info;
+}
 
 /* SFrame (SFRAME_VERSION_1) set function info.  */
 static unsigned char
@@ -280,10 +293,10 @@ sframe_set_version (uint32_t sframe_version ATTRIBUTE_UNUSED)
 {
   sframe_ver_ops.format_version = SFRAME_VERSION_2;
 
-  /* These operations remain the same for SFRAME_VERSION_2 as fre_info and
-     func_info have not changed from SFRAME_VERSION_1.  */
+  sframe_ver_ops.set_fre_info = sframe_v2_set_fre_info;
 
-  sframe_ver_ops.set_fre_info = sframe_v1_set_fre_info;
+  /* This operation remains the same for SFRAME_VERSION_2 as func_info
+     has not changed from SFRAME_VERSION_1.  */
 
   sframe_ver_ops.set_func_info = sframe_v1_set_func_info;
 }
@@ -291,10 +304,10 @@ sframe_set_version (uint32_t sframe_version ATTRIBUTE_UNUSED)
 /* SFrame set FRE info.  */
 
 static unsigned char
-sframe_set_fre_info (unsigned int base_reg, unsigned int num_offsets,
+sframe_set_fre_info (unsigned int base_reg, unsigned int offset_count_or_flags,
 		     unsigned int offset_size, bool mangled_ra_p)
 {
-  return sframe_ver_ops.set_fre_info (base_reg, num_offsets,
+  return sframe_ver_ops.set_fre_info (base_reg, offset_count_or_flags,
 				      offset_size, mangled_ra_p);
 }
 
@@ -353,6 +366,24 @@ get_fre_num_offsets (struct sframe_row_entry *sframe_fre)
   return fre_num_offsets;
 }
 
+/* Get offset flags necessary for the SFrame Frame Row Entry.  */
+
+static unsigned int
+get_fre_offset_flags (struct sframe_row_entry *sframe_fre)
+{
+  /* CFA offset must always be present.  */
+  unsigned int fre_offset_flags = 1 << SFRAME_FRE_CFA_OFFSET_IDX;
+
+  if (sframe_fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
+    fre_offset_flags |= 1 << SFRAME_FRE_FP_OFFSET_IDX;
+#ifdef SFRAME_FRE_RA_TRACKING
+  if (sframe_ra_tracking_p ()
+      && sframe_fre->ra_loc == SFRAME_FRE_ELEM_LOC_STACK)
+    fre_offset_flags |= 1 << SFRAME_FRE_RA_OFFSET_IDX;
+#endif
+  return fre_offset_flags;
+}
+
 /* Get the minimum necessary offset size (in bytes) for this
    SFrame frame row entry.  */
 
@@ -494,6 +525,7 @@ output_sframe_row_entry (symbolS *fde_start_addr,
 {
   unsigned char fre_info;
   unsigned int fre_num_offsets;
+  unsigned int fre_offset_flags;
   unsigned int fre_offset_size;
   unsigned int fre_base_reg;
   expressionS exp;
@@ -524,8 +556,9 @@ output_sframe_row_entry (symbolS *fde_start_addr,
      size of offset in this frame row entry.  */
   fre_base_reg = get_fre_base_reg_id (sframe_fre);
   fre_num_offsets = get_fre_num_offsets (sframe_fre);
+  fre_offset_flags = get_fre_offset_flags (sframe_fre);
   fre_offset_size = sframe_get_fre_offset_size (sframe_fre);
-  fre_info = sframe_set_fre_info (fre_base_reg, fre_num_offsets,
+  fre_info = sframe_set_fre_info (fre_base_reg, fre_offset_flags,
 				  fre_offset_size, sframe_fre->mangled_ra_p);
   out_one (fre_info);
 
@@ -1497,25 +1530,6 @@ sframe_do_fde (struct sframe_xlate_ctx *xlate_ctx,
 	= get_dw_fde_end_addrS (xlate_ctx->dw_fde);
     }
 
-#ifdef SFRAME_FRE_RA_TRACKING
-  if (sframe_ra_tracking_p ())
-    {
-      struct sframe_row_entry *fre;
-
-      /* Iterate over the scratchpad FREs and validate them.  */
-      for (fre = xlate_ctx->first_fre; fre; fre = fre->next)
-	{
-	  /* SFrame format cannot represent FP on stack without RA on stack.  */
-	  if (fre->ra_loc != SFRAME_FRE_ELEM_LOC_STACK
-	      && fre->bp_loc == SFRAME_FRE_ELEM_LOC_STACK)
-	    {
-	      as_warn (_("skipping SFrame FDE due to FP without RA on stack"));
-	      return SFRAME_XLATE_ERR_NOTREPRESENTED;
-	    }
-	}
-    }
-#endif /* SFRAME_FRE_RA_TRACKING  */
-
   return SFRAME_XLATE_OK;
 }
 
diff --git a/include/sframe.h b/include/sframe.h
index 90bc92a32f84..093dca6cda9a 100644
--- a/include/sframe.h
+++ b/include/sframe.h
@@ -241,14 +241,14 @@ typedef struct sframe_fre_info
 {
   /* Information about
      - 1 bit: base reg for CFA
-     - 4 bits: Number of offsets (N).  A value of upto 3 is allowed to track
-     all three of CFA, FP and RA (fixed implicit order).
+     - 4 bits: Flags for offsets.  Upto 3 are allowed to track
+     all three of CFA, FP and RA (fixed order).
      - 2 bits: information about size of the offsets (S) in bytes.
      Valid values are SFRAME_FRE_OFFSET_1B, SFRAME_FRE_OFFSET_2B,
      SFRAME_FRE_OFFSET_4B
      - 1 bit: Mangled RA state bit (aarch64 only).
      ----------------------------------------------------------------------------------
-     | Mangled-RA (aarch64) |  Size of offsets   |   Number of offsets    |   base_reg |
+     | Mangled-RA (aarch64) |  Size of offsets   |   Flags for offsets    |   base_reg |
      | Unused (amd64, s390) |                    |                        |            |
      ----------------------------------------------------------------------------------
      8                     7                    5                        1            0
@@ -264,6 +264,11 @@ typedef struct sframe_fre_info
   (((0 & 0x1) << 7) | (((offset_size) & 0x3) << 5) | \
    (((offset_num) & 0xf) << 1) | ((base_reg_id) & 0x1))
 
+/* Note: Set mangled_ra_p to zero by default.  */
+#define SFRAME_V2_FRE_INFO(base_reg_id, offset_flags, offset_size) \
+  (((0 & 0x1) << 7) | (((offset_size) & 0x3) << 5) | \
+   (((offset_flags) & 0xf) << 1) | ((base_reg_id) & 0x1))
+
 /* Set the mangled_ra_p bit as indicated.  */
 #define SFRAME_V1_FRE_INFO_UPDATE_MANGLED_RA_P(mangled_ra_p, fre_info) \
   ((((mangled_ra_p) & 0x1) << 7) | ((fre_info) & 0x7f))
@@ -273,6 +278,8 @@ typedef struct sframe_fre_info
 #define SFRAME_V1_FRE_OFFSET_SIZE(data)		  (((data) >> 5) & 0x3)
 #define SFRAME_V1_FRE_MANGLED_RA_P(data)	  (((data) >> 7) & 0x1)
 
+#define SFRAME_V2_FRE_OFFSET_FLAGS(data)	  (((data) >> 1) & 0xf)
+
 /* SFrame Frame Row Entry definitions.
 
    Used for AMD64, AARCH64, and S390.
diff --git a/libsframe/sframe.c b/libsframe/sframe.c
index 460face4168e..f43f9eaecfd1 100644
--- a/libsframe/sframe.c
+++ b/libsframe/sframe.c
@@ -115,7 +115,10 @@ sframe_get_hdr_size (sframe_header *sfh)
 static uint8_t
 sframe_fre_get_offset_count (uint8_t fre_info)
 {
-  return SFRAME_V1_FRE_OFFSET_COUNT (fre_info);
+  unsigned int fre_offset_flags = SFRAME_V2_FRE_OFFSET_FLAGS (fre_info);
+  const char flags_to_count[16] = { 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4 };
+
+  return flags_to_count[fre_offset_flags];
 }
 
 static uint8_t
@@ -130,6 +133,19 @@ sframe_get_fre_ra_mangled_p (uint8_t fre_info)
   return SFRAME_V1_FRE_MANGLED_RA_P (fre_info);
 }
 
+static int
+sframe_get_fre_offset_index (uint8_t fre_info, unsigned int tracked_offset)
+{
+  unsigned int fre_offset_flags = SFRAME_V2_FRE_OFFSET_FLAGS (fre_info);
+  const char flags_to_count[16] = { 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4 };
+  unsigned int mask = (1 << tracked_offset) - 1;
+
+  if ((fre_offset_flags & (1 << tracked_offset)) == 0)
+    return -1;
+
+  return flags_to_count[fre_offset_flags & mask];
+}
+
 /* Access functions for info from function descriptor entry.  */
 
 static uint32_t
@@ -680,7 +696,13 @@ int32_t
 sframe_fre_get_cfa_offset (sframe_decoder_ctx *dctx ATTRIBUTE_UNUSED,
 			   sframe_frame_row_entry *fre, int *errp)
 {
-  return sframe_get_fre_offset (fre, SFRAME_FRE_CFA_OFFSET_IDX, errp);
+  int cfa_offset_idx = sframe_get_fre_offset_index (fre->fre_info,
+						    SFRAME_FRE_CFA_OFFSET_IDX);
+  if (cfa_offset_idx == -1)
+    return sframe_set_errno (errp, SFRAME_ERR_FREOFFSET_NOPRESENT);
+  sframe_assert (cfa_offset_idx == 0);
+
+  return sframe_get_fre_offset (fre, cfa_offset_idx, errp);
 }
 
 /* Get the FP offset from the FRE.  If the offset is invalid, sets errp.  */
@@ -689,7 +711,7 @@ int32_t
 sframe_fre_get_fp_offset (sframe_decoder_ctx *dctx,
 			  sframe_frame_row_entry *fre, int *errp)
 {
-  uint32_t fp_offset_idx = 0;
+  int fp_offset_idx;
   int8_t fp_offset = sframe_decoder_get_fixed_fp_offset (dctx);
   /* If the FP offset is not being tracked, return the fixed FP offset
      from the SFrame header.  */
@@ -700,13 +722,13 @@ sframe_fre_get_fp_offset (sframe_decoder_ctx *dctx,
       return fp_offset;
     }
 
-  /* In some ABIs, the stack offset to recover RA (using the CFA) from is
-     fixed (like AMD64).  In such cases, the stack offset to recover FP will
-     appear at the second index.  */
-  fp_offset_idx = ((sframe_decoder_get_fixed_ra_offset (dctx)
-		    != SFRAME_CFA_FIXED_RA_INVALID)
-		   ? SFRAME_FRE_RA_OFFSET_IDX
-		   : SFRAME_FRE_FP_OFFSET_IDX);
+  /* Get the FP offset index and check that it is valid.  */
+  fp_offset_idx = sframe_get_fre_offset_index (fre->fre_info,
+					       SFRAME_FRE_FP_OFFSET_IDX);
+  if (fp_offset_idx == -1)
+    return sframe_set_errno (errp, SFRAME_ERR_FREOFFSET_NOPRESENT);
+
+  /* Otherwise, get the FP offset from the FRE.  */
   return sframe_get_fre_offset (fre, fp_offset_idx, errp);
 }
 
@@ -716,6 +738,7 @@ int32_t
 sframe_fre_get_ra_offset (sframe_decoder_ctx *dctx,
 			  sframe_frame_row_entry *fre, int *errp)
 {
+  int ra_offset_idx;
   int8_t ra_offset = sframe_decoder_get_fixed_ra_offset (dctx);
   /* If the RA offset was not being tracked, return the fixed RA offset
      from the SFrame header.  */
@@ -726,8 +749,14 @@ sframe_fre_get_ra_offset (sframe_decoder_ctx *dctx,
       return ra_offset;
     }
 
+  /* Get the RA offset index and check that it is valid.  */
+  ra_offset_idx = sframe_get_fre_offset_index (fre->fre_info,
+					       SFRAME_FRE_RA_OFFSET_IDX);
+  if (ra_offset_idx == -1)
+    return sframe_set_errno (errp, SFRAME_ERR_FREOFFSET_NOPRESENT);
+
   /* Otherwise, get the RA offset from the FRE.  */
-  return sframe_get_fre_offset (fre, SFRAME_FRE_RA_OFFSET_IDX, errp);
+  return sframe_get_fre_offset (fre, ra_offset_idx, errp);
 }
 
 /* Get whether the RA is mangled.  */
-- 
2.40.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap)
  2024-04-22 15:59 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap) Jens Remus
  2024-04-22 15:59 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus
@ 2024-04-22 23:59 ` Indu Bhagat
  2024-04-23  9:54   ` Jens Remus
  1 sibling, 1 reply; 5+ messages in thread
From: Indu Bhagat @ 2024-04-22 23:59 UTC (permalink / raw)
  To: Jens Remus, binutils; +Cc: Andreas Krebbel

On 4/22/24 08:59, Jens Remus wrote:
> This patch series adds support in SFrame to represent the frame pointer
> (FP) without the return address (RA) being saved on the stack (and/or on
> s390x in another register).
> 
> This is the second of two proposed alternatives:
> 1. The alternative patch series uses a dummy padding offset (invalid
>     offset from CFA value of zero) as RA offset to represent FP without
>     RA on saved on the stack.
> 2. This patch series changes the SFrame FRE count field into
>     a bitmap, to convey which offsets follow the FRE.
> 
> Note that it currently applies on top of my v3 patch series series that
> adds initial support to generate .sframe from CFI directives on s390x,
> although it is independent of that.
> 
> A SFrame FRE currently has a 4-bit field containing the offset count
> that follow the FRE. While this could account for up to 15 offsets (or
> 16, when excluding the mandatory CFA offset from CFA base register), it
> cannot represent which of these offsets actually follow.
> 
> Redefining the 4-bit count field into a 4-bit offset bitmap allows to
> track up to 4 offsets (or 5, when excluding the mandatory CFA offset
> from CFA base register).
> 

This approach, in its current form, immensely confines the future 
adaptability of the format.

My recommendation would be to avoid such changes to format where is 
becomes more restrictive for future needs.  While it is generally 
recommended to not add more registers to track, confining it now to 4 
(or 5) offsets now seems rather limiting.

I have two suggestions to resolve this issue of "FP without RA on 
stack".  I will reply on the other thread.

> The main downside of this approach is that this is potentially a major
> change to the SFrame V2 format, which may require a bump to V3. The
> benefits are that (1) it does not add any dummy padding offsets, which
> would unnecessarily add bloat to the SFrame information, and that (2)
> it does not change any of the external SFrame API.
> Using a lookup table the bitmap can easily be translated into an offset
> count. Similar any logic that checks the presence of an offset can
> easily be implemented using a bit test.
> 
> Note that there is a minor implementation issue with regards to the
> internal API methods (callbacks), due to the change in SFrame format
> changing a method argument from count to bitmap.
> Additionally this initial implementation lacks better naming of the
> tracked register IDs and any update of the SFrame format specification.
> 
> Thanks and regards,
> Jens
> 
> 
> Jens Remus (1):
>    sframe: Represent FP without RA on stack
> 
>   gas/gen-sframe.c   | 66 ++++++++++++++++++++++++++++------------------
>   include/sframe.h   | 13 ++++++---
>   libsframe/sframe.c | 51 +++++++++++++++++++++++++++--------
>   3 files changed, 90 insertions(+), 40 deletions(-)
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap)
  2024-04-22 23:59 ` [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap) Indu Bhagat
@ 2024-04-23  9:54   ` Jens Remus
  2024-04-25  7:58     ` Indu Bhagat
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Remus @ 2024-04-23  9:54 UTC (permalink / raw)
  To: Indu Bhagat, binutils; +Cc: Andreas Krebbel

Am 23.04.2024 um 01:59 schrieb Indu Bhagat:
> On 4/22/24 08:59, Jens Remus wrote:
>> This patch series adds support in SFrame to represent the frame pointer
>> (FP) without the return address (RA) being saved on the stack (and/or on
>> s390x in another register).
>>
>> This is the second of two proposed alternatives:
>> 1. The alternative patch series uses a dummy padding offset (invalid
>>     offset from CFA value of zero) as RA offset to represent FP without
>>     RA on saved on the stack.
>> 2. This patch series changes the SFrame FRE count field into
>>     a bitmap, to convey which offsets follow the FRE.
>>
>> Note that it currently applies on top of my v3 patch series series that
>> adds initial support to generate .sframe from CFI directives on s390x,
>> although it is independent of that.
>>
>> A SFrame FRE currently has a 4-bit field containing the offset count
>> that follow the FRE. While this could account for up to 15 offsets (or
>> 16, when excluding the mandatory CFA offset from CFA base register), it
>> cannot represent which of these offsets actually follow.
>>
>> Redefining the 4-bit count field into a 4-bit offset bitmap allows to
>> track up to 4 offsets (or 5, when excluding the mandatory CFA offset
>> from CFA base register).
>>
> 
> This approach, in its current form, immensely confines the future 
> adaptability of the format.
> 
> My recommendation would be to avoid such changes to format where is 
> becomes more restrictive for future needs.  While it is generally 
> recommended to not add more registers to track, confining it now to 4 
> (or 5) offsets now seems rather limiting.

Isn't it rather confining that using a simple offset count cannot 
represent which of the tracked offsets follow a FRE, which effectively 
enforces a precedence in which they can be tracked? I just wonder 
whether it is really that useful to theoretically be able to track 15 
(or 16) offsets with this limitation instead of just 3 (or 4) without 
any limitations. The number of bits could also be increased in a future 
SFrame version with a different FRE layout.

I don't have enough knowledge of other architectures to judge, whether 
the s390x case of an architecture using both FP and RA tracking while 
not always saving both at once or restoring them individually is really 
exotic or might be more widespread.

I would understand if changing SFrame V2 to a bitmap is too much of a 
change and would need to wait until the next version. Using the dummy 
padding offsets in V2 (on s390x) in the meantime, which are just a minor 
change, would be a good compromise.

> I have two suggestions to resolve this issue of "FP without RA on 
> stack".  I will reply on the other thread.

Thanks a lot for your feedback! I will follow up there.

>> The main downside of this approach is that this is potentially a major
>> change to the SFrame V2 format, which may require a bump to V3. The
>> benefits are that (1) it does not add any dummy padding offsets, which
>> would unnecessarily add bloat to the SFrame information, and that (2)
>> it does not change any of the external SFrame API.
>> Using a lookup table the bitmap can easily be translated into an offset
>> count. Similar any logic that checks the presence of an offset can
>> easily be implemented using a bit test.
>>
>> Note that there is a minor implementation issue with regards to the
>> internal API methods (callbacks), due to the change in SFrame format
>> changing a method argument from count to bitmap.
>> Additionally this initial implementation lacks better naming of the
>> tracked register IDs and any update of the SFrame format specification.
>>
>> Thanks and regards,
>> Jens
>>
>>
>> Jens Remus (1):
>>    sframe: Represent FP without RA on stack
>>
>>   gas/gen-sframe.c   | 66 ++++++++++++++++++++++++++++------------------
>>   include/sframe.h   | 13 ++++++---
>>   libsframe/sframe.c | 51 +++++++++++++++++++++++++++--------
>>   3 files changed, 90 insertions(+), 40 deletions(-)
>>
> 

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303) and z/VSE Support
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des 
Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der 
Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap)
  2024-04-23  9:54   ` Jens Remus
@ 2024-04-25  7:58     ` Indu Bhagat
  0 siblings, 0 replies; 5+ messages in thread
From: Indu Bhagat @ 2024-04-25  7:58 UTC (permalink / raw)
  To: Jens Remus, binutils; +Cc: Andreas Krebbel

On 4/23/24 02:54, Jens Remus wrote:
> Am 23.04.2024 um 01:59 schrieb Indu Bhagat:
>> On 4/22/24 08:59, Jens Remus wrote:
>>> This patch series adds support in SFrame to represent the frame pointer
>>> (FP) without the return address (RA) being saved on the stack (and/or on
>>> s390x in another register).
>>>
>>> This is the second of two proposed alternatives:
>>> 1. The alternative patch series uses a dummy padding offset (invalid
>>>     offset from CFA value of zero) as RA offset to represent FP without
>>>     RA on saved on the stack.
>>> 2. This patch series changes the SFrame FRE count field into
>>>     a bitmap, to convey which offsets follow the FRE.
>>>
>>> Note that it currently applies on top of my v3 patch series series that
>>> adds initial support to generate .sframe from CFI directives on s390x,
>>> although it is independent of that.
>>>
>>> A SFrame FRE currently has a 4-bit field containing the offset count
>>> that follow the FRE. While this could account for up to 15 offsets (or
>>> 16, when excluding the mandatory CFA offset from CFA base register), it
>>> cannot represent which of these offsets actually follow.
>>>
>>> Redefining the 4-bit count field into a 4-bit offset bitmap allows to
>>> track up to 4 offsets (or 5, when excluding the mandatory CFA offset
>>> from CFA base register).
>>>
>>
>> This approach, in its current form, immensely confines the future 
>> adaptability of the format.
>>
>> My recommendation would be to avoid such changes to format where is 
>> becomes more restrictive for future needs.  While it is generally 
>> recommended to not add more registers to track, confining it now to 4 
>> (or 5) offsets now seems rather limiting.
> 
> Isn't it rather confining that using a simple offset count cannot 
> represent which of the tracked offsets follow a FRE, which effectively 
> enforces a precedence in which they can be tracked? I just wonder 
> whether it is really that useful to theoretically be able to track 15 
> (or 16) offsets with this limitation instead of just 3 (or 4) without 
> any limitations. The number of bits could also be increased in a future 
> SFrame version with a different FRE layout.
> 

While you do have a point, changing the 4-bits which are used to 
indicate "number of offsets" to "bitmap of offsets" does limit the 
flexibility of the format to support other architectures/ABIs in future.

We sort of envisioned that this array of stack offsets (a.k.a FRE 
offsets) can even be interpreted differently by each arch/ABI:

For AMD64:
offset1 = CFA offset
offset2 if present = FP offset

For aarch64:
offset1 = CFA offset
offset2 if present is RA offset
offset3 if present is FP offset
This works for aarch64, because Frame record (if created) saves both FP 
and LR on stack.

For s390:
The stack tracer may interpret offsets as needed. You may even store 
some metadata in one of the offsets:

offset1 = CFA offset
offset2 = metadata (= read next as FP)
offset3 = FP offset

offset1 = CFA offset
offset2 = metadata (= read next as RA)
offset3 = RA offset

offset1 = CFA offset
offset2 = metadata (= read next as FP, RA)
offset3 = FP offset
offset4 = RA offset

offset1 = CFA offset
offset2 = metadata (= read next as registers)
offset3 = RA register number
offset4 = FP register number

etc.

Your solution with padding to resolve the issue of "FP without RA" (and 
storing register number for leaf functions) in the FRE offset is what 
this is in spirit.  It is useful for flexibility as you see.

> I don't have enough knowledge of other architectures to judge, whether 
> the s390x case of an architecture using both FP and RA tracking while 
> not always saving both at once or restoring them individually is really 
> exotic or might be more widespread.
> 
> I would understand if changing SFrame V2 to a bitmap is too much of a 
> change and would need to wait until the next version. Using the dummy 
> padding offsets in V2 (on s390x) in the meantime, which are just a minor 
> change, would be a good compromise.
> 

Format bump if needed should be done.  My main concern about the bitmap 
solution is the loss of flexibility.  Its hard to predict the future 
needs, so it makes sense to not lose flexibility for adaptation.

>> I have two suggestions to resolve this issue of "FP without RA on 
>> stack".  I will reply on the other thread.
> 
> Thanks a lot for your feedback! I will follow up there.
> 
>>> The main downside of this approach is that this is potentially a major
>>> change to the SFrame V2 format, which may require a bump to V3. The
>>> benefits are that (1) it does not add any dummy padding offsets, which
>>> would unnecessarily add bloat to the SFrame information, and that (2)
>>> it does not change any of the external SFrame API.
>>> Using a lookup table the bitmap can easily be translated into an offset
>>> count. Similar any logic that checks the presence of an offset can
>>> easily be implemented using a bit test.
>>>
>>> Note that there is a minor implementation issue with regards to the
>>> internal API methods (callbacks), due to the change in SFrame format
>>> changing a method argument from count to bitmap.
>>> Additionally this initial implementation lacks better naming of the
>>> tracked register IDs and any update of the SFrame format specification.
>>>
>>> Thanks and regards,
>>> Jens
>>>
>>>
>>> Jens Remus (1):
>>>    sframe: Represent FP without RA on stack
>>>
>>>   gas/gen-sframe.c   | 66 ++++++++++++++++++++++++++++------------------
>>>   include/sframe.h   | 13 ++++++---
>>>   libsframe/sframe.c | 51 +++++++++++++++++++++++++++--------
>>>   3 files changed, 90 insertions(+), 40 deletions(-)
>>>
>>
> 
> Regards,
> Jens


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-25  7:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-22 15:59 [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap) Jens Remus
2024-04-22 15:59 ` [RFC PATCH 1/1] sframe: Represent FP without RA on stack Jens Remus
2024-04-22 23:59 ` [RFC PATCH 0/1] sframe: Represent FP without RA on stack (bitmap) Indu Bhagat
2024-04-23  9:54   ` Jens Remus
2024-04-25  7:58     ` Indu Bhagat

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).