public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality
@ 2022-12-21 11:03 Victor Do Nascimento
  2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Victor Do Nascimento @ 2022-12-21 11:03 UTC (permalink / raw)
  To: newlib; +Cc: Richard Earnshaw

Hi all,

This respin of the patch series adds the final modifications required to 
patches in response to upstream comments and rebases work on the setjmp 
and longjmp routines onto the fixed arm abi.

Tweaks necessary for correct cfi information generation made to:
* newlib/libc/machine/arm/strcmp-armv7.S
* newlib/libc/machine/arm/memchr.S

Stray comment restored in:
* newlib/libc/machine/arm/memcpy-armv7m.S

Patch rebased, cleaned up and missing BTI landing pad added:
* newlib/libc/machine/arm/setjmp.S


All remaining patches in series remain as in previous iterations.

Thanks,
Victor

------

This patch series modifies hand-written assembly files for Arm
targets, introducing a uniform prologue/epilogue interface,
responsible for pushing/popping registers on function entry and exit,
while conditionally enabling branch target identification as well as
address return signature and verification based on Armv8.1-M Pointer
Authentication [1] using ACLE feature test macros at compile-time [2].

The incorportaion of PACBTI functionality in function prologues/
epilogues is dictated by the combination of parameter macros in
arm-asm.h and arguments passed to the `-mbranch-protection' flag at
the time of Newlib compilation.

Regression tested on arm-none-eabi with and without MVE extension and
for Newlib and Newlib-nano.

[1]
<https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
[2] 
<https://developer.arm.com/documentation/101028/0012/5--Feature-test-macros>

Victor Do Nascimento (8):
   newlib: libc: define M-profile PACBTI-enablement macros
   newlib: libc: strcmp M-profile PACBTI-enablement
   newlib: libc: strlen M-profile PACBTI-enablement
   newlib: libc: memchr M-profile PACBTI-enablement
   newlib: libc: memcpy M-profile PACBTI-enablement
   newlib: libc: aeabi_memmove M-profile PACBTI-enablement
   newlib: libc: aeabi_memset M-profile PACBTI-enablement
   newlib: libc: setjmp M-profile PACBTI-enablement

  .../libc/machine/arm/aeabi_memmove-thumb2.S   |  17 +-
  newlib/libc/machine/arm/aeabi_memset-thumb2.S |  14 +-
  newlib/libc/machine/arm/arm_asm.h             | 441 ++++++++++++++++++
  newlib/libc/machine/arm/memchr.S              |  50 +-
  newlib/libc/machine/arm/memcpy-armv7m.S       |  33 +-
  newlib/libc/machine/arm/setjmp.S              |  39 ++
  newlib/libc/machine/arm/strcmp-arm-tiny.S     |   8 +-
  newlib/libc/machine/arm/strcmp-armv7.S        |  57 ++-
  newlib/libc/machine/arm/strcmp-armv7m.S       |  26 +-
  newlib/libc/machine/arm/strlen-armv7.S        |  17 +-
  newlib/libc/machine/arm/strlen-thumb2-Os.S    |  14 +-
  11 files changed, 656 insertions(+), 60 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros
  2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
@ 2022-12-21 11:19 ` Victor L. Do Nascimento
  2023-01-06 10:42   ` Christophe Lyon
  2022-12-21 11:21 ` [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:19 UTC (permalink / raw)
  To: newlib; +Cc: Richard Earnshaw

Augment the arm_asm.h header file to simplify function prologues and
epilogues whilst adding support for PACBTI enablement via macros for
hand-written assembly functions.  For PACBTI, both prologues/epilogues
as well as cfi-related directives are automatically amended
accordingly, depending on the compile-time mbranch-protection argument
values.

It defines the following preprocessor macros:
   * HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
   leaf functions.
   * PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
   to the stack irrespective of whether the ip register is clobbered in
   the function or not.
   * STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
   the push list as necessary in the prologue to ensure stack
   alignment preservation at the start of assembly function.  The
   epilogue behavior is likewise affected by this flag, ensuring any
   pushed dummy registers also get popped on function return.

It also defines the following assembler macros:
   * prologue: In addition to pushing any callee-saved registers onto
   the stack, it generates any requested pacbti instructions.
   Pushed registers are specified via the optional `first', `last',
   `push_ip' and `push_lr' macro argument parameters.
   when a single register number is provided, it pushes that
   register.  When two register numbers are provided, they specify a
   rage to save.  If push_ip and/or push_lr are non-zero, the
   respective registers are also saved.  Stack alignment is requested
   via the `align` argument, which defaults to the value of
   STACK_ALIGN_ENFORCE, unless manually overridden.

   For example:

       prologue push_ip=1 -> push {ip}
       prologue push_ip=1, align8=1 -> push {r2, ip}
       prologue push_ip=1, push_lr=1 -> push {ip, lr}
       prologue 1 -> push {r1}
       prologue 1, align8=1 -> push {r0, r1}
       prologue 1 push_ip=1 -> push {r1, ip}
       prologue 1 4 -> push {r1-r4}
       prologue 1 4 push_ip=1 -> push {r1-r4, ip}

   * epilogue: pops registers off the stack and emits pac key signing
   instruction, if requested. The `first', `last', `push_ip',
   `push_lr' and `align' function as per the prologue macro,
   generating pop instead of push instructions.

   Stack alignment is enforced via the following helper macro
   call-chain:

	{prologue|epilogue} ->_align8 -> _preprocess_reglist ->
	  _preprocess_reglist1 -> {_prologue|_epilogue}

   Finally, the necessary cfi directives for adding debug information
   to prologue and epilogue are generated via the following macros:

   * cfisavelist - prologue macro helper function, generating
   necessary .cfi_offset directives associated with push instruction.
   Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
   to generate the following:

       push {r1-r2, ip}
       .cfi_adjust_cfa_offset 12
       .cfi_offset 143, -4
       .cfi_offset 2, -8
       .cfi_offset 1, -12

   * cfirestorelist - epilogue macro helper function, emitting
   .cfi_restore instructions prior to resetting the cfa offset.  As
   such, calling `epilogue 1 2 push_ip=1' will produce:

        pop {r1-r2, ip}
	.cfi_register 143, 12
	.cfi_restore 2
	.cfi_restore 1
	.cfi_def_cfa_offset 0
---
 newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
 1 file changed, 441 insertions(+)

diff --git a/newlib/libc/machine/arm/arm_asm.h b/newlib/libc/machine/arm/arm_asm.h
index 2708057de..94fa77b4d 100644
--- a/newlib/libc/machine/arm/arm_asm.h
+++ b/newlib/libc/machine/arm/arm_asm.h
@@ -60,4 +60,445 @@
 # define _ISA_THUMB_1
 #endif
 
+/* Check whether leaf function PAC signing has been requested in the
+   -mbranch-protect compile-time option.  */
+#define LEAF_PROTECT_BIT 2
+
+#ifdef __ARM_FEATURE_PAC_DEFAULT
+# define HAVE_PAC_LEAF \
+	((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
+#else
+# define HAVE_PAC_LEAF 0
+#endif
+
+/* Provide default parameters for PAC-code handling in leaf-functions.  */
+#if HAVE_PAC_LEAF
+# ifndef PAC_LEAF_PUSH_IP
+#  define PAC_LEAF_PUSH_IP 1
+# endif
+#else /* !HAVE_PAC_LEAF */
+# undef PAC_LEAF_PUSH_IP
+# define PAC_LEAF_PUSH_IP 0
+#endif /* HAVE_PAC_LEAF */
+
+#define STACK_ALIGN_ENFORCE 0
+
+#ifdef __ASSEMBLER__
+
+/******************************************************************************
+* Implementation of the prologue and epilogue assembler macros and their
+* associated helper functions.
+*
+* These functions add support for the following:
+*
+* - M-profile branch target identification (BTI) landing-pads when compiled
+*   with `-mbranch-protection=bti'.
+* - PAC-signing and verification instructions, depending on hardware support
+*   and whether the PAC-signing of leaf functions has been requested via the
+*   `-mbranch-protection=pac-ret+leaf' compiler argument.
+* - 8-byte stack alignment preservation at function entry, defaulting to the
+*   value of STACK_ALIGN_ENFORCE.
+*
+* Notes:
+* - Prologue stack alignment is implemented by detecting a push with an odd
+*   number of registers and prepending a dummy register to the list.
+* - If alignment is attempted on a list containing r0, compilation will result
+*   in an error.
+* - If alignment is attempted in a list containing r1, r0 will be prepended to
+*   the register list and r0 will be restored prior to function return.  for
+*   functions with non-void return types, this will result in the corruption of
+*   the result register.
+* - Stack alignment is enforced via the following helper macro call-chain:
+*
+*	{prologue|epilogue} ->_align8 -> _preprocess_reglist ->
+*		_preprocess_reglist1 -> {_prologue|_epilogue}
+*
+* - Debug CFI directives are automatically added to prologues and epilogues,
+*   assisted by `cfisavelist' and `cfirestorelist', respectively.
+*
+* Arguments:
+* prologue
+* --------
+* - first	- If `last' specified, this serves as start of general-purpose
+*		  register (GPR) range to push onto stack, otherwise represents
+*		  single GPR to push onto stack.  If omitted, no GPRs pushed
+*		  onto stack at prologue.
+* - last	- If given, specifies inclusive upper-bound of GPR range.
+* - push_ip	- Determines whether IP register is to be pushed to stack at
+*		  prologue.  When pac-signing is requested, this holds the
+*		  the pac-key.  Either 1 or 0 to push or not push, respectively.
+*		  Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
+* - push_lr	- Determines whether to push lr to the stack on function entry.
+*		  Either 1 or 0  to push or not push, respectively.
+* - align8	- Whether to enforce alignment. Either 1 or 0, with 1 requesting
+*		  alignment.
+*
+* epilogue
+* --------
+*   The epilogue should be called passing the same arguments as those passed to
+*   the prologue to ensure the stack is not corrupted on function return.
+*
+* Usage examples:
+*
+*   prologue push_ip=1 -> push {ip}
+*   epilogue push_ip=1, align8=1 -> pop {r2, ip}
+*   prologue push_ip=1, push_lr=1 -> push {ip, lr}
+*   epilogue 1 -> pop {r1}
+*   prologue 1, align8=1 -> push {r0, r1}
+*   epilogue 1, push_ip=1 -> pop {r1, ip}
+*   prologue 1, 4 -> push {r1-r4}
+*   epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
+*
+******************************************************************************/
+
+/* Emit .cfi_restore directives for a consecutive sequence of registers.  */
+	.macro cfirestorelist first, last
+	.cfi_restore \last
+	.if \last-\first
+	 cfirestorelist \first, \last-1
+	.endif
+	.endm
+
+/* Emit .cfi_offset directives for a consecutive sequence of registers.  */
+	.macro cfisavelist first, last, index=1
+	.cfi_offset \last, -4*(\index)
+	.if \last-\first
+	 cfisavelist \first, \last-1, \index+1
+	.endif
+	.endm
+
+.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
+	.if \push_ip & 1 != \push_ip
+	 .error "push_ip may be either 0 or 1"
+	.endif
+	.if \push_lr & 1 != \push_lr
+	 .error "push_lr may be either 0 or 1"
+	.endif
+	.if \first != -1
+	 .if \last == -1
+	  /* Upper-bound not provided: Set upper = lower.  */
+	  _prologue \first, \first, \push_ip, \push_lr
+	  .exitm
+	 .endif
+	.endif
+#if HAVE_PAC_LEAF
+#if __ARM_FEATURE_BTI_DEFAULT
+	pacbti	ip, lr, sp
+#else
+	pac	ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+	.cfi_register 143, 12
+#else
+#if __ARM_FEATURE_BTI_DEFAULT
+	bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+#endif /* HAVE_PAC_LEAF */
+	.if \first != -1
+	 .if \last != \first
+	  .if \last >= 13
+	.error "SP cannot be in the save list"
+	  .endif
+	  .if \push_ip
+	   .if \push_lr
+	/* Case 1: push register range, ip and lr registers.  */
+	push {r\first-r\last, ip, lr}
+	.cfi_adjust_cfa_offset ((\last-\first)+3)*4
+	.cfi_offset 14, -4
+	.cfi_offset 143, -8
+	cfisavelist \first, \last, 3
+	   .else // !\push_lr
+	/* Case 2: push register range and ip register.  */
+	push {r\first-r\last, ip}
+	.cfi_adjust_cfa_offset ((\last-\first)+2)*4
+	.cfi_offset 143, -4
+	cfisavelist \first, \last, 2
+	   .endif
+	  .else // !\push_ip
+	   .if \push_lr
+	/* Case 3: push register range and lr register.  */
+	push {r\first-r\last, lr}
+	.cfi_adjust_cfa_offset ((\last-\first)+2)*4
+	.cfi_offset 14, -4
+	cfisavelist \first, \last, 2
+	   .else // !\push_lr
+	/* Case 4: push register range.  */
+	push {r\first-r\last}
+	.cfi_adjust_cfa_offset ((\last-\first)+1)*4
+	cfisavelist \first, \last, 1
+	   .endif
+	  .endif
+	 .else // \last == \first
+	  .if \push_ip
+	   .if \push_lr
+	/* Case 5: push single GP register plus ip and lr registers.  */
+	push {r\first, ip, lr}
+	.cfi_adjust_cfa_offset 12
+	.cfi_offset 14, -4
+	.cfi_offset 143, -8
+        cfisavelist \first, \first, 3
+	   .else // !\push_lr
+	/* Case 6: push single GP register plus ip register.  */
+	push {r\first, ip}
+	.cfi_adjust_cfa_offset 8
+	.cfi_offset 143, -4
+        cfisavelist \first, \first, 2
+	   .endif
+	  .else // !\push_ip
+	   .if \push_lr
+	/* Case 7: push single GP register plus lr register.  */
+	push {r\first, lr}
+	.cfi_adjust_cfa_offset 8
+	.cfi_offset 14, -4
+	cfisavelist \first, \first, 2
+	   .else // !\push_lr
+	/* Case 8: push single GP register.  */
+	push {r\first}
+	.cfi_adjust_cfa_offset 4
+	cfisavelist \first, \first, 1
+	   .endif
+	  .endif
+	 .endif
+	.else // \first == -1
+	 .if \push_ip
+	  .if \push_lr
+	/* Case 9: push ip and lr registers.  */
+	push {ip, lr}
+	.cfi_adjust_cfa_offset 8
+	.cfi_offset 14, -4
+	.cfi_offset 143, -8
+	  .else // !\push_lr
+	/* Case 10: push ip register.  */
+	push {ip}
+	.cfi_adjust_cfa_offset 4
+	.cfi_offset 143, -4
+	  .endif
+	 .else // !\push_ip
+          .if \push_lr
+	/* Case 11: push lr register.  */
+	push {lr}
+	.cfi_adjust_cfa_offset 4
+	.cfi_offset 14, -4
+          .endif
+	 .endif
+	.endif
+.endm
+
+.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
+	.if \push_ip & 1 != \push_ip
+	 .error "push_ip may be either 0 or 1"
+	.endif
+	.if \push_lr & 1 != \push_lr
+	 .error "push_lr may be either 0 or 1"
+	.endif
+	.if \first != -1
+	 .if \last == -1
+	  /* Upper-bound not provided: Set upper = lower.  */
+	  _epilogue \first, \first, \push_ip, \push_lr
+	  .exitm
+	 .endif
+	 .if \last != \first
+	  .if \last >= 13
+	.error "SP cannot be in the save list"
+	  .endif
+	  .if \push_ip
+	   .if \push_lr
+	/* Case 1: pop register range, ip and lr registers.  */
+	pop {r\first-r\last, ip, lr}
+	.cfi_restore 14
+	.cfi_register 143, 12
+	cfirestorelist \first, \last
+	   .else // !\push_lr
+	/* Case 2: pop register range and ip register.  */
+	pop {r\first-r\last, ip}
+	.cfi_register 143, 12
+	cfirestorelist \first, \last
+	   .endif
+	  .else // !\push_ip
+	   .if \push_lr
+	/* Case 3: pop register range and lr register.  */
+	pop {r\first-r\last, lr}
+	.cfi_restore 14
+	cfirestorelist \first, \last
+	   .else // !\push_lr
+	/* Case 4: pop register range.  */
+	pop {r\first-r\last}
+	cfirestorelist \first, \last
+	   .endif
+	  .endif
+	 .else // \last == \first
+	  .if \push_ip
+	   .if \push_lr
+	/* Case 5: pop single GP register plus ip and lr registers.  */
+	pop {r\first, ip, lr}
+	.cfi_restore 14
+	.cfi_register 143, 12
+	cfirestorelist \first, \first
+	   .else // !\push_lr
+	/* Case 6: pop single GP register plus ip register.  */
+	pop {r\first, ip}
+	.cfi_register 143, 12
+	cfirestorelist \first, \first
+	   .endif
+	  .else // !\push_ip
+	   .if \push_lr
+	/* Case 7: pop single GP register plus lr register.  */
+	pop {r\first, lr}
+	.cfi_restore 14
+	cfirestorelist \first, \first
+	   .else // !\push_lr
+	/* Case 8: pop single GP register.  */
+	pop {r\first}
+	cfirestorelist \first, \first
+	   .endif
+	  .endif
+	 .endif
+	.else // \first == -1
+	 .if \push_ip
+	  .if \push_lr
+	/* Case 9: pop ip and lr registers.  */
+	pop {ip, lr}
+	.cfi_restore 14
+	.cfi_register 143, 12
+	  .else // !\push_lr
+	/* Case 10: pop ip register.  */
+	pop {ip}
+	.cfi_register 143, 12
+	  .endif
+	 .else // !\push_ip
+          .if \push_lr
+	/* Case 11: pop lr register.  */
+	pop {lr}
+	.cfi_restore 14
+          .endif
+	 .endif
+	.endif
+#if HAVE_PAC_LEAF
+	aut	ip, lr, sp
+#endif /* HAVE_PAC_LEAF */
+	bx	lr
+.endm
+
+# clean up expressions in 'last'
+.macro _preprocess_reglist1 first:req, last:req, push_ip:req, push_lr:req, reglist_op:req
+	.if \last == 0
+	 \reglist_op \first, 0, \push_ip, \push_lr
+	.elseif \last == 1
+	 \reglist_op \first, 1, \push_ip, \push_lr
+	.elseif \last == 2
+	 \reglist_op \first, 2, \push_ip, \push_lr
+	.elseif \last == 3
+	 \reglist_op \first, 3, \push_ip, \push_lr
+	.elseif \last == 4
+	 \reglist_op \first, 4, \push_ip, \push_lr
+	.elseif \last == 5
+	 \reglist_op \first, 5, \push_ip, \push_lr
+	.elseif \last == 6
+	 \reglist_op \first, 6, \push_ip, \push_lr
+	.elseif \last == 7
+	 \reglist_op \first, 7, \push_ip, \push_lr
+	.elseif \last == 8
+	 \reglist_op \first, 8, \push_ip, \push_lr
+	.elseif \last == 9
+	 \reglist_op \first, 9, \push_ip, \push_lr
+	.elseif \last == 10
+	 \reglist_op \first, 10, \push_ip, \push_lr
+	.elseif \last == 11
+	 \reglist_op \first, 11, \push_ip, \push_lr
+	.else
+	 .error "last (\last) out of range"
+	.endif
+.endm
+
+# clean up expressions in 'first'
+.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0, reglist_op:req
+	.ifb \last
+	 _preprocess_reglist \first \first \push_ip \push_lr
+	.else
+	 .if \first > \last
+	  .error "last (\last) must be at least as great as first (\first)"
+	 .endif
+	 .if \first == 0
+	  _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 1
+	  _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 2
+	  _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 3
+	  _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 4
+	  _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 5
+	  _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 6
+	  _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 7
+	  _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 8
+	  _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 9
+	  _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 10
+	  _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
+	 .elseif \first == 11
+	  _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
+	 .else
+	  .error "first (\first) out of range"
+	 .endif
+	.endif
+.endm
+
+.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
+	.ifb \first
+	 .ifnb \last
+	  .error "can't have last (\last) without specifying first"
+	 .else // \last not blank
+	  .if ((\push_ip + \push_lr) % 2) == 0
+	   \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
+	   .exitm
+	  .else // ((\push_ip + \push_lr) % 2) odd
+	   _align8 2, 2, \push_ip, \push_lr, \reglist_op
+	   .exitm
+	  .endif // ((\push_ip + \push_lr) % 2) == 0
+	 .endif // .ifnb \last
+	.endif // .ifb \first
+
+	.ifb \last
+	 _align8 \first, \first, \push_ip, \push_lr, \reglist_op
+	.else
+	 .if \push_ip & 1 <> \push_ip
+	  .error "push_ip may be 0 or 1"
+	 .endif
+	 .if \push_lr & 1 <> \push_lr
+	  .error "push_lr may be 0 or 1"
+	 .endif
+	 .ifeq (\last - \first + \push_ip + \push_lr) % 2
+	  .if \first == 0
+	   .error "Alignment required and first register is r0"
+	   .exitm
+	  .endif
+	  _preprocess_reglist \first-1, \last, \push_ip, \push_lr, \reglist_op
+	 .else
+	  _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
+	 .endif
+	.endif
+.endm
+
+.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
+	.if \align8
+	 _align8 \first, \last, \push_ip, \push_lr, _prologue
+	.else
+	 _prologue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
+	.endif
+.endm
+
+.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
+	.if \align8
+	 _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
+	.else
+	 _epilogue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
+	.endif
+.endm
+
+#endif /* __ASSEMBLER__ */
+
 #endif /* ARM_ASM__H */
-- 
2.36.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
  2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
  2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
@ 2022-12-21 11:21 ` Victor L. Do Nascimento
  2023-01-06 11:09   ` Christophe Lyon
  2022-12-21 11:22 ` [PATCH v5 3/8] newlib: libc: strlen " Victor L. Do Nascimento
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:21 UTC (permalink / raw)
  To: newlib; +Cc: Richard Earnshaw

Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.

This patch enables PACBTI for all relevant variants of strcmp:
     * Newlib for armv8.1-m.main+pacbti
     * Newlib for armv8.1-m.main+pacbti+mve
     * Newlib-nano
---
 newlib/libc/machine/arm/strcmp-arm-tiny.S |  8 +++-
 newlib/libc/machine/arm/strcmp-armv7.S    | 57 ++++++++++++++---------
 newlib/libc/machine/arm/strcmp-armv7m.S   | 26 +++++++----
 3 files changed, 60 insertions(+), 31 deletions(-)

diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S b/newlib/libc/machine/arm/strcmp-arm-tiny.S
index 607a41daf..0bd2a2e6e 100644
--- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
+++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
@@ -29,10 +29,14 @@
 /* Tiny version of strcmp in ARM state.  Used only when optimizing
    for size.  Also supports Thumb-2.  */
 
+#include "arm_asm.h"
+
 	.syntax unified
 def_fn strcmp
+	.fnstart
 	.cfi_sections .debug_frame
 	.cfi_startproc
+	prologue
 1:
 	ldrb	r2, [r0], #1
 	ldrb	r3, [r1], #1
@@ -42,6 +46,8 @@ def_fn strcmp
 	beq	1b
 2:
 	subs	r0, r2, r3
-	bx	lr
+	epilogue
 	.cfi_endproc
+	.cantunwind
+	.fnend
 	.size	strcmp, . - strcmp
diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
index 2f93bfb73..7cafca151 100644
--- a/newlib/libc/machine/arm/strcmp-armv7.S
+++ b/newlib/libc/machine/arm/strcmp-armv7.S
@@ -45,6 +45,8 @@
 	.thumb
 	.syntax unified
 
+#include "arm_asm.h"
+
 /* Parameters and result.  */
 #define src1		r0
 #define src2		r1
@@ -91,8 +93,9 @@
 	ldrd	r4, r5, [sp], #16
 	.cfi_restore 4
 	.cfi_restore 5
+	.cfi_adjust_cfa_offset -16
 	sub	result, result, r1, lsr #24
-	bx	lr
+	epilogue push_ip=HAVE_PAC_LEAF
 #else
 	/* To use the big-endian trick we'd have to reverse all three words.
 	   that's slower than this approach.  */
@@ -112,22 +115,21 @@
 	ldrd	r4, r5, [sp], #16
 	.cfi_restore 4
 	.cfi_restore 5
+	.cfi_adjust_cfa_offset -16
 	sub	result, result, r1
 
-	bx	lr
+	epilogue push_ip=HAVE_PAC_LEAF
 #endif
 	.endm
 
+
 	.text
 	.p2align	5
-.Lstrcmp_start_addr:
-#ifndef STRCMP_NO_PRECHECK
-.Lfastpath_exit:
-	sub	r0, r2, r3
-	bx	lr
-	nop
-#endif
 def_fn	strcmp
+	.fnstart
+	.cfi_sections .debug_frame
+	.cfi_startproc
+	prologue push_ip=HAVE_PAC_LEAF
 #ifndef STRCMP_NO_PRECHECK
 	ldrb	r2, [src1]
 	ldrb	r3, [src2]
@@ -136,16 +138,14 @@ def_fn	strcmp
 	cmpcs	r2, r3
 	bne	.Lfastpath_exit
 #endif
-	.cfi_sections .debug_frame
-	.cfi_startproc
 	strd	r4, r5, [sp, #-16]!
-	.cfi_def_cfa_offset 16
-	.cfi_offset 4, -16
-	.cfi_offset 5, -12
+	.cfi_adjust_cfa_offset 16
+	.cfi_rel_offset 4, 0
+	.cfi_rel_offset 5, 4
 	orr	tmp1, src1, src2
 	strd	r6, r7, [sp, #8]
-	.cfi_offset 6, -8
-	.cfi_offset 7, -4
+	.cfi_rel_offset 6, 8
+	.cfi_rel_offset 7, 12
 	mvn	const_m1, #0
 	lsl	r2, tmp1, #29
 	cbz	r2, .Lloop_aligned8
@@ -270,7 +270,6 @@ def_fn	strcmp
 	ldr	data1, [src1], #4
 	beq	.Laligned_m2
 	bcs	.Laligned_m1
-
 #ifdef STRCMP_NO_PRECHECK
 	ldrb	data2, [src2, #1]
 	uxtb	tmp1, data1, ror #BYTE1_OFFSET
@@ -314,10 +313,19 @@ def_fn	strcmp
 	mov	result, tmp1
 	ldr	r4, [sp], #16
 	.cfi_restore 4
-	bx	lr
+	.cfi_adjust_cfa_offset -16
+	epilogue push_ip=HAVE_PAC_LEAF
 
 #ifndef STRCMP_NO_PRECHECK
+.Lfastpath_exit:
+	.cfi_restore_state
+	.cfi_remember_state
+	sub	r0, r2, r3
+	epilogue push_ip=HAVE_PAC_LEAF
+
 .Laligned_m1:
+	.cfi_restore_state
+	.cfi_remember_state
 	add	src2, src2, #4
 #endif
 .Lsrc1_aligned:
@@ -364,8 +372,9 @@ def_fn	strcmp
 	/* R6/7 Not used in this sequence.  */
 	.cfi_restore 6
 	.cfi_restore 7
+	.cfi_adjust_cfa_offset -16
 	neg	result, result
-	bx	lr
+	epilogue push_ip=HAVE_PAC_LEAF
 
 6:
 	.cfi_restore_state
@@ -441,7 +450,8 @@ def_fn	strcmp
 	/* R6/7 not used in this sequence.  */
 	.cfi_restore 6
 	.cfi_restore 7
-	bx	lr
+	.cfi_adjust_cfa_offset -16
+	epilogue push_ip=HAVE_PAC_LEAF
 
 .Lstrcmp_tail:
 	.cfi_restore_state
@@ -463,7 +473,10 @@ def_fn	strcmp
 	/* R6/7 not used in this sequence.  */
 	.cfi_restore 6
 	.cfi_restore 7
+	.cfi_adjust_cfa_offset -16
 	sub	result, result, data2, lsr #24
-	bx	lr
+	epilogue push_ip=HAVE_PAC_LEAF
 	.cfi_endproc
-	.size strcmp, . - .Lstrcmp_start_addr
+	.cantunwind
+	.fnend
+	.size strcmp, . - strcmp
diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
index cdb4912df..825b6e77f 100644
--- a/newlib/libc/machine/arm/strcmp-armv7m.S
+++ b/newlib/libc/machine/arm/strcmp-armv7m.S
@@ -29,6 +29,8 @@
 /* Very similar to the generic code, but uses Thumb2 as implemented
    in ARMv7-M.  */
 
+#include "arm_asm.h"
+
 /* Parameters and result.  */
 #define src1		r0
 #define src2		r1
@@ -44,8 +46,10 @@
 	.thumb
 	.syntax unified
 def_fn strcmp
+	.fnstart
 	.cfi_sections .debug_frame
 	.cfi_startproc
+	prologue push_ip=HAVE_PAC_LEAF
 	eor	tmp1, src1, src2
 	tst	tmp1, #3
 	/* Strings not at same byte offset from a word boundary.  */
@@ -82,6 +86,7 @@ def_fn strcmp
 	ldreq	data2, [src2], #4
 	beq	4b
 2:
+	.cfi_remember_state
 	/* There's a zero or a different byte in the word */
 	S2HI	result, data1, #24
 	S2LO	data1, data1, #8
@@ -106,7 +111,7 @@ def_fn strcmp
 	lsrs	result, result, #24
 	subs	result, result, data2
 #endif
-	bx	lr
+	epilogue push_ip=HAVE_PAC_LEAF
 
 
 #if 0
@@ -205,8 +210,10 @@ def_fn strcmp
 
 	/* First of all, compare bytes until src1(sp1) is word-aligned. */
 .Lstrcmp_unaligned:
+	.cfi_restore_state
 	tst	src1, #3
 	beq	2f
+	.cfi_remember_state
 	ldrb	data1, [src1], #1
 	ldrb	data2, [src2], #1
 	cmp	data1, #1
@@ -214,12 +221,13 @@ def_fn strcmp
 	cmpcs	data1, data2
 	beq	.Lstrcmp_unaligned
 	sub	result, data1, data2
-	bx	lr
+	epilogue push_ip=HAVE_PAC_LEAF
 
 2:
+	.cfi_restore_state
 	stmfd	sp!, {r5}
-	.cfi_def_cfa_offset 4
-	.cfi_offset 5, -4
+	.cfi_adjust_cfa_offset 4
+	.cfi_rel_offset 5, 0
 
 	ldr	data1, [src1], #4
 	and	tmp2, src2, #3
@@ -355,8 +363,8 @@ def_fn strcmp
 	.cfi_remember_state
 	ldmfd	sp!, {r5}
 	.cfi_restore 5
-	.cfi_def_cfa_offset 0
-	bx	lr
+	.cfi_adjust_cfa_offset -4
+	epilogue push_ip=HAVE_PAC_LEAF
 
 .Lstrcmp_tail:
 	.cfi_restore_state
@@ -372,7 +380,9 @@ def_fn strcmp
 	sub	result, r2, result
 	ldmfd	sp!, {r5}
 	.cfi_restore 5
-	.cfi_def_cfa_offset 0
-	bx	lr
+	.cfi_adjust_cfa_offset -4
+	epilogue push_ip=HAVE_PAC_LEAF
 	.cfi_endproc
+	.cantunwind
+	.fnend
 	.size strcmp, . - strcmp
-- 
2.36.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 3/8] newlib: libc: strlen M-profile PACBTI-enablement
  2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
  2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
  2022-12-21 11:21 ` [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
@ 2022-12-21 11:22 ` Victor L. Do Nascimento
  2022-12-21 11:24 ` [PATCH v5 4/8] newlib: libc: memchr " Victor L. Do Nascimento
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:22 UTC (permalink / raw)
  To: newlib; +Cc: Richard Earnshaw

Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.

This patch enables PACBTI for all relevant variants of strlen:
     * Newlib for armv8.1-m.main+pacbti
     * Newlib for armv8.1-m.main+pacbti+mve
     * Newlib-nano
---
 newlib/libc/machine/arm/strlen-armv7.S     | 17 ++++++++++++++---
 newlib/libc/machine/arm/strlen-thumb2-Os.S | 14 +++++++++++---
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/newlib/libc/machine/arm/strlen-armv7.S b/newlib/libc/machine/arm/strlen-armv7.S
index f3dda0d60..27094040c 100644
--- a/newlib/libc/machine/arm/strlen-armv7.S
+++ b/newlib/libc/machine/arm/strlen-armv7.S
@@ -59,6 +59,7 @@
    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.  */
 
 #include "acle-compat.h"
+#include "arm_asm.h"
 
 	.macro def_fn f p2align=0
 	.text
@@ -78,7 +79,11 @@
 
 	/* This code requires Thumb.  */
 #if __ARM_ARCH_PROFILE == 'M'
+#if __ARM_ARCH >= 8
+	/* keep config inherited from -march=.  */
+#else
 	.arch   armv7e-m
+#endif /* if __ARM_ARCH >= 8 */
 #else
 	.arch	armv6t2
 #endif
@@ -100,8 +105,10 @@
 #define tmp2		r5
 
 def_fn	strlen p2align=6
+	.fnstart
+	.cfi_startproc
+	prologue 4 5 push_ip=HAVE_PAC_LEAF
 	pld	[srcin, #0]
-	strd	r4, r5, [sp, #-8]!
 	bic	src, srcin, #7
 	mvn	const_m1, #0
 	ands	tmp1, srcin, #7		/* (8 - bytes) to alignment.  */
@@ -151,6 +158,7 @@ def_fn	strlen p2align=6
 	beq	.Lloop_aligned
 
 .Lnull_found:
+	.cfi_remember_state
 	cmp	data1a, #0
 	itt	eq
 	addeq	result, result, #4
@@ -159,11 +167,11 @@ def_fn	strlen p2align=6
 	rev	data1a, data1a
 #endif
 	clz	data1a, data1a
-	ldrd	r4, r5, [sp], #8
 	add	result, result, data1a, lsr #3	/* Bits -> Bytes.  */
-	bx	lr
+	epilogue 4 5 push_ip=HAVE_PAC_LEAF
 
 .Lmisaligned8:
+	.cfi_restore_state
 	ldrd	data1a, data1b, [src]
 	and	tmp2, tmp1, #3
 	rsb	result, tmp1, #0
@@ -177,4 +185,7 @@ def_fn	strlen p2align=6
 	movne	data1a, const_m1
 	mov	const_0, #0
 	b	.Lstart_realigned
+	.cfi_endproc
+	.cantunwind
+	.fnend
 	.size	strlen, . - strlen
diff --git a/newlib/libc/machine/arm/strlen-thumb2-Os.S b/newlib/libc/machine/arm/strlen-thumb2-Os.S
index 961f41a0a..a46db573c 100644
--- a/newlib/libc/machine/arm/strlen-thumb2-Os.S
+++ b/newlib/libc/machine/arm/strlen-thumb2-Os.S
@@ -25,6 +25,7 @@
    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.  */
 
 #include "acle-compat.h"
+#include "arm_asm.h"
 
 	.macro def_fn f p2align=0
 	.text
@@ -33,8 +34,9 @@
 	.type \f, %function
 \f:
 	.endm
-
-#if __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
+#if __ARM_ARCH_PROFILE == 'M' && __ARM_ARCH >= 8
+	/* keep config inherited from -march=.  */
+#elif __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
 	.arch   armv7
 #else
 	.arch	armv6t2
@@ -44,11 +46,17 @@
 	.syntax unified
 
 def_fn	strlen p2align=1
+	.fnstart
+	.cfi_startproc
+	prologue
 	mov     r3, r0
 1:	ldrb.w  r2, [r3], #1
 	cmp     r2, #0
 	bne	1b
 	subs    r0, r3, r0
 	subs    r0, #1
-	bx      lr
+	epilogue
+	.cfi_endproc
+	.cantunwind
+	.fnend
 	.size	strlen, . - strlen
-- 
2.36.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 4/8] newlib: libc: memchr M-profile PACBTI-enablement
  2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
                   ` (2 preceding siblings ...)
  2022-12-21 11:22 ` [PATCH v5 3/8] newlib: libc: strlen " Victor L. Do Nascimento
@ 2022-12-21 11:24 ` Victor L. Do Nascimento
  2022-12-21 11:25 ` [PATCH v5 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:24 UTC (permalink / raw)
  To: newlib; +Cc: Richard Earnshaw

Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
 newlib/libc/machine/arm/memchr.S | 50 ++++++++++++++++++++++++++++----
 1 file changed, 44 insertions(+), 6 deletions(-)

diff --git a/newlib/libc/machine/arm/memchr.S b/newlib/libc/machine/arm/memchr.S
index 1a4c6512c..3c11addad 100644
--- a/newlib/libc/machine/arm/memchr.S
+++ b/newlib/libc/machine/arm/memchr.S
@@ -76,6 +76,7 @@
 	.syntax unified
 
 #include "acle-compat.h"
+#include "arm_asm.h"
 
 @ NOTE: This ifdef MUST match the one in memchr-stub.c
 #if defined (__ARM_NEON__) || defined (__ARM_NEON)
@@ -267,10 +268,14 @@ memchr:
 #elif __ARM_ARCH_ISA_THUMB >= 2 && defined (__ARM_FEATURE_DSP)
 
 #if __ARM_ARCH_PROFILE == 'M'
-       .arch armv7e-m
+#if __ARM_ARCH >= 8
+	/* keep config inherited from -march=.  */
 #else
-       .arch armv6t2
-#endif
+	.arch armv7e-m
+#endif /* __ARM_ARCH >= 8 */
+#else
+	.arch armv6t2
+#endif /* __ARM_ARCH_PROFILE == 'M' */
 
 @ this lets us check a flag in a 00/ff byte easily in either endianness
 #ifdef __ARMEB__
@@ -287,11 +292,14 @@ memchr:
 	.p2align 4,,15
 	.global memchr
 	.type memchr,%function
+	.fnstart
+	.cfi_startproc
 memchr:
 	@ r0 = start of memory to scan
 	@ r1 = character to look for
 	@ r2 = length
 	@ returns r0 = pointer to character or NULL if not found
+	prologue
 	and	r1,r1,#0xff	@ Don't trust the caller to pass a char
 
 	cmp	r2,#16		@ If short don't bother with anything clever
@@ -313,6 +321,11 @@ memchr:
 10:
 	@ We are aligned, we know we have at least 8 bytes to work with
 	push	{r4,r5,r6,r7}
+	.cfi_adjust_cfa_offset 16
+	.cfi_rel_offset 4, 0
+	.cfi_rel_offset 5, 4
+	.cfi_rel_offset 6, 8
+	.cfi_rel_offset 7, 12
 	orr	r1, r1, r1, lsl #8	@ expand the match word across all bytes
 	orr	r1, r1, r1, lsl #16
 	bic	r4, r2, #7	@ Number of double words to work with * 8
@@ -334,6 +347,11 @@ memchr:
 	bne	15b		@ (Flags from the subs above)
 
 	pop	{r4,r5,r6,r7}
+	.cfi_restore 7
+	.cfi_restore 6
+	.cfi_restore 5
+	.cfi_restore 4
+	.cfi_adjust_cfa_offset -16
 	and	r1,r1,#0xff	@ r1 back to a single character
 	and	r2,r2,#7	@ Leave the count remaining as the number
 				@ after the double words have been done
@@ -349,17 +367,29 @@ memchr:
 	bne	21b		@ on r2 flags
 
 40:
+	.cfi_remember_state
 	movs	r0,#0		@ not found
-	bx	lr
+	epilogue
 
 50:
+	.cfi_restore_state
+	.cfi_remember_state
 	subs	r0,r0,#1	@ found
-	bx	lr
+	epilogue
 
 60:  @ We're here because the fast path found a hit 
      @ now we have to track down exactly which word it was
 	@ r0 points to the start of the double word after the one tested
 	@ r5 has the 00/ff pattern for the first word, r6 has the chained value
+	@ This point is reached from cbnz midway through label 15 prior to
+	@ popping r4-r7 off the stack.  .cfi_restore_state alone disregards
+	@ this, so we manually correct this.
+	.cfi_restore_state	@ Standard post-prologue state
+	.cfi_adjust_cfa_offset 16
+	.cfi_rel_offset 4, 0
+	.cfi_rel_offset 5, 4
+	.cfi_rel_offset 6, 8
+	.cfi_rel_offset 7, 12
 	cmp	r5, #0
 	itte	eq
 	moveq	r5, r6		@ the end is in the 2nd word
@@ -379,8 +409,16 @@ memchr:
 
 61:
 	pop	{r4,r5,r6,r7}
+	.cfi_restore 7
+	.cfi_restore 6
+	.cfi_restore 5
+	.cfi_restore 4
+	.cfi_adjust_cfa_offset -16
 	subs	r0,r0,#1
-	bx	lr
+	epilogue
+	.cfi_endproc
+	.cantunwind
+	.fnend
 #else
   /* Defined in memchr-stub.c.  */
 #endif
-- 
2.36.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 5/8] newlib: libc: memcpy M-profile PACBTI-enablement
  2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
                   ` (3 preceding siblings ...)
  2022-12-21 11:24 ` [PATCH v5 4/8] newlib: libc: memchr " Victor L. Do Nascimento
@ 2022-12-21 11:25 ` Victor L. Do Nascimento
  2022-12-21 11:27 ` [PATCH v5 6/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:25 UTC (permalink / raw)
  To: newlib; +Cc: Richard Earnshaw

Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
 newlib/libc/machine/arm/memcpy-armv7m.S | 33 ++++++++++++++++++-------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/newlib/libc/machine/arm/memcpy-armv7m.S b/newlib/libc/machine/arm/memcpy-armv7m.S
index c8bff36f6..ec1ad6485 100644
--- a/newlib/libc/machine/arm/memcpy-armv7m.S
+++ b/newlib/libc/machine/arm/memcpy-armv7m.S
@@ -46,6 +46,8 @@
      __OPT_BIG_BLOCK_SIZE: Size of big block in words.  Default to 64.
      __OPT_MID_BLOCK_SIZE: Size of big block in words.  Default to 16.
  */
+#include "arm_asm.h"
+
 #ifndef __OPT_BIG_BLOCK_SIZE
 #define __OPT_BIG_BLOCK_SIZE (4 * 16)
 #endif
@@ -85,6 +87,8 @@
 	.global	memcpy
 	.thumb
 	.thumb_func
+	.fnstart
+	.cfi_startproc
 	.type	memcpy, %function
 memcpy:
 	@ r0: dst
@@ -93,10 +97,11 @@ memcpy:
 #ifdef __ARM_FEATURE_UNALIGNED
 	/* In case of UNALIGNED access supported, ip is not used in
 	   function body.  */
+	prologue push_ip=HAVE_PAC_LEAF
 	mov	ip, r0
 #else
-	push	{r0}
-#endif
+	prologue 0 push_ip=HAVE_PAC_LEAF
+#endif /* __ARM_FEATURE_UNALIGNED */
 	orr	r3, r1, r0
 	ands	r3, r3, #3
 	bne	.Lmisaligned_copy
@@ -178,15 +183,17 @@ memcpy:
 #endif /* __ARM_FEATURE_UNALIGNED */
 
 .Ldone:
+	.cfi_remember_state
 #ifdef __ARM_FEATURE_UNALIGNED
 	mov	r0, ip
+	epilogue push_ip=HAVE_PAC_LEAF
 #else
-	pop	{r0}
-#endif
-	bx	lr
+	epilogue 0 push_ip=HAVE_PAC_LEAF
+#endif /*  __ARM_FEATURE_UNALIGNED */
 
 	.align 2
 .Lmisaligned_copy:
+	.cfi_restore_state
 #ifdef __ARM_FEATURE_UNALIGNED
 	/* Define label DST_ALIGNED to BIG_BLOCK.  It will go to aligned copy
 	   once destination is adjusted to aligned.  */
@@ -247,6 +254,9 @@ memcpy:
 	/* dst is aligned, but src isn't.  Misaligned copy.  */
 
 	push	{r4, r5}
+	.cfi_adjust_cfa_offset 8
+	.cfi_rel_offset 4, 0
+	.cfi_rel_offset 5, 4
 	subs	r2, #4
 
 	/* Backward r1 by misaligned bytes, to make r1 aligned.
@@ -299,6 +309,9 @@ memcpy:
 	adds	r2, #4
 	subs	r1, ip
 	pop	{r4, r5}
+	.cfi_restore 4
+	.cfi_restore 5
+	.cfi_adjust_cfa_offset -8
 
 #endif /* __ARM_FEATURE_UNALIGNED */
 
@@ -321,9 +334,11 @@ memcpy:
 
 #ifdef __ARM_FEATURE_UNALIGNED
 	mov	r0, ip
+	epilogue push_ip=HAVE_PAC_LEAF
 #else
-	pop	{r0}
-#endif
-	bx	lr
-
+	epilogue 0 push_ip=HAVE_PAC_LEAF
+#endif /* __ARM_FEATURE_UNALIGNED */
+	.cfi_endproc
+	.cantunwind
+	.fnend
 	.size	memcpy, .-memcpy
-- 
2.36.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 6/8] newlib: libc: aeabi_memmove M-profile PACBTI-enablement
  2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
                   ` (4 preceding siblings ...)
  2022-12-21 11:25 ` [PATCH v5 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
@ 2022-12-21 11:27 ` Victor L. Do Nascimento
  2022-12-21 11:28 ` [PATCH v5 7/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
  2022-12-21 11:42 ` [PATCH v5 8/8] newlib: libc: setjmp " Victor L. Do Nascimento
  7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:27 UTC (permalink / raw)
  To: newlib; +Cc: Richard Earnshaw

Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
 newlib/libc/machine/arm/aeabi_memmove-thumb2.S | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
index e9504437b..20ca993e5 100644
--- a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
+++ b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
@@ -26,6 +26,8 @@
  * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
+#include "arm_asm.h"
+
 	.thumb
 	.syntax unified
 	.global __aeabi_memmove
@@ -33,8 +35,10 @@
 	ASM_ALIAS __aeabi_memmove4 __aeabi_memmove
 	ASM_ALIAS __aeabi_memmove8 __aeabi_memmove
 __aeabi_memmove:
+	.fnstart
+	.cfi_startproc
+	prologue 4
 	cmp	r0, r1
-	push	{r4}
 	bls	3f
 	adds	r3, r1, r2
 	cmp	r0, r3
@@ -48,9 +52,10 @@ __aeabi_memmove:
 	strb	r4, [r1, #-1]!
 	bne	1b
 2:
-	pop	{r4}
-	bx	lr
+	.cfi_remember_state
+	epilogue 4
 3:
+	.cfi_restore_state
 	cmp	r2, #0
 	beq	2b
 	add	r2, r2, r1
@@ -60,6 +65,8 @@ __aeabi_memmove:
 	cmp	r2, r1
 	strb	r4, [r3, #1]!
 	bne	4b
-	pop	{r4}
-	bx	lr
+	epilogue 4
+	.cfi_endproc
+	.cantunwind
+	.fnend
 	.size __aeabi_memmove, . - __aeabi_memmove
-- 
2.36.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 7/8] newlib: libc: aeabi_memset M-profile PACBTI-enablement
  2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
                   ` (5 preceding siblings ...)
  2022-12-21 11:27 ` [PATCH v5 6/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
@ 2022-12-21 11:28 ` Victor L. Do Nascimento
  2022-12-21 11:42 ` [PATCH v5 8/8] newlib: libc: setjmp " Victor L. Do Nascimento
  7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:28 UTC (permalink / raw)
  To: newlib; +Cc: Richard Earnshaw

Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
 newlib/libc/machine/arm/aeabi_memset-thumb2.S | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb2.S b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
index eaca1d8d7..6b77d3820 100644
--- a/newlib/libc/machine/arm/aeabi_memset-thumb2.S
+++ b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
@@ -26,14 +26,18 @@
  * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
+#include "arm_asm.h"
+
 	.thumb
 	.syntax unified
 	.global __aeabi_memset
 	.type	__aeabi_memset, %function
+	.fnstart
+	.cfi_startproc
 	ASM_ALIAS __aeabi_memset4 __aeabi_memset
 	ASM_ALIAS __aeabi_memset8 __aeabi_memset
 __aeabi_memset:
-	push	{r4, r5, r6}
+	prologue 4 6
 	lsls	r4, r0, #30
 	beq	10f
 	subs	r4, r1, #1
@@ -98,10 +102,14 @@ __aeabi_memset:
 	cmp	r3, r4
 	bne	8b
 9:
-	pop	{r4, r5, r6}
-	bx	lr
+	.cfi_remember_state
+	epilogue 4 6
 10:
+	.cfi_restore_state
 	mov	r4, r1
 	mov	r3, r0
 	b	3b
+	.cfi_endproc
+	.cantunwind
+	.fnend
 	.size __aeabi_memset, . - __aeabi_memset
-- 
2.36.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 8/8] newlib: libc: setjmp M-profile PACBTI-enablement
  2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
                   ` (6 preceding siblings ...)
  2022-12-21 11:28 ` [PATCH v5 7/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
@ 2022-12-21 11:42 ` Victor L. Do Nascimento
  2023-01-05 16:53   ` Richard Earnshaw
  7 siblings, 1 reply; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:42 UTC (permalink / raw)
  To: newlib; +Cc: Richard Earnshaw

Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
 newlib/libc/machine/arm/setjmp.S | 39 ++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/newlib/libc/machine/arm/setjmp.S b/newlib/libc/machine/arm/setjmp.S
index d814afea8..3e4d7cb70 100644
--- a/newlib/libc/machine/arm/setjmp.S
+++ b/newlib/libc/machine/arm/setjmp.S
@@ -155,6 +155,8 @@ SYM (.arm_start_of.\name):
 	.align 2
 	MODE
 	.globl SYM (\name)
+	.fnstart
+	.cfi_startproc
 	TYPE (\name)
 SYM (\name):
 	PROLOGUE \name
@@ -162,6 +164,8 @@ SYM (\name):
 
 .macro FUNC_END name
 	RET
+	.cfi_endproc
+	.fnend
 	SIZE (\name)
 .endm
 
@@ -171,6 +175,21 @@ SYM (\name):
 
 	FUNC_START setjmp
 
+#if __ARM_FEATURE_PAC_DEFAULT
+# if __ARM_FEATURE_BTI_DEFAULT
+	pacbti	ip, lr, sp
+# else
+	pac	ip, lr, sp
+# endif /* __ARM_FEATURE_BTI_DEFAULT */
+	mov r3, ip
+	str r3, [r0, #104]
+	.cfi_register 143, 12
+#else
+# if __ARM_FEATURE_BTI_DEFAULT
+	bti
+# endif /* __ARM_FEATURE_BTI_DEFAULT */
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
+
 	/* Save all the callee-preserved registers into the jump buffer.  */
 #ifdef __thumb2__
 	mov		ip, sp
@@ -184,6 +203,10 @@ SYM (\name):
 
 	/* When setting up the jump buffer return 0.  */
 	mov		r0, #0
+#if __ARM_FEATURE_PAC_DEFAULT
+	mov ip, r3
+	aut ip, lr, sp
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
 
 	FUNC_END setjmp
 
@@ -193,6 +216,16 @@ SYM (\name):
 
 	FUNC_START longjmp
 
+#if __ARM_FEATURE_BTI_DEFAULT
+	bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+
+#if __ARM_FEATURE_PAC_DEFAULT
+	/* Keep original jmpbuf address for retrieving pac-code
+	   for authentication.  */
+	mov	r2, r0
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
+
 	/* If we have stack extension code it ought to be handled here.  */
 
 	/* Restore the registers, retrieving the state when setjmp() was called.  */
@@ -212,5 +245,11 @@ SYM (\name):
 	it		eq
 	moveq		r0, #1
 
+#if __ARM_FEATURE_PAC_DEFAULT
+	ldr r3, [r2, #104]
+	mov ip, r3
+	aut ip, lr, sp
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
+
 	FUNC_END longjmp
 #endif
-- 
2.36.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 8/8] newlib: libc: setjmp M-profile PACBTI-enablement
  2022-12-21 11:42 ` [PATCH v5 8/8] newlib: libc: setjmp " Victor L. Do Nascimento
@ 2023-01-05 16:53   ` Richard Earnshaw
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Earnshaw @ 2023-01-05 16:53 UTC (permalink / raw)
  To: Victor L. Do Nascimento, newlib; +Cc: Richard Earnshaw



On 21/12/2022 11:42, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
> ---
>   newlib/libc/machine/arm/setjmp.S | 39 ++++++++++++++++++++++++++++++++
>   1 file changed, 39 insertions(+)
> 
> diff --git a/newlib/libc/machine/arm/setjmp.S b/newlib/libc/machine/arm/setjmp.S
> index d814afea8..3e4d7cb70 100644
> --- a/newlib/libc/machine/arm/setjmp.S
> +++ b/newlib/libc/machine/arm/setjmp.S
> @@ -155,6 +155,8 @@ SYM (.arm_start_of.\name):
>   	.align 2
>   	MODE
>   	.globl SYM (\name)
> +	.fnstart
> +	.cfi_startproc
>   	TYPE (\name)
>   SYM (\name):
>   	PROLOGUE \name
> @@ -162,6 +164,8 @@ SYM (\name):
>   
>   .macro FUNC_END name
>   	RET
> +	.cfi_endproc
> +	.fnend
>   	SIZE (\name)
>   .endm
>   
> @@ -171,6 +175,21 @@ SYM (\name):
>   
>   	FUNC_START setjmp
>   
> +#if __ARM_FEATURE_PAC_DEFAULT
> +# if __ARM_FEATURE_BTI_DEFAULT
> +	pacbti	ip, lr, sp
> +# else
> +	pac	ip, lr, sp
> +# endif /* __ARM_FEATURE_BTI_DEFAULT */
> +	mov r3, ip
> +	str r3, [r0, #104]

#104 here is a bit obscure.  I think it would be clearer to write 
something like

	str r3, [r0, #(CORE_REGS_SAVE_SIZE + FP_REGS_SAVE_SIZE)]

and then define these as appropriate.

> +	.cfi_register 143, 12
> +#else
> +# if __ARM_FEATURE_BTI_DEFAULT
> +	bti
> +# endif /* __ARM_FEATURE_BTI_DEFAULT */
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
> +
>   	/* Save all the callee-preserved registers into the jump buffer.  */
>   #ifdef __thumb2__
>   	mov		ip, sp
> @@ -184,6 +203,10 @@ SYM (\name):
>   
>   	/* When setting up the jump buffer return 0.  */
>   	mov		r0, #0
> +#if __ARM_FEATURE_PAC_DEFAULT
> +	mov ip, r3
> +	aut ip, lr, sp
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
>   
>   	FUNC_END setjmp
>   
> @@ -193,6 +216,16 @@ SYM (\name):
>   
>   	FUNC_START longjmp
>   
> +#if __ARM_FEATURE_BTI_DEFAULT
> +	bti
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> +
> +#if __ARM_FEATURE_PAC_DEFAULT
> +	/* Keep original jmpbuf address for retrieving pac-code
> +	   for authentication.  */
> +	mov	r2, r0
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
> +
>   	/* If we have stack extension code it ought to be handled here.  */
>   
>   	/* Restore the registers, retrieving the state when setjmp() was called.  */
> @@ -212,5 +245,11 @@ SYM (\name):
>   	it		eq
>   	moveq		r0, #1
>   
> +#if __ARM_FEATURE_PAC_DEFAULT
> +	ldr r3, [r2, #104]
> +	mov ip, r3

See above.  Also, you don't need to load into r3 and then move to IP, 
just load ip directly.

> +	aut ip, lr, sp
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
> +
>   	FUNC_END longjmp
>   #endif

R.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros
  2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
@ 2023-01-06 10:42   ` Christophe Lyon
  2023-01-06 20:51     ` Victor Do Nascimento
  2023-01-09  9:33     ` Christophe Lyon
  0 siblings, 2 replies; 15+ messages in thread
From: Christophe Lyon @ 2023-01-06 10:42 UTC (permalink / raw)
  To: Victor L. Do Nascimento, newlib; +Cc: Richard Earnshaw

Hi Victor,

Thanks for the patch series, a few comments/questions below.

Christophe


On 12/21/22 12:19, Victor L. Do Nascimento wrote:
> Augment the arm_asm.h header file to simplify function prologues and
> epilogues whilst adding support for PACBTI enablement via macros for
> hand-written assembly functions.  For PACBTI, both prologues/epilogues
> as well as cfi-related directives are automatically amended
> accordingly, depending on the compile-time mbranch-protection argument
> values.
> 
> It defines the following preprocessor macros:
>     * HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
>     leaf functions.
>     * PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
>     to the stack irrespective of whether the ip register is clobbered in
>     the function or not.
>     * STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
>     the push list as necessary in the prologue to ensure stack
>     alignment preservation at the start of assembly function.  The
>     epilogue behavior is likewise affected by this flag, ensuring any
>     pushed dummy registers also get popped on function return.
IIUC, these new macros are meant for general usage outside of newlib, do 
they need proper documentation? Or maybe an entry in the "News" section? 
I don't know. Otherwise, I think they should not appear in the user 
naming space.


> It also defines the following assembler macros:
>     * prologue: In addition to pushing any callee-saved registers onto
>     the stack, it generates any requested pacbti instructions.
>     Pushed registers are specified via the optional `first', `last',
>     `push_ip' and `push_lr' macro argument parameters.
Maybe you should quote 'first' and 'last' differently from 'push_ip' and 
'push_lr', since the example below shows that 'first' and 'last' are in 
fact register numbers (IIUC)

>     when a single register number is provided, it pushes that
Typo: "When" (with a capital)

>     register.  When two register numbers are provided, they specify a
>     rage to save.  If push_ip and/or push_lr are non-zero, the
Typo: "range"

>     respective registers are also saved.  Stack alignment is requested
>     via the `align` argument, which defaults to the value of
>     STACK_ALIGN_ENFORCE, unless manually overridden.
> 
>     For example:
> 
>         prologue push_ip=1 -> push {ip}
>         prologue push_ip=1, align8=1 -> push {r2, ip}
>         prologue push_ip=1, push_lr=1 -> push {ip, lr}
>         prologue 1 -> push {r1}
>         prologue 1, align8=1 -> push {r0, r1}
>         prologue 1 push_ip=1 -> push {r1, ip}
>         prologue 1 4 -> push {r1-r4}
>         prologue 1 4 push_ip=1 -> push {r1-r4, ip}
can you include an example with pacbti?


>     * epilogue: pops registers off the stack and emits pac key signing
>     instruction, if requested. The `first', `last', `push_ip',
>     `push_lr' and `align' function as per the prologue macro,
>     generating pop instead of push instructions.
> 
>     Stack alignment is enforced via the following helper macro
>     call-chain:
> 
> 	{prologue|epilogue} ->_align8 -> _preprocess_reglist ->
> 	  _preprocess_reglist1 -> {_prologue|_epilogue}
> 
>     Finally, the necessary cfi directives for adding debug information
>     to prologue and epilogue are generated via the following macros:
> 
>     * cfisavelist - prologue macro helper function, generating
>     necessary .cfi_offset directives associated with push instruction.
>     Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
>     to generate the following:
> 
>         push {r1-r2, ip}
>         .cfi_adjust_cfa_offset 12
>         .cfi_offset 143, -4
>         .cfi_offset 2, -8
>         .cfi_offset 1, -12
> 
>     * cfirestorelist - epilogue macro helper function, emitting
>     .cfi_restore instructions prior to resetting the cfa offset.  As
>     such, calling `epilogue 1 2 push_ip=1' will produce:
> 
>          pop {r1-r2, ip}
> 	.cfi_register 143, 12
> 	.cfi_restore 2
> 	.cfi_restore 1
> 	.cfi_def_cfa_offset 0
> ---
>   newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
>   1 file changed, 441 insertions(+)
> 
> diff --git a/newlib/libc/machine/arm/arm_asm.h b/newlib/libc/machine/arm/arm_asm.h
> index 2708057de..94fa77b4d 100644
> --- a/newlib/libc/machine/arm/arm_asm.h
> +++ b/newlib/libc/machine/arm/arm_asm.h
> @@ -60,4 +60,445 @@
>   # define _ISA_THUMB_1
>   #endif
>   
> +/* Check whether leaf function PAC signing has been requested in the
> +   -mbranch-protect compile-time option.  */
> +#define LEAF_PROTECT_BIT 2
Shouldn't this start with '__' or be #undefed at the end of this file to 
avoid polluting user naming space? (I noticed it's not used outside this 
file in this patch series)

> +
> +#ifdef __ARM_FEATURE_PAC_DEFAULT
> +# define HAVE_PAC_LEAF \
> +	((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
> +#else
> +# define HAVE_PAC_LEAF 0
> +#endif
> +
> +/* Provide default parameters for PAC-code handling in leaf-functions.  */
> +#if HAVE_PAC_LEAF
> +# ifndef PAC_LEAF_PUSH_IP
> +#  define PAC_LEAF_PUSH_IP 1
> +# endif
> +#else /* !HAVE_PAC_LEAF */
> +# undef PAC_LEAF_PUSH_IP
> +# define PAC_LEAF_PUSH_IP 0
> +#endif /* HAVE_PAC_LEAF */
> +
> +#define STACK_ALIGN_ENFORCE 0
> +
> +#ifdef __ASSEMBLER__
> +
> +/******************************************************************************
> +* Implementation of the prologue and epilogue assembler macros and their
> +* associated helper functions.
> +*
> +* These functions add support for the following:
> +*
> +* - M-profile branch target identification (BTI) landing-pads when compiled
> +*   with `-mbranch-protection=bti'.
> +* - PAC-signing and verification instructions, depending on hardware support
> +*   and whether the PAC-signing of leaf functions has been requested via the
> +*   `-mbranch-protection=pac-ret+leaf' compiler argument.
> +* - 8-byte stack alignment preservation at function entry, defaulting to the
> +*   value of STACK_ALIGN_ENFORCE.
> +*
> +* Notes:
> +* - Prologue stack alignment is implemented by detecting a push with an odd
> +*   number of registers and prepending a dummy register to the list.
> +* - If alignment is attempted on a list containing r0, compilation will result
> +*   in an error.
> +* - If alignment is attempted in a list containing r1, r0 will be prepended to
> +*   the register list and r0 will be restored prior to function return.  for
> +*   functions with non-void return types, this will result in the corruption of
> +*   the result register.
> +* - Stack alignment is enforced via the following helper macro call-chain:
> +*
> +*	{prologue|epilogue} ->_align8 -> _preprocess_reglist ->
> +*		_preprocess_reglist1 -> {_prologue|_epilogue}
> +*
> +* - Debug CFI directives are automatically added to prologues and epilogues,
> +*   assisted by `cfisavelist' and `cfirestorelist', respectively.
> +*
> +* Arguments:
> +* prologue
> +* --------
> +* - first	- If `last' specified, this serves as start of general-purpose
> +*		  register (GPR) range to push onto stack, otherwise represents
> +*		  single GPR to push onto stack.  If omitted, no GPRs pushed
> +*		  onto stack at prologue.
> +* - last	- If given, specifies inclusive upper-bound of GPR range.
> +* - push_ip	- Determines whether IP register is to be pushed to stack at
> +*		  prologue.  When pac-signing is requested, this holds the
> +*		  the pac-key.  Either 1 or 0 to push or not push, respectively.
> +*		  Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
> +* - push_lr	- Determines whether to push lr to the stack on function entry.
> +*		  Either 1 or 0  to push or not push, respectively.
> +* - align8	- Whether to enforce alignment. Either 1 or 0, with 1 requesting
> +*		  alignment.
> +*
> +* epilogue
> +* --------
> +*   The epilogue should be called passing the same arguments as those passed to
> +*   the prologue to ensure the stack is not corrupted on function return.
> +*
> +* Usage examples:
> +*
> +*   prologue push_ip=1 -> push {ip}
> +*   epilogue push_ip=1, align8=1 -> pop {r2, ip}
> +*   prologue push_ip=1, push_lr=1 -> push {ip, lr}
> +*   epilogue 1 -> pop {r1}
> +*   prologue 1, align8=1 -> push {r0, r1}
> +*   epilogue 1, push_ip=1 -> pop {r1, ip}
> +*   prologue 1, 4 -> push {r1-r4}
> +*   epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
> +*
> +******************************************************************************/
> +
> +/* Emit .cfi_restore directives for a consecutive sequence of registers.  */
> +	.macro cfirestorelist first, last
> +	.cfi_restore \last
> +	.if \last-\first
> +	 cfirestorelist \first, \last-1
> +	.endif
> +	.endm
> +
> +/* Emit .cfi_offset directives for a consecutive sequence of registers.  */
> +	.macro cfisavelist first, last, index=1
> +	.cfi_offset \last, -4*(\index)
> +	.if \last-\first
> +	 cfisavelist \first, \last-1, \index+1
> +	.endif
> +	.endm
> +
> +.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
> +	.if \push_ip & 1 != \push_ip
> +	 .error "push_ip may be either 0 or 1"
> +	.endif
> +	.if \push_lr & 1 != \push_lr
> +	 .error "push_lr may be either 0 or 1"
> +	.endif
> +	.if \first != -1
> +	 .if \last == -1
> +	  /* Upper-bound not provided: Set upper = lower.  */
> +	  _prologue \first, \first, \push_ip, \push_lr
> +	  .exitm
> +	 .endif
> +	.endif
> +#if HAVE_PAC_LEAF
> +#if __ARM_FEATURE_BTI_DEFAULT
> +	pacbti	ip, lr, sp
> +#else
> +	pac	ip, lr, sp
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> +	.cfi_register 143, 12
> +#else
> +#if __ARM_FEATURE_BTI_DEFAULT
> +	bti
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> +#endif /* HAVE_PAC_LEAF */
> +	.if \first != -1
> +	 .if \last != \first
> +	  .if \last >= 13
> +	.error "SP cannot be in the save list"
> +	  .endif
I think you should also check that IP (r12) is not in the range, 
otherwise I think nothing prevents from doing
prologue 12, push_ip=1 which will result in emitting push {r12, ip}
(I suppose gas would complain?)
.... scratch that, I saw later that this sanity checking is performed in 
_preprocess_reglist1 :-)


> +	  .if \push_ip
> +	   .if \push_lr
> +	/* Case 1: push register range, ip and lr registers.  */
> +	push {r\first-r\last, ip, lr}
> +	.cfi_adjust_cfa_offset ((\last-\first)+3)*4
> +	.cfi_offset 14, -4
> +	.cfi_offset 143, -8
> +	cfisavelist \first, \last, 3
> +	   .else // !\push_lr
> +	/* Case 2: push register range and ip register.  */
> +	push {r\first-r\last, ip}
> +	.cfi_adjust_cfa_offset ((\last-\first)+2)*4
> +	.cfi_offset 143, -4
> +	cfisavelist \first, \last, 2
> +	   .endif
> +	  .else // !\push_ip
> +	   .if \push_lr
> +	/* Case 3: push register range and lr register.  */
> +	push {r\first-r\last, lr}
> +	.cfi_adjust_cfa_offset ((\last-\first)+2)*4
> +	.cfi_offset 14, -4
> +	cfisavelist \first, \last, 2
> +	   .else // !\push_lr
> +	/* Case 4: push register range.  */
> +	push {r\first-r\last}
> +	.cfi_adjust_cfa_offset ((\last-\first)+1)*4
> +	cfisavelist \first, \last, 1
> +	   .endif
> +	  .endif
> +	 .else // \last == \first
> +	  .if \push_ip
> +	   .if \push_lr
> +	/* Case 5: push single GP register plus ip and lr registers.  */
> +	push {r\first, ip, lr}
> +	.cfi_adjust_cfa_offset 12
> +	.cfi_offset 14, -4
> +	.cfi_offset 143, -8
> +        cfisavelist \first, \first, 3
> +	   .else // !\push_lr
> +	/* Case 6: push single GP register plus ip register.  */
> +	push {r\first, ip}
> +	.cfi_adjust_cfa_offset 8
> +	.cfi_offset 143, -4
> +        cfisavelist \first, \first, 2
> +	   .endif
> +	  .else // !\push_ip
> +	   .if \push_lr
> +	/* Case 7: push single GP register plus lr register.  */
> +	push {r\first, lr}
> +	.cfi_adjust_cfa_offset 8
> +	.cfi_offset 14, -4
> +	cfisavelist \first, \first, 2
> +	   .else // !\push_lr
> +	/* Case 8: push single GP register.  */
> +	push {r\first}
> +	.cfi_adjust_cfa_offset 4
> +	cfisavelist \first, \first, 1
> +	   .endif
> +	  .endif
> +	 .endif
> +	.else // \first == -1
> +	 .if \push_ip
> +	  .if \push_lr
> +	/* Case 9: push ip and lr registers.  */
> +	push {ip, lr}
> +	.cfi_adjust_cfa_offset 8
> +	.cfi_offset 14, -4
> +	.cfi_offset 143, -8
> +	  .else // !\push_lr
> +	/* Case 10: push ip register.  */
> +	push {ip}
> +	.cfi_adjust_cfa_offset 4
> +	.cfi_offset 143, -4
> +	  .endif
> +	 .else // !\push_ip
> +          .if \push_lr
> +	/* Case 11: push lr register.  */
> +	push {lr}
> +	.cfi_adjust_cfa_offset 4
> +	.cfi_offset 14, -4
> +          .endif
> +	 .endif
> +	.endif
> +.endm
> +
> +.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
> +	.if \push_ip & 1 != \push_ip
> +	 .error "push_ip may be either 0 or 1"
> +	.endif
> +	.if \push_lr & 1 != \push_lr
> +	 .error "push_lr may be either 0 or 1"
> +	.endif
> +	.if \first != -1
> +	 .if \last == -1
> +	  /* Upper-bound not provided: Set upper = lower.  */
> +	  _epilogue \first, \first, \push_ip, \push_lr
> +	  .exitm
> +	 .endif
> +	 .if \last != \first
> +	  .if \last >= 13
> +	.error "SP cannot be in the save list"
> +	  .endif
> +	  .if \push_ip
> +	   .if \push_lr
> +	/* Case 1: pop register range, ip and lr registers.  */
> +	pop {r\first-r\last, ip, lr}
> +	.cfi_restore 14
> +	.cfi_register 143, 12
> +	cfirestorelist \first, \last
> +	   .else // !\push_lr
> +	/* Case 2: pop register range and ip register.  */
> +	pop {r\first-r\last, ip}
> +	.cfi_register 143, 12
> +	cfirestorelist \first, \last
> +	   .endif
> +	  .else // !\push_ip
> +	   .if \push_lr
> +	/* Case 3: pop register range and lr register.  */
> +	pop {r\first-r\last, lr}
> +	.cfi_restore 14
> +	cfirestorelist \first, \last
> +	   .else // !\push_lr
> +	/* Case 4: pop register range.  */
> +	pop {r\first-r\last}
> +	cfirestorelist \first, \last
> +	   .endif
> +	  .endif
> +	 .else // \last == \first
> +	  .if \push_ip
> +	   .if \push_lr
> +	/* Case 5: pop single GP register plus ip and lr registers.  */
> +	pop {r\first, ip, lr}
> +	.cfi_restore 14
> +	.cfi_register 143, 12
> +	cfirestorelist \first, \first
> +	   .else // !\push_lr
> +	/* Case 6: pop single GP register plus ip register.  */
> +	pop {r\first, ip}
> +	.cfi_register 143, 12
> +	cfirestorelist \first, \first
> +	   .endif
> +	  .else // !\push_ip
> +	   .if \push_lr
> +	/* Case 7: pop single GP register plus lr register.  */
> +	pop {r\first, lr}
> +	.cfi_restore 14
> +	cfirestorelist \first, \first
> +	   .else // !\push_lr
> +	/* Case 8: pop single GP register.  */
> +	pop {r\first}
> +	cfirestorelist \first, \first
> +	   .endif
> +	  .endif
> +	 .endif
> +	.else // \first == -1
> +	 .if \push_ip
> +	  .if \push_lr
> +	/* Case 9: pop ip and lr registers.  */
> +	pop {ip, lr}
> +	.cfi_restore 14
> +	.cfi_register 143, 12
> +	  .else // !\push_lr
> +	/* Case 10: pop ip register.  */
> +	pop {ip}
> +	.cfi_register 143, 12
> +	  .endif
> +	 .else // !\push_ip
> +          .if \push_lr
> +	/* Case 11: pop lr register.  */
> +	pop {lr}
> +	.cfi_restore 14
> +          .endif
> +	 .endif
> +	.endif
> +#if HAVE_PAC_LEAF
> +	aut	ip, lr, sp
> +#endif /* HAVE_PAC_LEAF */
> +	bx	lr
> +.endm
> +
> +# clean up expressions in 'last'
> +.macro _preprocess_reglist1 first:req, last:req, push_ip:req, push_lr:req, reglist_op:req
> +	.if \last == 0
> +	 \reglist_op \first, 0, \push_ip, \push_lr
> +	.elseif \last == 1
> +	 \reglist_op \first, 1, \push_ip, \push_lr
> +	.elseif \last == 2
> +	 \reglist_op \first, 2, \push_ip, \push_lr
> +	.elseif \last == 3
> +	 \reglist_op \first, 3, \push_ip, \push_lr
> +	.elseif \last == 4
> +	 \reglist_op \first, 4, \push_ip, \push_lr
> +	.elseif \last == 5
> +	 \reglist_op \first, 5, \push_ip, \push_lr
> +	.elseif \last == 6
> +	 \reglist_op \first, 6, \push_ip, \push_lr
> +	.elseif \last == 7
> +	 \reglist_op \first, 7, \push_ip, \push_lr
> +	.elseif \last == 8
> +	 \reglist_op \first, 8, \push_ip, \push_lr
> +	.elseif \last == 9
> +	 \reglist_op \first, 9, \push_ip, \push_lr
> +	.elseif \last == 10
> +	 \reglist_op \first, 10, \push_ip, \push_lr
> +	.elseif \last == 11
> +	 \reglist_op \first, 11, \push_ip, \push_lr
> +	.else
> +	 .error "last (\last) out of range"
> +	.endif
> +.endm
> +
> +# clean up expressions in 'first'
> +.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0, reglist_op:req
> +	.ifb \last
> +	 _preprocess_reglist \first \first \push_ip \push_lr
> +	.else
> +	 .if \first > \last
> +	  .error "last (\last) must be at least as great as first (\first)"
> +	 .endif
> +	 .if \first == 0
> +	  _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 1
> +	  _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 2
> +	  _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 3
> +	  _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 4
> +	  _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 5
> +	  _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 6
> +	  _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 7
> +	  _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 8
> +	  _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 9
> +	  _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 10
> +	  _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
> +	 .elseif \first == 11
> +	  _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
> +	 .else
> +	  .error "first (\first) out of range"
> +	 .endif
> +	.endif
> +.endm
> +
> +.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
> +	.ifb \first
> +	 .ifnb \last
> +	  .error "can't have last (\last) without specifying first"
> +	 .else // \last not blank
> +	  .if ((\push_ip + \push_lr) % 2) == 0
> +	   \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
> +	   .exitm
> +	  .else // ((\push_ip + \push_lr) % 2) odd
> +	   _align8 2, 2, \push_ip, \push_lr, \reglist_op
> +	   .exitm
> +	  .endif // ((\push_ip + \push_lr) % 2) == 0
> +	 .endif // .ifnb \last
> +	.endif // .ifb \first
> +
> +	.ifb \last
> +	 _align8 \first, \first, \push_ip, \push_lr, \reglist_op
> +	.else
> +	 .if \push_ip & 1 <> \push_ip
> +	  .error "push_ip may be 0 or 1"
> +	 .endif
> +	 .if \push_lr & 1 <> \push_lr
> +	  .error "push_lr may be 0 or 1"
> +	 .endif
> +	 .ifeq (\last - \first + \push_ip + \push_lr) % 2
> +	  .if \first == 0
> +	   .error "Alignment required and first register is r0"
> +	   .exitm
> +	  .endif
> +	  _preprocess_reglist \first-1, \last, \push_ip, \push_lr, \reglist_op
> +	 .else
> +	  _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
> +	 .endif
> +	.endif
> +.endm
> +
> +.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
> +	.if \align8
> +	 _align8 \first, \last, \push_ip, \push_lr, _prologue
> +	.else
> +	 _prologue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
> +	.endif
> +.endm
> +
> +.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
> +	.if \align8
> +	 _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
> +	.else
> +	 _epilogue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
> +	.endif
> +.endm
> +
> +#endif /* __ASSEMBLER__ */
> +
>   #endif /* ARM_ASM__H */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
  2022-12-21 11:21 ` [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
@ 2023-01-06 11:09   ` Christophe Lyon
  2023-01-06 21:35     ` Victor Do Nascimento
  0 siblings, 1 reply; 15+ messages in thread
From: Christophe Lyon @ 2023-01-06 11:09 UTC (permalink / raw)
  To: Victor L. Do Nascimento, newlib; +Cc: Richard Earnshaw



On 12/21/22 12:21, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
> 
> This patch enables PACBTI for all relevant variants of strcmp:
>       * Newlib for armv8.1-m.main+pacbti
>       * Newlib for armv8.1-m.main+pacbti+mve
>       * Newlib-nano
> ---
>   newlib/libc/machine/arm/strcmp-arm-tiny.S |  8 +++-
>   newlib/libc/machine/arm/strcmp-armv7.S    | 57 ++++++++++++++---------
>   newlib/libc/machine/arm/strcmp-armv7m.S   | 26 +++++++----
>   3 files changed, 60 insertions(+), 31 deletions(-)
> 
> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S b/newlib/libc/machine/arm/strcmp-arm-tiny.S
> index 607a41daf..0bd2a2e6e 100644
> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
> @@ -29,10 +29,14 @@
>   /* Tiny version of strcmp in ARM state.  Used only when optimizing
>      for size.  Also supports Thumb-2.  */
>   
> +#include "arm_asm.h"
> +
>   	.syntax unified
>   def_fn strcmp
> +	.fnstart
>   	.cfi_sections .debug_frame
>   	.cfi_startproc
> +	prologue
why no push_ip=HAVE_PAC_LEAF ?
Is that because this is a tiny version and we don't want to use an extra 
push ip even it pacbti is enabled?

>   1:
>   	ldrb	r2, [r0], #1
>   	ldrb	r3, [r1], #1
> @@ -42,6 +46,8 @@ def_fn strcmp
>   	beq	1b
>   2:
>   	subs	r0, r2, r3
> -	bx	lr
> +	epilogue
>   	.cfi_endproc
> +	.cantunwind
> +	.fnend
>   	.size	strcmp, . - strcmp
> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
> index 2f93bfb73..7cafca151 100644
> --- a/newlib/libc/machine/arm/strcmp-armv7.S
> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
> @@ -45,6 +45,8 @@
>   	.thumb
>   	.syntax unified
>   
> +#include "arm_asm.h"
> +
>   /* Parameters and result.  */
>   #define src1		r0
>   #define src2		r1
> @@ -91,8 +93,9 @@
>   	ldrd	r4, r5, [sp], #16
>   	.cfi_restore 4
>   	.cfi_restore 5
> +	.cfi_adjust_cfa_offset -16
>   	sub	result, result, r1, lsr #24
> -	bx	lr
> +	epilogue push_ip=HAVE_PAC_LEAF
>   #else
>   	/* To use the big-endian trick we'd have to reverse all three words.
>   	   that's slower than this approach.  */
> @@ -112,22 +115,21 @@
>   	ldrd	r4, r5, [sp], #16
>   	.cfi_restore 4
>   	.cfi_restore 5
> +	.cfi_adjust_cfa_offset -16
>   	sub	result, result, r1
>   
> -	bx	lr
> +	epilogue push_ip=HAVE_PAC_LEAF
>   #endif
>   	.endm
>   
> +
>   	.text
>   	.p2align	5
> -.Lstrcmp_start_addr:
> -#ifndef STRCMP_NO_PRECHECK
> -.Lfastpath_exit:
> -	sub	r0, r2, r3
> -	bx	lr
> -	nop
> -#endif
>   def_fn	strcmp
> +	.fnstart
> +	.cfi_sections .debug_frame
> +	.cfi_startproc
> +	prologue push_ip=HAVE_PAC_LEAF
>   #ifndef STRCMP_NO_PRECHECK
>   	ldrb	r2, [src1]
>   	ldrb	r3, [src2]
> @@ -136,16 +138,14 @@ def_fn	strcmp
>   	cmpcs	r2, r3
>   	bne	.Lfastpath_exit
>   #endif
> -	.cfi_sections .debug_frame
> -	.cfi_startproc
>   	strd	r4, r5, [sp, #-16]!
> -	.cfi_def_cfa_offset 16
> -	.cfi_offset 4, -16
> -	.cfi_offset 5, -12
> +	.cfi_adjust_cfa_offset 16
> +	.cfi_rel_offset 4, 0
> +	.cfi_rel_offset 5, 4
>   	orr	tmp1, src1, src2
>   	strd	r6, r7, [sp, #8]
> -	.cfi_offset 6, -8
> -	.cfi_offset 7, -4
> +	.cfi_rel_offset 6, 8
> +	.cfi_rel_offset 7, 12
>   	mvn	const_m1, #0
>   	lsl	r2, tmp1, #29
>   	cbz	r2, .Lloop_aligned8
> @@ -270,7 +270,6 @@ def_fn	strcmp
>   	ldr	data1, [src1], #4
>   	beq	.Laligned_m2
>   	bcs	.Laligned_m1
> -
>   #ifdef STRCMP_NO_PRECHECK
>   	ldrb	data2, [src2, #1]
>   	uxtb	tmp1, data1, ror #BYTE1_OFFSET
> @@ -314,10 +313,19 @@ def_fn	strcmp
>   	mov	result, tmp1
>   	ldr	r4, [sp], #16
>   	.cfi_restore 4
> -	bx	lr
> +	.cfi_adjust_cfa_offset -16
> +	epilogue push_ip=HAVE_PAC_LEAF
>   
>   #ifndef STRCMP_NO_PRECHECK
> +.Lfastpath_exit:
> +	.cfi_restore_state
> +	.cfi_remember_state
> +	sub	r0, r2, r3
> +	epilogue push_ip=HAVE_PAC_LEAF
> +
>   .Laligned_m1:
> +	.cfi_restore_state
> +	.cfi_remember_state
>   	add	src2, src2, #4
>   #endif
>   .Lsrc1_aligned:
> @@ -364,8 +372,9 @@ def_fn	strcmp
>   	/* R6/7 Not used in this sequence.  */
>   	.cfi_restore 6
>   	.cfi_restore 7
> +	.cfi_adjust_cfa_offset -16
>   	neg	result, result
> -	bx	lr
> +	epilogue push_ip=HAVE_PAC_LEAF
>   
>   6:
>   	.cfi_restore_state
> @@ -441,7 +450,8 @@ def_fn	strcmp
>   	/* R6/7 not used in this sequence.  */
>   	.cfi_restore 6
>   	.cfi_restore 7
> -	bx	lr
> +	.cfi_adjust_cfa_offset -16
> +	epilogue push_ip=HAVE_PAC_LEAF
>   
>   .Lstrcmp_tail:
>   	.cfi_restore_state
> @@ -463,7 +473,10 @@ def_fn	strcmp
>   	/* R6/7 not used in this sequence.  */
>   	.cfi_restore 6
>   	.cfi_restore 7
> +	.cfi_adjust_cfa_offset -16
>   	sub	result, result, data2, lsr #24
> -	bx	lr
> +	epilogue push_ip=HAVE_PAC_LEAF
>   	.cfi_endproc
> -	.size strcmp, . - .Lstrcmp_start_addr
> +	.cantunwind
> +	.fnend
> +	.size strcmp, . - strcmp
> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
> index cdb4912df..825b6e77f 100644
> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
> @@ -29,6 +29,8 @@
>   /* Very similar to the generic code, but uses Thumb2 as implemented
>      in ARMv7-M.  */
>   
> +#include "arm_asm.h"
> +
>   /* Parameters and result.  */
>   #define src1		r0
>   #define src2		r1
> @@ -44,8 +46,10 @@
>   	.thumb
>   	.syntax unified
>   def_fn strcmp
> +	.fnstart
>   	.cfi_sections .debug_frame
>   	.cfi_startproc
> +	prologue push_ip=HAVE_PAC_LEAF
>   	eor	tmp1, src1, src2
>   	tst	tmp1, #3
>   	/* Strings not at same byte offset from a word boundary.  */
> @@ -82,6 +86,7 @@ def_fn strcmp
>   	ldreq	data2, [src2], #4
>   	beq	4b
>   2:
> +	.cfi_remember_state
>   	/* There's a zero or a different byte in the word */
>   	S2HI	result, data1, #24
>   	S2LO	data1, data1, #8
> @@ -106,7 +111,7 @@ def_fn strcmp
>   	lsrs	result, result, #24
>   	subs	result, result, data2
>   #endif
> -	bx	lr
> +	epilogue push_ip=HAVE_PAC_LEAF
>   
>   
>   #if 0
> @@ -205,8 +210,10 @@ def_fn strcmp
>   
>   	/* First of all, compare bytes until src1(sp1) is word-aligned. */
>   .Lstrcmp_unaligned:
> +	.cfi_restore_state
>   	tst	src1, #3
>   	beq	2f
> +	.cfi_remember_state
>   	ldrb	data1, [src1], #1
>   	ldrb	data2, [src2], #1
>   	cmp	data1, #1
> @@ -214,12 +221,13 @@ def_fn strcmp
>   	cmpcs	data1, data2
>   	beq	.Lstrcmp_unaligned
>   	sub	result, data1, data2
> -	bx	lr
> +	epilogue push_ip=HAVE_PAC_LEAF
>   
>   2:
> +	.cfi_restore_state
>   	stmfd	sp!, {r5}
> -	.cfi_def_cfa_offset 4
> -	.cfi_offset 5, -4
> +	.cfi_adjust_cfa_offset 4
> +	.cfi_rel_offset 5, 0
>   
>   	ldr	data1, [src1], #4
>   	and	tmp2, src2, #3
> @@ -355,8 +363,8 @@ def_fn strcmp
>   	.cfi_remember_state
>   	ldmfd	sp!, {r5}
>   	.cfi_restore 5
> -	.cfi_def_cfa_offset 0
> -	bx	lr
> +	.cfi_adjust_cfa_offset -4
> +	epilogue push_ip=HAVE_PAC_LEAF
>   
>   .Lstrcmp_tail:
>   	.cfi_restore_state
> @@ -372,7 +380,9 @@ def_fn strcmp
>   	sub	result, r2, result
>   	ldmfd	sp!, {r5}
>   	.cfi_restore 5
> -	.cfi_def_cfa_offset 0
> -	bx	lr
> +	.cfi_adjust_cfa_offset -4
> +	epilogue push_ip=HAVE_PAC_LEAF
>   	.cfi_endproc
> +	.cantunwind
> +	.fnend
>   	.size strcmp, . - strcmp

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros
  2023-01-06 10:42   ` Christophe Lyon
@ 2023-01-06 20:51     ` Victor Do Nascimento
  2023-01-09  9:33     ` Christophe Lyon
  1 sibling, 0 replies; 15+ messages in thread
From: Victor Do Nascimento @ 2023-01-06 20:51 UTC (permalink / raw)
  To: Christophe Lyon, newlib; +Cc: Richard Earnshaw

On 1/6/23 10:42, Christophe Lyon wrote:
> Hi Victor,
> 
> Thanks for the patch series, a few comments/questions below.
> 
> Christophe
> 
> 
> On 12/21/22 12:19, Victor L. Do Nascimento wrote:
>> Augment the arm_asm.h header file to simplify function prologues and
>> epilogues whilst adding support for PACBTI enablement via macros for
>> hand-written assembly functions.  For PACBTI, both prologues/epilogues
>> as well as cfi-related directives are automatically amended
>> accordingly, depending on the compile-time mbranch-protection argument
>> values.
>>
>> It defines the following preprocessor macros:
>>     * HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
>>     leaf functions.
>>     * PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
>>     to the stack irrespective of whether the ip register is clobbered in
>>     the function or not.
>>     * STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
>>     the push list as necessary in the prologue to ensure stack
>>     alignment preservation at the start of assembly function.  The
>>     epilogue behavior is likewise affected by this flag, ensuring any
>>     pushed dummy registers also get popped on function return.
> IIUC, these new macros are meant for general usage outside of newlib, do 
> they need proper documentation? Or maybe an entry in the "News" section? 
> I don't know. Otherwise, I think they should not appear in the user 
> naming space.

The initial rationalle when pondering whether or not to prefix names 
with __ was that I suspect this header is private to newlib (won't be 
exported to users), so we should not be prefixing names with __.

If you're interested, this was discussed in the second posted iteration 
of this patch.

I think users only concern themselves with headers under 
"newlib/libc/include". I may well be wrong though, so please feel free 
to correct me here!

That's not to say these macros wouldn't have use outside of Newlib, but 
their initial purpose was merely to standardize how the newlib assembly 
routines would respond to the presence of PACBTI-related architectural 
features.

>> It also defines the following assembler macros:
>>     * prologue: In addition to pushing any callee-saved registers onto
>>     the stack, it generates any requested pacbti instructions.
>>     Pushed registers are specified via the optional `first', `last',
>>     `push_ip' and `push_lr' macro argument parameters.
> Maybe you should quote 'first' and 'last' differently from 'push_ip' and 
> 'push_lr', since the example below shows that 'first' and 'last' are in 
> fact register numbers (IIUC)

IIUC, your misgivings over my use of the quoting for `first' and `last' 
stem from the fact that in my examples they are not given as named 
parameters.

If so, it's worth noting that assembler macro parameters may be passed 
both as named or positional parameters.

Therefore, when I give the example of `prologue 1 4', I could just as 
easily have written `prologue first=1 last=4' to the same effect. I 
avoided doing so purely for the sake of brevity.

I recognize this does make things a little confusing though...

>>     when a single register number is provided, it pushes that
> Typo: "When" (with a capital)
> 
>>     register.  When two register numbers are provided, they specify a
>>     rage to save.  If push_ip and/or push_lr are non-zero, the
> Typo: "range"
> 
>>     respective registers are also saved.  Stack alignment is requested
>>     via the `align` argument, which defaults to the value of
>>     STACK_ALIGN_ENFORCE, unless manually overridden.
>>
>>     For example:
>>
>>         prologue push_ip=1 -> push {ip}
>>         prologue push_ip=1, align8=1 -> push {r2, ip}
>>         prologue push_ip=1, push_lr=1 -> push {ip, lr}
>>         prologue 1 -> push {r1}
>>         prologue 1, align8=1 -> push {r0, r1}
>>         prologue 1 push_ip=1 -> push {r1, ip}
>>         prologue 1 4 -> push {r1-r4}
>>         prologue 1 4 push_ip=1 -> push {r1-r4, ip}
> can you include an example with pacbti?

Will do!

Thanks,

Victor

> 
>>     * epilogue: pops registers off the stack and emits pac key signing
>>     instruction, if requested. The `first', `last', `push_ip',
>>     `push_lr' and `align' function as per the prologue macro,
>>     generating pop instead of push instructions.
>>
>>     Stack alignment is enforced via the following helper macro
>>     call-chain:
>>
>>     {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
>>       _preprocess_reglist1 -> {_prologue|_epilogue}
>>
>>     Finally, the necessary cfi directives for adding debug information
>>     to prologue and epilogue are generated via the following macros:
>>
>>     * cfisavelist - prologue macro helper function, generating
>>     necessary .cfi_offset directives associated with push instruction.
>>     Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
>>     to generate the following:
>>
>>         push {r1-r2, ip}
>>         .cfi_adjust_cfa_offset 12
>>         .cfi_offset 143, -4
>>         .cfi_offset 2, -8
>>         .cfi_offset 1, -12
>>
>>     * cfirestorelist - epilogue macro helper function, emitting
>>     .cfi_restore instructions prior to resetting the cfa offset.  As
>>     such, calling `epilogue 1 2 push_ip=1' will produce:
>>
>>          pop {r1-r2, ip}
>>     .cfi_register 143, 12
>>     .cfi_restore 2
>>     .cfi_restore 1
>>     .cfi_def_cfa_offset 0
>> ---
>>   newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
>>   1 file changed, 441 insertions(+)
>>
>> diff --git a/newlib/libc/machine/arm/arm_asm.h 
>> b/newlib/libc/machine/arm/arm_asm.h
>> index 2708057de..94fa77b4d 100644
>> --- a/newlib/libc/machine/arm/arm_asm.h
>> +++ b/newlib/libc/machine/arm/arm_asm.h
>> @@ -60,4 +60,445 @@
>>   # define _ISA_THUMB_1
>>   #endif
>> +/* Check whether leaf function PAC signing has been requested in the
>> +   -mbranch-protect compile-time option.  */
>> +#define LEAF_PROTECT_BIT 2
> Shouldn't this start with '__' or be #undefed at the end of this file to 
> avoid polluting user naming space? (I noticed it's not used outside this 
> file in this patch series)
> 
>> +
>> +#ifdef __ARM_FEATURE_PAC_DEFAULT
>> +# define HAVE_PAC_LEAF \
>> +    ((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
>> +#else
>> +# define HAVE_PAC_LEAF 0
>> +#endif
>> +
>> +/* Provide default parameters for PAC-code handling in 
>> leaf-functions.  */
>> +#if HAVE_PAC_LEAF
>> +# ifndef PAC_LEAF_PUSH_IP
>> +#  define PAC_LEAF_PUSH_IP 1
>> +# endif
>> +#else /* !HAVE_PAC_LEAF */
>> +# undef PAC_LEAF_PUSH_IP
>> +# define PAC_LEAF_PUSH_IP 0
>> +#endif /* HAVE_PAC_LEAF */
>> +
>> +#define STACK_ALIGN_ENFORCE 0
>> +
>> +#ifdef __ASSEMBLER__
>> +
>> +/******************************************************************************
>> +* Implementation of the prologue and epilogue assembler macros and their
>> +* associated helper functions.
>> +*
>> +* These functions add support for the following:
>> +*
>> +* - M-profile branch target identification (BTI) landing-pads when 
>> compiled
>> +*   with `-mbranch-protection=bti'.
>> +* - PAC-signing and verification instructions, depending on hardware 
>> support
>> +*   and whether the PAC-signing of leaf functions has been requested 
>> via the
>> +*   `-mbranch-protection=pac-ret+leaf' compiler argument.
>> +* - 8-byte stack alignment preservation at function entry, defaulting 
>> to the
>> +*   value of STACK_ALIGN_ENFORCE.
>> +*
>> +* Notes:
>> +* - Prologue stack alignment is implemented by detecting a push with 
>> an odd
>> +*   number of registers and prepending a dummy register to the list.
>> +* - If alignment is attempted on a list containing r0, compilation 
>> will result
>> +*   in an error.
>> +* - If alignment is attempted in a list containing r1, r0 will be 
>> prepended to
>> +*   the register list and r0 will be restored prior to function 
>> return.  for
>> +*   functions with non-void return types, this will result in the 
>> corruption of
>> +*   the result register.
>> +* - Stack alignment is enforced via the following helper macro 
>> call-chain:
>> +*
>> +*    {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
>> +*        _preprocess_reglist1 -> {_prologue|_epilogue}
>> +*
>> +* - Debug CFI directives are automatically added to prologues and 
>> epilogues,
>> +*   assisted by `cfisavelist' and `cfirestorelist', respectively.
>> +*
>> +* Arguments:
>> +* prologue
>> +* --------
>> +* - first    - If `last' specified, this serves as start of 
>> general-purpose
>> +*          register (GPR) range to push onto stack, otherwise represents
>> +*          single GPR to push onto stack.  If omitted, no GPRs pushed
>> +*          onto stack at prologue.
>> +* - last    - If given, specifies inclusive upper-bound of GPR range.
>> +* - push_ip    - Determines whether IP register is to be pushed to 
>> stack at
>> +*          prologue.  When pac-signing is requested, this holds the
>> +*          the pac-key.  Either 1 or 0 to push or not push, 
>> respectively.
>> +*          Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
>> +* - push_lr    - Determines whether to push lr to the stack on 
>> function entry.
>> +*          Either 1 or 0  to push or not push, respectively.
>> +* - align8    - Whether to enforce alignment. Either 1 or 0, with 1 
>> requesting
>> +*          alignment.
>> +*
>> +* epilogue
>> +* --------
>> +*   The epilogue should be called passing the same arguments as those 
>> passed to
>> +*   the prologue to ensure the stack is not corrupted on function 
>> return.
>> +*
>> +* Usage examples:
>> +*
>> +*   prologue push_ip=1 -> push {ip}
>> +*   epilogue push_ip=1, align8=1 -> pop {r2, ip}
>> +*   prologue push_ip=1, push_lr=1 -> push {ip, lr}
>> +*   epilogue 1 -> pop {r1}
>> +*   prologue 1, align8=1 -> push {r0, r1}
>> +*   epilogue 1, push_ip=1 -> pop {r1, ip}
>> +*   prologue 1, 4 -> push {r1-r4}
>> +*   epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
>> +*
>> +******************************************************************************/
>> +
>> +/* Emit .cfi_restore directives for a consecutive sequence of 
>> registers.  */
>> +    .macro cfirestorelist first, last
>> +    .cfi_restore \last
>> +    .if \last-\first
>> +     cfirestorelist \first, \last-1
>> +    .endif
>> +    .endm
>> +
>> +/* Emit .cfi_offset directives for a consecutive sequence of 
>> registers.  */
>> +    .macro cfisavelist first, last, index=1
>> +    .cfi_offset \last, -4*(\index)
>> +    .if \last-\first
>> +     cfisavelist \first, \last-1, \index+1
>> +    .endif
>> +    .endm
>> +
>> +.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
>> +    .if \push_ip & 1 != \push_ip
>> +     .error "push_ip may be either 0 or 1"
>> +    .endif
>> +    .if \push_lr & 1 != \push_lr
>> +     .error "push_lr may be either 0 or 1"
>> +    .endif
>> +    .if \first != -1
>> +     .if \last == -1
>> +      /* Upper-bound not provided: Set upper = lower.  */
>> +      _prologue \first, \first, \push_ip, \push_lr
>> +      .exitm
>> +     .endif
>> +    .endif
>> +#if HAVE_PAC_LEAF
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> +    pacbti    ip, lr, sp
>> +#else
>> +    pac    ip, lr, sp
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> +    .cfi_register 143, 12
>> +#else
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> +    bti
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> +#endif /* HAVE_PAC_LEAF */
>> +    .if \first != -1
>> +     .if \last != \first
>> +      .if \last >= 13
>> +    .error "SP cannot be in the save list"
>> +      .endif
> I think you should also check that IP (r12) is not in the range, 
> otherwise I think nothing prevents from doing
> prologue 12, push_ip=1 which will result in emitting push {r12, ip}
> (I suppose gas would complain?)
> .... scratch that, I saw later that this sanity checking is performed in 
> _preprocess_reglist1 :-)
> 
> 
>> +      .if \push_ip
>> +       .if \push_lr
>> +    /* Case 1: push register range, ip and lr registers.  */
>> +    push {r\first-r\last, ip, lr}
>> +    .cfi_adjust_cfa_offset ((\last-\first)+3)*4
>> +    .cfi_offset 14, -4
>> +    .cfi_offset 143, -8
>> +    cfisavelist \first, \last, 3
>> +       .else // !\push_lr
>> +    /* Case 2: push register range and ip register.  */
>> +    push {r\first-r\last, ip}
>> +    .cfi_adjust_cfa_offset ((\last-\first)+2)*4
>> +    .cfi_offset 143, -4
>> +    cfisavelist \first, \last, 2
>> +       .endif
>> +      .else // !\push_ip
>> +       .if \push_lr
>> +    /* Case 3: push register range and lr register.  */
>> +    push {r\first-r\last, lr}
>> +    .cfi_adjust_cfa_offset ((\last-\first)+2)*4
>> +    .cfi_offset 14, -4
>> +    cfisavelist \first, \last, 2
>> +       .else // !\push_lr
>> +    /* Case 4: push register range.  */
>> +    push {r\first-r\last}
>> +    .cfi_adjust_cfa_offset ((\last-\first)+1)*4
>> +    cfisavelist \first, \last, 1
>> +       .endif
>> +      .endif
>> +     .else // \last == \first
>> +      .if \push_ip
>> +       .if \push_lr
>> +    /* Case 5: push single GP register plus ip and lr registers.  */
>> +    push {r\first, ip, lr}
>> +    .cfi_adjust_cfa_offset 12
>> +    .cfi_offset 14, -4
>> +    .cfi_offset 143, -8
>> +        cfisavelist \first, \first, 3
>> +       .else // !\push_lr
>> +    /* Case 6: push single GP register plus ip register.  */
>> +    push {r\first, ip}
>> +    .cfi_adjust_cfa_offset 8
>> +    .cfi_offset 143, -4
>> +        cfisavelist \first, \first, 2
>> +       .endif
>> +      .else // !\push_ip
>> +       .if \push_lr
>> +    /* Case 7: push single GP register plus lr register.  */
>> +    push {r\first, lr}
>> +    .cfi_adjust_cfa_offset 8
>> +    .cfi_offset 14, -4
>> +    cfisavelist \first, \first, 2
>> +       .else // !\push_lr
>> +    /* Case 8: push single GP register.  */
>> +    push {r\first}
>> +    .cfi_adjust_cfa_offset 4
>> +    cfisavelist \first, \first, 1
>> +       .endif
>> +      .endif
>> +     .endif
>> +    .else // \first == -1
>> +     .if \push_ip
>> +      .if \push_lr
>> +    /* Case 9: push ip and lr registers.  */
>> +    push {ip, lr}
>> +    .cfi_adjust_cfa_offset 8
>> +    .cfi_offset 14, -4
>> +    .cfi_offset 143, -8
>> +      .else // !\push_lr
>> +    /* Case 10: push ip register.  */
>> +    push {ip}
>> +    .cfi_adjust_cfa_offset 4
>> +    .cfi_offset 143, -4
>> +      .endif
>> +     .else // !\push_ip
>> +          .if \push_lr
>> +    /* Case 11: push lr register.  */
>> +    push {lr}
>> +    .cfi_adjust_cfa_offset 4
>> +    .cfi_offset 14, -4
>> +          .endif
>> +     .endif
>> +    .endif
>> +.endm
>> +
>> +.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
>> +    .if \push_ip & 1 != \push_ip
>> +     .error "push_ip may be either 0 or 1"
>> +    .endif
>> +    .if \push_lr & 1 != \push_lr
>> +     .error "push_lr may be either 0 or 1"
>> +    .endif
>> +    .if \first != -1
>> +     .if \last == -1
>> +      /* Upper-bound not provided: Set upper = lower.  */
>> +      _epilogue \first, \first, \push_ip, \push_lr
>> +      .exitm
>> +     .endif
>> +     .if \last != \first
>> +      .if \last >= 13
>> +    .error "SP cannot be in the save list"
>> +      .endif
>> +      .if \push_ip
>> +       .if \push_lr
>> +    /* Case 1: pop register range, ip and lr registers.  */
>> +    pop {r\first-r\last, ip, lr}
>> +    .cfi_restore 14
>> +    .cfi_register 143, 12
>> +    cfirestorelist \first, \last
>> +       .else // !\push_lr
>> +    /* Case 2: pop register range and ip register.  */
>> +    pop {r\first-r\last, ip}
>> +    .cfi_register 143, 12
>> +    cfirestorelist \first, \last
>> +       .endif
>> +      .else // !\push_ip
>> +       .if \push_lr
>> +    /* Case 3: pop register range and lr register.  */
>> +    pop {r\first-r\last, lr}
>> +    .cfi_restore 14
>> +    cfirestorelist \first, \last
>> +       .else // !\push_lr
>> +    /* Case 4: pop register range.  */
>> +    pop {r\first-r\last}
>> +    cfirestorelist \first, \last
>> +       .endif
>> +      .endif
>> +     .else // \last == \first
>> +      .if \push_ip
>> +       .if \push_lr
>> +    /* Case 5: pop single GP register plus ip and lr registers.  */
>> +    pop {r\first, ip, lr}
>> +    .cfi_restore 14
>> +    .cfi_register 143, 12
>> +    cfirestorelist \first, \first
>> +       .else // !\push_lr
>> +    /* Case 6: pop single GP register plus ip register.  */
>> +    pop {r\first, ip}
>> +    .cfi_register 143, 12
>> +    cfirestorelist \first, \first
>> +       .endif
>> +      .else // !\push_ip
>> +       .if \push_lr
>> +    /* Case 7: pop single GP register plus lr register.  */
>> +    pop {r\first, lr}
>> +    .cfi_restore 14
>> +    cfirestorelist \first, \first
>> +       .else // !\push_lr
>> +    /* Case 8: pop single GP register.  */
>> +    pop {r\first}
>> +    cfirestorelist \first, \first
>> +       .endif
>> +      .endif
>> +     .endif
>> +    .else // \first == -1
>> +     .if \push_ip
>> +      .if \push_lr
>> +    /* Case 9: pop ip and lr registers.  */
>> +    pop {ip, lr}
>> +    .cfi_restore 14
>> +    .cfi_register 143, 12
>> +      .else // !\push_lr
>> +    /* Case 10: pop ip register.  */
>> +    pop {ip}
>> +    .cfi_register 143, 12
>> +      .endif
>> +     .else // !\push_ip
>> +          .if \push_lr
>> +    /* Case 11: pop lr register.  */
>> +    pop {lr}
>> +    .cfi_restore 14
>> +          .endif
>> +     .endif
>> +    .endif
>> +#if HAVE_PAC_LEAF
>> +    aut    ip, lr, sp
>> +#endif /* HAVE_PAC_LEAF */
>> +    bx    lr
>> +.endm
>> +
>> +# clean up expressions in 'last'
>> +.macro _preprocess_reglist1 first:req, last:req, push_ip:req, 
>> push_lr:req, reglist_op:req
>> +    .if \last == 0
>> +     \reglist_op \first, 0, \push_ip, \push_lr
>> +    .elseif \last == 1
>> +     \reglist_op \first, 1, \push_ip, \push_lr
>> +    .elseif \last == 2
>> +     \reglist_op \first, 2, \push_ip, \push_lr
>> +    .elseif \last == 3
>> +     \reglist_op \first, 3, \push_ip, \push_lr
>> +    .elseif \last == 4
>> +     \reglist_op \first, 4, \push_ip, \push_lr
>> +    .elseif \last == 5
>> +     \reglist_op \first, 5, \push_ip, \push_lr
>> +    .elseif \last == 6
>> +     \reglist_op \first, 6, \push_ip, \push_lr
>> +    .elseif \last == 7
>> +     \reglist_op \first, 7, \push_ip, \push_lr
>> +    .elseif \last == 8
>> +     \reglist_op \first, 8, \push_ip, \push_lr
>> +    .elseif \last == 9
>> +     \reglist_op \first, 9, \push_ip, \push_lr
>> +    .elseif \last == 10
>> +     \reglist_op \first, 10, \push_ip, \push_lr
>> +    .elseif \last == 11
>> +     \reglist_op \first, 11, \push_ip, \push_lr
>> +    .else
>> +     .error "last (\last) out of range"
>> +    .endif
>> +.endm
>> +
>> +# clean up expressions in 'first'
>> +.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0, 
>> reglist_op:req
>> +    .ifb \last
>> +     _preprocess_reglist \first \first \push_ip \push_lr
>> +    .else
>> +     .if \first > \last
>> +      .error "last (\last) must be at least as great as first (\first)"
>> +     .endif
>> +     .if \first == 0
>> +      _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 1
>> +      _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 2
>> +      _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 3
>> +      _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 4
>> +      _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 5
>> +      _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 6
>> +      _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 7
>> +      _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 8
>> +      _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 9
>> +      _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 10
>> +      _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 11
>> +      _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
>> +     .else
>> +      .error "first (\first) out of range"
>> +     .endif
>> +    .endif
>> +.endm
>> +
>> +.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
>> +    .ifb \first
>> +     .ifnb \last
>> +      .error "can't have last (\last) without specifying first"
>> +     .else // \last not blank
>> +      .if ((\push_ip + \push_lr) % 2) == 0
>> +       \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
>> +       .exitm
>> +      .else // ((\push_ip + \push_lr) % 2) odd
>> +       _align8 2, 2, \push_ip, \push_lr, \reglist_op
>> +       .exitm
>> +      .endif // ((\push_ip + \push_lr) % 2) == 0
>> +     .endif // .ifnb \last
>> +    .endif // .ifb \first
>> +
>> +    .ifb \last
>> +     _align8 \first, \first, \push_ip, \push_lr, \reglist_op
>> +    .else
>> +     .if \push_ip & 1 <> \push_ip
>> +      .error "push_ip may be 0 or 1"
>> +     .endif
>> +     .if \push_lr & 1 <> \push_lr
>> +      .error "push_lr may be 0 or 1"
>> +     .endif
>> +     .ifeq (\last - \first + \push_ip + \push_lr) % 2
>> +      .if \first == 0
>> +       .error "Alignment required and first register is r0"
>> +       .exitm
>> +      .endif
>> +      _preprocess_reglist \first-1, \last, \push_ip, \push_lr, 
>> \reglist_op
>> +     .else
>> +      _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
>> +     .endif
>> +    .endif
>> +.endm
>> +
>> +.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, 
>> align8=STACK_ALIGN_ENFORCE
>> +    .if \align8
>> +     _align8 \first, \last, \push_ip, \push_lr, _prologue
>> +    .else
>> +     _prologue first=\first, last=\last, push_ip=\push_ip, 
>> push_lr=\push_lr
>> +    .endif
>> +.endm
>> +
>> +.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, 
>> align8=STACK_ALIGN_ENFORCE
>> +    .if \align8
>> +     _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
>> +    .else
>> +     _epilogue first=\first, last=\last, push_ip=\push_ip, 
>> push_lr=\push_lr
>> +    .endif
>> +.endm
>> +
>> +#endif /* __ASSEMBLER__ */
>> +
>>   #endif /* ARM_ASM__H */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
  2023-01-06 11:09   ` Christophe Lyon
@ 2023-01-06 21:35     ` Victor Do Nascimento
  0 siblings, 0 replies; 15+ messages in thread
From: Victor Do Nascimento @ 2023-01-06 21:35 UTC (permalink / raw)
  To: Christophe Lyon, newlib; +Cc: Richard Earnshaw



On 1/6/23 11:09, Christophe Lyon wrote:
> 
> 
> On 12/21/22 12:21, Victor L. Do Nascimento wrote:
>> Add function prologue/epilogue to conditionally add BTI landing pads
>> and/or PAC code generation & authentication instructions depending on
>> compilation flags.
>>
>> This patch enables PACBTI for all relevant variants of strcmp:
>>       * Newlib for armv8.1-m.main+pacbti
>>       * Newlib for armv8.1-m.main+pacbti+mve
>>       * Newlib-nano
>> ---
>>   newlib/libc/machine/arm/strcmp-arm-tiny.S |  8 +++-
>>   newlib/libc/machine/arm/strcmp-armv7.S    | 57 ++++++++++++++---------
>>   newlib/libc/machine/arm/strcmp-armv7m.S   | 26 +++++++----
>>   3 files changed, 60 insertions(+), 31 deletions(-)
>>
>> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S 
>> b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> index 607a41daf..0bd2a2e6e 100644
>> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> @@ -29,10 +29,14 @@
>>   /* Tiny version of strcmp in ARM state.  Used only when optimizing
>>      for size.  Also supports Thumb-2.  */
>> +#include "arm_asm.h"
>> +
>>       .syntax unified
>>   def_fn strcmp
>> +    .fnstart
>>       .cfi_sections .debug_frame
>>       .cfi_startproc
>> +    prologue
> why no push_ip=HAVE_PAC_LEAF ?
> Is that because this is a tiny version and we don't want to use an extra 
> push ip even it pacbti is enabled?

push_ip=HAVE_PAC_LEAF is reserved for a particular scenario.

If we're PAC-signing leaf functions (that is, HAVE_PAC_LEAF is set) but 
the intraprocedural scratch register r12 is not used in the function 
body, there's no strict need to push the pac-code onto the stack, so 
push_ip defaults to a potentially overridable value of PAC_LEAF_PUSH_IP.

If, on the other hand, r12 is used as part of the function body, our 
PAC-code will be corrupted. In such cases, pushing ip should be strictly 
dictated by the fact that we have requested leaf function PAC-signing, 
so that it can later be restored.

Therefore, if r12 is corrupted and HAVE_PAC_LEAF is set we should push 
ip to the stack irrespective of any overrides, and that's where 
push_ip=HAVE_PAC_LEAF is important.

as strcmp-arm-tiny.S doesn't use r12, we have flexibility over whether 
or not to push ip onto stack. That's why we have simply `push_ip' and 
not `push_ip=HAVE_PAC_LEAF'. strcmp-armv7.S and strcmp-armv7m.S 
represent the opposite scenario. :-)

Regards,
Victor

>>   1:
>>       ldrb    r2, [r0], #1
>>       ldrb    r3, [r1], #1
>> @@ -42,6 +46,8 @@ def_fn strcmp
>>       beq    1b
>>   2:
>>       subs    r0, r2, r3
>> -    bx    lr
>> +    epilogue
>>       .cfi_endproc
>> +    .cantunwind
>> +    .fnend
>>       .size    strcmp, . - strcmp
>> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S 
>> b/newlib/libc/machine/arm/strcmp-armv7.S
>> index 2f93bfb73..7cafca151 100644
>> --- a/newlib/libc/machine/arm/strcmp-armv7.S
>> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
>> @@ -45,6 +45,8 @@
>>       .thumb
>>       .syntax unified
>> +#include "arm_asm.h"
>> +
>>   /* Parameters and result.  */
>>   #define src1        r0
>>   #define src2        r1
>> @@ -91,8 +93,9 @@
>>       ldrd    r4, r5, [sp], #16
>>       .cfi_restore 4
>>       .cfi_restore 5
>> +    .cfi_adjust_cfa_offset -16
>>       sub    result, result, r1, lsr #24
>> -    bx    lr
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>   #else
>>       /* To use the big-endian trick we'd have to reverse all three 
>> words.
>>          that's slower than this approach.  */
>> @@ -112,22 +115,21 @@
>>       ldrd    r4, r5, [sp], #16
>>       .cfi_restore 4
>>       .cfi_restore 5
>> +    .cfi_adjust_cfa_offset -16
>>       sub    result, result, r1
>> -    bx    lr
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>   #endif
>>       .endm
>> +
>>       .text
>>       .p2align    5
>> -.Lstrcmp_start_addr:
>> -#ifndef STRCMP_NO_PRECHECK
>> -.Lfastpath_exit:
>> -    sub    r0, r2, r3
>> -    bx    lr
>> -    nop
>> -#endif
>>   def_fn    strcmp
>> +    .fnstart
>> +    .cfi_sections .debug_frame
>> +    .cfi_startproc
>> +    prologue push_ip=HAVE_PAC_LEAF
>>   #ifndef STRCMP_NO_PRECHECK
>>       ldrb    r2, [src1]
>>       ldrb    r3, [src2]
>> @@ -136,16 +138,14 @@ def_fn    strcmp
>>       cmpcs    r2, r3
>>       bne    .Lfastpath_exit
>>   #endif
>> -    .cfi_sections .debug_frame
>> -    .cfi_startproc
>>       strd    r4, r5, [sp, #-16]!
>> -    .cfi_def_cfa_offset 16
>> -    .cfi_offset 4, -16
>> -    .cfi_offset 5, -12
>> +    .cfi_adjust_cfa_offset 16
>> +    .cfi_rel_offset 4, 0
>> +    .cfi_rel_offset 5, 4
>>       orr    tmp1, src1, src2
>>       strd    r6, r7, [sp, #8]
>> -    .cfi_offset 6, -8
>> -    .cfi_offset 7, -4
>> +    .cfi_rel_offset 6, 8
>> +    .cfi_rel_offset 7, 12
>>       mvn    const_m1, #0
>>       lsl    r2, tmp1, #29
>>       cbz    r2, .Lloop_aligned8
>> @@ -270,7 +270,6 @@ def_fn    strcmp
>>       ldr    data1, [src1], #4
>>       beq    .Laligned_m2
>>       bcs    .Laligned_m1
>> -
>>   #ifdef STRCMP_NO_PRECHECK
>>       ldrb    data2, [src2, #1]
>>       uxtb    tmp1, data1, ror #BYTE1_OFFSET
>> @@ -314,10 +313,19 @@ def_fn    strcmp
>>       mov    result, tmp1
>>       ldr    r4, [sp], #16
>>       .cfi_restore 4
>> -    bx    lr
>> +    .cfi_adjust_cfa_offset -16
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>   #ifndef STRCMP_NO_PRECHECK
>> +.Lfastpath_exit:
>> +    .cfi_restore_state
>> +    .cfi_remember_state
>> +    sub    r0, r2, r3
>> +    epilogue push_ip=HAVE_PAC_LEAF
>> +
>>   .Laligned_m1:
>> +    .cfi_restore_state
>> +    .cfi_remember_state
>>       add    src2, src2, #4
>>   #endif
>>   .Lsrc1_aligned:
>> @@ -364,8 +372,9 @@ def_fn    strcmp
>>       /* R6/7 Not used in this sequence.  */
>>       .cfi_restore 6
>>       .cfi_restore 7
>> +    .cfi_adjust_cfa_offset -16
>>       neg    result, result
>> -    bx    lr
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>   6:
>>       .cfi_restore_state
>> @@ -441,7 +450,8 @@ def_fn    strcmp
>>       /* R6/7 not used in this sequence.  */
>>       .cfi_restore 6
>>       .cfi_restore 7
>> -    bx    lr
>> +    .cfi_adjust_cfa_offset -16
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>   .Lstrcmp_tail:
>>       .cfi_restore_state
>> @@ -463,7 +473,10 @@ def_fn    strcmp
>>       /* R6/7 not used in this sequence.  */
>>       .cfi_restore 6
>>       .cfi_restore 7
>> +    .cfi_adjust_cfa_offset -16
>>       sub    result, result, data2, lsr #24
>> -    bx    lr
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>       .cfi_endproc
>> -    .size strcmp, . - .Lstrcmp_start_addr
>> +    .cantunwind
>> +    .fnend
>> +    .size strcmp, . - strcmp
>> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S 
>> b/newlib/libc/machine/arm/strcmp-armv7m.S
>> index cdb4912df..825b6e77f 100644
>> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
>> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
>> @@ -29,6 +29,8 @@
>>   /* Very similar to the generic code, but uses Thumb2 as implemented
>>      in ARMv7-M.  */
>> +#include "arm_asm.h"
>> +
>>   /* Parameters and result.  */
>>   #define src1        r0
>>   #define src2        r1
>> @@ -44,8 +46,10 @@
>>       .thumb
>>       .syntax unified
>>   def_fn strcmp
>> +    .fnstart
>>       .cfi_sections .debug_frame
>>       .cfi_startproc
>> +    prologue push_ip=HAVE_PAC_LEAF
>>       eor    tmp1, src1, src2
>>       tst    tmp1, #3
>>       /* Strings not at same byte offset from a word boundary.  */
>> @@ -82,6 +86,7 @@ def_fn strcmp
>>       ldreq    data2, [src2], #4
>>       beq    4b
>>   2:
>> +    .cfi_remember_state
>>       /* There's a zero or a different byte in the word */
>>       S2HI    result, data1, #24
>>       S2LO    data1, data1, #8
>> @@ -106,7 +111,7 @@ def_fn strcmp
>>       lsrs    result, result, #24
>>       subs    result, result, data2
>>   #endif
>> -    bx    lr
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>   #if 0
>> @@ -205,8 +210,10 @@ def_fn strcmp
>>       /* First of all, compare bytes until src1(sp1) is word-aligned. */
>>   .Lstrcmp_unaligned:
>> +    .cfi_restore_state
>>       tst    src1, #3
>>       beq    2f
>> +    .cfi_remember_state
>>       ldrb    data1, [src1], #1
>>       ldrb    data2, [src2], #1
>>       cmp    data1, #1
>> @@ -214,12 +221,13 @@ def_fn strcmp
>>       cmpcs    data1, data2
>>       beq    .Lstrcmp_unaligned
>>       sub    result, data1, data2
>> -    bx    lr
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>   2:
>> +    .cfi_restore_state
>>       stmfd    sp!, {r5}
>> -    .cfi_def_cfa_offset 4
>> -    .cfi_offset 5, -4
>> +    .cfi_adjust_cfa_offset 4
>> +    .cfi_rel_offset 5, 0
>>       ldr    data1, [src1], #4
>>       and    tmp2, src2, #3
>> @@ -355,8 +363,8 @@ def_fn strcmp
>>       .cfi_remember_state
>>       ldmfd    sp!, {r5}
>>       .cfi_restore 5
>> -    .cfi_def_cfa_offset 0
>> -    bx    lr
>> +    .cfi_adjust_cfa_offset -4
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>   .Lstrcmp_tail:
>>       .cfi_restore_state
>> @@ -372,7 +380,9 @@ def_fn strcmp
>>       sub    result, r2, result
>>       ldmfd    sp!, {r5}
>>       .cfi_restore 5
>> -    .cfi_def_cfa_offset 0
>> -    bx    lr
>> +    .cfi_adjust_cfa_offset -4
>> +    epilogue push_ip=HAVE_PAC_LEAF
>>       .cfi_endproc
>> +    .cantunwind
>> +    .fnend
>>       .size strcmp, . - strcmp

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros
  2023-01-06 10:42   ` Christophe Lyon
  2023-01-06 20:51     ` Victor Do Nascimento
@ 2023-01-09  9:33     ` Christophe Lyon
  1 sibling, 0 replies; 15+ messages in thread
From: Christophe Lyon @ 2023-01-09  9:33 UTC (permalink / raw)
  To: Victor L. Do Nascimento, newlib; +Cc: Richard Earnshaw



On 1/6/23 11:42, Christophe Lyon wrote:
> Hi Victor,
> 
> Thanks for the patch series, a few comments/questions below.
> 
> Christophe
> 
> 
> On 12/21/22 12:19, Victor L. Do Nascimento wrote:
>> Augment the arm_asm.h header file to simplify function prologues and
>> epilogues whilst adding support for PACBTI enablement via macros for
>> hand-written assembly functions.  For PACBTI, both prologues/epilogues
>> as well as cfi-related directives are automatically amended
>> accordingly, depending on the compile-time mbranch-protection argument
>> values.
>>
>> It defines the following preprocessor macros:
>>     * HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
>>     leaf functions.
>>     * PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
>>     to the stack irrespective of whether the ip register is clobbered in
>>     the function or not.
>>     * STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
>>     the push list as necessary in the prologue to ensure stack
>>     alignment preservation at the start of assembly function.  The
>>     epilogue behavior is likewise affected by this flag, ensuring any
>>     pushed dummy registers also get popped on function return.
> IIUC, these new macros are meant for general usage outside of newlib, do 
> they need proper documentation? Or maybe an entry in the "News" section? 
> I don't know. Otherwise, I think they should not appear in the user 
> naming space.

I've just noticed this point was already discussed with Richard some 
time ago, I somehow missed/forgot it:
https://sourceware.org/pipermail/newlib/2022/019831.html

So it looks fine if this header/macros are not meant to be used outside 
of newlib.

Thanks,

Christophe

> 
> 
>> It also defines the following assembler macros:
>>     * prologue: In addition to pushing any callee-saved registers onto
>>     the stack, it generates any requested pacbti instructions.
>>     Pushed registers are specified via the optional `first', `last',
>>     `push_ip' and `push_lr' macro argument parameters.
> Maybe you should quote 'first' and 'last' differently from 'push_ip' and 
> 'push_lr', since the example below shows that 'first' and 'last' are in 
> fact register numbers (IIUC)
> 
>>     when a single register number is provided, it pushes that
> Typo: "When" (with a capital)
> 
>>     register.  When two register numbers are provided, they specify a
>>     rage to save.  If push_ip and/or push_lr are non-zero, the
> Typo: "range"
> 
>>     respective registers are also saved.  Stack alignment is requested
>>     via the `align` argument, which defaults to the value of
>>     STACK_ALIGN_ENFORCE, unless manually overridden.
>>
>>     For example:
>>
>>         prologue push_ip=1 -> push {ip}
>>         prologue push_ip=1, align8=1 -> push {r2, ip}
>>         prologue push_ip=1, push_lr=1 -> push {ip, lr}
>>         prologue 1 -> push {r1}
>>         prologue 1, align8=1 -> push {r0, r1}
>>         prologue 1 push_ip=1 -> push {r1, ip}
>>         prologue 1 4 -> push {r1-r4}
>>         prologue 1 4 push_ip=1 -> push {r1-r4, ip}
> can you include an example with pacbti?
> 
> 
>>     * epilogue: pops registers off the stack and emits pac key signing
>>     instruction, if requested. The `first', `last', `push_ip',
>>     `push_lr' and `align' function as per the prologue macro,
>>     generating pop instead of push instructions.
>>
>>     Stack alignment is enforced via the following helper macro
>>     call-chain:
>>
>>     {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
>>       _preprocess_reglist1 -> {_prologue|_epilogue}
>>
>>     Finally, the necessary cfi directives for adding debug information
>>     to prologue and epilogue are generated via the following macros:
>>
>>     * cfisavelist - prologue macro helper function, generating
>>     necessary .cfi_offset directives associated with push instruction.
>>     Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
>>     to generate the following:
>>
>>         push {r1-r2, ip}
>>         .cfi_adjust_cfa_offset 12
>>         .cfi_offset 143, -4
>>         .cfi_offset 2, -8
>>         .cfi_offset 1, -12
>>
>>     * cfirestorelist - epilogue macro helper function, emitting
>>     .cfi_restore instructions prior to resetting the cfa offset.  As
>>     such, calling `epilogue 1 2 push_ip=1' will produce:
>>
>>          pop {r1-r2, ip}
>>     .cfi_register 143, 12
>>     .cfi_restore 2
>>     .cfi_restore 1
>>     .cfi_def_cfa_offset 0
>> ---
>>   newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
>>   1 file changed, 441 insertions(+)
>>
>> diff --git a/newlib/libc/machine/arm/arm_asm.h 
>> b/newlib/libc/machine/arm/arm_asm.h
>> index 2708057de..94fa77b4d 100644
>> --- a/newlib/libc/machine/arm/arm_asm.h
>> +++ b/newlib/libc/machine/arm/arm_asm.h
>> @@ -60,4 +60,445 @@
>>   # define _ISA_THUMB_1
>>   #endif
>> +/* Check whether leaf function PAC signing has been requested in the
>> +   -mbranch-protect compile-time option.  */
>> +#define LEAF_PROTECT_BIT 2
> Shouldn't this start with '__' or be #undefed at the end of this file to 
> avoid polluting user naming space? (I noticed it's not used outside this 
> file in this patch series)
> 
>> +
>> +#ifdef __ARM_FEATURE_PAC_DEFAULT
>> +# define HAVE_PAC_LEAF \
>> +    ((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
>> +#else
>> +# define HAVE_PAC_LEAF 0
>> +#endif
>> +
>> +/* Provide default parameters for PAC-code handling in 
>> leaf-functions.  */
>> +#if HAVE_PAC_LEAF
>> +# ifndef PAC_LEAF_PUSH_IP
>> +#  define PAC_LEAF_PUSH_IP 1
>> +# endif
>> +#else /* !HAVE_PAC_LEAF */
>> +# undef PAC_LEAF_PUSH_IP
>> +# define PAC_LEAF_PUSH_IP 0
>> +#endif /* HAVE_PAC_LEAF */
>> +
>> +#define STACK_ALIGN_ENFORCE 0
>> +
>> +#ifdef __ASSEMBLER__
>> +
>> +/******************************************************************************
>> +* Implementation of the prologue and epilogue assembler macros and their
>> +* associated helper functions.
>> +*
>> +* These functions add support for the following:
>> +*
>> +* - M-profile branch target identification (BTI) landing-pads when 
>> compiled
>> +*   with `-mbranch-protection=bti'.
>> +* - PAC-signing and verification instructions, depending on hardware 
>> support
>> +*   and whether the PAC-signing of leaf functions has been requested 
>> via the
>> +*   `-mbranch-protection=pac-ret+leaf' compiler argument.
>> +* - 8-byte stack alignment preservation at function entry, defaulting 
>> to the
>> +*   value of STACK_ALIGN_ENFORCE.
>> +*
>> +* Notes:
>> +* - Prologue stack alignment is implemented by detecting a push with 
>> an odd
>> +*   number of registers and prepending a dummy register to the list.
>> +* - If alignment is attempted on a list containing r0, compilation 
>> will result
>> +*   in an error.
>> +* - If alignment is attempted in a list containing r1, r0 will be 
>> prepended to
>> +*   the register list and r0 will be restored prior to function 
>> return.  for
>> +*   functions with non-void return types, this will result in the 
>> corruption of
>> +*   the result register.
>> +* - Stack alignment is enforced via the following helper macro 
>> call-chain:
>> +*
>> +*    {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
>> +*        _preprocess_reglist1 -> {_prologue|_epilogue}
>> +*
>> +* - Debug CFI directives are automatically added to prologues and 
>> epilogues,
>> +*   assisted by `cfisavelist' and `cfirestorelist', respectively.
>> +*
>> +* Arguments:
>> +* prologue
>> +* --------
>> +* - first    - If `last' specified, this serves as start of 
>> general-purpose
>> +*          register (GPR) range to push onto stack, otherwise represents
>> +*          single GPR to push onto stack.  If omitted, no GPRs pushed
>> +*          onto stack at prologue.
>> +* - last    - If given, specifies inclusive upper-bound of GPR range.
>> +* - push_ip    - Determines whether IP register is to be pushed to 
>> stack at
>> +*          prologue.  When pac-signing is requested, this holds the
>> +*          the pac-key.  Either 1 or 0 to push or not push, 
>> respectively.
>> +*          Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
>> +* - push_lr    - Determines whether to push lr to the stack on 
>> function entry.
>> +*          Either 1 or 0  to push or not push, respectively.
>> +* - align8    - Whether to enforce alignment. Either 1 or 0, with 1 
>> requesting
>> +*          alignment.
>> +*
>> +* epilogue
>> +* --------
>> +*   The epilogue should be called passing the same arguments as those 
>> passed to
>> +*   the prologue to ensure the stack is not corrupted on function 
>> return.
>> +*
>> +* Usage examples:
>> +*
>> +*   prologue push_ip=1 -> push {ip}
>> +*   epilogue push_ip=1, align8=1 -> pop {r2, ip}
>> +*   prologue push_ip=1, push_lr=1 -> push {ip, lr}
>> +*   epilogue 1 -> pop {r1}
>> +*   prologue 1, align8=1 -> push {r0, r1}
>> +*   epilogue 1, push_ip=1 -> pop {r1, ip}
>> +*   prologue 1, 4 -> push {r1-r4}
>> +*   epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
>> +*
>> +******************************************************************************/
>> +
>> +/* Emit .cfi_restore directives for a consecutive sequence of 
>> registers.  */
>> +    .macro cfirestorelist first, last
>> +    .cfi_restore \last
>> +    .if \last-\first
>> +     cfirestorelist \first, \last-1
>> +    .endif
>> +    .endm
>> +
>> +/* Emit .cfi_offset directives for a consecutive sequence of 
>> registers.  */
>> +    .macro cfisavelist first, last, index=1
>> +    .cfi_offset \last, -4*(\index)
>> +    .if \last-\first
>> +     cfisavelist \first, \last-1, \index+1
>> +    .endif
>> +    .endm
>> +
>> +.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
>> +    .if \push_ip & 1 != \push_ip
>> +     .error "push_ip may be either 0 or 1"
>> +    .endif
>> +    .if \push_lr & 1 != \push_lr
>> +     .error "push_lr may be either 0 or 1"
>> +    .endif
>> +    .if \first != -1
>> +     .if \last == -1
>> +      /* Upper-bound not provided: Set upper = lower.  */
>> +      _prologue \first, \first, \push_ip, \push_lr
>> +      .exitm
>> +     .endif
>> +    .endif
>> +#if HAVE_PAC_LEAF
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> +    pacbti    ip, lr, sp
>> +#else
>> +    pac    ip, lr, sp
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> +    .cfi_register 143, 12
>> +#else
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> +    bti
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> +#endif /* HAVE_PAC_LEAF */
>> +    .if \first != -1
>> +     .if \last != \first
>> +      .if \last >= 13
>> +    .error "SP cannot be in the save list"
>> +      .endif
> I think you should also check that IP (r12) is not in the range, 
> otherwise I think nothing prevents from doing
> prologue 12, push_ip=1 which will result in emitting push {r12, ip}
> (I suppose gas would complain?)
> .... scratch that, I saw later that this sanity checking is performed in 
> _preprocess_reglist1 :-)
> 
> 
>> +      .if \push_ip
>> +       .if \push_lr
>> +    /* Case 1: push register range, ip and lr registers.  */
>> +    push {r\first-r\last, ip, lr}
>> +    .cfi_adjust_cfa_offset ((\last-\first)+3)*4
>> +    .cfi_offset 14, -4
>> +    .cfi_offset 143, -8
>> +    cfisavelist \first, \last, 3
>> +       .else // !\push_lr
>> +    /* Case 2: push register range and ip register.  */
>> +    push {r\first-r\last, ip}
>> +    .cfi_adjust_cfa_offset ((\last-\first)+2)*4
>> +    .cfi_offset 143, -4
>> +    cfisavelist \first, \last, 2
>> +       .endif
>> +      .else // !\push_ip
>> +       .if \push_lr
>> +    /* Case 3: push register range and lr register.  */
>> +    push {r\first-r\last, lr}
>> +    .cfi_adjust_cfa_offset ((\last-\first)+2)*4
>> +    .cfi_offset 14, -4
>> +    cfisavelist \first, \last, 2
>> +       .else // !\push_lr
>> +    /* Case 4: push register range.  */
>> +    push {r\first-r\last}
>> +    .cfi_adjust_cfa_offset ((\last-\first)+1)*4
>> +    cfisavelist \first, \last, 1
>> +       .endif
>> +      .endif
>> +     .else // \last == \first
>> +      .if \push_ip
>> +       .if \push_lr
>> +    /* Case 5: push single GP register plus ip and lr registers.  */
>> +    push {r\first, ip, lr}
>> +    .cfi_adjust_cfa_offset 12
>> +    .cfi_offset 14, -4
>> +    .cfi_offset 143, -8
>> +        cfisavelist \first, \first, 3
>> +       .else // !\push_lr
>> +    /* Case 6: push single GP register plus ip register.  */
>> +    push {r\first, ip}
>> +    .cfi_adjust_cfa_offset 8
>> +    .cfi_offset 143, -4
>> +        cfisavelist \first, \first, 2
>> +       .endif
>> +      .else // !\push_ip
>> +       .if \push_lr
>> +    /* Case 7: push single GP register plus lr register.  */
>> +    push {r\first, lr}
>> +    .cfi_adjust_cfa_offset 8
>> +    .cfi_offset 14, -4
>> +    cfisavelist \first, \first, 2
>> +       .else // !\push_lr
>> +    /* Case 8: push single GP register.  */
>> +    push {r\first}
>> +    .cfi_adjust_cfa_offset 4
>> +    cfisavelist \first, \first, 1
>> +       .endif
>> +      .endif
>> +     .endif
>> +    .else // \first == -1
>> +     .if \push_ip
>> +      .if \push_lr
>> +    /* Case 9: push ip and lr registers.  */
>> +    push {ip, lr}
>> +    .cfi_adjust_cfa_offset 8
>> +    .cfi_offset 14, -4
>> +    .cfi_offset 143, -8
>> +      .else // !\push_lr
>> +    /* Case 10: push ip register.  */
>> +    push {ip}
>> +    .cfi_adjust_cfa_offset 4
>> +    .cfi_offset 143, -4
>> +      .endif
>> +     .else // !\push_ip
>> +          .if \push_lr
>> +    /* Case 11: push lr register.  */
>> +    push {lr}
>> +    .cfi_adjust_cfa_offset 4
>> +    .cfi_offset 14, -4
>> +          .endif
>> +     .endif
>> +    .endif
>> +.endm
>> +
>> +.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
>> +    .if \push_ip & 1 != \push_ip
>> +     .error "push_ip may be either 0 or 1"
>> +    .endif
>> +    .if \push_lr & 1 != \push_lr
>> +     .error "push_lr may be either 0 or 1"
>> +    .endif
>> +    .if \first != -1
>> +     .if \last == -1
>> +      /* Upper-bound not provided: Set upper = lower.  */
>> +      _epilogue \first, \first, \push_ip, \push_lr
>> +      .exitm
>> +     .endif
>> +     .if \last != \first
>> +      .if \last >= 13
>> +    .error "SP cannot be in the save list"
>> +      .endif
>> +      .if \push_ip
>> +       .if \push_lr
>> +    /* Case 1: pop register range, ip and lr registers.  */
>> +    pop {r\first-r\last, ip, lr}
>> +    .cfi_restore 14
>> +    .cfi_register 143, 12
>> +    cfirestorelist \first, \last
>> +       .else // !\push_lr
>> +    /* Case 2: pop register range and ip register.  */
>> +    pop {r\first-r\last, ip}
>> +    .cfi_register 143, 12
>> +    cfirestorelist \first, \last
>> +       .endif
>> +      .else // !\push_ip
>> +       .if \push_lr
>> +    /* Case 3: pop register range and lr register.  */
>> +    pop {r\first-r\last, lr}
>> +    .cfi_restore 14
>> +    cfirestorelist \first, \last
>> +       .else // !\push_lr
>> +    /* Case 4: pop register range.  */
>> +    pop {r\first-r\last}
>> +    cfirestorelist \first, \last
>> +       .endif
>> +      .endif
>> +     .else // \last == \first
>> +      .if \push_ip
>> +       .if \push_lr
>> +    /* Case 5: pop single GP register plus ip and lr registers.  */
>> +    pop {r\first, ip, lr}
>> +    .cfi_restore 14
>> +    .cfi_register 143, 12
>> +    cfirestorelist \first, \first
>> +       .else // !\push_lr
>> +    /* Case 6: pop single GP register plus ip register.  */
>> +    pop {r\first, ip}
>> +    .cfi_register 143, 12
>> +    cfirestorelist \first, \first
>> +       .endif
>> +      .else // !\push_ip
>> +       .if \push_lr
>> +    /* Case 7: pop single GP register plus lr register.  */
>> +    pop {r\first, lr}
>> +    .cfi_restore 14
>> +    cfirestorelist \first, \first
>> +       .else // !\push_lr
>> +    /* Case 8: pop single GP register.  */
>> +    pop {r\first}
>> +    cfirestorelist \first, \first
>> +       .endif
>> +      .endif
>> +     .endif
>> +    .else // \first == -1
>> +     .if \push_ip
>> +      .if \push_lr
>> +    /* Case 9: pop ip and lr registers.  */
>> +    pop {ip, lr}
>> +    .cfi_restore 14
>> +    .cfi_register 143, 12
>> +      .else // !\push_lr
>> +    /* Case 10: pop ip register.  */
>> +    pop {ip}
>> +    .cfi_register 143, 12
>> +      .endif
>> +     .else // !\push_ip
>> +          .if \push_lr
>> +    /* Case 11: pop lr register.  */
>> +    pop {lr}
>> +    .cfi_restore 14
>> +          .endif
>> +     .endif
>> +    .endif
>> +#if HAVE_PAC_LEAF
>> +    aut    ip, lr, sp
>> +#endif /* HAVE_PAC_LEAF */
>> +    bx    lr
>> +.endm
>> +
>> +# clean up expressions in 'last'
>> +.macro _preprocess_reglist1 first:req, last:req, push_ip:req, 
>> push_lr:req, reglist_op:req
>> +    .if \last == 0
>> +     \reglist_op \first, 0, \push_ip, \push_lr
>> +    .elseif \last == 1
>> +     \reglist_op \first, 1, \push_ip, \push_lr
>> +    .elseif \last == 2
>> +     \reglist_op \first, 2, \push_ip, \push_lr
>> +    .elseif \last == 3
>> +     \reglist_op \first, 3, \push_ip, \push_lr
>> +    .elseif \last == 4
>> +     \reglist_op \first, 4, \push_ip, \push_lr
>> +    .elseif \last == 5
>> +     \reglist_op \first, 5, \push_ip, \push_lr
>> +    .elseif \last == 6
>> +     \reglist_op \first, 6, \push_ip, \push_lr
>> +    .elseif \last == 7
>> +     \reglist_op \first, 7, \push_ip, \push_lr
>> +    .elseif \last == 8
>> +     \reglist_op \first, 8, \push_ip, \push_lr
>> +    .elseif \last == 9
>> +     \reglist_op \first, 9, \push_ip, \push_lr
>> +    .elseif \last == 10
>> +     \reglist_op \first, 10, \push_ip, \push_lr
>> +    .elseif \last == 11
>> +     \reglist_op \first, 11, \push_ip, \push_lr
>> +    .else
>> +     .error "last (\last) out of range"
>> +    .endif
>> +.endm
>> +
>> +# clean up expressions in 'first'
>> +.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0, 
>> reglist_op:req
>> +    .ifb \last
>> +     _preprocess_reglist \first \first \push_ip \push_lr
>> +    .else
>> +     .if \first > \last
>> +      .error "last (\last) must be at least as great as first (\first)"
>> +     .endif
>> +     .if \first == 0
>> +      _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 1
>> +      _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 2
>> +      _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 3
>> +      _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 4
>> +      _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 5
>> +      _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 6
>> +      _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 7
>> +      _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 8
>> +      _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 9
>> +      _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 10
>> +      _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
>> +     .elseif \first == 11
>> +      _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
>> +     .else
>> +      .error "first (\first) out of range"
>> +     .endif
>> +    .endif
>> +.endm
>> +
>> +.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
>> +    .ifb \first
>> +     .ifnb \last
>> +      .error "can't have last (\last) without specifying first"
>> +     .else // \last not blank
>> +      .if ((\push_ip + \push_lr) % 2) == 0
>> +       \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
>> +       .exitm
>> +      .else // ((\push_ip + \push_lr) % 2) odd
>> +       _align8 2, 2, \push_ip, \push_lr, \reglist_op
>> +       .exitm
>> +      .endif // ((\push_ip + \push_lr) % 2) == 0
>> +     .endif // .ifnb \last
>> +    .endif // .ifb \first
>> +
>> +    .ifb \last
>> +     _align8 \first, \first, \push_ip, \push_lr, \reglist_op
>> +    .else
>> +     .if \push_ip & 1 <> \push_ip
>> +      .error "push_ip may be 0 or 1"
>> +     .endif
>> +     .if \push_lr & 1 <> \push_lr
>> +      .error "push_lr may be 0 or 1"
>> +     .endif
>> +     .ifeq (\last - \first + \push_ip + \push_lr) % 2
>> +      .if \first == 0
>> +       .error "Alignment required and first register is r0"
>> +       .exitm
>> +      .endif
>> +      _preprocess_reglist \first-1, \last, \push_ip, \push_lr, 
>> \reglist_op
>> +     .else
>> +      _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
>> +     .endif
>> +    .endif
>> +.endm
>> +
>> +.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, 
>> align8=STACK_ALIGN_ENFORCE
>> +    .if \align8
>> +     _align8 \first, \last, \push_ip, \push_lr, _prologue
>> +    .else
>> +     _prologue first=\first, last=\last, push_ip=\push_ip, 
>> push_lr=\push_lr
>> +    .endif
>> +.endm
>> +
>> +.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, 
>> align8=STACK_ALIGN_ENFORCE
>> +    .if \align8
>> +     _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
>> +    .else
>> +     _epilogue first=\first, last=\last, push_ip=\push_ip, 
>> push_lr=\push_lr
>> +    .endif
>> +.endm
>> +
>> +#endif /* __ASSEMBLER__ */
>> +
>>   #endif /* ARM_ASM__H */

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-01-09  9:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
2023-01-06 10:42   ` Christophe Lyon
2023-01-06 20:51     ` Victor Do Nascimento
2023-01-09  9:33     ` Christophe Lyon
2022-12-21 11:21 ` [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
2023-01-06 11:09   ` Christophe Lyon
2023-01-06 21:35     ` Victor Do Nascimento
2022-12-21 11:22 ` [PATCH v5 3/8] newlib: libc: strlen " Victor L. Do Nascimento
2022-12-21 11:24 ` [PATCH v5 4/8] newlib: libc: memchr " Victor L. Do Nascimento
2022-12-21 11:25 ` [PATCH v5 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
2022-12-21 11:27 ` [PATCH v5 6/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
2022-12-21 11:28 ` [PATCH v5 7/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
2022-12-21 11:42 ` [PATCH v5 8/8] newlib: libc: setjmp " Victor L. Do Nascimento
2023-01-05 16:53   ` Richard Earnshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).