* [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality
@ 2022-12-21 11:03 Victor Do Nascimento
2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
` (7 more replies)
0 siblings, 8 replies; 15+ messages in thread
From: Victor Do Nascimento @ 2022-12-21 11:03 UTC (permalink / raw)
To: newlib; +Cc: Richard Earnshaw
Hi all,
This respin of the patch series adds the final modifications required to
patches in response to upstream comments and rebases work on the setjmp
and longjmp routines onto the fixed arm abi.
Tweaks necessary for correct cfi information generation made to:
* newlib/libc/machine/arm/strcmp-armv7.S
* newlib/libc/machine/arm/memchr.S
Stray comment restored in:
* newlib/libc/machine/arm/memcpy-armv7m.S
Patch rebased, cleaned up and missing BTI landing pad added:
* newlib/libc/machine/arm/setjmp.S
All remaining patches in series remain as in previous iterations.
Thanks,
Victor
------
This patch series modifies hand-written assembly files for Arm
targets, introducing a uniform prologue/epilogue interface,
responsible for pushing/popping registers on function entry and exit,
while conditionally enabling branch target identification as well as
address return signature and verification based on Armv8.1-M Pointer
Authentication [1] using ACLE feature test macros at compile-time [2].
The incorportaion of PACBTI functionality in function prologues/
epilogues is dictated by the combination of parameter macros in
arm-asm.h and arguments passed to the `-mbranch-protection' flag at
the time of Newlib compilation.
Regression tested on arm-none-eabi with and without MVE extension and
for Newlib and Newlib-nano.
[1]
<https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
[2]
<https://developer.arm.com/documentation/101028/0012/5--Feature-test-macros>
Victor Do Nascimento (8):
newlib: libc: define M-profile PACBTI-enablement macros
newlib: libc: strcmp M-profile PACBTI-enablement
newlib: libc: strlen M-profile PACBTI-enablement
newlib: libc: memchr M-profile PACBTI-enablement
newlib: libc: memcpy M-profile PACBTI-enablement
newlib: libc: aeabi_memmove M-profile PACBTI-enablement
newlib: libc: aeabi_memset M-profile PACBTI-enablement
newlib: libc: setjmp M-profile PACBTI-enablement
.../libc/machine/arm/aeabi_memmove-thumb2.S | 17 +-
newlib/libc/machine/arm/aeabi_memset-thumb2.S | 14 +-
newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++
newlib/libc/machine/arm/memchr.S | 50 +-
newlib/libc/machine/arm/memcpy-armv7m.S | 33 +-
newlib/libc/machine/arm/setjmp.S | 39 ++
newlib/libc/machine/arm/strcmp-arm-tiny.S | 8 +-
newlib/libc/machine/arm/strcmp-armv7.S | 57 ++-
newlib/libc/machine/arm/strcmp-armv7m.S | 26 +-
newlib/libc/machine/arm/strlen-armv7.S | 17 +-
newlib/libc/machine/arm/strlen-thumb2-Os.S | 14 +-
11 files changed, 656 insertions(+), 60 deletions(-)
--
2.36.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
@ 2022-12-21 11:19 ` Victor L. Do Nascimento
2023-01-06 10:42 ` Christophe Lyon
2022-12-21 11:21 ` [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
` (6 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:19 UTC (permalink / raw)
To: newlib; +Cc: Richard Earnshaw
Augment the arm_asm.h header file to simplify function prologues and
epilogues whilst adding support for PACBTI enablement via macros for
hand-written assembly functions. For PACBTI, both prologues/epilogues
as well as cfi-related directives are automatically amended
accordingly, depending on the compile-time mbranch-protection argument
values.
It defines the following preprocessor macros:
* HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
leaf functions.
* PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
to the stack irrespective of whether the ip register is clobbered in
the function or not.
* STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
the push list as necessary in the prologue to ensure stack
alignment preservation at the start of assembly function. The
epilogue behavior is likewise affected by this flag, ensuring any
pushed dummy registers also get popped on function return.
It also defines the following assembler macros:
* prologue: In addition to pushing any callee-saved registers onto
the stack, it generates any requested pacbti instructions.
Pushed registers are specified via the optional `first', `last',
`push_ip' and `push_lr' macro argument parameters.
when a single register number is provided, it pushes that
register. When two register numbers are provided, they specify a
rage to save. If push_ip and/or push_lr are non-zero, the
respective registers are also saved. Stack alignment is requested
via the `align` argument, which defaults to the value of
STACK_ALIGN_ENFORCE, unless manually overridden.
For example:
prologue push_ip=1 -> push {ip}
prologue push_ip=1, align8=1 -> push {r2, ip}
prologue push_ip=1, push_lr=1 -> push {ip, lr}
prologue 1 -> push {r1}
prologue 1, align8=1 -> push {r0, r1}
prologue 1 push_ip=1 -> push {r1, ip}
prologue 1 4 -> push {r1-r4}
prologue 1 4 push_ip=1 -> push {r1-r4, ip}
* epilogue: pops registers off the stack and emits pac key signing
instruction, if requested. The `first', `last', `push_ip',
`push_lr' and `align' function as per the prologue macro,
generating pop instead of push instructions.
Stack alignment is enforced via the following helper macro
call-chain:
{prologue|epilogue} ->_align8 -> _preprocess_reglist ->
_preprocess_reglist1 -> {_prologue|_epilogue}
Finally, the necessary cfi directives for adding debug information
to prologue and epilogue are generated via the following macros:
* cfisavelist - prologue macro helper function, generating
necessary .cfi_offset directives associated with push instruction.
Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
to generate the following:
push {r1-r2, ip}
.cfi_adjust_cfa_offset 12
.cfi_offset 143, -4
.cfi_offset 2, -8
.cfi_offset 1, -12
* cfirestorelist - epilogue macro helper function, emitting
.cfi_restore instructions prior to resetting the cfa offset. As
such, calling `epilogue 1 2 push_ip=1' will produce:
pop {r1-r2, ip}
.cfi_register 143, 12
.cfi_restore 2
.cfi_restore 1
.cfi_def_cfa_offset 0
---
newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
1 file changed, 441 insertions(+)
diff --git a/newlib/libc/machine/arm/arm_asm.h b/newlib/libc/machine/arm/arm_asm.h
index 2708057de..94fa77b4d 100644
--- a/newlib/libc/machine/arm/arm_asm.h
+++ b/newlib/libc/machine/arm/arm_asm.h
@@ -60,4 +60,445 @@
# define _ISA_THUMB_1
#endif
+/* Check whether leaf function PAC signing has been requested in the
+ -mbranch-protect compile-time option. */
+#define LEAF_PROTECT_BIT 2
+
+#ifdef __ARM_FEATURE_PAC_DEFAULT
+# define HAVE_PAC_LEAF \
+ ((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
+#else
+# define HAVE_PAC_LEAF 0
+#endif
+
+/* Provide default parameters for PAC-code handling in leaf-functions. */
+#if HAVE_PAC_LEAF
+# ifndef PAC_LEAF_PUSH_IP
+# define PAC_LEAF_PUSH_IP 1
+# endif
+#else /* !HAVE_PAC_LEAF */
+# undef PAC_LEAF_PUSH_IP
+# define PAC_LEAF_PUSH_IP 0
+#endif /* HAVE_PAC_LEAF */
+
+#define STACK_ALIGN_ENFORCE 0
+
+#ifdef __ASSEMBLER__
+
+/******************************************************************************
+* Implementation of the prologue and epilogue assembler macros and their
+* associated helper functions.
+*
+* These functions add support for the following:
+*
+* - M-profile branch target identification (BTI) landing-pads when compiled
+* with `-mbranch-protection=bti'.
+* - PAC-signing and verification instructions, depending on hardware support
+* and whether the PAC-signing of leaf functions has been requested via the
+* `-mbranch-protection=pac-ret+leaf' compiler argument.
+* - 8-byte stack alignment preservation at function entry, defaulting to the
+* value of STACK_ALIGN_ENFORCE.
+*
+* Notes:
+* - Prologue stack alignment is implemented by detecting a push with an odd
+* number of registers and prepending a dummy register to the list.
+* - If alignment is attempted on a list containing r0, compilation will result
+* in an error.
+* - If alignment is attempted in a list containing r1, r0 will be prepended to
+* the register list and r0 will be restored prior to function return. for
+* functions with non-void return types, this will result in the corruption of
+* the result register.
+* - Stack alignment is enforced via the following helper macro call-chain:
+*
+* {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
+* _preprocess_reglist1 -> {_prologue|_epilogue}
+*
+* - Debug CFI directives are automatically added to prologues and epilogues,
+* assisted by `cfisavelist' and `cfirestorelist', respectively.
+*
+* Arguments:
+* prologue
+* --------
+* - first - If `last' specified, this serves as start of general-purpose
+* register (GPR) range to push onto stack, otherwise represents
+* single GPR to push onto stack. If omitted, no GPRs pushed
+* onto stack at prologue.
+* - last - If given, specifies inclusive upper-bound of GPR range.
+* - push_ip - Determines whether IP register is to be pushed to stack at
+* prologue. When pac-signing is requested, this holds the
+* the pac-key. Either 1 or 0 to push or not push, respectively.
+* Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
+* - push_lr - Determines whether to push lr to the stack on function entry.
+* Either 1 or 0 to push or not push, respectively.
+* - align8 - Whether to enforce alignment. Either 1 or 0, with 1 requesting
+* alignment.
+*
+* epilogue
+* --------
+* The epilogue should be called passing the same arguments as those passed to
+* the prologue to ensure the stack is not corrupted on function return.
+*
+* Usage examples:
+*
+* prologue push_ip=1 -> push {ip}
+* epilogue push_ip=1, align8=1 -> pop {r2, ip}
+* prologue push_ip=1, push_lr=1 -> push {ip, lr}
+* epilogue 1 -> pop {r1}
+* prologue 1, align8=1 -> push {r0, r1}
+* epilogue 1, push_ip=1 -> pop {r1, ip}
+* prologue 1, 4 -> push {r1-r4}
+* epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
+*
+******************************************************************************/
+
+/* Emit .cfi_restore directives for a consecutive sequence of registers. */
+ .macro cfirestorelist first, last
+ .cfi_restore \last
+ .if \last-\first
+ cfirestorelist \first, \last-1
+ .endif
+ .endm
+
+/* Emit .cfi_offset directives for a consecutive sequence of registers. */
+ .macro cfisavelist first, last, index=1
+ .cfi_offset \last, -4*(\index)
+ .if \last-\first
+ cfisavelist \first, \last-1, \index+1
+ .endif
+ .endm
+
+.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
+ .if \push_ip & 1 != \push_ip
+ .error "push_ip may be either 0 or 1"
+ .endif
+ .if \push_lr & 1 != \push_lr
+ .error "push_lr may be either 0 or 1"
+ .endif
+ .if \first != -1
+ .if \last == -1
+ /* Upper-bound not provided: Set upper = lower. */
+ _prologue \first, \first, \push_ip, \push_lr
+ .exitm
+ .endif
+ .endif
+#if HAVE_PAC_LEAF
+#if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+#else
+ pac ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ .cfi_register 143, 12
+#else
+#if __ARM_FEATURE_BTI_DEFAULT
+ bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+#endif /* HAVE_PAC_LEAF */
+ .if \first != -1
+ .if \last != \first
+ .if \last >= 13
+ .error "SP cannot be in the save list"
+ .endif
+ .if \push_ip
+ .if \push_lr
+ /* Case 1: push register range, ip and lr registers. */
+ push {r\first-r\last, ip, lr}
+ .cfi_adjust_cfa_offset ((\last-\first)+3)*4
+ .cfi_offset 14, -4
+ .cfi_offset 143, -8
+ cfisavelist \first, \last, 3
+ .else // !\push_lr
+ /* Case 2: push register range and ip register. */
+ push {r\first-r\last, ip}
+ .cfi_adjust_cfa_offset ((\last-\first)+2)*4
+ .cfi_offset 143, -4
+ cfisavelist \first, \last, 2
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 3: push register range and lr register. */
+ push {r\first-r\last, lr}
+ .cfi_adjust_cfa_offset ((\last-\first)+2)*4
+ .cfi_offset 14, -4
+ cfisavelist \first, \last, 2
+ .else // !\push_lr
+ /* Case 4: push register range. */
+ push {r\first-r\last}
+ .cfi_adjust_cfa_offset ((\last-\first)+1)*4
+ cfisavelist \first, \last, 1
+ .endif
+ .endif
+ .else // \last == \first
+ .if \push_ip
+ .if \push_lr
+ /* Case 5: push single GP register plus ip and lr registers. */
+ push {r\first, ip, lr}
+ .cfi_adjust_cfa_offset 12
+ .cfi_offset 14, -4
+ .cfi_offset 143, -8
+ cfisavelist \first, \first, 3
+ .else // !\push_lr
+ /* Case 6: push single GP register plus ip register. */
+ push {r\first, ip}
+ .cfi_adjust_cfa_offset 8
+ .cfi_offset 143, -4
+ cfisavelist \first, \first, 2
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 7: push single GP register plus lr register. */
+ push {r\first, lr}
+ .cfi_adjust_cfa_offset 8
+ .cfi_offset 14, -4
+ cfisavelist \first, \first, 2
+ .else // !\push_lr
+ /* Case 8: push single GP register. */
+ push {r\first}
+ .cfi_adjust_cfa_offset 4
+ cfisavelist \first, \first, 1
+ .endif
+ .endif
+ .endif
+ .else // \first == -1
+ .if \push_ip
+ .if \push_lr
+ /* Case 9: push ip and lr registers. */
+ push {ip, lr}
+ .cfi_adjust_cfa_offset 8
+ .cfi_offset 14, -4
+ .cfi_offset 143, -8
+ .else // !\push_lr
+ /* Case 10: push ip register. */
+ push {ip}
+ .cfi_adjust_cfa_offset 4
+ .cfi_offset 143, -4
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 11: push lr register. */
+ push {lr}
+ .cfi_adjust_cfa_offset 4
+ .cfi_offset 14, -4
+ .endif
+ .endif
+ .endif
+.endm
+
+.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
+ .if \push_ip & 1 != \push_ip
+ .error "push_ip may be either 0 or 1"
+ .endif
+ .if \push_lr & 1 != \push_lr
+ .error "push_lr may be either 0 or 1"
+ .endif
+ .if \first != -1
+ .if \last == -1
+ /* Upper-bound not provided: Set upper = lower. */
+ _epilogue \first, \first, \push_ip, \push_lr
+ .exitm
+ .endif
+ .if \last != \first
+ .if \last >= 13
+ .error "SP cannot be in the save list"
+ .endif
+ .if \push_ip
+ .if \push_lr
+ /* Case 1: pop register range, ip and lr registers. */
+ pop {r\first-r\last, ip, lr}
+ .cfi_restore 14
+ .cfi_register 143, 12
+ cfirestorelist \first, \last
+ .else // !\push_lr
+ /* Case 2: pop register range and ip register. */
+ pop {r\first-r\last, ip}
+ .cfi_register 143, 12
+ cfirestorelist \first, \last
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 3: pop register range and lr register. */
+ pop {r\first-r\last, lr}
+ .cfi_restore 14
+ cfirestorelist \first, \last
+ .else // !\push_lr
+ /* Case 4: pop register range. */
+ pop {r\first-r\last}
+ cfirestorelist \first, \last
+ .endif
+ .endif
+ .else // \last == \first
+ .if \push_ip
+ .if \push_lr
+ /* Case 5: pop single GP register plus ip and lr registers. */
+ pop {r\first, ip, lr}
+ .cfi_restore 14
+ .cfi_register 143, 12
+ cfirestorelist \first, \first
+ .else // !\push_lr
+ /* Case 6: pop single GP register plus ip register. */
+ pop {r\first, ip}
+ .cfi_register 143, 12
+ cfirestorelist \first, \first
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 7: pop single GP register plus lr register. */
+ pop {r\first, lr}
+ .cfi_restore 14
+ cfirestorelist \first, \first
+ .else // !\push_lr
+ /* Case 8: pop single GP register. */
+ pop {r\first}
+ cfirestorelist \first, \first
+ .endif
+ .endif
+ .endif
+ .else // \first == -1
+ .if \push_ip
+ .if \push_lr
+ /* Case 9: pop ip and lr registers. */
+ pop {ip, lr}
+ .cfi_restore 14
+ .cfi_register 143, 12
+ .else // !\push_lr
+ /* Case 10: pop ip register. */
+ pop {ip}
+ .cfi_register 143, 12
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 11: pop lr register. */
+ pop {lr}
+ .cfi_restore 14
+ .endif
+ .endif
+ .endif
+#if HAVE_PAC_LEAF
+ aut ip, lr, sp
+#endif /* HAVE_PAC_LEAF */
+ bx lr
+.endm
+
+# clean up expressions in 'last'
+.macro _preprocess_reglist1 first:req, last:req, push_ip:req, push_lr:req, reglist_op:req
+ .if \last == 0
+ \reglist_op \first, 0, \push_ip, \push_lr
+ .elseif \last == 1
+ \reglist_op \first, 1, \push_ip, \push_lr
+ .elseif \last == 2
+ \reglist_op \first, 2, \push_ip, \push_lr
+ .elseif \last == 3
+ \reglist_op \first, 3, \push_ip, \push_lr
+ .elseif \last == 4
+ \reglist_op \first, 4, \push_ip, \push_lr
+ .elseif \last == 5
+ \reglist_op \first, 5, \push_ip, \push_lr
+ .elseif \last == 6
+ \reglist_op \first, 6, \push_ip, \push_lr
+ .elseif \last == 7
+ \reglist_op \first, 7, \push_ip, \push_lr
+ .elseif \last == 8
+ \reglist_op \first, 8, \push_ip, \push_lr
+ .elseif \last == 9
+ \reglist_op \first, 9, \push_ip, \push_lr
+ .elseif \last == 10
+ \reglist_op \first, 10, \push_ip, \push_lr
+ .elseif \last == 11
+ \reglist_op \first, 11, \push_ip, \push_lr
+ .else
+ .error "last (\last) out of range"
+ .endif
+.endm
+
+# clean up expressions in 'first'
+.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0, reglist_op:req
+ .ifb \last
+ _preprocess_reglist \first \first \push_ip \push_lr
+ .else
+ .if \first > \last
+ .error "last (\last) must be at least as great as first (\first)"
+ .endif
+ .if \first == 0
+ _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 1
+ _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 2
+ _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 3
+ _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 4
+ _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 5
+ _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 6
+ _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 7
+ _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 8
+ _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 9
+ _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 10
+ _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 11
+ _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
+ .else
+ .error "first (\first) out of range"
+ .endif
+ .endif
+.endm
+
+.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
+ .ifb \first
+ .ifnb \last
+ .error "can't have last (\last) without specifying first"
+ .else // \last not blank
+ .if ((\push_ip + \push_lr) % 2) == 0
+ \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
+ .exitm
+ .else // ((\push_ip + \push_lr) % 2) odd
+ _align8 2, 2, \push_ip, \push_lr, \reglist_op
+ .exitm
+ .endif // ((\push_ip + \push_lr) % 2) == 0
+ .endif // .ifnb \last
+ .endif // .ifb \first
+
+ .ifb \last
+ _align8 \first, \first, \push_ip, \push_lr, \reglist_op
+ .else
+ .if \push_ip & 1 <> \push_ip
+ .error "push_ip may be 0 or 1"
+ .endif
+ .if \push_lr & 1 <> \push_lr
+ .error "push_lr may be 0 or 1"
+ .endif
+ .ifeq (\last - \first + \push_ip + \push_lr) % 2
+ .if \first == 0
+ .error "Alignment required and first register is r0"
+ .exitm
+ .endif
+ _preprocess_reglist \first-1, \last, \push_ip, \push_lr, \reglist_op
+ .else
+ _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
+ .endif
+ .endif
+.endm
+
+.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
+ .if \align8
+ _align8 \first, \last, \push_ip, \push_lr, _prologue
+ .else
+ _prologue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
+ .endif
+.endm
+
+.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
+ .if \align8
+ _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
+ .else
+ _epilogue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
+ .endif
+.endm
+
+#endif /* __ASSEMBLER__ */
+
#endif /* ARM_ASM__H */
--
2.36.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
@ 2022-12-21 11:21 ` Victor L. Do Nascimento
2023-01-06 11:09 ` Christophe Lyon
2022-12-21 11:22 ` [PATCH v5 3/8] newlib: libc: strlen " Victor L. Do Nascimento
` (5 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:21 UTC (permalink / raw)
To: newlib; +Cc: Richard Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strcmp:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
---
newlib/libc/machine/arm/strcmp-arm-tiny.S | 8 +++-
newlib/libc/machine/arm/strcmp-armv7.S | 57 ++++++++++++++---------
newlib/libc/machine/arm/strcmp-armv7m.S | 26 +++++++----
3 files changed, 60 insertions(+), 31 deletions(-)
diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S b/newlib/libc/machine/arm/strcmp-arm-tiny.S
index 607a41daf..0bd2a2e6e 100644
--- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
+++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
@@ -29,10 +29,14 @@
/* Tiny version of strcmp in ARM state. Used only when optimizing
for size. Also supports Thumb-2. */
+#include "arm_asm.h"
+
.syntax unified
def_fn strcmp
+ .fnstart
.cfi_sections .debug_frame
.cfi_startproc
+ prologue
1:
ldrb r2, [r0], #1
ldrb r3, [r1], #1
@@ -42,6 +46,8 @@ def_fn strcmp
beq 1b
2:
subs r0, r2, r3
- bx lr
+ epilogue
.cfi_endproc
+ .cantunwind
+ .fnend
.size strcmp, . - strcmp
diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
index 2f93bfb73..7cafca151 100644
--- a/newlib/libc/machine/arm/strcmp-armv7.S
+++ b/newlib/libc/machine/arm/strcmp-armv7.S
@@ -45,6 +45,8 @@
.thumb
.syntax unified
+#include "arm_asm.h"
+
/* Parameters and result. */
#define src1 r0
#define src2 r1
@@ -91,8 +93,9 @@
ldrd r4, r5, [sp], #16
.cfi_restore 4
.cfi_restore 5
+ .cfi_adjust_cfa_offset -16
sub result, result, r1, lsr #24
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
#else
/* To use the big-endian trick we'd have to reverse all three words.
that's slower than this approach. */
@@ -112,22 +115,21 @@
ldrd r4, r5, [sp], #16
.cfi_restore 4
.cfi_restore 5
+ .cfi_adjust_cfa_offset -16
sub result, result, r1
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
#endif
.endm
+
.text
.p2align 5
-.Lstrcmp_start_addr:
-#ifndef STRCMP_NO_PRECHECK
-.Lfastpath_exit:
- sub r0, r2, r3
- bx lr
- nop
-#endif
def_fn strcmp
+ .fnstart
+ .cfi_sections .debug_frame
+ .cfi_startproc
+ prologue push_ip=HAVE_PAC_LEAF
#ifndef STRCMP_NO_PRECHECK
ldrb r2, [src1]
ldrb r3, [src2]
@@ -136,16 +138,14 @@ def_fn strcmp
cmpcs r2, r3
bne .Lfastpath_exit
#endif
- .cfi_sections .debug_frame
- .cfi_startproc
strd r4, r5, [sp, #-16]!
- .cfi_def_cfa_offset 16
- .cfi_offset 4, -16
- .cfi_offset 5, -12
+ .cfi_adjust_cfa_offset 16
+ .cfi_rel_offset 4, 0
+ .cfi_rel_offset 5, 4
orr tmp1, src1, src2
strd r6, r7, [sp, #8]
- .cfi_offset 6, -8
- .cfi_offset 7, -4
+ .cfi_rel_offset 6, 8
+ .cfi_rel_offset 7, 12
mvn const_m1, #0
lsl r2, tmp1, #29
cbz r2, .Lloop_aligned8
@@ -270,7 +270,6 @@ def_fn strcmp
ldr data1, [src1], #4
beq .Laligned_m2
bcs .Laligned_m1
-
#ifdef STRCMP_NO_PRECHECK
ldrb data2, [src2, #1]
uxtb tmp1, data1, ror #BYTE1_OFFSET
@@ -314,10 +313,19 @@ def_fn strcmp
mov result, tmp1
ldr r4, [sp], #16
.cfi_restore 4
- bx lr
+ .cfi_adjust_cfa_offset -16
+ epilogue push_ip=HAVE_PAC_LEAF
#ifndef STRCMP_NO_PRECHECK
+.Lfastpath_exit:
+ .cfi_restore_state
+ .cfi_remember_state
+ sub r0, r2, r3
+ epilogue push_ip=HAVE_PAC_LEAF
+
.Laligned_m1:
+ .cfi_restore_state
+ .cfi_remember_state
add src2, src2, #4
#endif
.Lsrc1_aligned:
@@ -364,8 +372,9 @@ def_fn strcmp
/* R6/7 Not used in this sequence. */
.cfi_restore 6
.cfi_restore 7
+ .cfi_adjust_cfa_offset -16
neg result, result
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
6:
.cfi_restore_state
@@ -441,7 +450,8 @@ def_fn strcmp
/* R6/7 not used in this sequence. */
.cfi_restore 6
.cfi_restore 7
- bx lr
+ .cfi_adjust_cfa_offset -16
+ epilogue push_ip=HAVE_PAC_LEAF
.Lstrcmp_tail:
.cfi_restore_state
@@ -463,7 +473,10 @@ def_fn strcmp
/* R6/7 not used in this sequence. */
.cfi_restore 6
.cfi_restore 7
+ .cfi_adjust_cfa_offset -16
sub result, result, data2, lsr #24
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
.cfi_endproc
- .size strcmp, . - .Lstrcmp_start_addr
+ .cantunwind
+ .fnend
+ .size strcmp, . - strcmp
diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
index cdb4912df..825b6e77f 100644
--- a/newlib/libc/machine/arm/strcmp-armv7m.S
+++ b/newlib/libc/machine/arm/strcmp-armv7m.S
@@ -29,6 +29,8 @@
/* Very similar to the generic code, but uses Thumb2 as implemented
in ARMv7-M. */
+#include "arm_asm.h"
+
/* Parameters and result. */
#define src1 r0
#define src2 r1
@@ -44,8 +46,10 @@
.thumb
.syntax unified
def_fn strcmp
+ .fnstart
.cfi_sections .debug_frame
.cfi_startproc
+ prologue push_ip=HAVE_PAC_LEAF
eor tmp1, src1, src2
tst tmp1, #3
/* Strings not at same byte offset from a word boundary. */
@@ -82,6 +86,7 @@ def_fn strcmp
ldreq data2, [src2], #4
beq 4b
2:
+ .cfi_remember_state
/* There's a zero or a different byte in the word */
S2HI result, data1, #24
S2LO data1, data1, #8
@@ -106,7 +111,7 @@ def_fn strcmp
lsrs result, result, #24
subs result, result, data2
#endif
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
#if 0
@@ -205,8 +210,10 @@ def_fn strcmp
/* First of all, compare bytes until src1(sp1) is word-aligned. */
.Lstrcmp_unaligned:
+ .cfi_restore_state
tst src1, #3
beq 2f
+ .cfi_remember_state
ldrb data1, [src1], #1
ldrb data2, [src2], #1
cmp data1, #1
@@ -214,12 +221,13 @@ def_fn strcmp
cmpcs data1, data2
beq .Lstrcmp_unaligned
sub result, data1, data2
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
2:
+ .cfi_restore_state
stmfd sp!, {r5}
- .cfi_def_cfa_offset 4
- .cfi_offset 5, -4
+ .cfi_adjust_cfa_offset 4
+ .cfi_rel_offset 5, 0
ldr data1, [src1], #4
and tmp2, src2, #3
@@ -355,8 +363,8 @@ def_fn strcmp
.cfi_remember_state
ldmfd sp!, {r5}
.cfi_restore 5
- .cfi_def_cfa_offset 0
- bx lr
+ .cfi_adjust_cfa_offset -4
+ epilogue push_ip=HAVE_PAC_LEAF
.Lstrcmp_tail:
.cfi_restore_state
@@ -372,7 +380,9 @@ def_fn strcmp
sub result, r2, result
ldmfd sp!, {r5}
.cfi_restore 5
- .cfi_def_cfa_offset 0
- bx lr
+ .cfi_adjust_cfa_offset -4
+ epilogue push_ip=HAVE_PAC_LEAF
.cfi_endproc
+ .cantunwind
+ .fnend
.size strcmp, . - strcmp
--
2.36.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v5 3/8] newlib: libc: strlen M-profile PACBTI-enablement
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
2022-12-21 11:21 ` [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
@ 2022-12-21 11:22 ` Victor L. Do Nascimento
2022-12-21 11:24 ` [PATCH v5 4/8] newlib: libc: memchr " Victor L. Do Nascimento
` (4 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:22 UTC (permalink / raw)
To: newlib; +Cc: Richard Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strlen:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
---
newlib/libc/machine/arm/strlen-armv7.S | 17 ++++++++++++++---
newlib/libc/machine/arm/strlen-thumb2-Os.S | 14 +++++++++++---
2 files changed, 25 insertions(+), 6 deletions(-)
diff --git a/newlib/libc/machine/arm/strlen-armv7.S b/newlib/libc/machine/arm/strlen-armv7.S
index f3dda0d60..27094040c 100644
--- a/newlib/libc/machine/arm/strlen-armv7.S
+++ b/newlib/libc/machine/arm/strlen-armv7.S
@@ -59,6 +59,7 @@
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */
#include "acle-compat.h"
+#include "arm_asm.h"
.macro def_fn f p2align=0
.text
@@ -78,7 +79,11 @@
/* This code requires Thumb. */
#if __ARM_ARCH_PROFILE == 'M'
+#if __ARM_ARCH >= 8
+ /* keep config inherited from -march=. */
+#else
.arch armv7e-m
+#endif /* if __ARM_ARCH >= 8 */
#else
.arch armv6t2
#endif
@@ -100,8 +105,10 @@
#define tmp2 r5
def_fn strlen p2align=6
+ .fnstart
+ .cfi_startproc
+ prologue 4 5 push_ip=HAVE_PAC_LEAF
pld [srcin, #0]
- strd r4, r5, [sp, #-8]!
bic src, srcin, #7
mvn const_m1, #0
ands tmp1, srcin, #7 /* (8 - bytes) to alignment. */
@@ -151,6 +158,7 @@ def_fn strlen p2align=6
beq .Lloop_aligned
.Lnull_found:
+ .cfi_remember_state
cmp data1a, #0
itt eq
addeq result, result, #4
@@ -159,11 +167,11 @@ def_fn strlen p2align=6
rev data1a, data1a
#endif
clz data1a, data1a
- ldrd r4, r5, [sp], #8
add result, result, data1a, lsr #3 /* Bits -> Bytes. */
- bx lr
+ epilogue 4 5 push_ip=HAVE_PAC_LEAF
.Lmisaligned8:
+ .cfi_restore_state
ldrd data1a, data1b, [src]
and tmp2, tmp1, #3
rsb result, tmp1, #0
@@ -177,4 +185,7 @@ def_fn strlen p2align=6
movne data1a, const_m1
mov const_0, #0
b .Lstart_realigned
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size strlen, . - strlen
diff --git a/newlib/libc/machine/arm/strlen-thumb2-Os.S b/newlib/libc/machine/arm/strlen-thumb2-Os.S
index 961f41a0a..a46db573c 100644
--- a/newlib/libc/machine/arm/strlen-thumb2-Os.S
+++ b/newlib/libc/machine/arm/strlen-thumb2-Os.S
@@ -25,6 +25,7 @@
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */
#include "acle-compat.h"
+#include "arm_asm.h"
.macro def_fn f p2align=0
.text
@@ -33,8 +34,9 @@
.type \f, %function
\f:
.endm
-
-#if __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
+#if __ARM_ARCH_PROFILE == 'M' && __ARM_ARCH >= 8
+ /* keep config inherited from -march=. */
+#elif __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
.arch armv7
#else
.arch armv6t2
@@ -44,11 +46,17 @@
.syntax unified
def_fn strlen p2align=1
+ .fnstart
+ .cfi_startproc
+ prologue
mov r3, r0
1: ldrb.w r2, [r3], #1
cmp r2, #0
bne 1b
subs r0, r3, r0
subs r0, #1
- bx lr
+ epilogue
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size strlen, . - strlen
--
2.36.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v5 4/8] newlib: libc: memchr M-profile PACBTI-enablement
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (2 preceding siblings ...)
2022-12-21 11:22 ` [PATCH v5 3/8] newlib: libc: strlen " Victor L. Do Nascimento
@ 2022-12-21 11:24 ` Victor L. Do Nascimento
2022-12-21 11:25 ` [PATCH v5 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
` (3 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:24 UTC (permalink / raw)
To: newlib; +Cc: Richard Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/memchr.S | 50 ++++++++++++++++++++++++++++----
1 file changed, 44 insertions(+), 6 deletions(-)
diff --git a/newlib/libc/machine/arm/memchr.S b/newlib/libc/machine/arm/memchr.S
index 1a4c6512c..3c11addad 100644
--- a/newlib/libc/machine/arm/memchr.S
+++ b/newlib/libc/machine/arm/memchr.S
@@ -76,6 +76,7 @@
.syntax unified
#include "acle-compat.h"
+#include "arm_asm.h"
@ NOTE: This ifdef MUST match the one in memchr-stub.c
#if defined (__ARM_NEON__) || defined (__ARM_NEON)
@@ -267,10 +268,14 @@ memchr:
#elif __ARM_ARCH_ISA_THUMB >= 2 && defined (__ARM_FEATURE_DSP)
#if __ARM_ARCH_PROFILE == 'M'
- .arch armv7e-m
+#if __ARM_ARCH >= 8
+ /* keep config inherited from -march=. */
#else
- .arch armv6t2
-#endif
+ .arch armv7e-m
+#endif /* __ARM_ARCH >= 8 */
+#else
+ .arch armv6t2
+#endif /* __ARM_ARCH_PROFILE == 'M' */
@ this lets us check a flag in a 00/ff byte easily in either endianness
#ifdef __ARMEB__
@@ -287,11 +292,14 @@ memchr:
.p2align 4,,15
.global memchr
.type memchr,%function
+ .fnstart
+ .cfi_startproc
memchr:
@ r0 = start of memory to scan
@ r1 = character to look for
@ r2 = length
@ returns r0 = pointer to character or NULL if not found
+ prologue
and r1,r1,#0xff @ Don't trust the caller to pass a char
cmp r2,#16 @ If short don't bother with anything clever
@@ -313,6 +321,11 @@ memchr:
10:
@ We are aligned, we know we have at least 8 bytes to work with
push {r4,r5,r6,r7}
+ .cfi_adjust_cfa_offset 16
+ .cfi_rel_offset 4, 0
+ .cfi_rel_offset 5, 4
+ .cfi_rel_offset 6, 8
+ .cfi_rel_offset 7, 12
orr r1, r1, r1, lsl #8 @ expand the match word across all bytes
orr r1, r1, r1, lsl #16
bic r4, r2, #7 @ Number of double words to work with * 8
@@ -334,6 +347,11 @@ memchr:
bne 15b @ (Flags from the subs above)
pop {r4,r5,r6,r7}
+ .cfi_restore 7
+ .cfi_restore 6
+ .cfi_restore 5
+ .cfi_restore 4
+ .cfi_adjust_cfa_offset -16
and r1,r1,#0xff @ r1 back to a single character
and r2,r2,#7 @ Leave the count remaining as the number
@ after the double words have been done
@@ -349,17 +367,29 @@ memchr:
bne 21b @ on r2 flags
40:
+ .cfi_remember_state
movs r0,#0 @ not found
- bx lr
+ epilogue
50:
+ .cfi_restore_state
+ .cfi_remember_state
subs r0,r0,#1 @ found
- bx lr
+ epilogue
60: @ We're here because the fast path found a hit
@ now we have to track down exactly which word it was
@ r0 points to the start of the double word after the one tested
@ r5 has the 00/ff pattern for the first word, r6 has the chained value
+ @ This point is reached from cbnz midway through label 15 prior to
+ @ popping r4-r7 off the stack. .cfi_restore_state alone disregards
+ @ this, so we manually correct this.
+ .cfi_restore_state @ Standard post-prologue state
+ .cfi_adjust_cfa_offset 16
+ .cfi_rel_offset 4, 0
+ .cfi_rel_offset 5, 4
+ .cfi_rel_offset 6, 8
+ .cfi_rel_offset 7, 12
cmp r5, #0
itte eq
moveq r5, r6 @ the end is in the 2nd word
@@ -379,8 +409,16 @@ memchr:
61:
pop {r4,r5,r6,r7}
+ .cfi_restore 7
+ .cfi_restore 6
+ .cfi_restore 5
+ .cfi_restore 4
+ .cfi_adjust_cfa_offset -16
subs r0,r0,#1
- bx lr
+ epilogue
+ .cfi_endproc
+ .cantunwind
+ .fnend
#else
/* Defined in memchr-stub.c. */
#endif
--
2.36.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v5 5/8] newlib: libc: memcpy M-profile PACBTI-enablement
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (3 preceding siblings ...)
2022-12-21 11:24 ` [PATCH v5 4/8] newlib: libc: memchr " Victor L. Do Nascimento
@ 2022-12-21 11:25 ` Victor L. Do Nascimento
2022-12-21 11:27 ` [PATCH v5 6/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
` (2 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:25 UTC (permalink / raw)
To: newlib; +Cc: Richard Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/memcpy-armv7m.S | 33 ++++++++++++++++++-------
1 file changed, 24 insertions(+), 9 deletions(-)
diff --git a/newlib/libc/machine/arm/memcpy-armv7m.S b/newlib/libc/machine/arm/memcpy-armv7m.S
index c8bff36f6..ec1ad6485 100644
--- a/newlib/libc/machine/arm/memcpy-armv7m.S
+++ b/newlib/libc/machine/arm/memcpy-armv7m.S
@@ -46,6 +46,8 @@
__OPT_BIG_BLOCK_SIZE: Size of big block in words. Default to 64.
__OPT_MID_BLOCK_SIZE: Size of big block in words. Default to 16.
*/
+#include "arm_asm.h"
+
#ifndef __OPT_BIG_BLOCK_SIZE
#define __OPT_BIG_BLOCK_SIZE (4 * 16)
#endif
@@ -85,6 +87,8 @@
.global memcpy
.thumb
.thumb_func
+ .fnstart
+ .cfi_startproc
.type memcpy, %function
memcpy:
@ r0: dst
@@ -93,10 +97,11 @@ memcpy:
#ifdef __ARM_FEATURE_UNALIGNED
/* In case of UNALIGNED access supported, ip is not used in
function body. */
+ prologue push_ip=HAVE_PAC_LEAF
mov ip, r0
#else
- push {r0}
-#endif
+ prologue 0 push_ip=HAVE_PAC_LEAF
+#endif /* __ARM_FEATURE_UNALIGNED */
orr r3, r1, r0
ands r3, r3, #3
bne .Lmisaligned_copy
@@ -178,15 +183,17 @@ memcpy:
#endif /* __ARM_FEATURE_UNALIGNED */
.Ldone:
+ .cfi_remember_state
#ifdef __ARM_FEATURE_UNALIGNED
mov r0, ip
+ epilogue push_ip=HAVE_PAC_LEAF
#else
- pop {r0}
-#endif
- bx lr
+ epilogue 0 push_ip=HAVE_PAC_LEAF
+#endif /* __ARM_FEATURE_UNALIGNED */
.align 2
.Lmisaligned_copy:
+ .cfi_restore_state
#ifdef __ARM_FEATURE_UNALIGNED
/* Define label DST_ALIGNED to BIG_BLOCK. It will go to aligned copy
once destination is adjusted to aligned. */
@@ -247,6 +254,9 @@ memcpy:
/* dst is aligned, but src isn't. Misaligned copy. */
push {r4, r5}
+ .cfi_adjust_cfa_offset 8
+ .cfi_rel_offset 4, 0
+ .cfi_rel_offset 5, 4
subs r2, #4
/* Backward r1 by misaligned bytes, to make r1 aligned.
@@ -299,6 +309,9 @@ memcpy:
adds r2, #4
subs r1, ip
pop {r4, r5}
+ .cfi_restore 4
+ .cfi_restore 5
+ .cfi_adjust_cfa_offset -8
#endif /* __ARM_FEATURE_UNALIGNED */
@@ -321,9 +334,11 @@ memcpy:
#ifdef __ARM_FEATURE_UNALIGNED
mov r0, ip
+ epilogue push_ip=HAVE_PAC_LEAF
#else
- pop {r0}
-#endif
- bx lr
-
+ epilogue 0 push_ip=HAVE_PAC_LEAF
+#endif /* __ARM_FEATURE_UNALIGNED */
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size memcpy, .-memcpy
--
2.36.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v5 6/8] newlib: libc: aeabi_memmove M-profile PACBTI-enablement
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (4 preceding siblings ...)
2022-12-21 11:25 ` [PATCH v5 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
@ 2022-12-21 11:27 ` Victor L. Do Nascimento
2022-12-21 11:28 ` [PATCH v5 7/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
2022-12-21 11:42 ` [PATCH v5 8/8] newlib: libc: setjmp " Victor L. Do Nascimento
7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:27 UTC (permalink / raw)
To: newlib; +Cc: Richard Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/aeabi_memmove-thumb2.S | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
index e9504437b..20ca993e5 100644
--- a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
+++ b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
@@ -26,6 +26,8 @@
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
+#include "arm_asm.h"
+
.thumb
.syntax unified
.global __aeabi_memmove
@@ -33,8 +35,10 @@
ASM_ALIAS __aeabi_memmove4 __aeabi_memmove
ASM_ALIAS __aeabi_memmove8 __aeabi_memmove
__aeabi_memmove:
+ .fnstart
+ .cfi_startproc
+ prologue 4
cmp r0, r1
- push {r4}
bls 3f
adds r3, r1, r2
cmp r0, r3
@@ -48,9 +52,10 @@ __aeabi_memmove:
strb r4, [r1, #-1]!
bne 1b
2:
- pop {r4}
- bx lr
+ .cfi_remember_state
+ epilogue 4
3:
+ .cfi_restore_state
cmp r2, #0
beq 2b
add r2, r2, r1
@@ -60,6 +65,8 @@ __aeabi_memmove:
cmp r2, r1
strb r4, [r3, #1]!
bne 4b
- pop {r4}
- bx lr
+ epilogue 4
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size __aeabi_memmove, . - __aeabi_memmove
--
2.36.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v5 7/8] newlib: libc: aeabi_memset M-profile PACBTI-enablement
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (5 preceding siblings ...)
2022-12-21 11:27 ` [PATCH v5 6/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
@ 2022-12-21 11:28 ` Victor L. Do Nascimento
2022-12-21 11:42 ` [PATCH v5 8/8] newlib: libc: setjmp " Victor L. Do Nascimento
7 siblings, 0 replies; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:28 UTC (permalink / raw)
To: newlib; +Cc: Richard Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/aeabi_memset-thumb2.S | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb2.S b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
index eaca1d8d7..6b77d3820 100644
--- a/newlib/libc/machine/arm/aeabi_memset-thumb2.S
+++ b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
@@ -26,14 +26,18 @@
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
+#include "arm_asm.h"
+
.thumb
.syntax unified
.global __aeabi_memset
.type __aeabi_memset, %function
+ .fnstart
+ .cfi_startproc
ASM_ALIAS __aeabi_memset4 __aeabi_memset
ASM_ALIAS __aeabi_memset8 __aeabi_memset
__aeabi_memset:
- push {r4, r5, r6}
+ prologue 4 6
lsls r4, r0, #30
beq 10f
subs r4, r1, #1
@@ -98,10 +102,14 @@ __aeabi_memset:
cmp r3, r4
bne 8b
9:
- pop {r4, r5, r6}
- bx lr
+ .cfi_remember_state
+ epilogue 4 6
10:
+ .cfi_restore_state
mov r4, r1
mov r3, r0
b 3b
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size __aeabi_memset, . - __aeabi_memset
--
2.36.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v5 8/8] newlib: libc: setjmp M-profile PACBTI-enablement
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (6 preceding siblings ...)
2022-12-21 11:28 ` [PATCH v5 7/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
@ 2022-12-21 11:42 ` Victor L. Do Nascimento
2023-01-05 16:53 ` Richard Earnshaw
7 siblings, 1 reply; 15+ messages in thread
From: Victor L. Do Nascimento @ 2022-12-21 11:42 UTC (permalink / raw)
To: newlib; +Cc: Richard Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/setjmp.S | 39 ++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/newlib/libc/machine/arm/setjmp.S b/newlib/libc/machine/arm/setjmp.S
index d814afea8..3e4d7cb70 100644
--- a/newlib/libc/machine/arm/setjmp.S
+++ b/newlib/libc/machine/arm/setjmp.S
@@ -155,6 +155,8 @@ SYM (.arm_start_of.\name):
.align 2
MODE
.globl SYM (\name)
+ .fnstart
+ .cfi_startproc
TYPE (\name)
SYM (\name):
PROLOGUE \name
@@ -162,6 +164,8 @@ SYM (\name):
.macro FUNC_END name
RET
+ .cfi_endproc
+ .fnend
SIZE (\name)
.endm
@@ -171,6 +175,21 @@ SYM (\name):
FUNC_START setjmp
+#if __ARM_FEATURE_PAC_DEFAULT
+# if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+# else
+ pac ip, lr, sp
+# endif /* __ARM_FEATURE_BTI_DEFAULT */
+ mov r3, ip
+ str r3, [r0, #104]
+ .cfi_register 143, 12
+#else
+# if __ARM_FEATURE_BTI_DEFAULT
+ bti
+# endif /* __ARM_FEATURE_BTI_DEFAULT */
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
+
/* Save all the callee-preserved registers into the jump buffer. */
#ifdef __thumb2__
mov ip, sp
@@ -184,6 +203,10 @@ SYM (\name):
/* When setting up the jump buffer return 0. */
mov r0, #0
+#if __ARM_FEATURE_PAC_DEFAULT
+ mov ip, r3
+ aut ip, lr, sp
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
FUNC_END setjmp
@@ -193,6 +216,16 @@ SYM (\name):
FUNC_START longjmp
+#if __ARM_FEATURE_BTI_DEFAULT
+ bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+
+#if __ARM_FEATURE_PAC_DEFAULT
+ /* Keep original jmpbuf address for retrieving pac-code
+ for authentication. */
+ mov r2, r0
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
+
/* If we have stack extension code it ought to be handled here. */
/* Restore the registers, retrieving the state when setjmp() was called. */
@@ -212,5 +245,11 @@ SYM (\name):
it eq
moveq r0, #1
+#if __ARM_FEATURE_PAC_DEFAULT
+ ldr r3, [r2, #104]
+ mov ip, r3
+ aut ip, lr, sp
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
+
FUNC_END longjmp
#endif
--
2.36.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 8/8] newlib: libc: setjmp M-profile PACBTI-enablement
2022-12-21 11:42 ` [PATCH v5 8/8] newlib: libc: setjmp " Victor L. Do Nascimento
@ 2023-01-05 16:53 ` Richard Earnshaw
0 siblings, 0 replies; 15+ messages in thread
From: Richard Earnshaw @ 2023-01-05 16:53 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard Earnshaw
On 21/12/2022 11:42, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
> ---
> newlib/libc/machine/arm/setjmp.S | 39 ++++++++++++++++++++++++++++++++
> 1 file changed, 39 insertions(+)
>
> diff --git a/newlib/libc/machine/arm/setjmp.S b/newlib/libc/machine/arm/setjmp.S
> index d814afea8..3e4d7cb70 100644
> --- a/newlib/libc/machine/arm/setjmp.S
> +++ b/newlib/libc/machine/arm/setjmp.S
> @@ -155,6 +155,8 @@ SYM (.arm_start_of.\name):
> .align 2
> MODE
> .globl SYM (\name)
> + .fnstart
> + .cfi_startproc
> TYPE (\name)
> SYM (\name):
> PROLOGUE \name
> @@ -162,6 +164,8 @@ SYM (\name):
>
> .macro FUNC_END name
> RET
> + .cfi_endproc
> + .fnend
> SIZE (\name)
> .endm
>
> @@ -171,6 +175,21 @@ SYM (\name):
>
> FUNC_START setjmp
>
> +#if __ARM_FEATURE_PAC_DEFAULT
> +# if __ARM_FEATURE_BTI_DEFAULT
> + pacbti ip, lr, sp
> +# else
> + pac ip, lr, sp
> +# endif /* __ARM_FEATURE_BTI_DEFAULT */
> + mov r3, ip
> + str r3, [r0, #104]
#104 here is a bit obscure. I think it would be clearer to write
something like
str r3, [r0, #(CORE_REGS_SAVE_SIZE + FP_REGS_SAVE_SIZE)]
and then define these as appropriate.
> + .cfi_register 143, 12
> +#else
> +# if __ARM_FEATURE_BTI_DEFAULT
> + bti
> +# endif /* __ARM_FEATURE_BTI_DEFAULT */
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
> +
> /* Save all the callee-preserved registers into the jump buffer. */
> #ifdef __thumb2__
> mov ip, sp
> @@ -184,6 +203,10 @@ SYM (\name):
>
> /* When setting up the jump buffer return 0. */
> mov r0, #0
> +#if __ARM_FEATURE_PAC_DEFAULT
> + mov ip, r3
> + aut ip, lr, sp
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
>
> FUNC_END setjmp
>
> @@ -193,6 +216,16 @@ SYM (\name):
>
> FUNC_START longjmp
>
> +#if __ARM_FEATURE_BTI_DEFAULT
> + bti
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> +
> +#if __ARM_FEATURE_PAC_DEFAULT
> + /* Keep original jmpbuf address for retrieving pac-code
> + for authentication. */
> + mov r2, r0
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
> +
> /* If we have stack extension code it ought to be handled here. */
>
> /* Restore the registers, retrieving the state when setjmp() was called. */
> @@ -212,5 +245,11 @@ SYM (\name):
> it eq
> moveq r0, #1
>
> +#if __ARM_FEATURE_PAC_DEFAULT
> + ldr r3, [r2, #104]
> + mov ip, r3
See above. Also, you don't need to load into r3 and then move to IP,
just load ip directly.
> + aut ip, lr, sp
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
> +
> FUNC_END longjmp
> #endif
R.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
@ 2023-01-06 10:42 ` Christophe Lyon
2023-01-06 20:51 ` Victor Do Nascimento
2023-01-09 9:33 ` Christophe Lyon
0 siblings, 2 replies; 15+ messages in thread
From: Christophe Lyon @ 2023-01-06 10:42 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard Earnshaw
Hi Victor,
Thanks for the patch series, a few comments/questions below.
Christophe
On 12/21/22 12:19, Victor L. Do Nascimento wrote:
> Augment the arm_asm.h header file to simplify function prologues and
> epilogues whilst adding support for PACBTI enablement via macros for
> hand-written assembly functions. For PACBTI, both prologues/epilogues
> as well as cfi-related directives are automatically amended
> accordingly, depending on the compile-time mbranch-protection argument
> values.
>
> It defines the following preprocessor macros:
> * HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
> leaf functions.
> * PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
> to the stack irrespective of whether the ip register is clobbered in
> the function or not.
> * STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
> the push list as necessary in the prologue to ensure stack
> alignment preservation at the start of assembly function. The
> epilogue behavior is likewise affected by this flag, ensuring any
> pushed dummy registers also get popped on function return.
IIUC, these new macros are meant for general usage outside of newlib, do
they need proper documentation? Or maybe an entry in the "News" section?
I don't know. Otherwise, I think they should not appear in the user
naming space.
> It also defines the following assembler macros:
> * prologue: In addition to pushing any callee-saved registers onto
> the stack, it generates any requested pacbti instructions.
> Pushed registers are specified via the optional `first', `last',
> `push_ip' and `push_lr' macro argument parameters.
Maybe you should quote 'first' and 'last' differently from 'push_ip' and
'push_lr', since the example below shows that 'first' and 'last' are in
fact register numbers (IIUC)
> when a single register number is provided, it pushes that
Typo: "When" (with a capital)
> register. When two register numbers are provided, they specify a
> rage to save. If push_ip and/or push_lr are non-zero, the
Typo: "range"
> respective registers are also saved. Stack alignment is requested
> via the `align` argument, which defaults to the value of
> STACK_ALIGN_ENFORCE, unless manually overridden.
>
> For example:
>
> prologue push_ip=1 -> push {ip}
> prologue push_ip=1, align8=1 -> push {r2, ip}
> prologue push_ip=1, push_lr=1 -> push {ip, lr}
> prologue 1 -> push {r1}
> prologue 1, align8=1 -> push {r0, r1}
> prologue 1 push_ip=1 -> push {r1, ip}
> prologue 1 4 -> push {r1-r4}
> prologue 1 4 push_ip=1 -> push {r1-r4, ip}
can you include an example with pacbti?
> * epilogue: pops registers off the stack and emits pac key signing
> instruction, if requested. The `first', `last', `push_ip',
> `push_lr' and `align' function as per the prologue macro,
> generating pop instead of push instructions.
>
> Stack alignment is enforced via the following helper macro
> call-chain:
>
> {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
> _preprocess_reglist1 -> {_prologue|_epilogue}
>
> Finally, the necessary cfi directives for adding debug information
> to prologue and epilogue are generated via the following macros:
>
> * cfisavelist - prologue macro helper function, generating
> necessary .cfi_offset directives associated with push instruction.
> Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
> to generate the following:
>
> push {r1-r2, ip}
> .cfi_adjust_cfa_offset 12
> .cfi_offset 143, -4
> .cfi_offset 2, -8
> .cfi_offset 1, -12
>
> * cfirestorelist - epilogue macro helper function, emitting
> .cfi_restore instructions prior to resetting the cfa offset. As
> such, calling `epilogue 1 2 push_ip=1' will produce:
>
> pop {r1-r2, ip}
> .cfi_register 143, 12
> .cfi_restore 2
> .cfi_restore 1
> .cfi_def_cfa_offset 0
> ---
> newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
> 1 file changed, 441 insertions(+)
>
> diff --git a/newlib/libc/machine/arm/arm_asm.h b/newlib/libc/machine/arm/arm_asm.h
> index 2708057de..94fa77b4d 100644
> --- a/newlib/libc/machine/arm/arm_asm.h
> +++ b/newlib/libc/machine/arm/arm_asm.h
> @@ -60,4 +60,445 @@
> # define _ISA_THUMB_1
> #endif
>
> +/* Check whether leaf function PAC signing has been requested in the
> + -mbranch-protect compile-time option. */
> +#define LEAF_PROTECT_BIT 2
Shouldn't this start with '__' or be #undefed at the end of this file to
avoid polluting user naming space? (I noticed it's not used outside this
file in this patch series)
> +
> +#ifdef __ARM_FEATURE_PAC_DEFAULT
> +# define HAVE_PAC_LEAF \
> + ((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
> +#else
> +# define HAVE_PAC_LEAF 0
> +#endif
> +
> +/* Provide default parameters for PAC-code handling in leaf-functions. */
> +#if HAVE_PAC_LEAF
> +# ifndef PAC_LEAF_PUSH_IP
> +# define PAC_LEAF_PUSH_IP 1
> +# endif
> +#else /* !HAVE_PAC_LEAF */
> +# undef PAC_LEAF_PUSH_IP
> +# define PAC_LEAF_PUSH_IP 0
> +#endif /* HAVE_PAC_LEAF */
> +
> +#define STACK_ALIGN_ENFORCE 0
> +
> +#ifdef __ASSEMBLER__
> +
> +/******************************************************************************
> +* Implementation of the prologue and epilogue assembler macros and their
> +* associated helper functions.
> +*
> +* These functions add support for the following:
> +*
> +* - M-profile branch target identification (BTI) landing-pads when compiled
> +* with `-mbranch-protection=bti'.
> +* - PAC-signing and verification instructions, depending on hardware support
> +* and whether the PAC-signing of leaf functions has been requested via the
> +* `-mbranch-protection=pac-ret+leaf' compiler argument.
> +* - 8-byte stack alignment preservation at function entry, defaulting to the
> +* value of STACK_ALIGN_ENFORCE.
> +*
> +* Notes:
> +* - Prologue stack alignment is implemented by detecting a push with an odd
> +* number of registers and prepending a dummy register to the list.
> +* - If alignment is attempted on a list containing r0, compilation will result
> +* in an error.
> +* - If alignment is attempted in a list containing r1, r0 will be prepended to
> +* the register list and r0 will be restored prior to function return. for
> +* functions with non-void return types, this will result in the corruption of
> +* the result register.
> +* - Stack alignment is enforced via the following helper macro call-chain:
> +*
> +* {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
> +* _preprocess_reglist1 -> {_prologue|_epilogue}
> +*
> +* - Debug CFI directives are automatically added to prologues and epilogues,
> +* assisted by `cfisavelist' and `cfirestorelist', respectively.
> +*
> +* Arguments:
> +* prologue
> +* --------
> +* - first - If `last' specified, this serves as start of general-purpose
> +* register (GPR) range to push onto stack, otherwise represents
> +* single GPR to push onto stack. If omitted, no GPRs pushed
> +* onto stack at prologue.
> +* - last - If given, specifies inclusive upper-bound of GPR range.
> +* - push_ip - Determines whether IP register is to be pushed to stack at
> +* prologue. When pac-signing is requested, this holds the
> +* the pac-key. Either 1 or 0 to push or not push, respectively.
> +* Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
> +* - push_lr - Determines whether to push lr to the stack on function entry.
> +* Either 1 or 0 to push or not push, respectively.
> +* - align8 - Whether to enforce alignment. Either 1 or 0, with 1 requesting
> +* alignment.
> +*
> +* epilogue
> +* --------
> +* The epilogue should be called passing the same arguments as those passed to
> +* the prologue to ensure the stack is not corrupted on function return.
> +*
> +* Usage examples:
> +*
> +* prologue push_ip=1 -> push {ip}
> +* epilogue push_ip=1, align8=1 -> pop {r2, ip}
> +* prologue push_ip=1, push_lr=1 -> push {ip, lr}
> +* epilogue 1 -> pop {r1}
> +* prologue 1, align8=1 -> push {r0, r1}
> +* epilogue 1, push_ip=1 -> pop {r1, ip}
> +* prologue 1, 4 -> push {r1-r4}
> +* epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
> +*
> +******************************************************************************/
> +
> +/* Emit .cfi_restore directives for a consecutive sequence of registers. */
> + .macro cfirestorelist first, last
> + .cfi_restore \last
> + .if \last-\first
> + cfirestorelist \first, \last-1
> + .endif
> + .endm
> +
> +/* Emit .cfi_offset directives for a consecutive sequence of registers. */
> + .macro cfisavelist first, last, index=1
> + .cfi_offset \last, -4*(\index)
> + .if \last-\first
> + cfisavelist \first, \last-1, \index+1
> + .endif
> + .endm
> +
> +.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
> + .if \push_ip & 1 != \push_ip
> + .error "push_ip may be either 0 or 1"
> + .endif
> + .if \push_lr & 1 != \push_lr
> + .error "push_lr may be either 0 or 1"
> + .endif
> + .if \first != -1
> + .if \last == -1
> + /* Upper-bound not provided: Set upper = lower. */
> + _prologue \first, \first, \push_ip, \push_lr
> + .exitm
> + .endif
> + .endif
> +#if HAVE_PAC_LEAF
> +#if __ARM_FEATURE_BTI_DEFAULT
> + pacbti ip, lr, sp
> +#else
> + pac ip, lr, sp
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> + .cfi_register 143, 12
> +#else
> +#if __ARM_FEATURE_BTI_DEFAULT
> + bti
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> +#endif /* HAVE_PAC_LEAF */
> + .if \first != -1
> + .if \last != \first
> + .if \last >= 13
> + .error "SP cannot be in the save list"
> + .endif
I think you should also check that IP (r12) is not in the range,
otherwise I think nothing prevents from doing
prologue 12, push_ip=1 which will result in emitting push {r12, ip}
(I suppose gas would complain?)
.... scratch that, I saw later that this sanity checking is performed in
_preprocess_reglist1 :-)
> + .if \push_ip
> + .if \push_lr
> + /* Case 1: push register range, ip and lr registers. */
> + push {r\first-r\last, ip, lr}
> + .cfi_adjust_cfa_offset ((\last-\first)+3)*4
> + .cfi_offset 14, -4
> + .cfi_offset 143, -8
> + cfisavelist \first, \last, 3
> + .else // !\push_lr
> + /* Case 2: push register range and ip register. */
> + push {r\first-r\last, ip}
> + .cfi_adjust_cfa_offset ((\last-\first)+2)*4
> + .cfi_offset 143, -4
> + cfisavelist \first, \last, 2
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 3: push register range and lr register. */
> + push {r\first-r\last, lr}
> + .cfi_adjust_cfa_offset ((\last-\first)+2)*4
> + .cfi_offset 14, -4
> + cfisavelist \first, \last, 2
> + .else // !\push_lr
> + /* Case 4: push register range. */
> + push {r\first-r\last}
> + .cfi_adjust_cfa_offset ((\last-\first)+1)*4
> + cfisavelist \first, \last, 1
> + .endif
> + .endif
> + .else // \last == \first
> + .if \push_ip
> + .if \push_lr
> + /* Case 5: push single GP register plus ip and lr registers. */
> + push {r\first, ip, lr}
> + .cfi_adjust_cfa_offset 12
> + .cfi_offset 14, -4
> + .cfi_offset 143, -8
> + cfisavelist \first, \first, 3
> + .else // !\push_lr
> + /* Case 6: push single GP register plus ip register. */
> + push {r\first, ip}
> + .cfi_adjust_cfa_offset 8
> + .cfi_offset 143, -4
> + cfisavelist \first, \first, 2
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 7: push single GP register plus lr register. */
> + push {r\first, lr}
> + .cfi_adjust_cfa_offset 8
> + .cfi_offset 14, -4
> + cfisavelist \first, \first, 2
> + .else // !\push_lr
> + /* Case 8: push single GP register. */
> + push {r\first}
> + .cfi_adjust_cfa_offset 4
> + cfisavelist \first, \first, 1
> + .endif
> + .endif
> + .endif
> + .else // \first == -1
> + .if \push_ip
> + .if \push_lr
> + /* Case 9: push ip and lr registers. */
> + push {ip, lr}
> + .cfi_adjust_cfa_offset 8
> + .cfi_offset 14, -4
> + .cfi_offset 143, -8
> + .else // !\push_lr
> + /* Case 10: push ip register. */
> + push {ip}
> + .cfi_adjust_cfa_offset 4
> + .cfi_offset 143, -4
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 11: push lr register. */
> + push {lr}
> + .cfi_adjust_cfa_offset 4
> + .cfi_offset 14, -4
> + .endif
> + .endif
> + .endif
> +.endm
> +
> +.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
> + .if \push_ip & 1 != \push_ip
> + .error "push_ip may be either 0 or 1"
> + .endif
> + .if \push_lr & 1 != \push_lr
> + .error "push_lr may be either 0 or 1"
> + .endif
> + .if \first != -1
> + .if \last == -1
> + /* Upper-bound not provided: Set upper = lower. */
> + _epilogue \first, \first, \push_ip, \push_lr
> + .exitm
> + .endif
> + .if \last != \first
> + .if \last >= 13
> + .error "SP cannot be in the save list"
> + .endif
> + .if \push_ip
> + .if \push_lr
> + /* Case 1: pop register range, ip and lr registers. */
> + pop {r\first-r\last, ip, lr}
> + .cfi_restore 14
> + .cfi_register 143, 12
> + cfirestorelist \first, \last
> + .else // !\push_lr
> + /* Case 2: pop register range and ip register. */
> + pop {r\first-r\last, ip}
> + .cfi_register 143, 12
> + cfirestorelist \first, \last
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 3: pop register range and lr register. */
> + pop {r\first-r\last, lr}
> + .cfi_restore 14
> + cfirestorelist \first, \last
> + .else // !\push_lr
> + /* Case 4: pop register range. */
> + pop {r\first-r\last}
> + cfirestorelist \first, \last
> + .endif
> + .endif
> + .else // \last == \first
> + .if \push_ip
> + .if \push_lr
> + /* Case 5: pop single GP register plus ip and lr registers. */
> + pop {r\first, ip, lr}
> + .cfi_restore 14
> + .cfi_register 143, 12
> + cfirestorelist \first, \first
> + .else // !\push_lr
> + /* Case 6: pop single GP register plus ip register. */
> + pop {r\first, ip}
> + .cfi_register 143, 12
> + cfirestorelist \first, \first
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 7: pop single GP register plus lr register. */
> + pop {r\first, lr}
> + .cfi_restore 14
> + cfirestorelist \first, \first
> + .else // !\push_lr
> + /* Case 8: pop single GP register. */
> + pop {r\first}
> + cfirestorelist \first, \first
> + .endif
> + .endif
> + .endif
> + .else // \first == -1
> + .if \push_ip
> + .if \push_lr
> + /* Case 9: pop ip and lr registers. */
> + pop {ip, lr}
> + .cfi_restore 14
> + .cfi_register 143, 12
> + .else // !\push_lr
> + /* Case 10: pop ip register. */
> + pop {ip}
> + .cfi_register 143, 12
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 11: pop lr register. */
> + pop {lr}
> + .cfi_restore 14
> + .endif
> + .endif
> + .endif
> +#if HAVE_PAC_LEAF
> + aut ip, lr, sp
> +#endif /* HAVE_PAC_LEAF */
> + bx lr
> +.endm
> +
> +# clean up expressions in 'last'
> +.macro _preprocess_reglist1 first:req, last:req, push_ip:req, push_lr:req, reglist_op:req
> + .if \last == 0
> + \reglist_op \first, 0, \push_ip, \push_lr
> + .elseif \last == 1
> + \reglist_op \first, 1, \push_ip, \push_lr
> + .elseif \last == 2
> + \reglist_op \first, 2, \push_ip, \push_lr
> + .elseif \last == 3
> + \reglist_op \first, 3, \push_ip, \push_lr
> + .elseif \last == 4
> + \reglist_op \first, 4, \push_ip, \push_lr
> + .elseif \last == 5
> + \reglist_op \first, 5, \push_ip, \push_lr
> + .elseif \last == 6
> + \reglist_op \first, 6, \push_ip, \push_lr
> + .elseif \last == 7
> + \reglist_op \first, 7, \push_ip, \push_lr
> + .elseif \last == 8
> + \reglist_op \first, 8, \push_ip, \push_lr
> + .elseif \last == 9
> + \reglist_op \first, 9, \push_ip, \push_lr
> + .elseif \last == 10
> + \reglist_op \first, 10, \push_ip, \push_lr
> + .elseif \last == 11
> + \reglist_op \first, 11, \push_ip, \push_lr
> + .else
> + .error "last (\last) out of range"
> + .endif
> +.endm
> +
> +# clean up expressions in 'first'
> +.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0, reglist_op:req
> + .ifb \last
> + _preprocess_reglist \first \first \push_ip \push_lr
> + .else
> + .if \first > \last
> + .error "last (\last) must be at least as great as first (\first)"
> + .endif
> + .if \first == 0
> + _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 1
> + _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 2
> + _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 3
> + _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 4
> + _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 5
> + _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 6
> + _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 7
> + _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 8
> + _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 9
> + _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 10
> + _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 11
> + _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
> + .else
> + .error "first (\first) out of range"
> + .endif
> + .endif
> +.endm
> +
> +.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
> + .ifb \first
> + .ifnb \last
> + .error "can't have last (\last) without specifying first"
> + .else // \last not blank
> + .if ((\push_ip + \push_lr) % 2) == 0
> + \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
> + .exitm
> + .else // ((\push_ip + \push_lr) % 2) odd
> + _align8 2, 2, \push_ip, \push_lr, \reglist_op
> + .exitm
> + .endif // ((\push_ip + \push_lr) % 2) == 0
> + .endif // .ifnb \last
> + .endif // .ifb \first
> +
> + .ifb \last
> + _align8 \first, \first, \push_ip, \push_lr, \reglist_op
> + .else
> + .if \push_ip & 1 <> \push_ip
> + .error "push_ip may be 0 or 1"
> + .endif
> + .if \push_lr & 1 <> \push_lr
> + .error "push_lr may be 0 or 1"
> + .endif
> + .ifeq (\last - \first + \push_ip + \push_lr) % 2
> + .if \first == 0
> + .error "Alignment required and first register is r0"
> + .exitm
> + .endif
> + _preprocess_reglist \first-1, \last, \push_ip, \push_lr, \reglist_op
> + .else
> + _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
> + .endif
> + .endif
> +.endm
> +
> +.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
> + .if \align8
> + _align8 \first, \last, \push_ip, \push_lr, _prologue
> + .else
> + _prologue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
> + .endif
> +.endm
> +
> +.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
> + .if \align8
> + _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
> + .else
> + _epilogue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
> + .endif
> +.endm
> +
> +#endif /* __ASSEMBLER__ */
> +
> #endif /* ARM_ASM__H */
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-12-21 11:21 ` [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
@ 2023-01-06 11:09 ` Christophe Lyon
2023-01-06 21:35 ` Victor Do Nascimento
0 siblings, 1 reply; 15+ messages in thread
From: Christophe Lyon @ 2023-01-06 11:09 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard Earnshaw
On 12/21/22 12:21, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
>
> This patch enables PACBTI for all relevant variants of strcmp:
> * Newlib for armv8.1-m.main+pacbti
> * Newlib for armv8.1-m.main+pacbti+mve
> * Newlib-nano
> ---
> newlib/libc/machine/arm/strcmp-arm-tiny.S | 8 +++-
> newlib/libc/machine/arm/strcmp-armv7.S | 57 ++++++++++++++---------
> newlib/libc/machine/arm/strcmp-armv7m.S | 26 +++++++----
> 3 files changed, 60 insertions(+), 31 deletions(-)
>
> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S b/newlib/libc/machine/arm/strcmp-arm-tiny.S
> index 607a41daf..0bd2a2e6e 100644
> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
> @@ -29,10 +29,14 @@
> /* Tiny version of strcmp in ARM state. Used only when optimizing
> for size. Also supports Thumb-2. */
>
> +#include "arm_asm.h"
> +
> .syntax unified
> def_fn strcmp
> + .fnstart
> .cfi_sections .debug_frame
> .cfi_startproc
> + prologue
why no push_ip=HAVE_PAC_LEAF ?
Is that because this is a tiny version and we don't want to use an extra
push ip even it pacbti is enabled?
> 1:
> ldrb r2, [r0], #1
> ldrb r3, [r1], #1
> @@ -42,6 +46,8 @@ def_fn strcmp
> beq 1b
> 2:
> subs r0, r2, r3
> - bx lr
> + epilogue
> .cfi_endproc
> + .cantunwind
> + .fnend
> .size strcmp, . - strcmp
> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
> index 2f93bfb73..7cafca151 100644
> --- a/newlib/libc/machine/arm/strcmp-armv7.S
> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
> @@ -45,6 +45,8 @@
> .thumb
> .syntax unified
>
> +#include "arm_asm.h"
> +
> /* Parameters and result. */
> #define src1 r0
> #define src2 r1
> @@ -91,8 +93,9 @@
> ldrd r4, r5, [sp], #16
> .cfi_restore 4
> .cfi_restore 5
> + .cfi_adjust_cfa_offset -16
> sub result, result, r1, lsr #24
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
> #else
> /* To use the big-endian trick we'd have to reverse all three words.
> that's slower than this approach. */
> @@ -112,22 +115,21 @@
> ldrd r4, r5, [sp], #16
> .cfi_restore 4
> .cfi_restore 5
> + .cfi_adjust_cfa_offset -16
> sub result, result, r1
>
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
> #endif
> .endm
>
> +
> .text
> .p2align 5
> -.Lstrcmp_start_addr:
> -#ifndef STRCMP_NO_PRECHECK
> -.Lfastpath_exit:
> - sub r0, r2, r3
> - bx lr
> - nop
> -#endif
> def_fn strcmp
> + .fnstart
> + .cfi_sections .debug_frame
> + .cfi_startproc
> + prologue push_ip=HAVE_PAC_LEAF
> #ifndef STRCMP_NO_PRECHECK
> ldrb r2, [src1]
> ldrb r3, [src2]
> @@ -136,16 +138,14 @@ def_fn strcmp
> cmpcs r2, r3
> bne .Lfastpath_exit
> #endif
> - .cfi_sections .debug_frame
> - .cfi_startproc
> strd r4, r5, [sp, #-16]!
> - .cfi_def_cfa_offset 16
> - .cfi_offset 4, -16
> - .cfi_offset 5, -12
> + .cfi_adjust_cfa_offset 16
> + .cfi_rel_offset 4, 0
> + .cfi_rel_offset 5, 4
> orr tmp1, src1, src2
> strd r6, r7, [sp, #8]
> - .cfi_offset 6, -8
> - .cfi_offset 7, -4
> + .cfi_rel_offset 6, 8
> + .cfi_rel_offset 7, 12
> mvn const_m1, #0
> lsl r2, tmp1, #29
> cbz r2, .Lloop_aligned8
> @@ -270,7 +270,6 @@ def_fn strcmp
> ldr data1, [src1], #4
> beq .Laligned_m2
> bcs .Laligned_m1
> -
> #ifdef STRCMP_NO_PRECHECK
> ldrb data2, [src2, #1]
> uxtb tmp1, data1, ror #BYTE1_OFFSET
> @@ -314,10 +313,19 @@ def_fn strcmp
> mov result, tmp1
> ldr r4, [sp], #16
> .cfi_restore 4
> - bx lr
> + .cfi_adjust_cfa_offset -16
> + epilogue push_ip=HAVE_PAC_LEAF
>
> #ifndef STRCMP_NO_PRECHECK
> +.Lfastpath_exit:
> + .cfi_restore_state
> + .cfi_remember_state
> + sub r0, r2, r3
> + epilogue push_ip=HAVE_PAC_LEAF
> +
> .Laligned_m1:
> + .cfi_restore_state
> + .cfi_remember_state
> add src2, src2, #4
> #endif
> .Lsrc1_aligned:
> @@ -364,8 +372,9 @@ def_fn strcmp
> /* R6/7 Not used in this sequence. */
> .cfi_restore 6
> .cfi_restore 7
> + .cfi_adjust_cfa_offset -16
> neg result, result
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
>
> 6:
> .cfi_restore_state
> @@ -441,7 +450,8 @@ def_fn strcmp
> /* R6/7 not used in this sequence. */
> .cfi_restore 6
> .cfi_restore 7
> - bx lr
> + .cfi_adjust_cfa_offset -16
> + epilogue push_ip=HAVE_PAC_LEAF
>
> .Lstrcmp_tail:
> .cfi_restore_state
> @@ -463,7 +473,10 @@ def_fn strcmp
> /* R6/7 not used in this sequence. */
> .cfi_restore 6
> .cfi_restore 7
> + .cfi_adjust_cfa_offset -16
> sub result, result, data2, lsr #24
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
> .cfi_endproc
> - .size strcmp, . - .Lstrcmp_start_addr
> + .cantunwind
> + .fnend
> + .size strcmp, . - strcmp
> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
> index cdb4912df..825b6e77f 100644
> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
> @@ -29,6 +29,8 @@
> /* Very similar to the generic code, but uses Thumb2 as implemented
> in ARMv7-M. */
>
> +#include "arm_asm.h"
> +
> /* Parameters and result. */
> #define src1 r0
> #define src2 r1
> @@ -44,8 +46,10 @@
> .thumb
> .syntax unified
> def_fn strcmp
> + .fnstart
> .cfi_sections .debug_frame
> .cfi_startproc
> + prologue push_ip=HAVE_PAC_LEAF
> eor tmp1, src1, src2
> tst tmp1, #3
> /* Strings not at same byte offset from a word boundary. */
> @@ -82,6 +86,7 @@ def_fn strcmp
> ldreq data2, [src2], #4
> beq 4b
> 2:
> + .cfi_remember_state
> /* There's a zero or a different byte in the word */
> S2HI result, data1, #24
> S2LO data1, data1, #8
> @@ -106,7 +111,7 @@ def_fn strcmp
> lsrs result, result, #24
> subs result, result, data2
> #endif
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
>
>
> #if 0
> @@ -205,8 +210,10 @@ def_fn strcmp
>
> /* First of all, compare bytes until src1(sp1) is word-aligned. */
> .Lstrcmp_unaligned:
> + .cfi_restore_state
> tst src1, #3
> beq 2f
> + .cfi_remember_state
> ldrb data1, [src1], #1
> ldrb data2, [src2], #1
> cmp data1, #1
> @@ -214,12 +221,13 @@ def_fn strcmp
> cmpcs data1, data2
> beq .Lstrcmp_unaligned
> sub result, data1, data2
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
>
> 2:
> + .cfi_restore_state
> stmfd sp!, {r5}
> - .cfi_def_cfa_offset 4
> - .cfi_offset 5, -4
> + .cfi_adjust_cfa_offset 4
> + .cfi_rel_offset 5, 0
>
> ldr data1, [src1], #4
> and tmp2, src2, #3
> @@ -355,8 +363,8 @@ def_fn strcmp
> .cfi_remember_state
> ldmfd sp!, {r5}
> .cfi_restore 5
> - .cfi_def_cfa_offset 0
> - bx lr
> + .cfi_adjust_cfa_offset -4
> + epilogue push_ip=HAVE_PAC_LEAF
>
> .Lstrcmp_tail:
> .cfi_restore_state
> @@ -372,7 +380,9 @@ def_fn strcmp
> sub result, r2, result
> ldmfd sp!, {r5}
> .cfi_restore 5
> - .cfi_def_cfa_offset 0
> - bx lr
> + .cfi_adjust_cfa_offset -4
> + epilogue push_ip=HAVE_PAC_LEAF
> .cfi_endproc
> + .cantunwind
> + .fnend
> .size strcmp, . - strcmp
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2023-01-06 10:42 ` Christophe Lyon
@ 2023-01-06 20:51 ` Victor Do Nascimento
2023-01-09 9:33 ` Christophe Lyon
1 sibling, 0 replies; 15+ messages in thread
From: Victor Do Nascimento @ 2023-01-06 20:51 UTC (permalink / raw)
To: Christophe Lyon, newlib; +Cc: Richard Earnshaw
On 1/6/23 10:42, Christophe Lyon wrote:
> Hi Victor,
>
> Thanks for the patch series, a few comments/questions below.
>
> Christophe
>
>
> On 12/21/22 12:19, Victor L. Do Nascimento wrote:
>> Augment the arm_asm.h header file to simplify function prologues and
>> epilogues whilst adding support for PACBTI enablement via macros for
>> hand-written assembly functions. For PACBTI, both prologues/epilogues
>> as well as cfi-related directives are automatically amended
>> accordingly, depending on the compile-time mbranch-protection argument
>> values.
>>
>> It defines the following preprocessor macros:
>> * HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
>> leaf functions.
>> * PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
>> to the stack irrespective of whether the ip register is clobbered in
>> the function or not.
>> * STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
>> the push list as necessary in the prologue to ensure stack
>> alignment preservation at the start of assembly function. The
>> epilogue behavior is likewise affected by this flag, ensuring any
>> pushed dummy registers also get popped on function return.
> IIUC, these new macros are meant for general usage outside of newlib, do
> they need proper documentation? Or maybe an entry in the "News" section?
> I don't know. Otherwise, I think they should not appear in the user
> naming space.
The initial rationalle when pondering whether or not to prefix names
with __ was that I suspect this header is private to newlib (won't be
exported to users), so we should not be prefixing names with __.
If you're interested, this was discussed in the second posted iteration
of this patch.
I think users only concern themselves with headers under
"newlib/libc/include". I may well be wrong though, so please feel free
to correct me here!
That's not to say these macros wouldn't have use outside of Newlib, but
their initial purpose was merely to standardize how the newlib assembly
routines would respond to the presence of PACBTI-related architectural
features.
>> It also defines the following assembler macros:
>> * prologue: In addition to pushing any callee-saved registers onto
>> the stack, it generates any requested pacbti instructions.
>> Pushed registers are specified via the optional `first', `last',
>> `push_ip' and `push_lr' macro argument parameters.
> Maybe you should quote 'first' and 'last' differently from 'push_ip' and
> 'push_lr', since the example below shows that 'first' and 'last' are in
> fact register numbers (IIUC)
IIUC, your misgivings over my use of the quoting for `first' and `last'
stem from the fact that in my examples they are not given as named
parameters.
If so, it's worth noting that assembler macro parameters may be passed
both as named or positional parameters.
Therefore, when I give the example of `prologue 1 4', I could just as
easily have written `prologue first=1 last=4' to the same effect. I
avoided doing so purely for the sake of brevity.
I recognize this does make things a little confusing though...
>> when a single register number is provided, it pushes that
> Typo: "When" (with a capital)
>
>> register. When two register numbers are provided, they specify a
>> rage to save. If push_ip and/or push_lr are non-zero, the
> Typo: "range"
>
>> respective registers are also saved. Stack alignment is requested
>> via the `align` argument, which defaults to the value of
>> STACK_ALIGN_ENFORCE, unless manually overridden.
>>
>> For example:
>>
>> prologue push_ip=1 -> push {ip}
>> prologue push_ip=1, align8=1 -> push {r2, ip}
>> prologue push_ip=1, push_lr=1 -> push {ip, lr}
>> prologue 1 -> push {r1}
>> prologue 1, align8=1 -> push {r0, r1}
>> prologue 1 push_ip=1 -> push {r1, ip}
>> prologue 1 4 -> push {r1-r4}
>> prologue 1 4 push_ip=1 -> push {r1-r4, ip}
> can you include an example with pacbti?
Will do!
Thanks,
Victor
>
>> * epilogue: pops registers off the stack and emits pac key signing
>> instruction, if requested. The `first', `last', `push_ip',
>> `push_lr' and `align' function as per the prologue macro,
>> generating pop instead of push instructions.
>>
>> Stack alignment is enforced via the following helper macro
>> call-chain:
>>
>> {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
>> _preprocess_reglist1 -> {_prologue|_epilogue}
>>
>> Finally, the necessary cfi directives for adding debug information
>> to prologue and epilogue are generated via the following macros:
>>
>> * cfisavelist - prologue macro helper function, generating
>> necessary .cfi_offset directives associated with push instruction.
>> Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
>> to generate the following:
>>
>> push {r1-r2, ip}
>> .cfi_adjust_cfa_offset 12
>> .cfi_offset 143, -4
>> .cfi_offset 2, -8
>> .cfi_offset 1, -12
>>
>> * cfirestorelist - epilogue macro helper function, emitting
>> .cfi_restore instructions prior to resetting the cfa offset. As
>> such, calling `epilogue 1 2 push_ip=1' will produce:
>>
>> pop {r1-r2, ip}
>> .cfi_register 143, 12
>> .cfi_restore 2
>> .cfi_restore 1
>> .cfi_def_cfa_offset 0
>> ---
>> newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
>> 1 file changed, 441 insertions(+)
>>
>> diff --git a/newlib/libc/machine/arm/arm_asm.h
>> b/newlib/libc/machine/arm/arm_asm.h
>> index 2708057de..94fa77b4d 100644
>> --- a/newlib/libc/machine/arm/arm_asm.h
>> +++ b/newlib/libc/machine/arm/arm_asm.h
>> @@ -60,4 +60,445 @@
>> # define _ISA_THUMB_1
>> #endif
>> +/* Check whether leaf function PAC signing has been requested in the
>> + -mbranch-protect compile-time option. */
>> +#define LEAF_PROTECT_BIT 2
> Shouldn't this start with '__' or be #undefed at the end of this file to
> avoid polluting user naming space? (I noticed it's not used outside this
> file in this patch series)
>
>> +
>> +#ifdef __ARM_FEATURE_PAC_DEFAULT
>> +# define HAVE_PAC_LEAF \
>> + ((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
>> +#else
>> +# define HAVE_PAC_LEAF 0
>> +#endif
>> +
>> +/* Provide default parameters for PAC-code handling in
>> leaf-functions. */
>> +#if HAVE_PAC_LEAF
>> +# ifndef PAC_LEAF_PUSH_IP
>> +# define PAC_LEAF_PUSH_IP 1
>> +# endif
>> +#else /* !HAVE_PAC_LEAF */
>> +# undef PAC_LEAF_PUSH_IP
>> +# define PAC_LEAF_PUSH_IP 0
>> +#endif /* HAVE_PAC_LEAF */
>> +
>> +#define STACK_ALIGN_ENFORCE 0
>> +
>> +#ifdef __ASSEMBLER__
>> +
>> +/******************************************************************************
>> +* Implementation of the prologue and epilogue assembler macros and their
>> +* associated helper functions.
>> +*
>> +* These functions add support for the following:
>> +*
>> +* - M-profile branch target identification (BTI) landing-pads when
>> compiled
>> +* with `-mbranch-protection=bti'.
>> +* - PAC-signing and verification instructions, depending on hardware
>> support
>> +* and whether the PAC-signing of leaf functions has been requested
>> via the
>> +* `-mbranch-protection=pac-ret+leaf' compiler argument.
>> +* - 8-byte stack alignment preservation at function entry, defaulting
>> to the
>> +* value of STACK_ALIGN_ENFORCE.
>> +*
>> +* Notes:
>> +* - Prologue stack alignment is implemented by detecting a push with
>> an odd
>> +* number of registers and prepending a dummy register to the list.
>> +* - If alignment is attempted on a list containing r0, compilation
>> will result
>> +* in an error.
>> +* - If alignment is attempted in a list containing r1, r0 will be
>> prepended to
>> +* the register list and r0 will be restored prior to function
>> return. for
>> +* functions with non-void return types, this will result in the
>> corruption of
>> +* the result register.
>> +* - Stack alignment is enforced via the following helper macro
>> call-chain:
>> +*
>> +* {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
>> +* _preprocess_reglist1 -> {_prologue|_epilogue}
>> +*
>> +* - Debug CFI directives are automatically added to prologues and
>> epilogues,
>> +* assisted by `cfisavelist' and `cfirestorelist', respectively.
>> +*
>> +* Arguments:
>> +* prologue
>> +* --------
>> +* - first - If `last' specified, this serves as start of
>> general-purpose
>> +* register (GPR) range to push onto stack, otherwise represents
>> +* single GPR to push onto stack. If omitted, no GPRs pushed
>> +* onto stack at prologue.
>> +* - last - If given, specifies inclusive upper-bound of GPR range.
>> +* - push_ip - Determines whether IP register is to be pushed to
>> stack at
>> +* prologue. When pac-signing is requested, this holds the
>> +* the pac-key. Either 1 or 0 to push or not push,
>> respectively.
>> +* Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
>> +* - push_lr - Determines whether to push lr to the stack on
>> function entry.
>> +* Either 1 or 0 to push or not push, respectively.
>> +* - align8 - Whether to enforce alignment. Either 1 or 0, with 1
>> requesting
>> +* alignment.
>> +*
>> +* epilogue
>> +* --------
>> +* The epilogue should be called passing the same arguments as those
>> passed to
>> +* the prologue to ensure the stack is not corrupted on function
>> return.
>> +*
>> +* Usage examples:
>> +*
>> +* prologue push_ip=1 -> push {ip}
>> +* epilogue push_ip=1, align8=1 -> pop {r2, ip}
>> +* prologue push_ip=1, push_lr=1 -> push {ip, lr}
>> +* epilogue 1 -> pop {r1}
>> +* prologue 1, align8=1 -> push {r0, r1}
>> +* epilogue 1, push_ip=1 -> pop {r1, ip}
>> +* prologue 1, 4 -> push {r1-r4}
>> +* epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
>> +*
>> +******************************************************************************/
>> +
>> +/* Emit .cfi_restore directives for a consecutive sequence of
>> registers. */
>> + .macro cfirestorelist first, last
>> + .cfi_restore \last
>> + .if \last-\first
>> + cfirestorelist \first, \last-1
>> + .endif
>> + .endm
>> +
>> +/* Emit .cfi_offset directives for a consecutive sequence of
>> registers. */
>> + .macro cfisavelist first, last, index=1
>> + .cfi_offset \last, -4*(\index)
>> + .if \last-\first
>> + cfisavelist \first, \last-1, \index+1
>> + .endif
>> + .endm
>> +
>> +.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
>> + .if \push_ip & 1 != \push_ip
>> + .error "push_ip may be either 0 or 1"
>> + .endif
>> + .if \push_lr & 1 != \push_lr
>> + .error "push_lr may be either 0 or 1"
>> + .endif
>> + .if \first != -1
>> + .if \last == -1
>> + /* Upper-bound not provided: Set upper = lower. */
>> + _prologue \first, \first, \push_ip, \push_lr
>> + .exitm
>> + .endif
>> + .endif
>> +#if HAVE_PAC_LEAF
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> + pacbti ip, lr, sp
>> +#else
>> + pac ip, lr, sp
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> + .cfi_register 143, 12
>> +#else
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> + bti
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> +#endif /* HAVE_PAC_LEAF */
>> + .if \first != -1
>> + .if \last != \first
>> + .if \last >= 13
>> + .error "SP cannot be in the save list"
>> + .endif
> I think you should also check that IP (r12) is not in the range,
> otherwise I think nothing prevents from doing
> prologue 12, push_ip=1 which will result in emitting push {r12, ip}
> (I suppose gas would complain?)
> .... scratch that, I saw later that this sanity checking is performed in
> _preprocess_reglist1 :-)
>
>
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 1: push register range, ip and lr registers. */
>> + push {r\first-r\last, ip, lr}
>> + .cfi_adjust_cfa_offset ((\last-\first)+3)*4
>> + .cfi_offset 14, -4
>> + .cfi_offset 143, -8
>> + cfisavelist \first, \last, 3
>> + .else // !\push_lr
>> + /* Case 2: push register range and ip register. */
>> + push {r\first-r\last, ip}
>> + .cfi_adjust_cfa_offset ((\last-\first)+2)*4
>> + .cfi_offset 143, -4
>> + cfisavelist \first, \last, 2
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 3: push register range and lr register. */
>> + push {r\first-r\last, lr}
>> + .cfi_adjust_cfa_offset ((\last-\first)+2)*4
>> + .cfi_offset 14, -4
>> + cfisavelist \first, \last, 2
>> + .else // !\push_lr
>> + /* Case 4: push register range. */
>> + push {r\first-r\last}
>> + .cfi_adjust_cfa_offset ((\last-\first)+1)*4
>> + cfisavelist \first, \last, 1
>> + .endif
>> + .endif
>> + .else // \last == \first
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 5: push single GP register plus ip and lr registers. */
>> + push {r\first, ip, lr}
>> + .cfi_adjust_cfa_offset 12
>> + .cfi_offset 14, -4
>> + .cfi_offset 143, -8
>> + cfisavelist \first, \first, 3
>> + .else // !\push_lr
>> + /* Case 6: push single GP register plus ip register. */
>> + push {r\first, ip}
>> + .cfi_adjust_cfa_offset 8
>> + .cfi_offset 143, -4
>> + cfisavelist \first, \first, 2
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 7: push single GP register plus lr register. */
>> + push {r\first, lr}
>> + .cfi_adjust_cfa_offset 8
>> + .cfi_offset 14, -4
>> + cfisavelist \first, \first, 2
>> + .else // !\push_lr
>> + /* Case 8: push single GP register. */
>> + push {r\first}
>> + .cfi_adjust_cfa_offset 4
>> + cfisavelist \first, \first, 1
>> + .endif
>> + .endif
>> + .endif
>> + .else // \first == -1
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 9: push ip and lr registers. */
>> + push {ip, lr}
>> + .cfi_adjust_cfa_offset 8
>> + .cfi_offset 14, -4
>> + .cfi_offset 143, -8
>> + .else // !\push_lr
>> + /* Case 10: push ip register. */
>> + push {ip}
>> + .cfi_adjust_cfa_offset 4
>> + .cfi_offset 143, -4
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 11: push lr register. */
>> + push {lr}
>> + .cfi_adjust_cfa_offset 4
>> + .cfi_offset 14, -4
>> + .endif
>> + .endif
>> + .endif
>> +.endm
>> +
>> +.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
>> + .if \push_ip & 1 != \push_ip
>> + .error "push_ip may be either 0 or 1"
>> + .endif
>> + .if \push_lr & 1 != \push_lr
>> + .error "push_lr may be either 0 or 1"
>> + .endif
>> + .if \first != -1
>> + .if \last == -1
>> + /* Upper-bound not provided: Set upper = lower. */
>> + _epilogue \first, \first, \push_ip, \push_lr
>> + .exitm
>> + .endif
>> + .if \last != \first
>> + .if \last >= 13
>> + .error "SP cannot be in the save list"
>> + .endif
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 1: pop register range, ip and lr registers. */
>> + pop {r\first-r\last, ip, lr}
>> + .cfi_restore 14
>> + .cfi_register 143, 12
>> + cfirestorelist \first, \last
>> + .else // !\push_lr
>> + /* Case 2: pop register range and ip register. */
>> + pop {r\first-r\last, ip}
>> + .cfi_register 143, 12
>> + cfirestorelist \first, \last
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 3: pop register range and lr register. */
>> + pop {r\first-r\last, lr}
>> + .cfi_restore 14
>> + cfirestorelist \first, \last
>> + .else // !\push_lr
>> + /* Case 4: pop register range. */
>> + pop {r\first-r\last}
>> + cfirestorelist \first, \last
>> + .endif
>> + .endif
>> + .else // \last == \first
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 5: pop single GP register plus ip and lr registers. */
>> + pop {r\first, ip, lr}
>> + .cfi_restore 14
>> + .cfi_register 143, 12
>> + cfirestorelist \first, \first
>> + .else // !\push_lr
>> + /* Case 6: pop single GP register plus ip register. */
>> + pop {r\first, ip}
>> + .cfi_register 143, 12
>> + cfirestorelist \first, \first
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 7: pop single GP register plus lr register. */
>> + pop {r\first, lr}
>> + .cfi_restore 14
>> + cfirestorelist \first, \first
>> + .else // !\push_lr
>> + /* Case 8: pop single GP register. */
>> + pop {r\first}
>> + cfirestorelist \first, \first
>> + .endif
>> + .endif
>> + .endif
>> + .else // \first == -1
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 9: pop ip and lr registers. */
>> + pop {ip, lr}
>> + .cfi_restore 14
>> + .cfi_register 143, 12
>> + .else // !\push_lr
>> + /* Case 10: pop ip register. */
>> + pop {ip}
>> + .cfi_register 143, 12
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 11: pop lr register. */
>> + pop {lr}
>> + .cfi_restore 14
>> + .endif
>> + .endif
>> + .endif
>> +#if HAVE_PAC_LEAF
>> + aut ip, lr, sp
>> +#endif /* HAVE_PAC_LEAF */
>> + bx lr
>> +.endm
>> +
>> +# clean up expressions in 'last'
>> +.macro _preprocess_reglist1 first:req, last:req, push_ip:req,
>> push_lr:req, reglist_op:req
>> + .if \last == 0
>> + \reglist_op \first, 0, \push_ip, \push_lr
>> + .elseif \last == 1
>> + \reglist_op \first, 1, \push_ip, \push_lr
>> + .elseif \last == 2
>> + \reglist_op \first, 2, \push_ip, \push_lr
>> + .elseif \last == 3
>> + \reglist_op \first, 3, \push_ip, \push_lr
>> + .elseif \last == 4
>> + \reglist_op \first, 4, \push_ip, \push_lr
>> + .elseif \last == 5
>> + \reglist_op \first, 5, \push_ip, \push_lr
>> + .elseif \last == 6
>> + \reglist_op \first, 6, \push_ip, \push_lr
>> + .elseif \last == 7
>> + \reglist_op \first, 7, \push_ip, \push_lr
>> + .elseif \last == 8
>> + \reglist_op \first, 8, \push_ip, \push_lr
>> + .elseif \last == 9
>> + \reglist_op \first, 9, \push_ip, \push_lr
>> + .elseif \last == 10
>> + \reglist_op \first, 10, \push_ip, \push_lr
>> + .elseif \last == 11
>> + \reglist_op \first, 11, \push_ip, \push_lr
>> + .else
>> + .error "last (\last) out of range"
>> + .endif
>> +.endm
>> +
>> +# clean up expressions in 'first'
>> +.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0,
>> reglist_op:req
>> + .ifb \last
>> + _preprocess_reglist \first \first \push_ip \push_lr
>> + .else
>> + .if \first > \last
>> + .error "last (\last) must be at least as great as first (\first)"
>> + .endif
>> + .if \first == 0
>> + _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 1
>> + _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 2
>> + _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 3
>> + _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 4
>> + _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 5
>> + _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 6
>> + _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 7
>> + _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 8
>> + _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 9
>> + _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 10
>> + _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 11
>> + _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
>> + .else
>> + .error "first (\first) out of range"
>> + .endif
>> + .endif
>> +.endm
>> +
>> +.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
>> + .ifb \first
>> + .ifnb \last
>> + .error "can't have last (\last) without specifying first"
>> + .else // \last not blank
>> + .if ((\push_ip + \push_lr) % 2) == 0
>> + \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
>> + .exitm
>> + .else // ((\push_ip + \push_lr) % 2) odd
>> + _align8 2, 2, \push_ip, \push_lr, \reglist_op
>> + .exitm
>> + .endif // ((\push_ip + \push_lr) % 2) == 0
>> + .endif // .ifnb \last
>> + .endif // .ifb \first
>> +
>> + .ifb \last
>> + _align8 \first, \first, \push_ip, \push_lr, \reglist_op
>> + .else
>> + .if \push_ip & 1 <> \push_ip
>> + .error "push_ip may be 0 or 1"
>> + .endif
>> + .if \push_lr & 1 <> \push_lr
>> + .error "push_lr may be 0 or 1"
>> + .endif
>> + .ifeq (\last - \first + \push_ip + \push_lr) % 2
>> + .if \first == 0
>> + .error "Alignment required and first register is r0"
>> + .exitm
>> + .endif
>> + _preprocess_reglist \first-1, \last, \push_ip, \push_lr,
>> \reglist_op
>> + .else
>> + _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
>> + .endif
>> + .endif
>> +.endm
>> +
>> +.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0,
>> align8=STACK_ALIGN_ENFORCE
>> + .if \align8
>> + _align8 \first, \last, \push_ip, \push_lr, _prologue
>> + .else
>> + _prologue first=\first, last=\last, push_ip=\push_ip,
>> push_lr=\push_lr
>> + .endif
>> +.endm
>> +
>> +.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0,
>> align8=STACK_ALIGN_ENFORCE
>> + .if \align8
>> + _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
>> + .else
>> + _epilogue first=\first, last=\last, push_ip=\push_ip,
>> push_lr=\push_lr
>> + .endif
>> +.endm
>> +
>> +#endif /* __ASSEMBLER__ */
>> +
>> #endif /* ARM_ASM__H */
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2023-01-06 11:09 ` Christophe Lyon
@ 2023-01-06 21:35 ` Victor Do Nascimento
0 siblings, 0 replies; 15+ messages in thread
From: Victor Do Nascimento @ 2023-01-06 21:35 UTC (permalink / raw)
To: Christophe Lyon, newlib; +Cc: Richard Earnshaw
On 1/6/23 11:09, Christophe Lyon wrote:
>
>
> On 12/21/22 12:21, Victor L. Do Nascimento wrote:
>> Add function prologue/epilogue to conditionally add BTI landing pads
>> and/or PAC code generation & authentication instructions depending on
>> compilation flags.
>>
>> This patch enables PACBTI for all relevant variants of strcmp:
>> * Newlib for armv8.1-m.main+pacbti
>> * Newlib for armv8.1-m.main+pacbti+mve
>> * Newlib-nano
>> ---
>> newlib/libc/machine/arm/strcmp-arm-tiny.S | 8 +++-
>> newlib/libc/machine/arm/strcmp-armv7.S | 57 ++++++++++++++---------
>> newlib/libc/machine/arm/strcmp-armv7m.S | 26 +++++++----
>> 3 files changed, 60 insertions(+), 31 deletions(-)
>>
>> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> index 607a41daf..0bd2a2e6e 100644
>> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> @@ -29,10 +29,14 @@
>> /* Tiny version of strcmp in ARM state. Used only when optimizing
>> for size. Also supports Thumb-2. */
>> +#include "arm_asm.h"
>> +
>> .syntax unified
>> def_fn strcmp
>> + .fnstart
>> .cfi_sections .debug_frame
>> .cfi_startproc
>> + prologue
> why no push_ip=HAVE_PAC_LEAF ?
> Is that because this is a tiny version and we don't want to use an extra
> push ip even it pacbti is enabled?
push_ip=HAVE_PAC_LEAF is reserved for a particular scenario.
If we're PAC-signing leaf functions (that is, HAVE_PAC_LEAF is set) but
the intraprocedural scratch register r12 is not used in the function
body, there's no strict need to push the pac-code onto the stack, so
push_ip defaults to a potentially overridable value of PAC_LEAF_PUSH_IP.
If, on the other hand, r12 is used as part of the function body, our
PAC-code will be corrupted. In such cases, pushing ip should be strictly
dictated by the fact that we have requested leaf function PAC-signing,
so that it can later be restored.
Therefore, if r12 is corrupted and HAVE_PAC_LEAF is set we should push
ip to the stack irrespective of any overrides, and that's where
push_ip=HAVE_PAC_LEAF is important.
as strcmp-arm-tiny.S doesn't use r12, we have flexibility over whether
or not to push ip onto stack. That's why we have simply `push_ip' and
not `push_ip=HAVE_PAC_LEAF'. strcmp-armv7.S and strcmp-armv7m.S
represent the opposite scenario. :-)
Regards,
Victor
>> 1:
>> ldrb r2, [r0], #1
>> ldrb r3, [r1], #1
>> @@ -42,6 +46,8 @@ def_fn strcmp
>> beq 1b
>> 2:
>> subs r0, r2, r3
>> - bx lr
>> + epilogue
>> .cfi_endproc
>> + .cantunwind
>> + .fnend
>> .size strcmp, . - strcmp
>> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S
>> b/newlib/libc/machine/arm/strcmp-armv7.S
>> index 2f93bfb73..7cafca151 100644
>> --- a/newlib/libc/machine/arm/strcmp-armv7.S
>> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
>> @@ -45,6 +45,8 @@
>> .thumb
>> .syntax unified
>> +#include "arm_asm.h"
>> +
>> /* Parameters and result. */
>> #define src1 r0
>> #define src2 r1
>> @@ -91,8 +93,9 @@
>> ldrd r4, r5, [sp], #16
>> .cfi_restore 4
>> .cfi_restore 5
>> + .cfi_adjust_cfa_offset -16
>> sub result, result, r1, lsr #24
>> - bx lr
>> + epilogue push_ip=HAVE_PAC_LEAF
>> #else
>> /* To use the big-endian trick we'd have to reverse all three
>> words.
>> that's slower than this approach. */
>> @@ -112,22 +115,21 @@
>> ldrd r4, r5, [sp], #16
>> .cfi_restore 4
>> .cfi_restore 5
>> + .cfi_adjust_cfa_offset -16
>> sub result, result, r1
>> - bx lr
>> + epilogue push_ip=HAVE_PAC_LEAF
>> #endif
>> .endm
>> +
>> .text
>> .p2align 5
>> -.Lstrcmp_start_addr:
>> -#ifndef STRCMP_NO_PRECHECK
>> -.Lfastpath_exit:
>> - sub r0, r2, r3
>> - bx lr
>> - nop
>> -#endif
>> def_fn strcmp
>> + .fnstart
>> + .cfi_sections .debug_frame
>> + .cfi_startproc
>> + prologue push_ip=HAVE_PAC_LEAF
>> #ifndef STRCMP_NO_PRECHECK
>> ldrb r2, [src1]
>> ldrb r3, [src2]
>> @@ -136,16 +138,14 @@ def_fn strcmp
>> cmpcs r2, r3
>> bne .Lfastpath_exit
>> #endif
>> - .cfi_sections .debug_frame
>> - .cfi_startproc
>> strd r4, r5, [sp, #-16]!
>> - .cfi_def_cfa_offset 16
>> - .cfi_offset 4, -16
>> - .cfi_offset 5, -12
>> + .cfi_adjust_cfa_offset 16
>> + .cfi_rel_offset 4, 0
>> + .cfi_rel_offset 5, 4
>> orr tmp1, src1, src2
>> strd r6, r7, [sp, #8]
>> - .cfi_offset 6, -8
>> - .cfi_offset 7, -4
>> + .cfi_rel_offset 6, 8
>> + .cfi_rel_offset 7, 12
>> mvn const_m1, #0
>> lsl r2, tmp1, #29
>> cbz r2, .Lloop_aligned8
>> @@ -270,7 +270,6 @@ def_fn strcmp
>> ldr data1, [src1], #4
>> beq .Laligned_m2
>> bcs .Laligned_m1
>> -
>> #ifdef STRCMP_NO_PRECHECK
>> ldrb data2, [src2, #1]
>> uxtb tmp1, data1, ror #BYTE1_OFFSET
>> @@ -314,10 +313,19 @@ def_fn strcmp
>> mov result, tmp1
>> ldr r4, [sp], #16
>> .cfi_restore 4
>> - bx lr
>> + .cfi_adjust_cfa_offset -16
>> + epilogue push_ip=HAVE_PAC_LEAF
>> #ifndef STRCMP_NO_PRECHECK
>> +.Lfastpath_exit:
>> + .cfi_restore_state
>> + .cfi_remember_state
>> + sub r0, r2, r3
>> + epilogue push_ip=HAVE_PAC_LEAF
>> +
>> .Laligned_m1:
>> + .cfi_restore_state
>> + .cfi_remember_state
>> add src2, src2, #4
>> #endif
>> .Lsrc1_aligned:
>> @@ -364,8 +372,9 @@ def_fn strcmp
>> /* R6/7 Not used in this sequence. */
>> .cfi_restore 6
>> .cfi_restore 7
>> + .cfi_adjust_cfa_offset -16
>> neg result, result
>> - bx lr
>> + epilogue push_ip=HAVE_PAC_LEAF
>> 6:
>> .cfi_restore_state
>> @@ -441,7 +450,8 @@ def_fn strcmp
>> /* R6/7 not used in this sequence. */
>> .cfi_restore 6
>> .cfi_restore 7
>> - bx lr
>> + .cfi_adjust_cfa_offset -16
>> + epilogue push_ip=HAVE_PAC_LEAF
>> .Lstrcmp_tail:
>> .cfi_restore_state
>> @@ -463,7 +473,10 @@ def_fn strcmp
>> /* R6/7 not used in this sequence. */
>> .cfi_restore 6
>> .cfi_restore 7
>> + .cfi_adjust_cfa_offset -16
>> sub result, result, data2, lsr #24
>> - bx lr
>> + epilogue push_ip=HAVE_PAC_LEAF
>> .cfi_endproc
>> - .size strcmp, . - .Lstrcmp_start_addr
>> + .cantunwind
>> + .fnend
>> + .size strcmp, . - strcmp
>> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S
>> b/newlib/libc/machine/arm/strcmp-armv7m.S
>> index cdb4912df..825b6e77f 100644
>> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
>> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
>> @@ -29,6 +29,8 @@
>> /* Very similar to the generic code, but uses Thumb2 as implemented
>> in ARMv7-M. */
>> +#include "arm_asm.h"
>> +
>> /* Parameters and result. */
>> #define src1 r0
>> #define src2 r1
>> @@ -44,8 +46,10 @@
>> .thumb
>> .syntax unified
>> def_fn strcmp
>> + .fnstart
>> .cfi_sections .debug_frame
>> .cfi_startproc
>> + prologue push_ip=HAVE_PAC_LEAF
>> eor tmp1, src1, src2
>> tst tmp1, #3
>> /* Strings not at same byte offset from a word boundary. */
>> @@ -82,6 +86,7 @@ def_fn strcmp
>> ldreq data2, [src2], #4
>> beq 4b
>> 2:
>> + .cfi_remember_state
>> /* There's a zero or a different byte in the word */
>> S2HI result, data1, #24
>> S2LO data1, data1, #8
>> @@ -106,7 +111,7 @@ def_fn strcmp
>> lsrs result, result, #24
>> subs result, result, data2
>> #endif
>> - bx lr
>> + epilogue push_ip=HAVE_PAC_LEAF
>> #if 0
>> @@ -205,8 +210,10 @@ def_fn strcmp
>> /* First of all, compare bytes until src1(sp1) is word-aligned. */
>> .Lstrcmp_unaligned:
>> + .cfi_restore_state
>> tst src1, #3
>> beq 2f
>> + .cfi_remember_state
>> ldrb data1, [src1], #1
>> ldrb data2, [src2], #1
>> cmp data1, #1
>> @@ -214,12 +221,13 @@ def_fn strcmp
>> cmpcs data1, data2
>> beq .Lstrcmp_unaligned
>> sub result, data1, data2
>> - bx lr
>> + epilogue push_ip=HAVE_PAC_LEAF
>> 2:
>> + .cfi_restore_state
>> stmfd sp!, {r5}
>> - .cfi_def_cfa_offset 4
>> - .cfi_offset 5, -4
>> + .cfi_adjust_cfa_offset 4
>> + .cfi_rel_offset 5, 0
>> ldr data1, [src1], #4
>> and tmp2, src2, #3
>> @@ -355,8 +363,8 @@ def_fn strcmp
>> .cfi_remember_state
>> ldmfd sp!, {r5}
>> .cfi_restore 5
>> - .cfi_def_cfa_offset 0
>> - bx lr
>> + .cfi_adjust_cfa_offset -4
>> + epilogue push_ip=HAVE_PAC_LEAF
>> .Lstrcmp_tail:
>> .cfi_restore_state
>> @@ -372,7 +380,9 @@ def_fn strcmp
>> sub result, r2, result
>> ldmfd sp!, {r5}
>> .cfi_restore 5
>> - .cfi_def_cfa_offset 0
>> - bx lr
>> + .cfi_adjust_cfa_offset -4
>> + epilogue push_ip=HAVE_PAC_LEAF
>> .cfi_endproc
>> + .cantunwind
>> + .fnend
>> .size strcmp, . - strcmp
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2023-01-06 10:42 ` Christophe Lyon
2023-01-06 20:51 ` Victor Do Nascimento
@ 2023-01-09 9:33 ` Christophe Lyon
1 sibling, 0 replies; 15+ messages in thread
From: Christophe Lyon @ 2023-01-09 9:33 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard Earnshaw
On 1/6/23 11:42, Christophe Lyon wrote:
> Hi Victor,
>
> Thanks for the patch series, a few comments/questions below.
>
> Christophe
>
>
> On 12/21/22 12:19, Victor L. Do Nascimento wrote:
>> Augment the arm_asm.h header file to simplify function prologues and
>> epilogues whilst adding support for PACBTI enablement via macros for
>> hand-written assembly functions. For PACBTI, both prologues/epilogues
>> as well as cfi-related directives are automatically amended
>> accordingly, depending on the compile-time mbranch-protection argument
>> values.
>>
>> It defines the following preprocessor macros:
>> * HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
>> leaf functions.
>> * PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
>> to the stack irrespective of whether the ip register is clobbered in
>> the function or not.
>> * STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
>> the push list as necessary in the prologue to ensure stack
>> alignment preservation at the start of assembly function. The
>> epilogue behavior is likewise affected by this flag, ensuring any
>> pushed dummy registers also get popped on function return.
> IIUC, these new macros are meant for general usage outside of newlib, do
> they need proper documentation? Or maybe an entry in the "News" section?
> I don't know. Otherwise, I think they should not appear in the user
> naming space.
I've just noticed this point was already discussed with Richard some
time ago, I somehow missed/forgot it:
https://sourceware.org/pipermail/newlib/2022/019831.html
So it looks fine if this header/macros are not meant to be used outside
of newlib.
Thanks,
Christophe
>
>
>> It also defines the following assembler macros:
>> * prologue: In addition to pushing any callee-saved registers onto
>> the stack, it generates any requested pacbti instructions.
>> Pushed registers are specified via the optional `first', `last',
>> `push_ip' and `push_lr' macro argument parameters.
> Maybe you should quote 'first' and 'last' differently from 'push_ip' and
> 'push_lr', since the example below shows that 'first' and 'last' are in
> fact register numbers (IIUC)
>
>> when a single register number is provided, it pushes that
> Typo: "When" (with a capital)
>
>> register. When two register numbers are provided, they specify a
>> rage to save. If push_ip and/or push_lr are non-zero, the
> Typo: "range"
>
>> respective registers are also saved. Stack alignment is requested
>> via the `align` argument, which defaults to the value of
>> STACK_ALIGN_ENFORCE, unless manually overridden.
>>
>> For example:
>>
>> prologue push_ip=1 -> push {ip}
>> prologue push_ip=1, align8=1 -> push {r2, ip}
>> prologue push_ip=1, push_lr=1 -> push {ip, lr}
>> prologue 1 -> push {r1}
>> prologue 1, align8=1 -> push {r0, r1}
>> prologue 1 push_ip=1 -> push {r1, ip}
>> prologue 1 4 -> push {r1-r4}
>> prologue 1 4 push_ip=1 -> push {r1-r4, ip}
> can you include an example with pacbti?
>
>
>> * epilogue: pops registers off the stack and emits pac key signing
>> instruction, if requested. The `first', `last', `push_ip',
>> `push_lr' and `align' function as per the prologue macro,
>> generating pop instead of push instructions.
>>
>> Stack alignment is enforced via the following helper macro
>> call-chain:
>>
>> {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
>> _preprocess_reglist1 -> {_prologue|_epilogue}
>>
>> Finally, the necessary cfi directives for adding debug information
>> to prologue and epilogue are generated via the following macros:
>>
>> * cfisavelist - prologue macro helper function, generating
>> necessary .cfi_offset directives associated with push instruction.
>> Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
>> to generate the following:
>>
>> push {r1-r2, ip}
>> .cfi_adjust_cfa_offset 12
>> .cfi_offset 143, -4
>> .cfi_offset 2, -8
>> .cfi_offset 1, -12
>>
>> * cfirestorelist - epilogue macro helper function, emitting
>> .cfi_restore instructions prior to resetting the cfa offset. As
>> such, calling `epilogue 1 2 push_ip=1' will produce:
>>
>> pop {r1-r2, ip}
>> .cfi_register 143, 12
>> .cfi_restore 2
>> .cfi_restore 1
>> .cfi_def_cfa_offset 0
>> ---
>> newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
>> 1 file changed, 441 insertions(+)
>>
>> diff --git a/newlib/libc/machine/arm/arm_asm.h
>> b/newlib/libc/machine/arm/arm_asm.h
>> index 2708057de..94fa77b4d 100644
>> --- a/newlib/libc/machine/arm/arm_asm.h
>> +++ b/newlib/libc/machine/arm/arm_asm.h
>> @@ -60,4 +60,445 @@
>> # define _ISA_THUMB_1
>> #endif
>> +/* Check whether leaf function PAC signing has been requested in the
>> + -mbranch-protect compile-time option. */
>> +#define LEAF_PROTECT_BIT 2
> Shouldn't this start with '__' or be #undefed at the end of this file to
> avoid polluting user naming space? (I noticed it's not used outside this
> file in this patch series)
>
>> +
>> +#ifdef __ARM_FEATURE_PAC_DEFAULT
>> +# define HAVE_PAC_LEAF \
>> + ((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
>> +#else
>> +# define HAVE_PAC_LEAF 0
>> +#endif
>> +
>> +/* Provide default parameters for PAC-code handling in
>> leaf-functions. */
>> +#if HAVE_PAC_LEAF
>> +# ifndef PAC_LEAF_PUSH_IP
>> +# define PAC_LEAF_PUSH_IP 1
>> +# endif
>> +#else /* !HAVE_PAC_LEAF */
>> +# undef PAC_LEAF_PUSH_IP
>> +# define PAC_LEAF_PUSH_IP 0
>> +#endif /* HAVE_PAC_LEAF */
>> +
>> +#define STACK_ALIGN_ENFORCE 0
>> +
>> +#ifdef __ASSEMBLER__
>> +
>> +/******************************************************************************
>> +* Implementation of the prologue and epilogue assembler macros and their
>> +* associated helper functions.
>> +*
>> +* These functions add support for the following:
>> +*
>> +* - M-profile branch target identification (BTI) landing-pads when
>> compiled
>> +* with `-mbranch-protection=bti'.
>> +* - PAC-signing and verification instructions, depending on hardware
>> support
>> +* and whether the PAC-signing of leaf functions has been requested
>> via the
>> +* `-mbranch-protection=pac-ret+leaf' compiler argument.
>> +* - 8-byte stack alignment preservation at function entry, defaulting
>> to the
>> +* value of STACK_ALIGN_ENFORCE.
>> +*
>> +* Notes:
>> +* - Prologue stack alignment is implemented by detecting a push with
>> an odd
>> +* number of registers and prepending a dummy register to the list.
>> +* - If alignment is attempted on a list containing r0, compilation
>> will result
>> +* in an error.
>> +* - If alignment is attempted in a list containing r1, r0 will be
>> prepended to
>> +* the register list and r0 will be restored prior to function
>> return. for
>> +* functions with non-void return types, this will result in the
>> corruption of
>> +* the result register.
>> +* - Stack alignment is enforced via the following helper macro
>> call-chain:
>> +*
>> +* {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
>> +* _preprocess_reglist1 -> {_prologue|_epilogue}
>> +*
>> +* - Debug CFI directives are automatically added to prologues and
>> epilogues,
>> +* assisted by `cfisavelist' and `cfirestorelist', respectively.
>> +*
>> +* Arguments:
>> +* prologue
>> +* --------
>> +* - first - If `last' specified, this serves as start of
>> general-purpose
>> +* register (GPR) range to push onto stack, otherwise represents
>> +* single GPR to push onto stack. If omitted, no GPRs pushed
>> +* onto stack at prologue.
>> +* - last - If given, specifies inclusive upper-bound of GPR range.
>> +* - push_ip - Determines whether IP register is to be pushed to
>> stack at
>> +* prologue. When pac-signing is requested, this holds the
>> +* the pac-key. Either 1 or 0 to push or not push,
>> respectively.
>> +* Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
>> +* - push_lr - Determines whether to push lr to the stack on
>> function entry.
>> +* Either 1 or 0 to push or not push, respectively.
>> +* - align8 - Whether to enforce alignment. Either 1 or 0, with 1
>> requesting
>> +* alignment.
>> +*
>> +* epilogue
>> +* --------
>> +* The epilogue should be called passing the same arguments as those
>> passed to
>> +* the prologue to ensure the stack is not corrupted on function
>> return.
>> +*
>> +* Usage examples:
>> +*
>> +* prologue push_ip=1 -> push {ip}
>> +* epilogue push_ip=1, align8=1 -> pop {r2, ip}
>> +* prologue push_ip=1, push_lr=1 -> push {ip, lr}
>> +* epilogue 1 -> pop {r1}
>> +* prologue 1, align8=1 -> push {r0, r1}
>> +* epilogue 1, push_ip=1 -> pop {r1, ip}
>> +* prologue 1, 4 -> push {r1-r4}
>> +* epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
>> +*
>> +******************************************************************************/
>> +
>> +/* Emit .cfi_restore directives for a consecutive sequence of
>> registers. */
>> + .macro cfirestorelist first, last
>> + .cfi_restore \last
>> + .if \last-\first
>> + cfirestorelist \first, \last-1
>> + .endif
>> + .endm
>> +
>> +/* Emit .cfi_offset directives for a consecutive sequence of
>> registers. */
>> + .macro cfisavelist first, last, index=1
>> + .cfi_offset \last, -4*(\index)
>> + .if \last-\first
>> + cfisavelist \first, \last-1, \index+1
>> + .endif
>> + .endm
>> +
>> +.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
>> + .if \push_ip & 1 != \push_ip
>> + .error "push_ip may be either 0 or 1"
>> + .endif
>> + .if \push_lr & 1 != \push_lr
>> + .error "push_lr may be either 0 or 1"
>> + .endif
>> + .if \first != -1
>> + .if \last == -1
>> + /* Upper-bound not provided: Set upper = lower. */
>> + _prologue \first, \first, \push_ip, \push_lr
>> + .exitm
>> + .endif
>> + .endif
>> +#if HAVE_PAC_LEAF
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> + pacbti ip, lr, sp
>> +#else
>> + pac ip, lr, sp
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> + .cfi_register 143, 12
>> +#else
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> + bti
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> +#endif /* HAVE_PAC_LEAF */
>> + .if \first != -1
>> + .if \last != \first
>> + .if \last >= 13
>> + .error "SP cannot be in the save list"
>> + .endif
> I think you should also check that IP (r12) is not in the range,
> otherwise I think nothing prevents from doing
> prologue 12, push_ip=1 which will result in emitting push {r12, ip}
> (I suppose gas would complain?)
> .... scratch that, I saw later that this sanity checking is performed in
> _preprocess_reglist1 :-)
>
>
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 1: push register range, ip and lr registers. */
>> + push {r\first-r\last, ip, lr}
>> + .cfi_adjust_cfa_offset ((\last-\first)+3)*4
>> + .cfi_offset 14, -4
>> + .cfi_offset 143, -8
>> + cfisavelist \first, \last, 3
>> + .else // !\push_lr
>> + /* Case 2: push register range and ip register. */
>> + push {r\first-r\last, ip}
>> + .cfi_adjust_cfa_offset ((\last-\first)+2)*4
>> + .cfi_offset 143, -4
>> + cfisavelist \first, \last, 2
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 3: push register range and lr register. */
>> + push {r\first-r\last, lr}
>> + .cfi_adjust_cfa_offset ((\last-\first)+2)*4
>> + .cfi_offset 14, -4
>> + cfisavelist \first, \last, 2
>> + .else // !\push_lr
>> + /* Case 4: push register range. */
>> + push {r\first-r\last}
>> + .cfi_adjust_cfa_offset ((\last-\first)+1)*4
>> + cfisavelist \first, \last, 1
>> + .endif
>> + .endif
>> + .else // \last == \first
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 5: push single GP register plus ip and lr registers. */
>> + push {r\first, ip, lr}
>> + .cfi_adjust_cfa_offset 12
>> + .cfi_offset 14, -4
>> + .cfi_offset 143, -8
>> + cfisavelist \first, \first, 3
>> + .else // !\push_lr
>> + /* Case 6: push single GP register plus ip register. */
>> + push {r\first, ip}
>> + .cfi_adjust_cfa_offset 8
>> + .cfi_offset 143, -4
>> + cfisavelist \first, \first, 2
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 7: push single GP register plus lr register. */
>> + push {r\first, lr}
>> + .cfi_adjust_cfa_offset 8
>> + .cfi_offset 14, -4
>> + cfisavelist \first, \first, 2
>> + .else // !\push_lr
>> + /* Case 8: push single GP register. */
>> + push {r\first}
>> + .cfi_adjust_cfa_offset 4
>> + cfisavelist \first, \first, 1
>> + .endif
>> + .endif
>> + .endif
>> + .else // \first == -1
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 9: push ip and lr registers. */
>> + push {ip, lr}
>> + .cfi_adjust_cfa_offset 8
>> + .cfi_offset 14, -4
>> + .cfi_offset 143, -8
>> + .else // !\push_lr
>> + /* Case 10: push ip register. */
>> + push {ip}
>> + .cfi_adjust_cfa_offset 4
>> + .cfi_offset 143, -4
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 11: push lr register. */
>> + push {lr}
>> + .cfi_adjust_cfa_offset 4
>> + .cfi_offset 14, -4
>> + .endif
>> + .endif
>> + .endif
>> +.endm
>> +
>> +.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
>> + .if \push_ip & 1 != \push_ip
>> + .error "push_ip may be either 0 or 1"
>> + .endif
>> + .if \push_lr & 1 != \push_lr
>> + .error "push_lr may be either 0 or 1"
>> + .endif
>> + .if \first != -1
>> + .if \last == -1
>> + /* Upper-bound not provided: Set upper = lower. */
>> + _epilogue \first, \first, \push_ip, \push_lr
>> + .exitm
>> + .endif
>> + .if \last != \first
>> + .if \last >= 13
>> + .error "SP cannot be in the save list"
>> + .endif
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 1: pop register range, ip and lr registers. */
>> + pop {r\first-r\last, ip, lr}
>> + .cfi_restore 14
>> + .cfi_register 143, 12
>> + cfirestorelist \first, \last
>> + .else // !\push_lr
>> + /* Case 2: pop register range and ip register. */
>> + pop {r\first-r\last, ip}
>> + .cfi_register 143, 12
>> + cfirestorelist \first, \last
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 3: pop register range and lr register. */
>> + pop {r\first-r\last, lr}
>> + .cfi_restore 14
>> + cfirestorelist \first, \last
>> + .else // !\push_lr
>> + /* Case 4: pop register range. */
>> + pop {r\first-r\last}
>> + cfirestorelist \first, \last
>> + .endif
>> + .endif
>> + .else // \last == \first
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 5: pop single GP register plus ip and lr registers. */
>> + pop {r\first, ip, lr}
>> + .cfi_restore 14
>> + .cfi_register 143, 12
>> + cfirestorelist \first, \first
>> + .else // !\push_lr
>> + /* Case 6: pop single GP register plus ip register. */
>> + pop {r\first, ip}
>> + .cfi_register 143, 12
>> + cfirestorelist \first, \first
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 7: pop single GP register plus lr register. */
>> + pop {r\first, lr}
>> + .cfi_restore 14
>> + cfirestorelist \first, \first
>> + .else // !\push_lr
>> + /* Case 8: pop single GP register. */
>> + pop {r\first}
>> + cfirestorelist \first, \first
>> + .endif
>> + .endif
>> + .endif
>> + .else // \first == -1
>> + .if \push_ip
>> + .if \push_lr
>> + /* Case 9: pop ip and lr registers. */
>> + pop {ip, lr}
>> + .cfi_restore 14
>> + .cfi_register 143, 12
>> + .else // !\push_lr
>> + /* Case 10: pop ip register. */
>> + pop {ip}
>> + .cfi_register 143, 12
>> + .endif
>> + .else // !\push_ip
>> + .if \push_lr
>> + /* Case 11: pop lr register. */
>> + pop {lr}
>> + .cfi_restore 14
>> + .endif
>> + .endif
>> + .endif
>> +#if HAVE_PAC_LEAF
>> + aut ip, lr, sp
>> +#endif /* HAVE_PAC_LEAF */
>> + bx lr
>> +.endm
>> +
>> +# clean up expressions in 'last'
>> +.macro _preprocess_reglist1 first:req, last:req, push_ip:req,
>> push_lr:req, reglist_op:req
>> + .if \last == 0
>> + \reglist_op \first, 0, \push_ip, \push_lr
>> + .elseif \last == 1
>> + \reglist_op \first, 1, \push_ip, \push_lr
>> + .elseif \last == 2
>> + \reglist_op \first, 2, \push_ip, \push_lr
>> + .elseif \last == 3
>> + \reglist_op \first, 3, \push_ip, \push_lr
>> + .elseif \last == 4
>> + \reglist_op \first, 4, \push_ip, \push_lr
>> + .elseif \last == 5
>> + \reglist_op \first, 5, \push_ip, \push_lr
>> + .elseif \last == 6
>> + \reglist_op \first, 6, \push_ip, \push_lr
>> + .elseif \last == 7
>> + \reglist_op \first, 7, \push_ip, \push_lr
>> + .elseif \last == 8
>> + \reglist_op \first, 8, \push_ip, \push_lr
>> + .elseif \last == 9
>> + \reglist_op \first, 9, \push_ip, \push_lr
>> + .elseif \last == 10
>> + \reglist_op \first, 10, \push_ip, \push_lr
>> + .elseif \last == 11
>> + \reglist_op \first, 11, \push_ip, \push_lr
>> + .else
>> + .error "last (\last) out of range"
>> + .endif
>> +.endm
>> +
>> +# clean up expressions in 'first'
>> +.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0,
>> reglist_op:req
>> + .ifb \last
>> + _preprocess_reglist \first \first \push_ip \push_lr
>> + .else
>> + .if \first > \last
>> + .error "last (\last) must be at least as great as first (\first)"
>> + .endif
>> + .if \first == 0
>> + _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 1
>> + _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 2
>> + _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 3
>> + _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 4
>> + _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 5
>> + _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 6
>> + _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 7
>> + _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 8
>> + _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 9
>> + _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 10
>> + _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
>> + .elseif \first == 11
>> + _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
>> + .else
>> + .error "first (\first) out of range"
>> + .endif
>> + .endif
>> +.endm
>> +
>> +.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
>> + .ifb \first
>> + .ifnb \last
>> + .error "can't have last (\last) without specifying first"
>> + .else // \last not blank
>> + .if ((\push_ip + \push_lr) % 2) == 0
>> + \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
>> + .exitm
>> + .else // ((\push_ip + \push_lr) % 2) odd
>> + _align8 2, 2, \push_ip, \push_lr, \reglist_op
>> + .exitm
>> + .endif // ((\push_ip + \push_lr) % 2) == 0
>> + .endif // .ifnb \last
>> + .endif // .ifb \first
>> +
>> + .ifb \last
>> + _align8 \first, \first, \push_ip, \push_lr, \reglist_op
>> + .else
>> + .if \push_ip & 1 <> \push_ip
>> + .error "push_ip may be 0 or 1"
>> + .endif
>> + .if \push_lr & 1 <> \push_lr
>> + .error "push_lr may be 0 or 1"
>> + .endif
>> + .ifeq (\last - \first + \push_ip + \push_lr) % 2
>> + .if \first == 0
>> + .error "Alignment required and first register is r0"
>> + .exitm
>> + .endif
>> + _preprocess_reglist \first-1, \last, \push_ip, \push_lr,
>> \reglist_op
>> + .else
>> + _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
>> + .endif
>> + .endif
>> +.endm
>> +
>> +.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0,
>> align8=STACK_ALIGN_ENFORCE
>> + .if \align8
>> + _align8 \first, \last, \push_ip, \push_lr, _prologue
>> + .else
>> + _prologue first=\first, last=\last, push_ip=\push_ip,
>> push_lr=\push_lr
>> + .endif
>> +.endm
>> +
>> +.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0,
>> align8=STACK_ALIGN_ENFORCE
>> + .if \align8
>> + _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
>> + .else
>> + _epilogue first=\first, last=\last, push_ip=\push_ip,
>> push_lr=\push_lr
>> + .endif
>> +.endm
>> +
>> +#endif /* __ASSEMBLER__ */
>> +
>> #endif /* ARM_ASM__H */
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-01-09 9:33 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-21 11:03 [PATCH v5 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
2022-12-21 11:19 ` [PATCH v5 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
2023-01-06 10:42 ` Christophe Lyon
2023-01-06 20:51 ` Victor Do Nascimento
2023-01-09 9:33 ` Christophe Lyon
2022-12-21 11:21 ` [PATCH v5 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
2023-01-06 11:09 ` Christophe Lyon
2023-01-06 21:35 ` Victor Do Nascimento
2022-12-21 11:22 ` [PATCH v5 3/8] newlib: libc: strlen " Victor L. Do Nascimento
2022-12-21 11:24 ` [PATCH v5 4/8] newlib: libc: memchr " Victor L. Do Nascimento
2022-12-21 11:25 ` [PATCH v5 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
2022-12-21 11:27 ` [PATCH v5 6/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
2022-12-21 11:28 ` [PATCH v5 7/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
2022-12-21 11:42 ` [PATCH v5 8/8] newlib: libc: setjmp " Victor L. Do Nascimento
2023-01-05 16:53 ` Richard Earnshaw
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).