* [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality
@ 2022-10-26 11:37 Victor L. Do Nascimento
2022-10-26 11:45 ` [PATCH v4 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
` (7 more replies)
0 siblings, 8 replies; 17+ messages in thread
From: Victor L. Do Nascimento @ 2022-10-26 11:37 UTC (permalink / raw)
To: newlib; +Cc: Richard.Earnshaw
Hi all,
This respin of the patch series builds upon the previously-proposed prologue/
epilogue interfaces, making them more flexible by the addition of novel
parameters controlling their behavior and adds important corrections as well
as simplifications to the employed CFI directives.
Thanks,
Victor
------
This patch series modifies hand-written assembly files for Arm
targets, introducing a uniform prologue/epilogue interface,
responsible for pushing/popping registers on function entry and exit,
while conditionally enabling branch target identification as well as
address return signature and verification based on Armv8.1-M Pointer
Authentication [1] using ACLE feature test macros at compile-time [2].
The incorportaion of PACBTI functionality in function prologues/
epilogues is dictated by the combination of parameter macros in
arm-asm.h and arguments passed to the `-mbranch-protection' flag at
the time of Newlib compilation.
Regression tested on arm-none-eabi with and without MVE extension and
for Newlib and Newlib-nano.
[1]
<https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
[2] <https://developer.arm.com/documentation/101028/0012/5--Feature-test-macros>
Victor Do Nascimento (8):
newlib: libc: define M-profile PACBTI-enablement macros
newlib: libc: strcmp M-profile PACBTI-enablement
newlib: libc: strlen M-profile PACBTI-enablement
newlib: libc: memchr M-profile PACBTI-enablement
newlib: libc: memcpy M-profile PACBTI-enablement
newlib: libc: setjmp/longjmp M-profile PACBTI-enablement
newlib: libc: aeabi_memmove M-profile PACBTI-enablement
newlib: libc: aeabi_memset M-profile PACBTI-enablement
.../libc/machine/arm/aeabi_memmove-thumb2.S | 17 +-
newlib/libc/machine/arm/aeabi_memset-thumb2.S | 14 +-
newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++
newlib/libc/machine/arm/memchr.S | 42 +-
newlib/libc/machine/arm/memcpy-armv7m.S | 37 +-
newlib/libc/machine/arm/setjmp.S | 33 +-
newlib/libc/machine/arm/strcmp-arm-tiny.S | 8 +-
newlib/libc/machine/arm/strcmp-armv7.S | 44 +-
newlib/libc/machine/arm/strcmp-armv7m.S | 26 +-
newlib/libc/machine/arm/strlen-armv7.S | 17 +-
newlib/libc/machine/arm/strlen-thumb2-Os.S | 14 +-
11 files changed, 636 insertions(+), 57 deletions(-)
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2022-10-26 11:37 [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality Victor L. Do Nascimento
@ 2022-10-26 11:45 ` Victor L. Do Nascimento
2022-11-22 15:04 ` Richard Earnshaw
2022-10-26 11:46 ` [PATCH v4 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
` (6 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Victor L. Do Nascimento @ 2022-10-26 11:45 UTC (permalink / raw)
To: newlib; +Cc: Richard.Earnshaw
Augment the arm_asm.h header file to simplify function prologues and
epilogues whilst adding support for PACBTI enablement via macros for
hand-written assembly functions. For PACBTI, both prologues/epilogues
as well as cfi-related directives are automatically amended
accordingly, depending on the compile-time mbranch-protection argument
values.
It defines the following preprocessor macros:
* HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
leaf functions.
* PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
to the stack irrespective of whether the ip register is clobbered in
the function or not.
* STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
the push list as necessary in the prologue to ensure stack
alignment preservation at the start of assembly function. The
epilogue behavior is likewise affected by this flag, ensuring any
pushed dummy registers also get popped on function return.
It also defines the following assembler macros:
* prologue: In addition to pushing any callee-saved registers onto
the stack, it generates any requested pacbti instructions.
Pushed registers are specified via the optional `first', `last',
`push_ip' and `push_lr' macro argument parameters.
when a single register number is provided, it pushes that
register. When two register numbers are provided, they specify a
rage to save. If push_ip and/or push_lr are non-zero, the
respective registers are also saved. Stack alignment is requested
via the `align` argument, which defaults to the value of
STACK_ALIGN_ENFORCE, unless manually overridden.
For example:
prologue push_ip=1 -> push {ip}
prologue push_ip=1, align8=1 -> push {r2, ip}
prologue push_ip=1, push_lr=1 -> push {ip, lr}
prologue 1 -> push {r1}
prologue 1, align8=1 -> push {r0, r1}
prologue 1 push_ip=1 -> push {r1, ip}
prologue 1 4 -> push {r1-r4}
prologue 1 4 push_ip=1 -> push {r1-r4, ip}
* epilogue: pops registers off the stack and emits pac key signing
instruction, if requested. The `first', `last', `push_ip',
`push_lr' and `align' function as per the prologue macro,
generating pop instead of push instructions.
Stack alignment is enforced via the following helper macro
call-chain:
{prologue|epilogue} ->_align8 -> _preprocess_reglist ->
_preprocess_reglist1 -> {_prologue|_epilogue}
Finally, the necessary cfi directives for adding debug information
to prologue and epilogue are generated via the following macros:
* cfisavelist - prologue macro helper function, generating
necessary .cfi_offset directives associated with push instruction.
Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
to generate the following:
push {r1-r2, ip}
.cfi_adjust_cfa_offset 12
.cfi_offset 143, -4
.cfi_offset 2, -8
.cfi_offset 1, -12
* cfirestorelist - epilogue macro helper function, emitting
.cfi_restore instructions prior to resetting the cfa offset. As
such, calling `epilogue 1 2 push_ip=1' will produce:
pop {r1-r2, ip}
.cfi_register 143, 12
.cfi_restore 2
.cfi_restore 1
.cfi_def_cfa_offset 0
---
newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
1 file changed, 441 insertions(+)
diff --git a/newlib/libc/machine/arm/arm_asm.h b/newlib/libc/machine/arm/arm_asm.h
index 2708057de..94fa77b4d 100644
--- a/newlib/libc/machine/arm/arm_asm.h
+++ b/newlib/libc/machine/arm/arm_asm.h
@@ -60,4 +60,445 @@
# define _ISA_THUMB_1
#endif
+/* Check whether leaf function PAC signing has been requested in the
+ -mbranch-protect compile-time option. */
+#define LEAF_PROTECT_BIT 2
+
+#ifdef __ARM_FEATURE_PAC_DEFAULT
+# define HAVE_PAC_LEAF \
+ ((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
+#else
+# define HAVE_PAC_LEAF 0
+#endif
+
+/* Provide default parameters for PAC-code handling in leaf-functions. */
+#if HAVE_PAC_LEAF
+# ifndef PAC_LEAF_PUSH_IP
+# define PAC_LEAF_PUSH_IP 1
+# endif
+#else /* !HAVE_PAC_LEAF */
+# undef PAC_LEAF_PUSH_IP
+# define PAC_LEAF_PUSH_IP 0
+#endif /* HAVE_PAC_LEAF */
+
+#define STACK_ALIGN_ENFORCE 0
+
+#ifdef __ASSEMBLER__
+
+/******************************************************************************
+* Implementation of the prologue and epilogue assembler macros and their
+* associated helper functions.
+*
+* These functions add support for the following:
+*
+* - M-profile branch target identification (BTI) landing-pads when compiled
+* with `-mbranch-protection=bti'.
+* - PAC-signing and verification instructions, depending on hardware support
+* and whether the PAC-signing of leaf functions has been requested via the
+* `-mbranch-protection=pac-ret+leaf' compiler argument.
+* - 8-byte stack alignment preservation at function entry, defaulting to the
+* value of STACK_ALIGN_ENFORCE.
+*
+* Notes:
+* - Prologue stack alignment is implemented by detecting a push with an odd
+* number of registers and prepending a dummy register to the list.
+* - If alignment is attempted on a list containing r0, compilation will result
+* in an error.
+* - If alignment is attempted in a list containing r1, r0 will be prepended to
+* the register list and r0 will be restored prior to function return. for
+* functions with non-void return types, this will result in the corruption of
+* the result register.
+* - Stack alignment is enforced via the following helper macro call-chain:
+*
+* {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
+* _preprocess_reglist1 -> {_prologue|_epilogue}
+*
+* - Debug CFI directives are automatically added to prologues and epilogues,
+* assisted by `cfisavelist' and `cfirestorelist', respectively.
+*
+* Arguments:
+* prologue
+* --------
+* - first - If `last' specified, this serves as start of general-purpose
+* register (GPR) range to push onto stack, otherwise represents
+* single GPR to push onto stack. If omitted, no GPRs pushed
+* onto stack at prologue.
+* - last - If given, specifies inclusive upper-bound of GPR range.
+* - push_ip - Determines whether IP register is to be pushed to stack at
+* prologue. When pac-signing is requested, this holds the
+* the pac-key. Either 1 or 0 to push or not push, respectively.
+* Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
+* - push_lr - Determines whether to push lr to the stack on function entry.
+* Either 1 or 0 to push or not push, respectively.
+* - align8 - Whether to enforce alignment. Either 1 or 0, with 1 requesting
+* alignment.
+*
+* epilogue
+* --------
+* The epilogue should be called passing the same arguments as those passed to
+* the prologue to ensure the stack is not corrupted on function return.
+*
+* Usage examples:
+*
+* prologue push_ip=1 -> push {ip}
+* epilogue push_ip=1, align8=1 -> pop {r2, ip}
+* prologue push_ip=1, push_lr=1 -> push {ip, lr}
+* epilogue 1 -> pop {r1}
+* prologue 1, align8=1 -> push {r0, r1}
+* epilogue 1, push_ip=1 -> pop {r1, ip}
+* prologue 1, 4 -> push {r1-r4}
+* epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
+*
+******************************************************************************/
+
+/* Emit .cfi_restore directives for a consecutive sequence of registers. */
+ .macro cfirestorelist first, last
+ .cfi_restore \last
+ .if \last-\first
+ cfirestorelist \first, \last-1
+ .endif
+ .endm
+
+/* Emit .cfi_offset directives for a consecutive sequence of registers. */
+ .macro cfisavelist first, last, index=1
+ .cfi_offset \last, -4*(\index)
+ .if \last-\first
+ cfisavelist \first, \last-1, \index+1
+ .endif
+ .endm
+
+.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
+ .if \push_ip & 1 != \push_ip
+ .error "push_ip may be either 0 or 1"
+ .endif
+ .if \push_lr & 1 != \push_lr
+ .error "push_lr may be either 0 or 1"
+ .endif
+ .if \first != -1
+ .if \last == -1
+ /* Upper-bound not provided: Set upper = lower. */
+ _prologue \first, \first, \push_ip, \push_lr
+ .exitm
+ .endif
+ .endif
+#if HAVE_PAC_LEAF
+#if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+#else
+ pac ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ .cfi_register 143, 12
+#else
+#if __ARM_FEATURE_BTI_DEFAULT
+ bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+#endif /* HAVE_PAC_LEAF */
+ .if \first != -1
+ .if \last != \first
+ .if \last >= 13
+ .error "SP cannot be in the save list"
+ .endif
+ .if \push_ip
+ .if \push_lr
+ /* Case 1: push register range, ip and lr registers. */
+ push {r\first-r\last, ip, lr}
+ .cfi_adjust_cfa_offset ((\last-\first)+3)*4
+ .cfi_offset 14, -4
+ .cfi_offset 143, -8
+ cfisavelist \first, \last, 3
+ .else // !\push_lr
+ /* Case 2: push register range and ip register. */
+ push {r\first-r\last, ip}
+ .cfi_adjust_cfa_offset ((\last-\first)+2)*4
+ .cfi_offset 143, -4
+ cfisavelist \first, \last, 2
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 3: push register range and lr register. */
+ push {r\first-r\last, lr}
+ .cfi_adjust_cfa_offset ((\last-\first)+2)*4
+ .cfi_offset 14, -4
+ cfisavelist \first, \last, 2
+ .else // !\push_lr
+ /* Case 4: push register range. */
+ push {r\first-r\last}
+ .cfi_adjust_cfa_offset ((\last-\first)+1)*4
+ cfisavelist \first, \last, 1
+ .endif
+ .endif
+ .else // \last == \first
+ .if \push_ip
+ .if \push_lr
+ /* Case 5: push single GP register plus ip and lr registers. */
+ push {r\first, ip, lr}
+ .cfi_adjust_cfa_offset 12
+ .cfi_offset 14, -4
+ .cfi_offset 143, -8
+ cfisavelist \first, \first, 3
+ .else // !\push_lr
+ /* Case 6: push single GP register plus ip register. */
+ push {r\first, ip}
+ .cfi_adjust_cfa_offset 8
+ .cfi_offset 143, -4
+ cfisavelist \first, \first, 2
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 7: push single GP register plus lr register. */
+ push {r\first, lr}
+ .cfi_adjust_cfa_offset 8
+ .cfi_offset 14, -4
+ cfisavelist \first, \first, 2
+ .else // !\push_lr
+ /* Case 8: push single GP register. */
+ push {r\first}
+ .cfi_adjust_cfa_offset 4
+ cfisavelist \first, \first, 1
+ .endif
+ .endif
+ .endif
+ .else // \first == -1
+ .if \push_ip
+ .if \push_lr
+ /* Case 9: push ip and lr registers. */
+ push {ip, lr}
+ .cfi_adjust_cfa_offset 8
+ .cfi_offset 14, -4
+ .cfi_offset 143, -8
+ .else // !\push_lr
+ /* Case 10: push ip register. */
+ push {ip}
+ .cfi_adjust_cfa_offset 4
+ .cfi_offset 143, -4
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 11: push lr register. */
+ push {lr}
+ .cfi_adjust_cfa_offset 4
+ .cfi_offset 14, -4
+ .endif
+ .endif
+ .endif
+.endm
+
+.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
+ .if \push_ip & 1 != \push_ip
+ .error "push_ip may be either 0 or 1"
+ .endif
+ .if \push_lr & 1 != \push_lr
+ .error "push_lr may be either 0 or 1"
+ .endif
+ .if \first != -1
+ .if \last == -1
+ /* Upper-bound not provided: Set upper = lower. */
+ _epilogue \first, \first, \push_ip, \push_lr
+ .exitm
+ .endif
+ .if \last != \first
+ .if \last >= 13
+ .error "SP cannot be in the save list"
+ .endif
+ .if \push_ip
+ .if \push_lr
+ /* Case 1: pop register range, ip and lr registers. */
+ pop {r\first-r\last, ip, lr}
+ .cfi_restore 14
+ .cfi_register 143, 12
+ cfirestorelist \first, \last
+ .else // !\push_lr
+ /* Case 2: pop register range and ip register. */
+ pop {r\first-r\last, ip}
+ .cfi_register 143, 12
+ cfirestorelist \first, \last
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 3: pop register range and lr register. */
+ pop {r\first-r\last, lr}
+ .cfi_restore 14
+ cfirestorelist \first, \last
+ .else // !\push_lr
+ /* Case 4: pop register range. */
+ pop {r\first-r\last}
+ cfirestorelist \first, \last
+ .endif
+ .endif
+ .else // \last == \first
+ .if \push_ip
+ .if \push_lr
+ /* Case 5: pop single GP register plus ip and lr registers. */
+ pop {r\first, ip, lr}
+ .cfi_restore 14
+ .cfi_register 143, 12
+ cfirestorelist \first, \first
+ .else // !\push_lr
+ /* Case 6: pop single GP register plus ip register. */
+ pop {r\first, ip}
+ .cfi_register 143, 12
+ cfirestorelist \first, \first
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 7: pop single GP register plus lr register. */
+ pop {r\first, lr}
+ .cfi_restore 14
+ cfirestorelist \first, \first
+ .else // !\push_lr
+ /* Case 8: pop single GP register. */
+ pop {r\first}
+ cfirestorelist \first, \first
+ .endif
+ .endif
+ .endif
+ .else // \first == -1
+ .if \push_ip
+ .if \push_lr
+ /* Case 9: pop ip and lr registers. */
+ pop {ip, lr}
+ .cfi_restore 14
+ .cfi_register 143, 12
+ .else // !\push_lr
+ /* Case 10: pop ip register. */
+ pop {ip}
+ .cfi_register 143, 12
+ .endif
+ .else // !\push_ip
+ .if \push_lr
+ /* Case 11: pop lr register. */
+ pop {lr}
+ .cfi_restore 14
+ .endif
+ .endif
+ .endif
+#if HAVE_PAC_LEAF
+ aut ip, lr, sp
+#endif /* HAVE_PAC_LEAF */
+ bx lr
+.endm
+
+# clean up expressions in 'last'
+.macro _preprocess_reglist1 first:req, last:req, push_ip:req, push_lr:req, reglist_op:req
+ .if \last == 0
+ \reglist_op \first, 0, \push_ip, \push_lr
+ .elseif \last == 1
+ \reglist_op \first, 1, \push_ip, \push_lr
+ .elseif \last == 2
+ \reglist_op \first, 2, \push_ip, \push_lr
+ .elseif \last == 3
+ \reglist_op \first, 3, \push_ip, \push_lr
+ .elseif \last == 4
+ \reglist_op \first, 4, \push_ip, \push_lr
+ .elseif \last == 5
+ \reglist_op \first, 5, \push_ip, \push_lr
+ .elseif \last == 6
+ \reglist_op \first, 6, \push_ip, \push_lr
+ .elseif \last == 7
+ \reglist_op \first, 7, \push_ip, \push_lr
+ .elseif \last == 8
+ \reglist_op \first, 8, \push_ip, \push_lr
+ .elseif \last == 9
+ \reglist_op \first, 9, \push_ip, \push_lr
+ .elseif \last == 10
+ \reglist_op \first, 10, \push_ip, \push_lr
+ .elseif \last == 11
+ \reglist_op \first, 11, \push_ip, \push_lr
+ .else
+ .error "last (\last) out of range"
+ .endif
+.endm
+
+# clean up expressions in 'first'
+.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0, reglist_op:req
+ .ifb \last
+ _preprocess_reglist \first \first \push_ip \push_lr
+ .else
+ .if \first > \last
+ .error "last (\last) must be at least as great as first (\first)"
+ .endif
+ .if \first == 0
+ _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 1
+ _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 2
+ _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 3
+ _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 4
+ _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 5
+ _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 6
+ _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 7
+ _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 8
+ _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 9
+ _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 10
+ _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
+ .elseif \first == 11
+ _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
+ .else
+ .error "first (\first) out of range"
+ .endif
+ .endif
+.endm
+
+.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
+ .ifb \first
+ .ifnb \last
+ .error "can't have last (\last) without specifying first"
+ .else // \last not blank
+ .if ((\push_ip + \push_lr) % 2) == 0
+ \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
+ .exitm
+ .else // ((\push_ip + \push_lr) % 2) odd
+ _align8 2, 2, \push_ip, \push_lr, \reglist_op
+ .exitm
+ .endif // ((\push_ip + \push_lr) % 2) == 0
+ .endif // .ifnb \last
+ .endif // .ifb \first
+
+ .ifb \last
+ _align8 \first, \first, \push_ip, \push_lr, \reglist_op
+ .else
+ .if \push_ip & 1 <> \push_ip
+ .error "push_ip may be 0 or 1"
+ .endif
+ .if \push_lr & 1 <> \push_lr
+ .error "push_lr may be 0 or 1"
+ .endif
+ .ifeq (\last - \first + \push_ip + \push_lr) % 2
+ .if \first == 0
+ .error "Alignment required and first register is r0"
+ .exitm
+ .endif
+ _preprocess_reglist \first-1, \last, \push_ip, \push_lr, \reglist_op
+ .else
+ _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
+ .endif
+ .endif
+.endm
+
+.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
+ .if \align8
+ _align8 \first, \last, \push_ip, \push_lr, _prologue
+ .else
+ _prologue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
+ .endif
+.endm
+
+.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
+ .if \align8
+ _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
+ .else
+ _epilogue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
+ .endif
+.endm
+
+#endif /* __ASSEMBLER__ */
+
#endif /* ARM_ASM__H */
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-10-26 11:37 [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality Victor L. Do Nascimento
2022-10-26 11:45 ` [PATCH v4 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
@ 2022-10-26 11:46 ` Victor L. Do Nascimento
2022-11-22 15:04 ` Richard Earnshaw
2022-10-26 11:47 ` [PATCH v4 3/8] newlib: libc: strlen " Victor L. Do Nascimento
` (5 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Victor L. Do Nascimento @ 2022-10-26 11:46 UTC (permalink / raw)
To: newlib; +Cc: Richard.Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strcmp:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
---
newlib/libc/machine/arm/strcmp-arm-tiny.S | 8 ++++-
newlib/libc/machine/arm/strcmp-armv7.S | 44 +++++++++++++++--------
newlib/libc/machine/arm/strcmp-armv7m.S | 26 +++++++++-----
3 files changed, 54 insertions(+), 24 deletions(-)
diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S b/newlib/libc/machine/arm/strcmp-arm-tiny.S
index 607a41daf..0bd2a2e6e 100644
--- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
+++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
@@ -29,10 +29,14 @@
/* Tiny version of strcmp in ARM state. Used only when optimizing
for size. Also supports Thumb-2. */
+#include "arm_asm.h"
+
.syntax unified
def_fn strcmp
+ .fnstart
.cfi_sections .debug_frame
.cfi_startproc
+ prologue
1:
ldrb r2, [r0], #1
ldrb r3, [r1], #1
@@ -42,6 +46,8 @@ def_fn strcmp
beq 1b
2:
subs r0, r2, r3
- bx lr
+ epilogue
.cfi_endproc
+ .cantunwind
+ .fnend
.size strcmp, . - strcmp
diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
index 2f93bfb73..26ba579ae 100644
--- a/newlib/libc/machine/arm/strcmp-armv7.S
+++ b/newlib/libc/machine/arm/strcmp-armv7.S
@@ -45,6 +45,8 @@
.thumb
.syntax unified
+#include "arm_asm.h"
+
/* Parameters and result. */
#define src1 r0
#define src2 r1
@@ -91,8 +93,9 @@
ldrd r4, r5, [sp], #16
.cfi_restore 4
.cfi_restore 5
+ .cfi_adjust_cfa_offset -16
sub result, result, r1, lsr #24
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
#else
/* To use the big-endian trick we'd have to reverse all three words.
that's slower than this approach. */
@@ -112,22 +115,30 @@
ldrd r4, r5, [sp], #16
.cfi_restore 4
.cfi_restore 5
+ .cfi_adjust_cfa_offset -16
sub result, result, r1
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
#endif
.endm
+
.text
.p2align 5
+ .fnstart
+ .cfi_sections .debug_frame
+ .cfi_startproc
.Lstrcmp_start_addr:
#ifndef STRCMP_NO_PRECHECK
.Lfastpath_exit:
+ .cfi_remember_state
sub r0, r2, r3
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
nop
#endif
def_fn strcmp
+ .cfi_restore_state
+ prologue push_ip=HAVE_PAC_LEAF
#ifndef STRCMP_NO_PRECHECK
ldrb r2, [src1]
ldrb r3, [src2]
@@ -136,16 +147,14 @@ def_fn strcmp
cmpcs r2, r3
bne .Lfastpath_exit
#endif
- .cfi_sections .debug_frame
- .cfi_startproc
strd r4, r5, [sp, #-16]!
- .cfi_def_cfa_offset 16
- .cfi_offset 4, -16
- .cfi_offset 5, -12
+ .cfi_adjust_cfa_offset 16
+ .cfi_rel_offset 4, 0
+ .cfi_rel_offset 5, 4
orr tmp1, src1, src2
strd r6, r7, [sp, #8]
- .cfi_offset 6, -8
- .cfi_offset 7, -4
+ .cfi_rel_offset 6, 8
+ .cfi_rel_offset 7, 12
mvn const_m1, #0
lsl r2, tmp1, #29
cbz r2, .Lloop_aligned8
@@ -270,7 +279,6 @@ def_fn strcmp
ldr data1, [src1], #4
beq .Laligned_m2
bcs .Laligned_m1
-
#ifdef STRCMP_NO_PRECHECK
ldrb data2, [src2, #1]
uxtb tmp1, data1, ror #BYTE1_OFFSET
@@ -314,7 +322,8 @@ def_fn strcmp
mov result, tmp1
ldr r4, [sp], #16
.cfi_restore 4
- bx lr
+ .cfi_adjust_cfa_offset -16
+ epilogue push_ip=HAVE_PAC_LEAF
#ifndef STRCMP_NO_PRECHECK
.Laligned_m1:
@@ -364,8 +373,9 @@ def_fn strcmp
/* R6/7 Not used in this sequence. */
.cfi_restore 6
.cfi_restore 7
+ .cfi_adjust_cfa_offset -16
neg result, result
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
6:
.cfi_restore_state
@@ -441,7 +451,8 @@ def_fn strcmp
/* R6/7 not used in this sequence. */
.cfi_restore 6
.cfi_restore 7
- bx lr
+ .cfi_adjust_cfa_offset -16
+ epilogue push_ip=HAVE_PAC_LEAF
.Lstrcmp_tail:
.cfi_restore_state
@@ -463,7 +474,10 @@ def_fn strcmp
/* R6/7 not used in this sequence. */
.cfi_restore 6
.cfi_restore 7
+ .cfi_adjust_cfa_offset -16
sub result, result, data2, lsr #24
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
.cfi_endproc
+ .cantunwind
+ .fnend
.size strcmp, . - .Lstrcmp_start_addr
diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
index cdb4912df..825b6e77f 100644
--- a/newlib/libc/machine/arm/strcmp-armv7m.S
+++ b/newlib/libc/machine/arm/strcmp-armv7m.S
@@ -29,6 +29,8 @@
/* Very similar to the generic code, but uses Thumb2 as implemented
in ARMv7-M. */
+#include "arm_asm.h"
+
/* Parameters and result. */
#define src1 r0
#define src2 r1
@@ -44,8 +46,10 @@
.thumb
.syntax unified
def_fn strcmp
+ .fnstart
.cfi_sections .debug_frame
.cfi_startproc
+ prologue push_ip=HAVE_PAC_LEAF
eor tmp1, src1, src2
tst tmp1, #3
/* Strings not at same byte offset from a word boundary. */
@@ -82,6 +86,7 @@ def_fn strcmp
ldreq data2, [src2], #4
beq 4b
2:
+ .cfi_remember_state
/* There's a zero or a different byte in the word */
S2HI result, data1, #24
S2LO data1, data1, #8
@@ -106,7 +111,7 @@ def_fn strcmp
lsrs result, result, #24
subs result, result, data2
#endif
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
#if 0
@@ -205,8 +210,10 @@ def_fn strcmp
/* First of all, compare bytes until src1(sp1) is word-aligned. */
.Lstrcmp_unaligned:
+ .cfi_restore_state
tst src1, #3
beq 2f
+ .cfi_remember_state
ldrb data1, [src1], #1
ldrb data2, [src2], #1
cmp data1, #1
@@ -214,12 +221,13 @@ def_fn strcmp
cmpcs data1, data2
beq .Lstrcmp_unaligned
sub result, data1, data2
- bx lr
+ epilogue push_ip=HAVE_PAC_LEAF
2:
+ .cfi_restore_state
stmfd sp!, {r5}
- .cfi_def_cfa_offset 4
- .cfi_offset 5, -4
+ .cfi_adjust_cfa_offset 4
+ .cfi_rel_offset 5, 0
ldr data1, [src1], #4
and tmp2, src2, #3
@@ -355,8 +363,8 @@ def_fn strcmp
.cfi_remember_state
ldmfd sp!, {r5}
.cfi_restore 5
- .cfi_def_cfa_offset 0
- bx lr
+ .cfi_adjust_cfa_offset -4
+ epilogue push_ip=HAVE_PAC_LEAF
.Lstrcmp_tail:
.cfi_restore_state
@@ -372,7 +380,9 @@ def_fn strcmp
sub result, r2, result
ldmfd sp!, {r5}
.cfi_restore 5
- .cfi_def_cfa_offset 0
- bx lr
+ .cfi_adjust_cfa_offset -4
+ epilogue push_ip=HAVE_PAC_LEAF
.cfi_endproc
+ .cantunwind
+ .fnend
.size strcmp, . - strcmp
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 3/8] newlib: libc: strlen M-profile PACBTI-enablement
2022-10-26 11:37 [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality Victor L. Do Nascimento
2022-10-26 11:45 ` [PATCH v4 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
2022-10-26 11:46 ` [PATCH v4 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
@ 2022-10-26 11:47 ` Victor L. Do Nascimento
2022-11-22 15:20 ` Richard Earnshaw
2022-10-26 11:49 ` [PATCH v4 4/8] newlib: libc: memchr " Victor L. Do Nascimento
` (4 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Victor L. Do Nascimento @ 2022-10-26 11:47 UTC (permalink / raw)
To: newlib; +Cc: Richard.Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strlen:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
---
newlib/libc/machine/arm/strlen-armv7.S | 17 ++++++++++++++---
newlib/libc/machine/arm/strlen-thumb2-Os.S | 14 +++++++++++---
2 files changed, 25 insertions(+), 6 deletions(-)
diff --git a/newlib/libc/machine/arm/strlen-armv7.S b/newlib/libc/machine/arm/strlen-armv7.S
index f3dda0d60..27094040c 100644
--- a/newlib/libc/machine/arm/strlen-armv7.S
+++ b/newlib/libc/machine/arm/strlen-armv7.S
@@ -59,6 +59,7 @@
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */
#include "acle-compat.h"
+#include "arm_asm.h"
.macro def_fn f p2align=0
.text
@@ -78,7 +79,11 @@
/* This code requires Thumb. */
#if __ARM_ARCH_PROFILE == 'M'
+#if __ARM_ARCH >= 8
+ /* keep config inherited from -march=. */
+#else
.arch armv7e-m
+#endif /* if __ARM_ARCH >= 8 */
#else
.arch armv6t2
#endif
@@ -100,8 +105,10 @@
#define tmp2 r5
def_fn strlen p2align=6
+ .fnstart
+ .cfi_startproc
+ prologue 4 5 push_ip=HAVE_PAC_LEAF
pld [srcin, #0]
- strd r4, r5, [sp, #-8]!
bic src, srcin, #7
mvn const_m1, #0
ands tmp1, srcin, #7 /* (8 - bytes) to alignment. */
@@ -151,6 +158,7 @@ def_fn strlen p2align=6
beq .Lloop_aligned
.Lnull_found:
+ .cfi_remember_state
cmp data1a, #0
itt eq
addeq result, result, #4
@@ -159,11 +167,11 @@ def_fn strlen p2align=6
rev data1a, data1a
#endif
clz data1a, data1a
- ldrd r4, r5, [sp], #8
add result, result, data1a, lsr #3 /* Bits -> Bytes. */
- bx lr
+ epilogue 4 5 push_ip=HAVE_PAC_LEAF
.Lmisaligned8:
+ .cfi_restore_state
ldrd data1a, data1b, [src]
and tmp2, tmp1, #3
rsb result, tmp1, #0
@@ -177,4 +185,7 @@ def_fn strlen p2align=6
movne data1a, const_m1
mov const_0, #0
b .Lstart_realigned
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size strlen, . - strlen
diff --git a/newlib/libc/machine/arm/strlen-thumb2-Os.S b/newlib/libc/machine/arm/strlen-thumb2-Os.S
index 961f41a0a..a46db573c 100644
--- a/newlib/libc/machine/arm/strlen-thumb2-Os.S
+++ b/newlib/libc/machine/arm/strlen-thumb2-Os.S
@@ -25,6 +25,7 @@
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */
#include "acle-compat.h"
+#include "arm_asm.h"
.macro def_fn f p2align=0
.text
@@ -33,8 +34,9 @@
.type \f, %function
\f:
.endm
-
-#if __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
+#if __ARM_ARCH_PROFILE == 'M' && __ARM_ARCH >= 8
+ /* keep config inherited from -march=. */
+#elif __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
.arch armv7
#else
.arch armv6t2
@@ -44,11 +46,17 @@
.syntax unified
def_fn strlen p2align=1
+ .fnstart
+ .cfi_startproc
+ prologue
mov r3, r0
1: ldrb.w r2, [r3], #1
cmp r2, #0
bne 1b
subs r0, r3, r0
subs r0, #1
- bx lr
+ epilogue
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size strlen, . - strlen
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 4/8] newlib: libc: memchr M-profile PACBTI-enablement
2022-10-26 11:37 [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality Victor L. Do Nascimento
` (2 preceding siblings ...)
2022-10-26 11:47 ` [PATCH v4 3/8] newlib: libc: strlen " Victor L. Do Nascimento
@ 2022-10-26 11:49 ` Victor L. Do Nascimento
2022-11-22 15:33 ` Richard Earnshaw
2022-10-26 11:50 ` [PATCH v4 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
` (3 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Victor L. Do Nascimento @ 2022-10-26 11:49 UTC (permalink / raw)
To: newlib; +Cc: Richard.Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/memchr.S | 42 +++++++++++++++++++++++++++-----
1 file changed, 36 insertions(+), 6 deletions(-)
diff --git a/newlib/libc/machine/arm/memchr.S b/newlib/libc/machine/arm/memchr.S
index 1a4c6512c..5b051123d 100644
--- a/newlib/libc/machine/arm/memchr.S
+++ b/newlib/libc/machine/arm/memchr.S
@@ -76,6 +76,7 @@
.syntax unified
#include "acle-compat.h"
+#include "arm_asm.h"
@ NOTE: This ifdef MUST match the one in memchr-stub.c
#if defined (__ARM_NEON__) || defined (__ARM_NEON)
@@ -267,10 +268,14 @@ memchr:
#elif __ARM_ARCH_ISA_THUMB >= 2 && defined (__ARM_FEATURE_DSP)
#if __ARM_ARCH_PROFILE == 'M'
- .arch armv7e-m
+#if __ARM_ARCH >= 8
+ /* keep config inherited from -march=. */
#else
- .arch armv6t2
-#endif
+ .arch armv7e-m
+#endif /* __ARM_ARCH >= 8 */
+#else
+ .arch armv6t2
+#endif /* __ARM_ARCH_PROFILE == 'M' */
@ this lets us check a flag in a 00/ff byte easily in either endianness
#ifdef __ARMEB__
@@ -287,11 +292,14 @@ memchr:
.p2align 4,,15
.global memchr
.type memchr,%function
+ .fnstart
+ .cfi_startproc
memchr:
@ r0 = start of memory to scan
@ r1 = character to look for
@ r2 = length
@ returns r0 = pointer to character or NULL if not found
+ prologue
and r1,r1,#0xff @ Don't trust the caller to pass a char
cmp r2,#16 @ If short don't bother with anything clever
@@ -313,6 +321,11 @@ memchr:
10:
@ We are aligned, we know we have at least 8 bytes to work with
push {r4,r5,r6,r7}
+ .cfi_adjust_cfa_offset 16
+ .cfi_rel_offset 4, 0
+ .cfi_rel_offset 5, 4
+ .cfi_rel_offset 6, 8
+ .cfi_rel_offset 7, 12
orr r1, r1, r1, lsl #8 @ expand the match word across all bytes
orr r1, r1, r1, lsl #16
bic r4, r2, #7 @ Number of double words to work with * 8
@@ -334,6 +347,11 @@ memchr:
bne 15b @ (Flags from the subs above)
pop {r4,r5,r6,r7}
+ .cfi_restore 7
+ .cfi_restore 6
+ .cfi_restore 5
+ .cfi_restore 4
+ .cfi_adjust_cfa_offset -16
and r1,r1,#0xff @ r1 back to a single character
and r2,r2,#7 @ Leave the count remaining as the number
@ after the double words have been done
@@ -349,17 +367,21 @@ memchr:
bne 21b @ on r2 flags
40:
+ .cfi_remember_state
movs r0,#0 @ not found
- bx lr
+ epilogue
50:
+ .cfi_restore_state
+ .cfi_remember_state
subs r0,r0,#1 @ found
- bx lr
+ epilogue
60: @ We're here because the fast path found a hit
@ now we have to track down exactly which word it was
@ r0 points to the start of the double word after the one tested
@ r5 has the 00/ff pattern for the first word, r6 has the chained value
+ .cfi_restore_state
cmp r5, #0
itte eq
moveq r5, r6 @ the end is in the 2nd word
@@ -379,8 +401,16 @@ memchr:
61:
pop {r4,r5,r6,r7}
+ .cfi_restore 7
+ .cfi_restore 6
+ .cfi_restore 5
+ .cfi_restore 4
+ .cfi_adjust_cfa_offset -16
subs r0,r0,#1
- bx lr
+ epilogue
+ .cfi_endproc
+ .cantunwind
+ .fnend
#else
/* Defined in memchr-stub.c. */
#endif
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 5/8] newlib: libc: memcpy M-profile PACBTI-enablement
2022-10-26 11:37 [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality Victor L. Do Nascimento
` (3 preceding siblings ...)
2022-10-26 11:49 ` [PATCH v4 4/8] newlib: libc: memchr " Victor L. Do Nascimento
@ 2022-10-26 11:50 ` Victor L. Do Nascimento
2022-11-22 16:03 ` Richard Earnshaw
2022-10-26 11:51 ` [PATCH v4 6/8] newlib: libc: setjmp/longjmp " Victor L. Do Nascimento
` (2 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Victor L. Do Nascimento @ 2022-10-26 11:50 UTC (permalink / raw)
To: newlib; +Cc: Richard.Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/memcpy-armv7m.S | 37 +++++++++++++++++--------
1 file changed, 26 insertions(+), 11 deletions(-)
diff --git a/newlib/libc/machine/arm/memcpy-armv7m.S b/newlib/libc/machine/arm/memcpy-armv7m.S
index c8bff36f6..a74bacc97 100644
--- a/newlib/libc/machine/arm/memcpy-armv7m.S
+++ b/newlib/libc/machine/arm/memcpy-armv7m.S
@@ -46,6 +46,8 @@
__OPT_BIG_BLOCK_SIZE: Size of big block in words. Default to 64.
__OPT_MID_BLOCK_SIZE: Size of big block in words. Default to 16.
*/
+#include "arm_asm.h"
+
#ifndef __OPT_BIG_BLOCK_SIZE
#define __OPT_BIG_BLOCK_SIZE (4 * 16)
#endif
@@ -85,6 +87,8 @@
.global memcpy
.thumb
.thumb_func
+ .fnstart
+ .cfi_startproc
.type memcpy, %function
memcpy:
@ r0: dst
@@ -93,10 +97,11 @@ memcpy:
#ifdef __ARM_FEATURE_UNALIGNED
/* In case of UNALIGNED access supported, ip is not used in
function body. */
+ prologue push_ip=HAVE_PAC_LEAF
mov ip, r0
#else
- push {r0}
-#endif
+ prologue 0 push_ip=HAVE_PAC_LEAF
+#endif /* __ARM_FEATURE_UNALIGNED */
orr r3, r1, r0
ands r3, r3, #3
bne .Lmisaligned_copy
@@ -135,13 +140,13 @@ memcpy:
ldr r3, [r1], #4
str r3, [r0], #4
END_UNROLL
-#else /* __ARM_ARCH_7M__ */
+#else
ldr r3, [r1, \offset]
str r3, [r0, \offset]
END_UNROLL
adds r0, __OPT_MID_BLOCK_SIZE
adds r1, __OPT_MID_BLOCK_SIZE
-#endif
+#endif /* __ARM_ARCH_7M__ */
subs r2, __OPT_MID_BLOCK_SIZE
bhs .Lmid_block_loop
@@ -178,15 +183,17 @@ memcpy:
#endif /* __ARM_FEATURE_UNALIGNED */
.Ldone:
+ .cfi_remember_state
#ifdef __ARM_FEATURE_UNALIGNED
mov r0, ip
+ epilogue push_ip=HAVE_PAC_LEAF
#else
- pop {r0}
-#endif
- bx lr
+ epilogue 0 push_ip=HAVE_PAC_LEAF
+#endif /* __ARM_FEATURE_UNALIGNED */
.align 2
.Lmisaligned_copy:
+ .cfi_restore_state
#ifdef __ARM_FEATURE_UNALIGNED
/* Define label DST_ALIGNED to BIG_BLOCK. It will go to aligned copy
once destination is adjusted to aligned. */
@@ -247,6 +254,9 @@ memcpy:
/* dst is aligned, but src isn't. Misaligned copy. */
push {r4, r5}
+ .cfi_adjust_cfa_offset 8
+ .cfi_rel_offset 4, 0
+ .cfi_rel_offset 5, 4
subs r2, #4
/* Backward r1 by misaligned bytes, to make r1 aligned.
@@ -299,6 +309,9 @@ memcpy:
adds r2, #4
subs r1, ip
pop {r4, r5}
+ .cfi_restore 4
+ .cfi_restore 5
+ .cfi_adjust_cfa_offset -8
#endif /* __ARM_FEATURE_UNALIGNED */
@@ -321,9 +334,11 @@ memcpy:
#ifdef __ARM_FEATURE_UNALIGNED
mov r0, ip
+ epilogue push_ip=HAVE_PAC_LEAF
#else
- pop {r0}
-#endif
- bx lr
-
+ epilogue 0 push_ip=HAVE_PAC_LEAF
+#endif /* __ARM_FEATURE_UNALIGNED */
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size memcpy, .-memcpy
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 6/8] newlib: libc: setjmp/longjmp M-profile PACBTI-enablement
2022-10-26 11:37 [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality Victor L. Do Nascimento
` (4 preceding siblings ...)
2022-10-26 11:50 ` [PATCH v4 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
@ 2022-10-26 11:51 ` Victor L. Do Nascimento
2022-11-22 16:17 ` Richard Earnshaw
2022-10-26 11:52 ` [PATCH v4 7/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
2022-10-26 11:53 ` [PATCH v4 8/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
7 siblings, 1 reply; 17+ messages in thread
From: Victor L. Do Nascimento @ 2022-10-26 11:51 UTC (permalink / raw)
To: newlib; +Cc: Richard.Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/setjmp.S | 33 ++++++++++++++++++++++++++++++--
1 file changed, 31 insertions(+), 2 deletions(-)
diff --git a/newlib/libc/machine/arm/setjmp.S b/newlib/libc/machine/arm/setjmp.S
index 21d6ff9e7..4fe53cdf2 100644
--- a/newlib/libc/machine/arm/setjmp.S
+++ b/newlib/libc/machine/arm/setjmp.S
@@ -157,11 +157,15 @@ SYM (.arm_start_of.\name):
.globl SYM (\name)
TYPE (\name)
SYM (\name):
+ .fnstart
+ .cfi_startproc
PROLOGUE \name
.endm
.macro FUNC_END name
RET
+ .cfi_endproc
+ .fnend
SIZE (\name)
.endm
@@ -173,11 +177,26 @@ SYM (\name):
/* Save all the callee-preserved registers into the jump buffer. */
#ifdef __thumb2__
+#if __ARM_FEATURE_PAC_DEFAULT
+#if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+#else
+ pac ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ .cfi_register 143, 12
+ mov a4, ip
+ mov ip, sp
+ stmea a1!, { a4, v1-v7, fp, ip, lr }
+#else
+#if __ARM_FEATURE_BTI_DEFAULT
+ bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
mov ip, sp
stmea a1!, { v1-v7, fp, ip, lr }
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
#else
stmea a1!, { v1-v7, fp, ip, sp, lr }
-#endif
+#endif /* __thumb2__ */
#if 0 /* Simulator does not cope with FP instructions yet. */
#ifndef __SOFTFP__
@@ -200,11 +219,17 @@ SYM (\name):
/* Restore the registers, retrieving the state when setjmp() was called. */
#ifdef __thumb2__
+#if __ARM_FEATURE_PAC_DEFAULT
+ ldmfd a1!, { a4, v1-v7, fp, ip, lr }
+ mov sp, ip
+ mov ip, a4
+#else
ldmfd a1!, { v1-v7, fp, ip, lr }
mov sp, ip
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
#else
ldmfd a1!, { v1-v7, fp, ip, sp, lr }
-#endif
+#endif /* __thumb2__ */
#if 0 /* Simulator does not cope with FP instructions yet. */
#ifndef __SOFTFP__
@@ -220,5 +245,9 @@ SYM (\name):
#endif
moveq a1, #1
+#if __ARM_FEATURE_PAC_DEFAULT
+ aut ip, lr, sp
+#endif
+
FUNC_END longjmp
#endif
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 7/8] newlib: libc: aeabi_memmove M-profile PACBTI-enablement
2022-10-26 11:37 [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality Victor L. Do Nascimento
` (5 preceding siblings ...)
2022-10-26 11:51 ` [PATCH v4 6/8] newlib: libc: setjmp/longjmp " Victor L. Do Nascimento
@ 2022-10-26 11:52 ` Victor L. Do Nascimento
2022-11-22 16:18 ` Richard Earnshaw
2022-10-26 11:53 ` [PATCH v4 8/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
7 siblings, 1 reply; 17+ messages in thread
From: Victor L. Do Nascimento @ 2022-10-26 11:52 UTC (permalink / raw)
To: newlib; +Cc: Richard.Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/aeabi_memmove-thumb2.S | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
index e9504437b..20ca993e5 100644
--- a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
+++ b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
@@ -26,6 +26,8 @@
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
+#include "arm_asm.h"
+
.thumb
.syntax unified
.global __aeabi_memmove
@@ -33,8 +35,10 @@
ASM_ALIAS __aeabi_memmove4 __aeabi_memmove
ASM_ALIAS __aeabi_memmove8 __aeabi_memmove
__aeabi_memmove:
+ .fnstart
+ .cfi_startproc
+ prologue 4
cmp r0, r1
- push {r4}
bls 3f
adds r3, r1, r2
cmp r0, r3
@@ -48,9 +52,10 @@ __aeabi_memmove:
strb r4, [r1, #-1]!
bne 1b
2:
- pop {r4}
- bx lr
+ .cfi_remember_state
+ epilogue 4
3:
+ .cfi_restore_state
cmp r2, #0
beq 2b
add r2, r2, r1
@@ -60,6 +65,8 @@ __aeabi_memmove:
cmp r2, r1
strb r4, [r3, #1]!
bne 4b
- pop {r4}
- bx lr
+ epilogue 4
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size __aeabi_memmove, . - __aeabi_memmove
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 8/8] newlib: libc: aeabi_memset M-profile PACBTI-enablement
2022-10-26 11:37 [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality Victor L. Do Nascimento
` (6 preceding siblings ...)
2022-10-26 11:52 ` [PATCH v4 7/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
@ 2022-10-26 11:53 ` Victor L. Do Nascimento
2022-11-22 16:19 ` Richard Earnshaw
7 siblings, 1 reply; 17+ messages in thread
From: Victor L. Do Nascimento @ 2022-10-26 11:53 UTC (permalink / raw)
To: newlib; +Cc: Richard.Earnshaw
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/aeabi_memset-thumb2.S | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb2.S b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
index eaca1d8d7..6b77d3820 100644
--- a/newlib/libc/machine/arm/aeabi_memset-thumb2.S
+++ b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
@@ -26,14 +26,18 @@
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
+#include "arm_asm.h"
+
.thumb
.syntax unified
.global __aeabi_memset
.type __aeabi_memset, %function
+ .fnstart
+ .cfi_startproc
ASM_ALIAS __aeabi_memset4 __aeabi_memset
ASM_ALIAS __aeabi_memset8 __aeabi_memset
__aeabi_memset:
- push {r4, r5, r6}
+ prologue 4 6
lsls r4, r0, #30
beq 10f
subs r4, r1, #1
@@ -98,10 +102,14 @@ __aeabi_memset:
cmp r3, r4
bne 8b
9:
- pop {r4, r5, r6}
- bx lr
+ .cfi_remember_state
+ epilogue 4 6
10:
+ .cfi_restore_state
mov r4, r1
mov r3, r0
b 3b
+ .cfi_endproc
+ .cantunwind
+ .fnend
.size __aeabi_memset, . - __aeabi_memset
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2022-10-26 11:45 ` [PATCH v4 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
@ 2022-11-22 15:04 ` Richard Earnshaw
0 siblings, 0 replies; 17+ messages in thread
From: Richard Earnshaw @ 2022-11-22 15:04 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard.Earnshaw
On 26/10/2022 12:45, Victor L. Do Nascimento wrote:
> Augment the arm_asm.h header file to simplify function prologues and
> epilogues whilst adding support for PACBTI enablement via macros for
> hand-written assembly functions. For PACBTI, both prologues/epilogues
> as well as cfi-related directives are automatically amended
> accordingly, depending on the compile-time mbranch-protection argument
> values.
>
> It defines the following preprocessor macros:
> * HAVE_PAC_LEAF: Indicates whether pac-signing has been requested for
> leaf functions.
> * PAC_LEAF_PUSH_IP: Whether leaf functions should push the pac code
> to the stack irrespective of whether the ip register is clobbered in
> the function or not.
> * STACK_ALIGN_ENFORCE: Whether a dummy register should be added to
> the push list as necessary in the prologue to ensure stack
> alignment preservation at the start of assembly function. The
> epilogue behavior is likewise affected by this flag, ensuring any
> pushed dummy registers also get popped on function return.
>
> It also defines the following assembler macros:
> * prologue: In addition to pushing any callee-saved registers onto
> the stack, it generates any requested pacbti instructions.
> Pushed registers are specified via the optional `first', `last',
> `push_ip' and `push_lr' macro argument parameters.
> when a single register number is provided, it pushes that
> register. When two register numbers are provided, they specify a
> rage to save. If push_ip and/or push_lr are non-zero, the
> respective registers are also saved. Stack alignment is requested
> via the `align` argument, which defaults to the value of
> STACK_ALIGN_ENFORCE, unless manually overridden.
>
> For example:
>
> prologue push_ip=1 -> push {ip}
> prologue push_ip=1, align8=1 -> push {r2, ip}
> prologue push_ip=1, push_lr=1 -> push {ip, lr}
> prologue 1 -> push {r1}
> prologue 1, align8=1 -> push {r0, r1}
> prologue 1 push_ip=1 -> push {r1, ip}
> prologue 1 4 -> push {r1-r4}
> prologue 1 4 push_ip=1 -> push {r1-r4, ip}
>
> * epilogue: pops registers off the stack and emits pac key signing
> instruction, if requested. The `first', `last', `push_ip',
> `push_lr' and `align' function as per the prologue macro,
> generating pop instead of push instructions.
>
> Stack alignment is enforced via the following helper macro
> call-chain:
>
> {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
> _preprocess_reglist1 -> {_prologue|_epilogue}
>
> Finally, the necessary cfi directives for adding debug information
> to prologue and epilogue are generated via the following macros:
>
> * cfisavelist - prologue macro helper function, generating
> necessary .cfi_offset directives associated with push instruction.
> Therefore, the net effect of calling `prologue 1 2 push_ip=1' is
> to generate the following:
>
> push {r1-r2, ip}
> .cfi_adjust_cfa_offset 12
> .cfi_offset 143, -4
> .cfi_offset 2, -8
> .cfi_offset 1, -12
>
> * cfirestorelist - epilogue macro helper function, emitting
> .cfi_restore instructions prior to resetting the cfa offset. As
> such, calling `epilogue 1 2 push_ip=1' will produce:
>
> pop {r1-r2, ip}
> .cfi_register 143, 12
> .cfi_restore 2
> .cfi_restore 1
> .cfi_def_cfa_offset 0
OK.
R.
> ---
> newlib/libc/machine/arm/arm_asm.h | 441 ++++++++++++++++++++++++++++++
> 1 file changed, 441 insertions(+)
>
> diff --git a/newlib/libc/machine/arm/arm_asm.h b/newlib/libc/machine/arm/arm_asm.h
> index 2708057de..94fa77b4d 100644
> --- a/newlib/libc/machine/arm/arm_asm.h
> +++ b/newlib/libc/machine/arm/arm_asm.h
> @@ -60,4 +60,445 @@
> # define _ISA_THUMB_1
> #endif
>
> +/* Check whether leaf function PAC signing has been requested in the
> + -mbranch-protect compile-time option. */
> +#define LEAF_PROTECT_BIT 2
> +
> +#ifdef __ARM_FEATURE_PAC_DEFAULT
> +# define HAVE_PAC_LEAF \
> + ((__ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)) && 1)
> +#else
> +# define HAVE_PAC_LEAF 0
> +#endif
> +
> +/* Provide default parameters for PAC-code handling in leaf-functions. */
> +#if HAVE_PAC_LEAF
> +# ifndef PAC_LEAF_PUSH_IP
> +# define PAC_LEAF_PUSH_IP 1
> +# endif
> +#else /* !HAVE_PAC_LEAF */
> +# undef PAC_LEAF_PUSH_IP
> +# define PAC_LEAF_PUSH_IP 0
> +#endif /* HAVE_PAC_LEAF */
> +
> +#define STACK_ALIGN_ENFORCE 0
> +
> +#ifdef __ASSEMBLER__
> +
> +/******************************************************************************
> +* Implementation of the prologue and epilogue assembler macros and their
> +* associated helper functions.
> +*
> +* These functions add support for the following:
> +*
> +* - M-profile branch target identification (BTI) landing-pads when compiled
> +* with `-mbranch-protection=bti'.
> +* - PAC-signing and verification instructions, depending on hardware support
> +* and whether the PAC-signing of leaf functions has been requested via the
> +* `-mbranch-protection=pac-ret+leaf' compiler argument.
> +* - 8-byte stack alignment preservation at function entry, defaulting to the
> +* value of STACK_ALIGN_ENFORCE.
> +*
> +* Notes:
> +* - Prologue stack alignment is implemented by detecting a push with an odd
> +* number of registers and prepending a dummy register to the list.
> +* - If alignment is attempted on a list containing r0, compilation will result
> +* in an error.
> +* - If alignment is attempted in a list containing r1, r0 will be prepended to
> +* the register list and r0 will be restored prior to function return. for
> +* functions with non-void return types, this will result in the corruption of
> +* the result register.
> +* - Stack alignment is enforced via the following helper macro call-chain:
> +*
> +* {prologue|epilogue} ->_align8 -> _preprocess_reglist ->
> +* _preprocess_reglist1 -> {_prologue|_epilogue}
> +*
> +* - Debug CFI directives are automatically added to prologues and epilogues,
> +* assisted by `cfisavelist' and `cfirestorelist', respectively.
> +*
> +* Arguments:
> +* prologue
> +* --------
> +* - first - If `last' specified, this serves as start of general-purpose
> +* register (GPR) range to push onto stack, otherwise represents
> +* single GPR to push onto stack. If omitted, no GPRs pushed
> +* onto stack at prologue.
> +* - last - If given, specifies inclusive upper-bound of GPR range.
> +* - push_ip - Determines whether IP register is to be pushed to stack at
> +* prologue. When pac-signing is requested, this holds the
> +* the pac-key. Either 1 or 0 to push or not push, respectively.
> +* Default behavior: Set to value of PAC_LEAF_PUSH_IP macro.
> +* - push_lr - Determines whether to push lr to the stack on function entry.
> +* Either 1 or 0 to push or not push, respectively.
> +* - align8 - Whether to enforce alignment. Either 1 or 0, with 1 requesting
> +* alignment.
> +*
> +* epilogue
> +* --------
> +* The epilogue should be called passing the same arguments as those passed to
> +* the prologue to ensure the stack is not corrupted on function return.
> +*
> +* Usage examples:
> +*
> +* prologue push_ip=1 -> push {ip}
> +* epilogue push_ip=1, align8=1 -> pop {r2, ip}
> +* prologue push_ip=1, push_lr=1 -> push {ip, lr}
> +* epilogue 1 -> pop {r1}
> +* prologue 1, align8=1 -> push {r0, r1}
> +* epilogue 1, push_ip=1 -> pop {r1, ip}
> +* prologue 1, 4 -> push {r1-r4}
> +* epilogue 1, 4 push_ip=1 -> pop {r1-r4, ip}
> +*
> +******************************************************************************/
> +
> +/* Emit .cfi_restore directives for a consecutive sequence of registers. */
> + .macro cfirestorelist first, last
> + .cfi_restore \last
> + .if \last-\first
> + cfirestorelist \first, \last-1
> + .endif
> + .endm
> +
> +/* Emit .cfi_offset directives for a consecutive sequence of registers. */
> + .macro cfisavelist first, last, index=1
> + .cfi_offset \last, -4*(\index)
> + .if \last-\first
> + cfisavelist \first, \last-1, \index+1
> + .endif
> + .endm
> +
> +.macro _prologue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
> + .if \push_ip & 1 != \push_ip
> + .error "push_ip may be either 0 or 1"
> + .endif
> + .if \push_lr & 1 != \push_lr
> + .error "push_lr may be either 0 or 1"
> + .endif
> + .if \first != -1
> + .if \last == -1
> + /* Upper-bound not provided: Set upper = lower. */
> + _prologue \first, \first, \push_ip, \push_lr
> + .exitm
> + .endif
> + .endif
> +#if HAVE_PAC_LEAF
> +#if __ARM_FEATURE_BTI_DEFAULT
> + pacbti ip, lr, sp
> +#else
> + pac ip, lr, sp
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> + .cfi_register 143, 12
> +#else
> +#if __ARM_FEATURE_BTI_DEFAULT
> + bti
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> +#endif /* HAVE_PAC_LEAF */
> + .if \first != -1
> + .if \last != \first
> + .if \last >= 13
> + .error "SP cannot be in the save list"
> + .endif
> + .if \push_ip
> + .if \push_lr
> + /* Case 1: push register range, ip and lr registers. */
> + push {r\first-r\last, ip, lr}
> + .cfi_adjust_cfa_offset ((\last-\first)+3)*4
> + .cfi_offset 14, -4
> + .cfi_offset 143, -8
> + cfisavelist \first, \last, 3
> + .else // !\push_lr
> + /* Case 2: push register range and ip register. */
> + push {r\first-r\last, ip}
> + .cfi_adjust_cfa_offset ((\last-\first)+2)*4
> + .cfi_offset 143, -4
> + cfisavelist \first, \last, 2
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 3: push register range and lr register. */
> + push {r\first-r\last, lr}
> + .cfi_adjust_cfa_offset ((\last-\first)+2)*4
> + .cfi_offset 14, -4
> + cfisavelist \first, \last, 2
> + .else // !\push_lr
> + /* Case 4: push register range. */
> + push {r\first-r\last}
> + .cfi_adjust_cfa_offset ((\last-\first)+1)*4
> + cfisavelist \first, \last, 1
> + .endif
> + .endif
> + .else // \last == \first
> + .if \push_ip
> + .if \push_lr
> + /* Case 5: push single GP register plus ip and lr registers. */
> + push {r\first, ip, lr}
> + .cfi_adjust_cfa_offset 12
> + .cfi_offset 14, -4
> + .cfi_offset 143, -8
> + cfisavelist \first, \first, 3
> + .else // !\push_lr
> + /* Case 6: push single GP register plus ip register. */
> + push {r\first, ip}
> + .cfi_adjust_cfa_offset 8
> + .cfi_offset 143, -4
> + cfisavelist \first, \first, 2
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 7: push single GP register plus lr register. */
> + push {r\first, lr}
> + .cfi_adjust_cfa_offset 8
> + .cfi_offset 14, -4
> + cfisavelist \first, \first, 2
> + .else // !\push_lr
> + /* Case 8: push single GP register. */
> + push {r\first}
> + .cfi_adjust_cfa_offset 4
> + cfisavelist \first, \first, 1
> + .endif
> + .endif
> + .endif
> + .else // \first == -1
> + .if \push_ip
> + .if \push_lr
> + /* Case 9: push ip and lr registers. */
> + push {ip, lr}
> + .cfi_adjust_cfa_offset 8
> + .cfi_offset 14, -4
> + .cfi_offset 143, -8
> + .else // !\push_lr
> + /* Case 10: push ip register. */
> + push {ip}
> + .cfi_adjust_cfa_offset 4
> + .cfi_offset 143, -4
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 11: push lr register. */
> + push {lr}
> + .cfi_adjust_cfa_offset 4
> + .cfi_offset 14, -4
> + .endif
> + .endif
> + .endif
> +.endm
> +
> +.macro _epilogue first=-1, last=-1, push_ip=PAC_LEAF_PUSH_IP, push_lr=0
> + .if \push_ip & 1 != \push_ip
> + .error "push_ip may be either 0 or 1"
> + .endif
> + .if \push_lr & 1 != \push_lr
> + .error "push_lr may be either 0 or 1"
> + .endif
> + .if \first != -1
> + .if \last == -1
> + /* Upper-bound not provided: Set upper = lower. */
> + _epilogue \first, \first, \push_ip, \push_lr
> + .exitm
> + .endif
> + .if \last != \first
> + .if \last >= 13
> + .error "SP cannot be in the save list"
> + .endif
> + .if \push_ip
> + .if \push_lr
> + /* Case 1: pop register range, ip and lr registers. */
> + pop {r\first-r\last, ip, lr}
> + .cfi_restore 14
> + .cfi_register 143, 12
> + cfirestorelist \first, \last
> + .else // !\push_lr
> + /* Case 2: pop register range and ip register. */
> + pop {r\first-r\last, ip}
> + .cfi_register 143, 12
> + cfirestorelist \first, \last
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 3: pop register range and lr register. */
> + pop {r\first-r\last, lr}
> + .cfi_restore 14
> + cfirestorelist \first, \last
> + .else // !\push_lr
> + /* Case 4: pop register range. */
> + pop {r\first-r\last}
> + cfirestorelist \first, \last
> + .endif
> + .endif
> + .else // \last == \first
> + .if \push_ip
> + .if \push_lr
> + /* Case 5: pop single GP register plus ip and lr registers. */
> + pop {r\first, ip, lr}
> + .cfi_restore 14
> + .cfi_register 143, 12
> + cfirestorelist \first, \first
> + .else // !\push_lr
> + /* Case 6: pop single GP register plus ip register. */
> + pop {r\first, ip}
> + .cfi_register 143, 12
> + cfirestorelist \first, \first
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 7: pop single GP register plus lr register. */
> + pop {r\first, lr}
> + .cfi_restore 14
> + cfirestorelist \first, \first
> + .else // !\push_lr
> + /* Case 8: pop single GP register. */
> + pop {r\first}
> + cfirestorelist \first, \first
> + .endif
> + .endif
> + .endif
> + .else // \first == -1
> + .if \push_ip
> + .if \push_lr
> + /* Case 9: pop ip and lr registers. */
> + pop {ip, lr}
> + .cfi_restore 14
> + .cfi_register 143, 12
> + .else // !\push_lr
> + /* Case 10: pop ip register. */
> + pop {ip}
> + .cfi_register 143, 12
> + .endif
> + .else // !\push_ip
> + .if \push_lr
> + /* Case 11: pop lr register. */
> + pop {lr}
> + .cfi_restore 14
> + .endif
> + .endif
> + .endif
> +#if HAVE_PAC_LEAF
> + aut ip, lr, sp
> +#endif /* HAVE_PAC_LEAF */
> + bx lr
> +.endm
> +
> +# clean up expressions in 'last'
> +.macro _preprocess_reglist1 first:req, last:req, push_ip:req, push_lr:req, reglist_op:req
> + .if \last == 0
> + \reglist_op \first, 0, \push_ip, \push_lr
> + .elseif \last == 1
> + \reglist_op \first, 1, \push_ip, \push_lr
> + .elseif \last == 2
> + \reglist_op \first, 2, \push_ip, \push_lr
> + .elseif \last == 3
> + \reglist_op \first, 3, \push_ip, \push_lr
> + .elseif \last == 4
> + \reglist_op \first, 4, \push_ip, \push_lr
> + .elseif \last == 5
> + \reglist_op \first, 5, \push_ip, \push_lr
> + .elseif \last == 6
> + \reglist_op \first, 6, \push_ip, \push_lr
> + .elseif \last == 7
> + \reglist_op \first, 7, \push_ip, \push_lr
> + .elseif \last == 8
> + \reglist_op \first, 8, \push_ip, \push_lr
> + .elseif \last == 9
> + \reglist_op \first, 9, \push_ip, \push_lr
> + .elseif \last == 10
> + \reglist_op \first, 10, \push_ip, \push_lr
> + .elseif \last == 11
> + \reglist_op \first, 11, \push_ip, \push_lr
> + .else
> + .error "last (\last) out of range"
> + .endif
> +.endm
> +
> +# clean up expressions in 'first'
> +.macro _preprocess_reglist first:req, last, push_ip=0, push_lr=0, reglist_op:req
> + .ifb \last
> + _preprocess_reglist \first \first \push_ip \push_lr
> + .else
> + .if \first > \last
> + .error "last (\last) must be at least as great as first (\first)"
> + .endif
> + .if \first == 0
> + _preprocess_reglist1 0, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 1
> + _preprocess_reglist1 1, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 2
> + _preprocess_reglist1 2, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 3
> + _preprocess_reglist1 3, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 4
> + _preprocess_reglist1 4, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 5
> + _preprocess_reglist1 5, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 6
> + _preprocess_reglist1 6, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 7
> + _preprocess_reglist1 7, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 8
> + _preprocess_reglist1 8, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 9
> + _preprocess_reglist1 9, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 10
> + _preprocess_reglist1 10, \last, \push_ip, \push_lr, \reglist_op
> + .elseif \first == 11
> + _preprocess_reglist1 11, \last, \push_ip, \push_lr, \reglist_op
> + .else
> + .error "first (\first) out of range"
> + .endif
> + .endif
> +.endm
> +
> +.macro _align8 first, last, push_ip=0, push_lr=0, reglist_op=_prologue
> + .ifb \first
> + .ifnb \last
> + .error "can't have last (\last) without specifying first"
> + .else // \last not blank
> + .if ((\push_ip + \push_lr) % 2) == 0
> + \reglist_op first=-1, last=-1, push_ip=\push_ip, push_lr=\push_lr
> + .exitm
> + .else // ((\push_ip + \push_lr) % 2) odd
> + _align8 2, 2, \push_ip, \push_lr, \reglist_op
> + .exitm
> + .endif // ((\push_ip + \push_lr) % 2) == 0
> + .endif // .ifnb \last
> + .endif // .ifb \first
> +
> + .ifb \last
> + _align8 \first, \first, \push_ip, \push_lr, \reglist_op
> + .else
> + .if \push_ip & 1 <> \push_ip
> + .error "push_ip may be 0 or 1"
> + .endif
> + .if \push_lr & 1 <> \push_lr
> + .error "push_lr may be 0 or 1"
> + .endif
> + .ifeq (\last - \first + \push_ip + \push_lr) % 2
> + .if \first == 0
> + .error "Alignment required and first register is r0"
> + .exitm
> + .endif
> + _preprocess_reglist \first-1, \last, \push_ip, \push_lr, \reglist_op
> + .else
> + _preprocess_reglist \first \last, \push_ip, \push_lr, \reglist_op
> + .endif
> + .endif
> +.endm
> +
> +.macro prologue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
> + .if \align8
> + _align8 \first, \last, \push_ip, \push_lr, _prologue
> + .else
> + _prologue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
> + .endif
> +.endm
> +
> +.macro epilogue first, last, push_ip=PAC_LEAF_PUSH_IP, push_lr=0, align8=STACK_ALIGN_ENFORCE
> + .if \align8
> + _align8 \first, \last, \push_ip, \push_lr, reglist_op=_epilogue
> + .else
> + _epilogue first=\first, last=\last, push_ip=\push_ip, push_lr=\push_lr
> + .endif
> +.endm
> +
> +#endif /* __ASSEMBLER__ */
> +
> #endif /* ARM_ASM__H */
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-10-26 11:46 ` [PATCH v4 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
@ 2022-11-22 15:04 ` Richard Earnshaw
0 siblings, 0 replies; 17+ messages in thread
From: Richard Earnshaw @ 2022-11-22 15:04 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard.Earnshaw
On 26/10/2022 12:46, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
>
> This patch enables PACBTI for all relevant variants of strcmp:
> * Newlib for armv8.1-m.main+pacbti
> * Newlib for armv8.1-m.main+pacbti+mve
> * Newlib-nano
> ---
> newlib/libc/machine/arm/strcmp-arm-tiny.S | 8 ++++-
> newlib/libc/machine/arm/strcmp-armv7.S | 44 +++++++++++++++--------
> newlib/libc/machine/arm/strcmp-armv7m.S | 26 +++++++++-----
> 3 files changed, 54 insertions(+), 24 deletions(-)
>
> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S b/newlib/libc/machine/arm/strcmp-arm-tiny.S
> index 607a41daf..0bd2a2e6e 100644
> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
> @@ -29,10 +29,14 @@
> /* Tiny version of strcmp in ARM state. Used only when optimizing
> for size. Also supports Thumb-2. */
>
> +#include "arm_asm.h"
> +
> .syntax unified
> def_fn strcmp
> + .fnstart
> .cfi_sections .debug_frame
> .cfi_startproc
> + prologue
> 1:
> ldrb r2, [r0], #1
> ldrb r3, [r1], #1
> @@ -42,6 +46,8 @@ def_fn strcmp
> beq 1b
> 2:
> subs r0, r2, r3
> - bx lr
> + epilogue
> .cfi_endproc
> + .cantunwind
> + .fnend
> .size strcmp, . - strcmp
> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
> index 2f93bfb73..26ba579ae 100644
> --- a/newlib/libc/machine/arm/strcmp-armv7.S
> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
> @@ -45,6 +45,8 @@
> .thumb
> .syntax unified
>
> +#include "arm_asm.h"
> +
> /* Parameters and result. */
> #define src1 r0
> #define src2 r1
> @@ -91,8 +93,9 @@
> ldrd r4, r5, [sp], #16
> .cfi_restore 4
> .cfi_restore 5
> + .cfi_adjust_cfa_offset -16
> sub result, result, r1, lsr #24
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
> #else
> /* To use the big-endian trick we'd have to reverse all three words.
> that's slower than this approach. */
> @@ -112,22 +115,30 @@
> ldrd r4, r5, [sp], #16
> .cfi_restore 4
> .cfi_restore 5
> + .cfi_adjust_cfa_offset -16
> sub result, result, r1
>
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
> #endif
> .endm
>
> +
> .text
> .p2align 5
> + .fnstart
> + .cfi_sections .debug_frame
> + .cfi_startproc
> .Lstrcmp_start_addr:
> #ifndef STRCMP_NO_PRECHECK
> .Lfastpath_exit:
> + .cfi_remember_state
> sub r0, r2, r3
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
> nop
> #endif
> def_fn strcmp
> + .cfi_restore_state
I guess this is supposed to match the remember state above, but that's
inside an ifdef and this isn't, so they are unbalanced.
I think it would be better to move the fastpath_exit code further down,
perhaps below the first epilogue sequence. Then we can clean up the
entry part and make it more natural (in particular we can simplify the
.size directive calculation).
The rest of this patch is OK.
R.
> + prologue push_ip=HAVE_PAC_LEAF
> #ifndef STRCMP_NO_PRECHECK
> ldrb r2, [src1]
> ldrb r3, [src2]
> @@ -136,16 +147,14 @@ def_fn strcmp
> cmpcs r2, r3
> bne .Lfastpath_exit
> #endif
> - .cfi_sections .debug_frame
> - .cfi_startproc
> strd r4, r5, [sp, #-16]!
> - .cfi_def_cfa_offset 16
> - .cfi_offset 4, -16
> - .cfi_offset 5, -12
> + .cfi_adjust_cfa_offset 16
> + .cfi_rel_offset 4, 0
> + .cfi_rel_offset 5, 4
> orr tmp1, src1, src2
> strd r6, r7, [sp, #8]
> - .cfi_offset 6, -8
> - .cfi_offset 7, -4
> + .cfi_rel_offset 6, 8
> + .cfi_rel_offset 7, 12
> mvn const_m1, #0
> lsl r2, tmp1, #29
> cbz r2, .Lloop_aligned8
> @@ -270,7 +279,6 @@ def_fn strcmp
> ldr data1, [src1], #4
> beq .Laligned_m2
> bcs .Laligned_m1
> -
> #ifdef STRCMP_NO_PRECHECK
> ldrb data2, [src2, #1]
> uxtb tmp1, data1, ror #BYTE1_OFFSET
> @@ -314,7 +322,8 @@ def_fn strcmp
> mov result, tmp1
> ldr r4, [sp], #16
> .cfi_restore 4
> - bx lr
> + .cfi_adjust_cfa_offset -16
> + epilogue push_ip=HAVE_PAC_LEAF
>
> #ifndef STRCMP_NO_PRECHECK
> .Laligned_m1:
> @@ -364,8 +373,9 @@ def_fn strcmp
> /* R6/7 Not used in this sequence. */
> .cfi_restore 6
> .cfi_restore 7
> + .cfi_adjust_cfa_offset -16
> neg result, result
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
>
> 6:
> .cfi_restore_state
> @@ -441,7 +451,8 @@ def_fn strcmp
> /* R6/7 not used in this sequence. */
> .cfi_restore 6
> .cfi_restore 7
> - bx lr
> + .cfi_adjust_cfa_offset -16
> + epilogue push_ip=HAVE_PAC_LEAF
>
> .Lstrcmp_tail:
> .cfi_restore_state
> @@ -463,7 +474,10 @@ def_fn strcmp
> /* R6/7 not used in this sequence. */
> .cfi_restore 6
> .cfi_restore 7
> + .cfi_adjust_cfa_offset -16
> sub result, result, data2, lsr #24
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
> .cfi_endproc
> + .cantunwind
> + .fnend
> .size strcmp, . - .Lstrcmp_start_addr
> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
> index cdb4912df..825b6e77f 100644
> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
> @@ -29,6 +29,8 @@
> /* Very similar to the generic code, but uses Thumb2 as implemented
> in ARMv7-M. */
>
> +#include "arm_asm.h"
> +
> /* Parameters and result. */
> #define src1 r0
> #define src2 r1
> @@ -44,8 +46,10 @@
> .thumb
> .syntax unified
> def_fn strcmp
> + .fnstart
> .cfi_sections .debug_frame
> .cfi_startproc
> + prologue push_ip=HAVE_PAC_LEAF
> eor tmp1, src1, src2
> tst tmp1, #3
> /* Strings not at same byte offset from a word boundary. */
> @@ -82,6 +86,7 @@ def_fn strcmp
> ldreq data2, [src2], #4
> beq 4b
> 2:
> + .cfi_remember_state
> /* There's a zero or a different byte in the word */
> S2HI result, data1, #24
> S2LO data1, data1, #8
> @@ -106,7 +111,7 @@ def_fn strcmp
> lsrs result, result, #24
> subs result, result, data2
> #endif
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
>
>
> #if 0
> @@ -205,8 +210,10 @@ def_fn strcmp
>
> /* First of all, compare bytes until src1(sp1) is word-aligned. */
> .Lstrcmp_unaligned:
> + .cfi_restore_state
> tst src1, #3
> beq 2f
> + .cfi_remember_state
> ldrb data1, [src1], #1
> ldrb data2, [src2], #1
> cmp data1, #1
> @@ -214,12 +221,13 @@ def_fn strcmp
> cmpcs data1, data2
> beq .Lstrcmp_unaligned
> sub result, data1, data2
> - bx lr
> + epilogue push_ip=HAVE_PAC_LEAF
>
> 2:
> + .cfi_restore_state
> stmfd sp!, {r5}
> - .cfi_def_cfa_offset 4
> - .cfi_offset 5, -4
> + .cfi_adjust_cfa_offset 4
> + .cfi_rel_offset 5, 0
>
> ldr data1, [src1], #4
> and tmp2, src2, #3
> @@ -355,8 +363,8 @@ def_fn strcmp
> .cfi_remember_state
> ldmfd sp!, {r5}
> .cfi_restore 5
> - .cfi_def_cfa_offset 0
> - bx lr
> + .cfi_adjust_cfa_offset -4
> + epilogue push_ip=HAVE_PAC_LEAF
>
> .Lstrcmp_tail:
> .cfi_restore_state
> @@ -372,7 +380,9 @@ def_fn strcmp
> sub result, r2, result
> ldmfd sp!, {r5}
> .cfi_restore 5
> - .cfi_def_cfa_offset 0
> - bx lr
> + .cfi_adjust_cfa_offset -4
> + epilogue push_ip=HAVE_PAC_LEAF
> .cfi_endproc
> + .cantunwind
> + .fnend
> .size strcmp, . - strcmp
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 3/8] newlib: libc: strlen M-profile PACBTI-enablement
2022-10-26 11:47 ` [PATCH v4 3/8] newlib: libc: strlen " Victor L. Do Nascimento
@ 2022-11-22 15:20 ` Richard Earnshaw
0 siblings, 0 replies; 17+ messages in thread
From: Richard Earnshaw @ 2022-11-22 15:20 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard.Earnshaw
On 26/10/2022 12:47, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
>
> This patch enables PACBTI for all relevant variants of strlen:
> * Newlib for armv8.1-m.main+pacbti
> * Newlib for armv8.1-m.main+pacbti+mve
> * Newlib-nano
This is OK. It's slightly unfortunate that we can nolonger have the PLD
instruction before the prologue, but it is what it is.
R.
> ---
> newlib/libc/machine/arm/strlen-armv7.S | 17 ++++++++++++++---
> newlib/libc/machine/arm/strlen-thumb2-Os.S | 14 +++++++++++---
> 2 files changed, 25 insertions(+), 6 deletions(-)
>
> diff --git a/newlib/libc/machine/arm/strlen-armv7.S b/newlib/libc/machine/arm/strlen-armv7.S
> index f3dda0d60..27094040c 100644
> --- a/newlib/libc/machine/arm/strlen-armv7.S
> +++ b/newlib/libc/machine/arm/strlen-armv7.S
> @@ -59,6 +59,7 @@
> OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */
>
> #include "acle-compat.h"
> +#include "arm_asm.h"
>
> .macro def_fn f p2align=0
> .text
> @@ -78,7 +79,11 @@
>
> /* This code requires Thumb. */
> #if __ARM_ARCH_PROFILE == 'M'
> +#if __ARM_ARCH >= 8
> + /* keep config inherited from -march=. */
> +#else
> .arch armv7e-m
> +#endif /* if __ARM_ARCH >= 8 */
> #else
> .arch armv6t2
> #endif
> @@ -100,8 +105,10 @@
> #define tmp2 r5
>
> def_fn strlen p2align=6
> + .fnstart
> + .cfi_startproc
> + prologue 4 5 push_ip=HAVE_PAC_LEAF
> pld [srcin, #0]
> - strd r4, r5, [sp, #-8]!
> bic src, srcin, #7
> mvn const_m1, #0
> ands tmp1, srcin, #7 /* (8 - bytes) to alignment. */
> @@ -151,6 +158,7 @@ def_fn strlen p2align=6
> beq .Lloop_aligned
>
> .Lnull_found:
> + .cfi_remember_state
> cmp data1a, #0
> itt eq
> addeq result, result, #4
> @@ -159,11 +167,11 @@ def_fn strlen p2align=6
> rev data1a, data1a
> #endif
> clz data1a, data1a
> - ldrd r4, r5, [sp], #8
> add result, result, data1a, lsr #3 /* Bits -> Bytes. */
> - bx lr
> + epilogue 4 5 push_ip=HAVE_PAC_LEAF
>
> .Lmisaligned8:
> + .cfi_restore_state
> ldrd data1a, data1b, [src]
> and tmp2, tmp1, #3
> rsb result, tmp1, #0
> @@ -177,4 +185,7 @@ def_fn strlen p2align=6
> movne data1a, const_m1
> mov const_0, #0
> b .Lstart_realigned
> + .cfi_endproc
> + .cantunwind
> + .fnend
> .size strlen, . - strlen
> diff --git a/newlib/libc/machine/arm/strlen-thumb2-Os.S b/newlib/libc/machine/arm/strlen-thumb2-Os.S
> index 961f41a0a..a46db573c 100644
> --- a/newlib/libc/machine/arm/strlen-thumb2-Os.S
> +++ b/newlib/libc/machine/arm/strlen-thumb2-Os.S
> @@ -25,6 +25,7 @@
> OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */
>
> #include "acle-compat.h"
> +#include "arm_asm.h"
>
> .macro def_fn f p2align=0
> .text
> @@ -33,8 +34,9 @@
> .type \f, %function
> \f:
> .endm
> -
> -#if __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
> +#if __ARM_ARCH_PROFILE == 'M' && __ARM_ARCH >= 8
> + /* keep config inherited from -march=. */
> +#elif __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
> .arch armv7
> #else
> .arch armv6t2
> @@ -44,11 +46,17 @@
> .syntax unified
>
> def_fn strlen p2align=1
> + .fnstart
> + .cfi_startproc
> + prologue
> mov r3, r0
> 1: ldrb.w r2, [r3], #1
> cmp r2, #0
> bne 1b
> subs r0, r3, r0
> subs r0, #1
> - bx lr
> + epilogue
> + .cfi_endproc
> + .cantunwind
> + .fnend
> .size strlen, . - strlen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 4/8] newlib: libc: memchr M-profile PACBTI-enablement
2022-10-26 11:49 ` [PATCH v4 4/8] newlib: libc: memchr " Victor L. Do Nascimento
@ 2022-11-22 15:33 ` Richard Earnshaw
0 siblings, 0 replies; 17+ messages in thread
From: Richard Earnshaw @ 2022-11-22 15:33 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard.Earnshaw
On 26/10/2022 12:49, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
> ---
> newlib/libc/machine/arm/memchr.S | 42 +++++++++++++++++++++++++++-----
> 1 file changed, 36 insertions(+), 6 deletions(-)
>
> diff --git a/newlib/libc/machine/arm/memchr.S b/newlib/libc/machine/arm/memchr.S
> index 1a4c6512c..5b051123d 100644
> --- a/newlib/libc/machine/arm/memchr.S
> +++ b/newlib/libc/machine/arm/memchr.S
> @@ -76,6 +76,7 @@
> .syntax unified
>
> #include "acle-compat.h"
> +#include "arm_asm.h"
>
> @ NOTE: This ifdef MUST match the one in memchr-stub.c
> #if defined (__ARM_NEON__) || defined (__ARM_NEON)
> @@ -267,10 +268,14 @@ memchr:
> #elif __ARM_ARCH_ISA_THUMB >= 2 && defined (__ARM_FEATURE_DSP)
>
> #if __ARM_ARCH_PROFILE == 'M'
> - .arch armv7e-m
> +#if __ARM_ARCH >= 8
> + /* keep config inherited from -march=. */
> #else
> - .arch armv6t2
> -#endif
> + .arch armv7e-m
> +#endif /* __ARM_ARCH >= 8 */
> +#else
> + .arch armv6t2
> +#endif /* __ARM_ARCH_PROFILE == 'M' */
>
> @ this lets us check a flag in a 00/ff byte easily in either endianness
> #ifdef __ARMEB__
> @@ -287,11 +292,14 @@ memchr:
> .p2align 4,,15
> .global memchr
> .type memchr,%function
> + .fnstart
> + .cfi_startproc
> memchr:
> @ r0 = start of memory to scan
> @ r1 = character to look for
> @ r2 = length
> @ returns r0 = pointer to character or NULL if not found
> + prologue
> and r1,r1,#0xff @ Don't trust the caller to pass a char
>
> cmp r2,#16 @ If short don't bother with anything clever
> @@ -313,6 +321,11 @@ memchr:
> 10:
> @ We are aligned, we know we have at least 8 bytes to work with
> push {r4,r5,r6,r7}
> + .cfi_adjust_cfa_offset 16
> + .cfi_rel_offset 4, 0
> + .cfi_rel_offset 5, 4
> + .cfi_rel_offset 6, 8
> + .cfi_rel_offset 7, 12
> orr r1, r1, r1, lsl #8 @ expand the match word across all bytes
> orr r1, r1, r1, lsl #16
> bic r4, r2, #7 @ Number of double words to work with * 8
> @@ -334,6 +347,11 @@ memchr:
> bne 15b @ (Flags from the subs above)
>
> pop {r4,r5,r6,r7}
> + .cfi_restore 7
> + .cfi_restore 6
> + .cfi_restore 5
> + .cfi_restore 4
> + .cfi_adjust_cfa_offset -16
> and r1,r1,#0xff @ r1 back to a single character
> and r2,r2,#7 @ Leave the count remaining as the number
> @ after the double words have been done
> @@ -349,17 +367,21 @@ memchr:
> bne 21b @ on r2 flags
>
> 40:
> + .cfi_remember_state
> movs r0,#0 @ not found
> - bx lr
> + epilogue
>
> 50:
> + .cfi_restore_state
> + .cfi_remember_state
> subs r0,r0,#1 @ found
> - bx lr
> + epilogue
>
> 60: @ We're here because the fast path found a hit
> @ now we have to track down exactly which word it was
> @ r0 points to the start of the double word after the one tested
> @ r5 has the 00/ff pattern for the first word, r6 has the chained value
> + .cfi_restore_state
This restores the wrong state. label 60 is branched to from within the
hot loop where r4-r7 have been saved on the stack, so at the very least
you'll need to add some additional directives here to correctly describe
those still being on the stack. You can probably get away with
.cfi_restore_state @ Standard post-prologue state
.cfi_adjust_cfa_offset 16
.cfi_rel_offset 4, 0
.cfi_rel_offset 5, 4
.cfi_rel_offset 6, 8
.cfi_rel_offset 7, 12
> cmp r5, #0
> itte eq
> moveq r5, r6 @ the end is in the 2nd word
> @@ -379,8 +401,16 @@ memchr:
>
> 61:
> pop {r4,r5,r6,r7}
> + .cfi_restore 7
> + .cfi_restore 6
> + .cfi_restore 5
> + .cfi_restore 4
> + .cfi_adjust_cfa_offset -16
> subs r0,r0,#1
> - bx lr
> + epilogue
> + .cfi_endproc
> + .cantunwind
> + .fnend
> #else
> /* Defined in memchr-stub.c. */
> #endif
R.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 5/8] newlib: libc: memcpy M-profile PACBTI-enablement
2022-10-26 11:50 ` [PATCH v4 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
@ 2022-11-22 16:03 ` Richard Earnshaw
0 siblings, 0 replies; 17+ messages in thread
From: Richard Earnshaw @ 2022-11-22 16:03 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard.Earnshaw
On 26/10/2022 12:50, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
> ---
> newlib/libc/machine/arm/memcpy-armv7m.S | 37 +++++++++++++++++--------
> 1 file changed, 26 insertions(+), 11 deletions(-)
>
> diff --git a/newlib/libc/machine/arm/memcpy-armv7m.S b/newlib/libc/machine/arm/memcpy-armv7m.S
> index c8bff36f6..a74bacc97 100644
> --- a/newlib/libc/machine/arm/memcpy-armv7m.S
> +++ b/newlib/libc/machine/arm/memcpy-armv7m.S
> @@ -46,6 +46,8 @@
> __OPT_BIG_BLOCK_SIZE: Size of big block in words. Default to 64.
> __OPT_MID_BLOCK_SIZE: Size of big block in words. Default to 16.
> */
> +#include "arm_asm.h"
> +
> #ifndef __OPT_BIG_BLOCK_SIZE
> #define __OPT_BIG_BLOCK_SIZE (4 * 16)
> #endif
> @@ -85,6 +87,8 @@
> .global memcpy
> .thumb
> .thumb_func
> + .fnstart
> + .cfi_startproc
> .type memcpy, %function
> memcpy:
> @ r0: dst
> @@ -93,10 +97,11 @@ memcpy:
> #ifdef __ARM_FEATURE_UNALIGNED
> /* In case of UNALIGNED access supported, ip is not used in
> function body. */
> + prologue push_ip=HAVE_PAC_LEAF
> mov ip, r0
> #else
> - push {r0}
> -#endif
> + prologue 0 push_ip=HAVE_PAC_LEAF
> +#endif /* __ARM_FEATURE_UNALIGNED */
> orr r3, r1, r0
> ands r3, r3, #3
> bne .Lmisaligned_copy
> @@ -135,13 +140,13 @@ memcpy:
> ldr r3, [r1], #4
> str r3, [r0], #4
> END_UNROLL
> -#else /* __ARM_ARCH_7M__ */
> +#else
Why this change?
> ldr r3, [r1, \offset]
> str r3, [r0, \offset]
> END_UNROLL
> adds r0, __OPT_MID_BLOCK_SIZE
> adds r1, __OPT_MID_BLOCK_SIZE
> -#endif
> +#endif /* __ARM_ARCH_7M__ */
And this?
Please put them back as they were.
[Just for the record, this test is using a deprecated GCC extension and
needs rewriting using the appropriate feature tests from ACLE; but
that's not something you need to fix.]
Otherwise OK.
R.
> subs r2, __OPT_MID_BLOCK_SIZE
> bhs .Lmid_block_loop
>
> @@ -178,15 +183,17 @@ memcpy:
> #endif /* __ARM_FEATURE_UNALIGNED */
>
> .Ldone:
> + .cfi_remember_state
> #ifdef __ARM_FEATURE_UNALIGNED
> mov r0, ip
> + epilogue push_ip=HAVE_PAC_LEAF
> #else
> - pop {r0}
> -#endif
> - bx lr
> + epilogue 0 push_ip=HAVE_PAC_LEAF
> +#endif /* __ARM_FEATURE_UNALIGNED */
>
> .align 2
> .Lmisaligned_copy:
> + .cfi_restore_state
> #ifdef __ARM_FEATURE_UNALIGNED
> /* Define label DST_ALIGNED to BIG_BLOCK. It will go to aligned copy
> once destination is adjusted to aligned. */
> @@ -247,6 +254,9 @@ memcpy:
> /* dst is aligned, but src isn't. Misaligned copy. */
>
> push {r4, r5}
> + .cfi_adjust_cfa_offset 8
> + .cfi_rel_offset 4, 0
> + .cfi_rel_offset 5, 4
> subs r2, #4
>
> /* Backward r1 by misaligned bytes, to make r1 aligned.
> @@ -299,6 +309,9 @@ memcpy:
> adds r2, #4
> subs r1, ip
> pop {r4, r5}
> + .cfi_restore 4
> + .cfi_restore 5
> + .cfi_adjust_cfa_offset -8
>
> #endif /* __ARM_FEATURE_UNALIGNED */
>
> @@ -321,9 +334,11 @@ memcpy:
>
> #ifdef __ARM_FEATURE_UNALIGNED
> mov r0, ip
> + epilogue push_ip=HAVE_PAC_LEAF
> #else
> - pop {r0}
> -#endif
> - bx lr
> -
> + epilogue 0 push_ip=HAVE_PAC_LEAF
> +#endif /* __ARM_FEATURE_UNALIGNED */
> + .cfi_endproc
> + .cantunwind
> + .fnend
> .size memcpy, .-memcpy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 6/8] newlib: libc: setjmp/longjmp M-profile PACBTI-enablement
2022-10-26 11:51 ` [PATCH v4 6/8] newlib: libc: setjmp/longjmp " Victor L. Do Nascimento
@ 2022-11-22 16:17 ` Richard Earnshaw
0 siblings, 0 replies; 17+ messages in thread
From: Richard Earnshaw @ 2022-11-22 16:17 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard.Earnshaw
On 26/10/2022 12:51, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
> ---
> newlib/libc/machine/arm/setjmp.S | 33 ++++++++++++++++++++++++++++++--
> 1 file changed, 31 insertions(+), 2 deletions(-)
>
> diff --git a/newlib/libc/machine/arm/setjmp.S b/newlib/libc/machine/arm/setjmp.S
> index 21d6ff9e7..4fe53cdf2 100644
> --- a/newlib/libc/machine/arm/setjmp.S
> +++ b/newlib/libc/machine/arm/setjmp.S
> @@ -157,11 +157,15 @@ SYM (.arm_start_of.\name):
> .globl SYM (\name)
> TYPE (\name)
> SYM (\name):
> + .fnstart
> + .cfi_startproc
> PROLOGUE \name
> .endm
>
> .macro FUNC_END name
> RET
> + .cfi_endproc
> + .fnend
> SIZE (\name)
> .endm
>
> @@ -173,11 +177,26 @@ SYM (\name):
>
> /* Save all the callee-preserved registers into the jump buffer. */
> #ifdef __thumb2__
> +#if __ARM_FEATURE_PAC_DEFAULT
> +#if __ARM_FEATURE_BTI_DEFAULT
> + pacbti ip, lr, sp
> +#else
> + pac ip, lr, sp
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> + .cfi_register 143, 12
> + mov a4, ip
> + mov ip, sp
> + stmea a1!, { a4, v1-v7, fp, ip, lr }
So this stores an extra value in the jump buf (and also changes the
offsets of the stored values). Have you checked that there's enough
space for that?
This might be considered an ABI break, though that's perhaps not too
important in a bare-metal environment.
R.
> +#else
> +#if __ARM_FEATURE_BTI_DEFAULT
> + bti
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> mov ip, sp
> stmea a1!, { v1-v7, fp, ip, lr }
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
> #else
> stmea a1!, { v1-v7, fp, ip, sp, lr }
> -#endif
> +#endif /* __thumb2__ */
>
> #if 0 /* Simulator does not cope with FP instructions yet. */
> #ifndef __SOFTFP__
> @@ -200,11 +219,17 @@ SYM (\name):
>
> /* Restore the registers, retrieving the state when setjmp() was called. */
> #ifdef __thumb2__
> +#if __ARM_FEATURE_PAC_DEFAULT
> + ldmfd a1!, { a4, v1-v7, fp, ip, lr }
> + mov sp, ip
> + mov ip, a4
> +#else
> ldmfd a1!, { v1-v7, fp, ip, lr }
> mov sp, ip
> +#endif /* __ARM_FEATURE_PAC_DEFAULT */
> #else
> ldmfd a1!, { v1-v7, fp, ip, sp, lr }
> -#endif
> +#endif /* __thumb2__ */
>
> #if 0 /* Simulator does not cope with FP instructions yet. */
> #ifndef __SOFTFP__
> @@ -220,5 +245,9 @@ SYM (\name):
> #endif
> moveq a1, #1
>
> +#if __ARM_FEATURE_PAC_DEFAULT
> + aut ip, lr, sp
> +#endif
> +
> FUNC_END longjmp
> #endif
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 7/8] newlib: libc: aeabi_memmove M-profile PACBTI-enablement
2022-10-26 11:52 ` [PATCH v4 7/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
@ 2022-11-22 16:18 ` Richard Earnshaw
0 siblings, 0 replies; 17+ messages in thread
From: Richard Earnshaw @ 2022-11-22 16:18 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard.Earnshaw
On 26/10/2022 12:52, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
> ---
> newlib/libc/machine/arm/aeabi_memmove-thumb2.S | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
OK.
R.
>
> diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
> index e9504437b..20ca993e5 100644
> --- a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
> +++ b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
> @@ -26,6 +26,8 @@
> * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> */
>
> +#include "arm_asm.h"
> +
> .thumb
> .syntax unified
> .global __aeabi_memmove
> @@ -33,8 +35,10 @@
> ASM_ALIAS __aeabi_memmove4 __aeabi_memmove
> ASM_ALIAS __aeabi_memmove8 __aeabi_memmove
> __aeabi_memmove:
> + .fnstart
> + .cfi_startproc
> + prologue 4
> cmp r0, r1
> - push {r4}
> bls 3f
> adds r3, r1, r2
> cmp r0, r3
> @@ -48,9 +52,10 @@ __aeabi_memmove:
> strb r4, [r1, #-1]!
> bne 1b
> 2:
> - pop {r4}
> - bx lr
> + .cfi_remember_state
> + epilogue 4
> 3:
> + .cfi_restore_state
> cmp r2, #0
> beq 2b
> add r2, r2, r1
> @@ -60,6 +65,8 @@ __aeabi_memmove:
> cmp r2, r1
> strb r4, [r3, #1]!
> bne 4b
> - pop {r4}
> - bx lr
> + epilogue 4
> + .cfi_endproc
> + .cantunwind
> + .fnend
> .size __aeabi_memmove, . - __aeabi_memmove
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 8/8] newlib: libc: aeabi_memset M-profile PACBTI-enablement
2022-10-26 11:53 ` [PATCH v4 8/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
@ 2022-11-22 16:19 ` Richard Earnshaw
0 siblings, 0 replies; 17+ messages in thread
From: Richard Earnshaw @ 2022-11-22 16:19 UTC (permalink / raw)
To: Victor L. Do Nascimento, newlib; +Cc: Richard.Earnshaw
On 26/10/2022 12:53, Victor L. Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
> ---
> newlib/libc/machine/arm/aeabi_memset-thumb2.S | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
OK.
R.
>
> diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb2.S b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
> index eaca1d8d7..6b77d3820 100644
> --- a/newlib/libc/machine/arm/aeabi_memset-thumb2.S
> +++ b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
> @@ -26,14 +26,18 @@
> * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> */
>
> +#include "arm_asm.h"
> +
> .thumb
> .syntax unified
> .global __aeabi_memset
> .type __aeabi_memset, %function
> + .fnstart
> + .cfi_startproc
> ASM_ALIAS __aeabi_memset4 __aeabi_memset
> ASM_ALIAS __aeabi_memset8 __aeabi_memset
> __aeabi_memset:
> - push {r4, r5, r6}
> + prologue 4 6
> lsls r4, r0, #30
> beq 10f
> subs r4, r1, #1
> @@ -98,10 +102,14 @@ __aeabi_memset:
> cmp r3, r4
> bne 8b
> 9:
> - pop {r4, r5, r6}
> - bx lr
> + .cfi_remember_state
> + epilogue 4 6
> 10:
> + .cfi_restore_state
> mov r4, r1
> mov r3, r0
> b 3b
> + .cfi_endproc
> + .cantunwind
> + .fnend
> .size __aeabi_memset, . - __aeabi_memset
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2022-11-22 16:19 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-26 11:37 [PATCH v4 0/8] Implement assembly cortex-M PACBTI functionality Victor L. Do Nascimento
2022-10-26 11:45 ` [PATCH v4 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor L. Do Nascimento
2022-11-22 15:04 ` Richard Earnshaw
2022-10-26 11:46 ` [PATCH v4 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor L. Do Nascimento
2022-11-22 15:04 ` Richard Earnshaw
2022-10-26 11:47 ` [PATCH v4 3/8] newlib: libc: strlen " Victor L. Do Nascimento
2022-11-22 15:20 ` Richard Earnshaw
2022-10-26 11:49 ` [PATCH v4 4/8] newlib: libc: memchr " Victor L. Do Nascimento
2022-11-22 15:33 ` Richard Earnshaw
2022-10-26 11:50 ` [PATCH v4 5/8] newlib: libc: memcpy " Victor L. Do Nascimento
2022-11-22 16:03 ` Richard Earnshaw
2022-10-26 11:51 ` [PATCH v4 6/8] newlib: libc: setjmp/longjmp " Victor L. Do Nascimento
2022-11-22 16:17 ` Richard Earnshaw
2022-10-26 11:52 ` [PATCH v4 7/8] newlib: libc: aeabi_memmove " Victor L. Do Nascimento
2022-11-22 16:18 ` Richard Earnshaw
2022-10-26 11:53 ` [PATCH v4 8/8] newlib: libc: aeabi_memset " Victor L. Do Nascimento
2022-11-22 16:19 ` Richard Earnshaw
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).