* [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality
@ 2022-08-03 15:35 Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor Do Nascimento
` (7 more replies)
0 siblings, 8 replies; 18+ messages in thread
From: Victor Do Nascimento @ 2022-08-03 15:35 UTC (permalink / raw)
To: newlib; +Cc: richard.earnshaw, Victor Do Nascimento
This patch series modifies hand-written assembly files for Arm targets,
conditionally enabling branch target identification as well as address
return signature and verification based on Armv8.1-M Pointer
Authentication [1] using ACLE feature test macros at compile-time [2].
The incorportaion of PACBTI functionality in function prologues/
epilogues is dictated by the arguments passed to the
`-mbranch-protection' flag at the time of Newlib compilation.
Regression tested on arm-none-eabi with and without MVE extension and
for Newlib and Newlib-nano.
[1]
<https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
[2] <https://developer.arm.com/documentation/101028/0012/5--Feature-test-macros>
Victor Do Nascimento (8):
newlib: libc: define M-profile PACBTI-enablement macros
newlib: libc: strcmp M-profile PACBTI-enablement
newlib: libc: strlen M-profile PACBTI-enablement
newlib: libc: memchr M-profile PACBTI-enablement
newlib: libc: memcpy M-profile PACBTI-enablement
newlib: libc: setjmp/longjmp M-profile PACBTI-enablement
newlib: libc: aeabi_memmove M-profile PACBTI-enablement
newlib: libc: aeabi_memset M-profile PACBTI-enablement
.../libc/machine/arm/aeabi_memmove-thumb2.S | 54 ++++++++++++--
newlib/libc/machine/arm/aeabi_memset-thumb2.S | 47 +++++++++++-
newlib/libc/machine/arm/memchr.S | 55 ++++++++++++--
newlib/libc/machine/arm/memcpy-armv7m.S | 74 +++++++++++++++++--
newlib/libc/machine/arm/pacbti.h | 64 ++++++++++++++++
newlib/libc/machine/arm/setjmp.S | 33 ++++++++-
newlib/libc/machine/arm/strcmp-arm-tiny.S | 7 +-
newlib/libc/machine/arm/strcmp-armv7.S | 45 ++++++++---
newlib/libc/machine/arm/strcmp-armv7m.S | 36 +++++++--
newlib/libc/machine/arm/strlen-armv7.S | 53 ++++++++++++-
newlib/libc/machine/arm/strlen-thumb2-Os.S | 13 +++-
11 files changed, 435 insertions(+), 46 deletions(-)
create mode 100644 newlib/libc/machine/arm/pacbti.h
--
2.36.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2022-08-03 15:35 [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
@ 2022-08-03 15:35 ` Victor Do Nascimento
2022-08-04 15:19 ` Richard Earnshaw
2022-08-03 15:35 ` [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor Do Nascimento
` (6 subsequent siblings)
7 siblings, 1 reply; 18+ messages in thread
From: Victor Do Nascimento @ 2022-08-03 15:35 UTC (permalink / raw)
To: newlib; +Cc: richard.earnshaw, Victor Do Nascimento
Create an assembly header file that conditionally defines fuction
prologues/epilogues depending on the compile-time mbranch-protection
argument values.
* newlib/libc/machine/arm/pacbti.h: New.
---
newlib/libc/machine/arm/pacbti.h | 64 ++++++++++++++++++++++++++++++++
1 file changed, 64 insertions(+)
create mode 100644 newlib/libc/machine/arm/pacbti.h
diff --git a/newlib/libc/machine/arm/pacbti.h b/newlib/libc/machine/arm/pacbti.h
new file mode 100644
index 000000000..4c31d00bc
--- /dev/null
+++ b/newlib/libc/machine/arm/pacbti.h
@@ -0,0 +1,64 @@
+/*
+ * Copyright (c) 2022 Arm Ltd
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. The name of the company may not be used to endorse or promote
+ * products derived from this software without specific prior written
+ * permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY ARM LTD ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL ARM LTD BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
+ * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+ * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* Checki whether leaf function PAC signing has been requested
+ in the -mbranch-protect compile-time option. */
+#define LEAF_PROTECT_BIT 2
+#define __HAVE_PAC_LEAF \
+ __ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)
+
+/* Macro to handle function entry depending on branch-protection
+ schemes. */
+ .macro pacbti_prologue
+#if __HAVE_PAC_LEAF
+#if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+#else
+ pac ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ .cfi_register 143, 12
+ str ip, [sp, #-4]!
+ .save {ra_auth_code}
+ .cfi_def_cfa_offset 4
+ .cfi_offset 143, -4
+#elif __ARM_FEATURE_BTI_DEFAULT
+ bti
+#endif /* __HAVE_PAC_LEAF */
+ .endm
+
+/* Macro to handle different branch exchange cases depending on
+ branch-protection schemes. */
+ .macro pacbti_epilogue
+#if __HAVE_PAC_LEAF
+ ldr ip, [sp], #4
+ .cfi_restore 143
+ .cfi_def_cfa_offset 0
+ aut ip, lr, sp
+#endif /* __HAVE_PAC_LEAF */
+ bx lr
+ .endm
--
2.36.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-08-03 15:35 [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor Do Nascimento
@ 2022-08-03 15:35 ` Victor Do Nascimento
2022-08-04 15:48 ` Richard Earnshaw
2022-08-03 15:35 ` [PATCH v2 3/8] newlib: libc: strlen " Victor Do Nascimento
` (5 subsequent siblings)
7 siblings, 1 reply; 18+ messages in thread
From: Victor Do Nascimento @ 2022-08-03 15:35 UTC (permalink / raw)
To: newlib; +Cc: richard.earnshaw, Victor Do Nascimento
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strcmp:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
---
newlib/libc/machine/arm/strcmp-arm-tiny.S | 7 +++-
newlib/libc/machine/arm/strcmp-armv7.S | 45 +++++++++++++++++------
newlib/libc/machine/arm/strcmp-armv7m.S | 36 +++++++++++++++---
3 files changed, 71 insertions(+), 17 deletions(-)
diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S b/newlib/libc/machine/arm/strcmp-arm-tiny.S
index 607a41daf..8085eb4df 100644
--- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
+++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
@@ -29,10 +29,14 @@
/* Tiny version of strcmp in ARM state. Used only when optimizing
for size. Also supports Thumb-2. */
+#include "pacbti.h"
+
.syntax unified
def_fn strcmp
+ .fnstart
.cfi_sections .debug_frame
.cfi_startproc
+ pacbti_prologue
1:
ldrb r2, [r0], #1
ldrb r3, [r1], #1
@@ -42,6 +46,7 @@ def_fn strcmp
beq 1b
2:
subs r0, r2, r3
- bx lr
+ pacbti_epilogue
.cfi_endproc
+ .fnend
.size strcmp, . - strcmp
diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
index 2f93bfb73..584c89f7e 100644
--- a/newlib/libc/machine/arm/strcmp-armv7.S
+++ b/newlib/libc/machine/arm/strcmp-armv7.S
@@ -45,6 +45,8 @@
.thumb
.syntax unified
+#include "pacbti.h"
+
/* Parameters and result. */
#define src1 r0
#define src2 r1
@@ -91,8 +93,9 @@
ldrd r4, r5, [sp], #16
.cfi_restore 4
.cfi_restore 5
+ .cfi_adjust_cfa_offset -16
sub result, result, r1, lsr #24
- bx lr
+ pacbti_epilogue
#else
/* To use the big-endian trick we'd have to reverse all three words.
that's slower than this approach. */
@@ -112,22 +115,28 @@
ldrd r4, r5, [sp], #16
.cfi_restore 4
.cfi_restore 5
+ .cfi_adjust_cfa_offset -16
sub result, result, r1
- bx lr
+ pacbti_epilogue
#endif
.endm
+
.text
.p2align 5
+ .fnstart
+ .cfi_sections .debug_frame
+ .cfi_startproc
.Lstrcmp_start_addr:
#ifndef STRCMP_NO_PRECHECK
.Lfastpath_exit:
sub r0, r2, r3
- bx lr
+ pacbti_epilogue
nop
#endif
def_fn strcmp
+ pacbti_prologue
#ifndef STRCMP_NO_PRECHECK
ldrb r2, [src1]
ldrb r3, [src2]
@@ -136,16 +145,26 @@ def_fn strcmp
cmpcs r2, r3
bne .Lfastpath_exit
#endif
- .cfi_sections .debug_frame
- .cfi_startproc
strd r4, r5, [sp, #-16]!
- .cfi_def_cfa_offset 16
+ .save {r4, r5}
+ .cfi_adjust_cfa_offset 16
+#if __HAVE_PAC_LEAF
+ .cfi_offset 4, -20
+ .cfi_offset 5, -16
+#else
.cfi_offset 4, -16
.cfi_offset 5, -12
+#endif /* __HAVE_PAC_LEAF */
orr tmp1, src1, src2
strd r6, r7, [sp, #8]
+ .save {r6, r7}
+#if __HAVE_PAC_LEAF
+ .cfi_offset 6, -12
+ .cfi_offset 7, -8
+#else
.cfi_offset 6, -8
.cfi_offset 7, -4
+#endif /* __HAVE_PAC_LEAF */
mvn const_m1, #0
lsl r2, tmp1, #29
cbz r2, .Lloop_aligned8
@@ -270,7 +289,6 @@ def_fn strcmp
ldr data1, [src1], #4
beq .Laligned_m2
bcs .Laligned_m1
-
#ifdef STRCMP_NO_PRECHECK
ldrb data2, [src2, #1]
uxtb tmp1, data1, ror #BYTE1_OFFSET
@@ -314,7 +332,8 @@ def_fn strcmp
mov result, tmp1
ldr r4, [sp], #16
.cfi_restore 4
- bx lr
+ .cfi_adjust_cfa_offset -16
+ pacbti_epilogue
#ifndef STRCMP_NO_PRECHECK
.Laligned_m1:
@@ -364,8 +383,9 @@ def_fn strcmp
/* R6/7 Not used in this sequence. */
.cfi_restore 6
.cfi_restore 7
+ .cfi_adjust_cfa_offset -16
neg result, result
- bx lr
+ pacbti_epilogue
6:
.cfi_restore_state
@@ -441,7 +461,8 @@ def_fn strcmp
/* R6/7 not used in this sequence. */
.cfi_restore 6
.cfi_restore 7
- bx lr
+ .cfi_adjust_cfa_offset -16
+ pacbti_epilogue
.Lstrcmp_tail:
.cfi_restore_state
@@ -463,7 +484,9 @@ def_fn strcmp
/* R6/7 not used in this sequence. */
.cfi_restore 6
.cfi_restore 7
+ .cfi_adjust_cfa_offset -16
sub result, result, data2, lsr #24
- bx lr
+ pacbti_epilogue
.cfi_endproc
+ .fnend
.size strcmp, . - .Lstrcmp_start_addr
diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
index cdb4912df..fe1519f4d 100644
--- a/newlib/libc/machine/arm/strcmp-armv7m.S
+++ b/newlib/libc/machine/arm/strcmp-armv7m.S
@@ -29,6 +29,8 @@
/* Very similar to the generic code, but uses Thumb2 as implemented
in ARMv7-M. */
+#include "pacbti.h"
+
/* Parameters and result. */
#define src1 r0
#define src2 r1
@@ -44,8 +46,10 @@
.thumb
.syntax unified
def_fn strcmp
+ .fnstart
.cfi_sections .debug_frame
.cfi_startproc
+ pacbti_prologue
eor tmp1, src1, src2
tst tmp1, #3
/* Strings not at same byte offset from a word boundary. */
@@ -106,7 +110,7 @@ def_fn strcmp
lsrs result, result, #24
subs result, result, data2
#endif
- bx lr
+ pacbti_epilogue
#if 0
@@ -214,12 +218,18 @@ def_fn strcmp
cmpcs data1, data2
beq .Lstrcmp_unaligned
sub result, data1, data2
- bx lr
+ pacbti_epilogue
2:
stmfd sp!, {r5}
- .cfi_def_cfa_offset 4
+ .save {r5}
+ .cfi_adjust_cfa_offset 4
+#if __HAVE_PAC_LEAF
+ .cfi_offset 5, -8 /* Account for ip register already on stack. */
+#else
.cfi_offset 5, -4
+#endif /* __HAVE_PAC_LEAF */
+
ldr data1, [src1], #4
and tmp2, src2, #3
@@ -353,10 +363,17 @@ def_fn strcmp
.Lstrcmp_done_equal:
mov result, #0
.cfi_remember_state
+#if __HAVE_PAC_LEAF
+ pop {r5, ip}
+ .cfi_restore 5
+ .cfi_restore 143
+ aut ip, lr, sp
+#else
ldmfd sp!, {r5}
.cfi_restore 5
+#endif /* __HAVE_PAC_LEAF */
.cfi_def_cfa_offset 0
- bx lr
+ bx lr
.Lstrcmp_tail:
.cfi_restore_state
@@ -370,9 +387,18 @@ def_fn strcmp
S2LOEQ data2, data2, #8
beq .Lstrcmp_tail
sub result, r2, result
+#if __HAVE_PAC_LEAF
+ pop {r5, ip}
+ .cfi_restore 5
+ .cfi_restore 143
+ .cfi_def_cfa_offset 0
+ aut ip, lr, sp
+#else
ldmfd sp!, {r5}
.cfi_restore 5
.cfi_def_cfa_offset 0
- bx lr
+#endif /* __HAVE_PAC_LEAF */
+ bx lr
.cfi_endproc
+ .fnend
.size strcmp, . - strcmp
--
2.36.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 3/8] newlib: libc: strlen M-profile PACBTI-enablement
2022-08-03 15:35 [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor Do Nascimento
@ 2022-08-03 15:35 ` Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 4/8] newlib: libc: memchr " Victor Do Nascimento
` (4 subsequent siblings)
7 siblings, 0 replies; 18+ messages in thread
From: Victor Do Nascimento @ 2022-08-03 15:35 UTC (permalink / raw)
To: newlib; +Cc: richard.earnshaw, Victor Do Nascimento
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
This patch enables PACBTI for all relevant variants of strlen:
* Newlib for armv8.1-m.main+pacbti
* Newlib for armv8.1-m.main+pacbti+mve
* Newlib-nano
---
newlib/libc/machine/arm/strlen-armv7.S | 53 ++++++++++++++++++++--
newlib/libc/machine/arm/strlen-thumb2-Os.S | 13 ++++--
2 files changed, 60 insertions(+), 6 deletions(-)
diff --git a/newlib/libc/machine/arm/strlen-armv7.S b/newlib/libc/machine/arm/strlen-armv7.S
index f3dda0d60..0492800f9 100644
--- a/newlib/libc/machine/arm/strlen-armv7.S
+++ b/newlib/libc/machine/arm/strlen-armv7.S
@@ -59,6 +59,7 @@
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */
#include "acle-compat.h"
+#include "pacbti.h"
.macro def_fn f p2align=0
.text
@@ -78,7 +79,11 @@
/* This code requires Thumb. */
#if __ARM_ARCH_PROFILE == 'M'
+#if __ARM_ARCH >= 8
+ /* keep config inherited from -march=. */
+#else
.arch armv7e-m
+#endif /* if __ARM_ARCH >= 8 */
#else
.arch armv6t2
#endif
@@ -100,8 +105,35 @@
#define tmp2 r5
def_fn strlen p2align=6
+ .fnstart
+ .cfi_startproc
+ /* common pacbti_prologue macro from pacbti.h not used.
+ handwritten prologue saves one push instruction. */
+#if __HAVE_PAC_LEAF
+#if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+#else
+ pac ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ .cfi_register 143, 12
+ push {r4, r5, ip}
+ .save {r4, r5, ra_auth_code}
+ .cfi_def_cfa_offset 12
+ .cfi_offset 143, -4
+ .cfi_offset 5, -8
+ .cfi_offset 4, -12
+
+#else
+#if __ARM_FEATURE_BTI_DEFAULT
+ bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ push {r4, r5}
+ .save {r4, r5}
+ .cfi_def_cfa_offset 8
+ .cfi_offset 5, -4
+ .cfi_offset 4, -8
+#endif /* __HAVE_PAC_LEAF */
pld [srcin, #0]
- strd r4, r5, [sp, #-8]!
bic src, srcin, #7
mvn const_m1, #0
ands tmp1, srcin, #7 /* (8 - bytes) to alignment. */
@@ -159,9 +191,22 @@ def_fn strlen p2align=6
rev data1a, data1a
#endif
clz data1a, data1a
- ldrd r4, r5, [sp], #8
add result, result, data1a, lsr #3 /* Bits -> Bytes. */
- bx lr
+#if __HAVE_PAC_LEAF
+ pop {r4, r5, ip}
+ .cfi_restore 4
+ .cfi_restore 5
+ .cfi_restore 143
+ .cfi_def_cfa_offset 0
+ aut ip, lr, sp
+#else
+ ldrd r4, r5, [sp], #8
+ .cfi_restore 4
+ .cfi_restore 5
+ .cfi_def_cfa_offset 0
+#endif /* __HAVE_PAC_LEAF */
+ bx lr
+
.Lmisaligned8:
ldrd data1a, data1b, [src]
@@ -177,4 +222,6 @@ def_fn strlen p2align=6
movne data1a, const_m1
mov const_0, #0
b .Lstart_realigned
+ .cfi_endproc
+ .fnend
.size strlen, . - strlen
diff --git a/newlib/libc/machine/arm/strlen-thumb2-Os.S b/newlib/libc/machine/arm/strlen-thumb2-Os.S
index 961f41a0a..571068dac 100644
--- a/newlib/libc/machine/arm/strlen-thumb2-Os.S
+++ b/newlib/libc/machine/arm/strlen-thumb2-Os.S
@@ -25,6 +25,7 @@
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */
#include "acle-compat.h"
+#include "pacbti.h"
.macro def_fn f p2align=0
.text
@@ -33,8 +34,9 @@
.type \f, %function
\f:
.endm
-
-#if __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
+#if __ARM_ARCH_PROFILE == 'M' && __ARM_ARCH >= 8
+ /* keep config inherited from -march=. */
+#elif __ARM_ARCH_ISA_THUMB >= 2 && __ARM_ARCH >= 7
.arch armv7
#else
.arch armv6t2
@@ -44,11 +46,16 @@
.syntax unified
def_fn strlen p2align=1
+ .fnstart
+ .cfi_startproc
+ pacbti_prologue
mov r3, r0
1: ldrb.w r2, [r3], #1
cmp r2, #0
bne 1b
subs r0, r3, r0
subs r0, #1
- bx lr
+ pacbti_epilogue
+ .cfi_endproc
+ .fnend
.size strlen, . - strlen
--
2.36.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 4/8] newlib: libc: memchr M-profile PACBTI-enablement
2022-08-03 15:35 [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (2 preceding siblings ...)
2022-08-03 15:35 ` [PATCH v2 3/8] newlib: libc: strlen " Victor Do Nascimento
@ 2022-08-03 15:35 ` Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 5/8] newlib: libc: memcpy " Victor Do Nascimento
` (3 subsequent siblings)
7 siblings, 0 replies; 18+ messages in thread
From: Victor Do Nascimento @ 2022-08-03 15:35 UTC (permalink / raw)
To: newlib; +Cc: richard.earnshaw, Victor Do Nascimento
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/memchr.S | 55 ++++++++++++++++++++++++++++----
1 file changed, 48 insertions(+), 7 deletions(-)
diff --git a/newlib/libc/machine/arm/memchr.S b/newlib/libc/machine/arm/memchr.S
index 1a4c6512c..6bdccb934 100644
--- a/newlib/libc/machine/arm/memchr.S
+++ b/newlib/libc/machine/arm/memchr.S
@@ -76,6 +76,7 @@
.syntax unified
#include "acle-compat.h"
+#include "pacbti.h"
@ NOTE: This ifdef MUST match the one in memchr-stub.c
#if defined (__ARM_NEON__) || defined (__ARM_NEON)
@@ -267,10 +268,14 @@ memchr:
#elif __ARM_ARCH_ISA_THUMB >= 2 && defined (__ARM_FEATURE_DSP)
#if __ARM_ARCH_PROFILE == 'M'
- .arch armv7e-m
+#if __ARM_ARCH >= 8
+ /* keep config inherited from -march=. */
#else
- .arch armv6t2
-#endif
+ .arch armv7e-m
+#endif /* __ARM_ARCH >= 8 */
+#else
+ .arch armv6t2
+#endif /* __ARM_ARCH_PROFILE == 'M' */
@ this lets us check a flag in a 00/ff byte easily in either endianness
#ifdef __ARMEB__
@@ -287,11 +292,14 @@ memchr:
.p2align 4,,15
.global memchr
.type memchr,%function
+ .fnstart
+ .cfi_startproc
memchr:
@ r0 = start of memory to scan
@ r1 = character to look for
@ r2 = length
@ returns r0 = pointer to character or NULL if not found
+ pacbti_prologue
and r1,r1,#0xff @ Don't trust the caller to pass a char
cmp r2,#16 @ If short don't bother with anything clever
@@ -313,6 +321,19 @@ memchr:
10:
@ We are aligned, we know we have at least 8 bytes to work with
push {r4,r5,r6,r7}
+ .save {r4-r7}
+ .cfi_adjust_cfa_offset 16
+#if __HAVE_PAC_LEAF
+ .cfi_offset 4, -20
+ .cfi_offset 5, -16
+ .cfi_offset 6, -12
+ .cfi_offset 7, -8
+#else
+ .cfi_offset 4, -16
+ .cfi_offset 5, -12
+ .cfi_offset 6, -8
+ .cfi_offset 7, -4
+#endif /* __HAVE_PAC_LEAF*/
orr r1, r1, r1, lsl #8 @ expand the match word across all bytes
orr r1, r1, r1, lsl #16
bic r4, r2, #7 @ Number of double words to work with * 8
@@ -334,6 +355,11 @@ memchr:
bne 15b @ (Flags from the subs above)
pop {r4,r5,r6,r7}
+ .cfi_restore 7
+ .cfi_restore 6
+ .cfi_restore 5
+ .cfi_restore 4
+ .cfi_adjust_cfa_offset -16
and r1,r1,#0xff @ r1 back to a single character
and r2,r2,#7 @ Leave the count remaining as the number
@ after the double words have been done
@@ -350,11 +376,11 @@ memchr:
40:
movs r0,#0 @ not found
- bx lr
+ pacbti_epilogue
50:
subs r0,r0,#1 @ found
- bx lr
+ pacbti_epilogue
60: @ We're here because the fast path found a hit
@ now we have to track down exactly which word it was
@@ -378,9 +404,24 @@ memchr:
addeq r0,r0,#1
61:
- pop {r4,r5,r6,r7}
subs r0,r0,#1
- bx lr
+#if __HAVE_PAC_LEAF
+ pop {r4,r5,r6,r7,ip}
+ .cfi_restore 143
+#else
+ pop {r4,r5,r6,r7}
+#endif /* __HAVE_PAC_LEAF*/
+ .cfi_restore 7
+ .cfi_restore 6
+ .cfi_restore 5
+ .cfi_restore 4
+ .cfi_def_cfa_offset 0
+#if __HAVE_PAC_LEAF
+ aut ip, lr, sp
+#endif /* __HAVE_PAC_LEAF*/
+ bx lr
+ .cfi_endproc
+ .fnend
#else
/* Defined in memchr-stub.c. */
#endif
--
2.36.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 5/8] newlib: libc: memcpy M-profile PACBTI-enablement
2022-08-03 15:35 [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (3 preceding siblings ...)
2022-08-03 15:35 ` [PATCH v2 4/8] newlib: libc: memchr " Victor Do Nascimento
@ 2022-08-03 15:35 ` Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 6/8] newlib: libc: setjmp/longjmp " Victor Do Nascimento
` (2 subsequent siblings)
7 siblings, 0 replies; 18+ messages in thread
From: Victor Do Nascimento @ 2022-08-03 15:35 UTC (permalink / raw)
To: newlib; +Cc: richard.earnshaw, Victor Do Nascimento
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/memcpy-armv7m.S | 74 ++++++++++++++++++++++---
1 file changed, 66 insertions(+), 8 deletions(-)
diff --git a/newlib/libc/machine/arm/memcpy-armv7m.S b/newlib/libc/machine/arm/memcpy-armv7m.S
index c8bff36f6..d9b9e4984 100644
--- a/newlib/libc/machine/arm/memcpy-armv7m.S
+++ b/newlib/libc/machine/arm/memcpy-armv7m.S
@@ -46,6 +46,8 @@
__OPT_BIG_BLOCK_SIZE: Size of big block in words. Default to 64.
__OPT_MID_BLOCK_SIZE: Size of big block in words. Default to 16.
*/
+#include "pacbti.h"
+
#ifndef __OPT_BIG_BLOCK_SIZE
#define __OPT_BIG_BLOCK_SIZE (4 * 16)
#endif
@@ -85,6 +87,8 @@
.global memcpy
.thumb
.thumb_func
+ .fnstart
+ .cfi_startproc
.type memcpy, %function
memcpy:
@ r0: dst
@@ -93,10 +97,31 @@ memcpy:
#ifdef __ARM_FEATURE_UNALIGNED
/* In case of UNALIGNED access supported, ip is not used in
function body. */
+ pacbti_prologue
mov ip, r0
#else
+#if __HAVE_PAC_LEAF
+#if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+#else
+ pac ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ .cfi_register 143, 12
+ push {r0, ip}
+ .save {r0, ra_auth_code}
+ .cfi_def_cfa_offset 8
+ .cfi_offset 0, -8
+ .cfi_offset 143, -4
+#else
+#if __ARM_FEATURE_BTI_DEFAULT
+ bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
push {r0}
-#endif
+ .save {r0}
+ .cfi_adjust_cfa_offset 4
+ .cfi_offset 0, -4
+#endif /* __HAVE_PAC_LEAF */
+#endif /* __ARM_FEATURE_UNALIGNED */
orr r3, r1, r0
ands r3, r3, #3
bne .Lmisaligned_copy
@@ -135,13 +160,13 @@ memcpy:
ldr r3, [r1], #4
str r3, [r0], #4
END_UNROLL
-#else /* __ARM_ARCH_7M__ */
+#else
ldr r3, [r1, \offset]
str r3, [r0, \offset]
END_UNROLL
adds r0, __OPT_MID_BLOCK_SIZE
adds r1, __OPT_MID_BLOCK_SIZE
-#endif
+#endif /* __ARM_ARCH_7M__ */
subs r2, __OPT_MID_BLOCK_SIZE
bhs .Lmid_block_loop
@@ -180,10 +205,21 @@ memcpy:
.Ldone:
#ifdef __ARM_FEATURE_UNALIGNED
mov r0, ip
+ pacbti_epilogue
+#else
+#if __HAVE_PAC_LEAF
+ pop {r0, ra_auth_code}
+ .cfi_restore 0
+ .cfi_restore 143
+ .cfi_def_cfa_offset 0
+ aut ip, lr, sp
#else
pop {r0}
-#endif
- bx lr
+ .cfi_restore 0
+ .cfi_def_cfa_offset 0
+#endif /* __HAVE_PAC_LEAF */
+ bx lr
+#endif /* __ARM_FEATURE_UNALIGNED */
.align 2
.Lmisaligned_copy:
@@ -247,6 +283,15 @@ memcpy:
/* dst is aligned, but src isn't. Misaligned copy. */
push {r4, r5}
+ .save {r4, r5}
+ .cfi_adjust_cfa_offset 8
+#if __HAVE_PAC_LEAF /* we pushed just the pac code. */
+ .cfi_offset 4, -12
+ .cfi_offset 5, -8
+#else /* we haven't pushed anything to stack. */
+ .cfi_offset 4, -8
+ .cfi_offset 5, -4
+#endif /* __HAVE_PAC_LEAF */
subs r2, #4
/* Backward r1 by misaligned bytes, to make r1 aligned.
@@ -299,6 +344,7 @@ memcpy:
adds r2, #4
subs r1, ip
pop {r4, r5}
+ .cfi_adjust_cfa_offset -8
#endif /* __ARM_FEATURE_UNALIGNED */
@@ -321,9 +367,21 @@ memcpy:
#ifdef __ARM_FEATURE_UNALIGNED
mov r0, ip
+ pacbti_epilogue
+#else
+#if __HAVE_PAC_LEAF
+ pop {r0, ra_auth_code}
+ .cfi_restore 0
+ .cfi_restore 143
+ .cfi_def_cfa_offset 0
+ aut ip, lr, sp
#else
pop {r0}
-#endif
- bx lr
-
+ .cfi_restore 0
+ .cfi_def_cfa_offset 0
+#endif /* __HAVE_PAC_LEAF */
+ bx lr
+#endif /* __ARM_FEATURE_UNALIGNED */
+ .cfi_endproc
+ .fnend
.size memcpy, .-memcpy
--
2.36.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 6/8] newlib: libc: setjmp/longjmp M-profile PACBTI-enablement
2022-08-03 15:35 [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (4 preceding siblings ...)
2022-08-03 15:35 ` [PATCH v2 5/8] newlib: libc: memcpy " Victor Do Nascimento
@ 2022-08-03 15:35 ` Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 7/8] newlib: libc: aeabi_memmove " Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 8/8] newlib: libc: aeabi_memset " Victor Do Nascimento
7 siblings, 0 replies; 18+ messages in thread
From: Victor Do Nascimento @ 2022-08-03 15:35 UTC (permalink / raw)
To: newlib; +Cc: richard.earnshaw, Victor Do Nascimento
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/setjmp.S | 33 ++++++++++++++++++++++++++++++--
1 file changed, 31 insertions(+), 2 deletions(-)
diff --git a/newlib/libc/machine/arm/setjmp.S b/newlib/libc/machine/arm/setjmp.S
index 21d6ff9e7..d60dd57df 100644
--- a/newlib/libc/machine/arm/setjmp.S
+++ b/newlib/libc/machine/arm/setjmp.S
@@ -157,11 +157,15 @@ SYM (.arm_start_of.\name):
.globl SYM (\name)
TYPE (\name)
SYM (\name):
+ .fnstart
+ .cfi_startproc
PROLOGUE \name
.endm
.macro FUNC_END name
RET
+ .cfi_endproc
+ .fnend
SIZE (\name)
.endm
@@ -173,11 +177,26 @@ SYM (\name):
/* Save all the callee-preserved registers into the jump buffer. */
#ifdef __thumb2__
+#if __ARM_FEATURE_PAC_DEFAULT
+#if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+#else
+ pac ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ .cfi_register 143, 12
+ mov a4, ip
+ mov ip, sp
+ stmea a1!, { a4, v1-v7, fp, ip, lr }
+#else
+#if __ARM_FEATURE_BTI_DEFAULT
+ bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
mov ip, sp
stmea a1!, { v1-v7, fp, ip, lr }
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
#else
stmea a1!, { v1-v7, fp, ip, sp, lr }
-#endif
+#endif /* __thumb2__ */
#if 0 /* Simulator does not cope with FP instructions yet. */
#ifndef __SOFTFP__
@@ -200,11 +219,17 @@ SYM (\name):
/* Restore the registers, retrieving the state when setjmp() was called. */
#ifdef __thumb2__
+#if __ARM_FEATURE_PAC_DEFAULT
+ ldmfd a1!, { a4, v1-v7, fp, ip, lr }
+ mov sp, ip
+ mov ip, a4
+#else
ldmfd a1!, { v1-v7, fp, ip, lr }
mov sp, ip
+#endif /* __ARM_FEATURE_PAC_DEFAULT */
#else
ldmfd a1!, { v1-v7, fp, ip, sp, lr }
-#endif
+#endif /* __thumb2__ */
#if 0 /* Simulator does not cope with FP instructions yet. */
#ifndef __SOFTFP__
@@ -220,5 +245,9 @@ SYM (\name):
#endif
moveq a1, #1
+#if __ARM_FEATURE_PAC_DEFAULT
+ aut ip, lr, sp
+#endif
+
FUNC_END longjmp
#endif
--
2.36.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 7/8] newlib: libc: aeabi_memmove M-profile PACBTI-enablement
2022-08-03 15:35 [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (5 preceding siblings ...)
2022-08-03 15:35 ` [PATCH v2 6/8] newlib: libc: setjmp/longjmp " Victor Do Nascimento
@ 2022-08-03 15:35 ` Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 8/8] newlib: libc: aeabi_memset " Victor Do Nascimento
7 siblings, 0 replies; 18+ messages in thread
From: Victor Do Nascimento @ 2022-08-03 15:35 UTC (permalink / raw)
To: newlib; +Cc: richard.earnshaw, Victor Do Nascimento
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
.../libc/machine/arm/aeabi_memmove-thumb2.S | 54 +++++++++++++++++--
1 file changed, 49 insertions(+), 5 deletions(-)
diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
index e9504437b..34054d96c 100644
--- a/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
+++ b/newlib/libc/machine/arm/aeabi_memmove-thumb2.S
@@ -26,6 +26,8 @@
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
+ #include "pacbti.h"
+
.thumb
.syntax unified
.global __aeabi_memmove
@@ -33,8 +35,28 @@
ASM_ALIAS __aeabi_memmove4 __aeabi_memmove
ASM_ALIAS __aeabi_memmove8 __aeabi_memmove
__aeabi_memmove:
- cmp r0, r1
+ .fnstart
+ .cfi_startproc
+#if __HAVE_PAC_LEAF
+#if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+#else
+ pac ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ .cfi_register 143, 12
+ push {r4, ip}
+ .save {r4, ra_auth_code}
+ .cfi_def_cfa_offset 8
+ .cfi_offset 4, -8
+ .cfi_offset 143, -4
+#elif __ARM_FEATURE_BTI_DEFAULT
+ bti
push {r4}
+ .save {r4}
+ .cfi_def_cfa_offset 4
+ .cfi_offset 4, -4
+#endif /* __HAVE_PAC_LEAF */
+ cmp r0, r1
bls 3f
adds r3, r1, r2
cmp r0, r3
@@ -48,8 +70,18 @@ __aeabi_memmove:
strb r4, [r1, #-1]!
bne 1b
2:
- pop {r4}
- bx lr
+#if __HAVE_PAC_LEAF
+ pop {r4, ip}
+ .cfi_restore 4
+ .cfi_restore 143
+ .cfi_def_cfa_offset 0
+ aut ip, lr, sp
+#else
+ pop {r4}
+ .cfi_restore 4
+ .cfi_def_cfa_offset 0
+#endif /* __HAVE_PAC_LEAF */
+ bx lr
3:
cmp r2, #0
beq 2b
@@ -60,6 +92,18 @@ __aeabi_memmove:
cmp r2, r1
strb r4, [r3, #1]!
bne 4b
- pop {r4}
- bx lr
+#if __HAVE_PAC_LEAF
+ pop {r4, ip}
+ .cfi_restore 4
+ .cfi_restore 143
+ .cfi_def_cfa_offset 0
+ aut ip, lr, sp
+#else
+ pop {r4}
+ .cfi_restore 4
+ .cfi_def_cfa_offset 0
+#endif /* __HAVE_PAC_LEAF */
+ bx lr
+ .cfi_endproc
+ .fnend
.size __aeabi_memmove, . - __aeabi_memmove
--
2.36.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 8/8] newlib: libc: aeabi_memset M-profile PACBTI-enablement
2022-08-03 15:35 [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
` (6 preceding siblings ...)
2022-08-03 15:35 ` [PATCH v2 7/8] newlib: libc: aeabi_memmove " Victor Do Nascimento
@ 2022-08-03 15:35 ` Victor Do Nascimento
7 siblings, 0 replies; 18+ messages in thread
From: Victor Do Nascimento @ 2022-08-03 15:35 UTC (permalink / raw)
To: newlib; +Cc: richard.earnshaw, Victor Do Nascimento
Add function prologue/epilogue to conditionally add BTI landing pads
and/or PAC code generation & authentication instructions depending on
compilation flags.
---
newlib/libc/machine/arm/aeabi_memset-thumb2.S | 47 ++++++++++++++++++-
1 file changed, 46 insertions(+), 1 deletion(-)
diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb2.S b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
index eaca1d8d7..8bd4b5eaa 100644
--- a/newlib/libc/machine/arm/aeabi_memset-thumb2.S
+++ b/newlib/libc/machine/arm/aeabi_memset-thumb2.S
@@ -26,14 +26,43 @@
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
+#include "pacbti.h"
+
.thumb
.syntax unified
.global __aeabi_memset
.type __aeabi_memset, %function
+ .fnstart
+ .cfi_startproc
ASM_ALIAS __aeabi_memset4 __aeabi_memset
ASM_ALIAS __aeabi_memset8 __aeabi_memset
__aeabi_memset:
+#if __HAVE_PAC_LEAF
+#if __ARM_FEATURE_BTI_DEFAULT
+ pacbti ip, lr, sp
+#else
+ pac ip, lr, sp
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
+ .cfi_register 143, 12
+ push {r4, r5, r6, ip}
+ .save {r4, r5, r6, ra_auth_code}
+ .cfi_def_cfa_offset 16
+ .cfi_offset 4, -16
+ .cfi_offset 5, -12
+ .cfi_offset 6, -8
+ .cfi_offset 143, -4
+#else
+#if __ARM_FEATURE_BTI_DEFAULT
+ bti
+#endif /* __ARM_FEATURE_BTI_DEFAULT */
push {r4, r5, r6}
+ .save {r4, r5, r6}
+ .cfi_def_cfa_offset 12
+ .cfi_offset 4, -12
+ .cfi_offset 5, -8
+ .cfi_offset 6, -4
+#endif /* __HAVE_PAC_LEAF */
+
lsls r4, r0, #30
beq 10f
subs r4, r1, #1
@@ -98,10 +127,26 @@ __aeabi_memset:
cmp r3, r4
bne 8b
9:
+#if __HAVE_PAC_LEAF
+ pop {r4, r5, r6, ip}
+ .cfi_restore 143
+ .cfi_restore 6
+ .cfi_restore 5
+ .cfi_restore 4
+ .cfi_def_cfa_offset 0
+ aut ip, lr, sp
+#else
pop {r4, r5, r6}
- bx lr
+ .cfi_restore 6
+ .cfi_restore 5
+ .cfi_restore 4
+ .cfi_def_cfa_offset 0
+#endif /* __HAVE_PAC_LEAF */
+ bx lr
10:
mov r4, r1
mov r3, r0
b 3b
+ .cfi_endproc
+ .fnend
.size __aeabi_memset, . - __aeabi_memset
--
2.36.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2022-08-03 15:35 ` [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor Do Nascimento
@ 2022-08-04 15:19 ` Richard Earnshaw
2022-08-05 9:21 ` Victor L. Do Nascimento
2022-08-16 16:11 ` Victor L. Do Nascimento
0 siblings, 2 replies; 18+ messages in thread
From: Richard Earnshaw @ 2022-08-04 15:19 UTC (permalink / raw)
To: Victor Do Nascimento, newlib; +Cc: richard.earnshaw
On 03/08/2022 16:35, Victor Do Nascimento wrote:
> Create an assembly header file that conditionally defines fuction
> prologues/epilogues depending on the compile-time mbranch-protection
> argument values.
>
> * newlib/libc/machine/arm/pacbti.h: New.
> ---
> newlib/libc/machine/arm/pacbti.h | 64 ++++++++++++++++++++++++++++++++
> 1 file changed, 64 insertions(+)
> create mode 100644 newlib/libc/machine/arm/pacbti.h
>
> diff --git a/newlib/libc/machine/arm/pacbti.h b/newlib/libc/machine/arm/pacbti.h
> new file mode 100644
> index 000000000..4c31d00bc
> --- /dev/null
> +++ b/newlib/libc/machine/arm/pacbti.h
> @@ -0,0 +1,64 @@
> +/*
> + * Copyright (c) 2022 Arm Ltd
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in the
> + * documentation and/or other materials provided with the distribution.
> + * 3. The name of the company may not be used to endorse or promote
> + * products derived from this software without specific prior written
> + * permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY ARM LTD ``AS IS'' AND ANY EXPRESS OR IMPLIED
> + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
> + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
> + * IN NO EVENT SHALL ARM LTD BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
> + * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
> + * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
> + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +/* Checki whether leaf function PAC signing has been requested
> + in the -mbranch-protect compile-time option. */
> +#define LEAF_PROTECT_BIT 2
> +#define __HAVE_PAC_LEAF \
> + __ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)
Either this header needs to avoid polluting the user namespace, or it
doesn't. If it does, then LEAF_PROTECT_BIT fails to do that. If it
doesn't then __HAVE_PAC_LEAF should really be HAVE_PAC_LEAF (to avoid
polluting the reserved namespace. I suspect this header is private to
newlib (won't be exported to users), so should not be prefixing names
with __.
> +
> +/* Macro to handle function entry depending on branch-protection
> + schemes. */
> + .macro pacbti_prologue
> +#if __HAVE_PAC_LEAF
> +#if __ARM_FEATURE_BTI_DEFAULT
> + pacbti ip, lr, sp
> +#else
> + pac ip, lr, sp
> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
> + .cfi_register 143, 12
> + str ip, [sp, #-4]!
This causes the stack to lose 8-byte alignment. Whilst for leaf
functions that's probably not a problem, the macro should have an option
where the user can ask for alignment to be preserved. But there should
also be an option to not save IP on the stack at all (for when the user
does not modify IP in the function body).
> + .save {ra_auth_code}
> + .cfi_def_cfa_offset 4
> + .cfi_offset 143, -4
> +#elif __ARM_FEATURE_BTI_DEFAULT
> + bti
> +#endif /* __HAVE_PAC_LEAF */
> + .endm
> +
> +/* Macro to handle different branch exchange cases depending on
> + branch-protection schemes. */
> + .macro pacbti_epilogue
> +#if __HAVE_PAC_LEAF
> + ldr ip, [sp], #4
> + .cfi_restore 143
> + .cfi_def_cfa_offset 0
> + aut ip, lr, sp
> +#endif /* __HAVE_PAC_LEAF */
> + bx lr
> + .endm
I think these macros are really misnamed, firstly, they're only for leaf
functions and secondly, they (particularly the epilogue) does something
even if PAC is not needed. So I think they should be renamed as
'leaf_prologue' and 'leaf_epilogue' respectively.
In consequence, I think this file should really be merged into the
existing arm_asm.h, then we don't need yet another header file.
Finally, the header needs to define a (C) macro that defines how much to
adjust the CFI offset by for various scenarios:
- When PAC is not used
- When PAC is used with no alignment
- When PAC is used and asked for 8-byte alignment to be preserved.
R.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-08-03 15:35 ` [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor Do Nascimento
@ 2022-08-04 15:48 ` Richard Earnshaw
2022-08-05 10:51 ` Victor L. Do Nascimento
0 siblings, 1 reply; 18+ messages in thread
From: Richard Earnshaw @ 2022-08-04 15:48 UTC (permalink / raw)
To: Victor Do Nascimento, newlib; +Cc: richard.earnshaw
On 03/08/2022 16:35, Victor Do Nascimento wrote:
> Add function prologue/epilogue to conditionally add BTI landing pads
> and/or PAC code generation & authentication instructions depending on
> compilation flags.
>
> This patch enables PACBTI for all relevant variants of strcmp:
> * Newlib for armv8.1-m.main+pacbti
> * Newlib for armv8.1-m.main+pacbti+mve
> * Newlib-nano
> ---
> newlib/libc/machine/arm/strcmp-arm-tiny.S | 7 +++-
> newlib/libc/machine/arm/strcmp-armv7.S | 45 +++++++++++++++++------
> newlib/libc/machine/arm/strcmp-armv7m.S | 36 +++++++++++++++---
> 3 files changed, 71 insertions(+), 17 deletions(-)
>
> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S b/newlib/libc/machine/arm/strcmp-arm-tiny.S
> index 607a41daf..8085eb4df 100644
> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
> @@ -29,10 +29,14 @@
> /* Tiny version of strcmp in ARM state. Used only when optimizing
> for size. Also supports Thumb-2. */
>
> +#include "pacbti.h"
> +
> .syntax unified
> def_fn strcmp
> + .fnstart
> .cfi_sections .debug_frame
> .cfi_startproc
> + pacbti_prologue
See comment on patch1 about the names of these macros.
> 1:
> ldrb r2, [r0], #1
> ldrb r3, [r1], #1
> @@ -42,6 +46,7 @@ def_fn strcmp
> beq 1b
> 2:
> subs r0, r2, r3
> - bx lr
> + pacbti_epilogue
> .cfi_endproc
> + .fnend
> .size strcmp, . - strcmp
> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
> index 2f93bfb73..584c89f7e 100644
> --- a/newlib/libc/machine/arm/strcmp-armv7.S
> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
> @@ -45,6 +45,8 @@
> .thumb
> .syntax unified
>
> +#include "pacbti.h"
> +
> /* Parameters and result. */
> #define src1 r0
> #define src2 r1
> @@ -91,8 +93,9 @@
> ldrd r4, r5, [sp], #16
> .cfi_restore 4
> .cfi_restore 5
> + .cfi_adjust_cfa_offset -16
This looks like it's an omission from the original file and thus should
be separated out as a separate patch.
> sub result, result, r1, lsr #24
> - bx lr
> + pacbti_epilogue
> #else
> /* To use the big-endian trick we'd have to reverse all three words.
> that's slower than this approach. */
> @@ -112,22 +115,28 @@
> ldrd r4, r5, [sp], #16
> .cfi_restore 4
> .cfi_restore 5
> + .cfi_adjust_cfa_offset -16
> sub result, result, r1
>
> - bx lr
> + pacbti_epilogue
> #endif
> .endm
>
> +
> .text
> .p2align 5
> + .fnstart
> + .cfi_sections .debug_frame
> + .cfi_startproc
> .Lstrcmp_start_addr:
> #ifndef STRCMP_NO_PRECHECK
> .Lfastpath_exit:
> sub r0, r2, r3
> - bx lr
> + pacbti_epilogue
> nop
> #endif
> def_fn strcmp
> + pacbti_prologue
> #ifndef STRCMP_NO_PRECHECK
> ldrb r2, [src1]
> ldrb r3, [src2]
> @@ -136,16 +145,26 @@ def_fn strcmp
> cmpcs r2, r3
> bne .Lfastpath_exit
> #endif
> - .cfi_sections .debug_frame
> - .cfi_startproc
> strd r4, r5, [sp, #-16]!
> - .cfi_def_cfa_offset 16
> + .save {r4, r5}
Hmm, I've just remembered that leaf functions cannot throw exceptions
(the EHABI only supports synchronous exceptions), so I don't think we
need .save directives at all. Instead, such functions should be marked
with ".cantunwind". You still want the dwarf unwind information as
that's for the debugger, it's just the '.save' directives that aren't
needed.
> + .cfi_adjust_cfa_offset 16
> +#if __HAVE_PAC_LEAF
> + .cfi_offset 4, -20
> + .cfi_offset 5, -16
> +#else
> .cfi_offset 4, -16
> .cfi_offset 5, -12
> +#endif /* __HAVE_PAC_LEAF */
Ugh! This is not the right way to do this. Instead (see comments on
patch 1) use a macro that defines the additional adjustment to use in
various contexts, something like
.cfi_offset 4, -(16+PAC_UNALIGNED_CFI_ADJ)
> orr tmp1, src1, src2
> strd r6, r7, [sp, #8]
> + .save {r6, r7}
> +#if __HAVE_PAC_LEAF
> + .cfi_offset 6, -12
> + .cfi_offset 7, -8
> +#else
> .cfi_offset 6, -8
> .cfi_offset 7, -4
> +#endif /* __HAVE_PAC_LEAF */
> mvn const_m1, #0
> lsl r2, tmp1, #29
> cbz r2, .Lloop_aligned8
> @@ -270,7 +289,6 @@ def_fn strcmp
> ldr data1, [src1], #4
> beq .Laligned_m2
> bcs .Laligned_m1
> -
> #ifdef STRCMP_NO_PRECHECK
> ldrb data2, [src2, #1]
> uxtb tmp1, data1, ror #BYTE1_OFFSET
> @@ -314,7 +332,8 @@ def_fn strcmp
> mov result, tmp1
> ldr r4, [sp], #16
> .cfi_restore 4
> - bx lr
> + .cfi_adjust_cfa_offset -16
> + pacbti_epilogue
>
> #ifndef STRCMP_NO_PRECHECK
> .Laligned_m1:
> @@ -364,8 +383,9 @@ def_fn strcmp
> /* R6/7 Not used in this sequence. */
> .cfi_restore 6
> .cfi_restore 7
> + .cfi_adjust_cfa_offset -16
> neg result, result
> - bx lr
> + pacbti_epilogue
>
> 6:
> .cfi_restore_state
> @@ -441,7 +461,8 @@ def_fn strcmp
> /* R6/7 not used in this sequence. */
> .cfi_restore 6
> .cfi_restore 7
> - bx lr
> + .cfi_adjust_cfa_offset -16
> + pacbti_epilogue
>
> .Lstrcmp_tail:
> .cfi_restore_state
> @@ -463,7 +484,9 @@ def_fn strcmp
> /* R6/7 not used in this sequence. */
> .cfi_restore 6
> .cfi_restore 7
> + .cfi_adjust_cfa_offset -16
> sub result, result, data2, lsr #24
> - bx lr
> + pacbti_epilogue
> .cfi_endproc
> + .fnend
> .size strcmp, . - .Lstrcmp_start_addr
> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
> index cdb4912df..fe1519f4d 100644
> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
> @@ -29,6 +29,8 @@
> /* Very similar to the generic code, but uses Thumb2 as implemented
> in ARMv7-M. */
>
> +#include "pacbti.h"
> +
> /* Parameters and result. */
> #define src1 r0
> #define src2 r1
> @@ -44,8 +46,10 @@
> .thumb
> .syntax unified
> def_fn strcmp
> + .fnstart
> .cfi_sections .debug_frame
> .cfi_startproc
> + pacbti_prologue
> eor tmp1, src1, src2
> tst tmp1, #3
> /* Strings not at same byte offset from a word boundary. */
> @@ -106,7 +110,7 @@ def_fn strcmp
> lsrs result, result, #24
> subs result, result, data2
> #endif
> - bx lr
> + pacbti_epilogue
>
>
> #if 0
> @@ -214,12 +218,18 @@ def_fn strcmp
> cmpcs data1, data2
> beq .Lstrcmp_unaligned
> sub result, data1, data2
> - bx lr
> + pacbti_epilogue
>
> 2:
> stmfd sp!, {r5}
> - .cfi_def_cfa_offset 4
> + .save {r5}
> + .cfi_adjust_cfa_offset 4
> +#if __HAVE_PAC_LEAF
> + .cfi_offset 5, -8 /* Account for ip register already on stack. */
> +#else
> .cfi_offset 5, -4
> +#endif /* __HAVE_PAC_LEAF */
> +
>
> ldr data1, [src1], #4
> and tmp2, src2, #3
> @@ -353,10 +363,17 @@ def_fn strcmp
> .Lstrcmp_done_equal:
> mov result, #0
> .cfi_remember_state
> +#if __HAVE_PAC_LEAF
> + pop {r5, ip}
> + .cfi_restore 5
> + .cfi_restore 143
> + aut ip, lr, sp
> +#else
> ldmfd sp!, {r5}
> .cfi_restore 5
> +#endif /* __HAVE_PAC_LEAF */
I think we could define a macro for this as well, something like
pop_reg_and_pac r5
While you're doing that, please change the single-register ldmfd into 'pop'.
> .cfi_def_cfa_offset 0
> - bx lr
> + bx lr
stray change to the BX instruction.
>
> .Lstrcmp_tail:
> .cfi_restore_state
> @@ -370,9 +387,18 @@ def_fn strcmp
> S2LOEQ data2, data2, #8
> beq .Lstrcmp_tail
> sub result, r2, result
> +#if __HAVE_PAC_LEAF
> + pop {r5, ip}
> + .cfi_restore 5
> + .cfi_restore 143
> + .cfi_def_cfa_offset 0
> + aut ip, lr, sp
> +#else
> ldmfd sp!, {r5}
> .cfi_restore 5
> .cfi_def_cfa_offset 0
> - bx lr
> +#endif /* __HAVE_PAC_LEAF */
Same as above
> + bx lr
tab between mnemonic and operands.
> .cfi_endproc
> + .fnend
> .size strcmp, . - strcmp
R.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2022-08-04 15:19 ` Richard Earnshaw
@ 2022-08-05 9:21 ` Victor L. Do Nascimento
2022-08-16 16:11 ` Victor L. Do Nascimento
1 sibling, 0 replies; 18+ messages in thread
From: Victor L. Do Nascimento @ 2022-08-05 9:21 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: newlib, richard.earnshaw
Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
> On 03/08/2022 16:35, Victor Do Nascimento wrote:
>> Create an assembly header file that conditionally defines fuction
>> prologues/epilogues depending on the compile-time mbranch-protection
>> argument values.
>> * newlib/libc/machine/arm/pacbti.h: New.
>> ---
>> newlib/libc/machine/arm/pacbti.h | 64 ++++++++++++++++++++++++++++++++
>> 1 file changed, 64 insertions(+)
>> create mode 100644 newlib/libc/machine/arm/pacbti.h
>> diff --git a/newlib/libc/machine/arm/pacbti.h
>> b/newlib/libc/machine/arm/pacbti.h
>> new file mode 100644
>> index 000000000..4c31d00bc
>> --- /dev/null
>> +++ b/newlib/libc/machine/arm/pacbti.h
>> @@ -0,0 +1,64 @@
>> +/*
>> + * Copyright (c) 2022 Arm Ltd
>> + * All rights reserved.
>> + *
>> + * Redistribution and use in source and binary forms, with or without
>> + * modification, are permitted provided that the following conditions
>> + * are met:
>> + * 1. Redistributions of source code must retain the above copyright
>> + * notice, this list of conditions and the following disclaimer.
>> + * 2. Redistributions in binary form must reproduce the above copyright
>> + * notice, this list of conditions and the following disclaimer in the
>> + * documentation and/or other materials provided with the distribution.
>> + * 3. The name of the company may not be used to endorse or promote
>> + * products derived from this software without specific prior written
>> + * permission.
>> + *
>> + * THIS SOFTWARE IS PROVIDED BY ARM LTD ``AS IS'' AND ANY EXPRESS OR IMPLIED
>> + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
>> + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
>> + * IN NO EVENT SHALL ARM LTD BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
>> + * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
>> + * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
>> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
>> + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
>> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +/* Checki whether leaf function PAC signing has been requested
>> + in the -mbranch-protect compile-time option. */
>> +#define LEAF_PROTECT_BIT 2
>> +#define __HAVE_PAC_LEAF \
>> + __ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)
>
> Either this header needs to avoid polluting the user namespace, or it
> doesn't. If it does, then LEAF_PROTECT_BIT fails to do that. If it
> doesn't then __HAVE_PAC_LEAF should really be HAVE_PAC_LEAF (to avoid
> polluting the reserved namespace. I suspect this header is private to
> newlib (won't be exported to users), so should not be prefixing names
> with __.
>
>> +
>> +/* Macro to handle function entry depending on branch-protection
>> + schemes. */
>> + .macro pacbti_prologue
>> +#if __HAVE_PAC_LEAF
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> + pacbti ip, lr, sp
>> +#else
>> + pac ip, lr, sp
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> + .cfi_register 143, 12
>> + str ip, [sp, #-4]!
>
> This causes the stack to lose 8-byte alignment. Whilst for leaf
> functions that's probably not a problem, the macro should have an
> option where the user can ask for alignment to be preserved. But
> there should also be an option to not save IP on the stack at all (for
> when the user does not modify IP in the function body).
Thanks for the feedback here. So we introduce two options the user is
given control over. I guess this raises two questions concerning design
choices to be made.
1. How do we expose these options to the user, and
2. What should we define as default values?
Regarding defaults, in order to maximize performance my initial guess
would be that we don't push IP to the stack, such that we save on
instructions by omitting pushes and pops and automatically preserve
stack alignment that way.
Even so, that leaves the question of how to best allow users to modify
these parameters prior to compiling the library.
Any thoughts??
>> + .save {ra_auth_code}
>> + .cfi_def_cfa_offset 4
>> + .cfi_offset 143, -4
>> +#elif __ARM_FEATURE_BTI_DEFAULT
>> + bti
>> +#endif /* __HAVE_PAC_LEAF */
>> + .endm
>> +
>> +/* Macro to handle different branch exchange cases depending on
>> + branch-protection schemes. */
>> + .macro pacbti_epilogue
>> +#if __HAVE_PAC_LEAF
>> + ldr ip, [sp], #4
>> + .cfi_restore 143
>> + .cfi_def_cfa_offset 0
>> + aut ip, lr, sp
>> +#endif /* __HAVE_PAC_LEAF */
>> + bx lr
>> + .endm
>
> I think these macros are really misnamed, firstly, they're only for
> leaf functions and secondly, they (particularly the epilogue) does
> something even if PAC is not needed. So I think they should be
> renamed as 'leaf_prologue' and 'leaf_epilogue' respectively.
Agreed.
> In consequence, I think this file should really be merged into the
> existing arm_asm.h, then we don't need yet another header file.
>
> Finally, the header needs to define a (C) macro that defines how much
> to adjust the CFI offset by for various scenarios:
> - When PAC is not used
> - When PAC is used with no alignment
> - When PAC is used and asked for 8-byte alignment to be preserved.
I'm thinking we set the value of PAC_CFI_ADJ contigent on values of
PAC_LEAF_PUSH_SP and PAC_PRESERVE_STACK_ALIGN when HAVE_PAC_LEAF is
set (once I define these new macros, of course).
V.
> R.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-08-04 15:48 ` Richard Earnshaw
@ 2022-08-05 10:51 ` Victor L. Do Nascimento
2022-08-05 14:34 ` Richard Earnshaw
0 siblings, 1 reply; 18+ messages in thread
From: Victor L. Do Nascimento @ 2022-08-05 10:51 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: newlib, richard.earnshaw
Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
> On 03/08/2022 16:35, Victor Do Nascimento wrote:
>> Add function prologue/epilogue to conditionally add BTI landing pads
>> and/or PAC code generation & authentication instructions depending on
>> compilation flags.
>> This patch enables PACBTI for all relevant variants of strcmp:
>> * Newlib for armv8.1-m.main+pacbti
>> * Newlib for armv8.1-m.main+pacbti+mve
>> * Newlib-nano
>> ---
>> newlib/libc/machine/arm/strcmp-arm-tiny.S | 7 +++-
>> newlib/libc/machine/arm/strcmp-armv7.S | 45 +++++++++++++++++------
>> newlib/libc/machine/arm/strcmp-armv7m.S | 36 +++++++++++++++---
>> 3 files changed, 71 insertions(+), 17 deletions(-)
>> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> index 607a41daf..8085eb4df 100644
>> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>> @@ -29,10 +29,14 @@
>> /* Tiny version of strcmp in ARM state. Used only when optimizing
>> for size. Also supports Thumb-2. */
>> +#include "pacbti.h"
>> +
>> .syntax unified
>> def_fn strcmp
>> + .fnstart
>> .cfi_sections .debug_frame
>> .cfi_startproc
>> + pacbti_prologue
>
> See comment on patch1 about the names of these macros.
>
>> 1:
>> ldrb r2, [r0], #1
>> ldrb r3, [r1], #1
>> @@ -42,6 +46,7 @@ def_fn strcmp
>> beq 1b
>> 2:
>> subs r0, r2, r3
>> - bx lr
>> + pacbti_epilogue
>> .cfi_endproc
>> + .fnend
>> .size strcmp, . - strcmp
>> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
>> index 2f93bfb73..584c89f7e 100644
>> --- a/newlib/libc/machine/arm/strcmp-armv7.S
>> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
>> @@ -45,6 +45,8 @@
>> .thumb
>> .syntax unified
>> +#include "pacbti.h"
>> +
>> /* Parameters and result. */
>> #define src1 r0
>> #define src2 r1
>> @@ -91,8 +93,9 @@
>> ldrd r4, r5, [sp], #16
>> .cfi_restore 4
>> .cfi_restore 5
>> + .cfi_adjust_cfa_offset -16
>
> This looks like it's an omission from the original file and thus
> should be separated out as a separate patch.
>
>> sub result, result, r1, lsr #24
>> - bx lr
>> + pacbti_epilogue
>> #else
>> /* To use the big-endian trick we'd have to reverse all three words.
>> that's slower than this approach. */
>> @@ -112,22 +115,28 @@
>> ldrd r4, r5, [sp], #16
>> .cfi_restore 4
>> .cfi_restore 5
>> + .cfi_adjust_cfa_offset -16
>> sub result, result, r1
>> - bx lr
>> + pacbti_epilogue
>> #endif
>> .endm
>> +
>> .text
>> .p2align 5
>> + .fnstart
>> + .cfi_sections .debug_frame
>> + .cfi_startproc
>> .Lstrcmp_start_addr:
>> #ifndef STRCMP_NO_PRECHECK
>> .Lfastpath_exit:
>> sub r0, r2, r3
>> - bx lr
>> + pacbti_epilogue
>> nop
>> #endif
>> def_fn strcmp
>> + pacbti_prologue
>> #ifndef STRCMP_NO_PRECHECK
>> ldrb r2, [src1]
>> ldrb r3, [src2]
>> @@ -136,16 +145,26 @@ def_fn strcmp
>> cmpcs r2, r3
>> bne .Lfastpath_exit
>> #endif
>> - .cfi_sections .debug_frame
>> - .cfi_startproc
>> strd r4, r5, [sp, #-16]!
>> - .cfi_def_cfa_offset 16
>> + .save {r4, r5}
>
> Hmm, I've just remembered that leaf functions cannot throw exceptions
> (the EHABI only supports synchronous exceptions), so I don't think we
> need .save directives at all. Instead, such functions should be
> marked with ".cantunwind". You still want the dwarf unwind
> information as that's for the debugger, it's just the '.save'
> directives that aren't needed.
>
>> + .cfi_adjust_cfa_offset 16
>> +#if __HAVE_PAC_LEAF
>> + .cfi_offset 4, -20
>> + .cfi_offset 5, -16
>> +#else
>> .cfi_offset 4, -16
>> .cfi_offset 5, -12
>> +#endif /* __HAVE_PAC_LEAF */
>
> Ugh! This is not the right way to do this. Instead (see comments on
> patch 1) use a macro that defines the additional adjustment to use in
> various contexts, something like
>
> .cfi_offset 4, -(16+PAC_UNALIGNED_CFI_ADJ)
>
>> orr tmp1, src1, src2
>> strd r6, r7, [sp, #8]
>> + .save {r6, r7}
>> +#if __HAVE_PAC_LEAF
>> + .cfi_offset 6, -12
>> + .cfi_offset 7, -8
>> +#else
>> .cfi_offset 6, -8
>> .cfi_offset 7, -4
>> +#endif /* __HAVE_PAC_LEAF */
>> mvn const_m1, #0
>> lsl r2, tmp1, #29
>> cbz r2, .Lloop_aligned8
>> @@ -270,7 +289,6 @@ def_fn strcmp
>> ldr data1, [src1], #4
>> beq .Laligned_m2
>> bcs .Laligned_m1
>> -
>> #ifdef STRCMP_NO_PRECHECK
>> ldrb data2, [src2, #1]
>> uxtb tmp1, data1, ror #BYTE1_OFFSET
>> @@ -314,7 +332,8 @@ def_fn strcmp
>> mov result, tmp1
>> ldr r4, [sp], #16
>> .cfi_restore 4
>> - bx lr
>> + .cfi_adjust_cfa_offset -16
>> + pacbti_epilogue
>> #ifndef STRCMP_NO_PRECHECK
>> .Laligned_m1:
>> @@ -364,8 +383,9 @@ def_fn strcmp
>> /* R6/7 Not used in this sequence. */
>> .cfi_restore 6
>> .cfi_restore 7
>> + .cfi_adjust_cfa_offset -16
>> neg result, result
>> - bx lr
>> + pacbti_epilogue
>> 6:
>> .cfi_restore_state
>> @@ -441,7 +461,8 @@ def_fn strcmp
>> /* R6/7 not used in this sequence. */
>> .cfi_restore 6
>> .cfi_restore 7
>> - bx lr
>> + .cfi_adjust_cfa_offset -16
>> + pacbti_epilogue
>> .Lstrcmp_tail:
>> .cfi_restore_state
>> @@ -463,7 +484,9 @@ def_fn strcmp
>> /* R6/7 not used in this sequence. */
>> .cfi_restore 6
>> .cfi_restore 7
>> + .cfi_adjust_cfa_offset -16
>> sub result, result, data2, lsr #24
>> - bx lr
>> + pacbti_epilogue
>> .cfi_endproc
>> + .fnend
>> .size strcmp, . - .Lstrcmp_start_addr
>> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
>> index cdb4912df..fe1519f4d 100644
>> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
>> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
>> @@ -29,6 +29,8 @@
>> /* Very similar to the generic code, but uses Thumb2 as implemented
>> in ARMv7-M. */
>> +#include "pacbti.h"
>> +
>> /* Parameters and result. */
>> #define src1 r0
>> #define src2 r1
>> @@ -44,8 +46,10 @@
>> .thumb
>> .syntax unified
>> def_fn strcmp
>> + .fnstart
>> .cfi_sections .debug_frame
>> .cfi_startproc
>> + pacbti_prologue
>> eor tmp1, src1, src2
>> tst tmp1, #3
>> /* Strings not at same byte offset from a word boundary. */
>> @@ -106,7 +110,7 @@ def_fn strcmp
>> lsrs result, result, #24
>> subs result, result, data2
>> #endif
>> - bx lr
>> + pacbti_epilogue
>> #if 0
>> @@ -214,12 +218,18 @@ def_fn strcmp
>> cmpcs data1, data2
>> beq .Lstrcmp_unaligned
>> sub result, data1, data2
>> - bx lr
>> + pacbti_epilogue
>> 2:
>> stmfd sp!, {r5}
>> - .cfi_def_cfa_offset 4
>> + .save {r5}
>> + .cfi_adjust_cfa_offset 4
>> +#if __HAVE_PAC_LEAF
>> + .cfi_offset 5, -8 /* Account for ip register already on stack. */
>> +#else
>> .cfi_offset 5, -4
>> +#endif /* __HAVE_PAC_LEAF */
>> +
>> ldr data1, [src1], #4
>> and tmp2, src2, #3
>> @@ -353,10 +363,17 @@ def_fn strcmp
>> .Lstrcmp_done_equal:
>> mov result, #0
>> .cfi_remember_state
>> +#if __HAVE_PAC_LEAF
>> + pop {r5, ip}
>> + .cfi_restore 5
>> + .cfi_restore 143
>> + aut ip, lr, sp
>> +#else
>> ldmfd sp!, {r5}
>> .cfi_restore 5
>> +#endif /* __HAVE_PAC_LEAF */
>
> I think we could define a macro for this as well, something like
> pop_reg_and_pac r5
This sounds good, but a survey through relevant function epilogues
reveals a series of different `pop' scenarios. On the one hand, we have
pop {r5, ip}, as we have here but on another we also have, for example,
pop {r4,r5,r6,r7,ip}.
Do we wish to provide a short-hand notation for one pop operation but
not the other via macros? My initial understanding is that assembler
macros don't provide us with the necessary flexibility to handle all
relevant cases.
I think the use of a macro here would greatly enhance code readability
and by extension maintainability, but at first sight it feels inelegant
to provide a solution applicable only to a single case of a more general
problem.
Any wisdom here would be appreciated.
V.
> While you're doing that, please change the single-register ldmfd into 'pop'.
>
>> .cfi_def_cfa_offset 0
>> - bx lr
>> + bx lr
>
> stray change to the BX instruction.
>
>> .Lstrcmp_tail:
>> .cfi_restore_state
>> @@ -370,9 +387,18 @@ def_fn strcmp
>> S2LOEQ data2, data2, #8
>> beq .Lstrcmp_tail
>> sub result, r2, result
>> +#if __HAVE_PAC_LEAF
>> + pop {r5, ip}
>> + .cfi_restore 5
>> + .cfi_restore 143
>> + .cfi_def_cfa_offset 0
>> + aut ip, lr, sp
>> +#else
>> ldmfd sp!, {r5}
>> .cfi_restore 5
>> .cfi_def_cfa_offset 0
>> - bx lr
>> +#endif /* __HAVE_PAC_LEAF */
>
> Same as above
>
>> + bx lr
>
> tab between mnemonic and operands.
>
>> .cfi_endproc
>> + .fnend
>> .size strcmp, . - strcmp
>
> R.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-08-05 10:51 ` Victor L. Do Nascimento
@ 2022-08-05 14:34 ` Richard Earnshaw
2022-08-05 14:58 ` Victor L. Do Nascimento
0 siblings, 1 reply; 18+ messages in thread
From: Richard Earnshaw @ 2022-08-05 14:34 UTC (permalink / raw)
To: Victor L. Do Nascimento; +Cc: newlib, richard.earnshaw
On 05/08/2022 11:51, Victor L. Do Nascimento wrote:
> Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
>
>> On 03/08/2022 16:35, Victor Do Nascimento wrote:
>>> Add function prologue/epilogue to conditionally add BTI landing pads
>>> and/or PAC code generation & authentication instructions depending on
>>> compilation flags.
>>> This patch enables PACBTI for all relevant variants of strcmp:
>>> * Newlib for armv8.1-m.main+pacbti
>>> * Newlib for armv8.1-m.main+pacbti+mve
>>> * Newlib-nano
>>> ---
>>> newlib/libc/machine/arm/strcmp-arm-tiny.S | 7 +++-
>>> newlib/libc/machine/arm/strcmp-armv7.S | 45 +++++++++++++++++------
>>> newlib/libc/machine/arm/strcmp-armv7m.S | 36 +++++++++++++++---
>>> 3 files changed, 71 insertions(+), 17 deletions(-)
>>> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>> b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>> index 607a41daf..8085eb4df 100644
>>> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>> @@ -29,10 +29,14 @@
>>> /* Tiny version of strcmp in ARM state. Used only when optimizing
>>> for size. Also supports Thumb-2. */
>>> +#include "pacbti.h"
>>> +
>>> .syntax unified
>>> def_fn strcmp
>>> + .fnstart
>>> .cfi_sections .debug_frame
>>> .cfi_startproc
>>> + pacbti_prologue
>>
>> See comment on patch1 about the names of these macros.
>>
>>> 1:
>>> ldrb r2, [r0], #1
>>> ldrb r3, [r1], #1
>>> @@ -42,6 +46,7 @@ def_fn strcmp
>>> beq 1b
>>> 2:
>>> subs r0, r2, r3
>>> - bx lr
>>> + pacbti_epilogue
>>> .cfi_endproc
>>> + .fnend
>>> .size strcmp, . - strcmp
>>> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
>>> index 2f93bfb73..584c89f7e 100644
>>> --- a/newlib/libc/machine/arm/strcmp-armv7.S
>>> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
>>> @@ -45,6 +45,8 @@
>>> .thumb
>>> .syntax unified
>>> +#include "pacbti.h"
>>> +
>>> /* Parameters and result. */
>>> #define src1 r0
>>> #define src2 r1
>>> @@ -91,8 +93,9 @@
>>> ldrd r4, r5, [sp], #16
>>> .cfi_restore 4
>>> .cfi_restore 5
>>> + .cfi_adjust_cfa_offset -16
>>
>> This looks like it's an omission from the original file and thus
>> should be separated out as a separate patch.
>>
>>> sub result, result, r1, lsr #24
>>> - bx lr
>>> + pacbti_epilogue
>>> #else
>>> /* To use the big-endian trick we'd have to reverse all three words.
>>> that's slower than this approach. */
>>> @@ -112,22 +115,28 @@
>>> ldrd r4, r5, [sp], #16
>>> .cfi_restore 4
>>> .cfi_restore 5
>>> + .cfi_adjust_cfa_offset -16
>>> sub result, result, r1
>>> - bx lr
>>> + pacbti_epilogue
>>> #endif
>>> .endm
>>> +
>>> .text
>>> .p2align 5
>>> + .fnstart
>>> + .cfi_sections .debug_frame
>>> + .cfi_startproc
>>> .Lstrcmp_start_addr:
>>> #ifndef STRCMP_NO_PRECHECK
>>> .Lfastpath_exit:
>>> sub r0, r2, r3
>>> - bx lr
>>> + pacbti_epilogue
>>> nop
>>> #endif
>>> def_fn strcmp
>>> + pacbti_prologue
>>> #ifndef STRCMP_NO_PRECHECK
>>> ldrb r2, [src1]
>>> ldrb r3, [src2]
>>> @@ -136,16 +145,26 @@ def_fn strcmp
>>> cmpcs r2, r3
>>> bne .Lfastpath_exit
>>> #endif
>>> - .cfi_sections .debug_frame
>>> - .cfi_startproc
>>> strd r4, r5, [sp, #-16]!
>>> - .cfi_def_cfa_offset 16
>>> + .save {r4, r5}
>>
>> Hmm, I've just remembered that leaf functions cannot throw exceptions
>> (the EHABI only supports synchronous exceptions), so I don't think we
>> need .save directives at all. Instead, such functions should be
>> marked with ".cantunwind". You still want the dwarf unwind
>> information as that's for the debugger, it's just the '.save'
>> directives that aren't needed.
>>
>>> + .cfi_adjust_cfa_offset 16
>>> +#if __HAVE_PAC_LEAF
>>> + .cfi_offset 4, -20
>>> + .cfi_offset 5, -16
>>> +#else
>>> .cfi_offset 4, -16
>>> .cfi_offset 5, -12
>>> +#endif /* __HAVE_PAC_LEAF */
>>
>> Ugh! This is not the right way to do this. Instead (see comments on
>> patch 1) use a macro that defines the additional adjustment to use in
>> various contexts, something like
>>
>> .cfi_offset 4, -(16+PAC_UNALIGNED_CFI_ADJ)
>>
>>> orr tmp1, src1, src2
>>> strd r6, r7, [sp, #8]
>>> + .save {r6, r7}
>>> +#if __HAVE_PAC_LEAF
>>> + .cfi_offset 6, -12
>>> + .cfi_offset 7, -8
>>> +#else
>>> .cfi_offset 6, -8
>>> .cfi_offset 7, -4
>>> +#endif /* __HAVE_PAC_LEAF */
>>> mvn const_m1, #0
>>> lsl r2, tmp1, #29
>>> cbz r2, .Lloop_aligned8
>>> @@ -270,7 +289,6 @@ def_fn strcmp
>>> ldr data1, [src1], #4
>>> beq .Laligned_m2
>>> bcs .Laligned_m1
>>> -
>>> #ifdef STRCMP_NO_PRECHECK
>>> ldrb data2, [src2, #1]
>>> uxtb tmp1, data1, ror #BYTE1_OFFSET
>>> @@ -314,7 +332,8 @@ def_fn strcmp
>>> mov result, tmp1
>>> ldr r4, [sp], #16
>>> .cfi_restore 4
>>> - bx lr
>>> + .cfi_adjust_cfa_offset -16
>>> + pacbti_epilogue
>>> #ifndef STRCMP_NO_PRECHECK
>>> .Laligned_m1:
>>> @@ -364,8 +383,9 @@ def_fn strcmp
>>> /* R6/7 Not used in this sequence. */
>>> .cfi_restore 6
>>> .cfi_restore 7
>>> + .cfi_adjust_cfa_offset -16
>>> neg result, result
>>> - bx lr
>>> + pacbti_epilogue
>>> 6:
>>> .cfi_restore_state
>>> @@ -441,7 +461,8 @@ def_fn strcmp
>>> /* R6/7 not used in this sequence. */
>>> .cfi_restore 6
>>> .cfi_restore 7
>>> - bx lr
>>> + .cfi_adjust_cfa_offset -16
>>> + pacbti_epilogue
>>> .Lstrcmp_tail:
>>> .cfi_restore_state
>>> @@ -463,7 +484,9 @@ def_fn strcmp
>>> /* R6/7 not used in this sequence. */
>>> .cfi_restore 6
>>> .cfi_restore 7
>>> + .cfi_adjust_cfa_offset -16
>>> sub result, result, data2, lsr #24
>>> - bx lr
>>> + pacbti_epilogue
>>> .cfi_endproc
>>> + .fnend
>>> .size strcmp, . - .Lstrcmp_start_addr
>>> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
>>> index cdb4912df..fe1519f4d 100644
>>> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
>>> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
>>> @@ -29,6 +29,8 @@
>>> /* Very similar to the generic code, but uses Thumb2 as implemented
>>> in ARMv7-M. */
>>> +#include "pacbti.h"
>>> +
>>> /* Parameters and result. */
>>> #define src1 r0
>>> #define src2 r1
>>> @@ -44,8 +46,10 @@
>>> .thumb
>>> .syntax unified
>>> def_fn strcmp
>>> + .fnstart
>>> .cfi_sections .debug_frame
>>> .cfi_startproc
>>> + pacbti_prologue
>>> eor tmp1, src1, src2
>>> tst tmp1, #3
>>> /* Strings not at same byte offset from a word boundary. */
>>> @@ -106,7 +110,7 @@ def_fn strcmp
>>> lsrs result, result, #24
>>> subs result, result, data2
>>> #endif
>>> - bx lr
>>> + pacbti_epilogue
>>> #if 0
>>> @@ -214,12 +218,18 @@ def_fn strcmp
>>> cmpcs data1, data2
>>> beq .Lstrcmp_unaligned
>>> sub result, data1, data2
>>> - bx lr
>>> + pacbti_epilogue
>>> 2:
>>> stmfd sp!, {r5}
>>> - .cfi_def_cfa_offset 4
>>> + .save {r5}
>>> + .cfi_adjust_cfa_offset 4
>>> +#if __HAVE_PAC_LEAF
>>> + .cfi_offset 5, -8 /* Account for ip register already on stack. */
>>> +#else
>>> .cfi_offset 5, -4
>>> +#endif /* __HAVE_PAC_LEAF */
>>> +
>>> ldr data1, [src1], #4
>>> and tmp2, src2, #3
>>> @@ -353,10 +363,17 @@ def_fn strcmp
>>> .Lstrcmp_done_equal:
>>> mov result, #0
>>> .cfi_remember_state
>>> +#if __HAVE_PAC_LEAF
>>> + pop {r5, ip}
>>> + .cfi_restore 5
>>> + .cfi_restore 143
>>> + aut ip, lr, sp
>>> +#else
>>> ldmfd sp!, {r5}
>>> .cfi_restore 5
>>> +#endif /* __HAVE_PAC_LEAF */
>>
>> I think we could define a macro for this as well, something like
>> pop_reg_and_pac r5
>
> This sounds good, but a survey through relevant function epilogues
> reveals a series of different `pop' scenarios. On the one hand, we have
> pop {r5, ip}, as we have here but on another we also have, for example,
> pop {r4,r5,r6,r7,ip}.
>
> Do we wish to provide a short-hand notation for one pop operation but
> not the other via macros? My initial understanding is that assembler
> macros don't provide us with the necessary flexibility to handle all
> relevant cases.
>
> I think the use of a macro here would greatly enhance code readability
> and by extension maintainability, but at first sight it feels inelegant
> to provide a solution applicable only to a single case of a more general
> problem.
>
> Any wisdom here would be appreciated.
Yes, this is all quite tricky. Fortunately, the vast majority of time
registers saved on the stack frame are consecutively numbered. So I
think it's possible to write some macros to handle this. The example
below is not complete (I'll leave you to flesh out all the different
cases :), but I think it shows that we can achieve what's needed.
@ Emit .cfi_offset directives for a consecutive sequence of registers
.macro cfisavelist first, last, count
.cfi_offset \last, -4*(\count)
.if \last-\first
cfisavelist \first, \last-1, \count+1
.endif
.endm
@ Create a prologue entry sequence handling PAC/BTI, if required and
emitting
@ CFI directives for
.macro prologue first=-1, last=-1, savepac=1
.if \first != -1
.if \last != -1
.if \savepac
push {r\first-r\last, ip}
.cfi_offset 143, -4
cfisavelist \first, \last, 2
.else
push {r\first-r\last}
cfisavelist \first, \last, 1
.endif
.else
push {r\first}
cfisavelist \first, \first, 1
.endif
.else
...
.endif
.endm
.text
.global foo
.type foo, %function
foo:
.cfi_startproc
prologue 4, 7, savepac=0
.cfi_endproc
.text
.global bar
.type bar, %function
bar:
.cfi_startproc
prologue 4, 7
.cfi_endproc
.text
.global wom
.type wom, %function
wom:
.cfi_startproc
prologue 4, savepac=0
.cfi_endproc
.text
.global bat
.type bat, %function
bat:
.cfi_startproc
prologue savepac=0
.cfi_endproc
R.
>
> V.
>
>> While you're doing that, please change the single-register ldmfd into 'pop'.
>>
>>> .cfi_def_cfa_offset 0
>>> - bx lr
>>> + bx lr
>>
>> stray change to the BX instruction.
>>
>>> .Lstrcmp_tail:
>>> .cfi_restore_state
>>> @@ -370,9 +387,18 @@ def_fn strcmp
>>> S2LOEQ data2, data2, #8
>>> beq .Lstrcmp_tail
>>> sub result, r2, result
>>> +#if __HAVE_PAC_LEAF
>>> + pop {r5, ip}
>>> + .cfi_restore 5
>>> + .cfi_restore 143
>>> + .cfi_def_cfa_offset 0
>>> + aut ip, lr, sp
>>> +#else
>>> ldmfd sp!, {r5}
>>> .cfi_restore 5
>>> .cfi_def_cfa_offset 0
>>> - bx lr
>>> +#endif /* __HAVE_PAC_LEAF */
>>
>> Same as above
>>
>>> + bx lr
>>
>> tab between mnemonic and operands.
>>
>>> .cfi_endproc
>>> + .fnend
>>> .size strcmp, . - strcmp
>>
>> R.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-08-05 14:34 ` Richard Earnshaw
@ 2022-08-05 14:58 ` Victor L. Do Nascimento
2022-08-05 15:15 ` Richard Earnshaw
0 siblings, 1 reply; 18+ messages in thread
From: Victor L. Do Nascimento @ 2022-08-05 14:58 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: newlib, richard.earnshaw
Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
> On 05/08/2022 11:51, Victor L. Do Nascimento wrote:
>> Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
>>
>>> On 03/08/2022 16:35, Victor Do Nascimento wrote:
>>>> Add function prologue/epilogue to conditionally add BTI landing pads
>>>> and/or PAC code generation & authentication instructions depending on
>>>> compilation flags.
>>>> This patch enables PACBTI for all relevant variants of strcmp:
>>>> * Newlib for armv8.1-m.main+pacbti
>>>> * Newlib for armv8.1-m.main+pacbti+mve
>>>> * Newlib-nano
>>>> ---
>>>> newlib/libc/machine/arm/strcmp-arm-tiny.S | 7 +++-
>>>> newlib/libc/machine/arm/strcmp-armv7.S | 45 +++++++++++++++++------
>>>> newlib/libc/machine/arm/strcmp-armv7m.S | 36 +++++++++++++++---
>>>> 3 files changed, 71 insertions(+), 17 deletions(-)
>>>> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>>> b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>>> index 607a41daf..8085eb4df 100644
>>>> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>>> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>>> @@ -29,10 +29,14 @@
>>>> /* Tiny version of strcmp in ARM state. Used only when optimizing
>>>> for size. Also supports Thumb-2. */
>>>> +#include "pacbti.h"
>>>> +
>>>> .syntax unified
>>>> def_fn strcmp
>>>> + .fnstart
>>>> .cfi_sections .debug_frame
>>>> .cfi_startproc
>>>> + pacbti_prologue
>>>
>>> See comment on patch1 about the names of these macros.
>>>
>>>> 1:
>>>> ldrb r2, [r0], #1
>>>> ldrb r3, [r1], #1
>>>> @@ -42,6 +46,7 @@ def_fn strcmp
>>>> beq 1b
>>>> 2:
>>>> subs r0, r2, r3
>>>> - bx lr
>>>> + pacbti_epilogue
>>>> .cfi_endproc
>>>> + .fnend
>>>> .size strcmp, . - strcmp
>>>> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
>>>> index 2f93bfb73..584c89f7e 100644
>>>> --- a/newlib/libc/machine/arm/strcmp-armv7.S
>>>> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
>>>> @@ -45,6 +45,8 @@
>>>> .thumb
>>>> .syntax unified
>>>> +#include "pacbti.h"
>>>> +
>>>> /* Parameters and result. */
>>>> #define src1 r0
>>>> #define src2 r1
>>>> @@ -91,8 +93,9 @@
>>>> ldrd r4, r5, [sp], #16
>>>> .cfi_restore 4
>>>> .cfi_restore 5
>>>> + .cfi_adjust_cfa_offset -16
>>>
>>> This looks like it's an omission from the original file and thus
>>> should be separated out as a separate patch.
>>>
>>>> sub result, result, r1, lsr #24
>>>> - bx lr
>>>> + pacbti_epilogue
>>>> #else
>>>> /* To use the big-endian trick we'd have to reverse all three words.
>>>> that's slower than this approach. */
>>>> @@ -112,22 +115,28 @@
>>>> ldrd r4, r5, [sp], #16
>>>> .cfi_restore 4
>>>> .cfi_restore 5
>>>> + .cfi_adjust_cfa_offset -16
>>>> sub result, result, r1
>>>> - bx lr
>>>> + pacbti_epilogue
>>>> #endif
>>>> .endm
>>>> +
>>>> .text
>>>> .p2align 5
>>>> + .fnstart
>>>> + .cfi_sections .debug_frame
>>>> + .cfi_startproc
>>>> .Lstrcmp_start_addr:
>>>> #ifndef STRCMP_NO_PRECHECK
>>>> .Lfastpath_exit:
>>>> sub r0, r2, r3
>>>> - bx lr
>>>> + pacbti_epilogue
>>>> nop
>>>> #endif
>>>> def_fn strcmp
>>>> + pacbti_prologue
>>>> #ifndef STRCMP_NO_PRECHECK
>>>> ldrb r2, [src1]
>>>> ldrb r3, [src2]
>>>> @@ -136,16 +145,26 @@ def_fn strcmp
>>>> cmpcs r2, r3
>>>> bne .Lfastpath_exit
>>>> #endif
>>>> - .cfi_sections .debug_frame
>>>> - .cfi_startproc
>>>> strd r4, r5, [sp, #-16]!
>>>> - .cfi_def_cfa_offset 16
>>>> + .save {r4, r5}
>>>
>>> Hmm, I've just remembered that leaf functions cannot throw exceptions
>>> (the EHABI only supports synchronous exceptions), so I don't think we
>>> need .save directives at all. Instead, such functions should be
>>> marked with ".cantunwind". You still want the dwarf unwind
>>> information as that's for the debugger, it's just the '.save'
>>> directives that aren't needed.
>>>
>>>> + .cfi_adjust_cfa_offset 16
>>>> +#if __HAVE_PAC_LEAF
>>>> + .cfi_offset 4, -20
>>>> + .cfi_offset 5, -16
>>>> +#else
>>>> .cfi_offset 4, -16
>>>> .cfi_offset 5, -12
>>>> +#endif /* __HAVE_PAC_LEAF */
>>>
>>> Ugh! This is not the right way to do this. Instead (see comments on
>>> patch 1) use a macro that defines the additional adjustment to use in
>>> various contexts, something like
>>>
>>> .cfi_offset 4, -(16+PAC_UNALIGNED_CFI_ADJ)
>>>
>>>> orr tmp1, src1, src2
>>>> strd r6, r7, [sp, #8]
>>>> + .save {r6, r7}
>>>> +#if __HAVE_PAC_LEAF
>>>> + .cfi_offset 6, -12
>>>> + .cfi_offset 7, -8
>>>> +#else
>>>> .cfi_offset 6, -8
>>>> .cfi_offset 7, -4
>>>> +#endif /* __HAVE_PAC_LEAF */
>>>> mvn const_m1, #0
>>>> lsl r2, tmp1, #29
>>>> cbz r2, .Lloop_aligned8
>>>> @@ -270,7 +289,6 @@ def_fn strcmp
>>>> ldr data1, [src1], #4
>>>> beq .Laligned_m2
>>>> bcs .Laligned_m1
>>>> -
>>>> #ifdef STRCMP_NO_PRECHECK
>>>> ldrb data2, [src2, #1]
>>>> uxtb tmp1, data1, ror #BYTE1_OFFSET
>>>> @@ -314,7 +332,8 @@ def_fn strcmp
>>>> mov result, tmp1
>>>> ldr r4, [sp], #16
>>>> .cfi_restore 4
>>>> - bx lr
>>>> + .cfi_adjust_cfa_offset -16
>>>> + pacbti_epilogue
>>>> #ifndef STRCMP_NO_PRECHECK
>>>> .Laligned_m1:
>>>> @@ -364,8 +383,9 @@ def_fn strcmp
>>>> /* R6/7 Not used in this sequence. */
>>>> .cfi_restore 6
>>>> .cfi_restore 7
>>>> + .cfi_adjust_cfa_offset -16
>>>> neg result, result
>>>> - bx lr
>>>> + pacbti_epilogue
>>>> 6:
>>>> .cfi_restore_state
>>>> @@ -441,7 +461,8 @@ def_fn strcmp
>>>> /* R6/7 not used in this sequence. */
>>>> .cfi_restore 6
>>>> .cfi_restore 7
>>>> - bx lr
>>>> + .cfi_adjust_cfa_offset -16
>>>> + pacbti_epilogue
>>>> .Lstrcmp_tail:
>>>> .cfi_restore_state
>>>> @@ -463,7 +484,9 @@ def_fn strcmp
>>>> /* R6/7 not used in this sequence. */
>>>> .cfi_restore 6
>>>> .cfi_restore 7
>>>> + .cfi_adjust_cfa_offset -16
>>>> sub result, result, data2, lsr #24
>>>> - bx lr
>>>> + pacbti_epilogue
>>>> .cfi_endproc
>>>> + .fnend
>>>> .size strcmp, . - .Lstrcmp_start_addr
>>>> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
>>>> index cdb4912df..fe1519f4d 100644
>>>> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
>>>> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
>>>> @@ -29,6 +29,8 @@
>>>> /* Very similar to the generic code, but uses Thumb2 as implemented
>>>> in ARMv7-M. */
>>>> +#include "pacbti.h"
>>>> +
>>>> /* Parameters and result. */
>>>> #define src1 r0
>>>> #define src2 r1
>>>> @@ -44,8 +46,10 @@
>>>> .thumb
>>>> .syntax unified
>>>> def_fn strcmp
>>>> + .fnstart
>>>> .cfi_sections .debug_frame
>>>> .cfi_startproc
>>>> + pacbti_prologue
>>>> eor tmp1, src1, src2
>>>> tst tmp1, #3
>>>> /* Strings not at same byte offset from a word boundary. */
>>>> @@ -106,7 +110,7 @@ def_fn strcmp
>>>> lsrs result, result, #24
>>>> subs result, result, data2
>>>> #endif
>>>> - bx lr
>>>> + pacbti_epilogue
>>>> #if 0
>>>> @@ -214,12 +218,18 @@ def_fn strcmp
>>>> cmpcs data1, data2
>>>> beq .Lstrcmp_unaligned
>>>> sub result, data1, data2
>>>> - bx lr
>>>> + pacbti_epilogue
>>>> 2:
>>>> stmfd sp!, {r5}
>>>> - .cfi_def_cfa_offset 4
>>>> + .save {r5}
>>>> + .cfi_adjust_cfa_offset 4
>>>> +#if __HAVE_PAC_LEAF
>>>> + .cfi_offset 5, -8 /* Account for ip register already on stack. */
>>>> +#else
>>>> .cfi_offset 5, -4
>>>> +#endif /* __HAVE_PAC_LEAF */
>>>> +
>>>> ldr data1, [src1], #4
>>>> and tmp2, src2, #3
>>>> @@ -353,10 +363,17 @@ def_fn strcmp
>>>> .Lstrcmp_done_equal:
>>>> mov result, #0
>>>> .cfi_remember_state
>>>> +#if __HAVE_PAC_LEAF
>>>> + pop {r5, ip}
>>>> + .cfi_restore 5
>>>> + .cfi_restore 143
>>>> + aut ip, lr, sp
>>>> +#else
>>>> ldmfd sp!, {r5}
>>>> .cfi_restore 5
>>>> +#endif /* __HAVE_PAC_LEAF */
>>>
>>> I think we could define a macro for this as well, something like
>>> pop_reg_and_pac r5
>> This sounds good, but a survey through relevant function epilogues
>> reveals a series of different `pop' scenarios. On the one hand, we have
>> pop {r5, ip}, as we have here but on another we also have, for example,
>> pop {r4,r5,r6,r7,ip}.
>> Do we wish to provide a short-hand notation for one pop operation
>> but
>> not the other via macros? My initial understanding is that assembler
>> macros don't provide us with the necessary flexibility to handle all
>> relevant cases.
>> I think the use of a macro here would greatly enhance code
>> readability
>> and by extension maintainability, but at first sight it feels inelegant
>> to provide a solution applicable only to a single case of a more general
>> problem.
>> Any wisdom here would be appreciated.
>
> Yes, this is all quite tricky. Fortunately, the vast majority of time
> registers saved on the stack frame are consecutively numbered. So I
> think it's possible to write some macros to handle this. The example
> below is not complete (I'll leave you to flesh out all the different
> cases :), but I think it shows that we can achieve what's needed.
>
> @ Emit .cfi_offset directives for a consecutive sequence of registers
> .macro cfisavelist first, last, count
> .cfi_offset \last, -4*(\count)
> .if \last-\first
> cfisavelist \first, \last-1, \count+1
> .endif
> .endm
>
> @ Create a prologue entry sequence handling PAC/BTI, if required and
> emitting
> @ CFI directives for
> .macro prologue first=-1, last=-1, savepac=1
> .if \first != -1
> .if \last != -1
> .if \savepac
> push {r\first-r\last, ip}
> .cfi_offset 143, -4
> cfisavelist \first, \last, 2
> .else
> push {r\first-r\last}
> cfisavelist \first, \last, 1
> .endif
> .else
> push {r\first}
> cfisavelist \first, \first, 1
> .endif
> .else
> ...
> .endif
> .endm
>
> .text
> .global foo
> .type foo, %function
> foo:
> .cfi_startproc
> prologue 4, 7, savepac=0
> .cfi_endproc
>
> .text
> .global bar
> .type bar, %function
> bar:
> .cfi_startproc
> prologue 4, 7
> .cfi_endproc
>
> .text
> .global wom
> .type wom, %function
> wom:
> .cfi_startproc
> prologue 4, savepac=0
> .cfi_endproc
>
> .text
> .global bat
> .type bat, %function
> bat:
> .cfi_startproc
> prologue savepac=0
> .cfi_endproc
>
> R.
Thanks for these, they'll come in extremely handy. Regarding the
pop_reg_and_pac macro you suggested, I'd concocted a solution which at
first sight seemed to make minimal assumptions about what we're asking
it to pop and, looking at initial example disassemblies, seems to
behave well.
I'll share it with you here in case you come across anything warranting
feedback.
Furthermore, I will see whether it will be of any use in building upon
your example.
.macro cfi_restore_list first:req, rest:vararg
.cfi_restore \first
.ifnb \rest
cfi_restore_list \rest
.endif
.endm
.macro pop_reg_and_pac first:req, rest:vararg
.ifnb \rest
pop {\first, \rest, ip}
.else
pop {\first, ip}
.endif
cfi_restore_list \first, \rest
.cfi_restore 143
aut ip, lr, sp
.endm
As always, thanks for your input.
V.
>> V.
>>
>>> While you're doing that, please change the single-register ldmfd into 'pop'.
>>>
>>>> .cfi_def_cfa_offset 0
>>>> - bx lr
>>>> + bx lr
>>>
>>> stray change to the BX instruction.
>>>
>>>> .Lstrcmp_tail:
>>>> .cfi_restore_state
>>>> @@ -370,9 +387,18 @@ def_fn strcmp
>>>> S2LOEQ data2, data2, #8
>>>> beq .Lstrcmp_tail
>>>> sub result, r2, result
>>>> +#if __HAVE_PAC_LEAF
>>>> + pop {r5, ip}
>>>> + .cfi_restore 5
>>>> + .cfi_restore 143
>>>> + .cfi_def_cfa_offset 0
>>>> + aut ip, lr, sp
>>>> +#else
>>>> ldmfd sp!, {r5}
>>>> .cfi_restore 5
>>>> .cfi_def_cfa_offset 0
>>>> - bx lr
>>>> +#endif /* __HAVE_PAC_LEAF */
>>>
>>> Same as above
>>>
>>>> + bx lr
>>>
>>> tab between mnemonic and operands.
>>>
>>>> .cfi_endproc
>>>> + .fnend
>>>> .size strcmp, . - strcmp
>>>
>>> R.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement
2022-08-05 14:58 ` Victor L. Do Nascimento
@ 2022-08-05 15:15 ` Richard Earnshaw
0 siblings, 0 replies; 18+ messages in thread
From: Richard Earnshaw @ 2022-08-05 15:15 UTC (permalink / raw)
To: Victor L. Do Nascimento; +Cc: newlib, richard.earnshaw
On 05/08/2022 15:58, Victor L. Do Nascimento wrote:
> Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
>
>> On 05/08/2022 11:51, Victor L. Do Nascimento wrote:
>>> Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
>>>
>>>> On 03/08/2022 16:35, Victor Do Nascimento wrote:
>>>>> Add function prologue/epilogue to conditionally add BTI landing pads
>>>>> and/or PAC code generation & authentication instructions depending on
>>>>> compilation flags.
>>>>> This patch enables PACBTI for all relevant variants of strcmp:
>>>>> * Newlib for armv8.1-m.main+pacbti
>>>>> * Newlib for armv8.1-m.main+pacbti+mve
>>>>> * Newlib-nano
>>>>> ---
>>>>> newlib/libc/machine/arm/strcmp-arm-tiny.S | 7 +++-
>>>>> newlib/libc/machine/arm/strcmp-armv7.S | 45 +++++++++++++++++------
>>>>> newlib/libc/machine/arm/strcmp-armv7m.S | 36 +++++++++++++++---
>>>>> 3 files changed, 71 insertions(+), 17 deletions(-)
>>>>> diff --git a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>>>> b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>>>> index 607a41daf..8085eb4df 100644
>>>>> --- a/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>>>> +++ b/newlib/libc/machine/arm/strcmp-arm-tiny.S
>>>>> @@ -29,10 +29,14 @@
>>>>> /* Tiny version of strcmp in ARM state. Used only when optimizing
>>>>> for size. Also supports Thumb-2. */
>>>>> +#include "pacbti.h"
>>>>> +
>>>>> .syntax unified
>>>>> def_fn strcmp
>>>>> + .fnstart
>>>>> .cfi_sections .debug_frame
>>>>> .cfi_startproc
>>>>> + pacbti_prologue
>>>>
>>>> See comment on patch1 about the names of these macros.
>>>>
>>>>> 1:
>>>>> ldrb r2, [r0], #1
>>>>> ldrb r3, [r1], #1
>>>>> @@ -42,6 +46,7 @@ def_fn strcmp
>>>>> beq 1b
>>>>> 2:
>>>>> subs r0, r2, r3
>>>>> - bx lr
>>>>> + pacbti_epilogue
>>>>> .cfi_endproc
>>>>> + .fnend
>>>>> .size strcmp, . - strcmp
>>>>> diff --git a/newlib/libc/machine/arm/strcmp-armv7.S b/newlib/libc/machine/arm/strcmp-armv7.S
>>>>> index 2f93bfb73..584c89f7e 100644
>>>>> --- a/newlib/libc/machine/arm/strcmp-armv7.S
>>>>> +++ b/newlib/libc/machine/arm/strcmp-armv7.S
>>>>> @@ -45,6 +45,8 @@
>>>>> .thumb
>>>>> .syntax unified
>>>>> +#include "pacbti.h"
>>>>> +
>>>>> /* Parameters and result. */
>>>>> #define src1 r0
>>>>> #define src2 r1
>>>>> @@ -91,8 +93,9 @@
>>>>> ldrd r4, r5, [sp], #16
>>>>> .cfi_restore 4
>>>>> .cfi_restore 5
>>>>> + .cfi_adjust_cfa_offset -16
>>>>
>>>> This looks like it's an omission from the original file and thus
>>>> should be separated out as a separate patch.
>>>>
>>>>> sub result, result, r1, lsr #24
>>>>> - bx lr
>>>>> + pacbti_epilogue
>>>>> #else
>>>>> /* To use the big-endian trick we'd have to reverse all three words.
>>>>> that's slower than this approach. */
>>>>> @@ -112,22 +115,28 @@
>>>>> ldrd r4, r5, [sp], #16
>>>>> .cfi_restore 4
>>>>> .cfi_restore 5
>>>>> + .cfi_adjust_cfa_offset -16
>>>>> sub result, result, r1
>>>>> - bx lr
>>>>> + pacbti_epilogue
>>>>> #endif
>>>>> .endm
>>>>> +
>>>>> .text
>>>>> .p2align 5
>>>>> + .fnstart
>>>>> + .cfi_sections .debug_frame
>>>>> + .cfi_startproc
>>>>> .Lstrcmp_start_addr:
>>>>> #ifndef STRCMP_NO_PRECHECK
>>>>> .Lfastpath_exit:
>>>>> sub r0, r2, r3
>>>>> - bx lr
>>>>> + pacbti_epilogue
>>>>> nop
>>>>> #endif
>>>>> def_fn strcmp
>>>>> + pacbti_prologue
>>>>> #ifndef STRCMP_NO_PRECHECK
>>>>> ldrb r2, [src1]
>>>>> ldrb r3, [src2]
>>>>> @@ -136,16 +145,26 @@ def_fn strcmp
>>>>> cmpcs r2, r3
>>>>> bne .Lfastpath_exit
>>>>> #endif
>>>>> - .cfi_sections .debug_frame
>>>>> - .cfi_startproc
>>>>> strd r4, r5, [sp, #-16]!
>>>>> - .cfi_def_cfa_offset 16
>>>>> + .save {r4, r5}
>>>>
>>>> Hmm, I've just remembered that leaf functions cannot throw exceptions
>>>> (the EHABI only supports synchronous exceptions), so I don't think we
>>>> need .save directives at all. Instead, such functions should be
>>>> marked with ".cantunwind". You still want the dwarf unwind
>>>> information as that's for the debugger, it's just the '.save'
>>>> directives that aren't needed.
>>>>
>>>>> + .cfi_adjust_cfa_offset 16
>>>>> +#if __HAVE_PAC_LEAF
>>>>> + .cfi_offset 4, -20
>>>>> + .cfi_offset 5, -16
>>>>> +#else
>>>>> .cfi_offset 4, -16
>>>>> .cfi_offset 5, -12
>>>>> +#endif /* __HAVE_PAC_LEAF */
>>>>
>>>> Ugh! This is not the right way to do this. Instead (see comments on
>>>> patch 1) use a macro that defines the additional adjustment to use in
>>>> various contexts, something like
>>>>
>>>> .cfi_offset 4, -(16+PAC_UNALIGNED_CFI_ADJ)
>>>>
>>>>> orr tmp1, src1, src2
>>>>> strd r6, r7, [sp, #8]
>>>>> + .save {r6, r7}
>>>>> +#if __HAVE_PAC_LEAF
>>>>> + .cfi_offset 6, -12
>>>>> + .cfi_offset 7, -8
>>>>> +#else
>>>>> .cfi_offset 6, -8
>>>>> .cfi_offset 7, -4
>>>>> +#endif /* __HAVE_PAC_LEAF */
>>>>> mvn const_m1, #0
>>>>> lsl r2, tmp1, #29
>>>>> cbz r2, .Lloop_aligned8
>>>>> @@ -270,7 +289,6 @@ def_fn strcmp
>>>>> ldr data1, [src1], #4
>>>>> beq .Laligned_m2
>>>>> bcs .Laligned_m1
>>>>> -
>>>>> #ifdef STRCMP_NO_PRECHECK
>>>>> ldrb data2, [src2, #1]
>>>>> uxtb tmp1, data1, ror #BYTE1_OFFSET
>>>>> @@ -314,7 +332,8 @@ def_fn strcmp
>>>>> mov result, tmp1
>>>>> ldr r4, [sp], #16
>>>>> .cfi_restore 4
>>>>> - bx lr
>>>>> + .cfi_adjust_cfa_offset -16
>>>>> + pacbti_epilogue
>>>>> #ifndef STRCMP_NO_PRECHECK
>>>>> .Laligned_m1:
>>>>> @@ -364,8 +383,9 @@ def_fn strcmp
>>>>> /* R6/7 Not used in this sequence. */
>>>>> .cfi_restore 6
>>>>> .cfi_restore 7
>>>>> + .cfi_adjust_cfa_offset -16
>>>>> neg result, result
>>>>> - bx lr
>>>>> + pacbti_epilogue
>>>>> 6:
>>>>> .cfi_restore_state
>>>>> @@ -441,7 +461,8 @@ def_fn strcmp
>>>>> /* R6/7 not used in this sequence. */
>>>>> .cfi_restore 6
>>>>> .cfi_restore 7
>>>>> - bx lr
>>>>> + .cfi_adjust_cfa_offset -16
>>>>> + pacbti_epilogue
>>>>> .Lstrcmp_tail:
>>>>> .cfi_restore_state
>>>>> @@ -463,7 +484,9 @@ def_fn strcmp
>>>>> /* R6/7 not used in this sequence. */
>>>>> .cfi_restore 6
>>>>> .cfi_restore 7
>>>>> + .cfi_adjust_cfa_offset -16
>>>>> sub result, result, data2, lsr #24
>>>>> - bx lr
>>>>> + pacbti_epilogue
>>>>> .cfi_endproc
>>>>> + .fnend
>>>>> .size strcmp, . - .Lstrcmp_start_addr
>>>>> diff --git a/newlib/libc/machine/arm/strcmp-armv7m.S b/newlib/libc/machine/arm/strcmp-armv7m.S
>>>>> index cdb4912df..fe1519f4d 100644
>>>>> --- a/newlib/libc/machine/arm/strcmp-armv7m.S
>>>>> +++ b/newlib/libc/machine/arm/strcmp-armv7m.S
>>>>> @@ -29,6 +29,8 @@
>>>>> /* Very similar to the generic code, but uses Thumb2 as implemented
>>>>> in ARMv7-M. */
>>>>> +#include "pacbti.h"
>>>>> +
>>>>> /* Parameters and result. */
>>>>> #define src1 r0
>>>>> #define src2 r1
>>>>> @@ -44,8 +46,10 @@
>>>>> .thumb
>>>>> .syntax unified
>>>>> def_fn strcmp
>>>>> + .fnstart
>>>>> .cfi_sections .debug_frame
>>>>> .cfi_startproc
>>>>> + pacbti_prologue
>>>>> eor tmp1, src1, src2
>>>>> tst tmp1, #3
>>>>> /* Strings not at same byte offset from a word boundary. */
>>>>> @@ -106,7 +110,7 @@ def_fn strcmp
>>>>> lsrs result, result, #24
>>>>> subs result, result, data2
>>>>> #endif
>>>>> - bx lr
>>>>> + pacbti_epilogue
>>>>> #if 0
>>>>> @@ -214,12 +218,18 @@ def_fn strcmp
>>>>> cmpcs data1, data2
>>>>> beq .Lstrcmp_unaligned
>>>>> sub result, data1, data2
>>>>> - bx lr
>>>>> + pacbti_epilogue
>>>>> 2:
>>>>> stmfd sp!, {r5}
>>>>> - .cfi_def_cfa_offset 4
>>>>> + .save {r5}
>>>>> + .cfi_adjust_cfa_offset 4
>>>>> +#if __HAVE_PAC_LEAF
>>>>> + .cfi_offset 5, -8 /* Account for ip register already on stack. */
>>>>> +#else
>>>>> .cfi_offset 5, -4
>>>>> +#endif /* __HAVE_PAC_LEAF */
>>>>> +
>>>>> ldr data1, [src1], #4
>>>>> and tmp2, src2, #3
>>>>> @@ -353,10 +363,17 @@ def_fn strcmp
>>>>> .Lstrcmp_done_equal:
>>>>> mov result, #0
>>>>> .cfi_remember_state
>>>>> +#if __HAVE_PAC_LEAF
>>>>> + pop {r5, ip}
>>>>> + .cfi_restore 5
>>>>> + .cfi_restore 143
>>>>> + aut ip, lr, sp
>>>>> +#else
>>>>> ldmfd sp!, {r5}
>>>>> .cfi_restore 5
>>>>> +#endif /* __HAVE_PAC_LEAF */
>>>>
>>>> I think we could define a macro for this as well, something like
>>>> pop_reg_and_pac r5
>>> This sounds good, but a survey through relevant function epilogues
>>> reveals a series of different `pop' scenarios. On the one hand, we have
>>> pop {r5, ip}, as we have here but on another we also have, for example,
>>> pop {r4,r5,r6,r7,ip}.
>>> Do we wish to provide a short-hand notation for one pop operation
>>> but
>>> not the other via macros? My initial understanding is that assembler
>>> macros don't provide us with the necessary flexibility to handle all
>>> relevant cases.
>>> I think the use of a macro here would greatly enhance code
>>> readability
>>> and by extension maintainability, but at first sight it feels inelegant
>>> to provide a solution applicable only to a single case of a more general
>>> problem.
>>> Any wisdom here would be appreciated.
>>
>> Yes, this is all quite tricky. Fortunately, the vast majority of time
>> registers saved on the stack frame are consecutively numbered. So I
>> think it's possible to write some macros to handle this. The example
>> below is not complete (I'll leave you to flesh out all the different
>> cases :), but I think it shows that we can achieve what's needed.
>>
>> @ Emit .cfi_offset directives for a consecutive sequence of registers
>> .macro cfisavelist first, last, count
>> .cfi_offset \last, -4*(\count)
>> .if \last-\first
>> cfisavelist \first, \last-1, \count+1
>> .endif
>> .endm
>>
>> @ Create a prologue entry sequence handling PAC/BTI, if required and
>> emitting
>> @ CFI directives for
>> .macro prologue first=-1, last=-1, savepac=1
>> .if \first != -1
>> .if \last != -1
>> .if \savepac
>> push {r\first-r\last, ip}
>> .cfi_offset 143, -4
>> cfisavelist \first, \last, 2
>> .else
>> push {r\first-r\last}
>> cfisavelist \first, \last, 1
>> .endif
>> .else
>> push {r\first}
>> cfisavelist \first, \first, 1
>> .endif
>> .else
>> ...
>> .endif
>> .endm
>>
>> .text
>> .global foo
>> .type foo, %function
>> foo:
>> .cfi_startproc
>> prologue 4, 7, savepac=0
>> .cfi_endproc
>>
>> .text
>> .global bar
>> .type bar, %function
>> bar:
>> .cfi_startproc
>> prologue 4, 7
>> .cfi_endproc
>>
>> .text
>> .global wom
>> .type wom, %function
>> wom:
>> .cfi_startproc
>> prologue 4, savepac=0
>> .cfi_endproc
>>
>> .text
>> .global bat
>> .type bat, %function
>> bat:
>> .cfi_startproc
>> prologue savepac=0
>> .cfi_endproc
>>
>> R.
>
> Thanks for these, they'll come in extremely handy. Regarding the
> pop_reg_and_pac macro you suggested, I'd concocted a solution which at
> first sight seemed to make minimal assumptions about what we're asking
> it to pop and, looking at initial example disassemblies, seems to
> behave well.
>
> I'll share it with you here in case you come across anything warranting
> feedback.
>
> Furthermore, I will see whether it will be of any use in building upon
> your example.
>
> .macro cfi_restore_list first:req, rest:vararg
> .cfi_restore \first
> .ifnb \rest
> cfi_restore_list \rest
> .endif
> .endm
>
> .macro pop_reg_and_pac first:req, rest:vararg
> .ifnb \rest
> pop {\first, \rest, ip}
> .else
> pop {\first, ip}
> .endif
> cfi_restore_list \first, \rest
> .cfi_restore 143
> aut ip, lr, sp
> .endm
>
Yes, I tried something similar to start with, but couldn't make it work
for the depth-first recursion needed for the cfi save directives
(because you need to count the list of registers to save in order to
work out the offsets of each one). The second problem is that, AFAICT,
macros can't be used as values to parameters (they simply produce text),
so you can't map names onto values (and I was trying to avoid register
names in .cfi statements).
R.
> As always, thanks for your input.
> V.
>
>>> V.
>>>
>>>> While you're doing that, please change the single-register ldmfd into 'pop'.
>>>>
>>>>> .cfi_def_cfa_offset 0
>>>>> - bx lr
>>>>> + bx lr
>>>>
>>>> stray change to the BX instruction.
>>>>
>>>>> .Lstrcmp_tail:
>>>>> .cfi_restore_state
>>>>> @@ -370,9 +387,18 @@ def_fn strcmp
>>>>> S2LOEQ data2, data2, #8
>>>>> beq .Lstrcmp_tail
>>>>> sub result, r2, result
>>>>> +#if __HAVE_PAC_LEAF
>>>>> + pop {r5, ip}
>>>>> + .cfi_restore 5
>>>>> + .cfi_restore 143
>>>>> + .cfi_def_cfa_offset 0
>>>>> + aut ip, lr, sp
>>>>> +#else
>>>>> ldmfd sp!, {r5}
>>>>> .cfi_restore 5
>>>>> .cfi_def_cfa_offset 0
>>>>> - bx lr
>>>>> +#endif /* __HAVE_PAC_LEAF */
>>>>
>>>> Same as above
>>>>
>>>>> + bx lr
>>>>
>>>> tab between mnemonic and operands.
>>>>
>>>>> .cfi_endproc
>>>>> + .fnend
>>>>> .size strcmp, . - strcmp
>>>>
>>>> R.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2022-08-04 15:19 ` Richard Earnshaw
2022-08-05 9:21 ` Victor L. Do Nascimento
@ 2022-08-16 16:11 ` Victor L. Do Nascimento
2022-08-18 16:28 ` Victor L. Do Nascimento
1 sibling, 1 reply; 18+ messages in thread
From: Victor L. Do Nascimento @ 2022-08-16 16:11 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: newlib, richard.earnshaw
Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
> On 03/08/2022 16:35, Victor Do Nascimento wrote:
>> Create an assembly header file that conditionally defines fuction
>> prologues/epilogues depending on the compile-time mbranch-protection
>> argument values.
>> * newlib/libc/machine/arm/pacbti.h: New.
>> ---
>> newlib/libc/machine/arm/pacbti.h | 64 ++++++++++++++++++++++++++++++++
>> 1 file changed, 64 insertions(+)
>> create mode 100644 newlib/libc/machine/arm/pacbti.h
>> diff --git a/newlib/libc/machine/arm/pacbti.h
>> b/newlib/libc/machine/arm/pacbti.h
>> new file mode 100644
>> index 000000000..4c31d00bc
>> --- /dev/null
>> +++ b/newlib/libc/machine/arm/pacbti.h
>> @@ -0,0 +1,64 @@
>> +/*
>> + * Copyright (c) 2022 Arm Ltd
>> + * All rights reserved.
>> + *
>> + * Redistribution and use in source and binary forms, with or without
>> + * modification, are permitted provided that the following conditions
>> + * are met:
>> + * 1. Redistributions of source code must retain the above copyright
>> + * notice, this list of conditions and the following disclaimer.
>> + * 2. Redistributions in binary form must reproduce the above copyright
>> + * notice, this list of conditions and the following disclaimer in the
>> + * documentation and/or other materials provided with the distribution.
>> + * 3. The name of the company may not be used to endorse or promote
>> + * products derived from this software without specific prior written
>> + * permission.
>> + *
>> + * THIS SOFTWARE IS PROVIDED BY ARM LTD ``AS IS'' AND ANY EXPRESS OR IMPLIED
>> + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
>> + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
>> + * IN NO EVENT SHALL ARM LTD BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
>> + * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
>> + * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
>> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
>> + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
>> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +/* Checki whether leaf function PAC signing has been requested
>> + in the -mbranch-protect compile-time option. */
>> +#define LEAF_PROTECT_BIT 2
>> +#define __HAVE_PAC_LEAF \
>> + __ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)
>
> Either this header needs to avoid polluting the user namespace, or it
> doesn't. If it does, then LEAF_PROTECT_BIT fails to do that. If it
> doesn't then __HAVE_PAC_LEAF should really be HAVE_PAC_LEAF (to avoid
> polluting the reserved namespace. I suspect this header is private to
> newlib (won't be exported to users), so should not be prefixing names
> with __.
>
>> +
>> +/* Macro to handle function entry depending on branch-protection
>> + schemes. */
>> + .macro pacbti_prologue
>> +#if __HAVE_PAC_LEAF
>> +#if __ARM_FEATURE_BTI_DEFAULT
>> + pacbti ip, lr, sp
>> +#else
>> + pac ip, lr, sp
>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>> + .cfi_register 143, 12
>> + str ip, [sp, #-4]!
>
> This causes the stack to lose 8-byte alignment. Whilst for leaf
> functions that's probably not a problem, the macro should have an
> option where the user can ask for alignment to be preserved. But
> there should also be an option to not save IP on the stack at all (for
> when the user does not modify IP in the function body).
Work is ongoing on implementing the macro with all necessary features.
A better macro implementation for the prologue is given as follows (but
still lacks any alignment preservation logic):
/* Emit .cfi_offset directives for a consecutive sequence of registers.
PAC_CFI_ADJ will be 4 bytes if IP reg is pushed. */
.macro cfisavelist first, last, index=1
.cfi_offset \last, -4*(\index) - PAC_CFI_ADJ
.if \last-\first
cfisavelist \first, \last-1, \index+1
.endif
.endm
/* Create a prologue entry sequence handling PAC/BTI, if required and emitting
CFI directives for generated PAC code and any pushed registers. */
#define NREG ((\last-\first)+1)
.macro prologue first=-1, last=-1, savepac=PAC_LEAF_PUSH_IP
#if HAVE_PAC_LEAF
#if __ARM_FEATURE_BTI_DEFAULT
pacbti ip, lr, sp
#else
pac ip, lr, sp
#endif /* __ARM_FEATURE_BTI_DEFAULT */
.cfi_register 143, 12
#else
#if __ARM_FEATURE_BTI_DEFAULT
bti
#endif /* __ARM_FEATURE_BTI_DEFAULT */
#endif /* HAVE_PAC_LEAF */
.if \first != -1
.if \last != -1
.if \savepac
push {r\first-r\last, ip}
.cfi_adjust_cfa_offset NREG*4 + PAC_CFI_ADJ
.cfi_offset 143, -PAC_CFI_ADJ
cfisavelist \first, \last
.else
push {r\first-r\last}
.cfi_adjust_cfa_offset NREG*4
cfisavelist \first, \last
.endif
.else
.if \savepac
push {r\first, ip}
.cfi_adjust_cfa_offset 4 + PAC_CFI_ADJ
.cfi_offset 143, -PAC_CFI_ADJ
cfisavelist \first, \first
.else // !\savepac
push {r\first}
cfisavelist \first, \first
.endif
.endif
.else // \first == -1
.if \savepac
push {ip}
.cfi_adjust_cfa_offset PAC_CFI_ADJ
.cfi_offset 143, -PAC_CFI_ADJ
.endif
.endif
.endm
The question remains of how I can ammend the register list in the push
directive when we wish to push an extra dummy register to the stack for
alignment purposes.
A hypothetical solution (albeit a broken one, as we can't redefine macro
arguments in the body of the macro) for when pushing an even no. of
registers + the pac code is:
.if \first != -1
.if \last != -1
.if \savepac
+ .if \align_requested && ( ( (NREG+1) % 2 ) != 0 && (\first != 0) )
+ \first = \first - 1
+ .endif
push {r\first-r\last, ip}
.cfi_adjust_cfa_offset NREG*4 + PAC_CFI_ADJ
+ .if \align_requested && ( ((NREG+1) % 2 ) != 0 && (\first != 0) )
+ /* Having pushed the dummy, restore original register range. */
+ \first = \first + 1
+ .endif
.cfi_offset 143, -PAC_CFI_ADJ
cfisavelist \first, \last
.else
...
The logic above seems correct to me, but I need to figure out a way of
adjusting the value of \first in the least invasive way.
If we emit the push instruction via a separate macro, we could then pass
the \first argument as \first-1 (e.g. pushmacro \first=\first-1,
last=last, savepac=1). Even so, creating a macro just to emit the push
instruction seems overkill. We may end up having way too many macros.
Do you have any insights on how I may be able to fix the above code to
allow for alignment preservation elegantly within the limitations of
what assembler macros can do?
Cheers,
V.
>> + .save {ra_auth_code}
>> + .cfi_def_cfa_offset 4
>> + .cfi_offset 143, -4
>> +#elif __ARM_FEATURE_BTI_DEFAULT
>> + bti
>> +#endif /* __HAVE_PAC_LEAF */
>> + .endm
>> +
>> +/* Macro to handle different branch exchange cases depending on
>> + branch-protection schemes. */
>> + .macro pacbti_epilogue
>> +#if __HAVE_PAC_LEAF
>> + ldr ip, [sp], #4
>> + .cfi_restore 143
>> + .cfi_def_cfa_offset 0
>> + aut ip, lr, sp
>> +#endif /* __HAVE_PAC_LEAF */
>> + bx lr
>> + .endm
>
> I think these macros are really misnamed, firstly, they're only for
> leaf functions and secondly, they (particularly the epilogue) does
> something even if PAC is not needed. So I think they should be
> renamed as 'leaf_prologue' and 'leaf_epilogue' respectively.
>
> In consequence, I think this file should really be merged into the
> existing arm_asm.h, then we don't need yet another header file.
>
> Finally, the header needs to define a (C) macro that defines how much
> to adjust the CFI offset by for various scenarios:
> - When PAC is not used
> - When PAC is used with no alignment
> - When PAC is used and asked for 8-byte alignment to be preserved.
>
> R.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros
2022-08-16 16:11 ` Victor L. Do Nascimento
@ 2022-08-18 16:28 ` Victor L. Do Nascimento
0 siblings, 0 replies; 18+ messages in thread
From: Victor L. Do Nascimento @ 2022-08-18 16:28 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: newlib, richard.earnshaw
"Victor L. Do Nascimento" <victor.donascimento@arm.com> writes:
> Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
>
>> On 03/08/2022 16:35, Victor Do Nascimento wrote:
>>> Create an assembly header file that conditionally defines fuction
>>> prologues/epilogues depending on the compile-time mbranch-protection
>>> argument values.
>>> * newlib/libc/machine/arm/pacbti.h: New.
>>> ---
>>> newlib/libc/machine/arm/pacbti.h | 64 ++++++++++++++++++++++++++++++++
>>> 1 file changed, 64 insertions(+)
>>> create mode 100644 newlib/libc/machine/arm/pacbti.h
>>> diff --git a/newlib/libc/machine/arm/pacbti.h
>>> b/newlib/libc/machine/arm/pacbti.h
>>> new file mode 100644
>>> index 000000000..4c31d00bc
>>> --- /dev/null
>>> +++ b/newlib/libc/machine/arm/pacbti.h
>>> @@ -0,0 +1,64 @@
>>> +/*
>>> + * Copyright (c) 2022 Arm Ltd
>>> + * All rights reserved.
>>> + *
>>> + * Redistribution and use in source and binary forms, with or without
>>> + * modification, are permitted provided that the following conditions
>>> + * are met:
>>> + * 1. Redistributions of source code must retain the above copyright
>>> + * notice, this list of conditions and the following disclaimer.
>>> + * 2. Redistributions in binary form must reproduce the above copyright
>>> + * notice, this list of conditions and the following disclaimer in the
>>> + * documentation and/or other materials provided with the distribution.
>>> + * 3. The name of the company may not be used to endorse or promote
>>> + * products derived from this software without specific prior written
>>> + * permission.
>>> + *
>>> + * THIS SOFTWARE IS PROVIDED BY ARM LTD ``AS IS'' AND ANY EXPRESS OR IMPLIED
>>> + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
>>> + * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
>>> + * IN NO EVENT SHALL ARM LTD BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>>> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
>>> + * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
>>> + * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
>>> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
>>> + * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
>>> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>>> + */
>>> +
>>> +/* Checki whether leaf function PAC signing has been requested
>>> + in the -mbranch-protect compile-time option. */
>>> +#define LEAF_PROTECT_BIT 2
>>> +#define __HAVE_PAC_LEAF \
>>> + __ARM_FEATURE_PAC_DEFAULT & (1 << LEAF_PROTECT_BIT)
>>
>> Either this header needs to avoid polluting the user namespace, or it
>> doesn't. If it does, then LEAF_PROTECT_BIT fails to do that. If it
>> doesn't then __HAVE_PAC_LEAF should really be HAVE_PAC_LEAF (to avoid
>> polluting the reserved namespace. I suspect this header is private to
>> newlib (won't be exported to users), so should not be prefixing names
>> with __.
>>
>>> +
>>> +/* Macro to handle function entry depending on branch-protection
>>> + schemes. */
>>> + .macro pacbti_prologue
>>> +#if __HAVE_PAC_LEAF
>>> +#if __ARM_FEATURE_BTI_DEFAULT
>>> + pacbti ip, lr, sp
>>> +#else
>>> + pac ip, lr, sp
>>> +#endif /* __ARM_FEATURE_BTI_DEFAULT */
>>> + .cfi_register 143, 12
>>> + str ip, [sp, #-4]!
>>
>> This causes the stack to lose 8-byte alignment. Whilst for leaf
>> functions that's probably not a problem, the macro should have an
>> option where the user can ask for alignment to be preserved. But
>> there should also be an option to not save IP on the stack at all (for
>> when the user does not modify IP in the function body).
>
> Work is ongoing on implementing the macro with all necessary features.
>
> A better macro implementation for the prologue is given as follows (but
> still lacks any alignment preservation logic):
>
> /* Emit .cfi_offset directives for a consecutive sequence of registers.
> PAC_CFI_ADJ will be 4 bytes if IP reg is pushed. */
> .macro cfisavelist first, last, index=1
> .cfi_offset \last, -4*(\index) - PAC_CFI_ADJ
> .if \last-\first
> cfisavelist \first, \last-1, \index+1
> .endif
> .endm
>
> /* Create a prologue entry sequence handling PAC/BTI, if required and emitting
> CFI directives for generated PAC code and any pushed registers. */
>
> #define NREG ((\last-\first)+1)
>
> .macro prologue first=-1, last=-1, savepac=PAC_LEAF_PUSH_IP
> #if HAVE_PAC_LEAF
> #if __ARM_FEATURE_BTI_DEFAULT
> pacbti ip, lr, sp
> #else
> pac ip, lr, sp
> #endif /* __ARM_FEATURE_BTI_DEFAULT */
> .cfi_register 143, 12
> #else
> #if __ARM_FEATURE_BTI_DEFAULT
> bti
> #endif /* __ARM_FEATURE_BTI_DEFAULT */
> #endif /* HAVE_PAC_LEAF */
> .if \first != -1
> .if \last != -1
> .if \savepac
> push {r\first-r\last, ip}
> .cfi_adjust_cfa_offset NREG*4 + PAC_CFI_ADJ
> .cfi_offset 143, -PAC_CFI_ADJ
> cfisavelist \first, \last
> .else
> push {r\first-r\last}
> .cfi_adjust_cfa_offset NREG*4
> cfisavelist \first, \last
> .endif
> .else
> .if \savepac
> push {r\first, ip}
> .cfi_adjust_cfa_offset 4 + PAC_CFI_ADJ
> .cfi_offset 143, -PAC_CFI_ADJ
> cfisavelist \first, \first
> .else // !\savepac
> push {r\first}
> cfisavelist \first, \first
> .endif
> .endif
> .else // \first == -1
> .if \savepac
> push {ip}
> .cfi_adjust_cfa_offset PAC_CFI_ADJ
> .cfi_offset 143, -PAC_CFI_ADJ
> .endif
> .endif
> .endm
>
> The question remains of how I can ammend the register list in the push
> directive when we wish to push an extra dummy register to the stack for
> alignment purposes.
>
> A hypothetical solution (albeit a broken one, as we can't redefine macro
> arguments in the body of the macro) for when pushing an even no. of
> registers + the pac code is:
>
> .if \first != -1
> .if \last != -1
> .if \savepac
>
> + .if \align_requested && ( ( (NREG+1) % 2 ) != 0 && (\first != 0) )
> + \first = \first - 1
> + .endif
>
> push {r\first-r\last, ip}
> .cfi_adjust_cfa_offset NREG*4 + PAC_CFI_ADJ
>
> + .if \align_requested && ( ((NREG+1) % 2 ) != 0 && (\first != 0) )
> + /* Having pushed the dummy, restore original register range. */
> + \first = \first + 1
> + .endif
>
> .cfi_offset 143, -PAC_CFI_ADJ
> cfisavelist \first, \last
>
> .else
> ...
>
> The logic above seems correct to me, but I need to figure out a way of
> adjusting the value of \first in the least invasive way.
>
> If we emit the push instruction via a separate macro, we could then pass
> the \first argument as \first-1 (e.g. pushmacro \first=\first-1,
> last=last, savepac=1). Even so, creating a macro just to emit the push
> instruction seems overkill. We may end up having way too many macros.
>
> Do you have any insights on how I may be able to fix the above code to
> allow for alignment preservation elegantly within the limitations of
> what assembler macros can do?
>
> Cheers,
> V.
>
I've also been thinking about simplifying the align logic, such that if
we request alignment preservation, it always pushes a dummy register along
w/ the pac code. This way, we don't preserve alignment, but rather
alignment "parity.
If in the absence of pushing the pac-code a simple push operation would
result in a misaligned stack pointer, I believe that adding the pac code
to the register list should also result in the preservation of
misalignment.
I think that if in a hypothetical situation code gets written such
that the original prologue results in a misaligned stack, it's correct
to assume that the developer is aware of this and will manually
compensate for it later on in the code if alignment is needed. The
risk I see is that having too much hidden logic in our prologue macro
might result in unforseen bugs for coders with insufficient familiarity
with it.
The alignment preservation feature should behave as follows: "With this
turned on, no perceptable difference is observed in program behaviour
irrespective of the presence/absence of the pac code on the stack."
>>> + .save {ra_auth_code}
>>> + .cfi_def_cfa_offset 4
>>> + .cfi_offset 143, -4
>>> +#elif __ARM_FEATURE_BTI_DEFAULT
>>> + bti
>>> +#endif /* __HAVE_PAC_LEAF */
>>> + .endm
>>> +
>>> +/* Macro to handle different branch exchange cases depending on
>>> + branch-protection schemes. */
>>> + .macro pacbti_epilogue
>>> +#if __HAVE_PAC_LEAF
>>> + ldr ip, [sp], #4
>>> + .cfi_restore 143
>>> + .cfi_def_cfa_offset 0
>>> + aut ip, lr, sp
>>> +#endif /* __HAVE_PAC_LEAF */
>>> + bx lr
>>> + .endm
>>
>> I think these macros are really misnamed, firstly, they're only for
>> leaf functions and secondly, they (particularly the epilogue) does
>> something even if PAC is not needed. So I think they should be
>> renamed as 'leaf_prologue' and 'leaf_epilogue' respectively.
>>
>> In consequence, I think this file should really be merged into the
>> existing arm_asm.h, then we don't need yet another header file.
>>
>> Finally, the header needs to define a (C) macro that defines how much
>> to adjust the CFI offset by for various scenarios:
>> - When PAC is not used
>> - When PAC is used with no alignment
>> - When PAC is used and asked for 8-byte alignment to be preserved.
This creates anoter level of complexity. As you mention that users
should have the option of whether or not to push the pac code onto the stack
when it won't be overwritten over the course of a leaf fuction, the
logic for this C macro will take this into account. This leads us to
create something as follows:
#if HAVE_PAC_LEAF /* Pac code requested for leaf functions. */
# if PAC_LEAF_PUSH_IP /* Always push pac code to the stack. */
# if ALIGN_PRESERVE
# define PAC_CFI_ADJ 8
# else
# define PAC_CFI_ADJ 4
# endif
# else
# define PAC_CFI_ADJ 0
# endif /* PAC_LEAF_PUSH_IP*/
#else
# define PAC_CFI_ADJ 0
#endif /* HAVE_PAC_LEAF */
Thus, if PAC_LEAF_PUSH_IP is not requested, PAC_CFI_ADJ is set to 0.
This creates a problem for functions where pushing the pac code onto
the stack is mandatory, as is the case for functions where the register
storing it will be reused in the body of the function.
In That case, we need a second macro which is independent of the value
of PAC_LEAF_PUSH_IP. This leads to our second macro to be used for such
functions:
#if HAVE_PAC_LEAF
# if ALIGN_PRESERVE
# define PAC_CFI_ADJ_DEFAULT 8
# else
# define PAC_CFI_ADJ_DEFAULT 4
# endif
#endif
and developers would have to choose correctly whether to write something
like
.cfi_offset <reg>, -(<base_offset>+PAC_CFI_ADJ)
or
.cfi_offset <reg>, -(<base_offset>+PAC_CFI_ADJ_DEFAULT)
Does this sound remotely acceptable?
We could use an assembler macro rather than C macro and calculate the
offset based on some passed argument, but either way we find ourselves
catering for a growing number of use cases and with it the complexity of
any solution grows.
I just thought I'd ask what your thoughts were on these matters.
V.
>>
>> R.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2022-08-18 16:28 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-03 15:35 [PATCH v2 0/8] Implement assembly cortex-M PACBTI functionality Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 1/8] newlib: libc: define M-profile PACBTI-enablement macros Victor Do Nascimento
2022-08-04 15:19 ` Richard Earnshaw
2022-08-05 9:21 ` Victor L. Do Nascimento
2022-08-16 16:11 ` Victor L. Do Nascimento
2022-08-18 16:28 ` Victor L. Do Nascimento
2022-08-03 15:35 ` [PATCH v2 2/8] newlib: libc: strcmp M-profile PACBTI-enablement Victor Do Nascimento
2022-08-04 15:48 ` Richard Earnshaw
2022-08-05 10:51 ` Victor L. Do Nascimento
2022-08-05 14:34 ` Richard Earnshaw
2022-08-05 14:58 ` Victor L. Do Nascimento
2022-08-05 15:15 ` Richard Earnshaw
2022-08-03 15:35 ` [PATCH v2 3/8] newlib: libc: strlen " Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 4/8] newlib: libc: memchr " Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 5/8] newlib: libc: memcpy " Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 6/8] newlib: libc: setjmp/longjmp " Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 7/8] newlib: libc: aeabi_memmove " Victor Do Nascimento
2022-08-03 15:35 ` [PATCH v2 8/8] newlib: libc: aeabi_memset " Victor Do Nascimento
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).