From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 3CD0E3858C27 for ; Tue, 22 Nov 2022 15:33:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3CD0E3858C27 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=foss.arm.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=foss.arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7CE641FB; Tue, 22 Nov 2022 07:33:12 -0800 (PST) Received: from [10.2.78.76] (unknown [10.2.78.76]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7233C3F73D; Tue, 22 Nov 2022 07:33:05 -0800 (PST) Message-ID: Date: Tue, 22 Nov 2022 15:33:04 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Subject: Re: [PATCH v4 4/8] newlib: libc: memchr M-profile PACBTI-enablement Content-Language: en-GB To: "Victor L. Do Nascimento" , newlib@sourceware.org Cc: Richard.Earnshaw@arm.com References: From: Richard Earnshaw In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3495.8 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,NICE_REPLY_A,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 26/10/2022 12:49, Victor L. Do Nascimento wrote: > Add function prologue/epilogue to conditionally add BTI landing pads > and/or PAC code generation & authentication instructions depending on > compilation flags. > --- > newlib/libc/machine/arm/memchr.S | 42 +++++++++++++++++++++++++++----- > 1 file changed, 36 insertions(+), 6 deletions(-) > > diff --git a/newlib/libc/machine/arm/memchr.S b/newlib/libc/machine/arm/memchr.S > index 1a4c6512c..5b051123d 100644 > --- a/newlib/libc/machine/arm/memchr.S > +++ b/newlib/libc/machine/arm/memchr.S > @@ -76,6 +76,7 @@ > .syntax unified > > #include "acle-compat.h" > +#include "arm_asm.h" > > @ NOTE: This ifdef MUST match the one in memchr-stub.c > #if defined (__ARM_NEON__) || defined (__ARM_NEON) > @@ -267,10 +268,14 @@ memchr: > #elif __ARM_ARCH_ISA_THUMB >= 2 && defined (__ARM_FEATURE_DSP) > > #if __ARM_ARCH_PROFILE == 'M' > - .arch armv7e-m > +#if __ARM_ARCH >= 8 > + /* keep config inherited from -march=. */ > #else > - .arch armv6t2 > -#endif > + .arch armv7e-m > +#endif /* __ARM_ARCH >= 8 */ > +#else > + .arch armv6t2 > +#endif /* __ARM_ARCH_PROFILE == 'M' */ > > @ this lets us check a flag in a 00/ff byte easily in either endianness > #ifdef __ARMEB__ > @@ -287,11 +292,14 @@ memchr: > .p2align 4,,15 > .global memchr > .type memchr,%function > + .fnstart > + .cfi_startproc > memchr: > @ r0 = start of memory to scan > @ r1 = character to look for > @ r2 = length > @ returns r0 = pointer to character or NULL if not found > + prologue > and r1,r1,#0xff @ Don't trust the caller to pass a char > > cmp r2,#16 @ If short don't bother with anything clever > @@ -313,6 +321,11 @@ memchr: > 10: > @ We are aligned, we know we have at least 8 bytes to work with > push {r4,r5,r6,r7} > + .cfi_adjust_cfa_offset 16 > + .cfi_rel_offset 4, 0 > + .cfi_rel_offset 5, 4 > + .cfi_rel_offset 6, 8 > + .cfi_rel_offset 7, 12 > orr r1, r1, r1, lsl #8 @ expand the match word across all bytes > orr r1, r1, r1, lsl #16 > bic r4, r2, #7 @ Number of double words to work with * 8 > @@ -334,6 +347,11 @@ memchr: > bne 15b @ (Flags from the subs above) > > pop {r4,r5,r6,r7} > + .cfi_restore 7 > + .cfi_restore 6 > + .cfi_restore 5 > + .cfi_restore 4 > + .cfi_adjust_cfa_offset -16 > and r1,r1,#0xff @ r1 back to a single character > and r2,r2,#7 @ Leave the count remaining as the number > @ after the double words have been done > @@ -349,17 +367,21 @@ memchr: > bne 21b @ on r2 flags > > 40: > + .cfi_remember_state > movs r0,#0 @ not found > - bx lr > + epilogue > > 50: > + .cfi_restore_state > + .cfi_remember_state > subs r0,r0,#1 @ found > - bx lr > + epilogue > > 60: @ We're here because the fast path found a hit > @ now we have to track down exactly which word it was > @ r0 points to the start of the double word after the one tested > @ r5 has the 00/ff pattern for the first word, r6 has the chained value > + .cfi_restore_state This restores the wrong state. label 60 is branched to from within the hot loop where r4-r7 have been saved on the stack, so at the very least you'll need to add some additional directives here to correctly describe those still being on the stack. You can probably get away with .cfi_restore_state @ Standard post-prologue state .cfi_adjust_cfa_offset 16 .cfi_rel_offset 4, 0 .cfi_rel_offset 5, 4 .cfi_rel_offset 6, 8 .cfi_rel_offset 7, 12 > cmp r5, #0 > itte eq > moveq r5, r6 @ the end is in the 2nd word > @@ -379,8 +401,16 @@ memchr: > > 61: > pop {r4,r5,r6,r7} > + .cfi_restore 7 > + .cfi_restore 6 > + .cfi_restore 5 > + .cfi_restore 4 > + .cfi_adjust_cfa_offset -16 > subs r0,r0,#1 > - bx lr > + epilogue > + .cfi_endproc > + .cantunwind > + .fnend > #else > /* Defined in memchr-stub.c. */ > #endif R.