From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id C7FF5383C7C0 for ; Tue, 13 Dec 2022 15:54:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C7FF5383C7C0 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=foss.arm.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=foss.arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0B33B2F4; Tue, 13 Dec 2022 07:55:17 -0800 (PST) Received: from [10.57.8.228] (unknown [10.57.8.228]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D02483F71E; Tue, 13 Dec 2022 07:54:35 -0800 (PST) Message-ID: <29d883c3-73b4-c6c4-2b8f-52036496b7cd@foss.arm.com> Date: Tue, 13 Dec 2022 15:54:34 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Subject: Re: [PATCH] libc: arm: fix setjmp abi non-conformance Content-Language: en-GB To: "Victor L. Do Nascimento" , newlib@sourceware.org Cc: richard.earnshaw@arm.com References: From: Richard Earnshaw In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3495.3 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_ASCII_DIVIDERS,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,NICE_REPLY_A,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 13/12/2022 14:51, Victor L. Do Nascimento wrote: > As per the arm Procedure Call Standard for the Arm Architecture > section 6.1.2 [1], VFP registers s16-s31 (d8-d15, q4-q7) must be > preserved across subroutine calls. > > The current setjmp/longjmp implementations preserve only the core > registers, with the jump buffer size too small to store the required > co-processor registers. > > In accordance with the C Library ABI for the Arm Architecture > section 6.11 [2], this patch sets _JBTYPE to long long adjusting > _JBLEN to 20. > > It also emits vfp load/store instructions depending on architectural > support, predicated at compile time on ACLE feature-test macros. > Pushed. Would you mind writing a short entry for the top-level NEWS file, as this is technically an ABI break. Just point out that the size and alignment of jmp_buf has been changed to conform with the ABI and to fix a bug with saving floating-point registers. R. > [1] https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst > [2] https://github.com/ARM-software/abi-aa/blob/main/clibabi32/clibabi32.rst > --- > COPYING.NEWLIB | 2 +- > newlib/libc/include/machine/setjmp.h | 8 ++- > newlib/libc/machine/arm/setjmp.S | 74 +++++++++++++++------------- > 3 files changed, 46 insertions(+), 38 deletions(-) > > diff --git a/COPYING.NEWLIB b/COPYING.NEWLIB > index 2d1473639..d54ed293d 100644 > --- a/COPYING.NEWLIB > +++ b/COPYING.NEWLIB > @@ -762,7 +762,7 @@ SUCH DAMAGE. > > (35) - Arm Ltd > > - Copyright (c) 2009-2018 Arm Ltd > + Copyright (c) 2009-2022 Arm Ltd > All rights reserved. > > Redistribution and use in source and binary forms, with or without > diff --git a/newlib/libc/include/machine/setjmp.h b/newlib/libc/include/machine/setjmp.h > index 53878a03d..29b76cec1 100644 > --- a/newlib/libc/include/machine/setjmp.h > +++ b/newlib/libc/include/machine/setjmp.h > @@ -12,9 +12,13 @@ _BEGIN_STD_C > #if defined(__arm__) || defined(__thumb__) > /* > * All callee preserved registers: > - * v1 - v7, fp, ip, sp, lr, f4, f5, f6, f7 > + * core registers: > + * r4 - r10, fp, sp, lr > + * VFP registers (architectural support dependent): > + * d8 - d15 > */ > -#define _JBLEN 23 > +#define _JBLEN 20 > +#define _JBTYPE long long > #endif > > #if defined(__aarch64__) > diff --git a/newlib/libc/machine/arm/setjmp.S b/newlib/libc/machine/arm/setjmp.S > index 21d6ff9e7..4cf0a8e3f 100644 > --- a/newlib/libc/machine/arm/setjmp.S > +++ b/newlib/libc/machine/arm/setjmp.S > @@ -27,34 +27,34 @@ > The interworking scheme expects functions to use a BX instruction > to return control to their parent. Since we need this code to work > in both interworked and non-interworked environments as well as with > - older processors which do not have the BX instruction we do the > + older processors which do not have the BX instruction we do the > following: > Test the return address. > If the bottom bit is clear perform an "old style" function exit. > (We know that we are in ARM mode and returning to an ARM mode caller). > Otherwise use the BX instruction to perform the function exit. > > - We know that we will never attempt to perform the BX instruction on > - an older processor, because that kind of processor will never be > - interworked, and a return address with the bottom bit set will never > + We know that we will never attempt to perform the BX instruction on > + an older processor, because that kind of processor will never be > + interworked, and a return address with the bottom bit set will never > be generated. > > In addition, we do not actually assemble the BX instruction as this would > require us to tell the assembler that the processor is an ARM7TDMI and > it would store this information in the binary. We want this binary to be > able to be linked with binaries compiled for older processors however, so > - we do not want such information stored there. > + we do not want such information stored there. > > If we are running using the APCS-26 convention however, then we never > - test the bottom bit, because this is part of the processor status. > - Instead we just do a normal return, since we know that we cannot be > + test the bottom bit, because this is part of the processor status. > + Instead we just do a normal return, since we know that we cannot be > returning to a Thumb caller - the Thumb does not support APCS-26. > - > - Function entry is much simpler. If we are compiling for the Thumb we > + > + Function entry is much simpler. If we are compiling for the Thumb we > just switch into ARM mode and then drop through into the rest of the > function. The function exit code will take care of the restore to > Thumb mode. > - > + > For Thumb-2 do everything in Thumb mode. */ > > .syntax unified > @@ -115,15 +115,15 @@ SYM (longjmp): > #else > #define RET tst lr, #1; \ > moveq pc, lr ; \ > -.word 0xe12fff1e /* bx lr */ > +.inst 0xe12fff1e /* bx lr */ > #endif > > #ifdef __thumb2__ > -.macro COND where when > +.macro COND where when > i\where \when > .endm > #else > -.macro COND where when > +.macro COND where when > .endm > #endif > > @@ -140,7 +140,7 @@ SYM (longjmp): > .macro PROLOGUE name > .code 16 > bx pc > - nop > + nop > .code 32 > SYM (.arm_start_of.\name): > .endm > @@ -149,7 +149,7 @@ SYM (.arm_start_of.\name): > .macro PROLOGUE name > .endm > #endif > - > + > .macro FUNC_START name > .text > .align 2 > @@ -164,61 +164,65 @@ SYM (\name): > RET > SIZE (\name) > .endm > - > + > /* -------------------------------------------------------------------- > - int setjmp (jmp_buf); > + int setjmp (jmp_buf); > -------------------------------------------------------------------- */ > - > + > FUNC_START setjmp > > /* Save all the callee-preserved registers into the jump buffer. */ > #ifdef __thumb2__ > mov ip, sp > - stmea a1!, { v1-v7, fp, ip, lr } > + stmia r0!, { r4-r10, fp, ip, lr } > #else > - stmea a1!, { v1-v7, fp, ip, sp, lr } > + stmia r0!, { r4-r10, fp, sp, lr } > +#endif > +#if defined __ARM_FP || defined __ARM_FEATURE_MVE > + vstm r0, { d8-d15 } > #endif > - > + > #if 0 /* Simulator does not cope with FP instructions yet. */ > #ifndef __SOFTFP__ > /* Save the floating point registers. */ > sfmea f4, 4, [a1] > #endif > -#endif > +#endif > /* When setting up the jump buffer return 0. */ > - mov a1, #0 > + mov r0, #0 > > FUNC_END setjmp > - > + > /* -------------------------------------------------------------------- > volatile void longjmp (jmp_buf, int); > -------------------------------------------------------------------- */ > - > + > FUNC_START longjmp > > /* If we have stack extension code it ought to be handled here. */ > - > + > /* Restore the registers, retrieving the state when setjmp() was called. */ > #ifdef __thumb2__ > - ldmfd a1!, { v1-v7, fp, ip, lr } > + ldmia r0!, { r4-r10, fp, ip, lr } > mov sp, ip > #else > - ldmfd a1!, { v1-v7, fp, ip, sp, lr } > + ldmia r0!, { r4-r10, fp, sp, lr } > +#endif > +#if defined __ARM_FP || defined __ARM_FEATURE_MVE > + vldm r0, { d8-d15 } > #endif > - > + > #if 0 /* Simulator does not cope with FP instructions yet. */ > #ifndef __SOFTFP__ > /* Restore floating point registers as well. */ > lfmfd f4, 4, [a1] > #endif > -#endif > +#endif > /* Put the return value into the integer result register. > - But if it is zero then return 1 instead. */ > - movs a1, a2 > -#ifdef __thumb2__ > + But if it is zero then return 1 instead. */ > + movs r0, r1 > it eq > -#endif > - moveq a1, #1 > + moveq r0, #1 > > FUNC_END longjmp > #endif