From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148]) by sourceware.org (Postfix) with ESMTPS id 3F19F3857341 for ; Thu, 27 Oct 2022 07:17:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3F19F3857341 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embedded-brains.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embedded-brains.de Received: from sslproxy04.your-server.de ([78.46.152.42]) by dedi548.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1onx8w-00057Q-1u for newlib@sourceware.org; Thu, 27 Oct 2022 09:17:33 +0200 Received: from [82.100.198.138] (helo=mail.embedded-brains.de) by sslproxy04.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1onx8v-000O9I-Kp for newlib@sourceware.org; Thu, 27 Oct 2022 09:17:33 +0200 Received: from localhost (localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id 4BFC64800B5 for ; Thu, 27 Oct 2022 09:17:33 +0200 (CEST) Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id OPTwHJSHGEIZ for ; Thu, 27 Oct 2022 09:17:32 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id D74304800EA for ; Thu, 27 Oct 2022 09:17:32 +0200 (CEST) X-Virus-Scanned: amavisd-new at zimbra.eb.localhost Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id n2GyGXhGDxsr for ; Thu, 27 Oct 2022 09:17:32 +0200 (CEST) Received: from zimbra.eb.localhost (unknown [192.168.96.242]) by mail.embedded-brains.de (Postfix) with ESMTPSA id B83904800B5 for ; Thu, 27 Oct 2022 09:17:32 +0200 (CEST) From: Sebastian Huber To: newlib@sourceware.org Subject: [PATCH] powerpc/setjmp: Fix 64-bit support Date: Thu, 27 Oct 2022 09:17:29 +0200 Message-Id: <20221027071729.47233-1-sebastian.huber@embedded-brains.de> X-Mailer: git-send-email 2.35.3 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Authenticated-Sender: smtp-embedded@poldinet.de X-Virus-Scanned: Clear (ClamAV 0.103.7/26700/Wed Oct 26 09:55:46 2022) X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The first attempt to support the 64-bit mode had two bugs: 1. The saved general-purpose register 31 value was overwritten with the s= aved link register value. 2. The link register was saved and restored using 32-bit instructions. Use 64-bit store/load instructions to save/restore the link register. Ma= ke sure that the general-purpose register 31 and the link register storage a= reas do not overlap. --- newlib/libc/machine/powerpc/setjmp.S | 129 +++++++++++++++------------ 1 file changed, 72 insertions(+), 57 deletions(-) diff --git a/newlib/libc/machine/powerpc/setjmp.S b/newlib/libc/machine/p= owerpc/setjmp.S index dc8b239a9..ee486a6ef 100644 --- a/newlib/libc/machine/powerpc/setjmp.S +++ b/newlib/libc/machine/powerpc/setjmp.S @@ -42,30 +42,34 @@ FUNC_START(setjmp) store instruction uses an offset of 4. */ addi 3,3,164 #elif __powerpc64__ - /* In the first store, add 16 to r3 so that the subsequent floating + /* In the first store, add 8 to r3 so that the subsequent floating point stores are aligned on an 8 byte boundary and the Altivec stores are aligned on a 16 byte boundary. */ - stdu 1,16(3) # offset 16 - stdu 2,8(3) # offset 24 - stdu 13,8(3) # offset 32 - stdu 14,8(3) # offset 40 - stdu 15,8(3) # offset 48 - stdu 16,8(3) # offset 56 - stdu 17,8(3) # offset 64 - stdu 18,8(3) # offset 72 - stdu 19,8(3) # offset 80 - stdu 20,8(3) # offset 88 - stdu 21,8(3) # offset 96 - stdu 22,8(3) # offset 104 - stdu 23,8(3) # offset 112 - stdu 24,8(3) # offset 120 - stdu 25,8(3) # offset 128 - stdu 26,8(3) # offset 136 - stdu 27,8(3) # offset 144 - stdu 28,8(3) # offset 152 - stdu 29,8(3) # offset 160 - stdu 30,8(3) # offset 168 - stdu 31,8(3) # offset 176 + stdu 1,8(3) # offset 8 + stdu 2,8(3) # offset 16 + stdu 13,8(3) # offset 24 + stdu 14,8(3) # offset 32 + stdu 15,8(3) # offset 40 + stdu 16,8(3) # offset 48 + stdu 17,8(3) # offset 56 + stdu 18,8(3) # offset 64 + stdu 19,8(3) # offset 72 + stdu 20,8(3) # offset 80 + stdu 21,8(3) # offset 88 + stdu 22,8(3) # offset 96 + stdu 23,8(3) # offset 104 + stdu 24,8(3) # offset 112 + stdu 25,8(3) # offset 120 + stdu 26,8(3) # offset 128 + stdu 27,8(3) # offset 136 + stdu 28,8(3) # offset 144 + stdu 29,8(3) # offset 152 + stdu 30,8(3) # offset 160 + stdu 31,8(3) # offset 168 + mflr 4 + stdu 4,8(3) # offset 176 + mfcr 4 + stwu 4,8(3) # offset 184 #else stw 1,0(3) # offset 0 stwu 2,4(3) # offset 4 @@ -90,20 +94,16 @@ FUNC_START(setjmp) stwu 31,4(3) # offset 80 #endif =20 +#if !__powerpc64__ /* If __SPE__, then add 84 to the offset shown from this point on until the end of this function. This difference comes from the fact that - we save 21 64-bit registers instead of 21 32-bit registers above. - - If __powerpc64__, then add 96 to the offset shown from this point on= until - the end of this function. This difference comes from the fact that - we save 21 64-bit registers instead of 21 32-bit registers above and - we take alignement requirements of floating point and Altivec stores - into account. */ + we save 21 64-bit registers instead of 21 32-bit registers above. *= / mflr 4 stwu 4,4(3) # offset 84 mfcr 4 stwu 4,4(3) # offset 88 # one word pad to get floating point aligned on 8 byte boundary +#endif =20 /* Check whether we need to save FPRs. Checking __NO_FPRS__ on its own would be enough for GCC 4.1 and above, but older @@ -117,6 +117,13 @@ FUNC_START(setjmp) andi. 5,5,0x2000 beq 1f #endif + + /* If __powerpc64__, then add 96 to the offset shown from this point on= until + the end of this function. This difference comes from the fact that + we save 22 64-bit registers instead of 22 32-bit registers above and + we take alignement requirements of floating point and Altivec stores + into account. */ + stfdu 14,8(3) # offset 96 stfdu 15,8(3) # offset 104 stfdu 16,8(3) # offset 112 @@ -220,30 +227,34 @@ FUNC_START(longjmp) load instruction uses an offset of 4. */ addi 3,3,164 #elif __powerpc64__ - /* In the first load, add 16 to r3 so that the subsequent floating + /* In the first load, add 8 to r3 so that the subsequent floating point loades are aligned on an 8 byte boundary and the Altivec loads are aligned on a 16 byte boundary. */ - ldu 1,16(3) # offset 16 - ldu 2,8(3) # offset 24 - ldu 13,8(3) # offset 32 - ldu 14,8(3) # offset 40 - ldu 15,8(3) # offset 48 - ldu 16,8(3) # offset 56 - ldu 17,8(3) # offset 64 - ldu 18,8(3) # offset 72 - ldu 19,8(3) # offset 80 - ldu 20,8(3) # offset 88 - ldu 21,8(3) # offset 96 - ldu 22,8(3) # offset 104 - ldu 23,8(3) # offset 112 - ldu 24,8(3) # offset 120 - ldu 25,8(3) # offset 128 - ldu 26,8(3) # offset 136 - ldu 27,8(3) # offset 144 - ldu 28,8(3) # offset 152 - ldu 29,8(3) # offset 160 - ldu 30,8(3) # offset 168 - ldu 31,8(3) # offset 176 + ldu 1,8(3) # offset 8 + ldu 2,8(3) # offset 16 + ldu 13,8(3) # offset 24 + ldu 14,8(3) # offset 32 + ldu 15,8(3) # offset 40 + ldu 16,8(3) # offset 48 + ldu 17,8(3) # offset 56 + ldu 18,8(3) # offset 64 + ldu 19,8(3) # offset 72 + ldu 20,8(3) # offset 80 + ldu 21,8(3) # offset 88 + ldu 22,8(3) # offset 96 + ldu 23,8(3) # offset 104 + ldu 24,8(3) # offset 112 + ldu 25,8(3) # offset 120 + ldu 26,8(3) # offset 128 + ldu 27,8(3) # offset 136 + ldu 28,8(3) # offset 144 + ldu 29,8(3) # offset 152 + ldu 30,8(3) # offset 160 + ldu 31,8(3) # offset 168 + ldu 5,8(3) # offset 176 + mtlr 5 + lwzu 5,8(3) # offset 184 + mtcrf 255,5 #else lwz 1,0(3) # offset 0=20 lwzu 2,4(3) # offset 4=20 @@ -269,18 +280,15 @@ FUNC_START(longjmp) #endif /* If __SPE__, then add 84 to the offset shown from this point on until the end of this function. This difference comes from the fact that - we restore 21 64-bit registers instead of 21 32-bit registers above. + we restore 22 64-bit registers instead of 22 32-bit registers above.= */ =20 - If __powerpc64__, then add 96 to the offset shown from this point on= until - the end of this function. This difference comes from the fact that - we restore 21 64-bit registers instead of 21 32-bit registers above = and - we take alignement requirements of floating point and Altivec loads - into account. */ +#if !__powerpc64__ lwzu 5,4(3) # offset 84 mtlr 5 lwzu 5,4(3) # offset 88 mtcrf 255,5 # one word pad to get floating point aligned on 8 byte boundary +#endif =20 /* Check whether we need to restore FPRs. Checking __NO_FPRS__ on its own would be enough for GCC 4.1 and @@ -292,6 +300,13 @@ FUNC_START(longjmp) andi. 5,5,0x2000 beq 1f #endif + + /* If __powerpc64__, then add 96 to the offset shown from this point on= until + the end of this function. This difference comes from the fact that + we restore 21 64-bit registers instead of 21 32-bit registers above = and + we take alignement requirements of floating point and Altivec loads + into account. */ + lfdu 14,8(3) # offset 96=20 lfdu 15,8(3) # offset 104 lfdu 16,8(3) # offset 112 --=20 2.35.3