From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <sebastian.huber@embedded-brains.de>
Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148])
	by sourceware.org (Postfix) with ESMTPS id 3F19F3857341
	for <newlib@sourceware.org>; Thu, 27 Oct 2022 07:17:36 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3F19F3857341
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embedded-brains.de
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embedded-brains.de
Received: from sslproxy04.your-server.de ([78.46.152.42])
	by dedi548.your-server.de with esmtpsa  (TLS1.3) tls TLS_AES_256_GCM_SHA384
	(Exim 4.94.2)
	(envelope-from <sebastian.huber@embedded-brains.de>)
	id 1onx8w-00057Q-1u
	for newlib@sourceware.org; Thu, 27 Oct 2022 09:17:33 +0200
Received: from [82.100.198.138] (helo=mail.embedded-brains.de)
	by sslproxy04.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <sebastian.huber@embedded-brains.de>)
	id 1onx8v-000O9I-Kp
	for newlib@sourceware.org; Thu, 27 Oct 2022 09:17:33 +0200
Received: from localhost (localhost [127.0.0.1])
	by mail.embedded-brains.de (Postfix) with ESMTP id 4BFC64800B5
	for <newlib@sourceware.org>; Thu, 27 Oct 2022 09:17:33 +0200 (CEST)
Received: from mail.embedded-brains.de ([127.0.0.1])
	by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032)
	with ESMTP id OPTwHJSHGEIZ for <newlib@sourceware.org>;
	Thu, 27 Oct 2022 09:17:32 +0200 (CEST)
Received: from localhost (localhost [127.0.0.1])
	by mail.embedded-brains.de (Postfix) with ESMTP id D74304800EA
	for <newlib@sourceware.org>; Thu, 27 Oct 2022 09:17:32 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zimbra.eb.localhost
Received: from mail.embedded-brains.de ([127.0.0.1])
	by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026)
	with ESMTP id n2GyGXhGDxsr for <newlib@sourceware.org>;
	Thu, 27 Oct 2022 09:17:32 +0200 (CEST)
Received: from zimbra.eb.localhost (unknown [192.168.96.242])
	by mail.embedded-brains.de (Postfix) with ESMTPSA id B83904800B5
	for <newlib@sourceware.org>; Thu, 27 Oct 2022 09:17:32 +0200 (CEST)
From: Sebastian Huber <sebastian.huber@embedded-brains.de>
To: newlib@sourceware.org
Subject: [PATCH] powerpc/setjmp: Fix 64-bit support
Date: Thu, 27 Oct 2022 09:17:29 +0200
Message-Id: <20221027071729.47233-1-sebastian.huber@embedded-brains.de>
X-Mailer: git-send-email 2.35.3
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Authenticated-Sender: smtp-embedded@poldinet.de
X-Virus-Scanned: Clear (ClamAV 0.103.7/26700/Wed Oct 26 09:55:46 2022)
X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <newlib.sourceware.org>

The first attempt to support the 64-bit mode had two bugs:

1. The saved general-purpose register 31 value was overwritten with the s=
aved
   link register value.

2. The link register was saved and restored using 32-bit instructions.

Use 64-bit store/load instructions to save/restore the link register.  Ma=
ke
sure that the general-purpose register 31 and the link register storage a=
reas
do not overlap.
---
 newlib/libc/machine/powerpc/setjmp.S | 129 +++++++++++++++------------
 1 file changed, 72 insertions(+), 57 deletions(-)

diff --git a/newlib/libc/machine/powerpc/setjmp.S b/newlib/libc/machine/p=
owerpc/setjmp.S
index dc8b239a9..ee486a6ef 100644
--- a/newlib/libc/machine/powerpc/setjmp.S
+++ b/newlib/libc/machine/powerpc/setjmp.S
@@ -42,30 +42,34 @@ FUNC_START(setjmp)
 	   store instruction uses an offset of 4.  */
 	addi	3,3,164
 #elif __powerpc64__
-	/* In the first store, add 16 to r3 so that the subsequent floating
+	/* In the first store, add 8 to r3 so that the subsequent floating
 	   point stores are aligned on an 8 byte boundary and the Altivec
 	   stores are aligned on a 16 byte boundary.  */
-	stdu	1,16(3)		# offset 16
-	stdu	2,8(3)		# offset 24
-	stdu	13,8(3)		# offset 32
-	stdu	14,8(3)		# offset 40
-	stdu	15,8(3)		# offset 48
-	stdu	16,8(3)		# offset 56
-	stdu	17,8(3)		# offset 64
-	stdu	18,8(3)		# offset 72
-	stdu	19,8(3)		# offset 80
-	stdu	20,8(3)		# offset 88
-	stdu	21,8(3)		# offset 96
-	stdu	22,8(3)		# offset 104
-	stdu	23,8(3)		# offset 112
-	stdu	24,8(3)		# offset 120
-	stdu	25,8(3)		# offset 128
-	stdu	26,8(3)		# offset 136
-	stdu	27,8(3)		# offset 144
-	stdu	28,8(3)		# offset 152
-	stdu	29,8(3)		# offset 160
-	stdu	30,8(3)		# offset 168
-	stdu	31,8(3)		# offset 176
+	stdu	1,8(3)		# offset 8
+	stdu	2,8(3)		# offset 16
+	stdu	13,8(3)		# offset 24
+	stdu	14,8(3)		# offset 32
+	stdu	15,8(3)		# offset 40
+	stdu	16,8(3)		# offset 48
+	stdu	17,8(3)		# offset 56
+	stdu	18,8(3)		# offset 64
+	stdu	19,8(3)		# offset 72
+	stdu	20,8(3)		# offset 80
+	stdu	21,8(3)		# offset 88
+	stdu	22,8(3)		# offset 96
+	stdu	23,8(3)		# offset 104
+	stdu	24,8(3)		# offset 112
+	stdu	25,8(3)		# offset 120
+	stdu	26,8(3)		# offset 128
+	stdu	27,8(3)		# offset 136
+	stdu	28,8(3)		# offset 144
+	stdu	29,8(3)		# offset 152
+	stdu	30,8(3)		# offset 160
+	stdu	31,8(3)		# offset 168
+	mflr	4
+	stdu	4,8(3)		# offset 176
+	mfcr	4
+	stwu	4,8(3)		# offset 184
 #else
 	stw	1,0(3)		# offset 0
 	stwu	2,4(3)		# offset 4
@@ -90,20 +94,16 @@ FUNC_START(setjmp)
 	stwu	31,4(3)		# offset 80
 #endif
=20
+#if !__powerpc64__
 	/* If __SPE__, then add 84 to the offset shown from this point on until
 	   the end of this function.  This difference comes from the fact that
-	   we save 21 64-bit registers instead of 21 32-bit registers above.
-
-	   If __powerpc64__, then add 96 to the offset shown from this point on=
 until
-	   the end of this function.  This difference comes from the fact that
-	   we save 21 64-bit registers instead of 21 32-bit registers above and
-	   we take alignement requirements of floating point and Altivec stores
-	   into account.  */
+	   we save 21 64-bit registers instead of 21 32-bit registers above.  *=
/
 	mflr	4
 	stwu	4,4(3)		# offset 84
 	mfcr	4
 	stwu	4,4(3)		# offset 88
 				# one word pad to get floating point aligned on 8 byte boundary
+#endif
=20
 	/* Check whether we need to save FPRs.  Checking __NO_FPRS__
 	   on its own would be enough for GCC 4.1 and above, but older
@@ -117,6 +117,13 @@ FUNC_START(setjmp)
 	andi.	5,5,0x2000
 	beq	1f
 #endif
+
+	/* If __powerpc64__, then add 96 to the offset shown from this point on=
 until
+	   the end of this function.  This difference comes from the fact that
+	   we save 22 64-bit registers instead of 22 32-bit registers above and
+	   we take alignement requirements of floating point and Altivec stores
+	   into account.  */
+
 	stfdu	14,8(3)		# offset 96
 	stfdu	15,8(3)		# offset 104
 	stfdu	16,8(3)		# offset 112
@@ -220,30 +227,34 @@ FUNC_START(longjmp)
 	   load instruction uses an offset of 4.  */
 	addi	3,3,164
 #elif __powerpc64__
-	/* In the first load, add 16 to r3 so that the subsequent floating
+	/* In the first load, add 8 to r3 so that the subsequent floating
 	   point loades are aligned on an 8 byte boundary and the Altivec
 	   loads are aligned on a 16 byte boundary.  */
-	ldu	1,16(3)		# offset 16
-	ldu	2,8(3)		# offset 24
-	ldu	13,8(3)		# offset 32
-	ldu	14,8(3)		# offset 40
-	ldu	15,8(3)		# offset 48
-	ldu	16,8(3)		# offset 56
-	ldu	17,8(3)		# offset 64
-	ldu	18,8(3)		# offset 72
-	ldu	19,8(3)		# offset 80
-	ldu	20,8(3)		# offset 88
-	ldu	21,8(3)		# offset 96
-	ldu	22,8(3)		# offset 104
-	ldu	23,8(3)		# offset 112
-	ldu	24,8(3)		# offset 120
-	ldu	25,8(3)		# offset 128
-	ldu	26,8(3)		# offset 136
-	ldu	27,8(3)		# offset 144
-	ldu	28,8(3)		# offset 152
-	ldu	29,8(3)		# offset 160
-	ldu	30,8(3)		# offset 168
-	ldu	31,8(3)		# offset 176
+	ldu	1,8(3)		# offset 8
+	ldu	2,8(3)		# offset 16
+	ldu	13,8(3)		# offset 24
+	ldu	14,8(3)		# offset 32
+	ldu	15,8(3)		# offset 40
+	ldu	16,8(3)		# offset 48
+	ldu	17,8(3)		# offset 56
+	ldu	18,8(3)		# offset 64
+	ldu	19,8(3)		# offset 72
+	ldu	20,8(3)		# offset 80
+	ldu	21,8(3)		# offset 88
+	ldu	22,8(3)		# offset 96
+	ldu	23,8(3)		# offset 104
+	ldu	24,8(3)		# offset 112
+	ldu	25,8(3)		# offset 120
+	ldu	26,8(3)		# offset 128
+	ldu	27,8(3)		# offset 136
+	ldu	28,8(3)		# offset 144
+	ldu	29,8(3)		# offset 152
+	ldu	30,8(3)		# offset 160
+	ldu	31,8(3)		# offset 168
+	ldu	5,8(3)		# offset 176
+	mtlr	5
+	lwzu	5,8(3)		# offset 184
+	mtcrf	255,5
 #else
 	lwz	1,0(3)		# offset 0=20
 	lwzu	2,4(3)		# offset 4=20
@@ -269,18 +280,15 @@ FUNC_START(longjmp)
 #endif
 	/* If __SPE__, then add 84 to the offset shown from this point on until
 	   the end of this function.  This difference comes from the fact that
-	   we restore 21 64-bit registers instead of 21 32-bit registers above.
+	   we restore 22 64-bit registers instead of 22 32-bit registers above.=
  */
=20
-	   If __powerpc64__, then add 96 to the offset shown from this point on=
 until
-	   the end of this function.  This difference comes from the fact that
-	   we restore 21 64-bit registers instead of 21 32-bit registers above =
and
-	   we take alignement requirements of floating point and Altivec loads
-	   into account.  */
+#if !__powerpc64__
 	lwzu	5,4(3)		# offset 84
 	mtlr	5
 	lwzu	5,4(3)		# offset 88
 	mtcrf	255,5
 				# one word pad to get floating point aligned on 8 byte boundary
+#endif
=20
 	/* Check whether we need to restore FPRs.  Checking
 	   __NO_FPRS__ on its own would be enough for GCC 4.1 and
@@ -292,6 +300,13 @@ FUNC_START(longjmp)
 	andi.	5,5,0x2000
 	beq	1f
 #endif
+
+	/* If __powerpc64__, then add 96 to the offset shown from this point on=
 until
+	   the end of this function.  This difference comes from the fact that
+	   we restore 21 64-bit registers instead of 21 32-bit registers above =
and
+	   we take alignement requirements of floating point and Altivec loads
+	   into account.  */
+
 	lfdu	14,8(3)         # offset 96=20
 	lfdu	15,8(3)         # offset 104
 	lfdu	16,8(3)         # offset 112
--=20
2.35.3