From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <siguza@siguza.net>
Received: from mail.siguza.net (mail.siguza.net [62.75.137.16])
 by sourceware.org (Postfix) with ESMTPS id 5192C387084C
 for <newlib@sourceware.org>; Mon, 11 Jan 2021 23:15:29 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5192C387084C
Received: from acacia.home (191.178.78.83.dynamic.wline.res.cust.swisscom.ch
 [83.78.178.191])
 by mail.siguza.net (Postfix) with ESMTPSA id 0D95B4A20049
 for <newlib@sourceware.org>; Tue, 12 Jan 2021 00:15:27 +0100 (CET)
From: Siguza <siguza@siguza.net>
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.21\))
Subject: Patches for targeting AArch64 Darwin with clang
Message-Id: <983159DB-FF02-4264-A7F2-AC963A4C68F7@siguza.net>
Date: Tue, 12 Jan 2021 00:15:26 +0100
To: newlib@sourceware.org
X-Mailer: Apple Mail (2.3654.20.0.2.21)
X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE,
 SPF_PASS autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: newlib@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Newlib mailing list <newlib.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/newlib>,
 <mailto:newlib-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/newlib/>
List-Post: <mailto:newlib@sourceware.org>
List-Help: <mailto:newlib-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/newlib>,
 <mailto:newlib-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Jan 2021 23:15:32 -0000

Hi

We at the checkra1n team are using Newlib as the standard library of a =
pre-boot bare metal execution environment on jailbroken iPhones (i.e. =
aarch64).
As our target is using the Darwin ABI and we're building with clang, we =
had to apply some patches. We'd like to upstream those.

The first two patches should be uncontroversial. They merely consist of:
1. an additional header include (which causes a warning for Linux/ELF =
targets, but which seems to be fatal when targeting Darwin).
2. a change that makes all AArch64 "p2align" directives default to 2 =
rather than 0 (which I'm assuming is done implicitly anyway for =
non-Darwin targets?).

The third patch changes SIMD/Neon register arguments in instructions =
that move between general-purpose and vector registers.
This is requires when building with clang, even for non-Darwin targets. =
As far as I can tell, the "d" in "reg.d[0]" does not appear in the ARMv8 =
Reference Manual and is a gcc-specific thing. I'm assuming it has no =
actual meaning and gcc just silently ignores it, but I didn't find any =
actual documentation on that.

The fourth patch makes all the AArch64 assembly files compatible with =
the Darwin ABI. In particular:
- The .type and .size directives are illegal for Darwin targets, so they =
are wrapped in "#ifndef __APPLE__" blocks.
- Macro invocations must separate arguments by commas, otherwise they =
are concatenated and treated as one argument. This should work on all =
targets and not require any ifdefs.
- Darwin prefixes C symbols with an underscore, so the assembly for e.g. =
memcpy has to use _memcpy as label. I figured the least invasive patch =
for this was to just #define these symbols when targeting Darwin.
- In one case there was a "b.hs memcpy". Darwin seems to not allow =
jumping to external labels in conditional branches, so I replaced that =
with a conditional jump to a local label, followed by an unconditional =
jump to the external one.

Please find the patches attached below.

- Siguza


=46rom 461d0a53041b94d23c3dd76b785b60b675ebdaa5 Mon Sep 17 00:00:00 2001
From: Siguza <siguza@siguza.net>
Date: Mon, 11 Jan 2021 22:47:57 +0100
Subject: [PATCH 1/4] Fix include of _memalign_r in aligned_alloc.c

---
 newlib/libc/stdlib/aligned_alloc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/newlib/libc/stdlib/aligned_alloc.c =
b/newlib/libc/stdlib/aligned_alloc.c
index feb22c24b..ad8887bd0 100644
--- a/newlib/libc/stdlib/aligned_alloc.c
+++ b/newlib/libc/stdlib/aligned_alloc.c
@@ -26,6 +26,7 @@
    NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
    SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */
=20
+#include <malloc.h>
 #include <reent.h>
 #include <stdlib.h>
=20
--=20
2.24.3 (Apple Git-128)


=46rom f9342c71fbcf968c26395ce0f1532266602b07af Mon Sep 17 00:00:00 2001
From: Siguza <siguza@siguza.net>
Date: Mon, 11 Jan 2021 22:52:11 +0100
Subject: [PATCH 2/4] Make aarch64 p2align default to 2

---
 newlib/libc/machine/aarch64/memchr.S    | 2 +-
 newlib/libc/machine/aarch64/memcmp.S    | 2 +-
 newlib/libc/machine/aarch64/memcpy.S    | 2 +-
 newlib/libc/machine/aarch64/memmove.S   | 2 +-
 newlib/libc/machine/aarch64/memset.S    | 2 +-
 newlib/libc/machine/aarch64/rawmemchr.S | 3 +--
 newlib/libc/machine/aarch64/setjmp.S    | 2 ++
 newlib/libc/machine/aarch64/strchr.S    | 2 +-
 newlib/libc/machine/aarch64/strchrnul.S | 2 +-
 newlib/libc/machine/aarch64/strcmp.S    | 2 +-
 newlib/libc/machine/aarch64/strcpy.S    | 2 +-
 newlib/libc/machine/aarch64/strlen.S    | 2 +-
 newlib/libc/machine/aarch64/strncmp.S   | 2 +-
 newlib/libc/machine/aarch64/strnlen.S   | 2 +-
 newlib/libc/machine/aarch64/strrchr.S   | 2 +-
 15 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/newlib/libc/machine/aarch64/memchr.S =
b/newlib/libc/machine/aarch64/memchr.S
index 53f5d6bc0..91c2af22d 100644
--- a/newlib/libc/machine/aarch64/memchr.S
+++ b/newlib/libc/machine/aarch64/memchr.S
@@ -70,7 +70,7 @@
  * identify exactly which byte has matched.
  */
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/memcmp.S =
b/newlib/libc/machine/aarch64/memcmp.S
index 605d99365..981baab3c 100644
--- a/newlib/libc/machine/aarch64/memcmp.S
+++ b/newlib/libc/machine/aarch64/memcmp.S
@@ -81,7 +81,7 @@
 #define tmp1		x7
 #define tmp2		x8
=20
-        .macro def_fn f p2align=3D0
+        .macro def_fn f p2align=3D2
         .text
         .p2align \p2align
         .global \f
diff --git a/newlib/libc/machine/aarch64/memcpy.S =
b/newlib/libc/machine/aarch64/memcpy.S
index 463bad0a1..d2de7415d 100644
--- a/newlib/libc/machine/aarch64/memcpy.S
+++ b/newlib/libc/machine/aarch64/memcpy.S
@@ -87,7 +87,7 @@
=20
 #define L(l) .L ## l
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/memmove.S =
b/newlib/libc/machine/aarch64/memmove.S
index 597a8c8e9..6da548f10 100644
--- a/newlib/libc/machine/aarch64/memmove.S
+++ b/newlib/libc/machine/aarch64/memmove.S
@@ -61,7 +61,7 @@
 /* See memmove-stub.c  */
 #else
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/memset.S =
b/newlib/libc/machine/aarch64/memset.S
index 103e3f8bb..cad9117b7 100644
--- a/newlib/libc/machine/aarch64/memset.S
+++ b/newlib/libc/machine/aarch64/memset.S
@@ -77,7 +77,7 @@
=20
 #define L(l) .L ## l
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/rawmemchr.S =
b/newlib/libc/machine/aarch64/rawmemchr.S
index 26da81005..484971b3f 100644
--- a/newlib/libc/machine/aarch64/rawmemchr.S
+++ b/newlib/libc/machine/aarch64/rawmemchr.S
@@ -36,7 +36,7 @@
=20
 #define L(l) .L ## l
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
@@ -65,4 +65,3 @@ L(do_strlen):
=20
 	.size   rawmemchr, . - rawmemchr
 #endif
-
diff --git a/newlib/libc/machine/aarch64/setjmp.S =
b/newlib/libc/machine/aarch64/setjmp.S
index 0856145bf..fde0e45a7 100644
--- a/newlib/libc/machine/aarch64/setjmp.S
+++ b/newlib/libc/machine/aarch64/setjmp.S
@@ -43,6 +43,7 @@
=20
 // int setjmp (jmp_buf)
 	.global	setjmp
+	.p2align	2
 	.type	setjmp, %function
 setjmp:
 	mov	x16, sp
@@ -58,6 +59,7 @@ setjmp:
=20
 // void longjmp (jmp_buf, int) __attribute__ ((noreturn))
 	.global	longjmp
+	.p2align	2
 	.type	longjmp, %function
 longjmp:
 #define REG_PAIR(REG1, REG2, OFFS)	ldp REG1, REG2, [x0, OFFS]
diff --git a/newlib/libc/machine/aarch64/strchr.S =
b/newlib/libc/machine/aarch64/strchr.S
index 2448dbc7d..5fc0fd06e 100644
--- a/newlib/libc/machine/aarch64/strchr.S
+++ b/newlib/libc/machine/aarch64/strchr.S
@@ -74,7 +74,7 @@
=20
 /* Locals and temporaries.  */
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/strchrnul.S =
b/newlib/libc/machine/aarch64/strchrnul.S
index a0ac13b7f..99fba3128 100644
--- a/newlib/libc/machine/aarch64/strchrnul.S
+++ b/newlib/libc/machine/aarch64/strchrnul.S
@@ -70,7 +70,7 @@
=20
 /* Locals and temporaries.  */
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/strcmp.S =
b/newlib/libc/machine/aarch64/strcmp.S
index e2bef2d49..cabcf4faa 100644
--- a/newlib/libc/machine/aarch64/strcmp.S
+++ b/newlib/libc/machine/aarch64/strcmp.S
@@ -33,7 +33,7 @@
 /* See strcmp-stub.c  */
 #else
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/strcpy.S =
b/newlib/libc/machine/aarch64/strcpy.S
index e5405f253..95533de60 100644
--- a/newlib/libc/machine/aarch64/strcpy.S
+++ b/newlib/libc/machine/aarch64/strcpy.S
@@ -72,7 +72,7 @@
 #define STRCPY strcpy
 #endif
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/strlen.S =
b/newlib/libc/machine/aarch64/strlen.S
index 872d136ef..7e6ced01d 100644
--- a/newlib/libc/machine/aarch64/strlen.S
+++ b/newlib/libc/machine/aarch64/strlen.S
@@ -55,7 +55,7 @@
=20
 #define L(l) .L ## l
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/strncmp.S =
b/newlib/libc/machine/aarch64/strncmp.S
index ffdabc260..b218e95a7 100644
--- a/newlib/libc/machine/aarch64/strncmp.S
+++ b/newlib/libc/machine/aarch64/strncmp.S
@@ -33,7 +33,7 @@
  * ARMv8-a, AArch64
  */
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/strnlen.S =
b/newlib/libc/machine/aarch64/strnlen.S
index c255c3f7c..0eb742412 100644
--- a/newlib/libc/machine/aarch64/strnlen.S
+++ b/newlib/libc/machine/aarch64/strnlen.S
@@ -55,7 +55,7 @@
 #define pos		x13
 #define limit_wd	x14
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
diff --git a/newlib/libc/machine/aarch64/strrchr.S =
b/newlib/libc/machine/aarch64/strrchr.S
index d64fc09b1..8cf8d302d 100644
--- a/newlib/libc/machine/aarch64/strrchr.S
+++ b/newlib/libc/machine/aarch64/strrchr.S
@@ -80,7 +80,7 @@
=20
 /* Locals and temporaries.  */
=20
-	.macro def_fn f p2align=3D0
+	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
--=20
2.24.3 (Apple Git-128)


=46rom 779f336fc4bfae8933b141460bff1c53f29effad Mon Sep 17 00:00:00 2001
From: Siguza <siguza@siguza.net>
Date: Mon, 11 Jan 2021 22:54:12 +0100
Subject: [PATCH 3/4] Make aarch64 assembly clang-compatible

---
 newlib/libc/machine/aarch64/memchr.S    |  6 +++---
 newlib/libc/machine/aarch64/strchr.S    |  6 +++---
 newlib/libc/machine/aarch64/strchrnul.S |  6 +++---
 newlib/libc/machine/aarch64/strrchr.S   | 10 +++++-----
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/newlib/libc/machine/aarch64/memchr.S =
b/newlib/libc/machine/aarch64/memchr.S
index 91c2af22d..8389c8a50 100644
--- a/newlib/libc/machine/aarch64/memchr.S
+++ b/newlib/libc/machine/aarch64/memchr.S
@@ -110,7 +110,7 @@ def_fn memchr
 	and	vhas_chr2.16b, vhas_chr2.16b, vrepmask.16b
 	addp	vend.16b, vhas_chr1.16b, vhas_chr2.16b		/* =
256->128 */
 	addp	vend.16b, vend.16b, vend.16b			/* =
128->64 */
-	mov	synd, vend.2d[0]
+	mov	synd, vend.d[0]
 	/* Clear the soff*2 lower bits */
 	lsl	tmp, soff, #1
 	lsr	synd, synd, tmp
@@ -130,7 +130,7 @@ def_fn memchr
 	/* Use a fast check for the termination condition */
 	orr	vend.16b, vhas_chr1.16b, vhas_chr2.16b
 	addp	vend.2d, vend.2d, vend.2d
-	mov	synd, vend.2d[0]
+	mov	synd, vend.d[0]
 	/* We're not out of data, loop if we haven't found the character =
*/
 	cbz	synd, .Lloop
=20
@@ -140,7 +140,7 @@ def_fn memchr
 	and	vhas_chr2.16b, vhas_chr2.16b, vrepmask.16b
 	addp	vend.16b, vhas_chr1.16b, vhas_chr2.16b		/* =
256->128 */
 	addp	vend.16b, vend.16b, vend.16b			/* =
128->64 */
-	mov	synd, vend.2d[0]
+	mov	synd, vend.d[0]
 	/* Only do the clear for the last possible block */
 	b.hi	.Ltail
=20
diff --git a/newlib/libc/machine/aarch64/strchr.S =
b/newlib/libc/machine/aarch64/strchr.S
index 5fc0fd06e..8ed6ef673 100644
--- a/newlib/libc/machine/aarch64/strchr.S
+++ b/newlib/libc/machine/aarch64/strchr.S
@@ -117,7 +117,7 @@ def_fn strchr
 	addp	vend1.16b, vend1.16b, vend2.16b		// 128->64
 	lsr	tmp1, tmp3, tmp1
=20
-	mov	tmp3, vend1.2d[0]
+	mov	tmp3, vend1.d[0]
 	bic	tmp1, tmp3, tmp1	// Mask padding bits.
 	cbnz	tmp1, .Ltail
=20
@@ -132,7 +132,7 @@ def_fn strchr
 	orr	vend2.16b, vhas_nul2.16b, vhas_chr2.16b
 	orr	vend1.16b, vend1.16b, vend2.16b
 	addp	vend1.2d, vend1.2d, vend1.2d
-	mov	tmp1, vend1.2d[0]
+	mov	tmp1, vend1.d[0]
 	cbz	tmp1, .Lloop
=20
 	/* Termination condition found.  Now need to establish exactly =
why
@@ -146,7 +146,7 @@ def_fn strchr
 	addp	vend1.16b, vend1.16b, vend2.16b		// 256->128
 	addp	vend1.16b, vend1.16b, vend2.16b		// 128->64
=20
-	mov	tmp1, vend1.2d[0]
+	mov	tmp1, vend1.d[0]
 .Ltail:
 	/* Count the trailing zeros, by bit reversing...  */
 	rbit	tmp1, tmp1
diff --git a/newlib/libc/machine/aarch64/strchrnul.S =
b/newlib/libc/machine/aarch64/strchrnul.S
index 99fba3128..0e257fa06 100644
--- a/newlib/libc/machine/aarch64/strchrnul.S
+++ b/newlib/libc/machine/aarch64/strchrnul.S
@@ -109,7 +109,7 @@ def_fn strchrnul
 	addp	vend1.16b, vend1.16b, vend1.16b		// 128->64
 	lsr	tmp1, tmp3, tmp1
=20
-	mov	tmp3, vend1.2d[0]
+	mov	tmp3, vend1.d[0]
 	bic	tmp1, tmp3, tmp1	// Mask padding bits.
 	cbnz	tmp1, .Ltail
=20
@@ -124,7 +124,7 @@ def_fn strchrnul
 	orr	vhas_chr2.16b, vhas_nul2.16b, vhas_chr2.16b
 	orr	vend1.16b, vhas_chr1.16b, vhas_chr2.16b
 	addp	vend1.2d, vend1.2d, vend1.2d
-	mov	tmp1, vend1.2d[0]
+	mov	tmp1, vend1.d[0]
 	cbz	tmp1, .Lloop
=20
 	/* Termination condition found.  Now need to establish exactly =
why
@@ -134,7 +134,7 @@ def_fn strchrnul
 	addp	vend1.16b, vhas_chr1.16b, vhas_chr2.16b		// =
256->128
 	addp	vend1.16b, vend1.16b, vend1.16b		// 128->64
=20
-	mov	tmp1, vend1.2d[0]
+	mov	tmp1, vend1.d[0]
 .Ltail:
 	/* Count the trailing zeros, by bit reversing...  */
 	rbit	tmp1, tmp1
diff --git a/newlib/libc/machine/aarch64/strrchr.S =
b/newlib/libc/machine/aarch64/strrchr.S
index 8cf8d302d..ee425c42b 100644
--- a/newlib/libc/machine/aarch64/strrchr.S
+++ b/newlib/libc/machine/aarch64/strrchr.S
@@ -120,10 +120,10 @@ def_fn strrchr
 	addp	vhas_chr1.16b, vhas_chr1.16b, vhas_chr2.16b	// =
256->128
 	addp	vhas_nul1.16b, vhas_nul1.16b, vhas_nul1.16b	// =
128->64
 	addp	vhas_chr1.16b, vhas_chr1.16b, vhas_chr1.16b	// =
128->64
-	mov	nul_match, vhas_nul1.2d[0]
+	mov	nul_match, vhas_nul1.d[0]
 	lsl	tmp1, tmp1, #1
 	mov	const_m1, #~0
-	mov	chr_match, vhas_chr1.2d[0]
+	mov	chr_match, vhas_chr1.d[0]
 	lsr	tmp3, const_m1, tmp1
=20
 	bic	nul_match, nul_match, tmp3	// Mask padding bits.
@@ -146,15 +146,15 @@ def_fn strrchr
 	addp	vhas_chr1.16b, vhas_chr1.16b, vhas_chr2.16b	// =
256->128
 	addp	vend1.16b, vend1.16b, vend1.16b	// 128->64
 	addp	vhas_chr1.16b, vhas_chr1.16b, vhas_chr1.16b	// =
128->64
-	mov	nul_match, vend1.2d[0]
-	mov	chr_match, vhas_chr1.2d[0]
+	mov	nul_match, vend1.d[0]
+	mov	chr_match, vhas_chr1.d[0]
 	cbz	nul_match, .Lloop
=20
 	and	vhas_nul1.16b, vhas_nul1.16b, vrepmask_0.16b
 	and	vhas_nul2.16b, vhas_nul2.16b, vrepmask_0.16b
 	addp	vhas_nul1.16b, vhas_nul1.16b, vhas_nul2.16b
 	addp	vhas_nul1.16b, vhas_nul1.16b, vhas_nul1.16b
-	mov	nul_match, vhas_nul1.2d[0]
+	mov	nul_match, vhas_nul1.d[0]
=20
 .Ltail:
 	/* Work out exactly where the string ends.  */
--=20
2.24.3 (Apple Git-128)


=46rom d80083fccf21ab7664732d88978d982c1bc99080 Mon Sep 17 00:00:00 2001
From: Siguza <siguza@siguza.net>
Date: Mon, 11 Jan 2021 23:01:35 +0100
Subject: [PATCH 4/4] Make aarch64 support the Darwin ABI

---
 newlib/libc/machine/aarch64/memchr.S    |  8 ++++++++
 newlib/libc/machine/aarch64/memcmp.S    | 10 +++++++++-
 newlib/libc/machine/aarch64/memcpy.S    | 10 +++++++++-
 newlib/libc/machine/aarch64/memmove.S   | 14 +++++++++++++-
 newlib/libc/machine/aarch64/memset.S    | 10 +++++++++-
 newlib/libc/machine/aarch64/rawmemchr.S | 12 +++++++++++-
 newlib/libc/machine/aarch64/setjmp.S    | 14 ++++++++++++++
 newlib/libc/machine/aarch64/strchr.S    |  8 ++++++++
 newlib/libc/machine/aarch64/strchrnul.S |  8 ++++++++
 newlib/libc/machine/aarch64/strcmp.S    | 12 ++++++++++--
 newlib/libc/machine/aarch64/strcpy.S    | 14 +++++++++++++-
 newlib/libc/machine/aarch64/strlen.S    | 10 +++++++++-
 newlib/libc/machine/aarch64/strncmp.S   |  9 +++++++++
 newlib/libc/machine/aarch64/strnlen.S   | 10 +++++++++-
 newlib/libc/machine/aarch64/strrchr.S   |  8 ++++++++
 15 files changed, 147 insertions(+), 10 deletions(-)

diff --git a/newlib/libc/machine/aarch64/memchr.S =
b/newlib/libc/machine/aarch64/memchr.S
index 8389c8a50..7025919a0 100644
--- a/newlib/libc/machine/aarch64/memchr.S
+++ b/newlib/libc/machine/aarch64/memchr.S
@@ -70,11 +70,17 @@
  * identify exactly which byte has matched.
  */
=20
+#ifdef __APPLE__
+#   define memchr _memchr
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -172,5 +178,7 @@ def_fn memchr
 	mov	result, #0
 	ret
=20
+#ifndef __APPLE__
 	.size	memchr, . - memchr
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/memcmp.S =
b/newlib/libc/machine/aarch64/memcmp.S
index 981baab3c..95a7d2a8c 100644
--- a/newlib/libc/machine/aarch64/memcmp.S
+++ b/newlib/libc/machine/aarch64/memcmp.S
@@ -81,15 +81,21 @@
 #define tmp1		x7
 #define tmp2		x8
=20
+#ifdef __APPLE__
+#   define memcmp _memcmp
+#endif
+
         .macro def_fn f p2align=3D2
         .text
         .p2align \p2align
         .global \f
+#ifndef __APPLE__
         .type \f, %function
+#endif
 \f:
         .endm
=20
-def_fn memcmp p2align=3D6
+def_fn memcmp, p2align=3D6
 	subs	limit, limit, 8
 	b.lo	L(less8)
=20
@@ -192,5 +198,7 @@ L(byte_loop):
 	sub	result, data1w, data2w
 	ret
=20
+#ifndef __APPLE__
 	.size	memcmp, . - memcmp
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/memcpy.S =
b/newlib/libc/machine/aarch64/memcpy.S
index d2de7415d..d9d3ef20f 100644
--- a/newlib/libc/machine/aarch64/memcpy.S
+++ b/newlib/libc/machine/aarch64/memcpy.S
@@ -87,11 +87,17 @@
=20
 #define L(l) .L ## l
=20
+#ifdef __APPLE__
+#   define memcpy _memcpy
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -104,7 +110,7 @@
    well as non-overlapping copies.
 */
=20
-def_fn memcpy p2align=3D6
+def_fn memcpy, p2align=3D6
 	prfm	PLDL1KEEP, [src]
 	add	srcend, src, count
 	add	dstend, dstin, count
@@ -226,5 +232,7 @@ L(copy_long):
 	stp	C_l, C_h, [dstend, -16]
 	ret
=20
+#ifndef __APPLE__
 	.size	memcpy, . - memcpy
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/memmove.S =
b/newlib/libc/machine/aarch64/memmove.S
index 6da548f10..395482061 100644
--- a/newlib/libc/machine/aarch64/memmove.S
+++ b/newlib/libc/machine/aarch64/memmove.S
@@ -61,11 +61,18 @@
 /* See memmove-stub.c  */
 #else
=20
+#ifdef __APPLE__
+#   define memcpy _memcpy
+#   define memmove _memmove
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -97,8 +104,11 @@ def_fn memmove, 6
 	sub	tmp1, dstin, src
 	cmp	count, 96
 	ccmp	tmp1, count, 2, hi
-	b.hs	memcpy
+	/* Darwin can't use b.hs to jump to external labels. */
+	b.lo	0f
+	b	memcpy
=20
+0:
 	cbz	tmp1, 3f
 	add	dstend, dstin, count
 	add	srcend, src, count
@@ -151,5 +161,7 @@ def_fn memmove, 6
 	stp	C_l, C_h, [dstin]
 3:	ret
=20
+#ifndef __APPLE__
 	.size	memmove, . - memmove
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/memset.S =
b/newlib/libc/machine/aarch64/memset.S
index cad9117b7..7bf190943 100644
--- a/newlib/libc/machine/aarch64/memset.S
+++ b/newlib/libc/machine/aarch64/memset.S
@@ -77,15 +77,21 @@
=20
 #define L(l) .L ## l
=20
+#ifdef __APPLE__
+#   define memset _memset
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
-def_fn memset p2align=3D6
+def_fn memset, p2align=3D6
=20
 	dup	v0.16B, valw
 	add	dstend, dstin, count
@@ -236,5 +242,7 @@ L(zva_other):
 	sub	dst, dst, 32		/* Bias dst for tail loop.  */
 	b	L(tail64)
=20
+#ifndef __APPLE__
 	.size	memset, . - memset
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/rawmemchr.S =
b/newlib/libc/machine/aarch64/rawmemchr.S
index 484971b3f..9f37a4d83 100644
--- a/newlib/libc/machine/aarch64/rawmemchr.S
+++ b/newlib/libc/machine/aarch64/rawmemchr.S
@@ -36,11 +36,19 @@
=20
 #define L(l) .L ## l
=20
+#ifdef __APPLE__
+#   define memchr _memchr
+#   define rawmemchr _rawmemchr
+#   define strlen _strlen
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -48,7 +56,7 @@
    Call strlen without setting up a full frame - it preserves x14/x15.
 */
=20
-def_fn rawmemchr p2align=3D5
+def_fn rawmemchr, p2align=3D5
 	.cfi_startproc
 	cbz	w1, L(do_strlen)
 	mov	x2, -1
@@ -63,5 +71,7 @@ L(do_strlen):
 	ret	x15
 	.cfi_endproc
=20
+#ifndef __APPLE__
 	.size   rawmemchr, . - rawmemchr
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/setjmp.S =
b/newlib/libc/machine/aarch64/setjmp.S
index fde0e45a7..0335b6729 100644
--- a/newlib/libc/machine/aarch64/setjmp.S
+++ b/newlib/libc/machine/aarch64/setjmp.S
@@ -41,10 +41,17 @@
 	REG_PAIR (d12, d13, 144);	\
 	REG_PAIR (d14, d15, 160);
=20
+#ifdef __APPLE__
+#   define setjmp _setjmp
+#   define longjmp _longjmp
+#endif
+
 // int setjmp (jmp_buf)
 	.global	setjmp
 	.p2align	2
+#ifndef __APPLE__
 	.type	setjmp, %function
+#endif
 setjmp:
 	mov	x16, sp
 #define REG_PAIR(REG1, REG2, OFFS)	stp REG1, REG2, [x0, OFFS]
@@ -55,12 +62,16 @@ setjmp:
 #undef REG_ONE
 	mov	w0, #0
 	ret
+#ifndef __APPLE__
 	.size	setjmp, .-setjmp
+#endif
=20
 // void longjmp (jmp_buf, int) __attribute__ ((noreturn))
 	.global	longjmp
 	.p2align	2
+#ifndef __APPLE__
 	.type	longjmp, %function
+#endif
 longjmp:
 #define REG_PAIR(REG1, REG2, OFFS)	ldp REG1, REG2, [x0, OFFS]
 #define REG_ONE(REG1, OFFS)		ldr REG1, [x0, OFFS]
@@ -73,4 +84,7 @@ longjmp:
 	cinc	w0, w1, eq
 	// use br not ret, as ret is guaranteed to mispredict
 	br	x30
+
+#ifndef __APPLE__
 	.size	longjmp, .-longjmp
+#endif
diff --git a/newlib/libc/machine/aarch64/strchr.S =
b/newlib/libc/machine/aarch64/strchr.S
index 8ed6ef673..c7e159b0a 100644
--- a/newlib/libc/machine/aarch64/strchr.S
+++ b/newlib/libc/machine/aarch64/strchr.S
@@ -74,11 +74,17 @@
=20
 /* Locals and temporaries.  */
=20
+#ifdef __APPLE__
+#   define strchr _strchr
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -160,5 +166,7 @@ def_fn strchr
 	csel	result, result, xzr, eq
 	ret
=20
+#ifndef __APPLE__
 	.size	strchr, . - strchr
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/strchrnul.S =
b/newlib/libc/machine/aarch64/strchrnul.S
index 0e257fa06..9f5551f59 100644
--- a/newlib/libc/machine/aarch64/strchrnul.S
+++ b/newlib/libc/machine/aarch64/strchrnul.S
@@ -70,11 +70,17 @@
=20
 /* Locals and temporaries.  */
=20
+#ifdef __APPLE__
+#   define strchrnul _strchrnul
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -145,5 +151,7 @@ def_fn strchrnul
 	add	result, src, tmp1, lsr #1
 	ret
=20
+#ifndef __APPLE__
 	.size	strchrnul, . - strchrnul
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/strcmp.S =
b/newlib/libc/machine/aarch64/strcmp.S
index cabcf4faa..ce6c2f5ad 100644
--- a/newlib/libc/machine/aarch64/strcmp.S
+++ b/newlib/libc/machine/aarch64/strcmp.S
@@ -33,11 +33,17 @@
 /* See strcmp-stub.c  */
 #else
=20
+#ifdef __APPLE__
+#   define strcmp _strcmp
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -67,7 +73,7 @@
 #define pos		x11
=20
 	/* Start of performance-critical section  -- one 64B cache line. =
 */
-def_fn strcmp p2align=3D6
+def_fn strcmp, p2align=3D6
 	eor	tmp1, src1, src2
 	mov	zeroones, #REP8_01
 	tst	tmp1, #7
@@ -197,6 +203,8 @@ L(loop_misaligned):
 L(done):
 	sub	result, data1, data2
 	ret
-	.size	strcmp, .-strcmp
=20
+#ifndef __APPLE__
+	.size	strcmp, .-strcmp
+#endif
 #endif
diff --git a/newlib/libc/machine/aarch64/strcpy.S =
b/newlib/libc/machine/aarch64/strcpy.S
index 95533de60..f9b293423 100644
--- a/newlib/libc/machine/aarch64/strcpy.S
+++ b/newlib/libc/machine/aarch64/strcpy.S
@@ -66,17 +66,27 @@
 #define len		x16
 #define to_align	x17
=20
+#ifdef __APPLE__
+#ifdef BUILD_STPCPY
+#define STRCPY _stpcpy
+#else
+#define STRCPY _strcpy
+#endif
+#else
 #ifdef BUILD_STPCPY
 #define STRCPY stpcpy
 #else
 #define STRCPY strcpy
+#endif
 #endif
=20
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -111,7 +121,7 @@
=20
 #define MIN_PAGE_SIZE (1 << MIN_PAGE_P2)
=20
-def_fn STRCPY p2align=3D6
+def_fn STRCPY, p2align=3D6
 	/* For moderately short strings, the fastest way to do the copy =
is to
 	   calculate the length of the string in the same way as strlen, =
then
 	   essentially do a memcpy of the result.  This avoids the need =
for
@@ -337,5 +347,7 @@ def_fn STRCPY p2align=3D6
 	bic	has_nul2, tmp3, tmp4
 	b	.Lfp_gt8
=20
+#ifndef __APPLE__
 	.size	STRCPY, . - STRCPY
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/strlen.S =
b/newlib/libc/machine/aarch64/strlen.S
index 7e6ced01d..c1ef145ea 100644
--- a/newlib/libc/machine/aarch64/strlen.S
+++ b/newlib/libc/machine/aarch64/strlen.S
@@ -55,11 +55,17 @@
=20
 #define L(l) .L ## l
=20
+#ifdef __APPLE__
+#   define strlen _strlen
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -104,7 +110,7 @@
 	   whether the first fetch, which may be misaligned, crosses a =
page
 	   boundary.  */
=20
-def_fn strlen p2align=3D6
+def_fn strlen, p2align=3D6
 	and	tmp1, srcin, MIN_PAGE_SIZE - 1
 	mov	zeroones, REP8_01
 	cmp	tmp1, MIN_PAGE_SIZE - 16
@@ -234,5 +240,7 @@ L(page_cross):
 	csel	data2, data2, tmp2, eq
 	b	L(page_cross_entry)
=20
+#ifndef __APPLE__
 	.size	strlen, . - strlen
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/strncmp.S =
b/newlib/libc/machine/aarch64/strncmp.S
index b218e95a7..bbae2a083 100644
--- a/newlib/libc/machine/aarch64/strncmp.S
+++ b/newlib/libc/machine/aarch64/strncmp.S
@@ -33,11 +33,17 @@
  * ARMv8-a, AArch64
  */
=20
+#ifdef __APPLE__
+#   define strncmp _strncmp
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -286,5 +292,8 @@ def_fn strncmp
 .Lret0:
 	mov	result, #0
 	ret
+
+#ifndef __APPLE__
 	.size strncmp, . - strncmp
 #endif
+#endif
diff --git a/newlib/libc/machine/aarch64/strnlen.S =
b/newlib/libc/machine/aarch64/strnlen.S
index 0eb742412..f6f501fec 100644
--- a/newlib/libc/machine/aarch64/strnlen.S
+++ b/newlib/libc/machine/aarch64/strnlen.S
@@ -55,11 +55,17 @@
 #define pos		x13
 #define limit_wd	x14
=20
+#ifdef __APPLE__
+#   define strnlen _strnlen
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -182,6 +188,8 @@ def_fn strnlen
 	csinv	data1, data1, xzr, le
 	csel	data2, data2, data2a, le
 	b	.Lrealigned
-	.size	strnlen, . - .Lstart	/* Include pre-padding in size.  =
*/
=20
+#ifndef __APPLE__
+	.size	strnlen, . - .Lstart	/* Include pre-padding in size.  =
*/
+#endif
 #endif
diff --git a/newlib/libc/machine/aarch64/strrchr.S =
b/newlib/libc/machine/aarch64/strrchr.S
index ee425c42b..b65833fe0 100644
--- a/newlib/libc/machine/aarch64/strrchr.S
+++ b/newlib/libc/machine/aarch64/strrchr.S
@@ -80,11 +80,17 @@
=20
 /* Locals and temporaries.  */
=20
+#ifdef __APPLE__
+#   define strrchr _strrchr
+#endif
+
 	.macro def_fn f p2align=3D2
 	.text
 	.p2align \p2align
 	.global \f
+#ifndef __APPLE__
 	.type \f, %function
+#endif
 \f:
 	.endm
=20
@@ -178,5 +184,7 @@ def_fn strrchr
=20
 	ret
=20
+#ifndef __APPLE__
 	.size	strrchr, . - strrchr
 #endif
+#endif
--=20
2.24.3 (Apple Git-128)