From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.siguza.net (mail.siguza.net [62.75.137.16]) by sourceware.org (Postfix) with ESMTPS id 5192C387084C for ; Mon, 11 Jan 2021 23:15:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5192C387084C Received: from acacia.home (191.178.78.83.dynamic.wline.res.cust.swisscom.ch [83.78.178.191]) by mail.siguza.net (Postfix) with ESMTPSA id 0D95B4A20049 for ; Tue, 12 Jan 2021 00:15:27 +0100 (CET) From: Siguza Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.21\)) Subject: Patches for targeting AArch64 Darwin with clang Message-Id: <983159DB-FF02-4264-A7F2-AC963A4C68F7@siguza.net> Date: Tue, 12 Jan 2021 00:15:26 +0100 To: newlib@sourceware.org X-Mailer: Apple Mail (2.3654.20.0.2.21) X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Jan 2021 23:15:32 -0000 Hi We at the checkra1n team are using Newlib as the standard library of a = pre-boot bare metal execution environment on jailbroken iPhones (i.e. = aarch64). As our target is using the Darwin ABI and we're building with clang, we = had to apply some patches. We'd like to upstream those. The first two patches should be uncontroversial. They merely consist of: 1. an additional header include (which causes a warning for Linux/ELF = targets, but which seems to be fatal when targeting Darwin). 2. a change that makes all AArch64 "p2align" directives default to 2 = rather than 0 (which I'm assuming is done implicitly anyway for = non-Darwin targets?). The third patch changes SIMD/Neon register arguments in instructions = that move between general-purpose and vector registers. This is requires when building with clang, even for non-Darwin targets. = As far as I can tell, the "d" in "reg.d[0]" does not appear in the ARMv8 = Reference Manual and is a gcc-specific thing. I'm assuming it has no = actual meaning and gcc just silently ignores it, but I didn't find any = actual documentation on that. The fourth patch makes all the AArch64 assembly files compatible with = the Darwin ABI. In particular: - The .type and .size directives are illegal for Darwin targets, so they = are wrapped in "#ifndef __APPLE__" blocks. - Macro invocations must separate arguments by commas, otherwise they = are concatenated and treated as one argument. This should work on all = targets and not require any ifdefs. - Darwin prefixes C symbols with an underscore, so the assembly for e.g. = memcpy has to use _memcpy as label. I figured the least invasive patch = for this was to just #define these symbols when targeting Darwin. - In one case there was a "b.hs memcpy". Darwin seems to not allow = jumping to external labels in conditional branches, so I replaced that = with a conditional jump to a local label, followed by an unconditional = jump to the external one. Please find the patches attached below. - Siguza =46rom 461d0a53041b94d23c3dd76b785b60b675ebdaa5 Mon Sep 17 00:00:00 2001 From: Siguza Date: Mon, 11 Jan 2021 22:47:57 +0100 Subject: [PATCH 1/4] Fix include of _memalign_r in aligned_alloc.c --- newlib/libc/stdlib/aligned_alloc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/newlib/libc/stdlib/aligned_alloc.c = b/newlib/libc/stdlib/aligned_alloc.c index feb22c24b..ad8887bd0 100644 --- a/newlib/libc/stdlib/aligned_alloc.c +++ b/newlib/libc/stdlib/aligned_alloc.c @@ -26,6 +26,7 @@ NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ =20 +#include #include #include =20 --=20 2.24.3 (Apple Git-128) =46rom f9342c71fbcf968c26395ce0f1532266602b07af Mon Sep 17 00:00:00 2001 From: Siguza Date: Mon, 11 Jan 2021 22:52:11 +0100 Subject: [PATCH 2/4] Make aarch64 p2align default to 2 --- newlib/libc/machine/aarch64/memchr.S | 2 +- newlib/libc/machine/aarch64/memcmp.S | 2 +- newlib/libc/machine/aarch64/memcpy.S | 2 +- newlib/libc/machine/aarch64/memmove.S | 2 +- newlib/libc/machine/aarch64/memset.S | 2 +- newlib/libc/machine/aarch64/rawmemchr.S | 3 +-- newlib/libc/machine/aarch64/setjmp.S | 2 ++ newlib/libc/machine/aarch64/strchr.S | 2 +- newlib/libc/machine/aarch64/strchrnul.S | 2 +- newlib/libc/machine/aarch64/strcmp.S | 2 +- newlib/libc/machine/aarch64/strcpy.S | 2 +- newlib/libc/machine/aarch64/strlen.S | 2 +- newlib/libc/machine/aarch64/strncmp.S | 2 +- newlib/libc/machine/aarch64/strnlen.S | 2 +- newlib/libc/machine/aarch64/strrchr.S | 2 +- 15 files changed, 16 insertions(+), 15 deletions(-) diff --git a/newlib/libc/machine/aarch64/memchr.S = b/newlib/libc/machine/aarch64/memchr.S index 53f5d6bc0..91c2af22d 100644 --- a/newlib/libc/machine/aarch64/memchr.S +++ b/newlib/libc/machine/aarch64/memchr.S @@ -70,7 +70,7 @@ * identify exactly which byte has matched. */ =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/memcmp.S = b/newlib/libc/machine/aarch64/memcmp.S index 605d99365..981baab3c 100644 --- a/newlib/libc/machine/aarch64/memcmp.S +++ b/newlib/libc/machine/aarch64/memcmp.S @@ -81,7 +81,7 @@ #define tmp1 x7 #define tmp2 x8 =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/memcpy.S = b/newlib/libc/machine/aarch64/memcpy.S index 463bad0a1..d2de7415d 100644 --- a/newlib/libc/machine/aarch64/memcpy.S +++ b/newlib/libc/machine/aarch64/memcpy.S @@ -87,7 +87,7 @@ =20 #define L(l) .L ## l =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/memmove.S = b/newlib/libc/machine/aarch64/memmove.S index 597a8c8e9..6da548f10 100644 --- a/newlib/libc/machine/aarch64/memmove.S +++ b/newlib/libc/machine/aarch64/memmove.S @@ -61,7 +61,7 @@ /* See memmove-stub.c */ #else =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/memset.S = b/newlib/libc/machine/aarch64/memset.S index 103e3f8bb..cad9117b7 100644 --- a/newlib/libc/machine/aarch64/memset.S +++ b/newlib/libc/machine/aarch64/memset.S @@ -77,7 +77,7 @@ =20 #define L(l) .L ## l =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/rawmemchr.S = b/newlib/libc/machine/aarch64/rawmemchr.S index 26da81005..484971b3f 100644 --- a/newlib/libc/machine/aarch64/rawmemchr.S +++ b/newlib/libc/machine/aarch64/rawmemchr.S @@ -36,7 +36,7 @@ =20 #define L(l) .L ## l =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f @@ -65,4 +65,3 @@ L(do_strlen): =20 .size rawmemchr, . - rawmemchr #endif - diff --git a/newlib/libc/machine/aarch64/setjmp.S = b/newlib/libc/machine/aarch64/setjmp.S index 0856145bf..fde0e45a7 100644 --- a/newlib/libc/machine/aarch64/setjmp.S +++ b/newlib/libc/machine/aarch64/setjmp.S @@ -43,6 +43,7 @@ =20 // int setjmp (jmp_buf) .global setjmp + .p2align 2 .type setjmp, %function setjmp: mov x16, sp @@ -58,6 +59,7 @@ setjmp: =20 // void longjmp (jmp_buf, int) __attribute__ ((noreturn)) .global longjmp + .p2align 2 .type longjmp, %function longjmp: #define REG_PAIR(REG1, REG2, OFFS) ldp REG1, REG2, [x0, OFFS] diff --git a/newlib/libc/machine/aarch64/strchr.S = b/newlib/libc/machine/aarch64/strchr.S index 2448dbc7d..5fc0fd06e 100644 --- a/newlib/libc/machine/aarch64/strchr.S +++ b/newlib/libc/machine/aarch64/strchr.S @@ -74,7 +74,7 @@ =20 /* Locals and temporaries. */ =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/strchrnul.S = b/newlib/libc/machine/aarch64/strchrnul.S index a0ac13b7f..99fba3128 100644 --- a/newlib/libc/machine/aarch64/strchrnul.S +++ b/newlib/libc/machine/aarch64/strchrnul.S @@ -70,7 +70,7 @@ =20 /* Locals and temporaries. */ =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/strcmp.S = b/newlib/libc/machine/aarch64/strcmp.S index e2bef2d49..cabcf4faa 100644 --- a/newlib/libc/machine/aarch64/strcmp.S +++ b/newlib/libc/machine/aarch64/strcmp.S @@ -33,7 +33,7 @@ /* See strcmp-stub.c */ #else =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/strcpy.S = b/newlib/libc/machine/aarch64/strcpy.S index e5405f253..95533de60 100644 --- a/newlib/libc/machine/aarch64/strcpy.S +++ b/newlib/libc/machine/aarch64/strcpy.S @@ -72,7 +72,7 @@ #define STRCPY strcpy #endif =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/strlen.S = b/newlib/libc/machine/aarch64/strlen.S index 872d136ef..7e6ced01d 100644 --- a/newlib/libc/machine/aarch64/strlen.S +++ b/newlib/libc/machine/aarch64/strlen.S @@ -55,7 +55,7 @@ =20 #define L(l) .L ## l =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/strncmp.S = b/newlib/libc/machine/aarch64/strncmp.S index ffdabc260..b218e95a7 100644 --- a/newlib/libc/machine/aarch64/strncmp.S +++ b/newlib/libc/machine/aarch64/strncmp.S @@ -33,7 +33,7 @@ * ARMv8-a, AArch64 */ =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/strnlen.S = b/newlib/libc/machine/aarch64/strnlen.S index c255c3f7c..0eb742412 100644 --- a/newlib/libc/machine/aarch64/strnlen.S +++ b/newlib/libc/machine/aarch64/strnlen.S @@ -55,7 +55,7 @@ #define pos x13 #define limit_wd x14 =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f diff --git a/newlib/libc/machine/aarch64/strrchr.S = b/newlib/libc/machine/aarch64/strrchr.S index d64fc09b1..8cf8d302d 100644 --- a/newlib/libc/machine/aarch64/strrchr.S +++ b/newlib/libc/machine/aarch64/strrchr.S @@ -80,7 +80,7 @@ =20 /* Locals and temporaries. */ =20 - .macro def_fn f p2align=3D0 + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f --=20 2.24.3 (Apple Git-128) =46rom 779f336fc4bfae8933b141460bff1c53f29effad Mon Sep 17 00:00:00 2001 From: Siguza Date: Mon, 11 Jan 2021 22:54:12 +0100 Subject: [PATCH 3/4] Make aarch64 assembly clang-compatible --- newlib/libc/machine/aarch64/memchr.S | 6 +++--- newlib/libc/machine/aarch64/strchr.S | 6 +++--- newlib/libc/machine/aarch64/strchrnul.S | 6 +++--- newlib/libc/machine/aarch64/strrchr.S | 10 +++++----- 4 files changed, 14 insertions(+), 14 deletions(-) diff --git a/newlib/libc/machine/aarch64/memchr.S = b/newlib/libc/machine/aarch64/memchr.S index 91c2af22d..8389c8a50 100644 --- a/newlib/libc/machine/aarch64/memchr.S +++ b/newlib/libc/machine/aarch64/memchr.S @@ -110,7 +110,7 @@ def_fn memchr and vhas_chr2.16b, vhas_chr2.16b, vrepmask.16b addp vend.16b, vhas_chr1.16b, vhas_chr2.16b /* = 256->128 */ addp vend.16b, vend.16b, vend.16b /* = 128->64 */ - mov synd, vend.2d[0] + mov synd, vend.d[0] /* Clear the soff*2 lower bits */ lsl tmp, soff, #1 lsr synd, synd, tmp @@ -130,7 +130,7 @@ def_fn memchr /* Use a fast check for the termination condition */ orr vend.16b, vhas_chr1.16b, vhas_chr2.16b addp vend.2d, vend.2d, vend.2d - mov synd, vend.2d[0] + mov synd, vend.d[0] /* We're not out of data, loop if we haven't found the character = */ cbz synd, .Lloop =20 @@ -140,7 +140,7 @@ def_fn memchr and vhas_chr2.16b, vhas_chr2.16b, vrepmask.16b addp vend.16b, vhas_chr1.16b, vhas_chr2.16b /* = 256->128 */ addp vend.16b, vend.16b, vend.16b /* = 128->64 */ - mov synd, vend.2d[0] + mov synd, vend.d[0] /* Only do the clear for the last possible block */ b.hi .Ltail =20 diff --git a/newlib/libc/machine/aarch64/strchr.S = b/newlib/libc/machine/aarch64/strchr.S index 5fc0fd06e..8ed6ef673 100644 --- a/newlib/libc/machine/aarch64/strchr.S +++ b/newlib/libc/machine/aarch64/strchr.S @@ -117,7 +117,7 @@ def_fn strchr addp vend1.16b, vend1.16b, vend2.16b // 128->64 lsr tmp1, tmp3, tmp1 =20 - mov tmp3, vend1.2d[0] + mov tmp3, vend1.d[0] bic tmp1, tmp3, tmp1 // Mask padding bits. cbnz tmp1, .Ltail =20 @@ -132,7 +132,7 @@ def_fn strchr orr vend2.16b, vhas_nul2.16b, vhas_chr2.16b orr vend1.16b, vend1.16b, vend2.16b addp vend1.2d, vend1.2d, vend1.2d - mov tmp1, vend1.2d[0] + mov tmp1, vend1.d[0] cbz tmp1, .Lloop =20 /* Termination condition found. Now need to establish exactly = why @@ -146,7 +146,7 @@ def_fn strchr addp vend1.16b, vend1.16b, vend2.16b // 256->128 addp vend1.16b, vend1.16b, vend2.16b // 128->64 =20 - mov tmp1, vend1.2d[0] + mov tmp1, vend1.d[0] .Ltail: /* Count the trailing zeros, by bit reversing... */ rbit tmp1, tmp1 diff --git a/newlib/libc/machine/aarch64/strchrnul.S = b/newlib/libc/machine/aarch64/strchrnul.S index 99fba3128..0e257fa06 100644 --- a/newlib/libc/machine/aarch64/strchrnul.S +++ b/newlib/libc/machine/aarch64/strchrnul.S @@ -109,7 +109,7 @@ def_fn strchrnul addp vend1.16b, vend1.16b, vend1.16b // 128->64 lsr tmp1, tmp3, tmp1 =20 - mov tmp3, vend1.2d[0] + mov tmp3, vend1.d[0] bic tmp1, tmp3, tmp1 // Mask padding bits. cbnz tmp1, .Ltail =20 @@ -124,7 +124,7 @@ def_fn strchrnul orr vhas_chr2.16b, vhas_nul2.16b, vhas_chr2.16b orr vend1.16b, vhas_chr1.16b, vhas_chr2.16b addp vend1.2d, vend1.2d, vend1.2d - mov tmp1, vend1.2d[0] + mov tmp1, vend1.d[0] cbz tmp1, .Lloop =20 /* Termination condition found. Now need to establish exactly = why @@ -134,7 +134,7 @@ def_fn strchrnul addp vend1.16b, vhas_chr1.16b, vhas_chr2.16b // = 256->128 addp vend1.16b, vend1.16b, vend1.16b // 128->64 =20 - mov tmp1, vend1.2d[0] + mov tmp1, vend1.d[0] .Ltail: /* Count the trailing zeros, by bit reversing... */ rbit tmp1, tmp1 diff --git a/newlib/libc/machine/aarch64/strrchr.S = b/newlib/libc/machine/aarch64/strrchr.S index 8cf8d302d..ee425c42b 100644 --- a/newlib/libc/machine/aarch64/strrchr.S +++ b/newlib/libc/machine/aarch64/strrchr.S @@ -120,10 +120,10 @@ def_fn strrchr addp vhas_chr1.16b, vhas_chr1.16b, vhas_chr2.16b // = 256->128 addp vhas_nul1.16b, vhas_nul1.16b, vhas_nul1.16b // = 128->64 addp vhas_chr1.16b, vhas_chr1.16b, vhas_chr1.16b // = 128->64 - mov nul_match, vhas_nul1.2d[0] + mov nul_match, vhas_nul1.d[0] lsl tmp1, tmp1, #1 mov const_m1, #~0 - mov chr_match, vhas_chr1.2d[0] + mov chr_match, vhas_chr1.d[0] lsr tmp3, const_m1, tmp1 =20 bic nul_match, nul_match, tmp3 // Mask padding bits. @@ -146,15 +146,15 @@ def_fn strrchr addp vhas_chr1.16b, vhas_chr1.16b, vhas_chr2.16b // = 256->128 addp vend1.16b, vend1.16b, vend1.16b // 128->64 addp vhas_chr1.16b, vhas_chr1.16b, vhas_chr1.16b // = 128->64 - mov nul_match, vend1.2d[0] - mov chr_match, vhas_chr1.2d[0] + mov nul_match, vend1.d[0] + mov chr_match, vhas_chr1.d[0] cbz nul_match, .Lloop =20 and vhas_nul1.16b, vhas_nul1.16b, vrepmask_0.16b and vhas_nul2.16b, vhas_nul2.16b, vrepmask_0.16b addp vhas_nul1.16b, vhas_nul1.16b, vhas_nul2.16b addp vhas_nul1.16b, vhas_nul1.16b, vhas_nul1.16b - mov nul_match, vhas_nul1.2d[0] + mov nul_match, vhas_nul1.d[0] =20 .Ltail: /* Work out exactly where the string ends. */ --=20 2.24.3 (Apple Git-128) =46rom d80083fccf21ab7664732d88978d982c1bc99080 Mon Sep 17 00:00:00 2001 From: Siguza Date: Mon, 11 Jan 2021 23:01:35 +0100 Subject: [PATCH 4/4] Make aarch64 support the Darwin ABI --- newlib/libc/machine/aarch64/memchr.S | 8 ++++++++ newlib/libc/machine/aarch64/memcmp.S | 10 +++++++++- newlib/libc/machine/aarch64/memcpy.S | 10 +++++++++- newlib/libc/machine/aarch64/memmove.S | 14 +++++++++++++- newlib/libc/machine/aarch64/memset.S | 10 +++++++++- newlib/libc/machine/aarch64/rawmemchr.S | 12 +++++++++++- newlib/libc/machine/aarch64/setjmp.S | 14 ++++++++++++++ newlib/libc/machine/aarch64/strchr.S | 8 ++++++++ newlib/libc/machine/aarch64/strchrnul.S | 8 ++++++++ newlib/libc/machine/aarch64/strcmp.S | 12 ++++++++++-- newlib/libc/machine/aarch64/strcpy.S | 14 +++++++++++++- newlib/libc/machine/aarch64/strlen.S | 10 +++++++++- newlib/libc/machine/aarch64/strncmp.S | 9 +++++++++ newlib/libc/machine/aarch64/strnlen.S | 10 +++++++++- newlib/libc/machine/aarch64/strrchr.S | 8 ++++++++ 15 files changed, 147 insertions(+), 10 deletions(-) diff --git a/newlib/libc/machine/aarch64/memchr.S = b/newlib/libc/machine/aarch64/memchr.S index 8389c8a50..7025919a0 100644 --- a/newlib/libc/machine/aarch64/memchr.S +++ b/newlib/libc/machine/aarch64/memchr.S @@ -70,11 +70,17 @@ * identify exactly which byte has matched. */ =20 +#ifdef __APPLE__ +# define memchr _memchr +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -172,5 +178,7 @@ def_fn memchr mov result, #0 ret =20 +#ifndef __APPLE__ .size memchr, . - memchr #endif +#endif diff --git a/newlib/libc/machine/aarch64/memcmp.S = b/newlib/libc/machine/aarch64/memcmp.S index 981baab3c..95a7d2a8c 100644 --- a/newlib/libc/machine/aarch64/memcmp.S +++ b/newlib/libc/machine/aarch64/memcmp.S @@ -81,15 +81,21 @@ #define tmp1 x7 #define tmp2 x8 =20 +#ifdef __APPLE__ +# define memcmp _memcmp +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 -def_fn memcmp p2align=3D6 +def_fn memcmp, p2align=3D6 subs limit, limit, 8 b.lo L(less8) =20 @@ -192,5 +198,7 @@ L(byte_loop): sub result, data1w, data2w ret =20 +#ifndef __APPLE__ .size memcmp, . - memcmp #endif +#endif diff --git a/newlib/libc/machine/aarch64/memcpy.S = b/newlib/libc/machine/aarch64/memcpy.S index d2de7415d..d9d3ef20f 100644 --- a/newlib/libc/machine/aarch64/memcpy.S +++ b/newlib/libc/machine/aarch64/memcpy.S @@ -87,11 +87,17 @@ =20 #define L(l) .L ## l =20 +#ifdef __APPLE__ +# define memcpy _memcpy +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -104,7 +110,7 @@ well as non-overlapping copies. */ =20 -def_fn memcpy p2align=3D6 +def_fn memcpy, p2align=3D6 prfm PLDL1KEEP, [src] add srcend, src, count add dstend, dstin, count @@ -226,5 +232,7 @@ L(copy_long): stp C_l, C_h, [dstend, -16] ret =20 +#ifndef __APPLE__ .size memcpy, . - memcpy #endif +#endif diff --git a/newlib/libc/machine/aarch64/memmove.S = b/newlib/libc/machine/aarch64/memmove.S index 6da548f10..395482061 100644 --- a/newlib/libc/machine/aarch64/memmove.S +++ b/newlib/libc/machine/aarch64/memmove.S @@ -61,11 +61,18 @@ /* See memmove-stub.c */ #else =20 +#ifdef __APPLE__ +# define memcpy _memcpy +# define memmove _memmove +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -97,8 +104,11 @@ def_fn memmove, 6 sub tmp1, dstin, src cmp count, 96 ccmp tmp1, count, 2, hi - b.hs memcpy + /* Darwin can't use b.hs to jump to external labels. */ + b.lo 0f + b memcpy =20 +0: cbz tmp1, 3f add dstend, dstin, count add srcend, src, count @@ -151,5 +161,7 @@ def_fn memmove, 6 stp C_l, C_h, [dstin] 3: ret =20 +#ifndef __APPLE__ .size memmove, . - memmove #endif +#endif diff --git a/newlib/libc/machine/aarch64/memset.S = b/newlib/libc/machine/aarch64/memset.S index cad9117b7..7bf190943 100644 --- a/newlib/libc/machine/aarch64/memset.S +++ b/newlib/libc/machine/aarch64/memset.S @@ -77,15 +77,21 @@ =20 #define L(l) .L ## l =20 +#ifdef __APPLE__ +# define memset _memset +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 -def_fn memset p2align=3D6 +def_fn memset, p2align=3D6 =20 dup v0.16B, valw add dstend, dstin, count @@ -236,5 +242,7 @@ L(zva_other): sub dst, dst, 32 /* Bias dst for tail loop. */ b L(tail64) =20 +#ifndef __APPLE__ .size memset, . - memset #endif +#endif diff --git a/newlib/libc/machine/aarch64/rawmemchr.S = b/newlib/libc/machine/aarch64/rawmemchr.S index 484971b3f..9f37a4d83 100644 --- a/newlib/libc/machine/aarch64/rawmemchr.S +++ b/newlib/libc/machine/aarch64/rawmemchr.S @@ -36,11 +36,19 @@ =20 #define L(l) .L ## l =20 +#ifdef __APPLE__ +# define memchr _memchr +# define rawmemchr _rawmemchr +# define strlen _strlen +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -48,7 +56,7 @@ Call strlen without setting up a full frame - it preserves x14/x15. */ =20 -def_fn rawmemchr p2align=3D5 +def_fn rawmemchr, p2align=3D5 .cfi_startproc cbz w1, L(do_strlen) mov x2, -1 @@ -63,5 +71,7 @@ L(do_strlen): ret x15 .cfi_endproc =20 +#ifndef __APPLE__ .size rawmemchr, . - rawmemchr #endif +#endif diff --git a/newlib/libc/machine/aarch64/setjmp.S = b/newlib/libc/machine/aarch64/setjmp.S index fde0e45a7..0335b6729 100644 --- a/newlib/libc/machine/aarch64/setjmp.S +++ b/newlib/libc/machine/aarch64/setjmp.S @@ -41,10 +41,17 @@ REG_PAIR (d12, d13, 144); \ REG_PAIR (d14, d15, 160); =20 +#ifdef __APPLE__ +# define setjmp _setjmp +# define longjmp _longjmp +#endif + // int setjmp (jmp_buf) .global setjmp .p2align 2 +#ifndef __APPLE__ .type setjmp, %function +#endif setjmp: mov x16, sp #define REG_PAIR(REG1, REG2, OFFS) stp REG1, REG2, [x0, OFFS] @@ -55,12 +62,16 @@ setjmp: #undef REG_ONE mov w0, #0 ret +#ifndef __APPLE__ .size setjmp, .-setjmp +#endif =20 // void longjmp (jmp_buf, int) __attribute__ ((noreturn)) .global longjmp .p2align 2 +#ifndef __APPLE__ .type longjmp, %function +#endif longjmp: #define REG_PAIR(REG1, REG2, OFFS) ldp REG1, REG2, [x0, OFFS] #define REG_ONE(REG1, OFFS) ldr REG1, [x0, OFFS] @@ -73,4 +84,7 @@ longjmp: cinc w0, w1, eq // use br not ret, as ret is guaranteed to mispredict br x30 + +#ifndef __APPLE__ .size longjmp, .-longjmp +#endif diff --git a/newlib/libc/machine/aarch64/strchr.S = b/newlib/libc/machine/aarch64/strchr.S index 8ed6ef673..c7e159b0a 100644 --- a/newlib/libc/machine/aarch64/strchr.S +++ b/newlib/libc/machine/aarch64/strchr.S @@ -74,11 +74,17 @@ =20 /* Locals and temporaries. */ =20 +#ifdef __APPLE__ +# define strchr _strchr +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -160,5 +166,7 @@ def_fn strchr csel result, result, xzr, eq ret =20 +#ifndef __APPLE__ .size strchr, . - strchr #endif +#endif diff --git a/newlib/libc/machine/aarch64/strchrnul.S = b/newlib/libc/machine/aarch64/strchrnul.S index 0e257fa06..9f5551f59 100644 --- a/newlib/libc/machine/aarch64/strchrnul.S +++ b/newlib/libc/machine/aarch64/strchrnul.S @@ -70,11 +70,17 @@ =20 /* Locals and temporaries. */ =20 +#ifdef __APPLE__ +# define strchrnul _strchrnul +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -145,5 +151,7 @@ def_fn strchrnul add result, src, tmp1, lsr #1 ret =20 +#ifndef __APPLE__ .size strchrnul, . - strchrnul #endif +#endif diff --git a/newlib/libc/machine/aarch64/strcmp.S = b/newlib/libc/machine/aarch64/strcmp.S index cabcf4faa..ce6c2f5ad 100644 --- a/newlib/libc/machine/aarch64/strcmp.S +++ b/newlib/libc/machine/aarch64/strcmp.S @@ -33,11 +33,17 @@ /* See strcmp-stub.c */ #else =20 +#ifdef __APPLE__ +# define strcmp _strcmp +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -67,7 +73,7 @@ #define pos x11 =20 /* Start of performance-critical section -- one 64B cache line. = */ -def_fn strcmp p2align=3D6 +def_fn strcmp, p2align=3D6 eor tmp1, src1, src2 mov zeroones, #REP8_01 tst tmp1, #7 @@ -197,6 +203,8 @@ L(loop_misaligned): L(done): sub result, data1, data2 ret - .size strcmp, .-strcmp =20 +#ifndef __APPLE__ + .size strcmp, .-strcmp +#endif #endif diff --git a/newlib/libc/machine/aarch64/strcpy.S = b/newlib/libc/machine/aarch64/strcpy.S index 95533de60..f9b293423 100644 --- a/newlib/libc/machine/aarch64/strcpy.S +++ b/newlib/libc/machine/aarch64/strcpy.S @@ -66,17 +66,27 @@ #define len x16 #define to_align x17 =20 +#ifdef __APPLE__ +#ifdef BUILD_STPCPY +#define STRCPY _stpcpy +#else +#define STRCPY _strcpy +#endif +#else #ifdef BUILD_STPCPY #define STRCPY stpcpy #else #define STRCPY strcpy +#endif #endif =20 .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -111,7 +121,7 @@ =20 #define MIN_PAGE_SIZE (1 << MIN_PAGE_P2) =20 -def_fn STRCPY p2align=3D6 +def_fn STRCPY, p2align=3D6 /* For moderately short strings, the fastest way to do the copy = is to calculate the length of the string in the same way as strlen, = then essentially do a memcpy of the result. This avoids the need = for @@ -337,5 +347,7 @@ def_fn STRCPY p2align=3D6 bic has_nul2, tmp3, tmp4 b .Lfp_gt8 =20 +#ifndef __APPLE__ .size STRCPY, . - STRCPY #endif +#endif diff --git a/newlib/libc/machine/aarch64/strlen.S = b/newlib/libc/machine/aarch64/strlen.S index 7e6ced01d..c1ef145ea 100644 --- a/newlib/libc/machine/aarch64/strlen.S +++ b/newlib/libc/machine/aarch64/strlen.S @@ -55,11 +55,17 @@ =20 #define L(l) .L ## l =20 +#ifdef __APPLE__ +# define strlen _strlen +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -104,7 +110,7 @@ whether the first fetch, which may be misaligned, crosses a = page boundary. */ =20 -def_fn strlen p2align=3D6 +def_fn strlen, p2align=3D6 and tmp1, srcin, MIN_PAGE_SIZE - 1 mov zeroones, REP8_01 cmp tmp1, MIN_PAGE_SIZE - 16 @@ -234,5 +240,7 @@ L(page_cross): csel data2, data2, tmp2, eq b L(page_cross_entry) =20 +#ifndef __APPLE__ .size strlen, . - strlen #endif +#endif diff --git a/newlib/libc/machine/aarch64/strncmp.S = b/newlib/libc/machine/aarch64/strncmp.S index b218e95a7..bbae2a083 100644 --- a/newlib/libc/machine/aarch64/strncmp.S +++ b/newlib/libc/machine/aarch64/strncmp.S @@ -33,11 +33,17 @@ * ARMv8-a, AArch64 */ =20 +#ifdef __APPLE__ +# define strncmp _strncmp +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -286,5 +292,8 @@ def_fn strncmp .Lret0: mov result, #0 ret + +#ifndef __APPLE__ .size strncmp, . - strncmp #endif +#endif diff --git a/newlib/libc/machine/aarch64/strnlen.S = b/newlib/libc/machine/aarch64/strnlen.S index 0eb742412..f6f501fec 100644 --- a/newlib/libc/machine/aarch64/strnlen.S +++ b/newlib/libc/machine/aarch64/strnlen.S @@ -55,11 +55,17 @@ #define pos x13 #define limit_wd x14 =20 +#ifdef __APPLE__ +# define strnlen _strnlen +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -182,6 +188,8 @@ def_fn strnlen csinv data1, data1, xzr, le csel data2, data2, data2a, le b .Lrealigned - .size strnlen, . - .Lstart /* Include pre-padding in size. = */ =20 +#ifndef __APPLE__ + .size strnlen, . - .Lstart /* Include pre-padding in size. = */ +#endif #endif diff --git a/newlib/libc/machine/aarch64/strrchr.S = b/newlib/libc/machine/aarch64/strrchr.S index ee425c42b..b65833fe0 100644 --- a/newlib/libc/machine/aarch64/strrchr.S +++ b/newlib/libc/machine/aarch64/strrchr.S @@ -80,11 +80,17 @@ =20 /* Locals and temporaries. */ =20 +#ifdef __APPLE__ +# define strrchr _strrchr +#endif + .macro def_fn f p2align=3D2 .text .p2align \p2align .global \f +#ifndef __APPLE__ .type \f, %function +#endif \f: .endm =20 @@ -178,5 +184,7 @@ def_fn strrchr =20 ret =20 +#ifndef __APPLE__ .size strrchr, . - strrchr #endif +#endif --=20 2.24.3 (Apple Git-128)