From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) by sourceware.org (Postfix) with ESMTPS id 412303858C2B for ; Fri, 27 Oct 2023 17:15:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 412303858C2B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rtems.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 412303858C2B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.210.53 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698426947; cv=none; b=mf37FQSEZCnPZdafnOS2ETfLV2pqxnYwAss36/W8s+qK8jaGoyX8H1NSgyVe6FLxYcyHqBOcw4gztiDK6qywxIdqsAZ4f6xGwGUo4A2Ze1Lw73eiwKjdIbMPg6jWt1zhNMXhJ9M+kC4dfafMz9HP1fArmGMF+gXBYn5W8093k4c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698426947; c=relaxed/simple; bh=MqsW8dFhEyFaHWGJNis1wseMhGwRmwuNWb4AOdEE9qE=; h=MIME-Version:From:Date:Message-ID:Subject:To; b=hxPOIvkGzmZqF+dRBfwrySRrlPVBTfFPjxy4+QPYhkakDWAMjayyzzLXw9dLgviSuPQhjXsNR3zaUHzyQF18aEBzNsddn+fMnsgmh7nP/3SPlyq6VvouQGNKC5ghM25fwRKiWXIOgLdkDttaibl5dx/MIhUVuNccMJuTjf15QPo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-f53.google.com with SMTP id 46e09a7af769-6ce2cc39d12so1385833a34.1 for ; Fri, 27 Oct 2023 10:15:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698426938; x=1699031738; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Pq+RAf3YU+k44HZ4FAggsX9CquR/i06Go88/WbsgVjU=; b=pA8j2GcrBhTWwSdnZAq/q+1n/YuNFlJaN4Ih+PSKfe8y9Vuq23AU4DZOzrEEXxtsa1 w3yzTfFWmJLi+go755IeIFHQPAzvL4yw8mZuM47Lvj3hmGhJFY4+IQAU3LlzTGNnhKXv NSTcV4NuvPKDnuFSP+iCvirlQFfo5BwzE6xQ+uACrhNrbQjNuIo3mvQNRn2B82v038wD Q6YDUNkACfSQDokKPNZrlMOJEtmn3ydCseQM4XsiMIx7bV2E/FzUUbww4aPNlQ9PLoma XadLuQy1ozoZ1mAh4Uuh6MDRK/CgV9uYfaajFYl/NMUugObjBfCFWW0drS25bqjY+rbu horw== X-Gm-Message-State: AOJu0YwqQen1kRlAiBsygz9tcSP+qvYyxIHPoLTNi2c3fL09TqjM2k1g RPsS+H/JSyiGZgRrwj3qFLFCPH3saBY= X-Google-Smtp-Source: AGHT+IG97ol4FZGviHpit6YqJfQiOzpCH0cSkHpPG5X+WKTQeZ9yQtIKs2UcfsGGjhyjofBPdE81mQ== X-Received: by 2002:a9d:6b8d:0:b0:6c6:50d0:1104 with SMTP id b13-20020a9d6b8d000000b006c650d01104mr3342900otq.27.1698426937991; Fri, 27 Oct 2023 10:15:37 -0700 (PDT) Received: from mail-oo1-f50.google.com (mail-oo1-f50.google.com. [209.85.161.50]) by smtp.gmail.com with ESMTPSA id d17-20020a056830045100b006c4d6a06a94sm337145otc.76.2023.10.27.10.15.37 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 27 Oct 2023 10:15:37 -0700 (PDT) Received: by mail-oo1-f50.google.com with SMTP id 006d021491bc7-5849fc56c62so1420117eaf.3 for ; Fri, 27 Oct 2023 10:15:37 -0700 (PDT) X-Received: by 2002:a05:6358:89e:b0:168:f5e2:b1e0 with SMTP id m30-20020a056358089e00b00168f5e2b1e0mr3775769rwj.31.1698426937122; Fri, 27 Oct 2023 10:15:37 -0700 (PDT) MIME-Version: 1.0 References: <20231027165918.69721-1-sebastian.huber@embedded-brains.de> In-Reply-To: <20231027165918.69721-1-sebastian.huber@embedded-brains.de> Reply-To: joel@rtems.org From: Joel Sherrill Date: Fri, 27 Oct 2023 12:15:25 -0500 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] aarch64: Remove duplicated optimized memmove() To: Sebastian Huber Cc: newlib@sourceware.org Content-Type: multipart/alternative; boundary="00000000000043a8090608b5d6f8" X-Spam-Status: No, score=-3037.5 required=5.0 tests=BAYES_00,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,KAM_DMARC_STATUS,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --00000000000043a8090608b5d6f8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable This doesn't appear to resolve the issue. $ aarch64-rtems6-nm -g ~/test-gcc/install-master/aarch64-rtems6/lib/libc.a | grep "T memmov" 0000000000000000 T memmove 0000000000000000 T memmove The libc/string/memmove.c is compiled and not overridden by anything in machine/aarch64. This results in the libc/string and memcpy.S objects both being in the libc.a with memmove(). I think there has to be an empty file for memmove.[cS] in machine/aarch64 to override the libc/string version. Alternatively, I think the memmove-stub.c could just be renamed memmove.c and this should also give the same result. --joel On Fri, Oct 27, 2023 at 11:59=E2=80=AFAM Sebastian Huber < sebastian.huber@embedded-brains.de> wrote: > The optimized aarch64/memcpy.S already provides a memmove() implementatio= n. > --- > newlib/Makefile.in | 20 --- > newlib/libc/machine/aarch64/Makefile.inc | 1 - > newlib/libc/machine/aarch64/memmove-stub.c | 2 +- > newlib/libc/machine/aarch64/memmove.S | 155 --------------------- > 4 files changed, 1 insertion(+), 177 deletions(-) > delete mode 100644 newlib/libc/machine/aarch64/memmove.S > > diff --git a/newlib/Makefile.in b/newlib/Makefile.in > index 4cb3534cc4..9eb21c7919 100644 > --- a/newlib/Makefile.in > +++ b/newlib/Makefile.in > @@ -595,7 +595,6 @@ check_PROGRAMS =3D > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/memcpy-stub.c \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ libc/machine/aarch64/memcpy.S \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/memmove-stub.c \ > -@HAVE_LIBC_MACHINE_AARCH64_TRUE@ libc/machine/aarch64/memmove.S \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/memrchr-stub.c \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ libc/machine/aarch64/memrchr.S \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/memset-stub.c \ > @@ -1848,7 +1847,6 @@ am__objects_51 =3D libc/ssp/libc_a-chk_fail.$(OBJEX= T) \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/libc_a-memcpy-stub.$(OBJEXT) \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/libc_a-memcpy.$(OBJEXT) \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/libc_a-memmove-stub.$(OBJEXT) \ > -@HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/libc_a-memmove.$(OBJEXT) \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/libc_a-memrchr-stub.$(OBJEXT) \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/libc_a-memrchr.$(OBJEXT) \ > @HAVE_LIBC_MACHINE_AARCH64_TRUE@ > libc/machine/aarch64/libc_a-memset-stub.$(OBJEXT) \ > @@ -8025,9 +8023,6 @@ libc/machine/aarch64/libc_a-memcpy.$(OBJEXT): \ > libc/machine/aarch64/libc_a-memmove-stub.$(OBJEXT): \ > libc/machine/aarch64/$(am__dirstamp) \ > libc/machine/aarch64/$(DEPDIR)/$(am__dirstamp) > -libc/machine/aarch64/libc_a-memmove.$(OBJEXT): \ > - libc/machine/aarch64/$(am__dirstamp) \ > - libc/machine/aarch64/$(DEPDIR)/$(am__dirstamp) > libc/machine/aarch64/libc_a-memrchr-stub.$(OBJEXT): \ > libc/machine/aarch64/$(am__dirstamp) \ > libc/machine/aarch64/$(DEPDIR)/$(am__dirstamp) > @@ -12739,7 +12734,6 @@ distclean-compile: > @AMDEP_TRUE@@am__include@ @am__quote@libc > /machine/aarch64/$(DEPDIR)/libc_a-memcpy-stub.Po@am__quote@ > @AMDEP_TRUE@@am__include@ @am__quote@libc > /machine/aarch64/$(DEPDIR)/libc_a-memcpy.Po@am__quote@ > @AMDEP_TRUE@@am__include@ @am__quote@libc > /machine/aarch64/$(DEPDIR)/libc_a-memmove-stub.Po@am__quote@ > -@AMDEP_TRUE@@am__include@ @am__quote@libc > /machine/aarch64/$(DEPDIR)/libc_a-memmove.Po@am__quote@ > @AMDEP_TRUE@@am__include@ @am__quote@libc > /machine/aarch64/$(DEPDIR)/libc_a-memrchr-stub.Po@am__quote@ > @AMDEP_TRUE@@am__include@ @am__quote@libc > /machine/aarch64/$(DEPDIR)/libc_a-memrchr.Po@am__quote@ > @AMDEP_TRUE@@am__include@ @am__quote@libc > /machine/aarch64/$(DEPDIR)/libc_a-memset-stub.Po@am__quote@ > @@ -16709,20 +16703,6 @@ libc/machine/aarch64/libc_a-memcpy.obj: > libc/machine/aarch64/memcpy.S > @AMDEP_TRUE@@am__fastdepCCAS_FALSE@ DEPDIR=3D$(DEPDIR) $(CCASDEPMODE) > $(depcomp) @AMDEPBACKSLASH@ > @am__fastdepCCAS_FALSE@ $(AM_V_CPPAS@am__nodep@)$(CCAS) $(DEFS) > $(DEFAULT_INCLUDES) $(INCLUDES) $(libc_a_CPPFLAGS) $(CPPFLAGS) > $(libc_a_CCASFLAGS) $(CCASFLAGS) -c -o > libc/machine/aarch64/libc_a-memcpy.obj `if test -f > 'libc/machine/aarch64/memcpy.S'; then $(CYGPATH_W) > 'libc/machine/aarch64/memcpy.S'; else $(CYGPATH_W) > '$(srcdir)/libc/machine/aarch64/memcpy.S'; fi` > > -libc/machine/aarch64/libc_a-memmove.o: libc/machine/aarch64/memmove.S > -@am__fastdepCCAS_TRUE@ $(AM_V_CPPAS)$(CCAS) $(DEFS) $(DEFAULT_INCLUDES) > $(INCLUDES) $(libc_a_CPPFLAGS) $(CPPFLAGS) $(libc_a_CCASFLAGS) $(CCASFLAG= S) > -MT libc/machine/aarch64/libc_a-memmove.o -MD -MP -MF > libc/machine/aarch64/$(DEPDIR)/libc_a-memmove.Tpo -c -o > libc/machine/aarch64/libc_a-memmove.o `test -f > 'libc/machine/aarch64/memmove.S' || echo > '$(srcdir)/'`libc/machine/aarch64/memmove.S > -@am__fastdepCCAS_TRUE@ $(AM_V_at)$(am__mv) > libc/machine/aarch64/$(DEPDIR)/libc_a-memmove.Tpo > libc/machine/aarch64/$(DEPDIR)/libc_a-memmove.Po > -@AMDEP_TRUE@@am__fastdepCCAS_FALSE@ > $(AM_V_CPPAS)source=3D'libc/machine/aarch64/memmove.S' > object=3D'libc/machine/aarch64/libc_a-memmove.o' libtool=3Dno @AMDEPBACKS= LASH@ > -@AMDEP_TRUE@@am__fastdepCCAS_FALSE@ DEPDIR=3D$(DEPDIR) $(CCASDEPMODE) > $(depcomp) @AMDEPBACKSLASH@ > -@am__fastdepCCAS_FALSE@ $(AM_V_CPPAS@am__nodep@)$(CCAS) $(DEFS) > $(DEFAULT_INCLUDES) $(INCLUDES) $(libc_a_CPPFLAGS) $(CPPFLAGS) > $(libc_a_CCASFLAGS) $(CCASFLAGS) -c -o > libc/machine/aarch64/libc_a-memmove.o `test -f > 'libc/machine/aarch64/memmove.S' || echo > '$(srcdir)/'`libc/machine/aarch64/memmove.S > - > -libc/machine/aarch64/libc_a-memmove.obj: libc/machine/aarch64/memmove.S > -@am__fastdepCCAS_TRUE@ $(AM_V_CPPAS)$(CCAS) $(DEFS) $(DEFAULT_INCLUDES) > $(INCLUDES) $(libc_a_CPPFLAGS) $(CPPFLAGS) $(libc_a_CCASFLAGS) $(CCASFLAG= S) > -MT libc/machine/aarch64/libc_a-memmove.obj -MD -MP -MF > libc/machine/aarch64/$(DEPDIR)/libc_a-memmove.Tpo -c -o > libc/machine/aarch64/libc_a-memmove.obj `if test -f > 'libc/machine/aarch64/memmove.S'; then $(CYGPATH_W) > 'libc/machine/aarch64/memmove.S'; else $(CYGPATH_W) > '$(srcdir)/libc/machine/aarch64/memmove.S'; fi` > -@am__fastdepCCAS_TRUE@ $(AM_V_at)$(am__mv) > libc/machine/aarch64/$(DEPDIR)/libc_a-memmove.Tpo > libc/machine/aarch64/$(DEPDIR)/libc_a-memmove.Po > -@AMDEP_TRUE@@am__fastdepCCAS_FALSE@ > $(AM_V_CPPAS)source=3D'libc/machine/aarch64/memmove.S' > object=3D'libc/machine/aarch64/libc_a-memmove.obj' libtool=3Dno @AMDEPBAC= KSLASH@ > -@AMDEP_TRUE@@am__fastdepCCAS_FALSE@ DEPDIR=3D$(DEPDIR) $(CCASDEPMODE) > $(depcomp) @AMDEPBACKSLASH@ > -@am__fastdepCCAS_FALSE@ $(AM_V_CPPAS@am__nodep@)$(CCAS) $(DEFS) > $(DEFAULT_INCLUDES) $(INCLUDES) $(libc_a_CPPFLAGS) $(CPPFLAGS) > $(libc_a_CCASFLAGS) $(CCASFLAGS) -c -o > libc/machine/aarch64/libc_a-memmove.obj `if test -f > 'libc/machine/aarch64/memmove.S'; then $(CYGPATH_W) > 'libc/machine/aarch64/memmove.S'; else $(CYGPATH_W) > '$(srcdir)/libc/machine/aarch64/memmove.S'; fi` > - > libc/machine/aarch64/libc_a-memrchr.o: libc/machine/aarch64/memrchr.S > @am__fastdepCCAS_TRUE@ $(AM_V_CPPAS)$(CCAS) $(DEFS) $(DEFAULT_INCLUDES) > $(INCLUDES) $(libc_a_CPPFLAGS) $(CPPFLAGS) $(libc_a_CCASFLAGS) $(CCASFLAG= S) > -MT libc/machine/aarch64/libc_a-memrchr.o -MD -MP -MF > libc/machine/aarch64/$(DEPDIR)/libc_a-memrchr.Tpo -c -o > libc/machine/aarch64/libc_a-memrchr.o `test -f > 'libc/machine/aarch64/memrchr.S' || echo > '$(srcdir)/'`libc/machine/aarch64/memrchr.S > @am__fastdepCCAS_TRUE@ $(AM_V_at)$(am__mv) > libc/machine/aarch64/$(DEPDIR)/libc_a-memrchr.Tpo > libc/machine/aarch64/$(DEPDIR)/libc_a-memrchr.Po > diff --git a/newlib/libc/machine/aarch64/Makefile.inc > b/newlib/libc/machine/aarch64/Makefile.inc > index c749b0d575..f705dfea15 100644 > --- a/newlib/libc/machine/aarch64/Makefile.inc > +++ b/newlib/libc/machine/aarch64/Makefile.inc > @@ -6,7 +6,6 @@ libc_a_SOURCES +=3D \ > %D%/memcpy-stub.c \ > %D%/memcpy.S \ > %D%/memmove-stub.c \ > - %D%/memmove.S \ > %D%/memrchr-stub.c \ > %D%/memrchr.S \ > %D%/memset-stub.c \ > diff --git a/newlib/libc/machine/aarch64/memmove-stub.c > b/newlib/libc/machine/aarch64/memmove-stub.c > index 8fa4ab9387..bc8255fb8b 100644 > --- a/newlib/libc/machine/aarch64/memmove-stub.c > +++ b/newlib/libc/machine/aarch64/memmove-stub.c > @@ -27,5 +27,5 @@ > #if (defined (__OPTIMIZE_SIZE__) || defined (PREFER_SIZE_OVER_SPEED)) > # include "../../string/memmove.c" > #else > -/* See memmove.S */ > +/* See memcpy.S */ > #endif > diff --git a/newlib/libc/machine/aarch64/memmove.S > b/newlib/libc/machine/aarch64/memmove.S > deleted file mode 100644 > index 597a8c8e9e..0000000000 > --- a/newlib/libc/machine/aarch64/memmove.S > +++ /dev/null > @@ -1,155 +0,0 @@ > -/* Copyright (c) 2013, Linaro Limited > - All rights reserved. > - > - Redistribution and use in source and binary forms, with or without > - modification, are permitted provided that the following conditions are > met: > - * Redistributions of source code must retain the above copyright > - notice, this list of conditions and the following disclaimer. > - * Redistributions in binary form must reproduce the above copyrig= ht > - notice, this list of conditions and the following disclaimer in > the > - documentation and/or other materials provided with the > distribution. > - * Neither the name of the Linaro nor the > - names of its contributors may be used to endorse or promote > products > - derived from this software without specific prior written > permission. > - > - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > - HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. = */ > - > -/* > - * Copyright (c) 2015 ARM Ltd > - * All rights reserved. > - * > - * Redistribution and use in source and binary forms, with or without > - * modification, are permitted provided that the following conditions > - * are met: > - * 1. Redistributions of source code must retain the above copyright > - * notice, this list of conditions and the following disclaimer. > - * 2. Redistributions in binary form must reproduce the above copyright > - * notice, this list of conditions and the following disclaimer in the > - * documentation and/or other materials provided with the distributio= n. > - * 3. The name of the company may not be used to endorse or promote > - * products derived from this software without specific prior written > - * permission. > - * > - * THIS SOFTWARE IS PROVIDED BY ARM LTD ``AS IS'' AND ANY EXPRESS OR > IMPLIED > - * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF > - * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. > - * IN NO EVENT SHALL ARM LTD BE LIABLE FOR ANY DIRECT, INDIRECT, > INCIDENTAL, > - * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > LIMITED > - * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR > - * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF > - * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING > - * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS > - * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > - */ > - > -/* Assumptions: > - * > - * ARMv8-a, AArch64, unaligned accesses > - */ > - > -#if (defined (__OPTIMIZE_SIZE__) || defined (PREFER_SIZE_OVER_SPEED)) > -/* See memmove-stub.c */ > -#else > - > - .macro def_fn f p2align=3D0 > - .text > - .p2align \p2align > - .global \f > - .type \f, %function > -\f: > - .endm > - > -/* Parameters and result. */ > -#define dstin x0 > -#define src x1 > -#define count x2 > -#define srcend x3 > -#define dstend x4 > -#define tmp1 x5 > -#define A_l x6 > -#define A_h x7 > -#define B_l x8 > -#define B_h x9 > -#define C_l x10 > -#define C_h x11 > -#define D_l x12 > -#define D_h x13 > -#define E_l count > -#define E_h tmp1 > - > -/* All memmoves up to 96 bytes are done by memcpy as it supports overlap= s. > - Larger backwards copies are also handled by memcpy. The only remaining > - case is forward large copies. The destination is aligned, and an > - unrolled loop processes 64 bytes per iteration. > -*/ > - > -def_fn memmove, 6 > - sub tmp1, dstin, src > - cmp count, 96 > - ccmp tmp1, count, 2, hi > - b.hs memcpy > - > - cbz tmp1, 3f > - add dstend, dstin, count > - add srcend, src, count > - > - /* Align dstend to 16 byte alignment so that we don't cross cache > line > - boundaries on both loads and stores. There are at least 96 > bytes > - to copy, so copy 16 bytes unaligned and then align. The loop > - copies 64 bytes per iteration and prefetches one iteration > ahead. */ > - > - and tmp1, dstend, 15 > - ldp D_l, D_h, [srcend, -16] > - sub srcend, srcend, tmp1 > - sub count, count, tmp1 > - ldp A_l, A_h, [srcend, -16] > - stp D_l, D_h, [dstend, -16] > - ldp B_l, B_h, [srcend, -32] > - ldp C_l, C_h, [srcend, -48] > - ldp D_l, D_h, [srcend, -64]! > - sub dstend, dstend, tmp1 > - subs count, count, 128 > - b.ls 2f > - nop > -1: > - stp A_l, A_h, [dstend, -16] > - ldp A_l, A_h, [srcend, -16] > - stp B_l, B_h, [dstend, -32] > - ldp B_l, B_h, [srcend, -32] > - stp C_l, C_h, [dstend, -48] > - ldp C_l, C_h, [srcend, -48] > - stp D_l, D_h, [dstend, -64]! > - ldp D_l, D_h, [srcend, -64]! > - subs count, count, 64 > - b.hi 1b > - > - /* Write the last full set of 64 bytes. The remainder is at most > 64 > - bytes, so it is safe to always copy 64 bytes from the start > even if > - there is just 1 byte left. */ > -2: > - ldp E_l, E_h, [src, 48] > - stp A_l, A_h, [dstend, -16] > - ldp A_l, A_h, [src, 32] > - stp B_l, B_h, [dstend, -32] > - ldp B_l, B_h, [src, 16] > - stp C_l, C_h, [dstend, -48] > - ldp C_l, C_h, [src] > - stp D_l, D_h, [dstend, -64] > - stp E_l, E_h, [dstin, 48] > - stp A_l, A_h, [dstin, 32] > - stp B_l, B_h, [dstin, 16] > - stp C_l, C_h, [dstin] > -3: ret > - > - .size memmove, . - memmove > -#endif > -- > 2.35.3 > > --00000000000043a8090608b5d6f8--