From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb36.google.com (mail-yb1-xb36.google.com [IPv6:2607:f8b0:4864:20::b36]) by sourceware.org (Postfix) with ESMTPS id 9A2B83841457 for ; Wed, 29 Jun 2022 22:13:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9A2B83841457 Received: by mail-yb1-xb36.google.com with SMTP id i15so30532863ybp.1 for ; Wed, 29 Jun 2022 15:13:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0ZjH4/CfzyF6R+2lrhlfC6jylHUyxGaaIOX7xddJDgg=; b=ajhtqKAfOEHL9RzlncUJHV4/zQFdFrpYNV9U/p0qC2BQcQAWHMUUVj/OpNYWMoOatZ DRQC6FR2wvHYEhFTRsgfs5U0S8u5qUQ5f/G4aOviKMuZeyanih98Q6tvpuXZsD0pX06j gOmt3PJnZvgQBfOFY3Tqs/Q7F3SPy4vGwTkWDL7hpfJrfvQEaumsfA7dmmT7KczioO2p UI5FlDrH8oRgouMXVsUa6lRb9pEF1qb7W5H0Je2IWrdNmxFLlW6yf7M9sZfMhFPxXgtm SVLHLLrYYHNBffy9e84qkMcv6968B6bEPk6WAfF9TQmMf9G4Yz7OVrb1gYBfk8b+303R e2Pw== X-Gm-Message-State: AJIora9cau8gRLREzFt0QnYQj4cm9K7WH/r87+H99qa+7pSvR/dLVuUM wMkVntSnmOhp4cKxc/7r8i+qIT3n9XO7RTyD5pGZmUbfWFs= X-Google-Smtp-Source: AGRyM1vyRidIOdHqH45AJrbqVkNxyU861S82mzt3plmjM34eAFjQNCC+PQ2KVm7FtnjGLjQav2k0eS8T6cDWw651yOs= X-Received: by 2002:a25:d715:0:b0:66a:1c6a:3eb5 with SMTP id o21-20020a25d715000000b0066a1c6a3eb5mr6066739ybg.555.1656540825978; Wed, 29 Jun 2022 15:13:45 -0700 (PDT) MIME-Version: 1.0 References: <20220628152757.17922-1-goldstein.w.n@gmail.com> In-Reply-To: From: Noah Goldstein Date: Wed, 29 Jun 2022 15:13:35 -0700 Message-ID: Subject: Re: [PATCH v1 1/2] x86: Move mem{p}{mov|cpy}_{chk_}erms to its own file To: "H.J. Lu" Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jun 2022 22:13:48 -0000 On Wed, Jun 29, 2022 at 1:17 PM H.J. Lu wrote: > > On Wed, Jun 29, 2022 at 1:03 PM Noah Goldstein wrote: > > > > On Wed, Jun 29, 2022 at 12:59 PM H.J. Lu wrote: > > > > > > On Wed, Jun 29, 2022 at 12:48 PM Noah Goldstein wrote: > > > > > > > > On Wed, Jun 29, 2022 at 12:46 PM H.J. Lu wrote: > > > > > > > > > > On Wed, Jun 29, 2022 at 12:34 PM Noah Goldstein wrote: > > > > > > > > > > > > On Wed, Jun 29, 2022 at 12:32 PM H.J. Lu wrote: > > > > > > > > > > > > > > On Tue, Jun 28, 2022 at 8:28 AM Noah Goldstein wrote: > > > > > > > > > > > > > > > > The primary memmove_{impl}_unaligned_erms implementations don't > > > > > > > > interact with this function. Putting them in same file both > > > > > > > > wastes space and unnecessarily bloats a hot code section. > > > > > > > > --- > > > > > > > > sysdeps/x86_64/multiarch/memmove-erms.S | 53 +++++++++++++++++++ > > > > > > > > .../multiarch/memmove-vec-unaligned-erms.S | 50 ----------------- > > > > > > > > 2 files changed, 53 insertions(+), 50 deletions(-) > > > > > > > > create mode 100644 sysdeps/x86_64/multiarch/memmove-erms.S > > > > > > > > > > > > > > > > diff --git a/sysdeps/x86_64/multiarch/memmove-erms.S b/sysdeps/x86_64/multiarch/memmove-erms.S > > > > > > > > new file mode 100644 > > > > > > > > index 0000000000..d98d21644b > > > > > > > > --- /dev/null > > > > > > > > +++ b/sysdeps/x86_64/multiarch/memmove-erms.S > > > > > > > > @@ -0,0 +1,53 @@ > > > > > > > > +#include > > > > > > > > + > > > > > > > > +#if defined USE_MULTIARCH && IS_IN (libc) > > > > > > > > + .text > > > > > > > > +ENTRY (__mempcpy_chk_erms) > > > > > > > > + cmp %RDX_LP, %RCX_LP > > > > > > > > + jb HIDDEN_JUMPTARGET (__chk_fail) > > > > > > > > +END (__mempcpy_chk_erms) > > > > > > > > + > > > > > > > > +/* Only used to measure performance of REP MOVSB. */ > > > > > > > > +ENTRY (__mempcpy_erms) > > > > > > > > + mov %RDI_LP, %RAX_LP > > > > > > > > + /* Skip zero length. */ > > > > > > > > + test %RDX_LP, %RDX_LP > > > > > > > > + jz 2f > > > > > > > > + add %RDX_LP, %RAX_LP > > > > > > > > + jmp L(start_movsb) > > > > > > > > +END (__mempcpy_erms) > > > > > > > > + > > > > > > > > +ENTRY (__memmove_chk_erms) > > > > > > > > + cmp %RDX_LP, %RCX_LP > > > > > > > > + jb HIDDEN_JUMPTARGET (__chk_fail) > > > > > > > > +END (__memmove_chk_erms) > > > > > > > > + > > > > > > > > +ENTRY (__memmove_erms) > > > > > > > > + movq %rdi, %rax > > > > > > > > + /* Skip zero length. */ > > > > > > > > + test %RDX_LP, %RDX_LP > > > > > > > > + jz 2f > > > > > > > > +L(start_movsb): > > > > > > > > + mov %RDX_LP, %RCX_LP > > > > > > > > + cmp %RSI_LP, %RDI_LP > > > > > > > > + jb 1f > > > > > > > > + /* Source == destination is less common. */ > > > > > > > > + je 2f > > > > > > > > + lea (%rsi,%rcx), %RDX_LP > > > > > > > > + cmp %RDX_LP, %RDI_LP > > > > > > > > + jb L(movsb_backward) > > > > > > > > +1: > > > > > > > > + rep movsb > > > > > > > > +2: > > > > > > > > + ret > > > > > > > > +L(movsb_backward): > > > > > > > > + leaq -1(%rdi,%rcx), %rdi > > > > > > > > + leaq -1(%rsi,%rcx), %rsi > > > > > > > > + std > > > > > > > > + rep movsb > > > > > > > > + cld > > > > > > > > + ret > > > > > > > > +END (__memmove_erms) > > > > > > > > +strong_alias (__memmove_erms, __memcpy_erms) > > > > > > > > +strong_alias (__memmove_chk_erms, __memcpy_chk_erms) > > > > > > > > +#endif > > > > > > > > diff --git a/sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S > > > > > > > > index d1518b8bab..04747133b7 100644 > > > > > > > > --- a/sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S > > > > > > > > +++ b/sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S > > > > > > > > @@ -239,56 +239,6 @@ L(start): > > > > > > > > #endif > > > > > > > > #if defined USE_MULTIARCH && IS_IN (libc) > > > > > > > > END (MEMMOVE_SYMBOL (__memmove, unaligned)) > > > > > > > > -# if VEC_SIZE == 16 > > > > > > > > -ENTRY (__mempcpy_chk_erms) > > > > > > > > - cmp %RDX_LP, %RCX_LP > > > > > > > > - jb HIDDEN_JUMPTARGET (__chk_fail) > > > > > > > > -END (__mempcpy_chk_erms) > > > > > > > > - > > > > > > > > -/* Only used to measure performance of REP MOVSB. */ > > > > > > > > -ENTRY (__mempcpy_erms) > > > > > > > > - mov %RDI_LP, %RAX_LP > > > > > > > > - /* Skip zero length. */ > > > > > > > > - test %RDX_LP, %RDX_LP > > > > > > > > - jz 2f > > > > > > > > - add %RDX_LP, %RAX_LP > > > > > > > > - jmp L(start_movsb) > > > > > > > > -END (__mempcpy_erms) > > > > > > > > - > > > > > > > > -ENTRY (__memmove_chk_erms) > > > > > > > > - cmp %RDX_LP, %RCX_LP > > > > > > > > - jb HIDDEN_JUMPTARGET (__chk_fail) > > > > > > > > -END (__memmove_chk_erms) > > > > > > > > - > > > > > > > > -ENTRY (__memmove_erms) > > > > > > > > - movq %rdi, %rax > > > > > > > > - /* Skip zero length. */ > > > > > > > > - test %RDX_LP, %RDX_LP > > > > > > > > - jz 2f > > > > > > > > -L(start_movsb): > > > > > > > > - mov %RDX_LP, %RCX_LP > > > > > > > > - cmp %RSI_LP, %RDI_LP > > > > > > > > - jb 1f > > > > > > > > - /* Source == destination is less common. */ > > > > > > > > - je 2f > > > > > > > > - lea (%rsi,%rcx), %RDX_LP > > > > > > > > - cmp %RDX_LP, %RDI_LP > > > > > > > > - jb L(movsb_backward) > > > > > > > > -1: > > > > > > > > - rep movsb > > > > > > > > -2: > > > > > > > > - ret > > > > > > > > -L(movsb_backward): > > > > > > > > - leaq -1(%rdi,%rcx), %rdi > > > > > > > > - leaq -1(%rsi,%rcx), %rsi > > > > > > > > - std > > > > > > > > - rep movsb > > > > > > > > - cld > > > > > > > > - ret > > > > > > > > -END (__memmove_erms) > > > > > > > > -strong_alias (__memmove_erms, __memcpy_erms) > > > > > > > > -strong_alias (__memmove_chk_erms, __memcpy_chk_erms) > > > > > > > > -# endif > > > > > > > > > > > > > > > > # ifdef SHARED > > > > > > > > ENTRY (MEMMOVE_CHK_SYMBOL (__mempcpy_chk, unaligned_erms)) > > > > > > > > -- > > > > > > > > 2.34.1 > > > > > > > > > > > > > > > > > > > > > > Please make a standalone patch. > > > > > > > > > > > > The memmove isa raising change has a dependency on it hence the series. > > > > > > Submit this first then rebsubmit memmove? Same for memset-erms / > > > > > > memset-isa raising? > > > > > > > > > > Yes. Moving the erms version to a separate file should be standalone. > > > > > Each patch should build. > > > > > > > > The reason it doesn't is that the isa raising for memmove-vec... > > > > would require handling the memmove_erms case for it to build > > > > otherwise ISA V3/V4 would leave memmove_erms > > > > as undefined symbols in the multiarch build. > > > > > > The first patch in your patch set isn't standalone since Makefile > > > change is in the second patch. > > > > Err I think there is a misunderstanding. > > > > I am going to fixup to make the memmove_erms / memset_erms > > changes standalone. My point is the following ISA raising patches > > are dependent on the respective erms functions being moved. > > > > Hences its a series. > > A series is OK. Fixed up both memset/memmove erms patches in V2. > > > > > > > > > > > > > > > > > You should combine 2 memset-erms changes > > > > > in 2 memset patches into a single patch. > > > > ? > > > > > > > > > > The first patch moves memset-erms to the file end > > > and the second patch moves it to another file. > > > > > > -- > > > H.J. > > > > -- > H.J.