public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Noah Goldstein <goldstein.w.n@gmail.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: GNU C Library <libc-alpha@sourceware.org>,
	"Carlos O'Donell" <carlos@systemhalted.org>
Subject: Re: [PATCH v1 2/3] x86: Move and slightly improve memset_erms
Date: Wed, 29 Jun 2022 12:32:33 -0700	[thread overview]
Message-ID: <CAFUsyfLkK-n0vNmzUYqNV9wifAGqU5b7hsU3dES0Rq8HcQsuVA@mail.gmail.com> (raw)
In-Reply-To: <CAMe9rOpQjA91qGQLnm=+702wEKj1seHmqCL-gZr+xUCm39XFqw@mail.gmail.com>

On Wed, Jun 29, 2022 at 12:26 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Jun 28, 2022 at 8:27 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > Implementation wise:
> >     1. Remove the VZEROUPPER as memset_{impl}_unaligned_erms does not
> >        use the L(stosb) label that was previously defined.
> >
> >     2. Don't give the hotpath (fallthrough) to zero size.
> >
> > Code positioning wise:
> >
> > Move L(memset_{chk}_erms) to its own file.  Leaving it in between the
>
>  It is ENTRY, not L.   Did you mean to move them to the end of file?

Will fix L -> ENTRY for V2.

Yes it should be moved to new file in this patch. Was rebase mistake. The file
change is in the isa raising patch. Will fix both for v2.
>
> > memset_{impl}_unaligned both adds unnecessary complexity to the
> > file and wastes space in a relatively hot cache section.
> > ---
> >  .../multiarch/memset-vec-unaligned-erms.S     | 54 ++++++++-----------
> >  1 file changed, 23 insertions(+), 31 deletions(-)
> >
> > diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> > index abc12d9cda..d98c613651 100644
> > --- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> > +++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> > @@ -156,37 +156,6 @@ L(entry_from_wmemset):
> >  #if defined USE_MULTIARCH && IS_IN (libc)
> >  END (MEMSET_SYMBOL (__memset, unaligned))
> >
> > -# if VEC_SIZE == 16
> > -ENTRY (__memset_chk_erms)
> > -       cmp     %RDX_LP, %RCX_LP
> > -       jb      HIDDEN_JUMPTARGET (__chk_fail)
> > -END (__memset_chk_erms)
> > -
> > -/* Only used to measure performance of REP STOSB.  */
> > -ENTRY (__memset_erms)
> > -       /* Skip zero length.  */
> > -       test    %RDX_LP, %RDX_LP
> > -       jnz      L(stosb)
> > -       movq    %rdi, %rax
> > -       ret
> > -# else
> > -/* Provide a hidden symbol to debugger.  */
> > -       .hidden MEMSET_SYMBOL (__memset, erms)
> > -ENTRY (MEMSET_SYMBOL (__memset, erms))
> > -# endif
> > -L(stosb):
> > -       mov     %RDX_LP, %RCX_LP
> > -       movzbl  %sil, %eax
> > -       mov     %RDI_LP, %RDX_LP
> > -       rep stosb
> > -       mov     %RDX_LP, %RAX_LP
> > -       VZEROUPPER_RETURN
> > -# if VEC_SIZE == 16
> > -END (__memset_erms)
> > -# else
> > -END (MEMSET_SYMBOL (__memset, erms))
> > -# endif
> > -
> >  # if defined SHARED && IS_IN (libc)
> >  ENTRY_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned_erms))
> >         cmp     %RDX_LP, %RCX_LP
> > @@ -461,3 +430,26 @@ L(between_2_3):
> >  #endif
> >         ret
> >  END (MEMSET_SYMBOL (__memset, unaligned_erms))
> > +
> > +#if defined USE_MULTIARCH && IS_IN (libc) && VEC_SIZE == 16
> > +ENTRY (__memset_chk_erms)
> > +       cmp     %RDX_LP, %RCX_LP
> > +       jb      HIDDEN_JUMPTARGET (__chk_fail)
> > +END (__memset_chk_erms)
> > +
> > +/* Only used to measure performance of REP STOSB.  */
> > +ENTRY (__memset_erms)
> > +       /* Skip zero length.  */
> > +       test    %RDX_LP, %RDX_LP
> > +       jz       L(stosb_return_zero)
> > +       mov     %RDX_LP, %RCX_LP
> > +       movzbl  %sil, %eax
> > +       mov     %RDI_LP, %RDX_LP
> > +       rep stosb
> > +       mov     %RDX_LP, %RAX_LP
> > +       ret
> > +L(stosb_return_zero):
> > +       movq    %rdi, %rax
> > +       ret
> > +END (__memset_erms)
> > +#endif
> > --
> > 2.34.1
> >
>
>
> --
> H.J.

  reply	other threads:[~2022-06-29 19:32 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-28 15:27 [PATCH v1 1/3] x86: Add definition for __wmemset_chk AVX2 RTM in ifunc impl list Noah Goldstein
2022-06-28 15:27 ` [PATCH v1 2/3] x86: Move and slightly improve memset_erms Noah Goldstein
2022-06-29 19:25   ` H.J. Lu
2022-06-29 19:32     ` Noah Goldstein [this message]
2022-06-29 22:11       ` Noah Goldstein
2022-06-28 15:27 ` [PATCH v1 3/3] x86: Add support for building {w}memset{_chk} with explicit ISA level Noah Goldstein
2022-06-29 19:30   ` H.J. Lu
2022-06-29 22:12     ` Noah Goldstein
2022-06-29 19:21 ` [PATCH v1 1/3] x86: Add definition for __wmemset_chk AVX2 RTM in ifunc impl list H.J. Lu
2022-06-29 23:08   ` Noah Goldstein
2022-07-14  3:03     ` Sunil Pandey
2022-06-29 22:12 ` [PATCH v2 " Noah Goldstein
2022-06-29 22:12   ` [PATCH v2 2/3] x86: Move and slightly improve memset_erms Noah Goldstein
2022-06-29 22:18     ` H.J. Lu
2022-06-29 23:09       ` Noah Goldstein
2022-06-29 22:12   ` [PATCH v2 3/3] x86: Add support for building {w}memset{_chk} with explicit ISA level Noah Goldstein
2022-06-29 22:53     ` H.J. Lu
2022-06-29 23:07 ` [PATCH v3 1/3] x86: Add definition for __wmemset_chk AVX2 RTM in ifunc impl list Noah Goldstein
2022-06-29 23:07   ` [PATCH v3 2/3] x86: Move and slightly improve memset_erms Noah Goldstein
2022-06-29 23:30     ` H.J. Lu
2022-07-14  3:04       ` Sunil Pandey
2022-06-29 23:07   ` [PATCH v3 3/3] x86: Add support for building {w}memset{_chk} with explicit ISA level Noah Goldstein
2022-06-29 23:28     ` H.J. Lu
2022-07-01 22:45     ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFUsyfLkK-n0vNmzUYqNV9wifAGqU5b7hsU3dES0Rq8HcQsuVA@mail.gmail.com \
    --to=goldstein.w.n@gmail.com \
    --cc=carlos@systemhalted.org \
    --cc=hjl.tools@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).