public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Noah Goldstein <goldstein.w.n@gmail.com>
To: Joerg Sonnenberger <joerg@bec.de>
Cc: libc-coord@lists.openwall.com,
	Richard Biener via Gcc <gcc@gcc.gnu.org>,
	 GNU C Library <libc-alpha@sourceware.org>
Subject: Re: [libc-coord] Add new ABIs '__strcmpeq', '__strncmpeq', '__wcscmpeq' and '__wcsncmpeq' to libc
Date: Fri, 21 Jan 2022 15:50:26 -0600	[thread overview]
Message-ID: <CAFUsyfKRicqg3zGdReTRgT-4LJv5Asbfn1+YE7tKvXLb4fuJLw@mail.gmail.com> (raw)
In-Reply-To: <YesAqsfgEpBsEAOy@bec.de>

On Fri, Jan 21, 2022 at 12:51 PM Joerg Sonnenberger <joerg@bec.de> wrote:
>
> On Thu, Jan 20, 2022 at 04:56:59PM -0600, Noah Goldstein wrote:
> > The goal is that the new interfaces will be usable as an optimization
> > by compilers if a program uses the return value of the non "eq"
> > variant as a boolean.
>
> So I'm curious, but can you demonstrate that it can be implemented
> notacibly faster than regular strcmp? Unlike for memcmp, I don't see an
> obvious way to save any operations.

Strong point! I had been somewhat assuming we could make the same
optimizations with `__memcmpeq` but there still needs to be some
logic that tracks which comes first the mismatch or the null terminator.

It's not quite as much as `memcmp` vs `__memcmpeq` but we can
still save.

Using the x86_64 AVX2 optimized implementation as reference:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/strcmp-avx2.S;h=9c73b5899d55a72b292f21b52593284cd513d2a3;hb=HEAD

We can convert the general return method of checking equals + strlen from:

```
VMOVU (%rdi), %ymm0
VPCMPEQ (%rsi), %ymm0, %ymm1
VPCMPEQ %ymm0, %ymmZERO, %ymm2
vpandn %ymm1, %ymm2, %ymm1
vpmovmskb %ymm1, %ecx
incl %ecx
jz L(keep_going)
tzcntl %ecx, %ecx
movzbl (%rdi, %rcx), %eax
movzbl (%rsi, %rcx), %ecx
subl %ecx, %eax
vzeroupper
ret
```

To

```
VMOVU (%rdi), %ymm0
VPCMPEQ (%rsi), %ymm0, %ymm1
VPCMPEQ %ymm0, %ymmZERO, %ymm2
vpandn %ymm1, %ymm2, %ymm2
vpmovmskb %ymm2, %ecx
incl %ecx
jz L(keep_going)
vpmovmskb %ymm1, %eax
blsi %ecx, %ecx
andn %eax, %ecx, %eax
vzeroupper
ret
```

Testing this with comparisons where mismatch or strlen in the first 32 bytes
(common case) it's about the same throughput but ~20% reduction in latency.

Another benefit is we can reuse this exact return logic throughout as memory
offset is no longer required. This simplifies the page cross logic a
great deal and
will net us some serious code size reduction for the common usage of strcmp.

I think though I was a bit over optimistic about the performance benefits as I
was using `memcmp` vs `__memcmpeq` as a reference. I'll put together
a patch for just `__strcmpeq` and post the results here. I think the
wide-character
versions have more expensive return value checks so if the character versions
show a benefit we can expect it to translate.



>
> Joerg

      reply	other threads:[~2022-01-21 21:50 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-20 22:56 Noah Goldstein
2022-01-21 18:51 ` [libc-coord] " Joerg Sonnenberger
2022-01-21 21:50   ` Noah Goldstein [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFUsyfKRicqg3zGdReTRgT-4LJv5Asbfn1+YE7tKvXLb4fuJLw@mail.gmail.com \
    --to=goldstein.w.n@gmail.com \
    --cc=gcc@gcc.gnu.org \
    --cc=joerg@bec.de \
    --cc=libc-alpha@sourceware.org \
    --cc=libc-coord@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).