From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: Wilco Dijkstra <Wilco.Dijkstra@arm.com>,
'GNU C Library' <libc-alpha@sourceware.org>
Subject: Re: [PATCH 04/17] Add string vectorized find and detection functions
Date: Mon, 19 Sep 2022 10:59:01 -0300 [thread overview]
Message-ID: <3622cffe-9cce-63f7-5321-a7903cca2890@linaro.org> (raw)
In-Reply-To: <AS4PR08MB79018243EE6ABB25C95FDB5C837D9@AS4PR08MB7901.eurprd08.prod.outlook.com>
On 03/09/22 10:13, Wilco Dijkstra wrote:
> Hi Adhemerval,
>
> +static inline unsigned int
> +__clz (op_t x)
> +{
> +#if !HAVE_BUILTIN_CLZ
> + unsigned r;
> + op_t i;
> +
> + x |= x >> 1;
> + x |= x >> 2;
> + x |= x >> 4;
> + x |= x >> 8;
> + x |= x >> 16;
> +# if __WORDSIZE == 64
> + x |= x >> 32;
> + i = x * 0x03F79D71B4CB0A89ull >> 58;
> +# else
> + i = x * 0x07C4ACDDU >> 27;
> +# endif
> + r = index_access (i);
> + return r ^ (sizeof (op_t) * CHAR_BIT - 1);
> +#else
> + if (sizeof (op_t) == sizeof (long int))
> + return __builtin_clzl (x);
> + else
> + return __builtin_clzll (x);
> +#endif
> +}
>
> This is a really bad idea. Firstly it is incorrect - sizeof (op_t) != __WORDSIZE due to
> the odd way it is defined (it can be 64 bits on 32-bit targets). That in itself is
> problematic since it isn't clear that using 64 bits operations extensively is efficient
> on 32-bit targets (using 64-bit multiplies in GMP is different from using 64-bit
> load/store in memcpy/memset which is different from 64-bit logical operations and
> shifts, so all of these should be decoupled rather than forced together).
>
> Secondly, there are already several ways to use count leading zeroes in GLIBC.
> One is use the builtin unconditionally (done in lots of places, eg. by math code),
> another is count_leading_zeros defined in longlong.h. This would add the third way.
> It's not clear how much gain inlining gives over using the libgcc implementation,
> but if it is significant then we could provide a generic inline clzl/clzll that can be
> used throughout GLIBC (replacing existing builtin_clz and count_leading_zeros).
>
Fair enough, I can't really recall I have added another count bits routine instead
of using the already provided ones on longlong.h. The longlong.h already take care
of avoiding libcall, so I adjusted the patch to use them instead.
> Finally, emulating a full clz is inefficient. If you have already called find_zero_low
> then there are at most 4 bits set on a 32-bit LE target, so you can trivially get the
> index of the first zero byte via:
>
> x = x & -x;
> x = (x >> 15) + (x >> 22) + 3 * (x >> 31);
>
> This is many times faster. There may be similar sequences for big-endian, but
> you could just do a multiply with a magic word that gives the correct result
> without needing a lookup table.
I take that on recent architectures it would be faster assuming the existence of
clz instruction, and this specific code is just used on memrchr tail call. So
I think we can optimize it further on a subsequent patch.
next prev parent reply other threads:[~2022-09-19 13:59 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-03 13:13 Wilco Dijkstra
2022-09-19 13:59 ` Adhemerval Zanella Netto [this message]
-- strict thread matches above, loose matches on Subject: below --
2022-09-02 20:39 [PATCH 00/17] Improve generic string routines Adhemerval Zanella
2022-09-02 20:39 ` [PATCH 04/17] Add string vectorized find and detection functions Adhemerval Zanella
2022-09-03 3:20 ` Noah Goldstein
2022-09-19 14:00 ` Adhemerval Zanella Netto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3622cffe-9cce-63f7-5321-a7903cca2890@linaro.org \
--to=adhemerval.zanella@linaro.org \
--cc=Wilco.Dijkstra@arm.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).