From: Corinna Vinschen <vinschen@redhat.com>
To: Seija Kijin <doremylover456@gmail.com>
Cc: newlib@sourceware.org
Subject: Re: [PATCH] newlib: libc: Optimize the string functions
Date: Tue, 16 Aug 2022 13:08:47 +0200 [thread overview]
Message-ID: <Yvt6v/ioprcZfKEc@calimero.vinschen.de> (raw)
In-Reply-To: <CAOfV4GcjyQe5Lc1JZ-yovoAScFkrmU1_Vb5OvyT15XJW2juZpw@mail.gmail.com>
Hi Seija,
Thanks for the patch.
First of all, your patch is broken in terms of whitespaces and line
breaks, so it can't be applied. I suspect your MUA is doing that
automatically. You better provide the git patch as attachement, or
change your MUA settings to leave whitespaces and line breaks alone.
On Aug 12 17:11, Seija Kijin wrote:
> --- a/newlib/libc/string/memccpy.c
> +++ b/newlib/libc/string/memccpy.c
> @@ -34,16 +34,17 @@ PORTABILITY
> /* Nonzero if either X or Y is not aligned on a "long" boundary. */
> #define UNALIGNED(X, Y) \
> - (((long)X & (sizeof (long) - 1)) | ((long)Y & (sizeof (long) - 1)))
> + (((unsigned long)X | (unsigned long)Y) & (sizeof (unsigned long) - 1))
> /* How many bytes are copied each iteration of the word copy loop. */
> -#define LITTLEBLOCKSIZE (sizeof (long))
> +#define LITTLEBLOCKSIZE (sizeof (unsigned long))
> /* Threshhold for punting to the byte copier. */
> #define TOO_SMALL(LEN) ((LEN) < LITTLEBLOCKSIZE)
> /* Macros for detecting endchar */
> #if LONG_MAX == 2147483647L
> +#define MAGIC
Why? There's no MAGIC in this code.
> #define DETECTNULL(X) (((X) - 0x01010101) & ~(X) & 0x80808080)
> #else
> #if LONG_MAX == 9223372036854775807L
> @@ -64,11 +65,11 @@ memccpy (void *__restrict dst0,
> #if defined(PREFER_SIZE_OVER_SPEED) || defined(__OPTIMIZE_SIZE__)
> void *ptr = NULL;
> - char *dst = (char *) dst0;
> - char *src = (char *) src0;
> - char endchar = endchar0 & 0xff;
> + unsigned char *dst = (unsigned char *) dst0;
> + const unsigned char *src = (const unsigned char *) src0;
> + const unsigned char endchar = (unsigned char) endchar0;
> - while (len0--)
> + for (; len0; len0--)
What exactly are you winning here? How do you know that splitting a
compare-postdecrement into separate compare and decrement is faster than
the original operation?
- Not decrementing immediately after compare only "optimizes" the first
iteration of the loop. Any further iteration needs to decrement and
compare anyway. So the supposed advantage only affects the degenerated
case of len0 == 0 and is insignificant in any other case.
- The compiler can optimize this by its own, with different optimizations
leading to one of the expressions being faster than the other depending
on compiler version and/or target CPU.
- One CPU could have optimized microcode for compare-postdecrement which
you may break up into less optimized code if the compiler's optimizer
isn't capable of combining the ops again.
- Alternatively the target CPU has an optimized predecrement-compare op
and no optimized compare-postdecrement op. But this is generic code.
You don't know if you're speeding up or slowing down, or not affecting
at all.
Given that, I'd leave this loops alone. It's the job of the compiler to
optimize them. The changes only raise the size of the patch unnecessarily.
Same goes for changes like this:
while ([...])
{
- if (!length--)
+ if (!length)
[...]
+ length--;
}
> void *
> memmem (const void *haystack, size_t hs_len, const void *needle, size_t ne_len)
> {
> - const char *hs = haystack;
> - const char *ne = needle;
> + const unsigned char *hs = haystack;
> + const unsigned char *ne = needle;
> if (ne_len == 0)
> return (void *)hs;
> - int i;
> - int c = ne[0];
> - const char *end = hs + hs_len - ne_len;
> + size_t i;
^^^^^^^^^
This is actually a bugfix. Thanks for catching it, but it
very much deserves a patch of its own, together with the
other changes from int to size_t in memmem and strstr.
> @@ -143,7 +143,7 @@ memmem (const void *haystack, size_t hs_len, const
> void *needle, size_t ne_len)
> size_t tmp, shift1;
> size_t m1 = ne_len - 1;
> size_t offset = 0;
> - int i;
> + size_t i;
^^^^^^^^^
...this one.
Sorry, but I've given up after about half of the patch. I don't see why
this is only a single huge patch. If there's any problem, multiple
smaller patches are easier to understand, and easier to revert or
bisect.
Please resend this patch as a patchset, preferredly one file per patch
including a matching commit message per patch. The aforementioned
bugfix should be a separate patch in the patchset.
Thanks,
Corinna
next prev parent reply other threads:[~2022-08-16 11:08 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-12 21:11 Seija Kijin
2022-08-16 11:08 ` Corinna Vinschen [this message]
-- strict thread matches above, loose matches on Subject: below --
2022-12-13 20:02 Brian Inglis
2022-08-12 21:09 Seija Kijin
2022-08-12 20:38 Seija Kijin
2022-08-11 18:56 doremylover456
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yvt6v/ioprcZfKEc@calimero.vinschen.de \
--to=vinschen@redhat.com \
--cc=doremylover456@gmail.com \
--cc=newlib@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).