From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: "Ondřej Bílka" <neleai@seznam.cz>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH v3 00/18] Improve generic string routines
Date: Thu, 11 Jan 2018 10:54:00 -0000 [thread overview]
Message-ID: <aa328892-994e-9cc5-4d2c-08a15b4bd033@linaro.org> (raw)
In-Reply-To: <20180110223020.GB12399@domone>
On 10/01/2018 20:30, OndÅej BÃlka wrote:
> On Wed, Jan 10, 2018 at 10:47:44AM -0200, Adhemerval Zanella wrote:
>> It is an update of previous Richard's patchset [1] to provide generic
>> string implementation for newer ports and make them only focus on
>> just specific routines to get a better overall improvement.
>> It is done by:
>>
> This is basically poorly reinvented version of my patch to make generic
> string routines. By dusting it of one could get lot better performance
> than this.
Why not work together to get to get a better implementation instead of
bashing a patch proposal by just saying 'I know/did better'? I did take a
look at your suggestion and my first impression what a more convoluted
with a lot of over-engineering. I used Richard's proposal because as a
start point mainly because I thought it is better decomposed and organized
over the internal string operations and a bit simpler for generic
implementations.
Not saying your previous proposal is out of value, and I would like if we
could improve this patch proposal with your input.
>
> This contains lot of glaring issues, main ones are:
> strnlen - quite ineffective as testing vs zero is faster than generic c.
> Could be trivially improved by inlining memchr.
The 08/08 is using memchr instead (just no inline it)...
>
> strcmp - this
>
> + /* Handle the unaligned bytes of p1 first. */
> + n = -(uintptr_t)p1 % sizeof(op_t);
> + for (i = 0; i < n; ++i)
>
> is a the worst way how to write memcmp as you make loop with unpredictable
> branch about alignment. Also in strcmp trying to be clever usually
> causes performance regression because its likely that you get difference
> in starting bytes.
I am open to suggestions here, usually the other methods trade speed for
more complexity and hardware/abi idiosyncrasies. Also keep in mind the idea
of implementation is to be used on a large sets of different hardware,
where strategies like unaligned access within a page can not be guarantee
to work.
>
> strcpy - this was previously changed to strlen+memcpy because it with
> platform specific strlen/memcpy it is faster. One should at least check
> if some platforms are affected. I wouldn't be surprised that one could
> get faster function just by adding some unrolling to memcpy.
>
It really depends, for instance aarch64 implementation sets a threshold
where strlen+memcpy is actually used where then it switches to load/store.
Also both strlen and memcpy is inlined, witch amortizes the function call
cost.
I though about adding such optimization, but it will need to rely if
compiler can inline size bounded strlen/memcpy to actually make sense
for small sizes (where it is not true for a lot of platforms). And
there is also the issue of which is the size to used as threshold.
next prev parent reply other threads:[~2018-01-11 10:54 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-10 12:48 Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 15/18] alpha: Add string-fzb.h and string-fzi.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 01/18] Parameterize op_t from memcopy.h Adhemerval Zanella
2018-01-11 13:28 ` Joseph Myers
2018-01-11 18:04 ` Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 05/18] string: Improve generic strlen Adhemerval Zanella
2018-01-11 17:21 ` Paul Eggert
2018-01-12 18:00 ` Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 10/18] string: Improve generic strchrnul Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 02/18] Parameterize OP_T_THRES from memcopy.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 14/18] hppa: Add string-fzb.h and string-fzi.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 08/18] string: Improve generic strnlen Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 17/18] powerpc: Add string-fza.h Adhemerval Zanella
2018-01-10 12:56 ` Tulio Magno Quites Machado Filho
2018-01-10 12:48 ` [PATCH v3 06/18] string: Improve generic memchr Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 12/18] string: Improve generic strcpy Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 16/18] arm: Add string-fza.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 11/18] string: Improve generic strcmp Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 03/18] Add string-maskoff.h generic header Adhemerval Zanella
2018-01-10 23:25 ` Paul Eggert
2018-01-11 10:54 ` Adhemerval Zanella
2018-01-11 13:29 ` Joseph Myers
2018-01-11 17:57 ` Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 07/18] string: Improve generic memrchr Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 09/18] string: Improve generic strchr Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 18/18] sh: Add string-fzb.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 13/18] hppa: Add memcopy.h Adhemerval Zanella
2018-01-11 13:36 ` Joseph Myers
2018-01-12 18:01 ` Adhemerval Zanella
2018-01-12 18:18 ` Joseph Myers
2018-01-12 18:37 ` Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 04/18] Add string vectorized find and detection functions Adhemerval Zanella
2018-01-11 13:34 ` Joseph Myers
2018-01-11 18:25 ` Adhemerval Zanella
2018-01-11 13:44 ` Luis Machado
2018-01-11 18:25 ` Adhemerval Zanella
2018-01-11 16:47 ` Paul Eggert
2018-01-11 18:54 ` Adhemerval Zanella
2018-01-12 1:08 ` Paul Eggert
2018-01-12 17:08 ` Joseph Myers
2018-01-12 17:59 ` Adhemerval Zanella
2018-01-12 13:30 ` Adhemerval Zanella
2018-01-10 22:30 ` [PATCH v3 00/18] Improve generic string routines Ondřej Bílka
2018-01-11 10:54 ` Adhemerval Zanella [this message]
2018-01-11 13:50 ` Joseph Myers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aa328892-994e-9cc5-4d2c-08a15b4bd033@linaro.org \
--to=adhemerval.zanella@linaro.org \
--cc=libc-alpha@sourceware.org \
--cc=neleai@seznam.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).