public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: "Ondřej Bílka" <neleai@seznam.cz>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH v3 00/18] Improve generic string routines
Date: Thu, 11 Jan 2018 10:54:00 -0000	[thread overview]
Message-ID: <aa328892-994e-9cc5-4d2c-08a15b4bd033@linaro.org> (raw)
In-Reply-To: <20180110223020.GB12399@domone>



On 10/01/2018 20:30, Ondřej Bílka wrote:
> On Wed, Jan 10, 2018 at 10:47:44AM -0200, Adhemerval Zanella wrote:
>> It is an update of previous Richard's patchset [1] to provide generic 
>> string implementation for newer ports and make them only focus on 
>> just specific routines to get a better overall improvement.
>> It is done by:
>>
> This is basically poorly reinvented version of my patch to make generic
> string routines. By dusting it of one could get lot better performance
> than this.

Why not work together to get to get a better implementation instead of
bashing a patch proposal by just saying 'I know/did better'? I did take a 
look at your suggestion and my first impression what a more convoluted
with a lot of over-engineering. I used Richard's proposal because as a
start point mainly because I thought it is better decomposed and organized
over the internal string operations and a bit simpler for generic
implementations.

Not saying your previous proposal is out of value, and I would like if we
could improve this patch proposal with your input.

> 
> This contains lot of glaring issues, main ones are:
> strnlen - quite ineffective as testing vs zero is faster than generic c.
> Could be trivially improved by inlining memchr. 

The 08/08 is using memchr instead (just no inline it)...

> 
> strcmp - this
> 
> +  /* Handle the unaligned bytes of p1 first.  */
> +  n = -(uintptr_t)p1 % sizeof(op_t);
> +  for (i = 0; i < n; ++i)
> 
> is a the worst way how to write memcmp as you make loop with unpredictable
> branch about alignment. Also in strcmp trying to be clever usually
> causes performance regression because its likely that you get difference
> in starting bytes.

I am open to suggestions here, usually the other methods trade speed for
more complexity and hardware/abi idiosyncrasies. Also keep in mind the idea 
of implementation is to be used on a large sets of different hardware, 
where strategies like unaligned access within a page can not be guarantee
to work.


> 
> strcpy - this was previously changed to strlen+memcpy because it with
> platform specific strlen/memcpy it is faster. One should at least check
> if some platforms are affected. I wouldn't be surprised that one could
> get faster function just by adding some unrolling to memcpy. 
> 

It really depends, for instance aarch64 implementation sets a threshold
where strlen+memcpy is actually used where then it switches to load/store.
Also both strlen and memcpy is inlined, witch amortizes the function call
cost.

I though about adding such optimization, but it will need to rely if 
compiler can inline size bounded strlen/memcpy to actually make sense 
for small sizes (where it is not true for a lot of platforms).  And 
there is also the issue of which is the size to used as threshold. 

  reply	other threads:[~2018-01-11 10:54 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-10 12:48 Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 15/18] alpha: Add string-fzb.h and string-fzi.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 01/18] Parameterize op_t from memcopy.h Adhemerval Zanella
2018-01-11 13:28   ` Joseph Myers
2018-01-11 18:04     ` Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 05/18] string: Improve generic strlen Adhemerval Zanella
2018-01-11 17:21   ` Paul Eggert
2018-01-12 18:00     ` Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 10/18] string: Improve generic strchrnul Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 02/18] Parameterize OP_T_THRES from memcopy.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 14/18] hppa: Add string-fzb.h and string-fzi.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 08/18] string: Improve generic strnlen Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 17/18] powerpc: Add string-fza.h Adhemerval Zanella
2018-01-10 12:56   ` Tulio Magno Quites Machado Filho
2018-01-10 12:48 ` [PATCH v3 06/18] string: Improve generic memchr Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 12/18] string: Improve generic strcpy Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 16/18] arm: Add string-fza.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 11/18] string: Improve generic strcmp Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 03/18] Add string-maskoff.h generic header Adhemerval Zanella
2018-01-10 23:25   ` Paul Eggert
2018-01-11 10:54     ` Adhemerval Zanella
2018-01-11 13:29   ` Joseph Myers
2018-01-11 17:57     ` Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 07/18] string: Improve generic memrchr Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 09/18] string: Improve generic strchr Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 18/18] sh: Add string-fzb.h Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 13/18] hppa: Add memcopy.h Adhemerval Zanella
2018-01-11 13:36   ` Joseph Myers
2018-01-12 18:01     ` Adhemerval Zanella
2018-01-12 18:18       ` Joseph Myers
2018-01-12 18:37         ` Adhemerval Zanella
2018-01-10 12:48 ` [PATCH v3 04/18] Add string vectorized find and detection functions Adhemerval Zanella
2018-01-11 13:34   ` Joseph Myers
2018-01-11 18:25     ` Adhemerval Zanella
2018-01-11 13:44   ` Luis Machado
2018-01-11 18:25     ` Adhemerval Zanella
2018-01-11 16:47   ` Paul Eggert
2018-01-11 18:54     ` Adhemerval Zanella
2018-01-12  1:08       ` Paul Eggert
2018-01-12 17:08         ` Joseph Myers
2018-01-12 17:59           ` Adhemerval Zanella
2018-01-12 13:30       ` Adhemerval Zanella
2018-01-10 22:30 ` [PATCH v3 00/18] Improve generic string routines Ondřej Bílka
2018-01-11 10:54   ` Adhemerval Zanella [this message]
2018-01-11 13:50     ` Joseph Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa328892-994e-9cc5-4d2c-08a15b4bd033@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    --cc=neleai@seznam.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).