public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "Ondřej Bílka" <neleai@seznam.cz>
To: Wilco Dijkstra <wdijkstr@arm.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH][PING] Improve stpncpy performance
Date: Wed, 12 Aug 2015 20:57:00 -0000	[thread overview]
Message-ID: <20150812205744.GA15321@domone> (raw)
In-Reply-To: <000c01d0ba58$44d1c170$ce754450$@com>

On Thu, Jul 09, 2015 at 04:02:26PM +0100, Wilco Dijkstra wrote:
> > Ondřej Bílka wrote:
> >
> > You don't have to use special case
> > 
> > if (size == n)
> >   return dest;
> > 
> > as it should be handled by
> > 
> > return memset (dest, '\0', 0);
> > 
> > That could improve performance a bit if its rare case. That doesn't
> > matter much as memset makes that function slow and it shouldn't be
> > used in performance sensitive code.
> > 
> > Otherwise ok for me.
> 
> On the benchtests the extra if made a significant difference, particularly
> since memset of 0 is relatively expensive as it is being regarded as a very
> rare case. It seems it should be less likely than the benchtests indicate,
> but we'd have to fix the benchtest first to use more realistic data.
>
Ok, I did data collection and I take my objection back as it almost
always happens in bash. I was surprised why it needs to use strncpy to
copy small number of bytes.  

When I tested dryrun benchmark special casing is faster. I got following
data on strncpy but no on stpncpy so we could reuse that for stpcpy patch that you also submitted.

replaying bash

calls 194

average n:  15.6082    n <= 0:   4.6% n <= 4:   6.7% n <= 8:  43.3% n <= 16:  80.9% n <= 24:  88.7% n <= 32:  88.7% n <= 48:  91.2% n <= 64:  98.5% 
s aligned to 4 bytes:  99.5%  8 bytes:  97.9% 16 bytes:   0.5% 
average *s access cache latency  0.9072    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
s2 aligned to 4 bytes:  34.0%  8 bytes:  24.2% 16 bytes:   1.5% 
s-s2 aligned to 4 bytes:  34.5%  8 bytes:  22.7% 16 bytes:  22.7% 
average *s2 access cache latency  1.1186    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
average capacity:  15.6082    c <= 0:   4.6% c <= 4:   6.7% c <= 8:  43.3% c <= 16:  80.9% c <= 24:  88.7% c <= 32:  88.7% c <= 48:  91.2% c <= 64:  98.5% n == capa : 100.0% 

replaying mc

calls 6971

average n:  9.0773    n <= 0:   2.7% n <= 4:  54.3% n <= 8:  71.9% n <= 16:  85.5% n <= 24:  91.0% n <= 32:  94.2% n <= 48:  96.9% n <= 64:  98.7% 
s aligned to 4 bytes: 100.0%  8 bytes: 100.0% 16 bytes: 100.0% 
average *s access cache latency  36.3347    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
s2 aligned to 4 bytes:  55.3%  8 bytes:  49.5% 16 bytes:  48.1% 
s-s2 aligned to 4 bytes:  55.3%  8 bytes:  49.5% 16 bytes:  48.1% 
average *s2 access cache latency  1.0126    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
average capacity:  9.5847    c <= 0:   0.7% c <= 4:  52.2% c <= 8:  70.9% c <= 16:  84.9% c <= 24:  90.7% c <= 32:  93.8% c <= 48:  96.5% c <= 64:  98.7% n == capa :  63.6% 

replaying mutt

calls 10415

average n:  7.6572    n <= 0:   0.1% n <= 4:  68.6% n <= 8:  82.5% n <= 16:  86.4% n <= 24:  87.7% n <= 32:  94.5% n <= 48:  96.6% n <= 64:  98.9% 
s aligned to 4 bytes:  57.9%  8 bytes:  49.1% 16 bytes:  45.2% 
average *s access cache latency  1.1092    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
s2 aligned to 4 bytes:  85.7%  8 bytes:  79.8% 16 bytes:  79.5% 
s-s2 aligned to 4 bytes:  43.6%  8 bytes:  28.9% 16 bytes:  24.7% 
average *s2 access cache latency  1.1324    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
average capacity:  51.2750    c <= 0:   0.0% c <= 4:  65.3% c <= 8:  73.0% c <= 16:  74.6% c <= 24:  75.2% c <= 32:  75.3% c <= 48:  75.3% c <= 64:  75.3% n == capa :  64.4% 

replaying /bin/bash

calls 60

average n:  10.3167    n <= 0:   1.7% n <= 4:  26.7% n <= 8:  56.7% n <= 16:  88.3% n <= 24:  98.3% n <= 32:  98.3% n <= 48:  98.3% n <= 64:  98.3% 
s aligned to 4 bytes: 100.0%  8 bytes: 100.0% 16 bytes:   1.7% 
average *s access cache latency  0.8833    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
s2 aligned to 4 bytes:  36.7%  8 bytes:  33.3% 16 bytes:   3.3% 
s-s2 aligned to 4 bytes:  36.7%  8 bytes:  33.3% 16 bytes:  31.7% 
average *s2 access cache latency  0.9167    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
average capacity:  10.3167    c <= 0:   1.7% c <= 4:  26.7% c <= 8:  56.7% c <= 16:  88.3% c <= 24:  98.3% c <= 32:  98.3% c <= 48:  98.3% c <= 64:  98.3% n == capa : 100.0% 

replaying as

calls 122

average n:  6.8115    n <= 0:   0.8% n <= 4:   6.6% n <= 8:  95.1% n <= 16:  98.4% n <= 24: 100.0% n <= 32: 100.0% n <= 48: 100.0% n <= 64: 100.0% 
s aligned to 4 bytes: 100.0%  8 bytes: 100.0% 16 bytes: 100.0% 
average *s access cache latency  1.0410    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
s2 aligned to 4 bytes:  25.4%  8 bytes:  13.1% 16 bytes:   5.7% 
s-s2 aligned to 4 bytes:  25.4%  8 bytes:  13.1% 16 bytes:   5.7% 
average *s2 access cache latency  0.9262    l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0% 
average capacity:  126.9508    c <= 0:   0.8% c <= 4:   0.8% c <= 8:   0.8% c <= 16:   0.8% c <= 24:   0.8% c <= 32:   0.8% c <= 48:   0.8% c <= 64:   0.8% n == capa :   0.8% 

  reply	other threads:[~2015-08-12 20:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-06 11:29 Wilco Dijkstra
2015-07-09 13:21 ` Ondřej Bílka
2015-07-09 15:02   ` Wilco Dijkstra
2015-08-12 20:57     ` Ondřej Bílka [this message]
2015-08-19 16:15       ` Wilco Dijkstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150812205744.GA15321@domone \
    --to=neleai@seznam.cz \
    --cc=libc-alpha@sourceware.org \
    --cc=wdijkstr@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).