From: "Ondřej Bílka" <neleai@seznam.cz>
To: Wilco Dijkstra <wdijkstr@arm.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH][PING] Improve stpncpy performance
Date: Wed, 12 Aug 2015 20:57:00 -0000 [thread overview]
Message-ID: <20150812205744.GA15321@domone> (raw)
In-Reply-To: <000c01d0ba58$44d1c170$ce754450$@com>
On Thu, Jul 09, 2015 at 04:02:26PM +0100, Wilco Dijkstra wrote:
> > OndÅej BÃlka wrote:
> >
> > You don't have to use special case
> >
> > if (size == n)
> > return dest;
> >
> > as it should be handled by
> >
> > return memset (dest, '\0', 0);
> >
> > That could improve performance a bit if its rare case. That doesn't
> > matter much as memset makes that function slow and it shouldn't be
> > used in performance sensitive code.
> >
> > Otherwise ok for me.
>
> On the benchtests the extra if made a significant difference, particularly
> since memset of 0 is relatively expensive as it is being regarded as a very
> rare case. It seems it should be less likely than the benchtests indicate,
> but we'd have to fix the benchtest first to use more realistic data.
>
Ok, I did data collection and I take my objection back as it almost
always happens in bash. I was surprised why it needs to use strncpy to
copy small number of bytes.
When I tested dryrun benchmark special casing is faster. I got following
data on strncpy but no on stpncpy so we could reuse that for stpcpy patch that you also submitted.
replaying bash
calls 194
average n: 15.6082 n <= 0: 4.6% n <= 4: 6.7% n <= 8: 43.3% n <= 16: 80.9% n <= 24: 88.7% n <= 32: 88.7% n <= 48: 91.2% n <= 64: 98.5%
s aligned to 4 bytes: 99.5% 8 bytes: 97.9% 16 bytes: 0.5%
average *s access cache latency 0.9072 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
s2 aligned to 4 bytes: 34.0% 8 bytes: 24.2% 16 bytes: 1.5%
s-s2 aligned to 4 bytes: 34.5% 8 bytes: 22.7% 16 bytes: 22.7%
average *s2 access cache latency 1.1186 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
average capacity: 15.6082 c <= 0: 4.6% c <= 4: 6.7% c <= 8: 43.3% c <= 16: 80.9% c <= 24: 88.7% c <= 32: 88.7% c <= 48: 91.2% c <= 64: 98.5% n == capa : 100.0%
replaying mc
calls 6971
average n: 9.0773 n <= 0: 2.7% n <= 4: 54.3% n <= 8: 71.9% n <= 16: 85.5% n <= 24: 91.0% n <= 32: 94.2% n <= 48: 96.9% n <= 64: 98.7%
s aligned to 4 bytes: 100.0% 8 bytes: 100.0% 16 bytes: 100.0%
average *s access cache latency 36.3347 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
s2 aligned to 4 bytes: 55.3% 8 bytes: 49.5% 16 bytes: 48.1%
s-s2 aligned to 4 bytes: 55.3% 8 bytes: 49.5% 16 bytes: 48.1%
average *s2 access cache latency 1.0126 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
average capacity: 9.5847 c <= 0: 0.7% c <= 4: 52.2% c <= 8: 70.9% c <= 16: 84.9% c <= 24: 90.7% c <= 32: 93.8% c <= 48: 96.5% c <= 64: 98.7% n == capa : 63.6%
replaying mutt
calls 10415
average n: 7.6572 n <= 0: 0.1% n <= 4: 68.6% n <= 8: 82.5% n <= 16: 86.4% n <= 24: 87.7% n <= 32: 94.5% n <= 48: 96.6% n <= 64: 98.9%
s aligned to 4 bytes: 57.9% 8 bytes: 49.1% 16 bytes: 45.2%
average *s access cache latency 1.1092 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
s2 aligned to 4 bytes: 85.7% 8 bytes: 79.8% 16 bytes: 79.5%
s-s2 aligned to 4 bytes: 43.6% 8 bytes: 28.9% 16 bytes: 24.7%
average *s2 access cache latency 1.1324 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
average capacity: 51.2750 c <= 0: 0.0% c <= 4: 65.3% c <= 8: 73.0% c <= 16: 74.6% c <= 24: 75.2% c <= 32: 75.3% c <= 48: 75.3% c <= 64: 75.3% n == capa : 64.4%
replaying /bin/bash
calls 60
average n: 10.3167 n <= 0: 1.7% n <= 4: 26.7% n <= 8: 56.7% n <= 16: 88.3% n <= 24: 98.3% n <= 32: 98.3% n <= 48: 98.3% n <= 64: 98.3%
s aligned to 4 bytes: 100.0% 8 bytes: 100.0% 16 bytes: 1.7%
average *s access cache latency 0.8833 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
s2 aligned to 4 bytes: 36.7% 8 bytes: 33.3% 16 bytes: 3.3%
s-s2 aligned to 4 bytes: 36.7% 8 bytes: 33.3% 16 bytes: 31.7%
average *s2 access cache latency 0.9167 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
average capacity: 10.3167 c <= 0: 1.7% c <= 4: 26.7% c <= 8: 56.7% c <= 16: 88.3% c <= 24: 98.3% c <= 32: 98.3% c <= 48: 98.3% c <= 64: 98.3% n == capa : 100.0%
replaying as
calls 122
average n: 6.8115 n <= 0: 0.8% n <= 4: 6.6% n <= 8: 95.1% n <= 16: 98.4% n <= 24: 100.0% n <= 32: 100.0% n <= 48: 100.0% n <= 64: 100.0%
s aligned to 4 bytes: 100.0% 8 bytes: 100.0% 16 bytes: 100.0%
average *s access cache latency 1.0410 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
s2 aligned to 4 bytes: 25.4% 8 bytes: 13.1% 16 bytes: 5.7%
s-s2 aligned to 4 bytes: 25.4% 8 bytes: 13.1% 16 bytes: 5.7%
average *s2 access cache latency 0.9262 l <= 8: 100.0% l <= 16: 100.0% l <= 32: 100.0% l <= 64: 100.0% l <= 128: 100.0%
average capacity: 126.9508 c <= 0: 0.8% c <= 4: 0.8% c <= 8: 0.8% c <= 16: 0.8% c <= 24: 0.8% c <= 32: 0.8% c <= 48: 0.8% c <= 64: 0.8% n == capa : 0.8%
next prev parent reply other threads:[~2015-08-12 20:57 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-06 11:29 Wilco Dijkstra
2015-07-09 13:21 ` Ondřej Bílka
2015-07-09 15:02 ` Wilco Dijkstra
2015-08-12 20:57 ` Ondřej Bílka [this message]
2015-08-19 16:15 ` Wilco Dijkstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150812205744.GA15321@domone \
--to=neleai@seznam.cz \
--cc=libc-alpha@sourceware.org \
--cc=wdijkstr@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).