* [PATCH] Improve performance of strcat
@ 2014-08-07 13:28 Wilco Dijkstra
2014-08-07 13:31 ` Adhemerval Zanella
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Wilco Dijkstra @ 2014-08-07 13:28 UTC (permalink / raw)
To: libc-alpha
[-- Attachment #1: Type: text/plain, Size: 523 bytes --]
Hi,
This patch improves strcat performance by using strlen and strcpy. Strlen has a fast C
implementation, so this improves performance even on targets which don't have an optimized strlen
and strcpy - it is 25% faster in bench-strcat. On targets which don't provide an optimized strcat
but which do have an optimized strlen and strcpy, performance gain is > 2x.
OK for commit?
ChangeLog:
2014-08-07 Wilco Dijkstra <wdijkstr@arm.com>
* string/strcat.c (strcat): Improve performance by using strlen/strcpy.
[-- Attachment #2: Improve-strcat-performance.txt --]
[-- Type: text/plain, Size: 796 bytes --]
---
string/strcat.c | 21 +--------------------
1 file changed, 1 insertion(+), 20 deletions(-)
diff --git a/string/strcat.c b/string/strcat.c
index 2cbe8b3..983d115 100644
--- a/string/strcat.c
+++ b/string/strcat.c
@@ -23,26 +23,7 @@
char *
strcat (char *dest, const char *src)
{
- char *s1 = dest;
- const char *s2 = src;
- char c;
-
- /* Find the end of the string. */
- do
- c = *s1++;
- while (c != '\0');
-
- /* Make S1 point before the next character, so we can increment
- it while memory is read (wins on pipelined cpus). */
- s1 -= 2;
-
- do
- {
- c = *s2++;
- *++s1 = c;
- }
- while (c != '\0');
-
+ strcpy (dest + strlen (dest), src);
return dest;
}
libc_hidden_builtin_def (strcat)
--
1.7.9.5
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Improve performance of strcat
2014-08-07 13:28 [PATCH] Improve performance of strcat Wilco Dijkstra
@ 2014-08-07 13:31 ` Adhemerval Zanella
2014-08-07 13:58 ` Joseph S. Myers
2014-09-10 17:24 ` Florian Weimer
2014-09-11 19:42 ` Carlos O'Donell
2 siblings, 1 reply; 10+ messages in thread
From: Adhemerval Zanella @ 2014-08-07 13:31 UTC (permalink / raw)
To: libc-alpha
On 07-08-2014 10:27, Wilco Dijkstra wrote:
> Hi,
>
> This patch improves strcat performance by using strlen and strcpy. Strlen has a fast C
> implementation, so this improves performance even on targets which don't have an optimized strlen
> and strcpy - it is 25% faster in bench-strcat. On targets which don't provide an optimized strcat
> but which do have an optimized strlen and strcpy, performance gain is > 2x.
>
> OK for commit?
>
> ChangeLog:
> 2014-08-07 Wilco Dijkstra <wdijkstr@arm.com>
>
> * string/strcat.c (strcat): Improve performance by using strlen/strcpy.
> ---
> string/strcat.c | 21 +--------------------
> 1 file changed, 1 insertion(+), 20 deletions(-)
>
> diff --git a/string/strcat.c b/string/strcat.c
>
> - do
> - {
> - c = *s2++;
> - *++s1 = c;
> - }
> - while (c != '\0');
> -
> + strcpy (dest + strlen (dest), src);
Should it be __strcpy/__strlen ?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Improve performance of strcat
2014-08-07 13:31 ` Adhemerval Zanella
@ 2014-08-07 13:58 ` Joseph S. Myers
0 siblings, 0 replies; 10+ messages in thread
From: Joseph S. Myers @ 2014-08-07 13:58 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
On Thu, 7 Aug 2014, Adhemerval Zanella wrote:
> > + strcpy (dest + strlen (dest), src);
>
> Should it be __strcpy/__strlen ?
As explained in recent discussion, there is no need for use of __* when
calling functions in ISO C90 that haven't been removed in more recent
standards (or more generally, when calling function A from function B if
function A is in all the supported standards containing function B). You
do need *_hidden_* for PLT avoidance, but include/string.h already has
libc_hidden_builtin_proto calls for strcpy and strlen (and if any
definition of those functions is missing the corresponding
libc_hidden_builtin_def, there will be an obvious error linking glibc).
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Improve performance of strcat
2014-08-07 13:28 [PATCH] Improve performance of strcat Wilco Dijkstra
2014-08-07 13:31 ` Adhemerval Zanella
@ 2014-09-10 17:24 ` Florian Weimer
2014-09-11 19:42 ` Carlos O'Donell
2 siblings, 0 replies; 10+ messages in thread
From: Florian Weimer @ 2014-09-10 17:24 UTC (permalink / raw)
To: Wilco Dijkstra, libc-alpha
On 08/07/2014 03:27 PM, Wilco Dijkstra wrote:
> This patch improves strcat performance by using strlen and strcpy. Strlen has a fast C
> implementation, so this improves performance even on targets which don't have an optimized strlen
> and strcpy - it is 25% faster in bench-strcat. On targets which don't provide an optimized strcat
> but which do have an optimized strlen and strcpy, performance gain is > 2x.
>
> OK for commit?
This is okay for master. Thanks.
--
Florian Weimer / Red Hat Product Security
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Improve performance of strcat
2014-08-07 13:28 [PATCH] Improve performance of strcat Wilco Dijkstra
2014-08-07 13:31 ` Adhemerval Zanella
2014-09-10 17:24 ` Florian Weimer
@ 2014-09-11 19:42 ` Carlos O'Donell
2014-09-12 5:19 ` Ondřej Bílka
2014-09-12 11:14 ` Wilco Dijkstra
2 siblings, 2 replies; 10+ messages in thread
From: Carlos O'Donell @ 2014-09-11 19:42 UTC (permalink / raw)
To: Wilco Dijkstra, libc-alpha
On 08/07/2014 09:27 AM, Wilco Dijkstra wrote:
> Hi,
>
> This patch improves strcat performance by using strlen and strcpy. Strlen has a fast C
> implementation, so this improves performance even on targets which don't have an optimized strlen
> and strcpy - it is 25% faster in bench-strcat. On targets which don't provide an optimized strcat
> but which do have an optimized strlen and strcpy, performance gain is > 2x.
What benchmarks did you use to test this performance gain?
Did you use glibc's microbenchmark?
What numbers did you get?
c.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Improve performance of strcat
2014-09-11 19:42 ` Carlos O'Donell
@ 2014-09-12 5:19 ` Ondřej Bílka
2014-09-12 11:51 ` Wilco Dijkstra
2014-09-12 11:14 ` Wilco Dijkstra
1 sibling, 1 reply; 10+ messages in thread
From: Ondřej Bílka @ 2014-09-12 5:19 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: Wilco Dijkstra, libc-alpha
On Thu, Sep 11, 2014 at 03:42:31PM -0400, Carlos O'Donell wrote:
> On 08/07/2014 09:27 AM, Wilco Dijkstra wrote:
> > Hi,
> >
> > This patch improves strcat performance by using strlen and strcpy. Strlen has a fast C
> > implementation, so this improves performance even on targets which don't have an optimized strlen
> > and strcpy - it is 25% faster in bench-strcat. On targets which don't provide an optimized strcat
> > but which do have an optimized strlen and strcpy, performance gain is > 2x.
>
> What benchmarks did you use to test this performance gain?
>
> Did you use glibc's microbenchmark?
>
> What numbers did you get?
>
Ah same patch that I send like year ago, your question is answered in
quoted email.
Also trying to benchmark C implementations is mostly meaningless, if you
want good performance you need write at least assembly implementation of memcpy,
strlen and strcpy. Optimizing one of these has bigger performance impact
than rest of functions combined and if you do not care about that you do
not have to care about this c implementation as well.
Also in case of strcat its easy to see that no matter how well you
optimize it you cannot get faster than strcpy (dest+strlen (dest), src);
minus constant overhead of like one function call. Reason is that you
could use strcat both as
strcpy (char *x, char *y)
{
x[0] = 0;
return strcat (x, y);
}
and
strlen (char *x)
{
char y[] = "x";
return strcat2 (x, y); // where we in assembly replace each write instruction jump that calculates end of string.
}
Now if you got input where strcat is faster than strcpy+strlen pair it
would give you faster strlen or strcpy by using formulas above.
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH] Improve performance of strcat
2014-09-12 5:19 ` Ondřej Bílka
@ 2014-09-12 11:51 ` Wilco Dijkstra
0 siblings, 0 replies; 10+ messages in thread
From: Wilco Dijkstra @ 2014-09-12 11:51 UTC (permalink / raw)
To: 'Ondřej Bílka', Carlos O'Donell; +Cc: libc-alpha
> Ondřej Bílka wrote:
> Also trying to benchmark C implementations is mostly meaningless, if you
> want good performance you need write at least assembly implementation of memcpy,
> strlen and strcpy. Optimizing one of these has bigger performance impact
> than rest of functions combined and if you do not care about that you do
> not have to care about this c implementation as well.
The issue is that you're forced to write lots of highly optimized assembly
code in order to get good string performance for any new architecture.
This issue exists across GLIBC, for example the math functions used to be
extremely inefficient due to a bad generic implementation. With a simple
architecture independent patch I achieved 99% of the performance of the
best optimized implementation. Rather than forcing all targets to add
large amounts of highly optimized assembler, why not put a little more
effort into GLIBC generic code to ensure it is efficient to start with?
So the goal here is to provide good string performance using just C routines
and ensure they benefit further from assembler implementations of strlen,
memcpy, memset etc.
> Also in case of strcat its easy to see that no matter how well you
> optimize it you cannot get faster than strcpy (dest+strlen (dest), src);
> minus constant overhead of like one function call.
Exactly, there is no need for a specialized assembler variant for strcat.
Wilco
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH] Improve performance of strcat
2014-09-11 19:42 ` Carlos O'Donell
2014-09-12 5:19 ` Ondřej Bílka
@ 2014-09-12 11:14 ` Wilco Dijkstra
1 sibling, 0 replies; 10+ messages in thread
From: Wilco Dijkstra @ 2014-09-12 11:14 UTC (permalink / raw)
To: 'Carlos O'Donell', libc-alpha
> Carlos O'Donell wrote:
> On 08/07/2014 09:27 AM, Wilco Dijkstra wrote:
> > Hi,
> >
> > This patch improves strcat performance by using strlen and strcpy. Strlen has a fast C
> > implementation, so this improves performance even on targets which don't have an optimized
> strlen
> > and strcpy - it is 25% faster in bench-strcat. On targets which don't provide an optimized
> strcat
> > but which do have an optimized strlen and strcpy, performance gain is > 2x.
>
> What benchmarks did you use to test this performance gain?
>
> Did you use glibc's microbenchmark?
>
> What numbers did you get?
These results are for the GLIBC benchtests/bench-strcat.c - I increased the iterations
significantly and profiled the results with a high tickrate to verify the timings.
65.74% 11343 bench-strcat bench-strcat [.] simple_strcat
24.90% 4307 bench-strcat libc-2.20.90.so [.] strcpy
5.20% 902 bench-strcat libc-2.20.90.so [.] strlen
1.22% 214 bench-strcat bench-strcat [.] do_test
1.08% 190 bench-strcat libc-2.20.90.so [.] strcat
So strcat+strlen+strcpy=31.18% vs simple_strcat 65.74%, ie. 2.1x speedup.
Wilco
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Improve performance of strcat
@ 2014-08-07 13:58 Wilco Dijkstra
0 siblings, 0 replies; 10+ messages in thread
From: Wilco Dijkstra @ 2014-08-07 13:58 UTC (permalink / raw)
To: azanella; +Cc: libc-alpha
>> + strcpy (dest + strlen (dest), src);
>
> Should it be __strcpy/__strlen ?
Not according to: https://sourceware.org/ml/libc-alpha/2014-04/msg00471.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH] Improve performance of strcat
@ 2014-09-10 15:32 Wilco Dijkstra
0 siblings, 0 replies; 10+ messages in thread
From: Wilco Dijkstra @ 2014-09-10 15:32 UTC (permalink / raw)
To: libc-alpha
Ping
> -----Original Message-----
> From: Wilco Dijkstra [mailto:wdijkstr@arm.com]
> Sent: 07 August 2014 14:28
> To: 'libc-alpha@sourceware.org'
> Subject: [PATCH] Improve performance of strcat
>
> Hi,
>
> This patch improves strcat performance by using strlen and strcpy. Strlen has a fast C
> implementation, so this improves performance even on targets which don't have an optimized
> strlen and strcpy - it is 25% faster in bench-strcat. On targets which don't provide an
> optimized strcat but which do have an optimized strlen and strcpy, performance gain is > 2x.
>
> OK for commit?
>
> ChangeLog:
> 2014-08-07 Wilco Dijkstra <wdijkstr@arm.com>
>
> * string/strcat.c (strcat): Improve performance by using strlen/strcpy.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-09-12 11:51 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-07 13:28 [PATCH] Improve performance of strcat Wilco Dijkstra
2014-08-07 13:31 ` Adhemerval Zanella
2014-08-07 13:58 ` Joseph S. Myers
2014-09-10 17:24 ` Florian Weimer
2014-09-11 19:42 ` Carlos O'Donell
2014-09-12 5:19 ` Ondřej Bílka
2014-09-12 11:51 ` Wilco Dijkstra
2014-09-12 11:14 ` Wilco Dijkstra
2014-08-07 13:58 Wilco Dijkstra
2014-09-10 15:32 Wilco Dijkstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).