public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To: "libc-alpha@sourceware.org" <libc-alpha@sourceware.org>
Cc: nd <nd@arm.com>
Subject: Re: [PATCH] Improve strtok(_r) performance
Date: Mon, 14 Nov 2016 12:21:00 -0000	[thread overview]
Message-ID: <AM5PR0802MB261081079A9A4BC050B185E483BC0@AM5PR0802MB2610.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <AM5PR0802MB2610F3F2F9864ECF0089F9E583AD0@AM5PR0802MB2610.eurprd08.prod.outlook.com>


ping

From: Wilco Dijkstra
Sent: 28 October 2016 12:35
To: libc-alpha@sourceware.org
Cc: nd
Subject: [PATCH] Improve strtok(_r) performance
    
Improve strtok(_r) performance.  Instead of calling strpbrk which calls
strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr.  Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string.  The result
is a ~2x speedup of strtok on most inputs in bench-strtok.

Passes regression tests, OK for commit?

ChangeLog:
2015-10-28  Wilco Dijkstra  <wdijkstr@arm.com>

        * string/strtok.c (STRTOK): Optimize for performance.
        * string/strtok_r.c (__strtok_r): Likewise.
--

diff --git a/string/strtok.c b/string/strtok.c
index 7a4574db5c80501e47d045ad4347e8a287b32191..b1ed48c24c8d20706b7d05481a138b18a01ff802 100644
--- a/string/strtok.c
+++ b/string/strtok.c
@@ -38,11 +38,18 @@ static char *olds;
 char *
 STRTOK (char *s, const char *delim)
 {
-  char *token;
+  char *end;
 
   if (s == NULL)
     s = olds;
 
+  /* Return immediately at end of string.  */
+  if (*s == '\0')
+    {
+      olds = s;
+      return NULL;
+    }
+
   /* Scan leading delimiters.  */
   s += strspn (s, delim);
   if (*s == '\0')
@@ -52,16 +59,15 @@ STRTOK (char *s, const char *delim)
     }
 
   /* Find the end of the token.  */
-  token = s;
-  s = strpbrk (token, delim);
-  if (s == NULL)
-    /* This token finishes the string.  */
-    olds = __rawmemchr (token, '\0');
-  else
+  end = s + strcspn (s, delim);
+  if (*end == '\0')
     {
-      /* Terminate the token and make OLDS point past it.  */
-      *s = '\0';
-      olds = s + 1;
+      olds = end;
+      return s;
     }
-  return token;
+
+  /* Terminate the token and make OLDS point past it.  */
+  *end = '\0';
+  olds = end + 1;
+  return s;
 }
diff --git a/string/strtok_r.c b/string/strtok_r.c
index f351304766108dad2c1cff881ad3bebae821b2a0..e049a5c82e026a3b6c1ba5da16ce81743717805e 100644
--- a/string/strtok_r.c
+++ b/string/strtok_r.c
@@ -45,11 +45,17 @@
 char *
 __strtok_r (char *s, const char *delim, char **save_ptr)
 {
-  char *token;
+  char *end;
 
   if (s == NULL)
     s = *save_ptr;
 
+  if (*s == '\0')
+    {
+      *save_ptr = s;
+      return NULL;
+    }
+
   /* Scan leading delimiters.  */
   s += strspn (s, delim);
   if (*s == '\0')
@@ -59,18 +65,17 @@ __strtok_r (char *s, const char *delim, char **save_ptr)
     }
 
   /* Find the end of the token.  */
-  token = s;
-  s = strpbrk (token, delim);
-  if (s == NULL)
-    /* This token finishes the string.  */
-    *save_ptr = __rawmemchr (token, '\0');
-  else
+  end = s + strcspn (s, delim);
+  if (*end == '\0')
     {
-      /* Terminate the token and make *SAVE_PTR point past it.  */
-      *s = '\0';
-      *save_ptr = s + 1;
+      *save_ptr = end;
+      return s;
     }
-  return token;
+
+  /* Terminate the token and make *SAVE_PTR point past it.  */
+  *end = '\0';
+  *save_ptr = end + 1;
+  return s;
 }
 #ifdef weak_alias
 libc_hidden_def (__strtok_r)
    

  parent reply	other threads:[~2016-11-14 12:21 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-28 11:35 Wilco Dijkstra
2016-11-04 12:55 ` Adhemerval Zanella
2016-11-14 12:21 ` Wilco Dijkstra [this message]
2016-11-14 12:26   ` Adhemerval Zanella
2016-11-14 12:50     ` Wilco Dijkstra
2016-11-14 12:57       ` Adhemerval Zanella
2016-11-16 17:06         ` [PATCH v2] " Wilco Dijkstra
2016-11-16 18:07           ` Andreas Schwab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM5PR0802MB261081079A9A4BC050B185E483BC0@AM5PR0802MB2610.eurprd08.prod.outlook.com \
    --to=wilco.dijkstra@arm.com \
    --cc=libc-alpha@sourceware.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).