public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Improve strtok(_r) performance
@ 2016-10-28 11:35 Wilco Dijkstra
  2016-11-04 12:55 ` Adhemerval Zanella
  2016-11-14 12:21 ` Wilco Dijkstra
  0 siblings, 2 replies; 8+ messages in thread
From: Wilco Dijkstra @ 2016-10-28 11:35 UTC (permalink / raw)
  To: libc-alpha; +Cc: nd

Improve strtok(_r) performance.  Instead of calling strpbrk which calls
strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr.  Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string.  The result
is a ~2x speedup of strtok on most inputs in bench-strtok.

Passes regression tests, OK for commit?

ChangeLog:
2015-10-28  Wilco Dijkstra  <wdijkstr@arm.com>

	* string/strtok.c (STRTOK): Optimize for performance.
	* string/strtok_r.c (__strtok_r): Likewise.
--

diff --git a/string/strtok.c b/string/strtok.c
index 7a4574db5c80501e47d045ad4347e8a287b32191..b1ed48c24c8d20706b7d05481a138b18a01ff802 100644
--- a/string/strtok.c
+++ b/string/strtok.c
@@ -38,11 +38,18 @@ static char *olds;
 char *
 STRTOK (char *s, const char *delim)
 {
-  char *token;
+  char *end;
 
   if (s == NULL)
     s = olds;
 
+  /* Return immediately at end of string.  */
+  if (*s == '\0')
+    {
+      olds = s;
+      return NULL;
+    }
+
   /* Scan leading delimiters.  */
   s += strspn (s, delim);
   if (*s == '\0')
@@ -52,16 +59,15 @@ STRTOK (char *s, const char *delim)
     }
 
   /* Find the end of the token.  */
-  token = s;
-  s = strpbrk (token, delim);
-  if (s == NULL)
-    /* This token finishes the string.  */
-    olds = __rawmemchr (token, '\0');
-  else
+  end = s + strcspn (s, delim);
+  if (*end == '\0')
     {
-      /* Terminate the token and make OLDS point past it.  */
-      *s = '\0';
-      olds = s + 1;
+      olds = end;
+      return s;
     }
-  return token;
+
+  /* Terminate the token and make OLDS point past it.  */
+  *end = '\0';
+  olds = end + 1;
+  return s;
 }
diff --git a/string/strtok_r.c b/string/strtok_r.c
index f351304766108dad2c1cff881ad3bebae821b2a0..e049a5c82e026a3b6c1ba5da16ce81743717805e 100644
--- a/string/strtok_r.c
+++ b/string/strtok_r.c
@@ -45,11 +45,17 @@
 char *
 __strtok_r (char *s, const char *delim, char **save_ptr)
 {
-  char *token;
+  char *end;
 
   if (s == NULL)
     s = *save_ptr;
 
+  if (*s == '\0')
+    {
+      *save_ptr = s;
+      return NULL;
+    }
+
   /* Scan leading delimiters.  */
   s += strspn (s, delim);
   if (*s == '\0')
@@ -59,18 +65,17 @@ __strtok_r (char *s, const char *delim, char **save_ptr)
     }
 
   /* Find the end of the token.  */
-  token = s;
-  s = strpbrk (token, delim);
-  if (s == NULL)
-    /* This token finishes the string.  */
-    *save_ptr = __rawmemchr (token, '\0');
-  else
+  end = s + strcspn (s, delim);
+  if (*end == '\0')
     {
-      /* Terminate the token and make *SAVE_PTR point past it.  */
-      *s = '\0';
-      *save_ptr = s + 1;
+      *save_ptr = end;
+      return s;
     }
-  return token;
+
+  /* Terminate the token and make *SAVE_PTR point past it.  */
+  *end = '\0';
+  *save_ptr = end + 1;
+  return s;
 }
 #ifdef weak_alias
 libc_hidden_def (__strtok_r)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Improve strtok(_r) performance
  2016-10-28 11:35 [PATCH] Improve strtok(_r) performance Wilco Dijkstra
@ 2016-11-04 12:55 ` Adhemerval Zanella
  2016-11-14 12:21 ` Wilco Dijkstra
  1 sibling, 0 replies; 8+ messages in thread
From: Adhemerval Zanella @ 2016-11-04 12:55 UTC (permalink / raw)
  To: libc-alpha



On 28/10/2016 09:35, Wilco Dijkstra wrote:
> Improve strtok(_r) performance.  Instead of calling strpbrk which calls
> strcspn, call strcspn directly so we get the end of the token without
> an extra call to rawmemchr.  Also avoid an unnecessary call to strcspn after
> the last token by adding an early exit for an empty string.  The result
> is a ~2x speedup of strtok on most inputs in bench-strtok.
> 
> Passes regression tests, OK for commit?

Why not aim for simplicity and just use strtok_r and strtok? I should
be a tail call in most architecture and performance loss should be
minimum.

Either way LGTM. I also found that powerpc64 optimized one performs
worse than this new default one, once you push it in I plan to 
remove it.

> 
> ChangeLog:
> 2015-10-28  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* string/strtok.c (STRTOK): Optimize for performance.
> 	* string/strtok_r.c (__strtok_r): Likewise.
> --
> 
> diff --git a/string/strtok.c b/string/strtok.c
> index 7a4574db5c80501e47d045ad4347e8a287b32191..b1ed48c24c8d20706b7d05481a138b18a01ff802 100644
> --- a/string/strtok.c
> +++ b/string/strtok.c
> @@ -38,11 +38,18 @@ static char *olds;
>  char *
>  STRTOK (char *s, const char *delim)
>  {
> -  char *token;
> +  char *end;
>  
>    if (s == NULL)
>      s = olds;
>  
> +  /* Return immediately at end of string.  */
> +  if (*s == '\0')
> +    {
> +      olds = s;
> +      return NULL;
> +    }
> +
>    /* Scan leading delimiters.  */
>    s += strspn (s, delim);
>    if (*s == '\0')
> @@ -52,16 +59,15 @@ STRTOK (char *s, const char *delim)
>      }
>  
>    /* Find the end of the token.  */
> -  token = s;
> -  s = strpbrk (token, delim);
> -  if (s == NULL)
> -    /* This token finishes the string.  */
> -    olds = __rawmemchr (token, '\0');
> -  else
> +  end = s + strcspn (s, delim);
> +  if (*end == '\0')
>      {
> -      /* Terminate the token and make OLDS point past it.  */
> -      *s = '\0';
> -      olds = s + 1;
> +      olds = end;
> +      return s;
>      }
> -  return token;
> +
> +  /* Terminate the token and make OLDS point past it.  */
> +  *end = '\0';
> +  olds = end + 1;
> +  return s;
>  }
> diff --git a/string/strtok_r.c b/string/strtok_r.c
> index f351304766108dad2c1cff881ad3bebae821b2a0..e049a5c82e026a3b6c1ba5da16ce81743717805e 100644
> --- a/string/strtok_r.c
> +++ b/string/strtok_r.c
> @@ -45,11 +45,17 @@
>  char *
>  __strtok_r (char *s, const char *delim, char **save_ptr)
>  {
> -  char *token;
> +  char *end;
>  
>    if (s == NULL)
>      s = *save_ptr;
>  
> +  if (*s == '\0')
> +    {
> +      *save_ptr = s;
> +      return NULL;
> +    }
> +
>    /* Scan leading delimiters.  */
>    s += strspn (s, delim);
>    if (*s == '\0')
> @@ -59,18 +65,17 @@ __strtok_r (char *s, const char *delim, char **save_ptr)
>      }
>  
>    /* Find the end of the token.  */
> -  token = s;
> -  s = strpbrk (token, delim);
> -  if (s == NULL)
> -    /* This token finishes the string.  */
> -    *save_ptr = __rawmemchr (token, '\0');
> -  else
> +  end = s + strcspn (s, delim);
> +  if (*end == '\0')
>      {
> -      /* Terminate the token and make *SAVE_PTR point past it.  */
> -      *s = '\0';
> -      *save_ptr = s + 1;
> +      *save_ptr = end;
> +      return s;
>      }
> -  return token;
> +
> +  /* Terminate the token and make *SAVE_PTR point past it.  */
> +  *end = '\0';
> +  *save_ptr = end + 1;
> +  return s;
>  }
>  #ifdef weak_alias
>  libc_hidden_def (__strtok_r)
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Improve strtok(_r) performance
  2016-10-28 11:35 [PATCH] Improve strtok(_r) performance Wilco Dijkstra
  2016-11-04 12:55 ` Adhemerval Zanella
@ 2016-11-14 12:21 ` Wilco Dijkstra
  2016-11-14 12:26   ` Adhemerval Zanella
  1 sibling, 1 reply; 8+ messages in thread
From: Wilco Dijkstra @ 2016-11-14 12:21 UTC (permalink / raw)
  To: libc-alpha; +Cc: nd


ping

From: Wilco Dijkstra
Sent: 28 October 2016 12:35
To: libc-alpha@sourceware.org
Cc: nd
Subject: [PATCH] Improve strtok(_r) performance
    
Improve strtok(_r) performance.  Instead of calling strpbrk which calls
strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr.  Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string.  The result
is a ~2x speedup of strtok on most inputs in bench-strtok.

Passes regression tests, OK for commit?

ChangeLog:
2015-10-28  Wilco Dijkstra  <wdijkstr@arm.com>

        * string/strtok.c (STRTOK): Optimize for performance.
        * string/strtok_r.c (__strtok_r): Likewise.
--

diff --git a/string/strtok.c b/string/strtok.c
index 7a4574db5c80501e47d045ad4347e8a287b32191..b1ed48c24c8d20706b7d05481a138b18a01ff802 100644
--- a/string/strtok.c
+++ b/string/strtok.c
@@ -38,11 +38,18 @@ static char *olds;
 char *
 STRTOK (char *s, const char *delim)
 {
-  char *token;
+  char *end;
 
   if (s == NULL)
     s = olds;
 
+  /* Return immediately at end of string.  */
+  if (*s == '\0')
+    {
+      olds = s;
+      return NULL;
+    }
+
   /* Scan leading delimiters.  */
   s += strspn (s, delim);
   if (*s == '\0')
@@ -52,16 +59,15 @@ STRTOK (char *s, const char *delim)
     }
 
   /* Find the end of the token.  */
-  token = s;
-  s = strpbrk (token, delim);
-  if (s == NULL)
-    /* This token finishes the string.  */
-    olds = __rawmemchr (token, '\0');
-  else
+  end = s + strcspn (s, delim);
+  if (*end == '\0')
     {
-      /* Terminate the token and make OLDS point past it.  */
-      *s = '\0';
-      olds = s + 1;
+      olds = end;
+      return s;
     }
-  return token;
+
+  /* Terminate the token and make OLDS point past it.  */
+  *end = '\0';
+  olds = end + 1;
+  return s;
 }
diff --git a/string/strtok_r.c b/string/strtok_r.c
index f351304766108dad2c1cff881ad3bebae821b2a0..e049a5c82e026a3b6c1ba5da16ce81743717805e 100644
--- a/string/strtok_r.c
+++ b/string/strtok_r.c
@@ -45,11 +45,17 @@
 char *
 __strtok_r (char *s, const char *delim, char **save_ptr)
 {
-  char *token;
+  char *end;
 
   if (s == NULL)
     s = *save_ptr;
 
+  if (*s == '\0')
+    {
+      *save_ptr = s;
+      return NULL;
+    }
+
   /* Scan leading delimiters.  */
   s += strspn (s, delim);
   if (*s == '\0')
@@ -59,18 +65,17 @@ __strtok_r (char *s, const char *delim, char **save_ptr)
     }
 
   /* Find the end of the token.  */
-  token = s;
-  s = strpbrk (token, delim);
-  if (s == NULL)
-    /* This token finishes the string.  */
-    *save_ptr = __rawmemchr (token, '\0');
-  else
+  end = s + strcspn (s, delim);
+  if (*end == '\0')
     {
-      /* Terminate the token and make *SAVE_PTR point past it.  */
-      *s = '\0';
-      *save_ptr = s + 1;
+      *save_ptr = end;
+      return s;
     }
-  return token;
+
+  /* Terminate the token and make *SAVE_PTR point past it.  */
+  *end = '\0';
+  *save_ptr = end + 1;
+  return s;
 }
 #ifdef weak_alias
 libc_hidden_def (__strtok_r)
    

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Improve strtok(_r) performance
  2016-11-14 12:21 ` Wilco Dijkstra
@ 2016-11-14 12:26   ` Adhemerval Zanella
  2016-11-14 12:50     ` Wilco Dijkstra
  0 siblings, 1 reply; 8+ messages in thread
From: Adhemerval Zanella @ 2016-11-14 12:26 UTC (permalink / raw)
  To: Wilco Dijkstra, libc-alpha; +Cc: nd

https://sourceware.org/ml/libc-alpha/2016-11/msg00150.html

On 14/11/2016 10:20, Wilco Dijkstra wrote:
> 
> ping
> 
> From: Wilco Dijkstra
> Sent: 28 October 2016 12:35
> To: libc-alpha@sourceware.org
> Cc: nd
> Subject: [PATCH] Improve strtok(_r) performance
>     
> Improve strtok(_r) performance.  Instead of calling strpbrk which calls
> strcspn, call strcspn directly so we get the end of the token without
> an extra call to rawmemchr.  Also avoid an unnecessary call to strcspn after
> the last token by adding an early exit for an empty string.  The result
> is a ~2x speedup of strtok on most inputs in bench-strtok.
> 
> Passes regression tests, OK for commit?
> 
> ChangeLog:
> 2015-10-28  Wilco Dijkstra  <wdijkstr@arm.com>
> 
>         * string/strtok.c (STRTOK): Optimize for performance.
>         * string/strtok_r.c (__strtok_r): Likewise.
> --
> 
> diff --git a/string/strtok.c b/string/strtok.c
> index 7a4574db5c80501e47d045ad4347e8a287b32191..b1ed48c24c8d20706b7d05481a138b18a01ff802 100644
> --- a/string/strtok.c
> +++ b/string/strtok.c
> @@ -38,11 +38,18 @@ static char *olds;
>  char *
>  STRTOK (char *s, const char *delim)
>  {
> -  char *token;
> +  char *end;
>  
>    if (s == NULL)
>      s = olds;
>  
> +  /* Return immediately at end of string.  */
> +  if (*s == '\0')
> +    {
> +      olds = s;
> +      return NULL;
> +    }
> +
>    /* Scan leading delimiters.  */
>    s += strspn (s, delim);
>    if (*s == '\0')
> @@ -52,16 +59,15 @@ STRTOK (char *s, const char *delim)
>      }
>  
>    /* Find the end of the token.  */
> -  token = s;
> -  s = strpbrk (token, delim);
> -  if (s == NULL)
> -    /* This token finishes the string.  */
> -    olds = __rawmemchr (token, '\0');
> -  else
> +  end = s + strcspn (s, delim);
> +  if (*end == '\0')
>      {
> -      /* Terminate the token and make OLDS point past it.  */
> -      *s = '\0';
> -      olds = s + 1;
> +      olds = end;
> +      return s;
>      }
> -  return token;
> +
> +  /* Terminate the token and make OLDS point past it.  */
> +  *end = '\0';
> +  olds = end + 1;
> +  return s;
>  }
> diff --git a/string/strtok_r.c b/string/strtok_r.c
> index f351304766108dad2c1cff881ad3bebae821b2a0..e049a5c82e026a3b6c1ba5da16ce81743717805e 100644
> --- a/string/strtok_r.c
> +++ b/string/strtok_r.c
> @@ -45,11 +45,17 @@
>  char *
>  __strtok_r (char *s, const char *delim, char **save_ptr)
>  {
> -  char *token;
> +  char *end;
>  
>    if (s == NULL)
>      s = *save_ptr;
>  
> +  if (*s == '\0')
> +    {
> +      *save_ptr = s;
> +      return NULL;
> +    }
> +
>    /* Scan leading delimiters.  */
>    s += strspn (s, delim);
>    if (*s == '\0')
> @@ -59,18 +65,17 @@ __strtok_r (char *s, const char *delim, char **save_ptr)
>      }
>  
>    /* Find the end of the token.  */
> -  token = s;
> -  s = strpbrk (token, delim);
> -  if (s == NULL)
> -    /* This token finishes the string.  */
> -    *save_ptr = __rawmemchr (token, '\0');
> -  else
> +  end = s + strcspn (s, delim);
> +  if (*end == '\0')
>      {
> -      /* Terminate the token and make *SAVE_PTR point past it.  */
> -      *s = '\0';
> -      *save_ptr = s + 1;
> +      *save_ptr = end;
> +      return s;
>      }
> -  return token;
> +
> +  /* Terminate the token and make *SAVE_PTR point past it.  */
> +  *end = '\0';
> +  *save_ptr = end + 1;
> +  return s;
>  }
>  #ifdef weak_alias
>  libc_hidden_def (__strtok_r)
>     
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Improve strtok(_r) performance
  2016-11-14 12:26   ` Adhemerval Zanella
@ 2016-11-14 12:50     ` Wilco Dijkstra
  2016-11-14 12:57       ` Adhemerval Zanella
  0 siblings, 1 reply; 8+ messages in thread
From: Wilco Dijkstra @ 2016-11-14 12:50 UTC (permalink / raw)
  To: Adhemerval Zanella, libc-alpha; +Cc: nd

Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:
> Why not aim for simplicity and just use strtok_r and strtok? I should
> be a tail call in most architecture and performance loss should be
> minimum.

You mean avoiding the duplication of code? Say inlining or tailcalling
strtok_r in string/strtok_r.c?

Wilco

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Improve strtok(_r) performance
  2016-11-14 12:50     ` Wilco Dijkstra
@ 2016-11-14 12:57       ` Adhemerval Zanella
  2016-11-16 17:06         ` [PATCH v2] " Wilco Dijkstra
  0 siblings, 1 reply; 8+ messages in thread
From: Adhemerval Zanella @ 2016-11-14 12:57 UTC (permalink / raw)
  To: Wilco Dijkstra, libc-alpha; +Cc: nd



On 14/11/2016 10:50, Wilco Dijkstra wrote:
> Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:
>> Why not aim for simplicity and just use strtok_r and strtok? I should
>> be a tail call in most architecture and performance loss should be
>> minimum.
> 
> You mean avoiding the duplication of code? Say inlining or tailcalling
> strtok_r in string/strtok_r.c?
> 
> Wilco

Yes.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] Improve strtok(_r) performance
  2016-11-14 12:57       ` Adhemerval Zanella
@ 2016-11-16 17:06         ` Wilco Dijkstra
  2016-11-16 18:07           ` Andreas Schwab
  0 siblings, 1 reply; 8+ messages in thread
From: Wilco Dijkstra @ 2016-11-16 17:06 UTC (permalink / raw)
  To: Adhemerval Zanella, libc-alpha; +Cc: nd

Improve strtok and strtok_r performance.  Instead of calling strpbrk which
calls strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr.  Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string.  Change strtok
to tailcall strtok_r to avoid unnecessary code duplication.

Remove the special header optimization for strtok_r of a 1-character
constant string - both strspn and strcspn contain optimizations for this
case.  Benchmarking this showed similar performance in the worst case,
but up to 5.5x better performance in the "found" case for large inputs.

Passes regression tests, OK for commit?

ChangeLog:
2015-11-16  Wilco Dijkstra  <wdijkstr@arm.com>

	* benchtests/bench-strtok.c (oldstrtok): Add old implementation.
	* string/strtok.c (strtok): Change to tailcall __strtok_r.
	* string/strtok_r.c (__strtok_r): Optimize for performance.
	* string/string-inlines.c (__old_strtok_r_1c): New function.
	* string/bits/string2.h (__strtok_r): Move to string-inlines.c

--
diff --git a/benchtests/bench-strtok.c b/benchtests/bench-strtok.c
index eeb798f01575c712b958ad1b7d5f88f91e158202..074be9748993d26539df642bc246e475b77557a7 100644
--- a/benchtests/bench-strtok.c
+++ b/benchtests/bench-strtok.c
@@ -20,13 +20,40 @@
 #define TEST_NAME "strtok"
 #include "bench-string.h"
 
-#define STRTOK strtok_string
-#include <string/strtok.c>
+char *oldstrtok (char *s, const char *delim)
+{
+  static char *olds;
+  char *token;
+
+  if (s == NULL)
+    s = olds;
+
+  /* Scan leading delimiters.  */
+  s += strspn (s, delim);
+  if (*s == '\0')
+    {
+      olds = s;
+      return NULL;
+    }
 
+  /* Find the end of the token.  */
+  token = s;
+  s = strpbrk (token, delim);
+  if (s == NULL)
+    /* This token finishes the string.  */
+    olds = __rawmemchr (token, '\0');
+  else
+    {
+      /* Terminate the token and make OLDS point past it.  */
+      *s = '\0';
+      olds = s + 1;
+    }
+  return token;
+}
 
 typedef char *(*proto_t) (const char *, const char *);
 
-IMPL (strtok_string, 0)
+IMPL (oldstrtok, 0)
 IMPL (strtok, 1)
 
 static void
diff --git a/string/bits/string2.h b/string/bits/string2.h
index 80987602f34ded483854bcea86dabd5b81e42a18..1b84c88661c2b4083296ece6bd763c7987597953 100644
--- a/string/bits/string2.h
+++ b/string/bits/string2.h
@@ -180,45 +180,6 @@ extern void *__rawmemchr (const void *__s, int __c);
 #endif
 
 
-#if !defined _HAVE_STRING_ARCH_strtok_r || defined _FORCE_INLINES
-# ifndef _HAVE_STRING_ARCH_strtok_r
-#  define __strtok_r(s, sep, nextp) \
-  (__extension__ (__builtin_constant_p (sep) && __string2_1bptr_p (sep)	      \
-		  && ((const char *) (sep))[0] != '\0'			      \
-		  && ((const char *) (sep))[1] == '\0'			      \
-		  ? __strtok_r_1c (s, ((const char *) (sep))[0], nextp)       \
-		  : __strtok_r (s, sep, nextp)))
-# endif
-
-__STRING_INLINE char *__strtok_r_1c (char *__s, char __sep, char **__nextp);
-__STRING_INLINE char *
-__strtok_r_1c (char *__s, char __sep, char **__nextp)
-{
-  char *__result;
-  if (__s == NULL)
-    __s = *__nextp;
-  while (*__s == __sep)
-    ++__s;
-  __result = NULL;
-  if (*__s != '\0')
-    {
-      __result = __s++;
-      while (*__s != '\0')
-	if (*__s++ == __sep)
-	  {
-	    __s[-1] = '\0';
-	    break;
-	  }
-    }
-  *__nextp = __s;
-  return __result;
-}
-# ifdef __USE_POSIX
-#  define strtok_r(s, sep, nextp) __strtok_r (s, sep, nextp)
-# endif
-#endif
-
-
 #if !defined _HAVE_STRING_ARCH_strsep || defined _FORCE_INLINES
 # ifndef _HAVE_STRING_ARCH_strsep
 
diff --git a/string/string-inlines.c b/string/string-inlines.c
index 1091468519e1561ac2a4e9c3ed6eb75ee9fdf43f..d43e5897c37430e5f97940469d65e7ddbdcbd09c 100644
--- a/string/string-inlines.c
+++ b/string/string-inlines.c
@@ -35,6 +35,36 @@
 
 #include "shlib-compat.h"
 
+#if SHLIB_COMPAT (libc, GLIBC_2_1_1, GLIBC_2_25)
+/* The inline functions are not used from GLIBC 2.25 and forward, however
+   they are required to provide the symbols through string-inlines.c
+   (if inlining is not possible for compatibility reasons).  */
+
+char *
+__old_strtok_r_1c (char *__s, char __sep, char **__nextp)
+{
+  char *__result;
+  if (__s == NULL)
+    __s = *__nextp;
+  while (*__s == __sep)
+    ++__s;
+  __result = NULL;
+  if (*__s != '\0')
+    {
+      __result = __s++;
+      while (*__s != '\0')
+	if (*__s++ == __sep)
+	  {
+	    __s[-1] = '\0';
+	    break;
+	  }
+    }
+  *__nextp = __s;
+  return __result;
+}
+compat_symbol (libc, __old_strtok_r_1c, __strtok_r_1c, GLIBC_2_1_1);
+#endif
+
 #if SHLIB_COMPAT (libc, GLIBC_2_1_1, GLIBC_2_24)
 /* The inline functions are not used from GLIBC 2.24 and forward, however
    they are required to provide the symbols through string-inlines.c
diff --git a/string/strtok.c b/string/strtok.c
index 7a4574db5c80501e47d045ad4347e8a287b32191..482cdc1da45a71173080b6eff3857e863b5977ea 100644
--- a/string/strtok.c
+++ b/string/strtok.c
@@ -18,14 +18,6 @@
 #include <string.h>
 
 
-static char *olds;
-
-#undef strtok
-
-#ifndef STRTOK
-# define STRTOK strtok
-#endif
-
 /* Parse S into tokens separated by characters in DELIM.
    If S is NULL, the last string strtok() was called with is
    used.  For example:
@@ -36,32 +28,8 @@ static char *olds;
 		// s = "abc\0=-def\0"
 */
 char *
-STRTOK (char *s, const char *delim)
+strtok (char *s, const char *delim)
 {
-  char *token;
-
-  if (s == NULL)
-    s = olds;
-
-  /* Scan leading delimiters.  */
-  s += strspn (s, delim);
-  if (*s == '\0')
-    {
-      olds = s;
-      return NULL;
-    }
-
-  /* Find the end of the token.  */
-  token = s;
-  s = strpbrk (token, delim);
-  if (s == NULL)
-    /* This token finishes the string.  */
-    olds = __rawmemchr (token, '\0');
-  else
-    {
-      /* Terminate the token and make OLDS point past it.  */
-      *s = '\0';
-      olds = s + 1;
-    }
-  return token;
+  static char *olds;
+  return __strtok_r (s, delim, &olds);
 }
diff --git a/string/strtok_r.c b/string/strtok_r.c
index f351304766108dad2c1cff881ad3bebae821b2a0..2d251f90d79b6c546e80e1d25c03955ea8dad92b 100644
--- a/string/strtok_r.c
+++ b/string/strtok_r.c
@@ -22,14 +22,10 @@
 
 #include <string.h>
 
-#undef strtok_r
-#undef __strtok_r
-
 #ifndef _LIBC
 /* Get specification.  */
 # include "strtok_r.h"
 # define __strtok_r strtok_r
-# define __rawmemchr strchr
 #endif
 
 /* Parse S into tokens separated by characters in DELIM.
@@ -45,11 +41,17 @@
 char *
 __strtok_r (char *s, const char *delim, char **save_ptr)
 {
-  char *token;
+  char *end;
 
   if (s == NULL)
     s = *save_ptr;
 
+  if (*s == '\0')
+    {
+      *save_ptr = s;
+      return NULL;
+    }
+
   /* Scan leading delimiters.  */
   s += strspn (s, delim);
   if (*s == '\0')
@@ -59,18 +61,17 @@ __strtok_r (char *s, const char *delim, char **save_ptr)
     }
 
   /* Find the end of the token.  */
-  token = s;
-  s = strpbrk (token, delim);
-  if (s == NULL)
-    /* This token finishes the string.  */
-    *save_ptr = __rawmemchr (token, '\0');
-  else
+  end = s + strcspn (s, delim);
+  if (*end == '\0')
     {
-      /* Terminate the token and make *SAVE_PTR point past it.  */
-      *s = '\0';
-      *save_ptr = s + 1;
+      *save_ptr = end;
+      return s;
     }
-  return token;
+
+  /* Terminate the token and make *SAVE_PTR point past it.  */
+  *end = '\0';
+  *save_ptr = end + 1;
+  return s;
 }
 #ifdef weak_alias
 libc_hidden_def (__strtok_r)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] Improve strtok(_r) performance
  2016-11-16 17:06         ` [PATCH v2] " Wilco Dijkstra
@ 2016-11-16 18:07           ` Andreas Schwab
  0 siblings, 0 replies; 8+ messages in thread
From: Andreas Schwab @ 2016-11-16 18:07 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: Adhemerval Zanella, libc-alpha, nd

On Nov 16 2016, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:

> diff --git a/benchtests/bench-strtok.c b/benchtests/bench-strtok.c
> index eeb798f01575c712b958ad1b7d5f88f91e158202..074be9748993d26539df642bc246e475b77557a7 100644
> --- a/benchtests/bench-strtok.c
> +++ b/benchtests/bench-strtok.c
> @@ -20,13 +20,40 @@
>  #define TEST_NAME "strtok"
>  #include "bench-string.h"
>  
> -#define STRTOK strtok_string
> -#include <string/strtok.c>
> +char *oldstrtok (char *s, const char *delim)

Style: break after return type.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-11-16 18:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-28 11:35 [PATCH] Improve strtok(_r) performance Wilco Dijkstra
2016-11-04 12:55 ` Adhemerval Zanella
2016-11-14 12:21 ` Wilco Dijkstra
2016-11-14 12:26   ` Adhemerval Zanella
2016-11-14 12:50     ` Wilco Dijkstra
2016-11-14 12:57       ` Adhemerval Zanella
2016-11-16 17:06         ` [PATCH v2] " Wilco Dijkstra
2016-11-16 18:07           ` Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).