public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v12 14/31] string: Improve generic strrchr with memrchr and strlen
@ 2023-02-02 19:54 Wilco Dijkstra
  2023-02-03 12:15 ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 5+ messages in thread
From: Wilco Dijkstra @ 2023-02-02 19:54 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: 'GNU C Library'

Hi Adhemerval,

 STRRCHR (const char *s, int c)
 {
+  return __memrchr (s, c, strlen (s) + 1);
 }

LGTM

I did a quick check on AArch64 and the speedup vs the old generic version
on bench-strrchr is 9.9 times! It's ~15% slower overall than the assembler
strrchr.

While the benchmark is not perfect, it shows that if you have decent strlen
and memrchr then strlen+memchr gives a competitive strrchr.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v12 14/31] string: Improve generic strrchr with memrchr and strlen
  2023-02-02 19:54 [PATCH v12 14/31] string: Improve generic strrchr with memrchr and strlen Wilco Dijkstra
@ 2023-02-03 12:15 ` Adhemerval Zanella Netto
  0 siblings, 0 replies; 5+ messages in thread
From: Adhemerval Zanella Netto @ 2023-02-03 12:15 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: 'GNU C Library'



On 02/02/23 16:54, Wilco Dijkstra wrote:
> Hi Adhemerval,
> 
>  STRRCHR (const char *s, int c)
>  {
> +  return __memrchr (s, c, strlen (s) + 1);
>  }
> 
> LGTM
> 
> I did a quick check on AArch64 and the speedup vs the old generic version
> on bench-strrchr is 9.9 times! It's ~15% slower overall than the assembler
> strrchr.
> 
> While the benchmark is not perfect, it shows that if you have decent strlen
> and memrchr then strlen+memchr gives a competitive strrchr.

Yeah, I saw a similar improvement on my side as well.  Thanks for verify it
as well.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v12 14/31] string: Improve generic strrchr with memrchr and strlen
  2023-02-04  3:06   ` Richard Henderson
@ 2023-02-06 14:01     ` Adhemerval Zanella Netto
  0 siblings, 0 replies; 5+ messages in thread
From: Adhemerval Zanella Netto @ 2023-02-06 14:01 UTC (permalink / raw)
  To: Richard Henderson, libc-alpha, Jeff Law, Xi Ruoyao, Noah Goldstein



On 04/02/23 00:06, Richard Henderson wrote:
> On 2/2/23 08:11, Adhemerval Zanella wrote:
>> Now that both strlen and memrchr have word vectorized implementation,
>> it should be faster to implement strrchr based on memrchr over the
>> string length instead of calling strchr on a loop.
>>
>> Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
>> and powerpc64-linux-gnu by removing the arch-specific assembly
>> implementation and disabling multi-arch (it covers both LE and BE
>> for 64 and 32 bits).
>> ---
>>   string/strrchr.c | 18 +-----------------
>>   1 file changed, 1 insertion(+), 17 deletions(-)
> 
> memrchr needs libc_hidden_builtin_proto.  On riscv64:
> 
>    c:   00000097                auipc   ra,0x0
>                         c: R_RISCV_CALL __GI_strlen
>                         c: R_RISCV_RELAX        *ABS*
>   10:   000080e7                jalr    ra # c <__GI_strrchr+0xc>
> ...
>   24:   00000317                auipc   t1,0x0
>                         24: R_RISCV_CALL_PLT    __memrchr
>                         24: R_RISCV_RELAX       *ABS*
>   28:   00030067                jr      t1 # 24 <.LVL2+0x8>
> 
> 
> r~

Similar to strchr [1], static linker ends up creating the expected local
call:

$ riscv64-glibc-linux-gnu-objdump -dr libc.so
[...]
0000000000088f5c <strrchr>:
   88f5c:       7179                    addi    sp,sp,-48
   88f5e:       e84a                    sd      s2,16(sp)
   88f60:       000e7917                auipc   s2,0xe7
   88f64:       73893903                ld      s2,1848(s2) # 170698 <__stack_chk_guard@GLIBC_2.27>
   88f68:       00093783                ld      a5,0(s2)
[...]
   88f9a:       6145                    addi    sp,sp,48
   88f9c:       c70fd06f                j       8640c <memrchr>
   88fa0:       193500ef                jal     ra,d9932 <__stack_chk_fail>
[...]

But I agree that we should change to have the libc_hidden_builin_proto
in this case.  I will send a patch to fix it.

[1] https://sourceware.org/pipermail/libc-alpha/2023-February/145322.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v12 14/31] string: Improve generic strrchr with memrchr and strlen
  2023-02-02 18:11 ` [PATCH v12 14/31] string: Improve generic strrchr with memrchr and strlen Adhemerval Zanella
@ 2023-02-04  3:06   ` Richard Henderson
  2023-02-06 14:01     ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Henderson @ 2023-02-04  3:06 UTC (permalink / raw)
  To: Adhemerval Zanella, libc-alpha, Jeff Law, Xi Ruoyao, Noah Goldstein

On 2/2/23 08:11, Adhemerval Zanella wrote:
> Now that both strlen and memrchr have word vectorized implementation,
> it should be faster to implement strrchr based on memrchr over the
> string length instead of calling strchr on a loop.
> 
> Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
> and powerpc64-linux-gnu by removing the arch-specific assembly
> implementation and disabling multi-arch (it covers both LE and BE
> for 64 and 32 bits).
> ---
>   string/strrchr.c | 18 +-----------------
>   1 file changed, 1 insertion(+), 17 deletions(-)

memrchr needs libc_hidden_builtin_proto.  On riscv64:

    c:   00000097                auipc   ra,0x0
                         c: R_RISCV_CALL __GI_strlen
                         c: R_RISCV_RELAX        *ABS*
   10:   000080e7                jalr    ra # c <__GI_strrchr+0xc>
...
   24:   00000317                auipc   t1,0x0
                         24: R_RISCV_CALL_PLT    __memrchr
                         24: R_RISCV_RELAX       *ABS*
   28:   00030067                jr      t1 # 24 <.LVL2+0x8>


r~

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v12 14/31] string: Improve generic strrchr with memrchr and strlen
  2023-02-02 18:11 [PATCH v12 00/31] Improve generic string routines Adhemerval Zanella
@ 2023-02-02 18:11 ` Adhemerval Zanella
  2023-02-04  3:06   ` Richard Henderson
  0 siblings, 1 reply; 5+ messages in thread
From: Adhemerval Zanella @ 2023-02-02 18:11 UTC (permalink / raw)
  To: libc-alpha, Richard Henderson, Jeff Law, Xi Ruoyao, Noah Goldstein

Now that both strlen and memrchr have word vectorized implementation,
it should be faster to implement strrchr based on memrchr over the
string length instead of calling strchr on a loop.

Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu,
and powerpc64-linux-gnu by removing the arch-specific assembly
implementation and disabling multi-arch (it covers both LE and BE
for 64 and 32 bits).
---
 string/strrchr.c | 18 +-----------------
 1 file changed, 1 insertion(+), 17 deletions(-)

diff --git a/string/strrchr.c b/string/strrchr.c
index 3c6e715d3b..7b76dea4e0 100644
--- a/string/strrchr.c
+++ b/string/strrchr.c
@@ -27,23 +27,7 @@
 char *
 STRRCHR (const char *s, int c)
 {
-  const char *found, *p;
-
-  c = (unsigned char) c;
-
-  /* Since strchr is fast, we use it rather than the obvious loop.  */
-
-  if (c == '\0')
-    return strchr (s, '\0');
-
-  found = NULL;
-  while ((p = strchr (s, c)) != NULL)
-    {
-      found = p;
-      s = p + 1;
-    }
-
-  return (char *) found;
+  return __memrchr (s, c, strlen (s) + 1);
 }
 
 #ifdef weak_alias
-- 
2.34.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-02-06 14:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-02 19:54 [PATCH v12 14/31] string: Improve generic strrchr with memrchr and strlen Wilco Dijkstra
2023-02-03 12:15 ` Adhemerval Zanella Netto
  -- strict thread matches above, loose matches on Subject: below --
2023-02-02 18:11 [PATCH v12 00/31] Improve generic string routines Adhemerval Zanella
2023-02-02 18:11 ` [PATCH v12 14/31] string: Improve generic strrchr with memrchr and strlen Adhemerval Zanella
2023-02-04  3:06   ` Richard Henderson
2023-02-06 14:01     ` Adhemerval Zanella Netto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).