public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug string/26091] New: strcpy cost more time in glibc-2.31
@ 2020-06-08  2:04 guojinhui at huawei dot com
  2020-06-08  2:05 ` [Bug string/26091] " guojinhui at huawei dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: guojinhui at huawei dot com @ 2020-06-08  2:04 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=26091

            Bug ID: 26091
           Summary: strcpy cost more time in glibc-2.31
           Product: glibc
           Version: 2.31
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: string
          Assignee: unassigned at sourceware dot org
          Reporter: guojinhui at huawei dot com
  Target Milestone: ---

Created attachment 12602
  --> https://sourceware.org/bugzilla/attachment.cgi?id=12602&action=edit
I reduced the bug to a stand-alone test case, now attached.

When I use strcpy to copy ten byte of data, it takes 70ns in glibc-2.31 while
53ns in glibc-2.29. I found it related to the address of strcpy. When the
address of strcpy is 32-byte alignment, it takes less time than 16-byte
alignment.

------------------------------------------------------------------------
testcase                 address           alignment          time(ns)
------------------------------------------------------------------------
strcpy_10_libmicro       0x95AF0           16                 70.48611
strcpy_10_libmicro       0x95C90           16                 69.54695
strcpy_10_libmicro       0x95C10           16                 69.0097
strcpy_10_libmicro       0x95AE0           32                 53.42931
strcpy_10_libmicro       0x95B00           32                 53.28875
strcpy_10_libmicro       0x95B20           32                 53.29308
strcpy_10_libmicro       0x95B40           32                 53.31686
strcpy_10_libmicro       0x95B60           32                 53.28691
------------------------------------------------------------------------

Thus, should it be 32-byte alignment?

 14 diff --git a/sysdeps/powerpc/powerpc32/strcpy.S
b/sysdeps/powerpc/powerpc32/strcpy.S
 15 index 0067e76..7a8badd 100644
 16 --- a/sysdeps/powerpc/powerpc32/strcpy.S
 17 +++ b/sysdeps/powerpc/powerpc32/strcpy.S
 18 @@ -22,7 +22,7 @@
 19
 20  /* char * [r3] strcpy (char *dest [r3], const char *src [r4])  */
 21
 22 -EALIGN (strcpy, 4, 0)
 23 +EALIGN (strcpy, 5, 0)
 24
 25  #define rTMP   r0
 26  #define rRTN   r3      /* incoming DEST arg preserved as result */
 27 --
 28 2.12.3

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug string/26091] strcpy cost more time in glibc-2.31
  2020-06-08  2:04 [Bug string/26091] New: strcpy cost more time in glibc-2.31 guojinhui at huawei dot com
@ 2020-06-08  2:05 ` guojinhui at huawei dot com
  2020-06-08  2:06 ` guojinhui at huawei dot com
  2020-06-22 20:28 ` adhemerval.zanella at linaro dot org
  2 siblings, 0 replies; 4+ messages in thread
From: guojinhui at huawei dot com @ 2020-06-08  2:05 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=26091

--- Comment #1 from JinhuiGuo <guojinhui at huawei dot com> ---
test case

I reduced the bug to a stand-alone test case, now attached.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug string/26091] strcpy cost more time in glibc-2.31
  2020-06-08  2:04 [Bug string/26091] New: strcpy cost more time in glibc-2.31 guojinhui at huawei dot com
  2020-06-08  2:05 ` [Bug string/26091] " guojinhui at huawei dot com
@ 2020-06-08  2:06 ` guojinhui at huawei dot com
  2020-06-22 20:28 ` adhemerval.zanella at linaro dot org
  2 siblings, 0 replies; 4+ messages in thread
From: guojinhui at huawei dot com @ 2020-06-08  2:06 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=26091

--- Comment #2 from JinhuiGuo <guojinhui at huawei dot com> ---
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#include <string.h>

int s = 10;
int unaligned = 0;

void init_str(char *str)
{
        static char *demo =
                "The quick brown fox jumps over the lazy dog.";
        int l = strlen(demo);
        int i;
        for (i = 0; i < s; i++) {
                str[i] = demo[i % l];
        }

        str[s] = 0;
}

int main(void)
{
        int i;
        struct timespec tv;
        struct timespec tv1;

        char *src2 = (char *)malloc(s + 1);
        char *src = (char *)malloc(s + 1 + unaligned);
        init_str(src2);
        src2 += unaligned;

        clock_gettime(CLOCK_MONOTONIC, &tv);

        for (i = 0; i < 1100000; i += 10) {
                (void) strcpy(src, src2);
                (void) strcpy(src, src2);
                (void) strcpy(src, src2);
                (void) strcpy(src, src2);
                (void) strcpy(src, src2);
                (void) strcpy(src, src2);
                (void) strcpy(src, src2);
                (void) strcpy(src, src2);
                (void) strcpy(src, src2);
                (void) strcpy(src, src2);
        }

        clock_gettime(CLOCK_MONOTONIC, &tv1);
        long long  tmp = ((long long)tv1.tv_sec * 1000000000LL) - ((long
long)tv.tv_sec * 1000000000LL)  + ((long long)tv1.tv_nsec ) - ((long
long)tv.tv_nsec);
        printf("cost: %f ns\n", ((double)tmp) / i);

        src2 -= unaligned;
        free(src);
        free(src2);

        return 0;
}

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug string/26091] strcpy cost more time in glibc-2.31
  2020-06-08  2:04 [Bug string/26091] New: strcpy cost more time in glibc-2.31 guojinhui at huawei dot com
  2020-06-08  2:05 ` [Bug string/26091] " guojinhui at huawei dot com
  2020-06-08  2:06 ` guojinhui at huawei dot com
@ 2020-06-22 20:28 ` adhemerval.zanella at linaro dot org
  2 siblings, 0 replies; 4+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2020-06-22 20:28 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=26091

Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |adhemerval.zanella at linaro dot o
                   |                            |rg

--- Comment #3 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to JinhuiGuo from comment #0)
> Created attachment 12602 [details]
> I reduced the bug to a stand-alone test case, now attached.
> 
> When I use strcpy to copy ten byte of data, it takes 70ns in glibc-2.31
> while 53ns in glibc-2.29. I found it related to the address of strcpy. When
> the address of strcpy is 32-byte alignment, it takes less time than 16-byte
> alignment.
> 
> ------------------------------------------------------------------------
> testcase                 address           alignment          time(ns)
> ------------------------------------------------------------------------
> strcpy_10_libmicro       0x95AF0           16	              70.48611
> strcpy_10_libmicro	 0x95C90	   16	              69.54695
> strcpy_10_libmicro	 0x95C10	   16	              69.0097
> strcpy_10_libmicro	 0x95AE0	   32	              53.42931
> strcpy_10_libmicro	 0x95B00	   32	              53.28875
> strcpy_10_libmicro	 0x95B20	   32	              53.29308
> strcpy_10_libmicro	 0x95B40	   32	              53.31686
> strcpy_10_libmicro	 0x95B60	   32	              53.28691
> ------------------------------------------------------------------------

I am seeing the opposite on gcc203 (POWER8) where changing the alignment to 32
(EALIGN (..., 5, 0) increases the cost from ~11.64 to ~12.41 to each call. This
is using the provided benchmark.

In fact this is really micro-arch dependent, where icache alignment might or
not imposes performance issues.  GCC also seems to use different alignment
depending of the target processor (-mcpu=xxx) and the default for powerX is ´
.palign 4,,15'.

So to actually change the default alignment I would like to check if this is
not a pessimization on generic powerpc32 as it seems for POWER.

> 
> Thus, should it be 32-byte alignment?
> 
>  14 diff --git a/sysdeps/powerpc/powerpc32/strcpy.S
> b/sysdeps/powerpc/powerpc32/strcpy.S
>  15 index 0067e76..7a8badd 100644
>  16 --- a/sysdeps/powerpc/powerpc32/strcpy.S
>  17 +++ b/sysdeps/powerpc/powerpc32/strcpy.S
>  18 @@ -22,7 +22,7 @@
>  19
>  20  /* char * [r3] strcpy (char *dest [r3], const char *src [r4])  */
>  21
>  22 -EALIGN (strcpy, 4, 0)
>  23 +EALIGN (strcpy, 5, 0)
>  24
>  25  #define rTMP   r0
>  26  #define rRTN   r3      /* incoming DEST arg preserved as result */
>  27 --
>  28 2.12.3

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-06-22 20:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-08  2:04 [Bug string/26091] New: strcpy cost more time in glibc-2.31 guojinhui at huawei dot com
2020-06-08  2:05 ` [Bug string/26091] " guojinhui at huawei dot com
2020-06-08  2:06 ` guojinhui at huawei dot com
2020-06-22 20:28 ` adhemerval.zanella at linaro dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).