[Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function.

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function.
@ 2021-04-07  7:53 novemberizing at gmail dot com
  2021-04-07  7:54 ` [Bug c/99953] " novemberizing at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: novemberizing at gmail dot com @ 2021-04-07  7:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953

            Bug ID: 99953
           Summary: In AVX, SIMD support environment, strlen performance
                    without optimization is 3 times faster than optimized
                    strlen function.
           Product: gcc
           Version: 9.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: novemberizing at gmail dot com
  Target Milestone: ---

Created attachment 50519
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50519&action=edit
the preprocessed file

I tested the performance of 65K bytes string and 65536 times for each O0, O1,
O2, O3, and the related performance was not optimized as shown below. If it is
not optimized, it has been confirmed that glibc@strlen_avx is called.

$ gcc -Wall -Wextra -fno-strict-aliasing -fwrapv
-fno-aggressive-loop-optimizations  -fsanitize=undefined -save-temps strlen.c

$ ./a.out 
no optimize =>  0.000007655
o1 optimize =>  0.000062935
o2 optimize =>  0.000022461
o3 optimize =>  0.000023192

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
9.3.0-17ubuntu1~20.04' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-9
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug c/99953] In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function.
  2021-04-07  7:53 [Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function novemberizing at gmail dot com
@ 2021-04-07  7:54 ` novemberizing at gmail dot com
  2021-04-07  8:02 ` novemberizing at gmail dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: novemberizing at gmail dot com @ 2021-04-07  7:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953

--- Comment #1 from Hyun Sik Park <novemberizing at gmail dot com> ---
Created attachment 50520
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50520&action=edit
simple test source

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug c/99953] In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function.
  2021-04-07  7:53 [Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function novemberizing at gmail dot com
  2021-04-07  7:54 ` [Bug c/99953] " novemberizing at gmail dot com
@ 2021-04-07  8:02 ` novemberizing at gmail dot com
  2021-04-07  8:06 ` [Bug target/99953] " pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: novemberizing at gmail dot com @ 2021-04-07  8:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953

--- Comment #2 from Hyun Sik Park <novemberizing at gmail dot com> ---
Test environment: gcc version 9.3.0 (Ubuntu 9.3.0–17ubuntu1~20.04)/Acer Aspire
V3–372/Intel(R) Core(TM) i5–6200U CPU @ 2.30GHz 4 Core

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/99953] In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function.
  2021-04-07  7:53 [Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function novemberizing at gmail dot com
  2021-04-07  7:54 ` [Bug c/99953] " novemberizing at gmail dot com
  2021-04-07  8:02 ` novemberizing at gmail dot com
@ 2021-04-07  8:06 ` pinskia at gcc dot gnu.org
  2021-04-07  8:06 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-04-07  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Does -mcpu=native help?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/99953] In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function.
  2021-04-07  7:53 [Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function novemberizing at gmail dot com
                   ` (2 preceding siblings ...)
  2021-04-07  8:06 ` [Bug target/99953] " pinskia at gcc dot gnu.org
@ 2021-04-07  8:06 ` pinskia at gcc dot gnu.org
  2021-04-07  8:14 ` novemberizing at gmail dot com
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-04-07  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I mean -march=native .

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/99953] In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function.
  2021-04-07  7:53 [Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function novemberizing at gmail dot com
                   ` (3 preceding siblings ...)
  2021-04-07  8:06 ` pinskia at gcc dot gnu.org
@ 2021-04-07  8:14 ` novemberizing at gmail dot com
  2021-04-07  8:25 ` pinskia at gcc dot gnu.org
  2021-04-07  9:05 ` novemberizing at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: novemberizing at gmail dot com @ 2021-04-07  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953

--- Comment #5 from Hyun Sik Park <novemberizing at gmail dot com> ---
$ gcc -march=native strlen.c 
$ ./a.out 
no optimize =>  0.000007860
o1 optimize =>  0.000062609
o2 optimize =>  0.000024775
o3 optimize =>  0.000022288

Same result.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/99953] In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function.
  2021-04-07  7:53 [Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function novemberizing at gmail dot com
                   ` (4 preceding siblings ...)
  2021-04-07  8:14 ` novemberizing at gmail dot com
@ 2021-04-07  8:25 ` pinskia at gcc dot gnu.org
  2021-04-07  9:05 ` novemberizing at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-04-07  8:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So this was fixed for GCC 10.
See dup bug 88809.

*** This bug has been marked as a duplicate of bug 88809 ***

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/99953] In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function.
  2021-04-07  7:53 [Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function novemberizing at gmail dot com
                   ` (5 preceding siblings ...)
  2021-04-07  8:25 ` pinskia at gcc dot gnu.org
@ 2021-04-07  9:05 ` novemberizing at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: novemberizing at gmail dot com @ 2021-04-07  9:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99953

--- Comment #7 from Hyun Sik Park <novemberizing at gmail dot com> ---
Thank you.

I was tested and the result is below.

$ ./a.out
no optimize => 0.000009640
o1 optimize => 0.000009126
o2 optimize => 0.000009422
o3 optimize => 0.000009081

experiment_optimize_3
    17d5: 48 01 c7 add %rax,%rdi
    17d8: e8 c3 f8 ff ff callq 10a0 strlen@plt
    17dd: 48 8b 74 24 08 mov 0x8(%rsp),%rsi

experiment_optimize_2
    168d: 48 01 c7 add %rax,%rdi
    1690: e8 0b fa ff ff callq 10a0 strlen@plt
    1695: 48 8b 74 24 10 mov 0x10(%rsp),%rsi

experiment_optimize_1
    154c: e8 4f fb ff ff callq 10a0 strlen@plt
    1551: 48 89 04 24 mov %rax,(%rsp)

experiment_optimize_0
    1375: 48 89 c7 mov %rax,%rdi
    1378: e8 23 fd ff ff callq 10a0 strlen@plt
    137d: 48 89 45 a8 mov %rax,-0x58(%rbp)

Thank you.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-04-07  9:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-07  7:53 [Bug c/99953] New: In AVX, SIMD support environment, strlen performance without optimization is 3 times faster than optimized strlen function novemberizing at gmail dot com
2021-04-07  7:54 ` [Bug c/99953] " novemberizing at gmail dot com
2021-04-07  8:02 ` novemberizing at gmail dot com
2021-04-07  8:06 ` [Bug target/99953] " pinskia at gcc dot gnu.org
2021-04-07  8:06 ` pinskia at gcc dot gnu.org
2021-04-07  8:14 ` novemberizing at gmail dot com
2021-04-07  8:25 ` pinskia at gcc dot gnu.org
2021-04-07  9:05 ` novemberizing at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).