public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
@ 2013-06-09 6:56 yiyi8761 at gmail dot com
2013-06-09 7:03 ` [Bug c/57571] " yiyi8761 at gmail dot com
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: yiyi8761 at gmail dot com @ 2013-06-09 6:56 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
Bug ID: 57571
Summary: linux kernel function memcpy() execute with low
efficiency on Intel Ivybridge platform
Product: gcc
Version: 4.7.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: yiyi8761 at gmail dot com
OS type: OpenSuse 12.3 or SUSE 11 SP2
CPU type: Intel Ivybridge i7-3612QE or Intel Ivybridge i7-3615QE
GCC Ver: 4.7.2(Open Suse 12.3) or 4.3.4(SUSE 11 SP2)
GCC 4.7.2 Configured with: ../configure --prefix=/usr
--infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64
--libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.7
--enable-ssp --disable-libssp --disable-libitm --disable-plugin
--with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux'
--disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib
--enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --enable-linker-build-id
--program-suffix=-4.7 --enable-linux-futex --without-system-libunwind
--with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
GCC 4.3.4 Configured with: ../configure --prefix=/usr
--infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64
--libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.7
--enable-ssp --disable-libssp --disable-libitm --disable-plugin
--with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux'
--disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib
--enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --enable-linker-build-id
--program-suffix=-4.7 --enable-linux-futex --without-system-libunwind
--with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
description:
1. With the configurations above, the memcpy() used by linux kernel has a very
low performance. use gdb to view memcpy() in disassembled code, it works like
this:
(gdb) set disassembly-flavor intel
(gdb) x/20i 0xffffffff812ca220
0xffffffff812ca220: mov rax,rdi
0xffffffff812ca223: mov rcx,rdx
0xffffffff812ca226: rep movs BYTE PTR es:[rdi],BYTE PTR ds:[rsi]
0xffffffff812ca228: ret
0xffffffff812ca229: add eax,DWORD PTR [rbx+0x48f307e2]
0xffffffff812ca22f: movs DWORD PTR es:[rdi],DWORD PTR ds:[rsi]
0xffffffff812ca230: mov ecx,edx
0xffffffff812ca232: rep movs BYTE PTR es:[rdi],BYTE PTR ds:[rsi]
0xffffffff812ca234: ret
2. However, using the same OS(same GCC version and config), but on Intel
Arrandle platform (i7 CPU L620), in gdb the function memcpy() in disassembled
code like this:
(gdb) set disassembly-flavor intel
(gdb) x/20i 0xffffffff81250e80
0xffffffff81250e80: mov rax,rdi
0xffffffff81250e83: mov ecx,edx
0xffffffff81250e85: shr ecx,0x3
0xffffffff81250e88: and edx,0x7
0xffffffff81250e8b: rep movs QWORD PTR es:[rdi],QWORD PTR ds:[rsi]
0xffffffff81250e8e: mov ecx,edx
0xffffffff81250e90: rep movs BYTE PTR es:[rdi],BYTE PTR ds:[rsi]
0xffffffff81250e92: ret
3. So, the memcpy()'s efficiency on i7 L620 is eight times on the Intel
Ivybridge Platform when the copy length is bigger than 8.
4. Have already referred to Intel and novell, the engineers said this issue may
related with the compiler.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug c/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
2013-06-09 6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
@ 2013-06-09 7:03 ` yiyi8761 at gmail dot com
2013-06-09 7:07 ` [Bug target/57571] " pinskia at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: yiyi8761 at gmail dot com @ 2013-06-09 7:03 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
--- Comment #1 from phoenix <yiyi8761 at gmail dot com> ---
Sorry, correct a description, GCC 4.3.4's configuration is wrong.
GCC 4.3.4 Configured with: ../configure --prefix=/usr --infodir=/usr/share/info
--mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64
--enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.3
--enable-ssp --disable-libssp --with-bugurl=http://bugs.opensuse.org/
--with-pkgversion='SUSE Linux' --disable-libgcj --disable-libmudflap
--with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit
--enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --program-suffix=-4.3
--enable-linux-futex --without-system-libunwind --with-cpu=generic
--build=x86_64-suse-linux
Thread model: posix
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
2013-06-09 6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
2013-06-09 7:03 ` [Bug c/57571] " yiyi8761 at gmail dot com
@ 2013-06-09 7:07 ` pinskia at gcc dot gnu.org
2013-06-09 8:13 ` yiyi8761 at gmail dot com
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-06-09 7:07 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Component|c |target
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Can you provide the preprocessed source which is used to generate memcpy here?
Are you sure that the kernel is not generating memcpy at runtime?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
2013-06-09 6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
2013-06-09 7:03 ` [Bug c/57571] " yiyi8761 at gmail dot com
2013-06-09 7:07 ` [Bug target/57571] " pinskia at gcc dot gnu.org
@ 2013-06-09 8:13 ` yiyi8761 at gmail dot com
2013-06-09 8:45 ` jakub at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: yiyi8761 at gmail dot com @ 2013-06-09 8:13 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
--- Comment #3 from phoenix <yiyi8761 at gmail dot com> ---
1. "Can you provide the preprocessed source"
Sorry,I trace the linux source code, the kernel used __builtin___memcpy() to
replace memcpy(). I can't find the __builtin___memcpy() source code in the
linux source code. It seems it is processed by GCC.
2. "the kernel is not generating memcpy at runtime?"
Could you give me a little explanation about this? Thank you very much!
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
2013-06-09 6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
` (2 preceding siblings ...)
2013-06-09 8:13 ` yiyi8761 at gmail dot com
@ 2013-06-09 8:45 ` jakub at gcc dot gnu.org
2013-06-09 9:10 ` yiyi8761 at gmail dot com
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-06-09 8:45 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Have you actually measured any slowdown, or you just think that rep movsd must
be 8 times faster than rep movsb?
If you look at
http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf
look at 3-91 there, IvyBridge CPUs and higher are supposed to have ERMSB
feature where rep movsb is always suppoed to be faster than rep movsd + movsb.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
2013-06-09 6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
` (3 preceding siblings ...)
2013-06-09 8:45 ` jakub at gcc dot gnu.org
@ 2013-06-09 9:10 ` yiyi8761 at gmail dot com
2013-06-09 10:05 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: yiyi8761 at gmail dot com @ 2013-06-09 9:10 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
--- Comment #5 from phoenix <yiyi8761 at gmail dot com> ---
(In reply to Jakub Jelinek from comment #4)
> Have you actually measured any slowdown, or you just think that rep movsd
> must be 8 times faster than rep movsb?
Yes, actually I first measured the Flash Read(which execute with memcpy())
speed has a sharp down, and then find the kernel memcpy() efficiency problem.
After, I test memcpy() with other methods, further confirmed the reuslt.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
2013-06-09 6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
` (4 preceding siblings ...)
2013-06-09 9:10 ` yiyi8761 at gmail dot com
@ 2013-06-09 10:05 ` jakub at gcc dot gnu.org
2013-06-10 8:41 ` rguenth at gcc dot gnu.org
2013-11-10 20:14 ` pinskia at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-06-09 10:05 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Anyway, please see http://gcc.gnu.org/bugs.html, without preprocessed source
and gcc options passed to it this report is useless.
Also, if it is a memcpy into a hardware device area, perhaps the kernel
shouldn't use memcpy for that but some routine optimized for device memory
access, the compiler isn't told in any way that it isn't normal memory and all
it can do is choose best memcpy strategy for host memory to memory copies.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
2013-06-09 6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
` (5 preceding siblings ...)
2013-06-09 10:05 ` jakub at gcc dot gnu.org
@ 2013-06-10 8:41 ` rguenth at gcc dot gnu.org
2013-11-10 20:14 ` pinskia at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-06-10 8:41 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
Last reconfirmed| |2013-06-10
Ever confirmed|0 |1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
2013-06-09 6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
` (6 preceding siblings ...)
2013-06-10 8:41 ` rguenth at gcc dot gnu.org
@ 2013-11-10 20:14 ` pinskia at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-11-10 20:14 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |RESOLVED
Resolution|--- |INVALID
--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
No testcase in 5 months so closing.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-11-10 20:14 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-09 6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
2013-06-09 7:03 ` [Bug c/57571] " yiyi8761 at gmail dot com
2013-06-09 7:07 ` [Bug target/57571] " pinskia at gcc dot gnu.org
2013-06-09 8:13 ` yiyi8761 at gmail dot com
2013-06-09 8:45 ` jakub at gcc dot gnu.org
2013-06-09 9:10 ` yiyi8761 at gmail dot com
2013-06-09 10:05 ` jakub at gcc dot gnu.org
2013-06-10 8:41 ` rguenth at gcc dot gnu.org
2013-11-10 20:14 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).