public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
@ 2013-06-09  6:56 yiyi8761 at gmail dot com
  2013-06-09  7:03 ` [Bug c/57571] " yiyi8761 at gmail dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: yiyi8761 at gmail dot com @ 2013-06-09  6:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571

            Bug ID: 57571
           Summary: linux kernel function memcpy() execute with low
                    efficiency on Intel Ivybridge platform
           Product: gcc
           Version: 4.7.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: yiyi8761 at gmail dot com

OS type: OpenSuse 12.3 or SUSE 11 SP2
CPU type: Intel Ivybridge i7-3612QE or Intel Ivybridge i7-3615QE
GCC Ver: 4.7.2(Open Suse 12.3) or 4.3.4(SUSE 11 SP2)
         GCC 4.7.2 Configured with: ../configure --prefix=/usr
--infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64
--libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.7
--enable-ssp --disable-libssp --disable-libitm --disable-plugin
--with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux'
--disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib
--enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --enable-linker-build-id
--program-suffix=-4.7 --enable-linux-futex --without-system-libunwind
--with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
         GCC 4.3.4 Configured with: ../configure --prefix=/usr
--infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64
--libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.7
--enable-ssp --disable-libssp --disable-libitm --disable-plugin
--with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux'
--disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib
--enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --enable-linker-build-id
--program-suffix=-4.7 --enable-linux-futex --without-system-libunwind
--with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux

description: 
1. With the configurations above, the memcpy() used by linux kernel has a very
low performance. use gdb to view memcpy() in disassembled code, it works like
this:

(gdb) set disassembly-flavor intel
(gdb) x/20i 0xffffffff812ca220
   0xffffffff812ca220:  mov    rax,rdi
   0xffffffff812ca223:  mov    rcx,rdx
   0xffffffff812ca226:  rep movs BYTE PTR es:[rdi],BYTE PTR ds:[rsi]
   0xffffffff812ca228:  ret    
   0xffffffff812ca229:  add    eax,DWORD PTR [rbx+0x48f307e2]
   0xffffffff812ca22f:  movs   DWORD PTR es:[rdi],DWORD PTR ds:[rsi]
   0xffffffff812ca230:  mov    ecx,edx
   0xffffffff812ca232:  rep movs BYTE PTR es:[rdi],BYTE PTR ds:[rsi]
   0xffffffff812ca234:  ret 

2. However, using the same OS(same GCC version and config), but on Intel
Arrandle platform (i7 CPU L620), in gdb the function memcpy() in disassembled
code like this:

(gdb) set disassembly-flavor intel
(gdb) x/20i 0xffffffff81250e80
   0xffffffff81250e80:  mov    rax,rdi    
   0xffffffff81250e83:  mov    ecx,edx
   0xffffffff81250e85:  shr    ecx,0x3
   0xffffffff81250e88:  and    edx,0x7
   0xffffffff81250e8b:  rep movs QWORD PTR es:[rdi],QWORD PTR ds:[rsi]    
   0xffffffff81250e8e:  mov    ecx,edx
   0xffffffff81250e90:  rep movs BYTE PTR es:[rdi],BYTE PTR ds:[rsi]
   0xffffffff81250e92:  ret

3. So, the memcpy()'s efficiency on i7 L620 is eight times on the Intel
Ivybridge Platform when the copy length is bigger than 8.

4. Have already referred to Intel and novell, the engineers said this issue may
related with the compiler.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug c/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
  2013-06-09  6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
@ 2013-06-09  7:03 ` yiyi8761 at gmail dot com
  2013-06-09  7:07 ` [Bug target/57571] " pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: yiyi8761 at gmail dot com @ 2013-06-09  7:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571

--- Comment #1 from phoenix <yiyi8761 at gmail dot com> ---
Sorry, correct a description, GCC 4.3.4's configuration is wrong.
GCC 4.3.4 Configured with: ../configure --prefix=/usr --infodir=/usr/share/info
--mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64
--enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.3
--enable-ssp --disable-libssp --with-bugurl=http://bugs.opensuse.org/
--with-pkgversion='SUSE Linux' --disable-libgcj --disable-libmudflap
--with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit
--enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --program-suffix=-4.3
--enable-linux-futex --without-system-libunwind --with-cpu=generic
--build=x86_64-suse-linux
Thread model: posix


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
  2013-06-09  6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
  2013-06-09  7:03 ` [Bug c/57571] " yiyi8761 at gmail dot com
@ 2013-06-09  7:07 ` pinskia at gcc dot gnu.org
  2013-06-09  8:13 ` yiyi8761 at gmail dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-06-09  7:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
          Component|c                           |target

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Can you provide the preprocessed source which is used to generate memcpy here? 
Are you sure that the kernel is not generating memcpy at runtime?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
  2013-06-09  6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
  2013-06-09  7:03 ` [Bug c/57571] " yiyi8761 at gmail dot com
  2013-06-09  7:07 ` [Bug target/57571] " pinskia at gcc dot gnu.org
@ 2013-06-09  8:13 ` yiyi8761 at gmail dot com
  2013-06-09  8:45 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: yiyi8761 at gmail dot com @ 2013-06-09  8:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571

--- Comment #3 from phoenix <yiyi8761 at gmail dot com> ---
1. "Can you provide the preprocessed source"

Sorry,I trace the linux source code, the kernel used __builtin___memcpy() to
replace memcpy(). I can't find the __builtin___memcpy() source code in the
linux source code. It seems it is processed by GCC. 

2. "the kernel is not generating memcpy at runtime?"

Could you give me a little explanation about this? Thank you very much!


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
  2013-06-09  6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
                   ` (2 preceding siblings ...)
  2013-06-09  8:13 ` yiyi8761 at gmail dot com
@ 2013-06-09  8:45 ` jakub at gcc dot gnu.org
  2013-06-09  9:10 ` yiyi8761 at gmail dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-06-09  8:45 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Have you actually measured any slowdown, or you just think that rep movsd must
be 8 times faster than rep movsb?
If you look at
http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf
look at 3-91 there, IvyBridge CPUs and higher are supposed to have ERMSB
feature where rep movsb is always suppoed to be faster than rep movsd + movsb.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
  2013-06-09  6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
                   ` (3 preceding siblings ...)
  2013-06-09  8:45 ` jakub at gcc dot gnu.org
@ 2013-06-09  9:10 ` yiyi8761 at gmail dot com
  2013-06-09 10:05 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: yiyi8761 at gmail dot com @ 2013-06-09  9:10 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571

--- Comment #5 from phoenix <yiyi8761 at gmail dot com> ---
(In reply to Jakub Jelinek from comment #4)
> Have you actually measured any slowdown, or you just think that rep movsd
> must be 8 times faster than rep movsb?
Yes, actually I first measured the Flash Read(which execute with memcpy())
speed has a sharp down, and then find the kernel memcpy() efficiency problem.
After, I test memcpy() with other methods, further confirmed the reuslt.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
  2013-06-09  6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
                   ` (4 preceding siblings ...)
  2013-06-09  9:10 ` yiyi8761 at gmail dot com
@ 2013-06-09 10:05 ` jakub at gcc dot gnu.org
  2013-06-10  8:41 ` rguenth at gcc dot gnu.org
  2013-11-10 20:14 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-06-09 10:05 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Anyway, please see http://gcc.gnu.org/bugs.html, without preprocessed source
and gcc options passed to it this report is useless.
Also, if it is a memcpy into a hardware device area, perhaps the kernel
shouldn't use memcpy for that but some routine optimized for device memory
access, the compiler isn't told in any way that it isn't normal memory and all
it can do is choose best memcpy strategy for host memory to memory copies.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
  2013-06-09  6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
                   ` (5 preceding siblings ...)
  2013-06-09 10:05 ` jakub at gcc dot gnu.org
@ 2013-06-10  8:41 ` rguenth at gcc dot gnu.org
  2013-11-10 20:14 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-06-10  8:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2013-06-10
     Ever confirmed|0                           |1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/57571] linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
  2013-06-09  6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
                   ` (6 preceding siblings ...)
  2013-06-10  8:41 ` rguenth at gcc dot gnu.org
@ 2013-11-10 20:14 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-11-10 20:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|---                         |INVALID

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
No testcase in 5 months so closing.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-11-10 20:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-09  6:56 [Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform yiyi8761 at gmail dot com
2013-06-09  7:03 ` [Bug c/57571] " yiyi8761 at gmail dot com
2013-06-09  7:07 ` [Bug target/57571] " pinskia at gcc dot gnu.org
2013-06-09  8:13 ` yiyi8761 at gmail dot com
2013-06-09  8:45 ` jakub at gcc dot gnu.org
2013-06-09  9:10 ` yiyi8761 at gmail dot com
2013-06-09 10:05 ` jakub at gcc dot gnu.org
2013-06-10  8:41 ` rguenth at gcc dot gnu.org
2013-11-10 20:14 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).