public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/49279] New: Optimization incorrectly presuming constant variable inside loop in g++ 4.5 and 4.6 with -O2 and -O3 for x86_64 targets
@ 2011-06-03 21:23 tcmartins at gmail dot com
  2011-06-04 16:37 ` [Bug c++/49279] " hjl.tools at gmail dot com
                   ` (21 more replies)
  0 siblings, 22 replies; 23+ messages in thread
From: tcmartins at gmail dot com @ 2011-06-03 21:23 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49279

           Summary: Optimization incorrectly presuming constant variable
                    inside loop in g++ 4.5 and 4.6 with -O2 and -O3 for
                    x86_64 targets
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: c++
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: tcmartins@gmail.com


Created attachment 24427
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24427
Testcase (reduced automatically using multidelta)

GCC is apparently producing the wrong code with Eigen2 (a template-based linear
algebra library) with optimization levels -O3 and -O2 for
x86_64-unknown-linux-gnu targets. A reduced test case is provided that
reproduces the error.

As I understand, the core of the problem is this loop (line 1132 of
the submitted test case):

for (; (ProcessFirstHalf ? i && i.index () < j : i); ++i)  {
   if (LhsIsSelfAdjoint) {
      int a = LhsIsRowMajor ? j : i.index ();
      int b = LhsIsRowMajor ? i.index () : j;
      Scalar v = i.value ();
      derived ().row (b) += ei_conj (v) * product.rhs ().row (a);        
  }
}

which is being translated into:

    movq    -8(%rsp), %rsi
    movq    (%rsi), %rbp
    addq    %rdx, %rbp
    movsd    0(%rbp), %xmm1   # <- %xmm1 is initialized here and 
.L5:                             #    no longer touched!
    leaq    0(,%rcx,8), %rsi
    leaq    4(%r8,%rcx,4), %r8
    movl    %r9d, %ecx
    jmp    .L8
.L13:                              # <-Loop here!!!
    movl    (%r8), %r10d
    addq    $4, %r8
.L8:
    movsd    0(%r13,%rsi), %xmm0
    movslq    %r10d, %r10
    addl    $1, %ecx
    mulsd    (%r12,%r10,8), %xmm0
    cmpl    %r11d, %ecx
    addsd    %xmm1, %xmm0
    movsd    %xmm0, 0(%rbp)   # <- % shouldn't %xmm1 be updated here?
    je    .L3              
    addq    $8, %rsi
    cmpl    %ecx, %r9d
    jle    .L13              # <- Loop ends 

the sum operation on line

 derived ().row (b) += ei_conj (v) * product.rhs ().row (a);

is apparently being performed by the instruction

 addsd    %xmm1, %xmm0

but the value of %xmm1 isn't being updated inside the loop!! Apparently the
compiler is presuming derived ().row (b) is constant inside the loop, which is
evidently *not* true. Since the value of %xmm1 is never updated, the 
value of derived ().row (b) at the end of the loop is equal to the last 
ei_conj (v) * product.rhs ().row (a) result.

The bug was verified on gcc versions 4.5.2 and 4.6.0 with -O2 and -O3 switches.
The compiler produces the correct code with -O0 and -O switches.

It is *NOT* present on the 4.4 branch (that is, 4.4 compiles the code
correctly) for -O0, -0, -02 and -O3 switches. 

I suppose it is a regression of the 4.5 branch.

Command line (for gcc 4.6.0):
/opt/gnu/gcc-4.6/bin/g++ -v -save-temps -nostdinc -O3  testcase.a.cpp

Compiler output:
Using built-in specs.
COLLECT_GCC=/opt/gnu/gcc-4.6/bin/g++
COLLECT_LTO_WRAPPER=/opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../configure : (reconfigured) ../configure : (reconfigured)
../configure
Thread model: posix
gcc version 4.6.0 (GCC) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-nostdinc' '-O3' '-shared-libgcc'
'-mtune=generic' '-march=x86-64'
 /opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/cc1plus -E
-quiet -nostdinc -v -iprefix
/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/ -D_GNU_SOURCE
testcase.a.cpp -mtune=generic -march=x86-64 -O3 -fpch-preprocess -o
testcase.a.ii
#include "..." search starts here:
#include <...> search starts here:
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-nostdinc' '-O3' '-shared-libgcc'
'-mtune=generic' '-march=x86-64'
 /opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/cc1plus
-fpreprocessed testcase.a.ii -quiet -dumpbase testcase.a.cpp -mtune=generic
-march=x86-64 -auxbase testcase.a -O3 -version -o testcase.a.s
GNU C++ (GCC) version 4.6.0 (x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.6.0, GMP version 4.3.2, MPFR version
3.0.0-p8, MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C++ (GCC) version 4.6.0 (x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.6.0, GMP version 4.3.2, MPFR version
3.0.0-p8, MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 3f3899c46d47b31a2bc0cb7f3d1408a6
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-nostdinc' '-O3' '-shared-libgcc'
'-mtune=generic' '-march=x86-64'
 as --64 -o testcase.a.o testcase.a.s
COMPILER_PATH=/opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/:/opt/gnu/gcc-4.6/bin/../libexec/gcc/
LIBRARY_PATH=/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/:/opt/gnu/gcc-4.6/bin/../lib/gcc/:/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/nvidia-current/:/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-nostdinc' '-O3' '-shared-libgcc'
'-mtune=generic' '-march=x86-64'
 /opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/collect2
--eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2
/usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o
/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/crtbegin.o
-L/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0
-L/opt/gnu/gcc-4.6/bin/../lib/gcc
-L/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../lib64
-L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/nvidia-current
-L/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../..
testcase.a.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc
/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/crtend.o
/usr/lib/../lib64/crtn.o

A test case was produced with the preprocessed output (generated from Eigen
version 2.0.15) and automatically reduced using the multidelta tool. Testcase
included.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-01-03 14:05 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-03 21:23 [Bug c++/49279] New: Optimization incorrectly presuming constant variable inside loop in g++ 4.5 and 4.6 with -O2 and -O3 for x86_64 targets tcmartins at gmail dot com
2011-06-04 16:37 ` [Bug c++/49279] " hjl.tools at gmail dot com
2011-06-06  9:06 ` [Bug c++/49279] [4.5/4.6/4.7 Regression] " rguenth at gcc dot gnu.org
2011-06-06 13:01 ` rguenth at gcc dot gnu.org
2011-06-06 14:04 ` rguenth at gcc dot gnu.org
2011-08-01 14:03 ` rguenth at gcc dot gnu.org
2011-10-04 16:48 ` [Bug tree-optimization/49279] " jakub at gcc dot gnu.org
2011-10-04 16:59 ` jakub at gcc dot gnu.org
2011-10-05  8:09 ` jakub at gcc dot gnu.org
2011-10-05  9:06 ` jakub at gcc dot gnu.org
2011-10-05  9:43 ` rguenther at suse dot de
2011-10-05 14:39 ` rguenth at gcc dot gnu.org
2011-10-05 15:51 ` jakub at gcc dot gnu.org
2011-10-05 15:53 ` jakub at gcc dot gnu.org
2011-10-06  8:08 ` rguenth at gcc dot gnu.org
2011-10-06  8:10 ` rguenth at gcc dot gnu.org
2011-10-06 16:39 ` jakub at gcc dot gnu.org
2011-10-06 19:58 ` jakub at gcc dot gnu.org
2011-10-07  8:16 ` [Bug tree-optimization/49279] [4.5 " rguenth at gcc dot gnu.org
2011-10-12 15:27 ` matz at gcc dot gnu.org
2012-01-03 12:21 ` rguenth at gcc dot gnu.org
2012-01-03 13:56 ` rguenth at gcc dot gnu.org
2012-01-03 14:05 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).