public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
@ 2012-07-03 19:56 ed at edrosten dot com
  2012-07-03 19:57 ` [Bug middle-end/53844] " ed at edrosten dot com
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: ed at edrosten dot com @ 2012-07-03 19:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

             Bug #: 53844
           Summary: GCC generates suboptimal code for unused members of
                    classes in some cases on multiple targets.
    Classification: Unclassified
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: ed@edrosten.com


Created attachment 27737
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27737
Source file for which suboptimal code is generated (self contained)

The attached source file contains a very minimal cut out part of a numerics
library with expression templates which was exhibiting poor performance. The
poor performance is due to the compiler emitting instructions to push variables
onto the stack which are ultimately never used.

The attached file seems to be the smallest which replicates the behaviour.
Compiled with:

g++-4.7  -S minimal.cc -O3 


The output generated is:

_Z4testRK6VectorI5VBaseERS1_i:
.LFB8:
    .cfi_startproc
    subq    $160, %rsp
    .cfi_def_cfa_offset 168
    movq    (%rsi), %rcx
    movq    (%rdi), %rdx
    leaq    -120(%rsp), %rax
    movl    $1, 8(%rsp)
    movl    $1, -8(%rsp)
    movl    $1, -24(%rsp)
    movl    $1, -40(%rsp)
    movq    %rax, 24(%rsp)
    leaq    -104(%rsp), %rax
    movl    $1, -56(%rsp)
    movl    $1, -72(%rsp)
    movl    $1, -88(%rsp)
    movq    %rax, 40(%rsp)
    leaq    24(%rsp), %rax
    movl    $1, -104(%rsp)
    movl    $1, -120(%rsp)
    movq    %rdi, 32(%rsp)
    movq    %rax, 48(%rsp)
    leaq    -88(%rsp), %rax
    movq    %rax, 56(%rsp)
    leaq    40(%rsp), %rax
    movq    %rax, 64(%rsp)
    leaq    -72(%rsp), %rax
    movq    %rax, 72(%rsp)
    leaq    56(%rsp), %rax
    movq    %rax, 80(%rsp)
    leaq    -56(%rsp), %rax
    movq    %rax, 88(%rsp)
    leaq    72(%rsp), %rax
    movq    %rax, 96(%rsp)
    leaq    -40(%rsp), %rax
    movq    %rax, 104(%rsp)
    leaq    88(%rsp), %rax
    movq    %rax, 112(%rsp)
    leaq    -24(%rsp), %rax
    movq    %rax, 120(%rsp)
    leaq    104(%rsp), %rax
    movq    %rax, 128(%rsp)
    leaq    -8(%rsp), %rax
    movq    %rax, 136(%rsp)
    leaq    120(%rsp), %rax
    movq    %rax, 144(%rsp)
    xorl    %eax, %eax
    .p2align 4,,10
    .p2align 3
.L2:
    movsd    (%rdx,%rax), %xmm0
    movsd    %xmm0, (%rcx,%rax)
    addq    $8, %rax
    cmpq    $800, %rax
    jne    .L2
    addq    $160, %rsp
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc

This majority of the instructions correspond to pushing the numbers (line 68 of
the source file) and corresponding reference (line 39 of the source file) onto
the stack. 

The behaviour is quite changable. For example replacing the assignment with
    const auto&& v=in*1*1*1*1*1*1*1*1*1*1*1;
    out=v;

and enabling C++11 produces much worse results. 


The main compiler tested was:

COLLECT_GCC=g++-4.7
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.7.0/configure --program-suffix=-4.7
--enable-languages=c,c++
Thread model: posix
gcc version 4.7.0 (GCC) 

The source has been tried with a number of other compilers in 32 and 64 bit
mode where applicable with similar results. The compilers are:

COLLECT_GCC=arm-linux-gnueabi-g++-4.6
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabi/4.6.1/lto-wrapper
Target: arm-linux-gnueabi
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
4.6.1-9ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.6 --enable-shared --enable-linker-build-id
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix
--with-gxx-include-dir=/usr/arm-linux-gnueabi/include/c++/4.6.1
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --enable-multilib
--disable-sjlj-exceptions --with-arch=armv7-a --with-float=softfp
--with-fpu=vfpv3-d16 --with-mode=thumb --disable-werror
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=arm-linux-gnueabi --program-prefix=arm-linux-gnueabi-
--includedir=/usr/arm-linux-gnueabi/include
--with-headers=/usr/arm-linux-gnueabi/include
--with-libs=/usr/arm-linux-gnueabi/lib
Thread model: posix
gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) 


Using built-in specs.
COLLECT_GCC=/usr/bin/g++-4.6.real
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
4.6.1-9ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr
--program-suffix=-4.6 --enable-shared --enable-linker-build-id
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin
--enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) 


Using built-in specs.
COLLECT_GCC=/usr/bin/g++-4.5.real
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.5.4/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
4.5.3-9ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.5 --enable-shared --enable-linker-build-id
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.5
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin
--enable-gold --enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc
--disable-werror --with-arch-32=i686 --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 4.5.4 (Ubuntu/Linaro 4.5.3-9ubuntu1) 

Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
4.4.6-11ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.4 --enable-shared --enable-linker-build-id
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-objc-gc --disable-werror --with-arch-32=i686
--with-tune=generic --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.6 (Ubuntu/Linaro 4.4.6-11ubuntu2)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
@ 2012-07-03 19:57 ` ed at edrosten dot com
  2012-07-03 20:23 ` pinskia at gcc dot gnu.org
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ed at edrosten dot com @ 2012-07-03 19:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

Edward Rosten <ed at edrosten dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
  2012-07-03 19:57 ` [Bug middle-end/53844] " ed at edrosten dot com
@ 2012-07-03 20:23 ` pinskia at gcc dot gnu.org
  2012-07-03 20:27 ` [Bug middle-end/53844] [4.6/4.7 Regression] " pinskia at gcc dot gnu.org
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-07-03 20:23 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-07-03
            Version|unknown                     |4.7.0
     Ever Confirmed|0                           |1
      Known to fail|                            |4.6.1, 4.7.0

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-07-03 20:23:40 UTC ---
Confirmed.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
  2012-07-03 19:57 ` [Bug middle-end/53844] " ed at edrosten dot com
  2012-07-03 20:23 ` pinskia at gcc dot gnu.org
@ 2012-07-03 20:27 ` pinskia at gcc dot gnu.org
  2012-07-04  9:35 ` [Bug tree-optimization/53844] " rguenth at gcc dot gnu.org
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-07-03 20:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |4.3.3
   Target Milestone|---                         |4.6.4
            Summary|GCC generates suboptimal    |[4.6/4.7 Regression] GCC
                   |code for unused members of  |generates suboptimal code
                   |classes in some cases on    |for unused members of
                   |multiple targets.           |classes in some cases on
                   |                            |multiple targets.
           Severity|enhancement                 |normal

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-07-03 20:26:46 UTC ---
DCE used to be able to remove these stores at least in 4.3.3.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (2 preceding siblings ...)
  2012-07-03 20:27 ` [Bug middle-end/53844] [4.6/4.7 Regression] " pinskia at gcc dot gnu.org
@ 2012-07-04  9:35 ` rguenth at gcc dot gnu.org
  2012-07-04  9:43 ` ed at edrosten dot com
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-04  9:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
          Component|middle-end                  |tree-optimization
         AssignedTo|unassigned at gcc dot       |rguenth at gcc dot gnu.org
                   |gnu.org                     |

--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-04 09:34:59 UTC ---
With the alias-oracle in place DCE can no longer see this (the temporaries
have their address taken).  DSE should be able to, but it seems it does not.

Investigating.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (3 preceding siblings ...)
  2012-07-04  9:35 ` [Bug tree-optimization/53844] " rguenth at gcc dot gnu.org
@ 2012-07-04  9:43 ` ed at edrosten dot com
  2012-07-04  9:55 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ed at edrosten dot com @ 2012-07-04  9:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

--- Comment #4 from Edward Rosten <ed at edrosten dot com> 2012-07-04 09:42:52 UTC ---
It doesn't seem to do with the address, entirely. It still pushes the values
onto the stack even if the class is changed to have a const int, rather than
const int&.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (4 preceding siblings ...)
  2012-07-04  9:43 ` ed at edrosten dot com
@ 2012-07-04  9:55 ` rguenth at gcc dot gnu.org
  2012-07-04 12:11 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-04  9:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

--- Comment #5 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-04 09:55:27 UTC ---
For trunk dse_possible_dead_store_p fails to look through the loop, when
being at the VDEF .MEM_165:

<bb 3>:
  # i_93 = PHI <i_78(3), 0(2)>
  # .MEM_100 = PHI <.MEM_165(3), .MEM_68(2)>
  # VUSE <.MEM_100>
  D.3879_71 = MEM[(struct VBase *)out_2(D)].my_data;
  D.3880_73 = (long unsigned int) i_93;
  D.3881_74 = D.3880_73 * 8;
  D.3878_75 = D.3879_71 + D.3881_74;
  # VUSE <.MEM_100>
  D.3975_138 = MEM[(const struct VBase *)in_1(D)].my_data;
  D.3974_141 = D.3975_138 + D.3881_74;
  # VUSE <.MEM_100>
  D.3981_142 = *D.3974_141;
  # .MEM_165 = VDEF <.MEM_100>
  *D.3878_75 = D.3981_142;
  i_78 = i_93 + 1;
  if (i_78 != 100)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 4>:
  # .MEM_28 = VDEF <.MEM_165>
  D.2811 ={v} {CLOBBER};

we then see two uses that have a VDEF - the PHI (which we processed before),
and the store after the loop.

The following fixes that

Index: gcc/tree-ssa-dse.c
===================================================================
--- gcc/tree-ssa-dse.c  (revision 189248)
+++ gcc/tree-ssa-dse.c  (working copy)
@@ -94,7 +94,7 @@ dse_possible_dead_store_p (gimple stmt,
   temp = stmt;
   do
     {
-      gimple use_stmt;
+      gimple use_stmt, defvar_def;
       imm_use_iterator ui;
       bool fail = false;
       tree defvar;
@@ -108,6 +108,7 @@ dse_possible_dead_store_p (gimple stmt,
        defvar = PHI_RESULT (temp);
       else
        defvar = gimple_vdef (temp);
+      defvar_def = temp;
       temp = NULL;
       FOR_EACH_IMM_USE_STMT (use_stmt, ui, defvar)
        {
@@ -139,7 +140,13 @@ dse_possible_dead_store_p (gimple stmt,
                  fail = true;
                  BREAK_FROM_IMM_USE_STMT (ui);
                }
-             temp = use_stmt;
+             /* Do not consider the PHI as use if it dominates the 
+                stmt defining the virtual operand we are processing.  */
+             if (gimple_bb (defvar_def) != gimple_bb (use_stmt)
+                 && !dominated_by_p (CDI_DOMINATORS,
+                                     gimple_bb (defvar_def),
+                                     gimple_bb (use_stmt)))
+               temp = use_stmt;
            }
          /* If the statement is a use the store is not dead.  */
          else if (ref_maybe_used_by_stmt_p (use_stmt,

but the existing loop code is weird ...

I'm anyway testing the above.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (5 preceding siblings ...)
  2012-07-04  9:55 ` rguenth at gcc dot gnu.org
@ 2012-07-04 12:11 ` rguenth at gcc dot gnu.org
  2012-07-04 12:12 ` rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-04 12:11 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

--- Comment #6 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-04 12:10:56 UTC ---
Author: rguenth
Date: Wed Jul  4 12:10:40 2012
New Revision: 189256

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189256
Log:
2012-07-04  Richard Guenther  <rguenther@suse.de>

    PR tree-optimization/53844
    * tree-ssa-dse.c (dse_possible_dead_store_p): Properly handle
    the loop virtual PHI.

    * g++.dg/tree-ssa/pr53844.C: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/tree-ssa/pr53844.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-dse.c


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (6 preceding siblings ...)
  2012-07-04 12:11 ` rguenth at gcc dot gnu.org
@ 2012-07-04 12:12 ` rguenth at gcc dot gnu.org
  2012-07-04 13:36 ` [Bug middle-end/53844] " ed at edrosten dot com
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-04 12:12 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |4.8.0

--- Comment #7 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-04 12:11:57 UTC ---
Fixed on trunk sofar, watching for fallout.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (7 preceding siblings ...)
  2012-07-04 12:12 ` rguenth at gcc dot gnu.org
@ 2012-07-04 13:36 ` ed at edrosten dot com
  2012-07-04 13:41 ` ed at edrosten dot com
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ed at edrosten dot com @ 2012-07-04 13:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

Edward Rosten <ed at edrosten dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|tree-optimization           |middle-end

--- Comment #8 from Edward Rosten <ed at edrosten dot com> 2012-07-04 13:36:28 UTC ---
(In reply to comment #7)
> Fixed on trunk sofar, watching for fallout.

I pulled the latest change from SVN and tried it on the test code, with
success.

I'm using the shortened test function:


void test(const Vector<>& in, Vector<>& out, int i)
{
    out = in*1*1*1*1;
}

If I change the test function to:


void test(const Vector<>& in, Vector<>& out, int i)
{
    const Vector<ScalarMulExpr<ScalarMulExpr<ScalarMulExpr<ScalarMulExpr<VBase>
> > > >& v = in*1*1*1*1;
    out = v;
}

The the results go to being almost identical to gcc 4.7 (and much worse than
the first test function). The asm code (compiled with -fno-tree-vectorize to
avoid all the asm code to deal with alignment etc) gives:

_Z4testRK6VectorI5VBaseERS1_i:
.LFB8:
    .cfi_startproc
    movq    (%rsi), %rcx
    xorl    %esi, %esi
    movq    -16(%rsp), %rax
    cvtsi2sd    %esi, %xmm1
    movq    8(%rax), %rdx
    movq    (%rax), %rax
    cvtsi2sd    (%rax), %xmm3
    movq    -24(%rsp), %rax
    movq    (%rdx), %rdx
    cvtsi2sd    (%rax), %xmm2
    xorl    %eax, %eax
    .p2align 4,,10
    .p2align 3
.L3:
    movsd    (%rdx,%rax), %xmm0
    mulsd    %xmm3, %xmm0
    mulsd    %xmm2, %xmm0
    mulsd    %xmm1, %xmm0
    movsd    %xmm0, (%rcx,%rax)
    addq    $8, %rax
    cmpq    $800, %rax
    jne    .L3
    rep; ret
    .cfi_endproc


In this case, it's clearly converting all those 1's to floats and then
multiplying by all but the first one. Note that if the following test function
is used:

void test(const Vector<>& in, Vector<>& out, int i)
{
    const Vector<ScalarMulExpr<ScalarMulExpr<VBase> > >& v = in*1*1;
    out = v;
}

Then suboptimal code isn't produced. Further investigation shows that this also
applies to the previous symptom with unnecessary pushes in gcc 4.7. 

If I change the mul member in ScalarMulExpr to int rather than int&, the
compiler can optimize away the first two multiplications, rather than the first
one.

GCC version is:

gcc-4.8-svn -v
Using built-in specs.
COLLECT_GCC=gcc-4.8-svn
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.8.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc/configure --program-suffix=-4.8-svn
--enable-languages=c,c++
Thread model: posix
gcc version 4.8.0 20120704 (experimental) (GCC) 

but similar results are reported on 4.7 as well.


Is this a continuation of the same bug, or should I refile this as a new bug?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (8 preceding siblings ...)
  2012-07-04 13:36 ` [Bug middle-end/53844] " ed at edrosten dot com
@ 2012-07-04 13:41 ` ed at edrosten dot com
  2012-07-04 13:47 ` rguenther at suse dot de
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ed at edrosten dot com @ 2012-07-04 13:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

--- Comment #9 from Edward Rosten <ed at edrosten dot com> 2012-07-04 13:40:54 UTC ---
(In reply to comment #7)
> Fixed on trunk sofar, watching for fallout.

I would like to note that your fix seems to remove the performance hit in my
numerics which revealed the bug.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (9 preceding siblings ...)
  2012-07-04 13:41 ` ed at edrosten dot com
@ 2012-07-04 13:47 ` rguenther at suse dot de
  2012-07-04 14:13 ` ed at edrosten dot com
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenther at suse dot de @ 2012-07-04 13:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

--- Comment #10 from rguenther at suse dot de <rguenther at suse dot de> 2012-07-04 13:47:09 UTC ---
On Wed, 4 Jul 2012, ed at edrosten dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844
> 
> Edward Rosten <ed at edrosten dot com> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>           Component|tree-optimization           |middle-end
> 
> --- Comment #8 from Edward Rosten <ed at edrosten dot com> 2012-07-04 13:36:28 UTC ---
> (In reply to comment #7)
> > Fixed on trunk sofar, watching for fallout.
> 
> I pulled the latest change from SVN and tried it on the test code, with
> success.
> 
> I'm using the shortened test function:
> 
> 
> void test(const Vector<>& in, Vector<>& out, int i)
> {
>     out = in*1*1*1*1;
> }
> 
> If I change the test function to:
> 
> 
> void test(const Vector<>& in, Vector<>& out, int i)
> {
>     const Vector<ScalarMulExpr<ScalarMulExpr<ScalarMulExpr<ScalarMulExpr<VBase>
> > > > >& v = in*1*1*1*1;
>     out = v;
> }

I can at least see that you are using no longer live variables:

<bb 2>:
  D.2391 ={v} {CLOBBER};
  D.2396 ={v} {CLOBBER};

<bb 3>:
  # i_36 = PHI <i_35(3), 0(2)>
  D.2928_28 = MEM[(struct VBase *)out_3(D)].my_data;
  D.2929_30 = (long unsigned int) i_36;
  D.2930_31 = D.2929_30 * 8;
  D.2927_32 = D.2928_28 + D.2930_31;
  D.2948_44 = MEM[(const struct ScalarMulExpr *)&D.2391].vec;

here D.2391 is already dead.  So possibly you are returning references
to temporaries somewhere.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (10 preceding siblings ...)
  2012-07-04 13:47 ` rguenther at suse dot de
@ 2012-07-04 14:13 ` ed at edrosten dot com
  2012-09-07  9:35 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ed at edrosten dot com @ 2012-07-04 14:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

--- Comment #11 from Edward Rosten <ed at edrosten dot com> 2012-07-04 14:13:28 UTC ---
(In reply to comment #10)
> On Wed, 4 Jul 2012, ed at edrosten dot com wrote:

> here D.2391 is already dead.  So possibly you are returning references
> to temporaries somewhere.

You are correct. In fiddling around adding and removing &'s and not tracking my
files properly, I left some references in. If I do it properly, then good code
is produced.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (11 preceding siblings ...)
  2012-07-04 14:13 ` ed at edrosten dot com
@ 2012-09-07  9:35 ` rguenth at gcc dot gnu.org
  2013-02-04 12:05 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-09-07  9:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (12 preceding siblings ...)
  2012-09-07  9:35 ` rguenth at gcc dot gnu.org
@ 2013-02-04 12:05 ` rguenth at gcc dot gnu.org
  2013-02-04 12:06 ` [Bug middle-end/53844] [4.6 " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-04 12:05 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-04 12:04:42 UTC ---
Author: rguenth
Date: Mon Feb  4 12:04:35 2013
New Revision: 195708

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=195708
Log:
2013-02-04  Richard Biener  <rguenther@suse.de>

    Backport from mainline
    2012-07-04  Richard Guenther  <rguenther@suse.de>

    PR tree-optimization/53844
    * tree-ssa-dse.c (dse_possible_dead_store_p): Properly handle
    the loop virtual PHI.

    * g++.dg/tree-ssa/pr53844.C: New testcase.

    2012-12-13  Richard Biener  <rguenther@suse.de>

    PR lto/55660
    * tree-streamer.c (record_common_node): Check that we are not
    recursively pre-loading nodes we want to skip.  Handle
    char_type_node appearing as part of va_list_type_node.

    * gcc.dg/lto/pr55660_0.c: New testcase.
    * gcc.dg/lto/pr55660_1.c: Likewise.

2013-02-04  Richard Biener  <rguenther@suse.de>

    PR middle-end/55890
    * gimple.h (gimple_call_builtin_class_p): New function.
    * gimple.c (validate_call): New function.
    (gimple_call_builtin_class_p): Likewise.
    * tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
    Use gimple_call_builtin_class_p.
    (find_func_clobbers): Likewise.
    * tree-ssa-strlen.c (adjust_last_stmt): Likewise.
    (strlen_optimize_stmt): Likewise.

    * gcc.dg/torture/pr55890-1.c: New testcase.
    * gcc.dg/torture/pr55890-2.c: Likewise.

Added:
    branches/gcc-4_7-branch/gcc/testsuite/g++.dg/tree-ssa/pr53844.C
    branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/lto/pr55660_0.c
    branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/lto/pr55660_1.c
    branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/torture/pr55890-1.c
    branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/torture/pr55890-2.c
Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/gimple.c
    branches/gcc-4_7-branch/gcc/gimple.h
    branches/gcc-4_7-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_7-branch/gcc/tree-ssa-dse.c
    branches/gcc-4_7-branch/gcc/tree-ssa-strlen.c
    branches/gcc-4_7-branch/gcc/tree-ssa-structalias.c
    branches/gcc-4_7-branch/gcc/tree-streamer.c


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (13 preceding siblings ...)
  2013-02-04 12:05 ` rguenth at gcc dot gnu.org
@ 2013-02-04 12:06 ` rguenth at gcc dot gnu.org
  2013-02-18 13:47 ` rguenth at gcc dot gnu.org
  2013-02-18 13:47 ` [Bug middle-end/53844] [4.6/4.7 " rguenth at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-04 12:06 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
      Known to work|                            |4.7.3
         Resolution|                            |FIXED
   Target Milestone|4.6.4                       |4.7.3
            Summary|[4.6/4.7 Regression] GCC    |[4.6 Regression] GCC
                   |generates suboptimal code   |generates suboptimal code
                   |for unused members of       |for unused members of
                   |classes in some cases on    |classes in some cases on
                   |multiple targets.           |multiple targets.

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-04 12:06:13 UTC ---
And 4.7.3.  Not considering to backport further, thus, FIXED.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6/4.7 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (15 preceding siblings ...)
  2013-02-18 13:47 ` rguenth at gcc dot gnu.org
@ 2013-02-18 13:47 ` rguenth at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-18 13:47 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|4.7.3                       |
   Target Milestone|4.7.3                       |4.8.0
            Summary|[4.6 Regression] GCC        |[4.6/4.7 Regression] GCC
                   |generates suboptimal code   |generates suboptimal code
                   |for unused members of       |for unused members of
                   |classes in some cases on    |classes in some cases on
                   |multiple targets.           |multiple targets.

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-18 13:47:33 UTC ---
Reverted even on the 4.7 branch.  Fixed for 4.8.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug middle-end/53844] [4.6 Regression] GCC generates suboptimal code for unused members of classes in some cases on multiple targets.
  2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
                   ` (14 preceding siblings ...)
  2013-02-04 12:06 ` [Bug middle-end/53844] [4.6 " rguenth at gcc dot gnu.org
@ 2013-02-18 13:47 ` rguenth at gcc dot gnu.org
  2013-02-18 13:47 ` [Bug middle-end/53844] [4.6/4.7 " rguenth at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-02-18 13:47 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53844

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-18 13:46:46 UTC ---
Author: rguenth
Date: Mon Feb 18 13:46:37 2013
New Revision: 196120

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=196120
Log:
2013-02-18  Richard Biener  <rguenther@suse.de>

    Revert
    2013-02-04  Richard Biener  <rguenther@suse.de>

    Backport from mainline
    2012-07-04  Richard Guenther  <rguenther@suse.de>

    PR tree-optimization/53844
    * tree-ssa-dse.c (dse_possible_dead_store_p): Properly handle
    the loop virtual PHI.

    * g++.dg/tree-ssa/pr53844.C: New testcase.

Removed:
    branches/gcc-4_7-branch/gcc/testsuite/g++.dg/tree-ssa/pr53844.C
Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_7-branch/gcc/tree-ssa-dse.c


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-02-18 13:47 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-03 19:56 [Bug middle-end/53844] New: GCC generates suboptimal code for unused members of classes in some cases on multiple targets ed at edrosten dot com
2012-07-03 19:57 ` [Bug middle-end/53844] " ed at edrosten dot com
2012-07-03 20:23 ` pinskia at gcc dot gnu.org
2012-07-03 20:27 ` [Bug middle-end/53844] [4.6/4.7 Regression] " pinskia at gcc dot gnu.org
2012-07-04  9:35 ` [Bug tree-optimization/53844] " rguenth at gcc dot gnu.org
2012-07-04  9:43 ` ed at edrosten dot com
2012-07-04  9:55 ` rguenth at gcc dot gnu.org
2012-07-04 12:11 ` rguenth at gcc dot gnu.org
2012-07-04 12:12 ` rguenth at gcc dot gnu.org
2012-07-04 13:36 ` [Bug middle-end/53844] " ed at edrosten dot com
2012-07-04 13:41 ` ed at edrosten dot com
2012-07-04 13:47 ` rguenther at suse dot de
2012-07-04 14:13 ` ed at edrosten dot com
2012-09-07  9:35 ` rguenth at gcc dot gnu.org
2013-02-04 12:05 ` rguenth at gcc dot gnu.org
2013-02-04 12:06 ` [Bug middle-end/53844] [4.6 " rguenth at gcc dot gnu.org
2013-02-18 13:47 ` rguenth at gcc dot gnu.org
2013-02-18 13:47 ` [Bug middle-end/53844] [4.6/4.7 " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).