public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
@ 2012-07-17 15:59 benedict.geihe at ins dot uni-bonn.de
  2012-07-17 16:01 ` [Bug c++/54000] " benedict.geihe at ins dot uni-bonn.de
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-07-17 15:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

             Bug #: 54000
           Summary: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5
                    using std::vector in matrix vector multiplication
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: benedict.geihe@ins.uni-bonn.de


Dear experts,

I got here from the gcc-help mailing list but have not submitted any bug
reports before. So I hope for your patience.

In a self-written library used for numerical computations we have some typical
programs serving as benchmarks for new compiler versions or optimization flags.
When gcc-4.6 was released we noticed a performance breakdown. The problem
persisted with gcc-4.7. I tried to produce a minimal stand-alone example and
followed the instructions at http://gcc.gnu.org/bugs/minimize.html. As
std::vector is included I was however not able to arrive at a really small
file.

What you see at the end of the file is actually just 1000 times matrix-vector
multiplication. However the matrix has a highly specific structure which is
encountered when performing numerical computations using the Finite Element
Method (FEM), i.e.:

std::vector<MinimalVec3> rows[9];

Thus it consists of 9 bands of triples of doubles. The length of each band
corresponds to the length of the vector it is applied to.


Compiling with gcc-4.5.0 (our current standard) 'time' command gives:
  real    1m32.606s
Using gcc-4.7.0 we get:
  real    2m6.923s
When removing member variable "double stuff" in "class MinimalVector" and using
gcc-4.7.0 we get:
  real    1m27.354s


Using a C array instead of std::vector above resolves this issue.


The specifications of the two compilers used are:

Using built-in specs.
COLLECT_GCC=/home/prog/gcc-4.5.0-64/bin/g++
COLLECT_LTO_WRAPPER=/home/prog/gcc-4.5.0-64/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ./configure --prefix=/home/prog/gcc-4.5.0-64/
--enable-languages=c,c++,fortran --disable-multilib --enable-lto
--with-libelf=/home/prog/libelf-64/ --with-ppl=/home/prog/ppl-64/
--with-cloog=/home/prog/cloog-ppl-64/
Thread model: posix
gcc version 4.5.0 (GCC)

and

Using built-in specs.
COLLECT_GCC=/home/prog/gcc-4.7.0-64/bin/g++
COLLECT_LTO_WRAPPER=/home/prog/gcc-4.7.0-64/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.7.0/configure --prefix=/home/prog/gcc-4.7.0-64/
--enable-languages=c,c++,fortran --with-gmp=/home/prog/gmp-5.0.4-64/
--with-ppl=/home/prog/ppl-0.12-64/ --enable-cloog-backend=isl
--with-cloog=/home/prog/cloog-0.16.3-64/ --disable-multilib
--enable-libstdcxx-debug
Thread model: posix
gcc version 4.7.0 (GCC)

They have been compiled manually on a machine running openSuse 11.3.


The command line was: g++ -O2 -o minmvmult minmvmult.ii


There were no warnings or error messages.



We'd be grateful for any suggestions.
Best regards
Benedict


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug c++/54000] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
@ 2012-07-17 16:01 ` benedict.geihe at ins dot uni-bonn.de
  2012-07-17 16:20 ` [Bug c++/54000] [4.6/4.7/4.8 Regression] " redi at gcc dot gnu.org
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-07-17 16:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

--- Comment #1 from Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> 2012-07-17 16:01:08 UTC ---
Created attachment 27816
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27816
preprocessed minimal example


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug c++/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
  2012-07-17 16:01 ` [Bug c++/54000] " benedict.geihe at ins dot uni-bonn.de
@ 2012-07-17 16:20 ` redi at gcc dot gnu.org
  2012-07-17 20:14 ` hjl.tools at gmail dot com
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: redi at gcc dot gnu.org @ 2012-07-17 16:20 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-07-17
      Known to work|                            |4.5.2
            Summary|Performance breakdown for   |[4.6/4.7/4.8 Regression]
                   |gcc-4.{6,7} vs. gcc-4.5     |Performance breakdown for
                   |using std::vector in matrix |gcc-4.{6,7} vs. gcc-4.5
                   |vector multiplication       |using std::vector in matrix
                   |                            |vector multiplication
     Ever Confirmed|0                           |1

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> 2012-07-17 16:19:42 UTC ---
confirmed


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug c++/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
  2012-07-17 16:01 ` [Bug c++/54000] " benedict.geihe at ins dot uni-bonn.de
  2012-07-17 16:20 ` [Bug c++/54000] [4.6/4.7/4.8 Regression] " redi at gcc dot gnu.org
@ 2012-07-17 20:14 ` hjl.tools at gmail dot com
  2012-07-17 21:47 ` [Bug tree-optimization/54000] " jason at gcc dot gnu.org
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: hjl.tools at gmail dot com @ 2012-07-17 20:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> 2012-07-17 20:14:18 UTC ---
It is caused by revision 172873:

http://gcc.gnu.org/ml/gcc-cvs/2011-04/msg01069.html


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (2 preceding siblings ...)
  2012-07-17 20:14 ` hjl.tools at gmail dot com
@ 2012-07-17 21:47 ` jason at gcc dot gnu.org
  2012-07-18  8:41 ` rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: jason at gcc dot gnu.org @ 2012-07-17 21:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Jason Merrill <jason at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jason at gcc dot gnu.org
          Component|c++                         |tree-optimization

--- Comment #4 from Jason Merrill <jason at gcc dot gnu.org> 2012-07-17 21:46:52 UTC ---
So, seems like an inlining heuristics issue.  Changing category to
tree-optimization.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (3 preceding siblings ...)
  2012-07-17 21:47 ` [Bug tree-optimization/54000] " jason at gcc dot gnu.org
@ 2012-07-18  8:41 ` rguenth at gcc dot gnu.org
  2012-07-24  9:53 ` benedict.geihe at ins dot uni-bonn.de
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-18  8:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Target Milestone|---                         |4.6.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (4 preceding siblings ...)
  2012-07-18  8:41 ` rguenth at gcc dot gnu.org
@ 2012-07-24  9:53 ` benedict.geihe at ins dot uni-bonn.de
  2012-09-05  9:30 ` benedict.geihe at ins dot uni-bonn.de
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-07-24  9:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

--- Comment #5 from Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> 2012-07-24 09:52:42 UTC ---
Thank you all for your quick replies.
Please let us know if we can further assist you.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (5 preceding siblings ...)
  2012-07-24  9:53 ` benedict.geihe at ins dot uni-bonn.de
@ 2012-09-05  9:30 ` benedict.geihe at ins dot uni-bonn.de
  2012-09-05  9:32 ` benedict.geihe at ins dot uni-bonn.de
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-09-05  9:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

--- Comment #6 from Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> 2012-09-05 09:30:05 UTC ---
I originally reported that using a C array instead of STL's vector solves the
problem. I am afraid that was wrong. I can also not remember what lead me to
this conclusion.

Anyway I attached a new minimal example without STL stuff. Hope it helps.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (6 preceding siblings ...)
  2012-09-05  9:30 ` benedict.geihe at ins dot uni-bonn.de
@ 2012-09-05  9:32 ` benedict.geihe at ins dot uni-bonn.de
  2012-09-07 10:00 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression][IVOPTS] " rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-09-05  9:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #27816|0                           |1
        is obsolete|                            |

--- Comment #7 from Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> 2012-09-05 09:32:27 UTC ---
Created attachment 28132
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28132
preprocessed minimal example without std::vector


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression][IVOPTS] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (7 preceding siblings ...)
  2012-09-05  9:32 ` benedict.geihe at ins dot uni-bonn.de
@ 2012-09-07 10:00 ` rguenth at gcc dot gnu.org
  2013-02-22 23:41 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner) steven at gcc dot gnu.org
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-09-07 10:00 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
            Summary|[4.6/4.7/4.8 Regression]    |[4.6/4.7/4.8
                   |Performance breakdown for   |Regression][IVOPTS]
                   |gcc-4.{6,7} vs. gcc-4.5     |Performance breakdown for
                   |using std::vector in matrix |gcc-4.{6,7} vs. gcc-4.5
                   |vector multiplication       |using std::vector in matrix
                   |                            |vector multiplication
      Known to fail|                            |4.7.1, 4.8.0

--- Comment #8 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-09-07 10:00:27 UTC ---
Thanks for the reduced testcase.  The innermost loops compare as follows:

4.5:

.L7:
        movsd   (%rbx,%rcx), %xmm0
        addq    $8, %rcx
        mulsd   0(%rbp,%rdx), %xmm0
        addq    $8, %rdx
        cmpq    $24, %rdx
        addsd   %xmm0, %xmm1
        movsd   %xmm1, (%rsi)
        jne     .L7

4.7:

.L13:
        movq    64(%rsp), %rdi
        movq    80(%rsp), %rdx
        addq    %rcx, %rdi
        addq    %r8, %rdx
        movsd   -8(%rax,%rdi), %xmm0
        mulsd   (%rsi,%rax), %xmm0
        addq    $8, %rax
        cmpq    $24, %rax
        addsd   (%rdx), %xmm0
        movsd   %xmm0, (%rdx)
        jne     .L13

so we seem to have a register allocation / spilling issue here as well
as a bad induction variable choice.  GCC 4.8 is not any better here.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (8 preceding siblings ...)
  2012-09-07 10:00 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression][IVOPTS] " rguenth at gcc dot gnu.org
@ 2013-02-22 23:41 ` steven at gcc dot gnu.org
  2013-02-22 23:42 ` steven at gcc dot gnu.org
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: steven at gcc dot gnu.org @ 2013-02-22 23:41 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org
            Summary|[4.6/4.7/4.8                |[4.6/4.7/4.8 Regression]
                   |Regression][IVOPTS]         |Performance breakdown for
                   |Performance breakdown for   |gcc-4.{6,7} vs. gcc-4.5
                   |gcc-4.{6,7} vs. gcc-4.5     |using std::vector in matrix
                   |using std::vector in matrix |vector multiplication
                   |vector multiplication       |(IVopts / inliner)

--- Comment #9 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-22 23:40:35 UTC ---
(In reply to comment #8)
> Thanks for the reduced testcase.  The innermost loops compare as follows:
> 
> 4.5:
> 
> .L7:
>         movsd   (%rbx,%rcx), %xmm0
>         addq    $8, %rcx
>         mulsd   0(%rbp,%rdx), %xmm0
>         addq    $8, %rdx
>         cmpq    $24, %rdx
>         addsd   %xmm0, %xmm1
>         movsd   %xmm1, (%rsi)
>         jne     .L7

4.8 r196182 with "--param early-inlining-insns=2" (2 x the default value):

.L13:   
        movsd   (%rdx), %xmm0
        addq    $8, %rdx
        mulsd   (%rsi,%rax), %xmm0
        addq    $8, %rax
        cmpq    $24, %rax
        addsd   %xmm0, %xmm1
        movsd   %xmm1, 8(%rdi,%rcx)
        jne     .L13


> 
> 4.7:
> 
> .L13:
>         movq    64(%rsp), %rdi
>         movq    80(%rsp), %rdx
>         addq    %rcx, %rdi
>         addq    %r8, %rdx
>         movsd   -8(%rax,%rdi), %xmm0
>         mulsd   (%rsi,%rax), %xmm0
>         addq    $8, %rax
>         cmpq    $24, %rax
>         addsd   (%rdx), %xmm0
>         movsd   %xmm0, (%rdx)
>         jne     .L13

This is similar to what 4.8 r196182 produces without inliner tweaks:

.L18:   
        movq    %rcx, %rdi
        addq    64(%rsp), %rdi
        movq    %r8, %rdx
        addq    80(%rsp), %rdx
        movsd   -8(%rax,%rdi), %xmm0
        mulsd   (%rsi,%rax), %xmm0
        addq    $8, %rax
        cmpq    $24, %rax
        addsd   (%rdx), %xmm0
        movsd   %xmm0, (%rdx)
        jne     .L18


> so we seem to have a register allocation / spilling issue here as well
> as a bad induction variable choice.  GCC 4.8 is not any better here.

All true, but in the end it looks like an inliner heuristics issue first
(as also suggested by comment #3).


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (9 preceding siblings ...)
  2013-02-22 23:41 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner) steven at gcc dot gnu.org
@ 2013-02-22 23:42 ` steven at gcc dot gnu.org
  2013-04-12 15:17 ` [Bug tree-optimization/54000] [4.7/4.8/4.9 " jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: steven at gcc dot gnu.org @ 2013-02-22 23:42 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

--- Comment #10 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-22 23:42:03 UTC ---
(In reply to comment #9)

> 4.8 r196182 with "--param early-inlining-insns=2" (2 x the default value):

"--param early-inlining-insns=22"


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.7/4.8/4.9 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (10 preceding siblings ...)
  2013-02-22 23:42 ` steven at gcc dot gnu.org
@ 2013-04-12 15:17 ` jakub at gcc dot gnu.org
  2014-06-12 13:48 ` [Bug tree-optimization/54000] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-04-12 15:17 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.6.4                       |4.7.4

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-04-12 15:16:47 UTC ---
GCC 4.6.4 has been released and the branch has been closed.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.7/4.8/4.9/4.10 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (11 preceding siblings ...)
  2013-04-12 15:17 ` [Bug tree-optimization/54000] [4.7/4.8/4.9 " jakub at gcc dot gnu.org
@ 2014-06-12 13:48 ` rguenth at gcc dot gnu.org
  2014-12-19 13:32 ` [Bug tree-optimization/54000] [4.8/4.9/5 " jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-06-12 13:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.7.4                       |4.8.4

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
The 4.7 branch is being closed, moving target milestone to 4.8.4.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.8/4.9/5 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (12 preceding siblings ...)
  2014-06-12 13:48 ` [Bug tree-optimization/54000] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
@ 2014-12-19 13:32 ` jakub at gcc dot gnu.org
  2015-02-09  8:46 ` [Bug tree-optimization/54000] [4.8/4.9 " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-12-19 13:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.8.4                       |4.8.5

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.8.4 has been released.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.8/4.9 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (13 preceding siblings ...)
  2014-12-19 13:32 ` [Bug tree-optimization/54000] [4.8/4.9/5 " jakub at gcc dot gnu.org
@ 2015-02-09  8:46 ` rguenth at gcc dot gnu.org
  2015-02-09  8:57 ` [Bug tree-optimization/54000] [4.8 " rguenth at gcc dot gnu.org
  2015-02-09 11:51 ` rguenth at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-09  8:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*, i?86-*-*
      Known to work|                            |5.0
            Summary|[4.8/4.9/5 Regression]      |[4.8/4.9 Regression]
                   |Performance breakdown for   |Performance breakdown for
                   |gcc-4.{6,7} vs. gcc-4.5     |gcc-4.{6,7} vs. gcc-4.5
                   |using std::vector in matrix |using std::vector in matrix
                   |vector multiplication       |vector multiplication
                   |(IVopts / inliner)          |(IVopts / inliner)

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, with trunk (gcc 5) I see

.L13:
        movsd   (%rdx), %xmm1
        xorl    %eax, %eax
.L12:
        movsd   -8(%rcx,%rax), %xmm0
        mulsd   (%rsi,%rax), %xmm0
        addq    $8, %rax
        cmpq    $24, %rax
        addsd   %xmm0, %xmm1
        movsd   %xmm1, (%rdx)
        jne     .L12
        addq    $8, %rdx
        addq    $8, %rcx
        addq    $24, %rsi
        cmpq    %rdi, %rdx
        jne     .L13

thus maybe even better than 4.5.

GCC 4.9 produces

.L17:
        leaq    (%r8,%rdx), %rcx
        movsd   8(%rdi,%rdx), %xmm1
        xorl    %eax, %eax
        addq    %r9, %rcx
.L14:
        movsd   -8(%rcx,%rax), %xmm0
        mulsd   (%rsi,%rax), %xmm0
        addq    $8, %rax
        cmpq    $24, %rax
        addsd   %xmm0, %xmm1
        movsd   %xmm1, 8(%rdi,%rdx)
        jne     .L14
        addq    $8, %rdx
        addq    $24, %rsi
        cmpq    $1016, %rdx
        jne     .L17

it might be again inliner changes that trigger the better behavior of course.

So - fixed in GCC 5.  Not sure how to produce a testcase that reliably
tracks good behavior here.  IVOPTs dumping should be improved somewhat.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (14 preceding siblings ...)
  2015-02-09  8:46 ` [Bug tree-optimization/54000] [4.8/4.9 " rguenth at gcc dot gnu.org
@ 2015-02-09  8:57 ` rguenth at gcc dot gnu.org
  2015-02-09 11:51 ` rguenth at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-09  8:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
      Known to work|                            |4.9.0
         Resolution|---                         |FIXED
   Target Milestone|4.8.5                       |4.9.0
            Summary|[4.8/4.9 Regression]        |[4.8 Regression]
                   |Performance breakdown for   |Performance breakdown for
                   |gcc-4.{6,7} vs. gcc-4.5     |gcc-4.{6,7} vs. gcc-4.5
                   |using std::vector in matrix |using std::vector in matrix
                   |vector multiplication       |vector multiplication
                   |(IVopts / inliner)          |(IVopts / inliner)

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
Timing-wise GCC 4.9 looks good as well.  With GCC 4.8 the testcase takes
2:30 to execute while with 4.9 1:09 and with GCC 5 finally 1:06.  GCC 4.9
also chooses a single IV for the innermost loop.

Declaring fixed.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/54000] [4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
  2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
                   ` (15 preceding siblings ...)
  2015-02-09  8:57 ` [Bug tree-optimization/54000] [4.8 " rguenth at gcc dot gnu.org
@ 2015-02-09 11:51 ` rguenth at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-09 11:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Mon Feb  9 11:51:05 2015
New Revision: 220536

URL: https://gcc.gnu.org/viewcvs?rev=220536&root=gcc&view=rev
Log:
2015-02-09  Richard Biener  <rguenther@suse.de>

    PR tree-optimization/54000
    * tree-ssa-looo-ivopts.c: Include tree-vectorizer.h.
    (struct ivopts_data): Add loop_loc member.
    (tree_ssa_iv_optimize_loop): Dump loop location.
    (create_new_ivs): Likewise, also dump number of IVs generated.

    * g++.dg/tree-ssa/ivopts-3.C: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/tree-ssa/ivopts-3.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-loop-ivopts.c


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-02-09 11:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
2012-07-17 16:01 ` [Bug c++/54000] " benedict.geihe at ins dot uni-bonn.de
2012-07-17 16:20 ` [Bug c++/54000] [4.6/4.7/4.8 Regression] " redi at gcc dot gnu.org
2012-07-17 20:14 ` hjl.tools at gmail dot com
2012-07-17 21:47 ` [Bug tree-optimization/54000] " jason at gcc dot gnu.org
2012-07-18  8:41 ` rguenth at gcc dot gnu.org
2012-07-24  9:53 ` benedict.geihe at ins dot uni-bonn.de
2012-09-05  9:30 ` benedict.geihe at ins dot uni-bonn.de
2012-09-05  9:32 ` benedict.geihe at ins dot uni-bonn.de
2012-09-07 10:00 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression][IVOPTS] " rguenth at gcc dot gnu.org
2013-02-22 23:41 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner) steven at gcc dot gnu.org
2013-02-22 23:42 ` steven at gcc dot gnu.org
2013-04-12 15:17 ` [Bug tree-optimization/54000] [4.7/4.8/4.9 " jakub at gcc dot gnu.org
2014-06-12 13:48 ` [Bug tree-optimization/54000] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
2014-12-19 13:32 ` [Bug tree-optimization/54000] [4.8/4.9/5 " jakub at gcc dot gnu.org
2015-02-09  8:46 ` [Bug tree-optimization/54000] [4.8/4.9 " rguenth at gcc dot gnu.org
2015-02-09  8:57 ` [Bug tree-optimization/54000] [4.8 " rguenth at gcc dot gnu.org
2015-02-09 11:51 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).