public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
@ 2012-07-17 15:59 benedict.geihe at ins dot uni-bonn.de
2012-07-17 16:01 ` [Bug c++/54000] " benedict.geihe at ins dot uni-bonn.de
` (16 more replies)
0 siblings, 17 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-07-17 15:59 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Bug #: 54000
Summary: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5
using std::vector in matrix vector multiplication
Classification: Unclassified
Product: gcc
Version: 4.7.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: benedict.geihe@ins.uni-bonn.de
Dear experts,
I got here from the gcc-help mailing list but have not submitted any bug
reports before. So I hope for your patience.
In a self-written library used for numerical computations we have some typical
programs serving as benchmarks for new compiler versions or optimization flags.
When gcc-4.6 was released we noticed a performance breakdown. The problem
persisted with gcc-4.7. I tried to produce a minimal stand-alone example and
followed the instructions at http://gcc.gnu.org/bugs/minimize.html. As
std::vector is included I was however not able to arrive at a really small
file.
What you see at the end of the file is actually just 1000 times matrix-vector
multiplication. However the matrix has a highly specific structure which is
encountered when performing numerical computations using the Finite Element
Method (FEM), i.e.:
std::vector<MinimalVec3> rows[9];
Thus it consists of 9 bands of triples of doubles. The length of each band
corresponds to the length of the vector it is applied to.
Compiling with gcc-4.5.0 (our current standard) 'time' command gives:
real 1m32.606s
Using gcc-4.7.0 we get:
real 2m6.923s
When removing member variable "double stuff" in "class MinimalVector" and using
gcc-4.7.0 we get:
real 1m27.354s
Using a C array instead of std::vector above resolves this issue.
The specifications of the two compilers used are:
Using built-in specs.
COLLECT_GCC=/home/prog/gcc-4.5.0-64/bin/g++
COLLECT_LTO_WRAPPER=/home/prog/gcc-4.5.0-64/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ./configure --prefix=/home/prog/gcc-4.5.0-64/
--enable-languages=c,c++,fortran --disable-multilib --enable-lto
--with-libelf=/home/prog/libelf-64/ --with-ppl=/home/prog/ppl-64/
--with-cloog=/home/prog/cloog-ppl-64/
Thread model: posix
gcc version 4.5.0 (GCC)
and
Using built-in specs.
COLLECT_GCC=/home/prog/gcc-4.7.0-64/bin/g++
COLLECT_LTO_WRAPPER=/home/prog/gcc-4.7.0-64/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.7.0/configure --prefix=/home/prog/gcc-4.7.0-64/
--enable-languages=c,c++,fortran --with-gmp=/home/prog/gmp-5.0.4-64/
--with-ppl=/home/prog/ppl-0.12-64/ --enable-cloog-backend=isl
--with-cloog=/home/prog/cloog-0.16.3-64/ --disable-multilib
--enable-libstdcxx-debug
Thread model: posix
gcc version 4.7.0 (GCC)
They have been compiled manually on a machine running openSuse 11.3.
The command line was: g++ -O2 -o minmvmult minmvmult.ii
There were no warnings or error messages.
We'd be grateful for any suggestions.
Best regards
Benedict
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug c++/54000] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
@ 2012-07-17 16:01 ` benedict.geihe at ins dot uni-bonn.de
2012-07-17 16:20 ` [Bug c++/54000] [4.6/4.7/4.8 Regression] " redi at gcc dot gnu.org
` (15 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-07-17 16:01 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
--- Comment #1 from Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> 2012-07-17 16:01:08 UTC ---
Created attachment 27816
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27816
preprocessed minimal example
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug c++/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
2012-07-17 16:01 ` [Bug c++/54000] " benedict.geihe at ins dot uni-bonn.de
@ 2012-07-17 16:20 ` redi at gcc dot gnu.org
2012-07-17 20:14 ` hjl.tools at gmail dot com
` (14 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: redi at gcc dot gnu.org @ 2012-07-17 16:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2012-07-17
Known to work| |4.5.2
Summary|Performance breakdown for |[4.6/4.7/4.8 Regression]
|gcc-4.{6,7} vs. gcc-4.5 |Performance breakdown for
|using std::vector in matrix |gcc-4.{6,7} vs. gcc-4.5
|vector multiplication |using std::vector in matrix
| |vector multiplication
Ever Confirmed|0 |1
--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> 2012-07-17 16:19:42 UTC ---
confirmed
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug c++/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
2012-07-17 16:01 ` [Bug c++/54000] " benedict.geihe at ins dot uni-bonn.de
2012-07-17 16:20 ` [Bug c++/54000] [4.6/4.7/4.8 Regression] " redi at gcc dot gnu.org
@ 2012-07-17 20:14 ` hjl.tools at gmail dot com
2012-07-17 21:47 ` [Bug tree-optimization/54000] " jason at gcc dot gnu.org
` (13 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: hjl.tools at gmail dot com @ 2012-07-17 20:14 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu.org
--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> 2012-07-17 20:14:18 UTC ---
It is caused by revision 172873:
http://gcc.gnu.org/ml/gcc-cvs/2011-04/msg01069.html
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (2 preceding siblings ...)
2012-07-17 20:14 ` hjl.tools at gmail dot com
@ 2012-07-17 21:47 ` jason at gcc dot gnu.org
2012-07-18 8:41 ` rguenth at gcc dot gnu.org
` (12 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: jason at gcc dot gnu.org @ 2012-07-17 21:47 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Jason Merrill <jason at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jason at gcc dot gnu.org
Component|c++ |tree-optimization
--- Comment #4 from Jason Merrill <jason at gcc dot gnu.org> 2012-07-17 21:46:52 UTC ---
So, seems like an inlining heuristics issue. Changing category to
tree-optimization.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (3 preceding siblings ...)
2012-07-17 21:47 ` [Bug tree-optimization/54000] " jason at gcc dot gnu.org
@ 2012-07-18 8:41 ` rguenth at gcc dot gnu.org
2012-07-24 9:53 ` benedict.geihe at ins dot uni-bonn.de
` (11 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-18 8:41 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Target Milestone|--- |4.6.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (4 preceding siblings ...)
2012-07-18 8:41 ` rguenth at gcc dot gnu.org
@ 2012-07-24 9:53 ` benedict.geihe at ins dot uni-bonn.de
2012-09-05 9:30 ` benedict.geihe at ins dot uni-bonn.de
` (10 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-07-24 9:53 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
--- Comment #5 from Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> 2012-07-24 09:52:42 UTC ---
Thank you all for your quick replies.
Please let us know if we can further assist you.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (5 preceding siblings ...)
2012-07-24 9:53 ` benedict.geihe at ins dot uni-bonn.de
@ 2012-09-05 9:30 ` benedict.geihe at ins dot uni-bonn.de
2012-09-05 9:32 ` benedict.geihe at ins dot uni-bonn.de
` (9 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-09-05 9:30 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
--- Comment #6 from Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> 2012-09-05 09:30:05 UTC ---
I originally reported that using a C array instead of STL's vector solves the
problem. I am afraid that was wrong. I can also not remember what lead me to
this conclusion.
Anyway I attached a new minimal example without STL stuff. Hope it helps.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (6 preceding siblings ...)
2012-09-05 9:30 ` benedict.geihe at ins dot uni-bonn.de
@ 2012-09-05 9:32 ` benedict.geihe at ins dot uni-bonn.de
2012-09-07 10:00 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression][IVOPTS] " rguenth at gcc dot gnu.org
` (8 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: benedict.geihe at ins dot uni-bonn.de @ 2012-09-05 9:32 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #27816|0 |1
is obsolete| |
--- Comment #7 from Benedict Geihe <benedict.geihe at ins dot uni-bonn.de> 2012-09-05 09:32:27 UTC ---
Created attachment 28132
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28132
preprocessed minimal example without std::vector
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression][IVOPTS] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (7 preceding siblings ...)
2012-09-05 9:32 ` benedict.geihe at ins dot uni-bonn.de
@ 2012-09-07 10:00 ` rguenth at gcc dot gnu.org
2013-02-22 23:41 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner) steven at gcc dot gnu.org
` (7 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-09-07 10:00 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
Summary|[4.6/4.7/4.8 Regression] |[4.6/4.7/4.8
|Performance breakdown for |Regression][IVOPTS]
|gcc-4.{6,7} vs. gcc-4.5 |Performance breakdown for
|using std::vector in matrix |gcc-4.{6,7} vs. gcc-4.5
|vector multiplication |using std::vector in matrix
| |vector multiplication
Known to fail| |4.7.1, 4.8.0
--- Comment #8 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-09-07 10:00:27 UTC ---
Thanks for the reduced testcase. The innermost loops compare as follows:
4.5:
.L7:
movsd (%rbx,%rcx), %xmm0
addq $8, %rcx
mulsd 0(%rbp,%rdx), %xmm0
addq $8, %rdx
cmpq $24, %rdx
addsd %xmm0, %xmm1
movsd %xmm1, (%rsi)
jne .L7
4.7:
.L13:
movq 64(%rsp), %rdi
movq 80(%rsp), %rdx
addq %rcx, %rdi
addq %r8, %rdx
movsd -8(%rax,%rdi), %xmm0
mulsd (%rsi,%rax), %xmm0
addq $8, %rax
cmpq $24, %rax
addsd (%rdx), %xmm0
movsd %xmm0, (%rdx)
jne .L13
so we seem to have a register allocation / spilling issue here as well
as a bad induction variable choice. GCC 4.8 is not any better here.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (8 preceding siblings ...)
2012-09-07 10:00 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression][IVOPTS] " rguenth at gcc dot gnu.org
@ 2013-02-22 23:41 ` steven at gcc dot gnu.org
2013-02-22 23:42 ` steven at gcc dot gnu.org
` (6 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: steven at gcc dot gnu.org @ 2013-02-22 23:41 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Steven Bosscher <steven at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
Summary|[4.6/4.7/4.8 |[4.6/4.7/4.8 Regression]
|Regression][IVOPTS] |Performance breakdown for
|Performance breakdown for |gcc-4.{6,7} vs. gcc-4.5
|gcc-4.{6,7} vs. gcc-4.5 |using std::vector in matrix
|using std::vector in matrix |vector multiplication
|vector multiplication |(IVopts / inliner)
--- Comment #9 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-22 23:40:35 UTC ---
(In reply to comment #8)
> Thanks for the reduced testcase. The innermost loops compare as follows:
>
> 4.5:
>
> .L7:
> movsd (%rbx,%rcx), %xmm0
> addq $8, %rcx
> mulsd 0(%rbp,%rdx), %xmm0
> addq $8, %rdx
> cmpq $24, %rdx
> addsd %xmm0, %xmm1
> movsd %xmm1, (%rsi)
> jne .L7
4.8 r196182 with "--param early-inlining-insns=2" (2 x the default value):
.L13:
movsd (%rdx), %xmm0
addq $8, %rdx
mulsd (%rsi,%rax), %xmm0
addq $8, %rax
cmpq $24, %rax
addsd %xmm0, %xmm1
movsd %xmm1, 8(%rdi,%rcx)
jne .L13
>
> 4.7:
>
> .L13:
> movq 64(%rsp), %rdi
> movq 80(%rsp), %rdx
> addq %rcx, %rdi
> addq %r8, %rdx
> movsd -8(%rax,%rdi), %xmm0
> mulsd (%rsi,%rax), %xmm0
> addq $8, %rax
> cmpq $24, %rax
> addsd (%rdx), %xmm0
> movsd %xmm0, (%rdx)
> jne .L13
This is similar to what 4.8 r196182 produces without inliner tweaks:
.L18:
movq %rcx, %rdi
addq 64(%rsp), %rdi
movq %r8, %rdx
addq 80(%rsp), %rdx
movsd -8(%rax,%rdi), %xmm0
mulsd (%rsi,%rax), %xmm0
addq $8, %rax
cmpq $24, %rax
addsd (%rdx), %xmm0
movsd %xmm0, (%rdx)
jne .L18
> so we seem to have a register allocation / spilling issue here as well
> as a bad induction variable choice. GCC 4.8 is not any better here.
All true, but in the end it looks like an inliner heuristics issue first
(as also suggested by comment #3).
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (9 preceding siblings ...)
2013-02-22 23:41 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner) steven at gcc dot gnu.org
@ 2013-02-22 23:42 ` steven at gcc dot gnu.org
2013-04-12 15:17 ` [Bug tree-optimization/54000] [4.7/4.8/4.9 " jakub at gcc dot gnu.org
` (5 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: steven at gcc dot gnu.org @ 2013-02-22 23:42 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
--- Comment #10 from Steven Bosscher <steven at gcc dot gnu.org> 2013-02-22 23:42:03 UTC ---
(In reply to comment #9)
> 4.8 r196182 with "--param early-inlining-insns=2" (2 x the default value):
"--param early-inlining-insns=22"
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.7/4.8/4.9 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (10 preceding siblings ...)
2013-02-22 23:42 ` steven at gcc dot gnu.org
@ 2013-04-12 15:17 ` jakub at gcc dot gnu.org
2014-06-12 13:48 ` [Bug tree-optimization/54000] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
` (4 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-04-12 15:17 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.6.4 |4.7.4
--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-04-12 15:16:47 UTC ---
GCC 4.6.4 has been released and the branch has been closed.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.7/4.8/4.9/4.10 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (11 preceding siblings ...)
2013-04-12 15:17 ` [Bug tree-optimization/54000] [4.7/4.8/4.9 " jakub at gcc dot gnu.org
@ 2014-06-12 13:48 ` rguenth at gcc dot gnu.org
2014-12-19 13:32 ` [Bug tree-optimization/54000] [4.8/4.9/5 " jakub at gcc dot gnu.org
` (3 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-06-12 13:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.7.4 |4.8.4
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
The 4.7 branch is being closed, moving target milestone to 4.8.4.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.8/4.9/5 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (12 preceding siblings ...)
2014-06-12 13:48 ` [Bug tree-optimization/54000] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
@ 2014-12-19 13:32 ` jakub at gcc dot gnu.org
2015-02-09 8:46 ` [Bug tree-optimization/54000] [4.8/4.9 " rguenth at gcc dot gnu.org
` (2 subsequent siblings)
16 siblings, 0 replies; 18+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-12-19 13:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.8.4 |4.8.5
--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.8.4 has been released.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.8/4.9 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (13 preceding siblings ...)
2014-12-19 13:32 ` [Bug tree-optimization/54000] [4.8/4.9/5 " jakub at gcc dot gnu.org
@ 2015-02-09 8:46 ` rguenth at gcc dot gnu.org
2015-02-09 8:57 ` [Bug tree-optimization/54000] [4.8 " rguenth at gcc dot gnu.org
2015-02-09 11:51 ` rguenth at gcc dot gnu.org
16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-09 8:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |x86_64-*-*, i?86-*-*
Known to work| |5.0
Summary|[4.8/4.9/5 Regression] |[4.8/4.9 Regression]
|Performance breakdown for |Performance breakdown for
|gcc-4.{6,7} vs. gcc-4.5 |gcc-4.{6,7} vs. gcc-4.5
|using std::vector in matrix |using std::vector in matrix
|vector multiplication |vector multiplication
|(IVopts / inliner) |(IVopts / inliner)
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, with trunk (gcc 5) I see
.L13:
movsd (%rdx), %xmm1
xorl %eax, %eax
.L12:
movsd -8(%rcx,%rax), %xmm0
mulsd (%rsi,%rax), %xmm0
addq $8, %rax
cmpq $24, %rax
addsd %xmm0, %xmm1
movsd %xmm1, (%rdx)
jne .L12
addq $8, %rdx
addq $8, %rcx
addq $24, %rsi
cmpq %rdi, %rdx
jne .L13
thus maybe even better than 4.5.
GCC 4.9 produces
.L17:
leaq (%r8,%rdx), %rcx
movsd 8(%rdi,%rdx), %xmm1
xorl %eax, %eax
addq %r9, %rcx
.L14:
movsd -8(%rcx,%rax), %xmm0
mulsd (%rsi,%rax), %xmm0
addq $8, %rax
cmpq $24, %rax
addsd %xmm0, %xmm1
movsd %xmm1, 8(%rdi,%rdx)
jne .L14
addq $8, %rdx
addq $24, %rsi
cmpq $1016, %rdx
jne .L17
it might be again inliner changes that trigger the better behavior of course.
So - fixed in GCC 5. Not sure how to produce a testcase that reliably
tracks good behavior here. IVOPTs dumping should be improved somewhat.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (14 preceding siblings ...)
2015-02-09 8:46 ` [Bug tree-optimization/54000] [4.8/4.9 " rguenth at gcc dot gnu.org
@ 2015-02-09 8:57 ` rguenth at gcc dot gnu.org
2015-02-09 11:51 ` rguenth at gcc dot gnu.org
16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-09 8:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Known to work| |4.9.0
Resolution|--- |FIXED
Target Milestone|4.8.5 |4.9.0
Summary|[4.8/4.9 Regression] |[4.8 Regression]
|Performance breakdown for |Performance breakdown for
|gcc-4.{6,7} vs. gcc-4.5 |gcc-4.{6,7} vs. gcc-4.5
|using std::vector in matrix |using std::vector in matrix
|vector multiplication |vector multiplication
|(IVopts / inliner) |(IVopts / inliner)
--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
Timing-wise GCC 4.9 looks good as well. With GCC 4.8 the testcase takes
2:30 to execute while with 4.9 1:09 and with GCC 5 finally 1:06. GCC 4.9
also chooses a single IV for the innermost loop.
Declaring fixed.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bug tree-optimization/54000] [4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner)
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
` (15 preceding siblings ...)
2015-02-09 8:57 ` [Bug tree-optimization/54000] [4.8 " rguenth at gcc dot gnu.org
@ 2015-02-09 11:51 ` rguenth at gcc dot gnu.org
16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-09 11:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54000
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Mon Feb 9 11:51:05 2015
New Revision: 220536
URL: https://gcc.gnu.org/viewcvs?rev=220536&root=gcc&view=rev
Log:
2015-02-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/54000
* tree-ssa-looo-ivopts.c: Include tree-vectorizer.h.
(struct ivopts_data): Add loop_loc member.
(tree_ssa_iv_optimize_loop): Dump loop location.
(create_new_ivs): Likewise, also dump number of IVs generated.
* g++.dg/tree-ssa/ivopts-3.C: New testcase.
Added:
trunk/gcc/testsuite/g++.dg/tree-ssa/ivopts-3.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-ivopts.c
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-02-09 11:51 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-17 15:59 [Bug c++/54000] New: Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication benedict.geihe at ins dot uni-bonn.de
2012-07-17 16:01 ` [Bug c++/54000] " benedict.geihe at ins dot uni-bonn.de
2012-07-17 16:20 ` [Bug c++/54000] [4.6/4.7/4.8 Regression] " redi at gcc dot gnu.org
2012-07-17 20:14 ` hjl.tools at gmail dot com
2012-07-17 21:47 ` [Bug tree-optimization/54000] " jason at gcc dot gnu.org
2012-07-18 8:41 ` rguenth at gcc dot gnu.org
2012-07-24 9:53 ` benedict.geihe at ins dot uni-bonn.de
2012-09-05 9:30 ` benedict.geihe at ins dot uni-bonn.de
2012-09-05 9:32 ` benedict.geihe at ins dot uni-bonn.de
2012-09-07 10:00 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression][IVOPTS] " rguenth at gcc dot gnu.org
2013-02-22 23:41 ` [Bug tree-optimization/54000] [4.6/4.7/4.8 Regression] Performance breakdown for gcc-4.{6,7} vs. gcc-4.5 using std::vector in matrix vector multiplication (IVopts / inliner) steven at gcc dot gnu.org
2013-02-22 23:42 ` steven at gcc dot gnu.org
2013-04-12 15:17 ` [Bug tree-optimization/54000] [4.7/4.8/4.9 " jakub at gcc dot gnu.org
2014-06-12 13:48 ` [Bug tree-optimization/54000] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
2014-12-19 13:32 ` [Bug tree-optimization/54000] [4.8/4.9/5 " jakub at gcc dot gnu.org
2015-02-09 8:46 ` [Bug tree-optimization/54000] [4.8/4.9 " rguenth at gcc dot gnu.org
2015-02-09 8:57 ` [Bug tree-optimization/54000] [4.8 " rguenth at gcc dot gnu.org
2015-02-09 11:51 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).