public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
@ 2010-05-30 17:17 dominiq at lps dot ens dot fr
2010-05-30 18:06 ` [Bug lto/44334] " dominiq at lps dot ens dot fr
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: dominiq at lps dot ens dot fr @ 2010-05-30 17:17 UTC (permalink / raw)
To: gcc-bugs
After revision 159852
Author: pault
Date: Wed May 26 05:11:04 2010 UTC (4 days, 12 hours ago)
Changed paths: 4
Log Message:
2010-05-26 Paul Thomas <pault@gcc.gnu.org>
PR fortran/40011
* resolve.c (resolve_global_procedure): Resolve the gsymbol's
namespace before trying to reorder the gsymbols.
2010-05-26 Paul Thomas <pault@gcc.gnu.org>
PR fortran/40011
* gfortran.dg/whole_file_19.f90 : New test.
the executable of the polyhedron test rnflow.f90 is ~27% slower when compiled
with -fwhole-program -flto:
[macbook] lin/test% gfcpf -v
Using built-in specs.
COLLECT_GCC=gfcpf
COLLECT_LTO_WRAPPER=/opt/gcc/gcc4.6pf/libexec/gcc/x86_64-apple-darwin10/4.6.0/lto-wrapper
Target: x86_64-apple-darwin10
Configured with: ../p_work/configure --prefix=/opt/gcc/gcc4.6pf
--mandir=/opt/gcc/gcc4.6pf/share/man --infodir=/opt/gcc/gcc4.6pf/share/info
--build=x86_64-apple-darwin10 --host=x86_64-apple-darwin10
--target=x86_64-apple-darwin10 --enable-languages=c,fortran
--with-gmp=/opt/sw64 --with-libiconv-prefix=/opt/sw64 --with-system-zlib
--x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib
--with-cloog=/opt/sw64 --with-ppl=/opt/sw64 --with-mpc=/opt/sw64 --enable-lto
Thread model: posix
gcc version 4.6.0 20100526 (experimental) [trunk revision 159851] (GCC)
[macbook] lin/test% gfcpf -O3 -ffast-math -funroll-loops -fomit-frame-pointer
rnflow.f90
[macbook] lin/test% time a.out > /dev/null
25.826u 0.686s 0:26.52 99.9% 0+0k 0+0io 0pf+0w
[macbook] lin/test% gfcpf -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-fwhole-file -flto rnflow.f90
[macbook] lin/test% time a.out > /dev/null
25.506u 0.674s 0:26.19 99.9% 0+0k 0+0io 0pf+0w
[macbook] lin/test% gfcpf -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-fwhole-program -flto rnflow.f90
[macbook] lin/test% time a.out > /dev/null
25.772u 0.678s 0:26.46 99.9% 0+0k 0+0io 0pf+0w
[macbook] lin/test% gfcp -v
Using built-in specs.
COLLECT_GCC=gfcp
COLLECT_LTO_WRAPPER=/opt/gcc/gcc4.6p/libexec/gcc/x86_64-apple-darwin10/4.6.0/lto-wrapper
Target: x86_64-apple-darwin10
Configured with: ../p_work/configure --prefix=/opt/gcc/gcc4.6p
--mandir=/opt/gcc/gcc4.6p/share/man --infodir=/opt/gcc/gcc4.6p/share/info
--build=x86_64-apple-darwin10 --host=x86_64-apple-darwin10
--target=x86_64-apple-darwin10 --enable-languages=c,fortran
--with-gmp=/opt/sw64 --with-libiconv-prefix=/opt/sw64 --with-system-zlib
--x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib
--with-cloog=/opt/sw64 --with-ppl=/opt/sw64 --with-mpc=/opt/sw64 --enable-lto
Thread model: posix
gcc version 4.6.0 20100526 (experimental) [trunk revision 159852] (GCC)
[macbook] lin/test% gfcp -O3 -ffast-math -funroll-loops -fomit-frame-pointer
rnflow.f90
[macbook] lin/test% time a.out > /dev/null
25.841u 0.696s 0:26.54 99.9% 0+0k 0+0io 0pf+0w
[macbook] lin/test% gfcp -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-fwhole-file -flto rnflow.f90
[macbook] lin/test% time a.out > /dev/null
25.540u 0.677s 0:26.22 99.9% 0+0k 0+0io 0pf+0w
[macbook] lin/test% gfcp -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-fwhole-program -flto rnflow.f90
[macbook] lin/test% time a.out > /dev/null
32.627u 0.685s 0:33.31 99.9% 0+0k 0+0io 0pf+0w <--- ~27% slower
As it has been noticed previously the executable of fatigue.f90 is ~30% faster
when compiled with -fwhole-program:
[macbook] lin/test% gfcp -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-fwhole-file -flto fatigue.f90
[macbook] lin/test% time a.out > /dev/null
9.031u 0.006s 0:09.04 99.8% 0+0k 0+1io 0pf+0w
[macbook] lin/test% gfcp -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-fwhole-program fatigue.f90
[macbook] lin/test% time a.out > /dev/null
6.448u 0.004s 0:06.47 99.5% 0+0k 0+1io 0pf+0w
--
Summary: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-
program -flto after revision 159852
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: lto
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: dominiq at lps dot ens dot fr
GCC build triplet: x86_64-apple-darwin10
GCC host triplet: x86_64-apple-darwin10
GCC target triplet: x86_64-apple-darwin10
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug lto/44334] [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
@ 2010-05-30 18:06 ` dominiq at lps dot ens dot fr
2010-05-30 18:09 ` [Bug fortran/44334] " rguenth at gcc dot gnu dot org
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dominiq at lps dot ens dot fr @ 2010-05-30 18:06 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from dominiq at lps dot ens dot fr 2010-05-30 18:06 -------
I'll attach the assembly generated with -O3 -ffast-math -funroll-loops
-fomit-frame-pointer -flto for revisions 159851 and 159852. It is the same
with/without -fwhole-program (probably obvious), however when assembled and
linked with
gfcp -O3 -ffast-math -funroll-loops -fomit-frame-pointer -fwhole-program -flto
rnflow_wp5*.s
the timing depends on the revision used to generate the assembly, but not on
the compiler revision.
--
dominiq at lps dot ens dot fr changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenther at suse dot de, jh
| |at suse dot cz, pault at gcc
| |dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
2010-05-30 18:06 ` [Bug lto/44334] " dominiq at lps dot ens dot fr
@ 2010-05-30 18:09 ` rguenth at gcc dot gnu dot org
2010-05-30 18:11 ` dominiq at lps dot ens dot fr
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-30 18:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from rguenth at gcc dot gnu dot org 2010-05-30 18:09 -------
Insufficient analysis. This more sounds like a dup of profile-estimate
messed up by inlining.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|lto |fortran
Summary|[4.6 Regression] rnflow.f90 |rnflow.f90 ~27% slower with
|~27% slower with -fwhole- |-fwhole-program -flto after
|program -flto after revision|revision 159852
|159852 |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
2010-05-30 18:06 ` [Bug lto/44334] " dominiq at lps dot ens dot fr
2010-05-30 18:09 ` [Bug fortran/44334] " rguenth at gcc dot gnu dot org
@ 2010-05-30 18:11 ` dominiq at lps dot ens dot fr
2010-05-30 18:12 ` dominiq at lps dot ens dot fr
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dominiq at lps dot ens dot fr @ 2010-05-30 18:11 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from dominiq at lps dot ens dot fr 2010-05-30 18:10 -------
Created an attachment (id=20780)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20780&action=view)
Assembly generated with -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-flto and revision 159851
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
` (2 preceding siblings ...)
2010-05-30 18:11 ` dominiq at lps dot ens dot fr
@ 2010-05-30 18:12 ` dominiq at lps dot ens dot fr
2010-05-30 18:31 ` dominiq at lps dot ens dot fr
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dominiq at lps dot ens dot fr @ 2010-05-30 18:12 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from dominiq at lps dot ens dot fr 2010-05-30 18:12 -------
Created an attachment (id=20781)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20781&action=view)
Assembly generated with -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-flto and revision 159852
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
` (3 preceding siblings ...)
2010-05-30 18:12 ` dominiq at lps dot ens dot fr
@ 2010-05-30 18:31 ` dominiq at lps dot ens dot fr
2010-05-30 18:49 ` rguenth at gcc dot gnu dot org
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dominiq at lps dot ens dot fr @ 2010-05-30 18:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from dominiq at lps dot ens dot fr 2010-05-30 18:30 -------
Output of gprof on darwin:
Revision 159851:
called/total parents
index %time self descendents called+self name index
called/total children
520605 _dgetf2_ [81]
0.00 0.00 64/1041192 ___timctr_MOD_gettim
[1429]
0.00 0.00 6548/1041192 _dswap_ [4112]
0.00 0.00 1034580/1041192 _xerbla_ [83]
[81] 0.0 0.00 0.00 1041192+520605 _dgetf2_ [81]
0.00 0.00 64137/110864 _dgetrf_ [82]
520605 _dgetf2_ [81]
-----------------------------------------------
13315 _dgetrf_ [82]
0.00 0.00 8/110864 ___timctr_MOD_gettim
[1429]
0.00 0.00 6548/110864 _dswap_ [4112]
0.00 0.00 6685/110864 __dyld_func_lookup [1665]
0.00 0.00 33486/110864 _xerbla_ [83]
0.00 0.00 64137/110864 _dgetf2_ [81]
[82] 0.0 0.00 0.00 110864+13315 _dgetrf_ [82]
0.00 0.00 1/1 _main [85]
13315 _dgetrf_ [82]
-----------------------------------------------
0.00 0.00 10872/10872 _dswap_ [4112]
[83] 0.0 0.00 0.00 10872 _xerbla_ [83]
0.00 0.00 1034580/1041192 _dgetf2_ [81]
0.00 0.00 33486/110864 _dgetrf_ [82]
-----------------------------------------------
0.00 0.00 1/1 _main [85]
[84] 0.0 0.00 0.00 1 __start [84]
-----------------------------------------------
0.00 0.00 1/1 _dgetrf_ [82]
[85] 0.0 0.00 0.00 1 _main [85]
0.00 0.00 1/1 __start [84]
-----------------------------------------------
...
% cumulative self self total
time seconds seconds calls ms/call ms/call name
0.0 0.00 0.00 1561733 0.00 0.00 _dgetf2_ [81]
0.0 0.00 0.00 110927 0.00 0.00 _dgetrf_ [82]
0.0 0.00 0.00 10872 0.00 0.00 _xerbla_ [83]
0.0 0.00 0.00 1 0.00 0.00 __start [84]
0.0 0.00 0.00 1 0.00 0.00 _main [85]
================================================================================
Revision 159852:
called/total parents
index %time self descendents called+self name index
called/total children
0.00 0.00 6548/1561733 _dswap_ [4112]
0.00 0.00 1555185/1561733 _xerbla_ [83]
[81] 0.0 0.00 0.00 1561733 _dgetf2_ [81]
0.00 0.00 64136/110927 _dgetrf_ [82]
-----------------------------------------------
13315 _dgetrf_ [82]
0.00 0.00 72/110927 ___timctr_MOD_gettim
[1429]
0.00 0.00 6548/110927 _dswap_ [4112]
0.00 0.00 6685/110927 __dyld_func_lookup [1665]
0.00 0.00 33486/110927 _xerbla_ [83]
0.00 0.00 64136/110927 _dgetf2_ [81]
[82] 0.0 0.00 0.00 110927+13315 _dgetrf_ [82]
0.00 0.00 1/1 _main [85]
13315 _dgetrf_ [82]
-----------------------------------------------
0.00 0.00 10872/10872 _dswap_ [4112]
[83] 0.0 0.00 0.00 10872 _xerbla_ [83]
0.00 0.00 1555185/1561733 _dgetf2_ [81]
0.00 0.00 33486/110927 _dgetrf_ [82]
-----------------------------------------------
0.00 0.00 1/1 _main [85]
[84] 0.0 0.00 0.00 1 __start [84]
-----------------------------------------------
0.00 0.00 1/1 _dgetrf_ [82]
[85] 0.0 0.00 0.00 1 _main [85]
0.00 0.00 1/1 __start [84]
-----------------------------------------------
...
% cumulative self self total
time seconds seconds calls ms/call ms/call name
0.0 0.00 0.00 5572994 0.00 0.00 _xerbla_ [154]
0.0 0.00 0.00 20556 0.00 0.00 _dswap_ [155]
0.0 0.00 0.00 20000 0.00 0.00 ___timctr_MOD_gettim
[156]
0.0 0.00 0.00 3 0.00 0.00 __dyld_func_lookup [157]
0.0 0.00 0.00 2 0.00 0.00 __start [158]
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
` (4 preceding siblings ...)
2010-05-30 18:31 ` dominiq at lps dot ens dot fr
@ 2010-05-30 18:49 ` rguenth at gcc dot gnu dot org
2010-05-30 18:55 ` dominiq at lps dot ens dot fr
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-30 18:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from rguenth at gcc dot gnu dot org 2010-05-30 18:48 -------
0.0 0.00 0.00 5572994 0.00 0.00 _xerbla_ [154]
eh? that's the blas error handler. something is fishy with your setup.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
` (5 preceding siblings ...)
2010-05-30 18:49 ` rguenth at gcc dot gnu dot org
@ 2010-05-30 18:55 ` dominiq at lps dot ens dot fr
2010-06-05 9:52 ` dominiq at lps dot ens dot fr
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dominiq at lps dot ens dot fr @ 2010-05-30 18:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from dominiq at lps dot ens dot fr 2010-05-30 18:55 -------
> Insufficient analysis. This more sounds like a dup of profile-estimate
> messed up by inlining.
Do you mean a dup of pr40106? Or is there others I am not aware of?
> eh? that's the blas error handler. something is fishy with your setup.
Which setup?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
` (6 preceding siblings ...)
2010-05-30 18:55 ` dominiq at lps dot ens dot fr
@ 2010-06-05 9:52 ` dominiq at lps dot ens dot fr
2010-09-08 21:00 ` burnus at gcc dot gnu dot org
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dominiq at lps dot ens dot fr @ 2010-06-05 9:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from dominiq at lps dot ens dot fr 2010-06-05 09:52 -------
At revision 160309, I get
[macbook] lin/test% gfc -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-fwhole-program -flto rnflow.f90 --param hot-bb-frequency-fraction=1000
[macbook] lin/test% time a.out > /dev/null
32.601u 0.716s 0:33.35 99.8% 0+0k 0+0io 0pf+0w
[macbook] lin/test% gfc -O3 -ffast-math -funroll-loops -fomit-frame-pointer
-fwhole-program -flto rnflow.f90 --param hot-bb-frequency-fraction=2000
[macbook] lin/test% time a.out > /dev/null
25.760u 0.708s 0:26.47 99.9% 0+0k 0+0io 0pf+0w
--
dominiq at lps dot ens dot fr changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
` (7 preceding siblings ...)
2010-06-05 9:52 ` dominiq at lps dot ens dot fr
@ 2010-09-08 21:00 ` burnus at gcc dot gnu dot org
2010-09-08 21:04 ` hubicka at gcc dot gnu dot org
2010-09-09 9:01 ` burnus at gcc dot gnu dot org
10 siblings, 0 replies; 12+ messages in thread
From: burnus at gcc dot gnu dot org @ 2010-09-08 21:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from burnus at gcc dot gnu dot org 2010-09-08 21:00 -------
For what it is worth, on AMD Athlon 64 X2 4800+ / x86-64-linux, I get for
gfortran -O3 -ffast-math -march=native -- and with with and without -flto:
0m45.132s -- (options as above)
0m52.731s -- additionally -fwhole-program
That's a +16% increase in run-time with -fwhole-program.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
` (8 preceding siblings ...)
2010-09-08 21:00 ` burnus at gcc dot gnu dot org
@ 2010-09-08 21:04 ` hubicka at gcc dot gnu dot org
2010-09-09 9:01 ` burnus at gcc dot gnu dot org
10 siblings, 0 replies; 12+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2010-09-08 21:04 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from hubicka at gcc dot gnu dot org 2010-09-08 21:04 -------
So hot-bb-frequency-fraction solves the whole regression?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug fortran/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
` (9 preceding siblings ...)
2010-09-08 21:04 ` hubicka at gcc dot gnu dot org
@ 2010-09-09 9:01 ` burnus at gcc dot gnu dot org
10 siblings, 0 replies; 12+ messages in thread
From: burnus at gcc dot gnu dot org @ 2010-09-09 9:01 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from burnus at gcc dot gnu dot org 2010-09-09 09:00 -------
[Move comment from IRC #gcc to bugzilla]
(In reply to comment #9)
> For what it is worth, on AMD Athlon 64 X2 4800+ / x86-64-linux, [...]
> That's a +16% increase in run-time with -fwhole-program.
(In reply to comment #10)
> So hot-bb-frequency-fraction solves the whole regression?
For me (cf. system above), --param hot-bb-frequency-fraction=2000 reduces the
slow down due to -fwhole-program from 16% to 3%. (The LTO version with and
without -fwhole-file is about 2% slower than the corresponding -fno-lto
version.)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-09-09 9:01 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-30 17:17 [Bug lto/44334] New: [4.6 Regression] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens dot fr
2010-05-30 18:06 ` [Bug lto/44334] " dominiq at lps dot ens dot fr
2010-05-30 18:09 ` [Bug fortran/44334] " rguenth at gcc dot gnu dot org
2010-05-30 18:11 ` dominiq at lps dot ens dot fr
2010-05-30 18:12 ` dominiq at lps dot ens dot fr
2010-05-30 18:31 ` dominiq at lps dot ens dot fr
2010-05-30 18:49 ` rguenth at gcc dot gnu dot org
2010-05-30 18:55 ` dominiq at lps dot ens dot fr
2010-06-05 9:52 ` dominiq at lps dot ens dot fr
2010-09-08 21:00 ` burnus at gcc dot gnu dot org
2010-09-08 21:04 ` hubicka at gcc dot gnu dot org
2010-09-09 9:01 ` burnus at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).