public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
@ 2010-11-15 13:41 ` dominiq at lps dot ens.fr
  2010-11-15 15:33 ` hubicka at ucw dot cz
                   ` (36 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2010-11-15 13:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

Dominique d'Humieres <dominiq at lps dot ens.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|fortran                     |lto

--- Comment #12 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2010-11-15 13:29:24 UTC ---
I think this is not a gfortran bug. Marked as aLTO one.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
  2010-11-15 13:41 ` [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens.fr
@ 2010-11-15 15:33 ` hubicka at ucw dot cz
  2010-12-19 11:44 ` hubicka at gcc dot gnu.org
                   ` (35 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: hubicka at ucw dot cz @ 2010-11-15 15:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #13 from Jan Hubicka <hubicka at ucw dot cz> 2010-11-15 15:22:19 UTC ---
Static profile estimation problem, to be exact. LTO is just triggering it by
bringing in enough of context ;)


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
  2010-11-15 13:41 ` [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens.fr
  2010-11-15 15:33 ` hubicka at ucw dot cz
@ 2010-12-19 11:44 ` hubicka at gcc dot gnu.org
  2010-12-19 13:37 ` dominiq at lps dot ens.fr
                   ` (34 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: hubicka at gcc dot gnu.org @ 2010-12-19 11:44 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2010.12.19 11:43:48
     Ever Confirmed|0                           |1

--- Comment #14 from Jan Hubicka <hubicka at gcc dot gnu.org> 2010-12-19 11:43:48 UTC ---
I finally got into some time to test the various solutions. easiest is probably
the following:
Index: predict.c
===================================================================
--- predict.c   (revision 168047)
+++ predict.c   (working copy)
@@ -126,7 +126,7 @@ maybe_hot_frequency_p (int freq)
   if (node->frequency == NODE_FREQUENCY_EXECUTED_ONCE
       && freq <= (ENTRY_BLOCK_PTR->frequency * 2 / 3))
     return false;
-  if (freq < BB_FREQ_MAX / PARAM_VALUE (HOT_BB_FREQUENCY_FRACTION))
+  if (freq < ENTRY_BLOCK_PTR->frequency / PARAM_VALUE
(HOT_BB_FREQUENCY_FRACTION))
     return false;
   return true;
 }
It makes GCC to decide on cold basic blocks not based on the innermost loop
nest but on the entry block frequency - so many conditoinals or EH renders BB
cold but not the fact it is outside of very many BBs.

Could you try if this solves the problem?


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2010-12-19 11:44 ` hubicka at gcc dot gnu.org
@ 2010-12-19 13:37 ` dominiq at lps dot ens.fr
  2010-12-20  8:57 ` dominiq at lps dot ens.fr
                   ` (33 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2010-12-19 13:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #15 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2010-12-19 13:37:17 UTC ---
> Could you try if this solves the problem?

The patch in comment #14 fixed the problem on x86_64-apple-darwin10 (I cannot
say anything for AMD). I have run the polyhedron tests without noticing any
slow down. I'll do a clean regstrap tonight. Thanks for the patch.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2010-12-19 13:37 ` dominiq at lps dot ens.fr
@ 2010-12-20  8:57 ` dominiq at lps dot ens.fr
  2010-12-21 10:46 ` dominiq at lps dot ens.fr
                   ` (32 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2010-12-20  8:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #16 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2010-12-20 08:57:32 UTC ---
The patch in comment #14 fixed the problem on x86_64-apple-darwin10, but causes
the following regressions:

FAIL: gcc.dg/autopar/outer-2.c scan-tree-dump-times parloops "parallelizing
outer loop" 1
FAIL: gcc.dg/autopar/outer-2.c scan-tree-dump-times optimized "loopfn" 5
FAIL: gcc.dg/tree-ssa/ldist-pr45948.c scan-tree-dump ldist "distributed: split
to 3"

which disappear if I revert the patch. Note that something looks uninitialized
with the patch:

[macbook] f90/bug% gcc46 -O2 -ftree-loop-distribution -fdump-tree-ldist-details
-c /opt/gcc/work/gcc/testsuite/gcc.dg/tree-ssa/ldist-pr45948.c
[macbook] f90/bug% grep distributed ldist-pr45948.c.101t.ldist
Loop -1515870811 distributed: split to 2 loops.
          ^^^^
instead of

Loop 1 distributed: split to 3 loops.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2010-12-20  8:57 ` dominiq at lps dot ens.fr
@ 2010-12-21 10:46 ` dominiq at lps dot ens.fr
  2011-01-17 21:15 ` howarth at nitro dot med.uc.edu
                   ` (31 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2010-12-21 10:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #17 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2010-12-21 10:46:06 UTC ---
For the record I have also tested the patch in comment #14 on
powerpc-apple-darwin9 at revision 168070. Without the patch I get

[karma] lin/test% gfc -Ofast -funroll-loops -ftree-loop-linear
-fomit-frame-pointer -finline-limit=600 --param hot-bb-frequency-fraction=2000
-fwhole-program -flto rnflow.f90
[karma] lin/test% time a.out > /dev/null
68.236u 6.947s 1:17.77 96.6%    0+0k 0+0io 0pf+0w
[karma] lin/test% gfc -Ofast -funroll-loops -ftree-loop-linear
-fomit-frame-pointer -finline-limit=600 -fwhole-program -flto rnflow.f90
[karma] lin/test% time a.out > /dev/null
65.229u 6.838s 1:14.61 96.5%    0+0k 0+0io 0pf+0w

Note a slight slow down with -param hot-bb-frequency-fraction=2000. With the
patch I get

[karma] lin/test% gfc -Ofast -funroll-loops -ftree-loop-linear
-fomit-frame-pointer -finline-limit=600 --param hot-bb-frequency-fraction=2000
-fwhole-program -flto rnflow.f90
[karma] lin/test% time a.out > /dev/null
69.690u 6.917s 1:19.44 96.4%    0+0k 0+0io 1pf+0w
[karma] lin/test% gfc -Ofast -funroll-loops -ftree-loop-linear
-fomit-frame-pointer -finline-limit=600 -fwhole-program -flto rnflow.f90
[karma] lin/test% time a.out > /dev/null
69.791u 7.225s 1:20.08 96.1%    0+0k 0+0io 0pf+0w

i.e.,  -param hot-bb-frequency-fraction=2000 does not change the timings, but
the resulting code is slower.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2010-12-21 10:46 ` dominiq at lps dot ens.fr
@ 2011-01-17 21:15 ` howarth at nitro dot med.uc.edu
  2011-01-17 21:16 ` howarth at nitro dot med.uc.edu
                   ` (30 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-17 21:15 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #18 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-17 21:12:03 UTC ---
Created attachment 23000
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23000
assembly for gcc.dg/autopar/outer-2.c at -m32 with r168907

/Users/howarth/work/gcc/xgcc -B/Users/howarth/work/gcc/
/Users/howarth/gcc-4.6-20110116/gcc/testsuite/gcc.dg/autopar/outer-2.c -O2
-ftree-parallelize-loops=4 -fdump-tree-parloops-details -fdump-tree-optimized
-S -m32 -o outer-2.s


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2011-01-17 21:15 ` howarth at nitro dot med.uc.edu
@ 2011-01-17 21:16 ` howarth at nitro dot med.uc.edu
  2011-01-17 21:36 ` howarth at nitro dot med.uc.edu
                   ` (29 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-17 21:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #19 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-17 21:13:36 UTC ---
Created attachment 23001
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23001
assembly for gcc.dg/autopar/outer-2.c at -m32 with patch from comment 14


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2011-01-17 21:16 ` howarth at nitro dot med.uc.edu
@ 2011-01-17 21:36 ` howarth at nitro dot med.uc.edu
  2011-01-17 21:43 ` howarth at nitro dot med.uc.edu
                   ` (28 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-17 21:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #20 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-17 21:14:10 UTC ---
Created attachment 23002
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23002
parloops for gcc.dg/autopar/outer-2.c at -m32 with r168907


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2011-01-17 21:36 ` howarth at nitro dot med.uc.edu
@ 2011-01-17 21:43 ` howarth at nitro dot med.uc.edu
  2011-01-17 21:52 ` howarth at nitro dot med.uc.edu
                   ` (27 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-17 21:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #21 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-17 21:14:42 UTC ---
Created attachment 23003
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23003
parloops for gcc.dg/autopar/outer-2.c at -m32 with patch from comment 14


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2011-01-17 21:43 ` howarth at nitro dot med.uc.edu
@ 2011-01-17 21:52 ` howarth at nitro dot med.uc.edu
  2011-01-17 21:56 ` howarth at nitro dot med.uc.edu
                   ` (26 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-17 21:52 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #22 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-17 21:15:38 UTC ---
Created attachment 23004
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23004
optimized for gcc.dg/autopar/outer-2.c at -m32 with r168907


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2011-01-17 21:52 ` howarth at nitro dot med.uc.edu
@ 2011-01-17 21:56 ` howarth at nitro dot med.uc.edu
  2011-01-22 17:22 ` hubicka at gcc dot gnu.org
                   ` (25 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-17 21:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #23 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-17 21:16:22 UTC ---
Created attachment 23005
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23005
optimized for gcc.dg/autopar/outer-2.c at -m32 with patch from comment 14


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (11 preceding siblings ...)
  2011-01-17 21:56 ` howarth at nitro dot med.uc.edu
@ 2011-01-22 17:22 ` hubicka at gcc dot gnu.org
  2011-01-22 21:54 ` hubicka at gcc dot gnu.org
                   ` (24 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: hubicka at gcc dot gnu.org @ 2011-01-22 17:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #24 from Jan Hubicka <hubicka at gcc dot gnu.org> 2011-01-22 16:23:49 UTC ---
PR 43884 has similar problem with deep loop nests.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (12 preceding siblings ...)
  2011-01-22 17:22 ` hubicka at gcc dot gnu.org
@ 2011-01-22 21:54 ` hubicka at gcc dot gnu.org
  2011-01-22 21:56 ` hubicka at gcc dot gnu.org
                   ` (23 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: hubicka at gcc dot gnu.org @ 2011-01-22 21:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #25 from Jan Hubicka <hubicka at gcc dot gnu.org> 2011-01-22 21:47:43 UTC ---
Author: hubicka
Date: Sat Jan 22 21:47:40 2011
New Revision: 169136

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169136
Log:
    PR tree-optimization/43884
    PR lto/44334
    * predict.c (maybe_hot_frequency_p): Use entry block frequency as an base.
    * doc/invoke.texi (hot-bb-frequency-fraction): Update docs.
    * gcc.dg/autopar/outer-2.c: Increase array size.
    * gcc.dg/tree-ssa/ldist-pr45948.c: Update test.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/predict.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/autopar/outer-2.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ldist-pr45948.c


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (13 preceding siblings ...)
  2011-01-22 21:54 ` hubicka at gcc dot gnu.org
@ 2011-01-22 21:56 ` hubicka at gcc dot gnu.org
  2011-01-23 10:10 ` howarth at nitro dot med.uc.edu
                   ` (22 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: hubicka at gcc dot gnu.org @ 2011-01-22 21:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #26 from Jan Hubicka <hubicka at gcc dot gnu.org> 2011-01-22 21:49:19 UTC ---
OK,
i comitted the branch prediction change.  I am bit confused by the rest of
trail, can you please confirm if the problem is fixed in all the configurations
mentioned?


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (14 preceding siblings ...)
  2011-01-22 21:56 ` hubicka at gcc dot gnu.org
@ 2011-01-23 10:10 ` howarth at nitro dot med.uc.edu
  2011-01-23 10:47 ` dominiq at lps dot ens.fr
                   ` (21 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-23 10:10 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

Jack Howarth <howarth at nitro dot med.uc.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |howarth at nitro dot
                   |                            |med.uc.edu

--- Comment #27 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-23 03:36:02 UTC ---
On x86_64-apple-darwin10 at r169137, the pb05 benchmarks compiled with

benchmark  -O3 -ffast-math  -O3 -ffast-math -funroll-loops   %change
           -funroll-loops   -flto -fwhole-program

ac            8.81            8.81                            0.0
aermod       17.30           17.50                            1.2 
air           5.62            5.57                           -0.9
capacita     32.77           33.35                            1.8
channel       1.89            1.89                            0.0
doduc        26.58           26.52                           -0.2
fatigue       8.37            8.36                           -0.1
gas_dyn       4.36            4.35                           -0.2
induct       13.05           13.04                           -0.1
linpk        17.15           17.05                           -0.6
mdbx         11.25           11.26                            0.1
nf           32.14           33.50                            4.2
protein      32.50           32.27                           -0.7
rnflow       24.11           24.84                            3.0
test_fpu      8.22            8.20                           -0.2
tfft          1.89            1.88                           -0.5

Geometric    11.07           11.11                            0.4
Mean


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (15 preceding siblings ...)
  2011-01-23 10:10 ` howarth at nitro dot med.uc.edu
@ 2011-01-23 10:47 ` dominiq at lps dot ens.fr
  2011-01-23 13:16 ` dominiq at lps dot ens.fr
                   ` (20 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-23 10:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #28 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 08:44:45 UTC ---
According to http://gcc.gnu.org/ml/gcc-regression/2011-01/msg00375.html
revision 169136 caused a bootstrap failure on powerpc-apple-darwin9.8.0:

....
/Users/regress/tbox/native/build/./prev-gcc/xgcc
-B/Users/regress/tbox/native/build/./prev-gcc/
-B/Users/regress/tbox/objs/powerpc-apple-darwin9.8.0/bin/
-B/Users/regress/tbox/objs/powerpc-apple-darwin9.8.0/bin/
-B/Users/regress/tbox/objs/powerpc-apple-darwin9.8.0/lib/ -isystem
/Users/regress/tbox/objs/powerpc-apple-darwin9.8.0/include -isystem
/Users/regress/tbox/objs/powerpc-apple-darwin9.8.0/sys-include    -c   -g -O2
-mdynamic-no-pic -gtoggle -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual
-Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror
-Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -I. -I.
-I/Users/regress/tbox/svn-gcc/gcc -I/Users/regress/tbox/svn-gcc/gcc/.
-I/Users/regress/tbox/svn-gcc/gcc/../include -I./../intl
-I/Users/regress/tbox/svn-gcc/gcc/../libcpp/include 
-I/Users/regress/tbox/svn-gcc/gcc/../libdecnumber
-I/Users/regress/tbox/svn-gcc/gcc/../libdecnumber/dpd -I../libdecnumber   
/Users/regress/tbox/svn-gcc/gcc/compare-elim.c -o compare-elim.o
/Users/regress/tbox/svn-gcc/gcc/compare-elim.c: In function
'maybe_select_cc_mode':
/Users/regress/tbox/svn-gcc/gcc/compare-elim.c:407:58: error: unused parameter
'b' [-Werror=unused-parameter]
cc1: all warnings being treated as errors


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (16 preceding siblings ...)
  2011-01-23 10:47 ` dominiq at lps dot ens.fr
@ 2011-01-23 13:16 ` dominiq at lps dot ens.fr
  2011-01-23 13:17 ` dominiq at lps dot ens.fr
                   ` (19 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-23 13:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #29 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 11:17:32 UTC ---
>From http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01607.html  the bootstrap
failure seems rather due to revision 169131. Note that revision 169142
bootstrapped on x86_64-apple-darwin10 configured with
--enable-checking=release.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (17 preceding siblings ...)
  2011-01-23 13:16 ` dominiq at lps dot ens.fr
@ 2011-01-23 13:17 ` dominiq at lps dot ens.fr
  2011-01-23 13:19 ` dominiq at lps dot ens.fr
                   ` (18 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-23 13:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #30 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 11:43:09 UTC ---
Concerning the timings in comment #27 they may reflect the fact the the inliner
is not aggressive enough for fortran codes and that it is worsen when using
-flto:

For rnflow.f90 I get

26.75s   with -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer
26.66s   with -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer
-finline-limit=600
27.60s   with -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer
-fwhole-program -flto
27.14s   with -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer
-finline-limit=600 -fwhole-program -flto
26.79s  with -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer
-finline-limit=2000 -fwhole-program -flto

The result is more spectacular for fatigue.f90

8.50s    with -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer
-finline-limit=600 -fwhole-program -flto
4.69s    with -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer
-finline-limit=2000 -fwhole-program -flto

Note that revision 169136 seems to require higher values of -finline-limit:
before it, 600 was sufficient to see the speed-up (I have reported that in an
other pr), now it has been increased (I did not tried values lower than 2000
yet).


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (18 preceding siblings ...)
  2011-01-23 13:17 ` dominiq at lps dot ens.fr
@ 2011-01-23 13:19 ` dominiq at lps dot ens.fr
  2011-01-23 13:53 ` hubicka at ucw dot cz
                   ` (17 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-23 13:19 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #31 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 12:06:00 UTC ---
The relevant pr for comment #30 is pr45810 comment #9. The threshold for
fatigue.f90 was322 before revision 169136 and is now 1520 (~x5).


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (19 preceding siblings ...)
  2011-01-23 13:19 ` dominiq at lps dot ens.fr
@ 2011-01-23 13:53 ` hubicka at ucw dot cz
  2011-01-23 14:30 ` hubicka at ucw dot cz
                   ` (16 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: hubicka at ucw dot cz @ 2011-01-23 13:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #33 from Jan Hubicka <hubicka at ucw dot cz> 2011-01-23 13:15:27 UTC ---
Please use -fdump-ipa-inline-details to generate the dump.  Perhaps we just
miscompute function body size somehow.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (20 preceding siblings ...)
  2011-01-23 13:53 ` hubicka at ucw dot cz
@ 2011-01-23 14:30 ` hubicka at ucw dot cz
  2011-01-23 14:46 ` hubicka at ucw dot cz
                   ` (15 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: hubicka at ucw dot cz @ 2011-01-23 14:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #32 from Jan Hubicka <hubicka at ucw dot cz> 2011-01-23 13:14:56 UTC ---
> The relevant pr for comment #30 is pr45810 comment #9. The threshold for
> fatigue.f90 was322 before revision 169136 and is now 1520 (~x5).
Interesting. Do you know what function we fail to inline?
Can you attach ipa-inline dump from both settings?
I know that also c-ray wants to increase inline limits.  I can increase them a
bit,
but not by factor of 5, since that would cause code size explosion at -O3.
(I did some tests on this two weeks ago)

Honza


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (21 preceding siblings ...)
  2011-01-23 14:30 ` hubicka at ucw dot cz
@ 2011-01-23 14:46 ` hubicka at ucw dot cz
  2011-01-23 15:50 ` dominiq at lps dot ens.fr
                   ` (14 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: hubicka at ucw dot cz @ 2011-01-23 14:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #34 from Jan Hubicka <hubicka at ucw dot cz> 2011-01-23 13:16:34 UTC ---
Pretty obvoius fix to the compare-elim issue is adding ATTRIBUTE_UNUSED to b
parameter.
It is used by SELECT_CC_MODE macro that is defined to not use it by default.

Honza


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (22 preceding siblings ...)
  2011-01-23 14:46 ` hubicka at ucw dot cz
@ 2011-01-23 15:50 ` dominiq at lps dot ens.fr
  2011-01-23 16:26 ` howarth at nitro dot med.uc.edu
                   ` (13 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-23 15:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #35 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 15:02:43 UTC ---
> Do you know what function we fail to inline?

It is generalized_hookes_law.

I have looked to fatigue.f90 in more details. With revision 168741, I see the
transitions:

 9.25s for inline-limit < 214
 6.50s for 213 < inline-limit < 322
 4.76s for 321 < inline-limit

With revision 169142, I see

 9.25s for inline-limit < 214
 6.50s for 213 < inline-limit < 322
 8.48s for 321 < inline-limit < 1520
 4.70s for 1519 < nline-limit

Indeed I may have missed other thresholds (especially in the range 322--1519).

I have dumps for values below and above the thresholds (10 of them). Do you
want them all? or only a subset? In the later case which ones?


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (23 preceding siblings ...)
  2011-01-23 15:50 ` dominiq at lps dot ens.fr
@ 2011-01-23 16:26 ` howarth at nitro dot med.uc.edu
  2011-01-23 16:33 ` howarth at nitro dot med.uc.edu
                   ` (12 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-23 16:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #36 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-23 15:45:19 UTC ---
Created attachment 23086
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23086
bzip2 compressed ipa-inline-details dump without -finline-limit

generated at r169137 on x86_64-apple-darwin10 with...

gfortran -O3 -ffast-math -funroll-loops -fdump-ipa-inline-details -flto
-fwhole-program ../fatigue.f90 -o ../fatigue


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (24 preceding siblings ...)
  2011-01-23 16:26 ` howarth at nitro dot med.uc.edu
@ 2011-01-23 16:33 ` howarth at nitro dot med.uc.edu
  2011-01-23 16:35 ` howarth at nitro dot med.uc.edu
                   ` (11 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-23 16:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #37 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-23 15:47:54 UTC ---
Created attachment 23087
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23087
bzip2 compressed ipa-inline-details dump with -finline-limit=600

bzip2 compressed ipa-inline-details dump with -finline-limit=600

generated at r169137 on x86_64-apple-darwin10 with...

gfortran -O3 -ffast-math -funroll-loops  -finline-limit=600
-fdump-ipa-inline-details -flto
-fwhole-program ../fatigue.f90 -o ../fatigue


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (25 preceding siblings ...)
  2011-01-23 16:33 ` howarth at nitro dot med.uc.edu
@ 2011-01-23 16:35 ` howarth at nitro dot med.uc.edu
  2011-01-23 16:40 ` dominiq at lps dot ens.fr
                   ` (10 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-23 16:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #38 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-23 15:49:19 UTC ---
Created attachment 23088
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23088
bzip2 compressed ipa-inline-details dump with -finline-limit=2000

generated at r169137 on x86_64-apple-darwin10 with...

gfortran -O3 -ffast-math -funroll-loops  -finline-limit=2000
-fdump-ipa-inline-details -flto
-fwhole-program ../fatigue.f90 -o ../fatigue


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (26 preceding siblings ...)
  2011-01-23 16:35 ` howarth at nitro dot med.uc.edu
@ 2011-01-23 16:40 ` dominiq at lps dot ens.fr
  2011-01-23 16:45 ` dominiq at lps dot ens.fr
                   ` (9 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-23 16:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #39 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 16:32:38 UTC ---
Created attachment 23089
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23089
-finline-limit=321 revision 168741

bzip2 fatigue.f90.048i.inline generated at revision168741  with -Ofast
-funroll-loops -ftree-loop-linear -fomit-frame-pointer -finline-limit=321
-fwhole-program -flto -fdump-ipa-inline-details


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (27 preceding siblings ...)
  2011-01-23 16:40 ` dominiq at lps dot ens.fr
@ 2011-01-23 16:45 ` dominiq at lps dot ens.fr
  2011-01-23 17:04 ` dominiq at lps dot ens.fr
                   ` (8 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-23 16:45 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #40 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 16:33:39 UTC ---
Created attachment 23090
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23090
-finline-limit=322 revision168741

bzip2 fatigue.f90.048i.inline generated at revision168741  with -Ofast
-funroll-loops -ftree-loop-linear -fomit-frame-pointer -finline-limit=322
-fwhole-program -flto -fdump-ipa-inline-details


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (28 preceding siblings ...)
  2011-01-23 16:45 ` dominiq at lps dot ens.fr
@ 2011-01-23 17:04 ` dominiq at lps dot ens.fr
  2011-01-23 17:56 ` dominiq at lps dot ens.fr
                   ` (7 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-23 17:04 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #41 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 16:35:02 UTC ---
Created attachment 23091
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23091
-finline-limit=321 revision 169142

bzip2 fatigue.f90.048i.inline generated at revision 169142  with -Ofast
-funroll-loops -ftree-loop-linear -fomit-frame-pointer -finline-limit=321
-fwhole-program -flto -fdump-ipa-inline-details


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (29 preceding siblings ...)
  2011-01-23 17:04 ` dominiq at lps dot ens.fr
@ 2011-01-23 17:56 ` dominiq at lps dot ens.fr
  2011-01-23 21:06 ` howarth at nitro dot med.uc.edu
                   ` (6 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-23 17:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #42 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-23 16:36:00 UTC ---
Created attachment 23092
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23092
-finline-limit=322 revision 169142

bzip2 fatigue.f90.048i.inline generated at revision 169142  with -Ofast
-funroll-loops -ftree-loop-linear -fomit-frame-pointer -finline-limit=322
-fwhole-program -flto -fdump-ipa-inline-details


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (30 preceding siblings ...)
  2011-01-23 17:56 ` dominiq at lps dot ens.fr
@ 2011-01-23 21:06 ` howarth at nitro dot med.uc.edu
  2011-01-25  4:15 ` howarth at nitro dot med.uc.edu
                   ` (5 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-23 21:06 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #43 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-23 18:07:36 UTC ---
On x86_64-apple-darwin10 at r169137, the pb05 benchmarks compiled with

benchmark  -O3 -ffast-math -funroll-loops  -O3 -ffast-math -funroll-loops 
%change
           -flto -fwhole-program           -finline-limit=2000 -flto 
                                           -fwhole-program

ac            8.81                          7.30                          
-17.1
aermod       17.50                         17.43                           
-0.4
air           5.57                          5.57                            
0.0
capacita     33.35                         31.86                           
-4.5
channel       1.89                          1.76                           
-6.9
doduc        26.52                         25.15                           
-5.2
fatigue       8.36                          4.21                          
-49.6
gas_dyn       4.35                          4.28                           
-1.6
induct       13.04                         13.05                            
0.1
linpk        17.05                         17.31                            
1.5
mdbx         11.26                         11.26                            
0.0
nf           33.50                         30.97                           
-7.6
protein      32.27                         32.62                            
1.1
rnflow       24.84                         24.16                           
-2.7
test_fpu      8.20                          8.90                            
8.5
tfft          1.88                          1.94                            
3.2

Geometric    11.11                         10.42                           
-6.2
Mean


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (31 preceding siblings ...)
  2011-01-23 21:06 ` howarth at nitro dot med.uc.edu
@ 2011-01-25  4:15 ` howarth at nitro dot med.uc.edu
  2011-01-25 10:31 ` rguenther at suse dot de
                   ` (4 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-25  4:15 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #44 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-25 03:13:39 UTC ---
Testing...

Index: gcc/params.def
===================================================================
--- gcc/params.def    (revision 169185)
+++ gcc/params.def    (working copy)
@@ -182,7 +182,7 @@ DEFPARAM(PARAM_LARGE_FUNCTION_INSNS,
 DEFPARAM(PARAM_LARGE_FUNCTION_GROWTH,
      "large-function-growth",
      "Maximal growth due to inlining of large function (in percent)",
-     100, 0, 0)
+     400, 0, 0)
 DEFPARAM(PARAM_LARGE_UNIT_INSNS,
      "large-unit-insns",
      "The size of translation unit to be considered large",

shows only a major improvement for fatigue (30%). This same improvement can be
achieved at -m32 and -m64 with just an increase of large-function-growth to
200.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (32 preceding siblings ...)
  2011-01-25  4:15 ` howarth at nitro dot med.uc.edu
@ 2011-01-25 10:31 ` rguenther at suse dot de
  2011-01-25 18:47 ` jh at suse dot de
                   ` (3 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: rguenther at suse dot de @ 2011-01-25 10:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #45 from rguenther at suse dot de <rguenther at suse dot de> 2011-01-25 10:20:01 UTC ---
On Tue, 25 Jan 2011, howarth at nitro dot med.uc.edu wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334
> 
> --- Comment #44 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-25 03:13:39 UTC ---
> Testing...
> 
> Index: gcc/params.def
> ===================================================================
> --- gcc/params.def    (revision 169185)
> +++ gcc/params.def    (working copy)
> @@ -182,7 +182,7 @@ DEFPARAM(PARAM_LARGE_FUNCTION_INSNS,
>  DEFPARAM(PARAM_LARGE_FUNCTION_GROWTH,
>       "large-function-growth",
>       "Maximal growth due to inlining of large function (in percent)",
> -     100, 0, 0)
> +     400, 0, 0)
>  DEFPARAM(PARAM_LARGE_UNIT_INSNS,
>       "large-unit-insns",
>       "The size of translation unit to be considered large",
> 
> shows only a major improvement for fatigue (30%). This same improvement can be
> achieved at -m32 and -m64 with just an increase of large-function-growth to
> 200.

We certainly won't adjust params at this stage.  There are other cases
(that c-ray one) where more aggressive inlining helps, but we should
avoid regressing for -O2 and only tune -O3 params eventually.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (33 preceding siblings ...)
  2011-01-25 10:31 ` rguenther at suse dot de
@ 2011-01-25 18:47 ` jh at suse dot de
  2011-01-25 19:40 ` dominiq at lps dot ens.fr
                   ` (2 subsequent siblings)
  37 siblings, 0 replies; 38+ messages in thread
From: jh at suse dot de @ 2011-01-25 18:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #46 from jh at suse dot de 2011-01-25 17:57:38 UTC ---
I sorted out increasing large function growth ratio as most safe way  
to deal with (easier half of) this problem. Unlike the parameters for  
inline limits it won't cause code size issues. It just allow somewhat  
bigger functions and thus stress more the backend on its linearity.

Given that the parameter was never tuned since its inclusion in GCC  
4.2, I guess we are not terribly sensitive here. We also improved a  
bit in the scalability here as I tuned the df code bit for LTO and  
spagetti code.

Otherwise we need to wait for 4.7 or possibly 4.6.1. That is fine with me.
I will still run tests tonight on how increasing the parameter affect  
our tester.  This is first time I see it hit in perfomrance sensitive  
way. Not sure how common it is in practice since I never really tried  
to change it.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (34 preceding siblings ...)
  2011-01-25 18:47 ` jh at suse dot de
@ 2011-01-25 19:40 ` dominiq at lps dot ens.fr
  2011-01-25 21:49 ` howarth at nitro dot med.uc.edu
  2011-02-16 17:24 ` dominiq at lps dot ens.fr
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-01-25 19:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #47 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-01-25 19:06:04 UTC ---
> I sorted out increasing large function growth ratio as most safe way  
> to deal with (easier half of) this problem. Unlike the parameters for  
> inline limits it won't cause code size issues. It just allow somewhat  
> bigger functions and thus stress more the backend on its linearity.

Well, the choice is not '-finline-limit' versus '--param
large-function-growth': some polyhedron tests are sensitive to some value of
'-finline-limit' (ac, channel, fatigue, ...) and for most of them '--param
large-function-growth' does not change anything. 

fatigue is quite peculiar in that there is a big speed-up with -fwhole-program
for -finline-limit>=322and an additional small speed-up for --param
large-function-growth>=132. In addition the later prevent a bad choice with
-flto (this should probably be discussed in pr 45810 and this pr closed as
fixed).

Note that I am not interested by fine tuning, but to find some acceptable
values of the default parameters that give good results for all (most;-)
fortran codes).


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (35 preceding siblings ...)
  2011-01-25 19:40 ` dominiq at lps dot ens.fr
@ 2011-01-25 21:49 ` howarth at nitro dot med.uc.edu
  2011-02-16 17:24 ` dominiq at lps dot ens.fr
  37 siblings, 0 replies; 38+ messages in thread
From: howarth at nitro dot med.uc.edu @ 2011-01-25 21:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

--- Comment #48 from Jack Howarth <howarth at nitro dot med.uc.edu> 2011-01-25 21:29:53 UTC ---
(In reply to comment #47)

> Well, the choice is not '-finline-limit' versus '--param
> large-function-growth': some polyhedron tests are sensitive to some value of
> '-finline-limit' (ac, channel, fatigue, ...) and for most of them '--param
> large-function-growth' does not change anything. 
> 
> fatigue is quite peculiar in that there is a big speed-up with -fwhole-program
> for -finline-limit>=322and an additional small speed-up for --param
> large-function-growth>=132. In addition the later prevent a bad choice with
> -flto (this should probably be discussed in pr 45810 and this pr closed as
> fixed).
> 
> Note that I am not interested by fine tuning, but to find some acceptable
> values of the default parameters that give good results for all (most;-)
> fortran codes).

In my tests, --param large-function-growth=200 was sufficient to yield 60% of
the performance increase in the fatigue benchmark obtained by modifying both
-finline-limit and --param large-function-growth.  Unlike increasing
-finline-limit, none of the other pb05 benchmarks showed even minor regressions
in speed.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852
       [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
                   ` (36 preceding siblings ...)
  2011-01-25 21:49 ` howarth at nitro dot med.uc.edu
@ 2011-02-16 17:24 ` dominiq at lps dot ens.fr
  37 siblings, 0 replies; 38+ messages in thread
From: dominiq at lps dot ens.fr @ 2011-02-16 17:24 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44334

Dominique d'Humieres <dominiq at lps dot ens.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|                            |FIXED

--- Comment #49 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-02-16 17:21:14 UTC ---
Since it seems that revision 169136 is the "right fix", I am closing this PR as
fixed. Any further discussion about the interaction between -fwhole-program and
-flto for the polyhedron test fatigue.f90 should take place in pr45810.


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2011-02-16 17:21 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-44334-4@http.gcc.gnu.org/bugzilla/>
2010-11-15 13:41 ` [Bug lto/44334] rnflow.f90 ~27% slower with -fwhole-program -flto after revision 159852 dominiq at lps dot ens.fr
2010-11-15 15:33 ` hubicka at ucw dot cz
2010-12-19 11:44 ` hubicka at gcc dot gnu.org
2010-12-19 13:37 ` dominiq at lps dot ens.fr
2010-12-20  8:57 ` dominiq at lps dot ens.fr
2010-12-21 10:46 ` dominiq at lps dot ens.fr
2011-01-17 21:15 ` howarth at nitro dot med.uc.edu
2011-01-17 21:16 ` howarth at nitro dot med.uc.edu
2011-01-17 21:36 ` howarth at nitro dot med.uc.edu
2011-01-17 21:43 ` howarth at nitro dot med.uc.edu
2011-01-17 21:52 ` howarth at nitro dot med.uc.edu
2011-01-17 21:56 ` howarth at nitro dot med.uc.edu
2011-01-22 17:22 ` hubicka at gcc dot gnu.org
2011-01-22 21:54 ` hubicka at gcc dot gnu.org
2011-01-22 21:56 ` hubicka at gcc dot gnu.org
2011-01-23 10:10 ` howarth at nitro dot med.uc.edu
2011-01-23 10:47 ` dominiq at lps dot ens.fr
2011-01-23 13:16 ` dominiq at lps dot ens.fr
2011-01-23 13:17 ` dominiq at lps dot ens.fr
2011-01-23 13:19 ` dominiq at lps dot ens.fr
2011-01-23 13:53 ` hubicka at ucw dot cz
2011-01-23 14:30 ` hubicka at ucw dot cz
2011-01-23 14:46 ` hubicka at ucw dot cz
2011-01-23 15:50 ` dominiq at lps dot ens.fr
2011-01-23 16:26 ` howarth at nitro dot med.uc.edu
2011-01-23 16:33 ` howarth at nitro dot med.uc.edu
2011-01-23 16:35 ` howarth at nitro dot med.uc.edu
2011-01-23 16:40 ` dominiq at lps dot ens.fr
2011-01-23 16:45 ` dominiq at lps dot ens.fr
2011-01-23 17:04 ` dominiq at lps dot ens.fr
2011-01-23 17:56 ` dominiq at lps dot ens.fr
2011-01-23 21:06 ` howarth at nitro dot med.uc.edu
2011-01-25  4:15 ` howarth at nitro dot med.uc.edu
2011-01-25 10:31 ` rguenther at suse dot de
2011-01-25 18:47 ` jh at suse dot de
2011-01-25 19:40 ` dominiq at lps dot ens.fr
2011-01-25 21:49 ` howarth at nitro dot med.uc.edu
2011-02-16 17:24 ` dominiq at lps dot ens.fr

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).