public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing
@ 2008-01-05 23:51 jaydub66 at gmail dot com
2008-01-06 0:37 ` [Bug tree-optimization/34683] " jaydub66 at gmail dot com
` (49 more replies)
0 siblings, 50 replies; 51+ messages in thread
From: jaydub66 at gmail dot com @ 2008-01-05 23:51 UTC (permalink / raw)
To: gcc-bugs
The attached Fortran file takes unusually long to compile with full
optimization on the latest GCC trunk builds. I tried a lot of the optimization
flags, and found that -fstrict-aliasing seems to trigger the long compile-time.
Without this flag the compile-time is perfectly reasonable, as you can see from
the following numbers (using trunk revision 131325 on i686-pc-linux-gnu,
Pentium M 1,7Ghz and 1GB memory):
-O0: 0.9s
-O1: 4.5s
-O2 -fno-strict-aliasing: 8.0s
-O3 -fno-strict-aliasing: 10.4s
But once I use -fstrict-aliasing, compile-time explodes:
-O1 -fstrict-aliasing: 1m10s
-O2: 2m16s
-O3: 1m43s
(-fstrict-aliasing is included in -O2 and -O3, but not in -O1)
I don't know why -O3 is faster than -O2 here (which seems funny), but anyway
all three numbers are way too large in my opinion. I mean: compile-time rises
by a factor of roughly 15, just due to this one flag! I don't even know what
the flag is supposed to do for this code. Maybe someone can explain this to me?
I didn't examine memory consumption very carefully, but this also seems
unusually high, as a similar file failed to compile on an old machine (with
only 256MB) due to lack of memory(!).
Am I right to assume that something goes wrong here somehow, or can someone
tell me why I should expect this kind of behaviour for this code?
--
Summary: compile-time problem with -fstrict-aliasing
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jaydub66 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] compile-time problem with -fstrict-aliasing
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
@ 2008-01-06 0:37 ` jaydub66 at gmail dot com
2008-01-06 0:55 ` pinskia at gcc dot gnu dot org
` (48 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: jaydub66 at gmail dot com @ 2008-01-06 0:37 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from jaydub66 at gmail dot com 2008-01-05 23:36 -------
Created an attachment (id=14885)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14885&action=view)
test case
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] compile-time problem with -fstrict-aliasing
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
2008-01-06 0:37 ` [Bug tree-optimization/34683] " jaydub66 at gmail dot com
@ 2008-01-06 0:55 ` pinskia at gcc dot gnu dot org
2008-01-06 6:47 ` jaydub66 at gmail dot com
` (47 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-01-06 0:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from pinskia at gcc dot gnu dot org 2008-01-05 23:37 -------
Can you give the output of the compiler when -ftime-report is added?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] compile-time problem with -fstrict-aliasing
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
2008-01-06 0:37 ` [Bug tree-optimization/34683] " jaydub66 at gmail dot com
2008-01-06 0:55 ` pinskia at gcc dot gnu dot org
@ 2008-01-06 6:47 ` jaydub66 at gmail dot com
2008-01-06 8:08 ` jaydub66 at gmail dot com
` (46 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: jaydub66 at gmail dot com @ 2008-01-06 6:47 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from jaydub66 at gmail dot com 2008-01-05 23:46 -------
(In reply to comment #2)
> Can you give the output of the compiler when -ftime-report is added?
Sure thing, thanks for the remark. Here it goes:
gfortran-4.3 -c -ftime-report -O1 -fstrict-aliasing Amplitudes.f90
Execution times (seconds)
garbage collection : 0.21 ( 0%) usr 0.02 ( 2%) sys 0.24 ( 0%) wall
0 kB ( 0%) ggc
callgraph construction: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
299 kB ( 0%) ggc
callgraph optimization: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
62 kB ( 0%) ggc
trivially dead code : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
0 kB ( 0%) ggc
df reaching defs : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
0 kB ( 0%) ggc
df live regs : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
0 kB ( 0%) ggc
df live&initialized regs: 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
0 kB ( 0%) ggc
df reg dead/unused notes: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
127 kB ( 0%) ggc
register information : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
0 kB ( 0%) ggc
alias analysis : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
196 kB ( 0%) ggc
rebuild jump labels : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
parser : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
1848 kB ( 0%) ggc
inline heuristics : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall
1 kB ( 0%) ggc
tree gimplify : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
847 kB ( 0%) ggc
tree CFG construction : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
769 kB ( 0%) ggc
tree CFG cleanup : 4.20 ( 5%) usr 0.06 ( 7%) sys 4.60 ( 5%) wall
2550 kB ( 0%) ggc
tree copy propagation : 0.25 ( 0%) usr 0.01 ( 1%) sys 0.31 ( 0%) wall
24 kB ( 0%) ggc
tree PTA : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
67 kB ( 0%) ggc
tree alias analysis : 0.01 ( 0%) usr 0.01 ( 1%) sys 0.01 ( 0%) wall
2 kB ( 0%) ggc
tree call clobbering : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
0 kB ( 0%) ggc
tree flow sensitive alias: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall
16 kB ( 0%) ggc
tree memory partitioning: 0.07 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall
0 kB ( 0%) ggc
tree SSA rewrite : 0.48 ( 1%) usr 0.01 ( 1%) sys 0.53 ( 1%) wall
8455 kB ( 1%) ggc
tree SSA other : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
tree SSA incremental : 3.26 ( 4%) usr 0.08 ( 9%) sys 3.62 ( 4%) wall
19615 kB ( 2%) ggc
tree operand scan : 49.67 (63%) usr 0.48 (52%) sys 57.11 (64%) wall
247409 kB (30%) ggc
dominator optimization: 0.85 ( 1%) usr 0.02 ( 2%) sys 1.57 ( 2%) wall
14019 kB ( 2%) ggc
tree SRA : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
27 kB ( 0%) ggc
tree STORE-CCP : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
1 kB ( 0%) ggc
tree CCP : 0.07 ( 0%) usr 0.01 ( 1%) sys 0.08 ( 0%) wall
15 kB ( 0%) ggc
tree PHI const/copy prop: 0.35 ( 0%) usr 0.00 ( 0%) sys 0.37 ( 0%) wall
0 kB ( 0%) ggc
tree split crit edges : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
2489 kB ( 0%) ggc
tree reassociation : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
72 kB ( 0%) ggc
tree FRE : 16.12 (20%) usr 0.17 (18%) sys 16.75 (19%) wall
505793 kB (62%) ggc
tree code sinking : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
41 kB ( 0%) ggc
tree forward propagate: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
7 kB ( 0%) ggc
tree conservative DCE : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall
0 kB ( 0%) ggc
tree aggressive DCE : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
0 kB ( 0%) ggc
tree DSE : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall
1879 kB ( 0%) ggc
tree loop bounds : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
160 kB ( 0%) ggc
loop invariant motion : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
1 kB ( 0%) ggc
tree canonical iv : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
138 kB ( 0%) ggc
scev constant prop : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
68 kB ( 0%) ggc
complete unrolling : 0.30 ( 0%) usr 0.01 ( 1%) sys 0.36 ( 0%) wall
2694 kB ( 0%) ggc
tree iv optimization : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.20 ( 0%) wall
3958 kB ( 0%) ggc
tree loop init : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
52 kB ( 0%) ggc
tree copy headers : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
1280 kB ( 0%) ggc
tree SSA uncprop : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
0 kB ( 0%) ggc
tree SSA to normal : 0.11 ( 0%) usr 0.01 ( 1%) sys 0.12 ( 0%) wall
274 kB ( 0%) ggc
tree rename SSA copies: 0.05 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall
0 kB ( 0%) ggc
dominance frontiers : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
0 kB ( 0%) ggc
dominance computation : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
expand : 0.91 ( 1%) usr 0.02 ( 2%) sys 1.01 ( 1%) wall
3726 kB ( 0%) ggc
lower subreg : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
8 kB ( 0%) ggc
CSE : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall
49 kB ( 0%) ggc
dead code elimination : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
0 kB ( 0%) ggc
dead store elim1 : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
249 kB ( 0%) ggc
dead store elim2 : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall
214 kB ( 0%) ggc
loop analysis : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
83 kB ( 0%) ggc
branch prediction : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
154 kB ( 0%) ggc
combiner : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
485 kB ( 0%) ggc
if-conversion : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
58 kB ( 0%) ggc
local alloc : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall
217 kB ( 0%) ggc
global alloc : 0.26 ( 0%) usr 0.01 ( 1%) sys 0.28 ( 0%) wall
1050 kB ( 0%) ggc
reload CSE regs : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
258 kB ( 0%) ggc
thread pro- & epilogue: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
5 kB ( 0%) ggc
rename registers : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
0 kB ( 0%) ggc
machine dep reorg : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
0 kB ( 0%) ggc
reg stack : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall
34 kB ( 0%) ggc
final : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
0 kB ( 0%) ggc
TOTAL : 79.11 0.92 89.22
822312 kB
So we have 30% for "tree operand scan" and 62% for "tree FRE". I have no idea
what that means though. I hope you do ;)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] compile-time problem with -fstrict-aliasing
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (2 preceding siblings ...)
2008-01-06 6:47 ` jaydub66 at gmail dot com
@ 2008-01-06 8:08 ` jaydub66 at gmail dot com
2008-01-06 9:09 ` jvdelisle at gcc dot gnu dot org
` (45 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: jaydub66 at gmail dot com @ 2008-01-06 8:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from jaydub66 at gmail dot com 2008-01-05 23:51 -------
But 800MB of memory is definitely *a lot* for such a small file!
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] compile-time problem with -fstrict-aliasing
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (3 preceding siblings ...)
2008-01-06 8:08 ` jaydub66 at gmail dot com
@ 2008-01-06 9:09 ` jvdelisle at gcc dot gnu dot org
2008-01-06 10:08 ` rguenth at gcc dot gnu dot org
` (44 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: jvdelisle at gcc dot gnu dot org @ 2008-01-06 9:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from jvdelisle at gcc dot gnu dot org 2008-01-06 00:36 -------
Confirmed: Some comparative timings with Intel and Sun compilers which also use
very little memory.
$ time gfc -c -O2 janus_slow.f90
real 2m53.448s
user 2m51.425s
sys 0m1.878s
$ time ifort -c -O2 janus_slow.f90
real 0m10.489s
user 0m10.264s
sys 0m0.185s
$ time sunf95 -c -O2 janus_slow.f90
real 0m24.029s
user 0m23.339s
sys 0m0.685s
--
jvdelisle at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2008-01-06 00:36:29
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] compile-time problem with -fstrict-aliasing
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (4 preceding siblings ...)
2008-01-06 9:09 ` jvdelisle at gcc dot gnu dot org
@ 2008-01-06 10:08 ` rguenth at gcc dot gnu dot org
2008-01-06 10:35 ` [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis rguenth at gcc dot gnu dot org
` (43 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-06 10:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from rguenth at gcc dot gnu dot org 2008-01-06 09:56 -------
The problem is we have loads of unpartitionable SFTs. There are workarounds
possible that only might affect performance of generated code, for example
you can use --param salias-max-array-elements=0, which will reduce the number
of unpartitionable SFTs somewhat. Or as a more big hammer, disable generating
SFTs completely by --param max-fields-for-field-sensitive=0.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (5 preceding siblings ...)
2008-01-06 10:08 ` rguenth at gcc dot gnu dot org
@ 2008-01-06 10:35 ` rguenth at gcc dot gnu dot org
2008-01-06 11:10 ` burnus at gcc dot gnu dot org
` (42 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-06 10:35 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from rguenth at gcc dot gnu dot org 2008-01-06 10:07 -------
Note that this shouldn't really be affected by strict aliasing or not. Note
that the FE generated IL causes the first alias pass to emit too conservative
alias info as well:
# VUSE <MPT.5965_208230>
D.6049_4652 = atmp.1110.data;
D.6050_4653 = (complex(kind=4)[0:] *) D.6049_4652;
D.6051_4654 = S.1113_289 + D.3340_4634;
# VUSE <SFT.......>
D.6052_4655 = (*D.6050_4653)[D.6051_4654];
I suppose the atmp.1110.data type is something like (void *), so the
cast is required. But this really pessimizes the middle-end IL and
it looks like the FE knows it will be complex(kind=4)[4] at the point
of creation. Note fixing this will also improve optimization and thus
runtime performance.
I also see the FE creates lots of array temporaries:
struct array2_complex(kind=4) atmp.1093;
complex(kind=4) A.1094[4];
struct array2_complex(kind=4) atmp.1095;
complex(kind=4) A.1096[4];
struct array2_complex(kind=4) atmp.1100;
complex(kind=4) A.1101[4];
struct array2_complex(kind=4) atmp.1102;
complex(kind=4) A.1103[4];
struct array2_complex(kind=4) atmp.1106;
complex(kind=4) A.1107[4];
real(kind=4) D.3326;
...
instead of re-using a single one. This also causes internal representation
to blow up.
So, to sum up, the situation could be significantly improved by improving
the FE. (Like I noticed for I/O statements, where I provided a patch -
that was not accepted - to share temporary variables created for the I/O
context)
As we already have bugs for the middle-end partitioning problems, I make
this one a Fortran FE bug instead.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |33974
Component|tree-optimization |fortran
Summary|compile-time problem with - |Fortran FE generated IL
|fstrict-aliasing |pessimizes middle-end IL and
| |analysis
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (6 preceding siblings ...)
2008-01-06 10:35 ` [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis rguenth at gcc dot gnu dot org
@ 2008-01-06 11:10 ` burnus at gcc dot gnu dot org
2008-01-06 11:14 ` rguenth at gcc dot gnu dot org
` (41 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: burnus at gcc dot gnu dot org @ 2008-01-06 11:10 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from burnus at gcc dot gnu dot org 2008-01-06 10:24 -------
> So, to sum up, the situation could be significantly improved by improving
> the FE. (Like I noticed for I/O statements, where I provided a patch -
> that was not accepted - to share temporary variables created for the I/O
> context)
Could you add a link to that patch? I quickly searched my mail archive, but
could not find it.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (7 preceding siblings ...)
2008-01-06 11:10 ` burnus at gcc dot gnu dot org
@ 2008-01-06 11:14 ` rguenth at gcc dot gnu dot org
2008-01-06 11:46 ` rguenth at gcc dot gnu dot org
` (40 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-06 11:14 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from rguenth at gcc dot gnu dot org 2008-01-06 10:34 -------
Basically, not re-using array descriptor variables is as bad as not re-using
I/O descriptor variables. I also wonder what this _gfortran_internal_{un,}pack
is about?
For
taux=gn(p2+p4)
gfortran generates
real(kind=4) A.1[4];
struct array1_real(kind=4) atmp.0;
atmp.0.dtype = 281;
atmp.0.dim[0].stride = 1;
atmp.0.dim[0].lbound = 0;
atmp.0.dim[0].ubound = 3;
atmp.0.data = (void *) &A.1;
atmp.0.offset = 0;
{
integer(kind=4) S.2;
S.2 = 0;
while (1)
{
if (S.2 > 3) goto L.1;
(*(real(kind=4)[0:] *) atmp.0.data)[S.2] = (*p2)[S.2] + (*p4)[S.2];
S.2 = S.2 + 1;
}
L.1:;
}
D.531 = _gfortran_internal_pack (&atmp.0);
D.533 = gn (D.531);
if ((real(kind=4)[0:] *) D.531 != (real(kind=4)[0:] *) atmp.0.data)
{
_gfortran_internal_unpack (&atmp.0, D.531);
{
void * D.532;
D.532 = D.531;
if (D.532 != 0B)
{
__builtin_free (D.532);
}
}
}
why does it not know the result of
if ((real(kind=4)[0:] *) D.531 != (real(kind=4)[0:] *) atmp.0.data)
at the point it creates the IL? I suppose this is about arrays with
stride != 1?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (8 preceding siblings ...)
2008-01-06 11:14 ` rguenth at gcc dot gnu dot org
@ 2008-01-06 11:46 ` rguenth at gcc dot gnu dot org
2008-01-06 12:02 ` rguenth at gcc dot gnu dot org
` (39 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-06 11:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from rguenth at gcc dot gnu dot org 2008-01-06 10:38 -------
This one:
http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01762.html
basically it mentiones async I/O as a reason to not do it, but instead the
callee can copy the I/O structure in that case instead (it probably needs
to anyway, as currently those structures live on the stack, so with async
I/O you'd need otherwise to guarantee that the I/O finished before the
current frame is left).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (9 preceding siblings ...)
2008-01-06 11:46 ` rguenth at gcc dot gnu dot org
@ 2008-01-06 12:02 ` rguenth at gcc dot gnu dot org
2008-01-06 12:02 ` rguenth at gcc dot gnu dot org
` (38 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-06 12:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from rguenth at gcc dot gnu dot org 2008-01-06 11:09 -------
Actually it isn't that bad for I/O as long as there are no pointers to the
dt_parm structs in the IL (function arguments are ok).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (10 preceding siblings ...)
2008-01-06 12:02 ` rguenth at gcc dot gnu dot org
@ 2008-01-06 12:02 ` rguenth at gcc dot gnu dot org
2008-01-06 12:08 ` rguenth at gcc dot gnu dot org
` (37 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-06 12:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from rguenth at gcc dot gnu dot org 2008-01-06 10:50 -------
So, currently you can build up arbitrary large chains of virtual clobbers with
write (*,*) 'hello'
write (*,*) 'hello'
...
as each of those writes create two un-partitionable SFTs. For arrays
temporaries
this is the same (no idea what is the most simple statement to create one).
Without the fix for PR34093 this would even reliably ICE the compiler (and it
of course creates compile-time and memory-usage problems, because the growth
of the IL is non-linear).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (11 preceding siblings ...)
2008-01-06 12:02 ` rguenth at gcc dot gnu dot org
@ 2008-01-06 12:08 ` rguenth at gcc dot gnu dot org
2008-01-06 12:19 ` rguenth at gcc dot gnu dot org
` (36 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-06 12:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from rguenth at gcc dot gnu dot org 2008-01-06 11:13 -------
For array temporaries the pinning of SFTs happens because we have the address
of
the actual array _data_ in the IL:
# SFT.10_52 = VDEF <SFT.10_51(D)>
atmp.0.data = &A.1;
the array descriptor itself is not the problem (the redundant descriptors still
consume memory, but should not cause compile-time problems where observed).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (12 preceding siblings ...)
2008-01-06 12:08 ` rguenth at gcc dot gnu dot org
@ 2008-01-06 12:19 ` rguenth at gcc dot gnu dot org
2008-01-06 12:30 ` rguenth at gcc dot gnu dot org
` (35 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-06 12:19 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from rguenth at gcc dot gnu dot org 2008-01-06 11:46 -------
As of optimization, the conversion sequence
atmp.0.data = &A.1;
D.540_5 = atmp.0.data;
D.541_6 = (real(kind=4)[0:] *) D.540_5;
...
(*D.541_6)[S.5_1] = D.545_14;
can be optimized to
A.1[S.5_1] = D.545_14;
with the following patch that makes sure we use (real(kind=4)[4] *) instead
of the unknown size array type.
Index: trans-types.c
===================================================================
--- trans-types.c (revision 131336)
+++ trans-types.c (working copy)
@@ -1511,10 +1511,12 @@ gfc_get_array_type_bounds (tree etype, i
/* TODO: known offsets for descriptors. */
GFC_TYPE_ARRAY_OFFSET (fat_type) = NULL_TREE;
- /* We define data as an unknown size array. Much better than doing
+ /* We define data as an array with the correct size. Much better than doing
pointer arithmetic. */
arraytype =
- build_array_type (etype, gfc_array_range_type);
+ build_array_type (etype, build_range_type (gfc_array_index_type,
+ gfc_index_zero_node, int_const_binop (MINUS_EXPR, stride,
+ integer_one_node, 0)));
arraytype = build_pointer_type (arraytype);
GFC_TYPE_ARRAY_DATAPTR_TYPE (fat_type) = arraytype;
(the patch needs to be adjusted for the cases stride is not the actual array
size, but you should get the idea)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (13 preceding siblings ...)
2008-01-06 12:19 ` rguenth at gcc dot gnu dot org
@ 2008-01-06 12:30 ` rguenth at gcc dot gnu dot org
2008-01-06 22:15 ` jaydub66 at gmail dot com
` (34 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-06 12:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from rguenth at gcc dot gnu dot org 2008-01-06 12:07 -------
Note that the large FRE compile-time is a dup of PR34450, try experimenting
with --param sccvn-max-scc-size=N and find a N where the problem vanishes.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (14 preceding siblings ...)
2008-01-06 12:30 ` rguenth at gcc dot gnu dot org
@ 2008-01-06 22:15 ` jaydub66 at gmail dot com
2008-01-06 22:24 ` rguenther at suse dot de
` (33 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: jaydub66 at gmail dot com @ 2008-01-06 22:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #16 from jaydub66 at gmail dot com 2008-01-06 22:08 -------
I've done some experimenting with older GCC versions I have floating around on
my machines:
Compiling the test case with both 4.1.2 and 4.2.1 gives an ICE, so I guess we
can't exactly call this a regression.
But then compiling with older 4.3 trunk builds works much better:
I have a version from August 24 of last year, which runs the test case at full
-O3 in a perfect 8sec, with 32MB memory usage (that is with
--enable-checking=release). And I have two other builds from October 8 and
November 9, which both run the test case in about 38s with 85MB mem-usage (they
are built with --enable-checking=debug, so this is probably just debugging
overhead). At least none of them takes the crazy 800MB of recent builds.
To sum up: It seems like trunk versions up to at least November 9 seem to work
fine on this test case.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug middle-end/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (16 preceding siblings ...)
2008-01-06 22:24 ` rguenther at suse dot de
@ 2008-01-06 22:24 ` pinskia at gcc dot gnu dot org
2008-01-07 14:05 ` [Bug tree-optimization/34683] " rguenth at gcc dot gnu dot org
` (31 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-01-06 22:24 UTC (permalink / raw)
To: gcc-bugs
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |33870
Component|fortran |middle-end
Summary|Fortran FE generated IL |[4.3 Regression] Fortran FE
|pessimizes middle-end IL and|generated IL pessimizes
|analysis |middle-end IL and analysis
Target Milestone|--- |4.3.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (15 preceding siblings ...)
2008-01-06 22:15 ` jaydub66 at gmail dot com
@ 2008-01-06 22:24 ` rguenther at suse dot de
2008-01-06 22:24 ` [Bug middle-end/34683] [4.3 Regression] " pinskia at gcc dot gnu dot org
` (32 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2008-01-06 22:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #17 from rguenther at suse dot de 2008-01-06 22:12 -------
Subject: Re: Fortran FE generated IL pessimizes middle-end
IL and analysis
On Sun, 6 Jan 2008, jaydub66 at gmail dot com wrote:
> ------- Comment #16 from jaydub66 at gmail dot com 2008-01-06 22:08 -------
> I've done some experimenting with older GCC versions I have floating around on
> my machines:
>
> Compiling the test case with both 4.1.2 and 4.2.1 gives an ICE, so I guess we
> can't exactly call this a regression.
>
> But then compiling with older 4.3 trunk builds works much better:
> I have a version from August 24 of last year, which runs the test case at full
> -O3 in a perfect 8sec, with 32MB memory usage (that is with
> --enable-checking=release). And I have two other builds from October 8 and
> November 9, which both run the test case in about 38s with 85MB mem-usage (they
> are built with --enable-checking=debug, so this is probably just debugging
> overhead). At least none of them takes the crazy 800MB of recent builds.
>
> To sum up: It seems like trunk versions up to at least November 9 seem to work
> fine on this test case.
Well, we know what introduced this regression on the trunk. It was
the fix for PR33870.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (17 preceding siblings ...)
2008-01-06 22:24 ` [Bug middle-end/34683] [4.3 Regression] " pinskia at gcc dot gnu dot org
@ 2008-01-07 14:05 ` rguenth at gcc dot gnu dot org
2008-01-07 15:00 ` rguenth at gcc dot gnu dot org
` (30 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-07 14:05 UTC (permalink / raw)
To: gcc-bugs
------- Comment #18 from rguenth at gcc dot gnu dot org 2008-01-07 13:25 -------
Most of the operand scanner time is caused by cfg_cleanup:
tree CFG cleanup : 93.71 (79%) usr 1.16 (54%) sys 95.03 (78%) wall
474427 kB (28%) ggc
globbing CFG cleanup (and operand scanner) time to the passes that trigger it
reveals:
tree PHI const/copy prop: 27.30 (16%) usr 0.27 ( 9%) sys 31.33 (16%) wall
9 kB ( 0%) ggc
tree FRE : 25.01 (15%) usr 0.75 (25%) sys 27.63 (14%) wall
1061857 kB (62%) ggc
complete unrolling : 100.00 (59%) usr 1.46 (49%) sys 111.81 (58%) wall
482221 kB (28%) ggc
wheee ;)
looking where we spend most of the time it seems to be building tree lists
in ssa_redirect_edge. I have a hack to make this a single allocation of a
VEC on the heap, still in cfg_cleanup remove_forwarder_block_with_phi there
is a linear search(!) in the list of pending PHI args.
Profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
48.75 52.85 52.85 2734252 0.00 0.00 ggc_alloc_stat
11.07 64.85 12.00 35486689 0.00 0.00 add_vars_for_offset
4.22 69.42 4.57 327924414 0.00 0.00 var_ann
3.89 73.64 4.22 138084 0.00 0.00
finalize_ssa_stmt_operands
3.22 77.13 3.49 44525266 0.00 0.00 add_virtual_operand
...
the add_vars_for_offset is PR33870 related.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (18 preceding siblings ...)
2008-01-07 14:05 ` [Bug tree-optimization/34683] " rguenth at gcc dot gnu dot org
@ 2008-01-07 15:00 ` rguenth at gcc dot gnu dot org
2008-01-07 15:20 ` rguenth at gcc dot gnu dot org
` (29 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-07 15:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #19 from rguenth at gcc dot gnu dot org 2008-01-07 14:32 -------
-fno-tree-loop-optimize reduces compile-time of this testcase from 117s to 35s
with -O -fstrict-aliasing. try_unroll_loop_completely calls update_ssa, for
every loop unrolled, added by r98599, which seems to be necessary - mainly
because
of the interesting PHI redirect re-use of PENDING_STMT.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (19 preceding siblings ...)
2008-01-07 15:00 ` rguenth at gcc dot gnu dot org
@ 2008-01-07 15:20 ` rguenth at gcc dot gnu dot org
2008-01-07 16:18 ` rguenth at gcc dot gnu dot org
` (28 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-07 15:20 UTC (permalink / raw)
To: gcc-bugs
------- Comment #20 from rguenth at gcc dot gnu dot org 2008-01-07 14:50 -------
Subject: Bug 34683
Author: rguenth
Date: Mon Jan 7 14:49:36 2008
New Revision: 131375
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131375
Log:
2008-01-07 Richard Guenther <rguenther@suse.de>
PR tree-optimization/34683
* tree-ssa-sccvn.c (vuses_to_vec): Pre-allocate the vector of
VOPs of the needed size to save memory. Use VEC_quick_push
to save compile-time.
(vdefs_to_vec): Likewise.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-sccvn.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (20 preceding siblings ...)
2008-01-07 15:20 ` rguenth at gcc dot gnu dot org
@ 2008-01-07 16:18 ` rguenth at gcc dot gnu dot org
2008-01-07 16:41 ` rguenth at gcc dot gnu dot org
` (27 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-07 16:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #21 from rguenth at gcc dot gnu dot org 2008-01-07 15:08 -------
Without the loop optimizer the profile looks like:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
9.77 2.96 2.96 14743868 0.00 0.00 add_vars_for_offset
8.00 5.38 2.42 vuses_compare
7.06 7.51 2.14 44262948 0.00 0.00 add_virtual_operand
6.65 9.52 2.01 235204852 0.00 0.00 var_ann
4.83 10.98 1.46 68814811 0.00 0.00 iterative_hash_expr
4.66 12.39 1.41 172 0.01 0.04 DFS
4.63 13.79 1.40 69248754 0.00 0.00 htab_find_with_hash
4.28 15.09 1.30 1229783 0.00 0.00 ggc_alloc_stat
4.13 16.34 1.25 206399974 0.00 0.00 VN_INFO
4.00 17.55 1.21 19029637 0.00 0.00 set_bb_for_stmt
3.74 18.68 1.13 12610 0.00 0.00 get_call_expr_operands
2.88 19.55 0.87 266239 0.00 0.00 valueize_vuses
2.31 20.25 0.70 68677290 0.00 0.00 uid_decl_map_eq
2.25 20.93 0.68 129473 0.00 0.00 visit_reference_op_store
1.90 21.50 0.58 114366877 0.00 0.00 is_gimple_reg
1.49 21.95 0.45 147496 0.00 0.00 vec_gc_o_reserve_1
1.41 22.38 0.43 68653081 0.00 0.00 referenced_var_lookup
and most of the time (cfgcleanup and operand scanner globbed into passes) is
spent in
tree PHI const/copy prop: 27.82 (42%) usr 0.05 ( 5%) sys 27.95 (41%) wall
8 kB ( 0%) ggc
tree FRE : 24.94 (37%) usr 0.78 (73%) sys 25.73 (38%) wall
508759 kB (80%) ggc
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (21 preceding siblings ...)
2008-01-07 16:18 ` rguenth at gcc dot gnu dot org
@ 2008-01-07 16:41 ` rguenth at gcc dot gnu dot org
2008-01-07 17:28 ` rguenth at gcc dot gnu dot org
` (26 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-07 16:41 UTC (permalink / raw)
To: gcc-bugs
------- Comment #22 from rguenth at gcc dot gnu dot org 2008-01-07 15:56 -------
A histogram over the number of VUSEs/VDEFs that are being sorted by sort_vuses
reveals:
180 2
281964 256
1498 4
that is, we call qsort 281964 times with 256 virtual operands in the VEC. Now,
curiously(?), virtual operands are sorted already, but after DECL_UID, not
SSA_NAME_VERSION (see operand_build_sort_virtual).
Adjusting the tree-vn sorting order to that of the operand scanner lets us
improve these numbers to
27 2
28665 256
128 4
a reduction of a factor of 10. Profile after that:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
10.37 2.91 2.91 14769893 0.00 0.00 add_vars_for_offset
7.91 5.13 2.22 44264182 0.00 0.00 add_virtual_operand
7.16 7.14 2.01 235231611 0.00 0.00 var_ann
5.24 8.61 1.47 206399974 0.00 0.00 VN_INFO
5.20 10.07 1.46 172 0.01 0.04 DFS
5.17 11.52 1.45 68814811 0.00 0.00 iterative_hash_expr
4.92 12.90 1.38 12610 0.00 0.00 get_call_expr_operands
4.74 14.23 1.33 1439173 0.00 0.00 ggc_alloc_stat
4.63 15.53 1.30 69275355 0.00 0.00 htab_find_with_hash
3.88 16.62 1.09 19029637 0.00 0.00 set_bb_for_stmt
3.56 17.62 1.00 266239 0.00 0.00 valueize_vuses
2.21 18.24 0.62 129473 0.00 0.00 visit_reference_op_store
1.96 18.79 0.55 68703417 0.00 0.00 uid_decl_map_eq
1.92 19.33 0.54 operand_build_cmp
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (22 preceding siblings ...)
2008-01-07 16:41 ` rguenth at gcc dot gnu dot org
@ 2008-01-07 17:28 ` rguenth at gcc dot gnu dot org
2008-01-08 10:18 ` rguenth at gcc dot gnu dot org
` (25 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-07 17:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #23 from rguenth at gcc dot gnu dot org 2008-01-07 17:08 -------
Zdenek, can you have a look if there is sth low-hanging with the tree level
loop unroller?
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rakdver at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (23 preceding siblings ...)
2008-01-07 17:28 ` rguenth at gcc dot gnu dot org
@ 2008-01-08 10:18 ` rguenth at gcc dot gnu dot org
2008-01-08 10:27 ` rguenth at gcc dot gnu dot org
` (24 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-08 10:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #24 from rguenth at gcc dot gnu dot org 2008-01-08 09:46 -------
The patch in comment #14 reduces compile time from 120s to 80s. I'll try to
get it into shape.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (24 preceding siblings ...)
2008-01-08 10:18 ` rguenth at gcc dot gnu dot org
@ 2008-01-08 10:27 ` rguenth at gcc dot gnu dot org
2008-01-08 12:18 ` rguenth at gcc dot gnu dot org
` (23 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-08 10:27 UTC (permalink / raw)
To: gcc-bugs
------- Comment #25 from rguenth at gcc dot gnu dot org 2008-01-08 09:55 -------
The unroll problem is that if you have multiple loops like
for (i=0; i<3; ++i)
a[i] = i;
...
for (i=0; i<3; ++i)
a[i] = i;
then even though we are in loop closed SSA form, we rename all variables
in the loop BBs after duplicating them for *every loop* we unroll. Which
causes the complexity of renaming to be quadratic (you can easily see
that for this testcase the number of SSA_NAMEs allocated by the renamer
is quadratic with the number of loops in the testcase). Now what makes
it so bad in this testcase is that we mostly deal with virtual operands
during renaming.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (25 preceding siblings ...)
2008-01-08 10:27 ` rguenth at gcc dot gnu dot org
@ 2008-01-08 12:18 ` rguenth at gcc dot gnu dot org
2008-01-08 12:22 ` rguenth at gcc dot gnu dot org
` (22 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-08 12:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #26 from rguenth at gcc dot gnu dot org 2008-01-08 11:06 -------
We do a _lot_ of calls to var_ann during the compilation:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
39.79 30.12 30.12 1311048 0.00 0.00 ggc_alloc_stat
12.62 39.67 9.55 34186814 0.00 0.00 add_vars_for_offset
5.77 44.04 4.37 331206978 0.00 0.00 var_ann
so var_ann is not inlined despite of it being declared as such.
0.00 0.00 35522/331206978 add_to_addressable_set [318]
1.58 0.00 119702754/331206978 add_virtual_operand [12]
2.79 0.00 211468702/331206978 add_vars_for_offset [17]
[39] 5.8 4.37 0.00 331206978 var_ann [39]
0.00 0.00 119615/96446282 htab_find_with_hash <cycle
34> [46]
So, the add_vars_for_offset loop is resposible for most of the calls
(unsurprisingly), and most of the calls are not for global vars.
Making sure this call gets inlined brings another 5s of compile-time.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (26 preceding siblings ...)
2008-01-08 12:18 ` rguenth at gcc dot gnu dot org
@ 2008-01-08 12:22 ` rguenth at gcc dot gnu dot org
2008-01-08 12:31 ` steven at gcc dot gnu dot org
` (21 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-08 12:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #27 from rguenth at gcc dot gnu dot org 2008-01-08 12:13 -------
Subject: Bug 34683
Author: rguenth
Date: Tue Jan 8 12:12:56 2008
New Revision: 131393
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131393
Log:
2008-01-08 Richard Guenther <rguenther@suse.de>
PR tree-optimization/34683
* tree-flow-inline.h (var_ann): Remove overzealous asserts.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-flow-inline.h
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (27 preceding siblings ...)
2008-01-08 12:22 ` rguenth at gcc dot gnu dot org
@ 2008-01-08 12:31 ` steven at gcc dot gnu dot org
2008-01-08 12:41 ` steven at gcc dot gnu dot org
` (20 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-01-08 12:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #28 from steven at gcc dot gnu dot org 2008-01-08 12:17 -------
Re. comment #26
More proof that assert checking _is_ expensive. I wonder how much compile time
can be blamed overall on release checking (i.e. checking disabled vs. release
checking).
Have you ever compared e.g. build times of cc1-i files or a linux kernel with a
non-checking compiler vs. a release-checking compiler?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (28 preceding siblings ...)
2008-01-08 12:31 ` steven at gcc dot gnu dot org
@ 2008-01-08 12:41 ` steven at gcc dot gnu dot org
2008-01-08 12:57 ` rguenth at gcc dot gnu dot org
` (19 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-01-08 12:41 UTC (permalink / raw)
To: gcc-bugs
------- Comment #29 from steven at gcc dot gnu dot org 2008-01-08 12:18 -------
Re. comment #26
More proof that assert checking _is_ expensive. I wonder how much compile time
can be blamed overall on release checking (i.e. checking disabled vs. release
checking).
Have you ever compared e.g. build times of cc1-i files or a linux kernel with a
non-checking compiler vs. a release-checking compiler?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (29 preceding siblings ...)
2008-01-08 12:41 ` steven at gcc dot gnu dot org
@ 2008-01-08 12:57 ` rguenth at gcc dot gnu dot org
2008-01-08 13:02 ` rguenth at gcc dot gnu dot org
` (18 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-08 12:57 UTC (permalink / raw)
To: gcc-bugs
------- Comment #30 from rguenth at gcc dot gnu dot org 2008-01-08 12:30 -------
No, I didn't. But yes, overzealous asserts in the simple inline accessor
functions are probably counter-productive. For example things like
static inline bool
gimple_aliases_computed_p (const struct function *fun)
{
gcc_assert (fun && fun->gimple_df);
return fun->gimple_df->aliases_computed_p;
}
would segfault anyway if fun or fun->gimple_df would be NULL. Or
static inline function_ann_t
function_ann (const_tree t)
{
gcc_assert (t);
gcc_assert (TREE_CODE (t) == FUNCTION_DECL);
gcc_assert (!t->base.ann
|| t->base.ann->common.type == FUNCTION_ANN);
return (function_ann_t) t->base.ann;
}
multiple asserts generate multiple diagnostic calls. Now, both of these
functions are not in hot code-paths.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (30 preceding siblings ...)
2008-01-08 12:57 ` rguenth at gcc dot gnu dot org
@ 2008-01-08 13:02 ` rguenth at gcc dot gnu dot org
2008-01-08 13:56 ` steven at gcc dot gnu dot org
` (17 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-08 13:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #31 from rguenth at gcc dot gnu dot org 2008-01-08 12:55 -------
Another hot assert is
static inline tree
memory_partition (tree sym)
{
tree tag;
/* MPTs belong to their own partition. */
if (TREE_CODE (sym) == MEMORY_PARTITION_TAG)
return sym;
gcc_assert (!is_gimple_reg (sym));
and is_gimple_reg isn't cheap either. For this testcase two thirds of
the calls are attributed to add_vars_for_offset (through add_v{use,def}):
0.50 0.01 135234172/199610365 add_vars_for_offset [27]
[114] 1.1 0.74 0.01 199610365 is_gimple_reg [114]
and it really isn't cheap:
CPU: AMD64 processors, speed 2000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 100000
samples % symbol name
545326 36.8318 ggc_alloc_stat
310838 20.9943 add_vars_for_offset
85655 5.7852 add_virtual_operand
69156 4.6709 finalize_ssa_stmt_operands
57057 3.8537 bitmap_set_bit
36755 2.4825 htab_find_with_hash
35915 2.4257 get_expr_operands
28207 1.9051 is_gimple_reg
23340 1.5764 referenced_var_lookup
wrapping the assert inside ENABLE_CHECKING buys us 2s or 2.5%. But this
is too much micro-optimizing for me ;)
I'd rather have the quadratic SSA rewriting in the unroller fixed.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (31 preceding siblings ...)
2008-01-08 13:02 ` rguenth at gcc dot gnu dot org
@ 2008-01-08 13:56 ` steven at gcc dot gnu dot org
2008-01-08 16:50 ` rguenth at gcc dot gnu dot org
` (16 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-01-08 13:56 UTC (permalink / raw)
To: gcc-bugs
------- Comment #32 from steven at gcc dot gnu dot org 2008-01-08 13:08 -------
You are of course right that for this test case the bigger win is in the
algorithmic issues.
On the other hand, for compile time in general, the gcc_asserts are IMHO the
worst thing that happened to GCC since the introduction of the garbage
allocator ;-) It is way too easy to write "gcc_assert (blah)" instead of "if
(blah)\nabort ();", and people don't realize how expensive these "cheap"
asserts are.
This would then be a nice project for the Wiki beginners projects pages,
perhaps: Audit the gcc_assert calls in small "inline" functions and see if
there is any measurable gain to win from removing some that don't make sense
(like e.g. the ones you identified in comment #30).
But that's not relevant for this bug report, of course :-)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (32 preceding siblings ...)
2008-01-08 13:56 ` steven at gcc dot gnu dot org
@ 2008-01-08 16:50 ` rguenth at gcc dot gnu dot org
2008-01-08 21:36 ` rguenth at gcc dot gnu dot org
` (15 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-08 16:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #33 from rguenth at gcc dot gnu dot org 2008-01-08 16:29 -------
Subject: Bug 34683
Author: rguenth
Date: Tue Jan 8 16:29:14 2008
New Revision: 131400
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131400
Log:
2008-01-08 Richard Guenther <rguenther@suse.de>
PR tree-optimization/34683
* tree-ssa-operands.c (operand_build_cmp): Export.
* tree-ssa-operands.h (operand_build_cmp): Declare.
* tree-vn.c (vuses_compare): Remove.
(sort_vuses): Use operand_build_cmp.
(sort_vuses_heap): Likewise.
* tree-ssa-sccvn.c (vuses_to_vec): Use VEC_reserve, not VEC_alloc
to re-use old VEC if available. Do not sort already sorted VUSEs.
(vdefs_to_vec): Do not sort already sorted VDEFs.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-operands.c
trunk/gcc/tree-ssa-operands.h
trunk/gcc/tree-ssa-sccvn.c
trunk/gcc/tree-vn.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (33 preceding siblings ...)
2008-01-08 16:50 ` rguenth at gcc dot gnu dot org
@ 2008-01-08 21:36 ` rguenth at gcc dot gnu dot org
2008-01-09 1:09 ` steven at gcc dot gnu dot org
` (14 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-08 21:36 UTC (permalink / raw)
To: gcc-bugs
------- Comment #34 from rguenth at gcc dot gnu dot org 2008-01-08 21:22 -------
Subject: Bug 34683
Author: rguenth
Date: Tue Jan 8 21:21:29 2008
New Revision: 131404
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131404
Log:
2008-01-08 Richard Guenther <rguenther@suse.de>
PR fortran/34706
PR tree-optimization/34683
* trans-types.c (gfc_get_array_type_bounds): Use an array type
with known size for accesses if that is known.
Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/trans-types.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (34 preceding siblings ...)
2008-01-08 21:36 ` rguenth at gcc dot gnu dot org
@ 2008-01-09 1:09 ` steven at gcc dot gnu dot org
2008-01-09 10:32 ` jaydub66 at gmail dot com
` (13 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-01-09 1:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #35 from steven at gcc dot gnu dot org 2008-01-09 00:04 -------
How does the trunk compare now to the numbers mentioned in comment #16 ?
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |WAITING
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (36 preceding siblings ...)
2008-01-09 10:32 ` jaydub66 at gmail dot com
@ 2008-01-09 10:32 ` rguenther at suse dot de
2008-01-09 10:35 ` [Bug tree-optimization/34683] [4.3 Regression] SSA rewriting in the loop unroller causes quadratic behavior steven at gcc dot gnu dot org
` (11 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenther at suse dot de @ 2008-01-09 10:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #37 from rguenther at suse dot de 2008-01-09 09:45 -------
Subject: Re: [4.3 Regression] Fortran FE
generated IL pessimizes middle-end IL and analysis
On Wed, 9 Jan 2008, jaydub66 at gmail dot com wrote:
> ------- Comment #36 from jaydub66 at gmail dot com 2008-01-09 09:38 -------
>
> > How does the trunk compare now to the numbers mentioned in comment #16 ?
>
> Compiling with rev. 131414 gives:
>
> -O1 -fstrict-aliasing: 33sec, 438197 kB
> -O2: 97sec, 669637 kB
> -O3: 50sec, 392669 kB
>
> This is of course better than a few days ago (thanks to Richard's patches). But
> it's not even close to the numbers in comment #16.
>
> What worries me about this is not the 33sec compile time (I could live with
> that, if I'd have to), but the really large amount of memory. On a machine with
> only 256MB compiling takes several minutes, due to massive swapping.
>
> Richard:
> You said that the remaining issue is probably not gonna be fixed for 4.3. I
> don't quite see the reason for this, since this apparently *was* fixed at some
> point in the 4.3 trunk (see comment 16).
Yeah, but we won't trade wrong-code bugs for better compile-time here.
I have no idea how practical it is to fix basic-block duplication code
(which is the worst offender regarding compile-time in this circumstances
of too many VOPs).
> Also this would really be a shame since it would make GCC 4.3 practically
> unusable for our project (http://gibuu.physik.uni-giessen.de/GiBUU/).
There is a workaround available - you can use
--param max-fields-for-field-sensitive=0
to disable SFTs, but this may disable useful optimization. This is
the route we want to go with 4.4, addressing the missed optimizations in
different ways.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] Fortran FE generated IL pessimizes middle-end IL and analysis
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (35 preceding siblings ...)
2008-01-09 1:09 ` steven at gcc dot gnu dot org
@ 2008-01-09 10:32 ` jaydub66 at gmail dot com
2008-01-09 10:32 ` rguenther at suse dot de
` (12 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: jaydub66 at gmail dot com @ 2008-01-09 10:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #36 from jaydub66 at gmail dot com 2008-01-09 09:38 -------
> How does the trunk compare now to the numbers mentioned in comment #16 ?
Compiling with rev. 131414 gives:
-O1 -fstrict-aliasing: 33sec, 438197 kB
-O2: 97sec, 669637 kB
-O3: 50sec, 392669 kB
This is of course better than a few days ago (thanks to Richard's patches). But
it's not even close to the numbers in comment #16.
What worries me about this is not the 33sec compile time (I could live with
that, if I'd have to), but the really large amount of memory. On a machine with
only 256MB compiling takes several minutes, due to massive swapping.
Richard:
You said that the remaining issue is probably not gonna be fixed for 4.3. I
don't quite see the reason for this, since this apparently *was* fixed at some
point in the 4.3 trunk (see comment 16).
Also this would really be a shame since it would make GCC 4.3 practically
unusable for our project (http://gibuu.physik.uni-giessen.de/GiBUU/).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (37 preceding siblings ...)
2008-01-09 10:32 ` rguenther at suse dot de
@ 2008-01-09 10:35 ` steven at gcc dot gnu dot org
2008-01-10 15:05 ` rguenth at gcc dot gnu dot org
` (10 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-01-09 10:35 UTC (permalink / raw)
To: gcc-bugs
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |NEW
Last reconfirmed|2008-01-06 00:36:29 |2008-01-09 09:56:00
date| |
Summary|[4.3 Regression] Fortran FE |[4.3 Regression] SSA
|generated IL pessimizes |rewriting in the loop
|middle-end IL and analysis |unroller causes quadratic
| |behavior
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (38 preceding siblings ...)
2008-01-09 10:35 ` [Bug tree-optimization/34683] [4.3 Regression] SSA rewriting in the loop unroller causes quadratic behavior steven at gcc dot gnu dot org
@ 2008-01-10 15:05 ` rguenth at gcc dot gnu dot org
2008-01-10 15:29 ` rguenth at gcc dot gnu dot org
` (9 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-10 15:05 UTC (permalink / raw)
To: gcc-bugs
------- Comment #38 from rguenth at gcc dot gnu dot org 2008-01-10 14:36 -------
Created an attachment (id=14908)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14908&action=view)
hack to disable rewriting of VOPs
I tried to change the unroller doing only SSA_NAME replacement update after
each unrolling and leave rewriting the virtual operands to after the whole
pass but either I messed up or this even makes it worse ;) See the prototype
hack.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (39 preceding siblings ...)
2008-01-10 15:05 ` rguenth at gcc dot gnu dot org
@ 2008-01-10 15:29 ` rguenth at gcc dot gnu dot org
2008-01-10 16:00 ` rguenth at gcc dot gnu dot org
` (8 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-10 15:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #39 from rguenth at gcc dot gnu dot org 2008-01-10 15:01 -------
Hmm, looks I was compeltely wrong about the cause of the slowdown. We actually
run cfg_cleanup after cunroll and merge blocks like
<BB1>
...
<BB2>
# SFT.1_2 = PHI <SFT.1_1 (BB1)>
...
# SFT.1000_2 = PHI <SFT.1000_1 (BB1)>
# SFT.1_3 = VDEF <SFT.1_2>
...
# SFT.1000_3 = VDEF <SFT.1_2>
*mem = x;
and in merging the blocks we do (tree_merge_blocks):
/* Remove all single-valued PHI nodes from block B of the form
V_i = PHI <V_j> by propagating V_j to all the uses of V_i. */
for (phi = phi_nodes (b); phi; phi = phi_nodes (b))
{
...
replace_uses_by (def, use);
remove_phi_node (phi, NULL, true);
BUT! replace_uses_by will for _each_ phi-node we replace its uses update
the target stmt! And fold it! We can do better with VOPs
Preparing a patch.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (40 preceding siblings ...)
2008-01-10 15:29 ` rguenth at gcc dot gnu dot org
@ 2008-01-10 16:00 ` rguenth at gcc dot gnu dot org
2008-01-10 17:57 ` rguenth at gcc dot gnu dot org
` (7 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-10 16:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #40 from rguenth at gcc dot gnu dot org 2008-01-10 15:32 -------
After you fix this, we're back to SCCVN is slow with lots of virtual operands.
For -O2 (which appearantly is the worst case) we now have the following
profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
10.33 3.08 3.08 313563133 0.00 0.00 VN_INFO
8.85 5.72 2.64 365 0.01 0.04 DFS
7.17 7.86 2.14 96368081 0.00 0.00 iterative_hash_expr
5.77 9.58 1.72 472085 0.00 0.00 valueize_vuses
5.43 11.20 1.62 216407 0.00 0.00 visit_reference_op_store
4.93 12.67 1.47 2093493 0.00 0.00 ggc_alloc_stat
3.15 13.61 0.94 17042636 0.00 0.00 set_bb_for_stmt
2.63 14.40 0.79 2442685 0.00 0.00 add_vars_for_offset
1.74 14.92 0.52 operand_build_cmp
1.54 15.38 0.46 226140 0.00 0.00 shared_vuses_from_stmt
1.48 15.82 0.44 245792 0.00 0.00 vn_reference_lookup
1.27 16.20 0.38 5272581 0.00 0.00 mark_all_vars_used_1
1.24 16.57 0.37 16115332 0.00 0.00 bitmap_set_bit
1.14 16.91 0.34 58 0.01 0.02 remove_unused_locals
1.14 17.25 0.34 6878571 0.00 0.00 htab_find_slot_with_hash
1.04 17.56 0.31 4286747 0.00 0.00 add_virtual_operand
1.04 17.87 0.31 255362 0.00 0.00 vec_gc_o_reserve_1
...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (41 preceding siblings ...)
2008-01-10 16:00 ` rguenth at gcc dot gnu dot org
@ 2008-01-10 17:57 ` rguenth at gcc dot gnu dot org
2008-01-10 18:05 ` rguenth at gcc dot gnu dot org
` (6 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-10 17:57 UTC (permalink / raw)
To: gcc-bugs
------- Comment #41 from rguenth at gcc dot gnu dot org 2008-01-10 16:59 -------
Subject: Bug 34683
Author: rguenth
Date: Thu Jan 10 16:59:06 2008
New Revision: 131446
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=131446
Log:
2008-01-10 Richard Guenther <rguenther@suse.de>
PR middle-end/34683
* tree-cfg.c (tree_merge_blocks): Do not go through the
full-blown folding and stmt updating path if we just deal
with virtual operands.
* tree-ssa-copy.c (may_propagate_copy): Do not short-cut
test for abnormal SSA_NAMEs.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-cfg.c
trunk/gcc/tree-ssa-copy.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] [4.3 Regression] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (42 preceding siblings ...)
2008-01-10 17:57 ` rguenth at gcc dot gnu dot org
@ 2008-01-10 18:05 ` rguenth at gcc dot gnu dot org
2008-01-10 18:08 ` [Bug tree-optimization/34683] " rguenth at gcc dot gnu dot org
` (5 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-10 18:05 UTC (permalink / raw)
To: gcc-bugs
------- Comment #42 from rguenth at gcc dot gnu dot org 2008-01-10 17:02 -------
This is probably all we can get for now - maybe another few % with minor
tweaks,
but nothing earth-shattering. After all, we _do_ have a much larger IL due
to the number of VOPs not partitioned.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (43 preceding siblings ...)
2008-01-10 18:05 ` rguenth at gcc dot gnu dot org
@ 2008-01-10 18:08 ` rguenth at gcc dot gnu dot org
2008-01-10 18:24 ` steven at gcc dot gnu dot org
` (4 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-10 18:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #43 from rguenth at gcc dot gnu dot org 2008-01-10 17:05 -------
Actually this is not a regression against a released compiler, as both the
4.1 and the 4.2 branches ICE for this testcase:
gfortran-4.2 -S -o /dev/null t.f90 -O -fstrict-aliasing
t.f90: In function 'ampli':
t.f90:83: internal compiler error: in gfc_trans_array_constructor, at
fortran/trans-array.c:1466
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://bugs.opensuse.org> for instructions.
so, can we declare this as fixed (the huge number of VOPs we have special
PRs for)?
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.3 Regression] SSA |SSA rewriting in the loop
|rewriting in the loop |unroller causes quadratic
|unroller causes quadratic |behavior
|behavior |
Target Milestone|4.3.0 |---
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (44 preceding siblings ...)
2008-01-10 18:08 ` [Bug tree-optimization/34683] " rguenth at gcc dot gnu dot org
@ 2008-01-10 18:24 ` steven at gcc dot gnu dot org
2008-01-10 22:19 ` jaydub66 at gmail dot com
` (3 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-01-10 18:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #44 from steven at gcc dot gnu dot org 2008-01-10 17:39 -------
Yes, fixed.
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (45 preceding siblings ...)
2008-01-10 18:24 ` steven at gcc dot gnu dot org
@ 2008-01-10 22:19 ` jaydub66 at gmail dot com
2008-01-10 23:45 ` steven at gcc dot gnu dot org
` (2 subsequent siblings)
49 siblings, 0 replies; 51+ messages in thread
From: jaydub66 at gmail dot com @ 2008-01-10 22:19 UTC (permalink / raw)
To: gcc-bugs
------- Comment #45 from jaydub66 at gmail dot com 2008-01-10 20:14 -------
(In reply to comment #42)
> This is probably all we can get for now - maybe another few % with minor
> tweaks,
> but nothing earth-shattering. After all, we _do_ have a much larger IL due
> to the number of VOPs not partitioned.
Well, it may not be earth-shattering, but it's still a solid improvement - of
more than a few %! With rev. 131446 I get the following numbers:
-O1 -fstrict-aliasing: 11sec, 201825 kB
-O2: 22sec, 392166 kB
-O3: 24sec, 388802 kB
Compile time is reduced by a factor of 4 for -O2, and memory usage is halved
for -O1! Nice job, Richard.
So I'd say compile time really isn't an issue any more. But memory consumption
is still a factor of 10 above what it was last year. Most of that comes from
'tree FRE' (~140MB), but all the junk in 'tree operand scan' is gone. For -O2
and -O3 another large chunk appears in 'tree PRE' (~160MB).
Are you sure you want to close this bug? For me memory consumption is still an
issue, and I would really like to see this fixed at some point. And if there is
another bug which is a duplicate of this, could you point me to it?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (46 preceding siblings ...)
2008-01-10 22:19 ` jaydub66 at gmail dot com
@ 2008-01-10 23:45 ` steven at gcc dot gnu dot org
2008-01-11 13:58 ` rguenth at gcc dot gnu dot org
2008-01-12 8:39 ` rakdver at gcc dot gnu dot org
49 siblings, 0 replies; 51+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-01-10 23:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #46 from steven at gcc dot gnu dot org 2008-01-10 20:21 -------
See e.g. bug 27004 and bug 33974.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (47 preceding siblings ...)
2008-01-10 23:45 ` steven at gcc dot gnu dot org
@ 2008-01-11 13:58 ` rguenth at gcc dot gnu dot org
2008-01-12 8:39 ` rakdver at gcc dot gnu dot org
49 siblings, 0 replies; 51+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-01-11 13:58 UTC (permalink / raw)
To: gcc-bugs
------- Comment #47 from rguenth at gcc dot gnu dot org 2008-01-11 13:29 -------
Most of the PRE/FRE memory is spent in copied VOPs VECs. Unfortunately we
cannot move them to heap memory easily as the get shared in the PRE tables...
I tried to be explicit in managing the memory, but it gets really hard (which
is probably the reason we stick it in GC memory in the first place).
So we could use two variants of all routines, one using GC allocated VECs and
one using heap allocated VECs and use the GC allocated one only for the final
SCC iteration and the PRE tables.
Smells like a task for C++ templates or a really ugly hack in C.
Obviously this is not going to happen for 4.3 :/
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
* [Bug tree-optimization/34683] SSA rewriting in the loop unroller causes quadratic behavior
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
` (48 preceding siblings ...)
2008-01-11 13:58 ` rguenth at gcc dot gnu dot org
@ 2008-01-12 8:39 ` rakdver at gcc dot gnu dot org
49 siblings, 0 replies; 51+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2008-01-12 8:39 UTC (permalink / raw)
To: gcc-bugs
------- Comment #48 from rakdver at gcc dot gnu dot org 2008-01-12 03:59 -------
(In reply to comment #47)
> Most of the PRE/FRE memory is spent in copied VOPs VECs. Unfortunately we
> cannot move them to heap memory easily as the get shared in the PRE tables...
> I tried to be explicit in managing the memory, but it gets really hard (which
> is probably the reason we stick it in GC memory in the first place).
in fact, it copies the VOPS list of each store statement 50 times on average
(once for PRE and once for FRE, i.e., about 100x total). Cannot sccvn be
throttled down somehow so that it does not spend so much memory this way?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683
^ permalink raw reply [flat|nested] 51+ messages in thread
end of thread, other threads:[~2008-01-12 4:00 UTC | newest]
Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-05 23:51 [Bug tree-optimization/34683] New: compile-time problem with -fstrict-aliasing jaydub66 at gmail dot com
2008-01-06 0:37 ` [Bug tree-optimization/34683] " jaydub66 at gmail dot com
2008-01-06 0:55 ` pinskia at gcc dot gnu dot org
2008-01-06 6:47 ` jaydub66 at gmail dot com
2008-01-06 8:08 ` jaydub66 at gmail dot com
2008-01-06 9:09 ` jvdelisle at gcc dot gnu dot org
2008-01-06 10:08 ` rguenth at gcc dot gnu dot org
2008-01-06 10:35 ` [Bug fortran/34683] Fortran FE generated IL pessimizes middle-end IL and analysis rguenth at gcc dot gnu dot org
2008-01-06 11:10 ` burnus at gcc dot gnu dot org
2008-01-06 11:14 ` rguenth at gcc dot gnu dot org
2008-01-06 11:46 ` rguenth at gcc dot gnu dot org
2008-01-06 12:02 ` rguenth at gcc dot gnu dot org
2008-01-06 12:02 ` rguenth at gcc dot gnu dot org
2008-01-06 12:08 ` rguenth at gcc dot gnu dot org
2008-01-06 12:19 ` rguenth at gcc dot gnu dot org
2008-01-06 12:30 ` rguenth at gcc dot gnu dot org
2008-01-06 22:15 ` jaydub66 at gmail dot com
2008-01-06 22:24 ` rguenther at suse dot de
2008-01-06 22:24 ` [Bug middle-end/34683] [4.3 Regression] " pinskia at gcc dot gnu dot org
2008-01-07 14:05 ` [Bug tree-optimization/34683] " rguenth at gcc dot gnu dot org
2008-01-07 15:00 ` rguenth at gcc dot gnu dot org
2008-01-07 15:20 ` rguenth at gcc dot gnu dot org
2008-01-07 16:18 ` rguenth at gcc dot gnu dot org
2008-01-07 16:41 ` rguenth at gcc dot gnu dot org
2008-01-07 17:28 ` rguenth at gcc dot gnu dot org
2008-01-08 10:18 ` rguenth at gcc dot gnu dot org
2008-01-08 10:27 ` rguenth at gcc dot gnu dot org
2008-01-08 12:18 ` rguenth at gcc dot gnu dot org
2008-01-08 12:22 ` rguenth at gcc dot gnu dot org
2008-01-08 12:31 ` steven at gcc dot gnu dot org
2008-01-08 12:41 ` steven at gcc dot gnu dot org
2008-01-08 12:57 ` rguenth at gcc dot gnu dot org
2008-01-08 13:02 ` rguenth at gcc dot gnu dot org
2008-01-08 13:56 ` steven at gcc dot gnu dot org
2008-01-08 16:50 ` rguenth at gcc dot gnu dot org
2008-01-08 21:36 ` rguenth at gcc dot gnu dot org
2008-01-09 1:09 ` steven at gcc dot gnu dot org
2008-01-09 10:32 ` jaydub66 at gmail dot com
2008-01-09 10:32 ` rguenther at suse dot de
2008-01-09 10:35 ` [Bug tree-optimization/34683] [4.3 Regression] SSA rewriting in the loop unroller causes quadratic behavior steven at gcc dot gnu dot org
2008-01-10 15:05 ` rguenth at gcc dot gnu dot org
2008-01-10 15:29 ` rguenth at gcc dot gnu dot org
2008-01-10 16:00 ` rguenth at gcc dot gnu dot org
2008-01-10 17:57 ` rguenth at gcc dot gnu dot org
2008-01-10 18:05 ` rguenth at gcc dot gnu dot org
2008-01-10 18:08 ` [Bug tree-optimization/34683] " rguenth at gcc dot gnu dot org
2008-01-10 18:24 ` steven at gcc dot gnu dot org
2008-01-10 22:19 ` jaydub66 at gmail dot com
2008-01-10 23:45 ` steven at gcc dot gnu dot org
2008-01-11 13:58 ` rguenth at gcc dot gnu dot org
2008-01-12 8:39 ` rakdver at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).