public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/59802] New: excessive compile time in loop unswitching
@ 2014-01-14 8:49 dcb314 at hotmail dot com
2014-01-14 11:20 ` [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP) rguenth at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: dcb314 at hotmail dot com @ 2014-01-14 8:49 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59802
Bug ID: 59802
Summary: excessive compile time in loop unswitching
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: dcb314 at hotmail dot com
Created attachment 31830
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31830&action=edit
gzipped C++ source code
I just compiled the attached code with gcc trunk 20140112 on a x86_64
box with flag -O3 and it took over eight minutes. Using only -O2 took
a more reasonable 2 minutes 38 seconds.
For reference, the redhat version of gcc 482 took 2 minutes 32 seconds
for -O3 and 2 minutes 11 seconds for -O2.
I can see that for -O2, trunk is using about 30 seconds more CPU time,
which is fine, but for -O3 over 5 minutes more.
I tried flag -ftime-report and here are all the times > 1%.
Execution times (seconds)
phase opt and generate : 465.18 (100%) usr 0.50 (57%) sys 468.04 (100%)
wall 130935 kB (59%) ggc
loop invariant motion : 22.50 ( 5%) usr 0.01 ( 1%) sys 22.85 ( 5%) wall
2 kB ( 0%) ggc
loop unswitching : 302.37 (65%) usr 0.01 ( 1%) sys 303.82 (65%) wall
72 kB ( 0%) ggc
CPROP : 85.02 (18%) usr 0.09 (10%) sys 85.52 (18%) wall
4445 kB ( 2%) ggc
TOTAL : 466.12 0.88 469.55
221219 kB
Suggest code rework for trunk for -O3, maybe in the area of loop unswitching.
This bug may be a duplicate of bug 38518
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP)
2014-01-14 8:49 [Bug c/59802] New: excessive compile time in loop unswitching dcb314 at hotmail dot com
@ 2014-01-14 11:20 ` rguenth at gcc dot gnu.org
2014-01-14 11:21 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-01-14 11:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59802
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |compile-time-hog
Status|UNCONFIRMED |NEW
Last reconfirmed| |2014-01-14
Component|c |rtl-optimization
Summary|excessive compile time in |excessive compile time in
|loop unswitching |RTL optimizers (loop
| |unswitching, CPROP)
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 4.8 shows
CPROP : 45.00 (57%) usr 0.02 ( 4%) sys 45.01 (57%) wall
4016 kB ( 2%) ggc
TOTAL : 78.48 0.57 79.04
213705 kB
while GCC 4.9 has
loop invariant motion : 10.11 (11%) usr 0.01 ( 2%) sys 10.16 (11%) wall
2 kB ( 0%) ggc
loop unswitching : 9.81 (11%) usr 0.00 ( 0%) sys 9.83 (11%) wall
1 kB ( 0%) ggc
CPROP : 48.16 (54%) usr 0.04 ( 7%) sys 48.20 (54%) wall
4444 kB ( 2%) ggc
so I can't really confirm the unswitching slowness (this is r205857 which
is somewhat older than your test).
Generally I think we should probably consider removing RTL unswitching,
there is not a single loop unswitched by RTL for this testcase.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP)
2014-01-14 8:49 [Bug c/59802] New: excessive compile time in loop unswitching dcb314 at hotmail dot com
2014-01-14 11:20 ` [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP) rguenth at gcc dot gnu.org
@ 2014-01-14 11:21 ` rguenth at gcc dot gnu.org
2014-01-14 12:04 ` dcb314 at hotmail dot com
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-01-14 11:21 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59802
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Oh, did you configure with --enable-checking=release for 4.9? (I did)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP)
2014-01-14 8:49 [Bug c/59802] New: excessive compile time in loop unswitching dcb314 at hotmail dot com
2014-01-14 11:20 ` [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP) rguenth at gcc dot gnu.org
2014-01-14 11:21 ` rguenth at gcc dot gnu.org
@ 2014-01-14 12:04 ` dcb314 at hotmail dot com
2014-01-14 13:24 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: dcb314 at hotmail dot com @ 2014-01-14 12:04 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59802
--- Comment #3 from David Binderman <dcb314 at hotmail dot com> ---
(In reply to Richard Biener from comment #2)
> Oh, did you configure with --enable-checking=release for 4.9? (I did)
No, I used --enable-checking=yes.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP)
2014-01-14 8:49 [Bug c/59802] New: excessive compile time in loop unswitching dcb314 at hotmail dot com
` (2 preceding siblings ...)
2014-01-14 12:04 ` dcb314 at hotmail dot com
@ 2014-01-14 13:24 ` rguenth at gcc dot gnu.org
2014-01-14 13:54 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-01-14 13:24 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59802
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |steven at gcc dot gnu.org
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to David Binderman from comment #3)
> (In reply to Richard Biener from comment #2)
> > Oh, did you configure with --enable-checking=release for 4.9? (I did)
>
> No, I used --enable-checking=yes.
That makes the comparison to 4.8 invalid (uses --enable-checking=release
by default).
Btw, callgrind shows that compile-time is dominated by
bitmap_intersection_of_preds (and bitmap_ior_and_compl),
called from lcm.c:compute_available. LCM works with
sbitmaps which can be very expensive for large functions.
tree PRE uses regular bitmaps, but it seems that LCM can
end up using the full bitmap via returning bitmap_ones
from bitmap_intersection_of_preds (for a block with no preds).
It seems compute_available doesn't use optimal iteration order
and that explicitely representing the maximum set instead of
handling unvisited preds makes things more expensive (need to
use sbitmaps).
Iterating in inverted postorder gets me
CPROP : 2.13 ( 5%) usr 0.06 (10%) sys 2.20 ( 5%) wall
4444 kB ( 2%) ggc
with no changes in generated code ...
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP)
2014-01-14 8:49 [Bug c/59802] New: excessive compile time in loop unswitching dcb314 at hotmail dot com
` (3 preceding siblings ...)
2014-01-14 13:24 ` rguenth at gcc dot gnu.org
@ 2014-01-14 13:54 ` rguenth at gcc dot gnu.org
2014-01-15 12:17 ` rguenth at gcc dot gnu.org
2014-01-19 21:28 ` dcb314 at hotmail dot com
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-01-14 13:54 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59802
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00780.html
Even better would be to get rid of the explicit maximum set (just ignore
incoming edges with the maximum set, aka 'unvisited' edges during
bitmap_intersection_of_preds). Basically follow what tree PRE does
for antic-in compute. That would make using regular bitmaps possible
(if that is a win - at least computing the changed bit is free). Also
queuing succs at the end of the worklist messes up iteration order for
everything but the first iteration. PRE uses a sbitmap that records
whether a BB was changed.
Anyway, the above simple patch dramatically improves the numbers for this
testcase.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP)
2014-01-14 8:49 [Bug c/59802] New: excessive compile time in loop unswitching dcb314 at hotmail dot com
` (4 preceding siblings ...)
2014-01-14 13:54 ` rguenth at gcc dot gnu.org
@ 2014-01-15 12:17 ` rguenth at gcc dot gnu.org
2014-01-19 21:28 ` dcb314 at hotmail dot com
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-01-15 12:17 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59802
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP)
2014-01-14 8:49 [Bug c/59802] New: excessive compile time in loop unswitching dcb314 at hotmail dot com
` (5 preceding siblings ...)
2014-01-15 12:17 ` rguenth at gcc dot gnu.org
@ 2014-01-19 21:28 ` dcb314 at hotmail dot com
6 siblings, 0 replies; 8+ messages in thread
From: dcb314 at hotmail dot com @ 2014-01-19 21:28 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59802
--- Comment #8 from David Binderman <dcb314 at hotmail dot com> ---
(In reply to Richard Biener from comment #7)
> Fixed.
The results I can report are for trunk dated 20130119
[dcb@zippy4 foundBugs]$ time ../results/bin/gcc -c bug129.cc
real 0m8.076s
user 0m5.925s
sys 0m0.131s
[dcb@zippy4 foundBugs]$ time ../results/bin/gcc -c -O2 bug129.cc
real 1m0.706s
user 0m57.884s
sys 0m0.402s
[dcb@zippy4 foundBugs]$ time ../results/bin/gcc -c -O3 bug129.cc
real 5m45.982s
user 5m42.793s
sys 0m0.457s
while the first time is trivial, the -O2 time is down by about
60% and the -O3 time is down by about 30%.
Good work !
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-01-19 21:28 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-14 8:49 [Bug c/59802] New: excessive compile time in loop unswitching dcb314 at hotmail dot com
2014-01-14 11:20 ` [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP) rguenth at gcc dot gnu.org
2014-01-14 11:21 ` rguenth at gcc dot gnu.org
2014-01-14 12:04 ` dcb314 at hotmail dot com
2014-01-14 13:24 ` rguenth at gcc dot gnu.org
2014-01-14 13:54 ` rguenth at gcc dot gnu.org
2014-01-15 12:17 ` rguenth at gcc dot gnu.org
2014-01-19 21:28 ` dcb314 at hotmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).