From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10572 invoked by alias); 5 Sep 2012 11:01:02 -0000 Received: (qmail 10543 invoked by uid 22791); 5 Sep 2012 11:01:01 -0000 X-SWARE-Spam-Status: No, hits=-4.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,KHOP_THREADED,TW_TM X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 05 Sep 2012 11:00:46 +0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/46590] [4.6/4.7/4.8 Regression] long compile time with -O2 and many loops Date: Wed, 05 Sep 2012 11:01:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Keywords: compile-time-hog, memory-hog X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.6.4 X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2012-09/txt/msg00346.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46590 --- Comment #36 from Richard Guenther 2012-09-05 10:59:52 UTC --- If I fix that (PR54489) by iterating over immediate dominators when querying AVAIL_OUT instead of accumulating then other loop opts quickly take over in compile-time, but memory usage stays reasonable at -O1. LIM is now the pass that pushes memory usage to 1.8GB - all other optimization passes are happy with just ~800MB. The issue with LIM is that it analyzes the whole function instead of working on outermost loops at a time (PR54488). Then of course IRA comes along and wrecks memory usage again ... (create_loop_tree_nodes). One can tame down IRA a bit using -fno-ira-loop-pressure -fira-region=one. We then arrive at roughly a constant 900MB memory usage for the full(!) testcase at -O1 and Execution times (seconds) phase opt and generate : 495.90 (99%) usr 1.98 (98%) sys 499.91 (99%) wall 870508 kB (92%) ggc df reaching defs : 19.16 ( 4%) usr 0.06 ( 3%) sys 19.18 ( 4%) wall 0 kB ( 0%) ggc alias stmt walking : 28.75 ( 6%) usr 0.21 (10%) sys 29.12 ( 6%) wall 2336 kB ( 0%) ggc tree SSA rewrite : 63.42 (13%) usr 0.02 ( 1%) sys 63.77 (13%) wall 18830 kB ( 2%) ggc tree SSA incremental : 74.64 (15%) usr 0.03 ( 1%) sys 74.44 (15%) wall 25886 kB ( 3%) ggc dominance frontiers : 101.71 (20%) usr 0.09 ( 4%) sys 102.17 (20%) wall 0 kB ( 0%) ggc dominance computation : 52.56 (11%) usr 0.09 ( 4%) sys 53.35 (11%) wall 0 kB ( 0%) ggc loop invariant motion : 101.20 (20%) usr 0.10 ( 5%) sys 101.75 (20%) wall 2700 kB ( 0%) ggc TOTAL : 498.79 2.03 502.87 947764 kB (all entries > 10s) The incremental SSA stuff is complete loop unrolling / IV canonicalization which does SSA update once per loop (similar to what loop header copying formerly did). Fixing that leads to Execution times (seconds) phase opt and generate : 214.62 (99%) usr 1.53 (96%) sys 217.41 (99%) wall 870508 kB (92%) ggc df reaching defs : 23.07 (11%) usr 0.01 ( 1%) sys 23.10 (10%) wall 0 kB ( 0%) ggc alias stmt walking : 28.51 (13%) usr 0.23 (14%) sys 28.93 (13%) wall 2336 kB ( 0%) ggc loop invariant motion : 105.43 (48%) usr 0.01 ( 1%) sys 106.22 (48%) wall 2700 kB ( 0%) ggc TOTAL : 217.56 1.59 220.44 947764 kB so RTL invariant motion is now the main offender ;)