From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12094 invoked by alias); 14 Jan 2014 13:24:05 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 12035 invoked by uid 48); 14 Jan 2014 13:24:01 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/59802] excessive compile time in RTL optimizers (loop unswitching, CPROP) Date: Tue, 14 Jan 2014 13:24:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 4.9.0 X-Bugzilla-Keywords: compile-time-hog X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-01/txt/msg01459.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59802 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |steven at gcc dot gnu.org --- Comment #4 from Richard Biener --- (In reply to David Binderman from comment #3) > (In reply to Richard Biener from comment #2) > > Oh, did you configure with --enable-checking=release for 4.9? (I did) > > No, I used --enable-checking=yes. That makes the comparison to 4.8 invalid (uses --enable-checking=release by default). Btw, callgrind shows that compile-time is dominated by bitmap_intersection_of_preds (and bitmap_ior_and_compl), called from lcm.c:compute_available. LCM works with sbitmaps which can be very expensive for large functions. tree PRE uses regular bitmaps, but it seems that LCM can end up using the full bitmap via returning bitmap_ones from bitmap_intersection_of_preds (for a block with no preds). It seems compute_available doesn't use optimal iteration order and that explicitely representing the maximum set instead of handling unvisited preds makes things more expensive (need to use sbitmaps). Iterating in inverted postorder gets me CPROP : 2.13 ( 5%) usr 0.06 (10%) sys 2.20 ( 5%) wall 4444 kB ( 2%) ggc with no changes in generated code ...