From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4168 invoked by alias); 3 Dec 2014 10:03:51 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 4141 invoked by uid 48); 3 Dec 2014 10:03:46 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/64164] [4.9/5 Regression] one more stack slot used due to one less inlining level Date: Wed, 03 Dec 2014 10:03:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.9.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: keywords cf_reconfirmed_on component target_milestone Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-12/txt/msg00325.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Last reconfirmed|2014-12-03 00:00:00 | Component|middle-end |rtl-optimization Target Milestone|--- |4.9.3 --- Comment #3 from Richard Biener --- The difference is in whether there are extra user-named variables in the end and thus SSA coalescing decision differences: stm_load (volatile stm_word_t * addr) { - stm_word_t l; - stm_word_t value; stm_word_t version; stm_word_t l; struct r_entry_t * r; - stm_word_t now; ... + size_t _32; + size_t _33; + size_t _34; ... Conflict graph: +1: 3 +3: 1 After sorting: Sorted Coalesce list: +(16610) _30 <-> _33 (651) _10 <-> _30 ... -Coalesce list: (10)_10 & (30)_30 [map: 1, 2] : Success -> 1 +Coalesce list: (30)_30 & (33)_33 [map: 2, 3] : Success -> 2 +Coalesce list: (10)_10 & (30)_30 [map: 1, 2] : Fail due to conflict So it turns out the different coalescing ends up generating worse code. It would be interesting to see why we decide that coalescing _30 and _33 is so much more beneficial than coalescing _10 and _30. Ah, it simply uses EDGE_FREQUENCY... and for some reason we predicted that _33 & 1 != 0 is 10% taken only. So ... the theory is that the version is faster on the important path?