From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1971 invoked by alias); 15 May 2014 09:51:57 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 1923 invoked by uid 48); 15 May 2014 09:51:53 -0000 From: "thomas.preudhomme at arm dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/60172] [4.9/4.10 Regression] ARM performance regression from trunk@207239 Date: Thu, 15 May 2014 09:51:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 4.9.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: thomas.preudhomme at arm dot com X-Bugzilla-Status: WAITING X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.9.1 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-05/txt/msg01353.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172 --- Comment #20 from Thomas Preud'homme --- (In reply to rguenther@suse.de from comment #19) > On Thu, 15 May 2014, thomas.preudhomme at arm dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172 > > > > --- Comment #18 from Thomas Preud'homme --- > > (In reply to Richard Biener from comment #17) > > > > > > Citing myself: > > > > > > On the GIMPLE level before expansion we have > > > > > > +40 = Arr_2_Par_Ref_22(D) + (_41 + pretmp_20); > > > > > > _51 = Arr_2_Par_Ref_22(D) + (_41 + (pretmp_20 + 1000)); > > > > > > so if _51 were Arr_2_Par_Ref_22(D) + ((_41 + pretmp_20) + 1000); > > > > > > then _41 + pretmp_20 would be fully redundant with the expression needed > > > by _40. > > > > Yes I saw that but I was wondering why would reassoc try this association > > rather than another since the header of the file doesn't mention any special > > treatment of explicit integer constants. > > > > Besides, wouldn't it still misses that fact that _51 = _40 + 1000? > > Yes. But reassoc doesn't associate across POINTER_PLUS_EXPRs. Is there a reason for that? > > RTL CSE could catch it, but for it the association would have to > be the same for both. If we start from the proposed form > then at RTL expansion time we could associate > pointer + (X + CST) to (pointer + X) + CST. Right. > > Feels all somewhat hacky, of course (and relies on TER). There > may be cases where doing the opposite is better (for example > if you have ptr1 + (X + 1000) and ptr2 + (X + 1000)). Association > to make CSE possible is always hard if CSE itself cannot associate > to maximize the number of CSE opportunities. So at the moment > any choice is just canonicalization. Exactly my thought. I'm not sure if that's what you have in mind when you write association for CSE but I was thinking about a scheme that ressemble what tree_to_aff_combination_expand does and organize all expanded expression to compare them easily (read efficiently). With such a capability it would then not be necessary to do the first replacement with forprop+reassoc+dom as everything could be done in CSE.