From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2166 invoked by alias); 27 Jan 2015 07:56:45 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 2017 invoked by uid 48); 27 Jan 2015 07:56:30 -0000 From: "amker at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/62173] [5.0 regression] 64bit Arch can't ivopt while 32bit Arch can Date: Tue, 27 Jan 2015 07:56:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: amker at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: jiwang at gcc dot gnu.org X-Bugzilla-Target-Milestone: 5.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-01/txt/msg03009.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173 --- Comment #27 from amker at gcc dot gnu.org --- (In reply to Jiong Wang from comment #24) > (In reply to amker from comment #23) > > partially agree. > > at least for the single use case given by Seb, I think tree ivopt should do > it. (I verified clang do ivopt correctly for the case) LLVM generates correct code, but I am not sure it's because of ivopt. The dump after ivopt for llvm is like below: ; Function Attrs: nounwind define void @bar(i32 %d) #0 { entry: %A = alloca [10 x i8], align 1 %cmp2 = icmp sgt i32 %d, 0 br i1 %cmp2, label %while.body.lr.ph, label %while.end while.body.lr.ph: ; preds = %entry %0 = sext i32 %d to i64 br label %while.body while.body: ; preds = %while.body.lr.ph, %while.body %indvars.iv = phi i64 [ %0, %while.body.lr.ph ], [ %indvars.iv.next, %while.body ] %indvars.iv.next = add nsw i64 %indvars.iv, -1 %scevgep = getelementptr i8* %A4, i64 %indvars.iv %1 = load i8* %scevgep, align 1, !tbaa !1 tail call void @foo(i8 %1) #2 %2 = add i64 %indvars.iv.next, 1 %cmp = icmp sgt i64 %2, 1 br i1 %cmp, label %while.body, label %while.end.loopexit The induction variable chosen is the original biv (d) actually, just like GCC. So even if we fix the idx_find_step issue, GCC's ivopt still can generate below codes: Loop-preheader ... Loop-body: iv = phi tmp = (POINTER_TYPE)&A; foo(MEM[base:tmp, index:iv]); Without proper RTL optimization, very likely the issue in calculation of base address of A still exists. > > for the rtl re-associate, it's a little bit painful from my experiment > experiences. as it's not always good to reassociate virtual_frame + offset, > we can only benefit if it's in loop, because the re-associate will increase > register pressure, there will be situations that more callee-saved regs > used, and finally we run into unncessary push/pop in pro/epilogue... and I > haven't found a good place where we can safely re-use existed rtl info and > do the rtl re-association as I am afraid rebuild those rtl info will cause > compile time penalty.