From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 64433 invoked by alias); 13 Aug 2015 05:01:04 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 64361 invoked by uid 48); 13 Aug 2015 05:01:00 -0000 From: "amker at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/29256] [4.9/5/6 regression] loop performance regression Date: Thu, 13 Aug 2015 05:01:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 4.2.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: amker at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.9.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-08/txt/msg00846.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256 --- Comment #62 from amker at gcc dot gnu.org --- (In reply to Bill Schmidt from comment #61) > (In reply to amker from comment #60) > > (In reply to Bill Schmidt from comment #59) > > > We don't have a lot of data yet, but we have seen several examples in SPEC > > > and other benchmarks where turning on -funroll-loops is helpful, but should > > > be much more helpful -- in many cases performance improves with a much > > > higher unroll factor. However, the effectiveness of unrolling is very much > > > tied up with these issues in IVOPTS, where we currently end up with too many > > > separate base registers for IVs. As we increase the unroll factor, we > > By this, do you mean too many candidates are chosen? Or the issue just like > > this PR describes? Thanks. > > > > On the surface, it's the issue from this PR where we have lots of separate > induction variables with their own index registers each requiring an add > during each iteration. The presence of this issue masks whether we have too IMHO, this issue should be fixed by a gimple unroller before IVO, or in RTL unroller. It's not that practical to fix it in IVO. > many candidates, but in the sense that we often see register spill > associated with this kind of code, we do have too many. I.e., the register > pressure model may not be in tune with the kind of addressing mode that's > being selected, but that's just a theory. Or perhaps pressure is just being > generically under-predicted for POWER. IVO's reg-pressure model fails to preserve a small iv set sometime on aarch64 too. I have this issue on list. On the other hand, the loops I saw are generally very big, it's might be inappropriate that rtl unroller decides to unroll them at the first place. > > Up till now we haven't done a lot of detailed analysis. Hopefully we can > free somebody up to start looking at some of our unrolling issues soon.