From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 84107 invoked by alias); 12 Aug 2015 07:12:36 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 84021 invoked by uid 55); 12 Aug 2015 07:12:32 -0000 From: "rguenther at suse dot de" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/29256] [4.9/5/6 regression] loop performance regression Date: Wed, 12 Aug 2015 07:12:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 4.2.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenther at suse dot de X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.9.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-08/txt/msg00749.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256 --- Comment #57 from rguenther at suse dot de --- On Tue, 11 Aug 2015, wschmidt at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256 > > --- Comment #56 from Bill Schmidt --- > (In reply to Bill Schmidt from comment #53) > > I'm not a fan of a tree-level unroller. It's impossible to make good > > decisions about unroll factors that early. But your second approach sounds > > quite promising to me. > > I would be willing to soften this statement. I think that an early unroller > might well be a profitable approach for most systems with large caches and so > forth, where if the unrolling heuristics are not completely accurate we are > still likely to make a reasonably good decision. However, I would expect to > see ports with limited caches/memory to want more accurate control over > unrolling decisions. So I could see allowing ports to select between a GIMPLE > unroller and an RTL unroller (I doubt anybody would want both). > > In general it seems like PowerPC could benefit from more aggressive unrolling > much of the time, provided we can also solve the related IVOPTS problems that > cause too much register spill. > > I may have an interest in working on a GIMPLE unroller, depending on how > quickly I can complete or shed some other projects... I think that a separate unrolling on GIMPLE would be a hard sell due to the lack of a good cost mode. _But_ doing unrolling as part of another transform like we are doing now makes sense. So does eventually moving parts of an RTL pass involving unrolling to GIMPLE, like modulo scheduling or SMS (leaving the scheduling part to RTL). Note that the RTL unroller is not enabled by default by any optimization level and note that unfortunately the RTL unroller shares flags with the GIMPLE level complete peeling (where it mainly controls cost modeling). Oh, but it's enabled with -fprofile-use. It's been a long time since I've done SPEC measuring with/without -funroll-loops (or/and -fpeel-loops). Note that these flags have secondary effects as well: toplev.c: flag_web = flag_unroll_loops || flag_peel_loops; toplev.c: flag_rename_registers = flag_unroll_loops || flag_peel_loops;