From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16720 invoked by alias); 19 Aug 2009 11:53:34 -0000 Received: (qmail 16635 invoked by uid 22791); 19 Aug 2009 11:53:33 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from mail1-relais-roc.national.inria.fr (HELO mail1-relais-roc.national.inria.fr) (192.134.164.82) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 19 Aug 2009 11:53:25 +0000 Received: from gaia.futurs.inria.fr (HELO [195.83.212.216]) ([195.83.212.216]) by mail1-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 19 Aug 2009 13:53:22 +0200 Message-ID: <4A8BE7B2.7080708@inria.fr> Date: Wed, 19 Aug 2009 12:13:00 -0000 From: Albert Cohen User-Agent: Mozilla-Thunderbird 2.0.0.19 (X11/20090103) MIME-Version: 1.0 To: gcc@gcc.gnu.org Subject: complete_unrolli / complete_unroll References: In-Reply-To: Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2009-08/txt/msg00331.txt.bz2 When debugging graphite, we ran into code bloat issues due to pass_complete_unrolli being called very early in the non-ipa optimization sequence. Much later, the full-blown pass_complete_unroll is scheduled, and this one does not do any harm. Strangely, this early unrolling pass (tuned to only unroll inner loops) is only enabled at -O3, independently of the -funroll-loops flag. Does anyone remember why it is there, for which platform it is useful, and what are the perf regressions if we remove it? My guess is that it may only harm... disabling or damaging the effectivenesss of the (loop-level) vectorizer and increasing compilation time. Thanks, Albert PS: When this question is solved, it will also be interesting to start a serious discussion on how to improve the flexibility in customizing pass ordering and parameterization of passes depending on the target. Grigori Fursin's work shows the strong benefits and already provides a working prototype. This question is independent of whether the customization is done by experts or machine-learning/statistical techniques.