From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 497 invoked by alias); 11 May 2010 08:34:15 -0000 Received: (qmail 485 invoked by uid 22791); 11 May 2010 08:34:14 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE X-Spam-Check-By: sourceware.org Received: from mail-gx0-f209.google.com (HELO mail-gx0-f209.google.com) (209.85.217.209) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 11 May 2010 08:34:09 +0000 Received: by gxk1 with SMTP id 1so2865349gxk.16 for ; Tue, 11 May 2010 01:34:07 -0700 (PDT) MIME-Version: 1.0 Received: by 10.150.100.9 with SMTP id x9mr9308955ybb.99.1273566847349; Tue, 11 May 2010 01:34:07 -0700 (PDT) Received: by 10.151.11.21 with HTTP; Tue, 11 May 2010 01:34:07 -0700 (PDT) In-Reply-To: References: Date: Tue, 11 May 2010 08:34:00 -0000 Message-ID: Subject: Re: IVOPT improvement patch From: Richard Guenther To: Xinliang David Li Cc: GCC Patches Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2010-05/txt/msg00729.txt.bz2 On Tue, May 11, 2010 at 8:35 AM, Xinliang David Li wro= te: > Hi, IVOPT has been one of the main area of complaints from gcc users > and it is often shutdown or user is forced to use inline assembly to > write key kernel loops. The following (resulting from the > investigation of many user complaints) summarize some of the key > problems: > > 1) Too many induction variables are used and advanced addressing mode > is not fully taken advantage of. On latest Intel CPU, the increased > loop size (due to iv updates) can have very large negative impact on > performance, e.g, when LSD and uop macro fusion get blocked. The root > cause of the problem is not at the cost model used in IVOPT, but in > the algorithm in finding the 'optimal' assignment from iv candidates > to uses. > > 2) Profile information is not used in cost estimation (e.g. computing > cost of loop variants) > > 3) For replaced IV (original) that are only live out of the loop (i.e. > there are no uses inside loop), the rewrite of the IV occurs inside > the loop which usually results in code more expensive than the > original iv update statement -- and it is very difficult for later > phases to sink down the computation outside the loop (see PR31792). > The right solution is to materialize/rewrite such ivs directly outside > the loop (also to avoid introducing overlapping live ranges) > > 4) iv update statement sometimes block the forward > propagation/combination of the memory ref operation (depending the > before IV value) =A0with the loop branch compare. Simple minded > propagation will lead to overlapping live range and addition copy/move > instruction to be generated. > > 5) In estimating the global cost (register pressure), the registers > resulting from LIM of invariant expressions are not considered > > 6) IN MEM_REF creation, loop variant and invariants may be assigned to > the same part -- which is essentially a re-association blocking LIM > > 7) Intrinsic calls that are essentially memory operations are not > recognized as uses. 8) Replacement pointer induction variables do not inherit alias-information pessimizing MEM_REF memory operations. > The attached patch handles all the problems above except for 7. > > > Bootstrapped and regression tested on linux/x86_64. > > The patch was not tuned for SPEC, but SPEC testing was done. > Observable improvements : gcc 4.85%, vpr 1.53%, bzip2 2.36%, and eon > 2.43% (Machine CPU: Intel Xeon E5345/2.33Ghz, m32mode). Can you split the patch into pieces and check SPEC numbers also for 64bit operation? I assume that maybe powerpc people want to check the performance impact as well. Thanks, Richard. > Ok for trunk? > > Thanks, > > David >