From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-262045-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 497 invoked by alias); 11 May 2010 08:34:15 -0000
Received: (qmail 485 invoked by uid 22791); 11 May 2010 08:34:14 -0000
X-SWARE-Spam-Status: No, hits=-1.8 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE
X-Spam-Check-By: sourceware.org
Received: from mail-gx0-f209.google.com (HELO mail-gx0-f209.google.com) (209.85.217.209)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 11 May 2010 08:34:09 +0000
Received: by gxk1 with SMTP id 1so2865349gxk.16        for <gcc-patches@gcc.gnu.org>; Tue, 11 May 2010 01:34:07 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.150.100.9 with SMTP id x9mr9308955ybb.99.1273566847349; Tue, 	11 May 2010 01:34:07 -0700 (PDT)
Received: by 10.151.11.21 with HTTP; Tue, 11 May 2010 01:34:07 -0700 (PDT)
In-Reply-To: <AANLkTini_oBL-mdWrt2IrVYw6f0-Mil_hi0AxPmqHjHv@mail.gmail.com>
References: <AANLkTini_oBL-mdWrt2IrVYw6f0-Mil_hi0AxPmqHjHv@mail.gmail.com>
Date: Tue, 11 May 2010 08:34:00 -0000
Message-ID: <AANLkTiluCGvOYgihb0XJTtsk0VrVdY61PsEuoFgFz05w@mail.gmail.com>
Subject: Re: IVOPT improvement patch
From: Richard Guenther <richard.guenther@gmail.com>
To: Xinliang David Li <davidxl@google.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2010-05/txt/msg00729.txt.bz2

On Tue, May 11, 2010 at 8:35 AM, Xinliang David Li <davidxl@google.com> wro=
te:
> Hi, IVOPT has been one of the main area of complaints from gcc users
> and it is often shutdown or user is forced to use inline assembly to
> write key kernel loops. The following (resulting from the
> investigation of many user complaints) summarize some of the key
> problems:
>
> 1) Too many induction variables are used and advanced addressing mode
> is not fully taken advantage of. On latest Intel CPU, the increased
> loop size (due to iv updates) can have very large negative impact on
> performance, e.g, when LSD and uop macro fusion get blocked. The root
> cause of the problem is not at the cost model used in IVOPT, but in
> the algorithm in finding the 'optimal' assignment from iv candidates
> to uses.
>
> 2) Profile information is not used in cost estimation (e.g. computing
> cost of loop variants)
>
> 3) For replaced IV (original) that are only live out of the loop (i.e.
> there are no uses inside loop), the rewrite of the IV occurs inside
> the loop which usually results in code more expensive than the
> original iv update statement -- and it is very difficult for later
> phases to sink down the computation outside the loop (see PR31792).
> The right solution is to materialize/rewrite such ivs directly outside
> the loop (also to avoid introducing overlapping live ranges)
>
> 4) iv update statement sometimes block the forward
> propagation/combination of the memory ref operation (depending the
> before IV value) =A0with the loop branch compare. Simple minded
> propagation will lead to overlapping live range and addition copy/move
> instruction to be generated.
>
> 5) In estimating the global cost (register pressure), the registers
> resulting from LIM of invariant expressions are not considered
>
> 6) IN MEM_REF creation, loop variant and invariants may be assigned to
> the same part -- which is essentially a re-association blocking LIM
>
> 7) Intrinsic calls that are essentially memory operations are not
> recognized as uses.

8) Replacement pointer induction variables do not inherit alias-information
pessimizing MEM_REF memory operations.

> The attached patch handles all the problems above except for 7.
>
>
> Bootstrapped and regression tested on linux/x86_64.
>
> The patch was not tuned for SPEC, but SPEC testing was done.
> Observable improvements : gcc 4.85%, vpr 1.53%, bzip2 2.36%, and eon
> 2.43% (Machine CPU: Intel Xeon E5345/2.33Ghz, m32mode).

Can you split the patch into pieces and check SPEC numbers also
for 64bit operation?  I assume that maybe powerpc people want to
check the performance impact as well.

Thanks,
Richard.

> Ok for trunk?
>
> Thanks,
>
> David
>