From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-262093-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 1210 invoked by alias); 11 May 2010 17:49:30 -0000
Received: (qmail 1198 invoked by uid 22791); 11 May 2010 17:49:28 -0000
X-SWARE-Spam-Status: No, hits=-2.0 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,SPF_HELO_PASS,TW_EG,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from smtp-out.google.com (HELO smtp-out.google.com) (74.125.121.35)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 11 May 2010 17:49:21 +0000
Received: from wpaz13.hot.corp.google.com (wpaz13.hot.corp.google.com [172.24.198.77])	by smtp-out.google.com with ESMTP id o4BHnHLd024295	for <gcc-patches@gcc.gnu.org>; Tue, 11 May 2010 10:49:18 -0700
Received: from vws8 (vws8.prod.google.com [10.241.21.136])	by wpaz13.hot.corp.google.com with ESMTP id o4BHn48f032168	for <gcc-patches@gcc.gnu.org>; Tue, 11 May 2010 10:49:16 -0700
Received: by vws8 with SMTP id 8so178908vws.36        for <gcc-patches@gcc.gnu.org>; Tue, 11 May 2010 10:49:16 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.220.124.194 with SMTP id v2mr376970vcr.125.1273600156061; Tue, 	11 May 2010 10:49:16 -0700 (PDT)
Received: by 10.220.17.8 with HTTP; Tue, 11 May 2010 10:49:15 -0700 (PDT)
In-Reply-To: <4BE99178.6030007@moene.org>
References: <AANLkTini_oBL-mdWrt2IrVYw6f0-Mil_hi0AxPmqHjHv@mail.gmail.com>	 <4BE99178.6030007@moene.org>
Date: Tue, 11 May 2010 17:49:00 -0000
Message-ID: <AANLkTimOfweSk1yg3daF9K_Cdo4oYw91nOPWO_mSka6K@mail.gmail.com>
Subject: Re: IVOPT improvement patch
From: Xinliang David Li <davidxl@google.com>
To: Toon Moene <toon@moene.org>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-System-Of-Record: true
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2010-05/txt/msg00777.txt.bz2

On Tue, May 11, 2010 at 10:18 AM, Toon Moene <toon@moene.org> wrote:
> On 05/11/2010 08:35 AM, Xinliang David Li wrote:
>
>> Hi, IVOPT has been one of the main area of complaints from gcc users
>> and it is often shutdown or user is forced to use inline assembly to
>> write key kernel loops. The following (resulting from the
>> investigation of many user complaints) summarize some of the key
>> problems:
>
>> 6) IN MEM_REF creation, loop variant and invariants may be assigned to
>> the same part -- which is essentially a re-association blocking LIM
>
> On the other hand, some recombination of induction variables is necessary=
 to
> prevent excessive register pressure (and the resulting spills).
>
> From my slides at the May, 1999 Linux Expo:
>
> "Let's turn our attention to the kinetic energy loop again:
> \begin{verbatim}
> =A0 =A0 =A0DO 810 I=3DILONP2,ILNLT
> =A0 =A0 =A0 =A0 ZEK(I) =3D 0.25 *
> =A0 =A0 + =A0 =A0 =A0 =A0( ( PUZ(I-1 =A0 ,K)*PUZ(I-1 =A0 ,K)
> =A0 =A0 + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 *HYU(I-1 =A0 )
> =A0 =A0 + =A0 =A0 =A0 =A0 =A0+ PUZ(I =A0 =A0 ,K)*PUZ(I =A0 =A0 ,K)
> =A0 =A0 + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 *HYU(I =A0 =A0 ))*RHYV (I)
> =A0 =A0 + =A0 =A0 =A0 =A0+ ( PVZ(I-ILON,K)*PVZ(I-ILON,K)
> =A0 =A0 + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 *HXV(I-ILON)
> =A0 =A0 + =A0 =A0 =A0 =A0 =A0+ PVZ(I =A0 =A0 ,K)*PVZ(I =A0 =A0 ,K)
> =A0 =A0 + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 *HXV(I =A0 =A0 ))*RHXU (I) )
> =A0810 =A0CONTINUE
> \end{verbatim}
> If we strength reduce all induction variables and move
> all loop invariant code out of the loop, we need
> 11 registers to hold the addresses needed to step through
> the arrays.
> \end{slide}
> \begin{slide}{}
> We can do better, by noting that
> \begin{verbatim}
> { PUZ(I-1 =A0 ,K), PUZ(I =A0 =A0 ,K) }
> { PVZ(I-ILON,K), PVZ(I =A0 =A0 ,K) }
> { HYU(I-1 =A0 ) =A0, HYU(I =A0 =A0 ) =A0 }
> { HXV(I-ILON) =A0, HXV(I =A0 =A0 ) =A0 }
> \end{verbatim}
> form 4 {\em equivalence classes} of induction variables
> that differ only by a constant - which means they
> can be written in the form of address-register-with-offset.
>

Finding the optimal partition and iv selection/assignment is the core
part of the IVOPT. GCC IVOPT uses a cost based approach to achieve
that, however there are some issues that prevent the above optimal
solution above. One of the intention of this patch is to address it.

The problem described in 6 is different. Assuming we have the
following linear address expressions:

base + iv + inv*coeff,  where iv is the induction variable, and inv is
loop invariant and coeff is constant 2/4/8, the synthesized mem ref
(without the fix) can be:

MEM_REF(base + iv, inv, coeff)

With the fix,

MEM_REF(base + inv*coeff, iv, 1)

Where base+inv*coeff can be later hoisted out of the loop.

Of course the added cost is the increased register pressure -- but
this is modelled by the so call pseudo invariant.

Thanks,

David


> In doing so, we save 4 registers and need only 7 registers
> for addressing.
>
> Richard Henderson implemented this optimization, which
> will be part of egcs-1.2 ... I mean gcc-2.95."
>
> That was '99, so it was discussing code compiled by g77 and using RTL
> optimization passes only - but the idea is the same.
>
> Kind regards,
>
> --
> Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG =A0Maartensdijk, The Netherlands
> At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
> Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html#Fortran
>