From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-212647-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 18982 invoked by alias); 25 Feb 2008 22:13:01 -0000
Received: (qmail 18972 invoked by uid 22791); 25 Feb 2008 22:13:00 -0000
X-Spam-Check-By: sourceware.org
Received: from mail.suse.de (HELO mx1.suse.de) (195.135.220.2)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 25 Feb 2008 22:12:33 +0000
Received: from Relay2.suse.de (relay-ext.suse.de [195.135.221.8]) 	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) 	(No client certificate requested) 	by mx1.suse.de (Postfix) with ESMTP id AEA11313A0; 	Mon, 25 Feb 2008 23:12:30 +0100 (CET)
Date: Mon, 25 Feb 2008 22:30:00 -0000
From: Richard Guenther <rguenther@suse.de>
To: Zdenek Dvorak <rakdver@kam.mff.cuni.cz>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [RFC] Change (flatten) representation of memory references
In-Reply-To: <20080225214530.GA31111@kam.mff.cuni.cz>
Message-ID: <Pine.LNX.4.64.0802252258420.5150@zhemvz.fhfr.qr>
References: <Pine.LNX.4.64.0802251706340.5150@zhemvz.fhfr.qr>  <20080225214530.GA31111@kam.mff.cuni.cz>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2008-02/txt/msg01233.txt.bz2

On Mon, 25 Feb 2008, Zdenek Dvorak wrote:

> Hi,
> 
> > I thought about the invariant addresses again (that we allow
> > &a[5] as is_gimple_min_invariant) and came to the conclusion that
> > rather than allowing a nesting of POINTER_PLUS_EXPR <ADDR_EXPR <a>, 5>
> > as invariant we'd extend ADDR_EXPR to adjust the address by an offset,
> > making all MEM_REF, INDIRECT_MEM_REF, POINTER_PLUS_EXPR and ADDR_EXPR
> > "symmetric".
> > 
> > With that addition I wrote up http://gcc.gnu.org/wiki/MemRef
> > 
> > Requests for additions/clarifications welcome.
> 
> it would be nice if IDX_EXPR also kept track of the bounds of the index
> (i.e., two additional fields).  The proposal mentions keeping
> "persistent value range information for i" -- I am not quite sure what
> that is supposed to mean?

I thought that special casing IDX_EXPR as a (possibly redundant) point to
track a value range for the index (the bounds of the index are a value
range after all) would be less nice than to have the ability to remember
a (constant) value range for any SSA_NAME.

That is, if we would add a persistent map of SSA_NAME -> value_range
we can store the lower/upper bounds for i_4 once we see it used as
an array index.  (Yes, I see that we would need to defer MEM_REF lowering
until after we go into SSA in this case, or insert ASSERT_EXPRs during
lowering)

Other than that, extending IDX_EXPR with lower/upper value is the
only way to keep the range information for the outermost index.  (I
suppose for multi-dimensional accesses we can recover the range from
the stride of the next outer IDX_EXPR, right?)

> Also, something should be said about the order of the indices -- is it
> allowed to reassociate IDX_EXPRs?  Is it allowed to access the same
> array with several different steps (IIRC, fortran frontend does some
> such trick to implement transposition of an array)?

IDX_EXPRs should be not re-associated (likewise the operands should
not be swapped) to ease analysis.  Basically the use-def chain of
the offset component should be the same as walking the indices
from innermost to outermost.

It is allowed to access the same memory with different MEM_REF
(thus dimensionality, strides and steps).  At least I don't see
what should prevent that.

> What about the indices whose value is a constant?  These can be
> moved to offset, but that makes dependency analysis harder, so perhaps
> it would make sense to make this lowering only after loop optimizations?

Constant valued indices are at the moment do not create an IDX_EXPR.
This would be easy to change, but I thought this wouldn't complicate
data dependency analysis (at least it shouldn't make it impossible).
But I don't know to much about the workings of our data dependence
analyzer.

If we defer lowering to after loop optimizatins I don't see a reason to
keep IDX_EXPR at all?  I really like to have lowering before our usual
memory optimizers (SCCVN and DSE for the two I looked at).

Looking at an actual example

float A[8][8][8];
float B[8][8][8];

void foo(void)
{
  int i, j, k;
  for (i=0; i<8; ++i)
    for (j=0; j<8; ++j)
      for (k=0; k<8; ++k)
        A[i][j][k] = B[i][2][k];
}

I see that I didn't implement constant-folding of IDX yet, or the
lowering unconditionally emits an IDX expr in this case (possibly
due to the non-constant offset - so we only get no IDX for constant
innermost indices):

  D.1681 = IDX <0 + k.4 * 4>;
  D.1682 = IDX <D.1681 + 2 * 32>;
  D.1683 = IDX <D.1682 + i.3 * 256>;
  D.1679 = MEM <float {0}, &B + D.1683>;
  D.1684 = IDX <0 + k.2 * 4>;
  D.1685 = IDX <D.1684 + j.1 * 32>;
  D.1686 = IDX <D.1685 + i.0 * 256>;
  MEM <float {0}, &A + D.1686> = D.1679;

does this look ok?  (I'll make an updated patch available that passes
the execute torture at -O0 and -O with just a few ICEs at -O)

Thanks,
Richard.