From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-207710-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 18003 invoked by alias); 14 Nov 2007 11:17:45 -0000
Received: (qmail 17993 invoked by uid 22791); 14 Nov 2007 11:17:44 -0000
X-Spam-Check-By: sourceware.org
Received: from py-out-1112.google.com (HELO py-out-1112.google.com) (64.233.166.179)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 14 Nov 2007 11:17:40 +0000
Received: by py-out-1112.google.com with SMTP id a29so3070866pyi         for <gcc-patches@gcc.gnu.org>; Wed, 14 Nov 2007 03:17:38 -0800 (PST)
Received: by 10.65.54.9 with SMTP id g9mr18923167qbk.1195039058556;         Wed, 14 Nov 2007 03:17:38 -0800 (PST)
Received: by 10.65.232.20 with HTTP; Wed, 14 Nov 2007 03:17:38 -0800 (PST)
Message-ID: <84fc9c000711140317o5abefc88n1c01e562d46baed0@mail.gmail.com>
Date: Wed, 14 Nov 2007 12:41:00 -0000
From: "Richard Guenther" <richard.guenther@gmail.com>
To: "Diego Novillo" <dnovillo@google.com>
Subject: Re: Fix PR 33870
Cc: gcc-patches@gcc.gnu.org
In-Reply-To: <84fc9c000711140255m22880809w7611ad878918bf96@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <b798aad50711071624q3c0ffdf0ib21f115b838a0736@mail.gmail.com> 	 <84fc9c000711080245n4f52e669gbcecaa3485f55161@mail.gmail.com> 	 <4739C673.5080905@google.com> 	 <84fc9c000711130808s5ecc11fcp1df00f20f379b76c@mail.gmail.com> 	 <84fc9c000711140255m22880809w7611ad878918bf96@mail.gmail.com>
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2007-11/txt/msg00785.txt.bz2

On Nov 14, 2007 11:55 AM, Richard Guenther <richard.guenther@gmail.com> wrote:
>
> On Nov 13, 2007 5:08 PM, Richard Guenther <richard.guenther@gmail.com> wrote:
> >
> > On Nov 13, 2007 4:44 PM, Diego Novillo <dnovillo@google.com> wrote:
> > > Richard Guenther wrote:
> > >
> > > > taking the address of result.x.pDirty causes the SFT to be marked as
> > > > in substructure and the problem re-surfaces.
> > >
> > > Yes.  Thanks for the test case.  The original patch did not take into
> > > account that the nesting level of the memory reference is also important.
> > >
> > > There are two problems here:
> > >
> > > 1- Given an arbitrary memory expression M dereferencing a pointer P into
> > > a nested structure, the offset computed for M is the offset starting at
> > > P.  However, the offsets computed for the original SFTs are offsets from
> > > the top of the pointed-to memory object O.  So, when we recognize this
> > > situation, we must adjust the offsets.  This was the situation
> > > recognized and fixed by the original patch.
> > >
> > > The bug was in the detection of whether P is "pointing to the middle of
> > > an object".  This is where the nesting level comes in to play.  Suppose
> > > that M was the expression p_3->b[3] and the original pointed-to object
> > > was v.x.y:
> > >
> > >                 struct {
> > >                         ...
> > >                         struct {
> > >                                 ...
> > >                                 struct {
> > >                                         ...
> > >                                         int b[4];
> > >                                 } y;
> > >                         } x;
> > >                 } v;
> > >
> > >                 p_3 = &v.x.y;
> > >
> > > If alias analysis created SFTs for this structure, the array 'b' will
> > > have 4 SFTs associated with it.  Say SFT.12, SFT.13, SFT.14 and SFT.15.
> > >   However, only SFT.12 will be added to the alias set of p_3's name tag.
> > >
> > > So, when we see the expression 'p_3->b[3]', we compute an offset of 96,
> > > but we need SFT.14 which will probably have an offset much higher than
> > > that.  So, we adjust 96 by the offset of SFT.12 and get to SFT.14.
> > >
> > > In the original patch, we recognized this situation by asking whether
> > > the SFT.12 was inside a nested structure.  However, it may happen that
> > > the pointer was actually pointing to the base of the original object
> > > 'v'.  In which case, we would have the expression 'p_3->x.y.b[3]', which
> > > also takes us to SFT.12.
> > >
> > > But in this case, the offset we computed from 'p_3->x.y.b[3]' is
> > > *already* the right offset, so adding the offset of SFT.12 to it gives
> > > us the wrong offset.  So, what this patch does is determine the nesting
> > > level of each SFT and the nesting level of the memory expression.  We
> > > only need to adjust the offset of an SFT if the nesting level of the
> > > memory expression is lower than the nesting level of the SFT that we are
> > > analyzing.
> > >
> > >
> > > 2- The second problem we have with this scheme is that we rely on the
> > > presence of key SFTs to adjust offsets when we detect references to
> > > nested aggregates.  This is something we cannot easily undo because
> > > inside the operand scanner we cannot really start doing use-def chain
> > > chasing to determine the original pointed-to objects.  This work has
> > > already been done by alias analysis, so duplicating this in the operand
> > > scanner would be wasteful.
> > >
> > > This means that this scheme cannot tolerate these key SFTs to be
> > > partitioned away.  If they are partitioned, then inside the operand
> > > scanner is essentially impossible to determine where in the structure is
> > > an arbitrary memory expression referring to.  This is shown in the new
> > > test gcc.dg/tree-ssa/alias-16.c.
> > >
> > > One way to deal with this would be to use the pointed-to set for the
> > > base pointer instead of the alias set of the name tag.  This has the
> > > problem that we'd miss the compile-time benefits of partitioning, as we
> > > would be traversing the original sets instead of the partitioned
> > > (smaller) alias sets.  Also, we may want to get rid of the duplicate
> > > pointed-to/alias sets in the future (not sure if they're both worth
> > > keeping).
> > >
> > > The other way of dealing with this is to recognize which SFTs are
> > > important to not partition and make sure they are never partitioned.
> > > During alias analysis, if a nested SFT is added to the alias set of a
> > > pointer, then it is marked as unpartitionable.  The partitioner, in
> > > turn, gives these SFTs a very high score and makes sure that they are
> > > never added to a partition.  The effects of this are/should be
> > > negligible as it only involves a single SFT and nested structures.
> > >
> > > The new test alias-16.c tests for these edge cases.
> > >
> > > Tested on x86_64 and ppc64.  Committed to trunk.
> >
> > Thanks.  Now, you should be able to do all this without introducing
> > all the nesting-level stuff if you make all pointed-to SFTs unpartitionable.
> >
> > In fact,
> >
> > > -  if (SFT_IN_NESTED_STRUCT (var))
> > > +  if (full_ref
> > > +      && SFT_NESTING_LEVEL (var) > 0
> > > +      && ref_nesting_level (full_ref) < SFT_NESTING_LEVEL (var))
> > >      {
> > >        /* Since VAR is an SFT inside a nested structure, the OFFSET
> > >          computed by get_ref_base_and_extent is the offset from the
> > > -        start of the immediately containing structure.  However, to
> > > -        find out what other SFTs are affected by this reference, we
> > > -        need to know the offsets starting at the root structure in
> > > -        the nesting hierarchy.
> > > +        start of the immediately containing structure.  If VAR is an
> > > +        SFT inside a nested structure, then FULL_REF may be a
> > > +        reference to the structure immediately enclosing SFT, and so
> > > +        OFFSET will be the offset from the start of the immediately
> > > +        enclosing structure.
> > > +
> > > +        However, to find out what other SFTs are affected by this
> > > +        reference, we need to know the offsets starting at the root
> > > +        structure in the nesting hierarchy.
> > >
> > >          For instance, given the following structure:
> >
> > This looks like a pure workaround for that you do not retain the offset
> > zero SFT as unpartitionable.  Otherwise you'd enter this function with
> > it (if it is pointed-to), and you can always add the pointed-to SFT offset.
> >
> > I believe your patch makes it more twisted than it needs to be (it doesn't
> > even look like an optimization, as you are scanning for pointed-to offset
> > zero multiple times, for all pointed-to SFTs with a nesting level >
> > than the ref).

Oh, and as you basically go back towards my first fix that was reverted this
patch causes compile-time regressions and we hit
internal compiler error: in ssa_operand_alloc, at tree-ssa-operands.c:484
more often as well.

(I believe the latter is because the max-aliased-vops limit is a per-function
one while a per-stmt one would be more useful, though I see that it is
difficult to estimate per-stmt VOP reduction during partitioning.  But as
we do the limit per-function we are not efficient in preventing this ICE)

Richard.