From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15269 invoked by alias); 7 Jun 2011 14:49:26 -0000 Received: (qmail 15256 invoked by uid 22791); 7 Jun 2011 14:49:25 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST X-Spam-Check-By: sourceware.org Received: from mail-ww0-f51.google.com (HELO mail-ww0-f51.google.com) (74.125.82.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 07 Jun 2011 14:49:07 +0000 Received: by wwf26 with SMTP id 26so4582304wwf.8 for ; Tue, 07 Jun 2011 07:49:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.227.152.132 with SMTP id g4mr6252860wbw.24.1307458145847; Tue, 07 Jun 2011 07:49:05 -0700 (PDT) Received: by 10.227.37.152 with HTTP; Tue, 7 Jun 2011 07:49:05 -0700 (PDT) In-Reply-To: <1307456077.4798.39.camel@L3G5336.ibm.com> References: <1307383631.4798.11.camel@L3G5336.ibm.com> <1307456077.4798.39.camel@L3G5336.ibm.com> Date: Tue, 07 Jun 2011 14:49:00 -0000 Message-ID: Subject: Re: [Design notes, RFC] Address-lowering prototype design (PR46556) From: Richard Guenther To: "William J. Schmidt" Cc: gcc-patches@gcc.gnu.org, bergner@vnet.ibm.com, dje.gcc@gmail.com, steven@gcc.gnu.org, law@redhat.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-06/txt/msg00529.txt.bz2 On Tue, Jun 7, 2011 at 4:14 PM, William J. Schmidt wrote: > > On Tue, 2011-06-07 at 12:06 +0200, Richard Guenther wrote: >> On Mon, Jun 6, 2011 at 8:07 PM, William J. Schmidt >> wrote: > > > >> > =A0* If the original expression will be recognized as a "hidden global= store" in >> > =A0 tree-ssa-sink.c:is_hidden_global_store, but the replacement expres= sion will >> > =A0 not, it is possible for the dead code eliminator to remove the mod= ified >> > =A0 statement. =A0It seems to me this shouldn't normally happen in pra= ctice. =A0For >> > =A0 now I detect this case and refuse to do the replacement, but I sus= pect a >> > =A0 possible front-end or upstream-optimization problem here. =A0The t= est case >> > =A0 that fails here is libstdc++-v3/testsuite/23_containers/vector/ >> > =A0 ext_pointer_modifiers/insert.cc. =A0More investigation required. >> >> That indeed sounds odd. > > When I looked into it, the addressing expression was fairly complex > initially, with templates and inheritance involved. =A0The code to create > the affine tree combination was able to collapse a great deal of the > arithmetic and produce something much simpler that no longer referenced > the item that had made it look like a global store originally. =A0(I.e., > buried in the expression were an (&x + a) and a -(&x + a) that cancelled > out and removed all traces of x.) =A0It appeared to me that this was being > done correctly, but it was a very complex expression and I might have > missed something. > > The result was that essentially the entire procedure ended up going > dead. =A0It's possible that some information was either lost or not > provided by the front end that should have prevented that. > > > >> > Loss of aliasing information >> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D >> > The most serious problem I've run into is degraded performance due to = poorer >> > instruction scheduling choices. =A0I tracked this down to >> > alias.c:nonoverlapping_component_refs_p. >> > >> > This code proves that two memory accesses don't overlap by attempting = to prove >> > that they access different fields of the same structure. =A0This is do= ne using >> > the MEM_EXPRs of the two rtx's, which record the expression trees that= were >> > translated into the rtx's during expand. =A0When address lowering is n= ot >> > present, a simple COMPONENT_REF will appear in the MEM_EXPR: =A0x.a, f= or >> > example. =A0However, address lowering changes the simple COMPONENT_REF= into a >> > [TARGET_]MEM_REF that is no longer necessarily identifiable as a field >> > reference. =A0Thus the aliasing machinery can no longer prove that two= such >> > field references are disjoint. >> > >> > This has severe consequences for performance, and has to be dealt with= if >> > address lowering is to be successful. >> > >> > I've worked around this with an admittedly fragile solution; I'll disc= uss the >> > drawbacks below. =A0The idea is to construct a mapping from replacemen= t mem_refs >> > to the original expressions that they replaced. =A0When a MEM_EXPR is = being set >> > during expand, we first look up the mem_ref in the mapping. =A0If pres= ent, the >> > MEM_EXPR is set to the original expression, rather than to the mem_ref= . =A0This >> > essentially duplicates the behavior in the absence of address lowering. >> >> Ick. =A0We had this in the past via TMR_ORIGINAL which caused all sorts >> of problems. =A0Removing it didn't cause much degradation because we now >> preserve points-to information. >> >> Originally I played with lowering all memory accesses to MEM_REFs >> (see the old mem-ref branch), and the loss of type-based alias >> disambiguation was indeed an issue. >> >> But - I definitely do not like the idea of preserving something similar >> to TMR_ORIGINAL. =A0Instead we can try preserving some information >> we derive from it. =A0We keep the original access type that we can use >> for TBAA but do not retain knowledge on whether the type of the >> MEM_REF is valid for TBAA or if it is view-converted. > > Yes, I really don't like what I have at the moment, either. =A0I put it in > place as a stopgap to let me proceed to look for other performance > problems. > > The question is how we can infer useful information for TBAA from the > MEM_REFs and TMRs. =A0I poked at trying to identify types and offsets from > the MEM_EXPRs, but this ended up being useless; I had to constrain too > many cases to maintain correctness, and couldn't prove the type > information for the important cases in SPEC I was trying to address. > > Unfortunately, the whole design goes down the drain if we can't find a > way to solve the TBAA issue. =A0The performance degradations are too > costly. If you look at what basic TBAA the alias oracle performs then it boils down to the fact that get_alias_set for a.b.c might end up using the alias-set of the type of C but for MEM[&a + 4] it will use the alias set of the type of a. The tree alias-oracle extracts both alias sets, that of the outermost valid type and that of the innermost as both are equally useful. But the MEM_REF (or TARGET_MEM_REF) tree only have storage for one such alias-set. Thus my idea at some point was to store the other one as well in some form. It will not be the full information (after all, the complete access path does provide some extra information - see aliasing_component_refs_p). Btw, I'm looking at lowering bitfield accesses to read-modify-write cycles and in that context also to lowering unaligned accesses for targets that do not support them. Richard. > > > Thanks, > Bill > >