From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16983 invoked by alias); 7 Jun 2011 14:33:48 -0000 Received: (qmail 16970 invoked by uid 22791); 7 Jun 2011 14:33:47 -0000 X-SWARE-Spam-Status: No, hits=-1.2 required=5.0 tests=AWL,BAYES_00,MAY_BE_FORGED,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e8.ny.us.ibm.com (HELO e8.ny.us.ibm.com) (32.97.182.138) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 07 Jun 2011 14:33:25 +0000 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e8.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p57EMIBW024366; Tue, 7 Jun 2011 10:22:18 -0400 Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com [9.17.195.85]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p57EXM7W070240; Tue, 7 Jun 2011 10:33:23 -0400 Received: from d03av05.boulder.ibm.com (loopback [127.0.0.1]) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p57EWSYW008706; Tue, 7 Jun 2011 08:33:08 -0600 Received: from [9.10.86.209] (tepot-pc.rchland.ibm.com [9.10.86.209] (may be forged)) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p57EF3Sn006656; Tue, 7 Jun 2011 08:15:03 -0600 Subject: Re: [Design notes, RFC] Address-lowering prototype design (PR46556) From: "William J. Schmidt" To: Richard Guenther Cc: gcc-patches@gcc.gnu.org, bergner@vnet.ibm.com, dje.gcc@gmail.com, steven@gcc.gnu.org, law@redhat.com In-Reply-To: References: <1307383631.4798.11.camel@L3G5336.ibm.com> Content-Type: text/plain Date: Tue, 07 Jun 2011 14:33:00 -0000 Message-Id: <1307456077.4798.39.camel@L3G5336.ibm.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-06/txt/msg00527.txt.bz2 On Tue, 2011-06-07 at 12:06 +0200, Richard Guenther wrote: > On Mon, Jun 6, 2011 at 8:07 PM, William J. Schmidt > wrote: > > * If the original expression will be recognized as a "hidden global store" in > > tree-ssa-sink.c:is_hidden_global_store, but the replacement expression will > > not, it is possible for the dead code eliminator to remove the modified > > statement. It seems to me this shouldn't normally happen in practice. For > > now I detect this case and refuse to do the replacement, but I suspect a > > possible front-end or upstream-optimization problem here. The test case > > that fails here is libstdc++-v3/testsuite/23_containers/vector/ > > ext_pointer_modifiers/insert.cc. More investigation required. > > That indeed sounds odd. When I looked into it, the addressing expression was fairly complex initially, with templates and inheritance involved. The code to create the affine tree combination was able to collapse a great deal of the arithmetic and produce something much simpler that no longer referenced the item that had made it look like a global store originally. (I.e., buried in the expression were an (&x + a) and a -(&x + a) that cancelled out and removed all traces of x.) It appeared to me that this was being done correctly, but it was a very complex expression and I might have missed something. The result was that essentially the entire procedure ended up going dead. It's possible that some information was either lost or not provided by the front end that should have prevented that. > > Loss of aliasing information > > ============================ > > The most serious problem I've run into is degraded performance due to poorer > > instruction scheduling choices. I tracked this down to > > alias.c:nonoverlapping_component_refs_p. > > > > This code proves that two memory accesses don't overlap by attempting to prove > > that they access different fields of the same structure. This is done using > > the MEM_EXPRs of the two rtx's, which record the expression trees that were > > translated into the rtx's during expand. When address lowering is not > > present, a simple COMPONENT_REF will appear in the MEM_EXPR: x.a, for > > example. However, address lowering changes the simple COMPONENT_REF into a > > [TARGET_]MEM_REF that is no longer necessarily identifiable as a field > > reference. Thus the aliasing machinery can no longer prove that two such > > field references are disjoint. > > > > This has severe consequences for performance, and has to be dealt with if > > address lowering is to be successful. > > > > I've worked around this with an admittedly fragile solution; I'll discuss the > > drawbacks below. The idea is to construct a mapping from replacement mem_refs > > to the original expressions that they replaced. When a MEM_EXPR is being set > > during expand, we first look up the mem_ref in the mapping. If present, the > > MEM_EXPR is set to the original expression, rather than to the mem_ref. This > > essentially duplicates the behavior in the absence of address lowering. > > Ick. We had this in the past via TMR_ORIGINAL which caused all sorts > of problems. Removing it didn't cause much degradation because we now > preserve points-to information. > > Originally I played with lowering all memory accesses to MEM_REFs > (see the old mem-ref branch), and the loss of type-based alias > disambiguation was indeed an issue. > > But - I definitely do not like the idea of preserving something similar > to TMR_ORIGINAL. Instead we can try preserving some information > we derive from it. We keep the original access type that we can use > for TBAA but do not retain knowledge on whether the type of the > MEM_REF is valid for TBAA or if it is view-converted. Yes, I really don't like what I have at the moment, either. I put it in place as a stopgap to let me proceed to look for other performance problems. The question is how we can infer useful information for TBAA from the MEM_REFs and TMRs. I poked at trying to identify types and offsets from the MEM_EXPRs, but this ended up being useless; I had to constrain too many cases to maintain correctness, and couldn't prove the type information for the important cases in SPEC I was trying to address. Unfortunately, the whole design goes down the drain if we can't find a way to solve the TBAA issue. The performance degradations are too costly. Thanks, Bill