From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18967 invoked by alias); 14 Nov 2013 17:50:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 18953 invoked by uid 89); 14 Nov 2013 17:50:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_50,RDNS_NONE,SPF_HELO_PASS,SPF_PASS,URIBL_BLOCKED autolearn=no version=3.3.2 X-HELO: mx1.redhat.com Received: from Unknown (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 14 Nov 2013 17:50:06 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rAEHnrV3005752 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 14 Nov 2013 12:49:54 -0500 Received: from [10.18.25.132] ([10.18.25.132]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id rAEHnqmB010024; Thu, 14 Nov 2013 12:49:53 -0500 Message-ID: <1384451362.15325.118.camel@surprise> Subject: Re: [PATCH] Add gimple subclasses for every gimple code (was Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)) From: David Malcolm To: Jeff Law Cc: Michael Matz , Andrew MacLeod , gcc-patches@gcc.gnu.org Date: Thu, 14 Nov 2013 18:32:00 -0000 In-Reply-To: <5284781F.3000604@redhat.com> References: <5271CBF9.2070005@redhat.com> <1383236801-13234-1-git-send-email-dmalcolm@redhat.com> <52741EE2.3030100@redhat.com> <1383671947.5282.93.camel@surprise> <1383800204.31927.33.camel@surprise> <527B25ED.9030804@redhat.com> <1383937364.31927.66.camel@surprise> <5284781F.3000604@redhat.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2013-11/txt/msg01651.txt.bz2 On Thu, 2013-11-14 at 00:13 -0700, Jeff Law wrote: > On 11/08/13 12:02, David Malcolm wrote: > >> I wouldn't mind seeing a small example proof of concept posted to help > >> those who don't see where this is going understand the goal. I would > >> recommend against posting another large patch for inclusion at this time. > > Attached is a proof-of-concept patch which uses the > > gimple_statement_switch subclass (as a "gimple_switch" typedef). This > > is one of the subclasses that the earlier patch added, which has no new > > fields, but which carries the invariant that, if non-NULL, > > gimple_code (gs) == GIMPLE_SWITCH. > [ ... ] > > Thanks. It's pretty much what I expected. Obviously for other codes > there may be a lot more changes that you have to slog through, but I > think this example shows the main concepts. > > Presumably in this new world order, the various gimple statement types > will continue to inherit from a base class. That seems somewhat > inevitable and implies a certain amount of downcasting (via whatever > means we agree upon). The worry, in my mind is how pervasive the > downcasting will be and how much of it we can get rid of over time. > > I may be wrong, but ISTM some of the downcasting is a result of not > providing certain capabilities via (pure?) virtual methods. For > example, expand_gimple_stmt_1 seems ripe for implementing as virtual > methods. ISTM you could also have virtuals to build the statements, > dump/pretty-print them, verify them, branch/edge redirection, > estimations for inlining, etc. ISTM that would eliminate a good chunk > of the downcasting. FWIW, I prefer the downcasts to adding virtual functions; what I've tried to do is create a very direct mapping from the status quo, introducing inheritance to gain the benefits listed earlier in the thread, whilst only changing "surface syntax". It seems to me that we're considering the general problem of type-safe code dispatch: given a type hierarchy, and various sites that want to vary behavior based on what types they see, how best to invoke the appropriate code, ensuring that the code that's called "knows" that its dealing with the appropriate subclass i.e. in a typesafe manner. There are various idioms for doing this kind of dispatch in C++, a non-exhaustive list is: (a) switches and if/then tests on the GIMPLE_CODE (stmt) - the status quo, and what my proposed patch continues to do, albeit gaining some compile-time typechecking using as_a<> for the switch and dyn_cast<> for the if/then. This is changing some surface syntax without making major changes, and gains us compile-time typesafety and IMHO more readable code, though clearly opinions vary here. In my (brief) testing, (a) has no significant effect on compiler performance. (b) adding virtual functions to gimple would be another way to handle type-safe dispatch, but they carry costs: (i) they would implicitly add a vtable ptr to the top of every gimple statement, increasing the memory consumption of the process (ii) it's my belief that a virtual function call is more expensive than the kinds of switch/if+then branching that we're currently doing on the code - though I don't have measurements to back this up (iii) what I call "link-time granularity". A vtable references every method within it. I'd love to have a libgimple.so, but to do so, every vtable would need to be populated with a particular set of operations at link time - where do we draw the line for "core" gimple operations, the dispatches performed by every core pass? The set of operations will never be complete: some plugin may want to add a new set of per-gimple-subclass behaviors for some custom gimple pass. (c) the "Visitor" design pattern [1] - rather than adding virtual functions to gimple, instead add them to a visitor class e.g.: class gimple_visitor { public: /* Visit a statement. This will dispatch to the appropriate handler below, based on GIMPLE_CODE (stmt), encapsulating the appropriate downcast within a big switch statement. */ void visit_stmt (gimple stmt); protected: /* Each gimple code gets its own handler. This class provides an empty implementation of each. If we want to force overrides, we could have an abstract_gimple_visitor base class above this one that has all of these be pure virtual. */ virtual void visit_cond (gimple_cond stmt) {} virtual void visit_switch (gimple_switch stmt) {} virtual void visit_assign (gimple_assign stmt) {} virtual void visit_phi (gimple_phi phi) {} /* etc */ }; Example of a subclass: class gimple_pretty_printer : public gimple_visitor { protected: /* Each of these implements the subclass-specific pretty-printing logic. */ void visit_cond (gimple_cond stmt); void visit_switch (gimple_switch stmt); void visit_assign (gimple_assign stmt); void visit_phi (gimple_phi phi); /* etc */ }; so that the dispatch can be written like this: gimple_pretty_printer pp; pp.visit_stmt (stmt); (the above isn't *exactly* the Visitor pattern from the Gang of Four book, I'm doing things in the visitor in order avoiding adding vfuncs to gimple). This approach avoids adding an implicit vtable field to the top of gimple [(i) above], and keeps the vtables with the code using them [(iii) above]. However it still would mean (ii) changing from switch/if-then control flow to vfunc calls, with unknown impact on performance. I'd be nervous about adding virtual functions anywhere where we're not already jumping though function ptrs. [Aside: this approach also gives us a natural place to store state relating to the dispatch (as fields of the visitor subclass), which may help in removing global state from the compiler. (Doesn't help with GTY state though)] (d) The status quo, with the drawback of doing the type-checking either at run-time (checked build) or not at all (release build). We have 41 gimple codes (and adding new ones is a relatively rare event), but many more dispatch sites, I think - a first estimate might be: [gcc] $ grep -nH -e "gimple_code (" *.c | wc -l 772 So we have (I think) a large number of sites that dispatch to code based on a (relatively) small fixed set of types - this suggests to me the use of the Visitor pattern: (c) above, but I think (a) is the conservative approach that gives many benefits for relatively low risk, and gives us an easy transition path to (c) if measurement establishes the lack of significant performance impact. Such a transition could be done piecemeal. > Just to be clear, I'm not asking you to make those changes, just for > your thoughts on approaches to eliminate the downcasting based on what > we've seen so far. (nods). Note that I don't regard the downcasting as inherently bad, just one approach to the generic issue of typesafe dynamic code dispatch. Yes, in many OO textbooks, it's regarded as a code smell, but then again "goto" has its uses :) > Thanks for pulling this together to help illustrate how some of this > might look in practice. I hope others take the time to look closely as > this example and think about what it means in terms of how we would be > writing code 6 months from now. On the subject of "6 months from now", my dream is that we can have a libgimple.so and a librtl.so (.a in the regular builds). I've love to build a pluggable static analyzer on top of the gimple code - so I'm trying to think of how we can structure the gimple code in such a way that it's reusable in a modular way. FWIW, I don't like the implicit "pointerness" of types within my proposed patch, I'd rather have, say, the "gimple_switch" type be the underlying _struct_, and spell out "gimple_switch *" to make it clear that the latter is a pointer. Doing so would make the is-a.h API somewhat less clunky, so that one could write: if (gimple_switch *swtch = dyn_cast (stmt)) { // do switchy stuff on swtch. } rather than: if (gimple_switch swtch = dyn_cast (stmt)) { // do switchy stuff on swtch. } Another tweak to the patch series could be to only introduce the leaf subclasses one at a time, each time making use of the new subclass. For example, the "gimple_switch" subclass would be added at the same time as a patch that makes use of it to improve compile-time typesafety, so each patch would contain its own justification, as it were (applying the YAGNI principle). Sorry for the long email; thanks for reading to the end. Dave [1] see the Gang of Four "Design Pattens" book, or e.g. http://en.wikipedia.org/wiki/Visitor_pattern