From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7342 invoked by alias); 4 Apr 2012 08:35:50 -0000 Received: (qmail 6872 invoked by uid 22791); 4 Apr 2012 08:35:49 -0000 X-SWARE-Spam-Status: No, hits=-4.8 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,TW_TM X-Spam-Check-By: sourceware.org Received: from mail-iy0-f175.google.com (HELO mail-iy0-f175.google.com) (209.85.210.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 04 Apr 2012 08:35:34 +0000 Received: by iaag37 with SMTP id g37so22945iaa.20 for ; Wed, 04 Apr 2012 01:35:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.202.38 with SMTP id kf6mr799936igc.30.1333528534414; Wed, 04 Apr 2012 01:35:34 -0700 (PDT) Received: by 10.42.228.200 with HTTP; Wed, 4 Apr 2012 01:35:34 -0700 (PDT) In-Reply-To: <1333468930.2447.112.camel@surprise> References: <1333387282.2447.60.camel@surprise> <1333468930.2447.112.camel@surprise> Date: Wed, 04 Apr 2012 08:35:00 -0000 Message-ID: Subject: Re: Proposed gcc plugin plugin API mk 2 (this time without camel case!) From: Richard Guenther To: David Malcolm Cc: gcc@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2012-04/txt/msg00081.txt.bz2 On Tue, Apr 3, 2012 at 6:02 PM, David Malcolm wrote: > On Tue, 2012-04-03 at 15:23 +0200, Richard Guenther wrote: >> On Tue, Apr 3, 2012 at 12:03 PM, Richard Guenther >> wrote: >> > On Mon, Apr 2, 2012 at 7:21 PM, David Malcolm wr= ote: >> >> I wrote a script and ported my proposed API for GCC plugins from my >> >> CamelCase naming convention to an underscore_based_convention (and >> >> manually fixed up things in a few places also). >> >> >> >> The result compiles and passes the full test suite for the Python >> >> plugin; that said, I'm still breaking the encapsulation in quite a few >> >> places (hey, this is an experimental prototype). >> >> >> >> You can see the latest version of it within the "proposed-plugin-api" >> >> branch of the Python plugin here: >> >> http://git.fedorahosted.org/git/?p=3Dgcc-python-plugin.git;a=3Dshortl= og;h=3Drefs/heads/proposed-plugin-api >> >> >> >> within the "proposed-plugin-api" subdirectory. >> > >> > Hmm, how do I get it? =A0I did >> > >> > git clone http://git.fedorahosted.org/git/proposed-plugin-api >> > >> > but there is nothing in gcc-python-plugin/. =A0And >> > >> > git checkout proposed-plugin-api >> > >> > says I'm already there ...? >> >> Meanwhile the directory magically appeared (heh ...). > > [The ways of git are something of a mystery to me: 95% of the time it's > the best revision control system I've ever used, but 5% of the time the > most obtuse] > >> Overall it looks good > Thanks for taking a look. > >> =A0- but it seems to leak GCC headers into the >> plugin API (via gcc-plugin.h and input.h inclusion). =A0Thus, it >> lacks separating the plugin API headers from the plugin API implementati= on >> headers? > That's true. =A0The big information "leak" happens inside > gcc-semiprivate-types.h, which defines the various structs that act like > pointers, each with a decl like this: > > struct gcc_something { > =A0 some_private_gcc_pointer_type inner; > }; > > It would be possible to make this more opaque like this: > > struct gcc_something { > =A0 struct some_private_gcc_struct *inner; > }; > > given that you then don't need a full definition of that inner struct > visible to users. =A0Though location_t is leaked, and in this approach, > there isn't a way around that, I think. See the reply by Roman. > >> Or maybe I'm not looking at the right place. >> I also dislike the use of GCC_PUBLIC_API, etc. - everything in >> the plugin API headers should be obviously public - and the implementati= on >> detail should be an implementation detail that should not need to care. > > I added that macro based on the example of other plugin/embedding > systems I've seen (e.g. Python), where it's handy to make that macro be > non-trivial on some platforms. =A0See e.g. CPython's pyport.h: > =A0http://hg.python.org/cpython/file/9599f091faa6/Include/pyport.h > where the macros PyAPI_FUNC and PyAPI_DATA have a 44-line definition, > handling various different compatibility cases. > > For example, GCC_PRIVATE_API could have some magic linker flag that lets > us automatically strip out those symbols so that they're not visible > externally after the compiler has been linked. That's for the implementation side - the public API side by definition has only declarations considered public. >> >> If this landed e.g. in GCC 4.8, would this be usable by other plugins? >> >> For sure. =A0I'd say get the copyright stuff sorted out and start pushin= g things >> into the main GCC repository - where the obvious starting point is to >> get the makefile, configure and install parts correct. >> >> I'd concentrate on providing enough API for introspection, like simple >> things, counting basic-blocks, counting calls, etc. =A0I'm not sure we >> want to expose gcc_gimple_walk_tree or gcc_gimple_print (or >> the gcc_printers for a start) - the latter would something that the >> python plugin >> would provide, resorting to introspecting the stmt itself. > FWIW the Python plugin already heavily uses gcc's pretty-printer code, > so that *is* something I'd want to wrap (but it's fairly easy to do so). Ok, fair enough. >> I also wonder about gcc_gimple_phi_upcast - why would you specialize >> PHI nodes but not any others? =A0I'd have exposed gcc_gimple_code. >> In general the plugin API should provide exactly one (and the most flexi= ble) >> way to do a thing (thus, not have gcc_gimple_assign_single_p) and >> the high-level consumers like the python plugin should provide >> nice-to-use interfaces. > > This touches on the biggest thing that's missing in the API: the > multitude of tree types, gimple statement types, and rtl types. =A0All I > added were the "base classes", and gimple-phi is the only instance so > far in the API of a subclass. Well, one way is surely to view it as sort-of "classes", the other way is to view it as tagged structures which you can simply expose. Of course if there will be a day where somebody converts all of gimple and tree to a C++ class hierarchy the class way would be a better representation. > I would like the API to expose these (my plugin uses them to good > effect), but doing it in C is likely to be messy: lots of upcasting and > downcasting functions. I'd not do up/down-casting. I'd at most expose a RTTI-like interface, gcc_gimple_is_phi, etc.. Using an opaque "base" pointer in all functions should be ok (ok, you will miss some type safety - but so does GCC internally). > An alternative approach might be to bite the bullet and go to C++; maybe > something like this (caveat: I last did any C++ about a decade ago, no > idea if the following compiles): > > // Everything exposed in the public headers are pure virtual functions > // leading to abstract classes: no public data > namespace gcc { > > =A0 // Abstract base class for objects managed by GCC's garbage-collector > =A0 class gcmanaged { > =A0 public: > =A0 =A0 =A0virtual void mark_in_use() const =3D 0; > =A0 }; > > =A0 class location : public gcmanaged { > =A0 =A0 =A0 // it isn't gcmanaged yet, but might become so > =A0 public: > =A0 =A0 =A0virtual std::string get_filename() const =3D 0; > =A0 =A0 =A0virtual int get_line() const =3D 0; > =A0 =A0 =A0virtual int get_column() const =3D 0; > =A0 }; > > =A0 namespace gimple { > =A0 =A0 =A0 // enum to assist in RTTI by non-C++ code: > =A0 =A0 =A0 enum code { > =A0 =A0 =A0 =A0 =A0 INVALID_GIMPLE_CODE =3D -1, > =A0 =A0 =A0 =A0 =A0 GIMPLE_NOP =3D 0, > =A0 =A0 =A0 =A0 =A0 GIMPLE_ASSIGNMENT, > =A0 =A0 =A0 =A0 =A0 GIMPLE_SWITCH, > =A0 =A0 =A0 =A0 =A0 //etc; don't have to expose all of the regular gimple= codes. > =A0 =A0 =A0 =A0 =A0 NUM_CODES, > =A0 =A0 =A0 }; > > =A0 =A0 =A0 class statement : public gcmanaged { > =A0 =A0 =A0 public: > =A0 =A0 =A0 =A0 =A0 virtual location *get_location() const =3D 0; > =A0 =A0 =A0 =A0 =A0 virtual enum code get_code() const =3D 0; > =A0 =A0 =A0 }; > > =A0 =A0 =A0 class nop : public statement { > =A0 =A0 =A0 }; > > =A0 =A0 =A0 class phi : public statement { > =A0 =A0 =A0 public: > =A0 =A0 =A0 =A0 =A0 virtual tree::ssaname* get_lhs() const =3D 0; > =A0 =A0 =A0 =A0 =A0 virtual std::list get_rhs() = const =3D 0; > =A0 =A0 =A0 =A0 =A0 // or whatever > =A0 =A0 =A0 }; > =A0 }; > > =A0 namespace tree { > =A0 =A0 =A0 class declaration : public gcmanaged { > =A0 =A0 =A0 public: > =A0 =A0 =A0 =A0 =A0 virtual std::string get_name() const =3D 0; > =A0 =A0 =A0 =A0 =A0 virtual location* =A0 get_location() const =3D 0; > =A0 =A0 =A0 }; > > =A0 =A0 =A0 class funcdecl :: public declaration { > =A0 =A0 =A0 public: > =A0 =A0 =A0 =A0 =A0 virtual function* get_function() const =3D 0; > =A0 =A0 =A0 =A0 =A0 virtual std::list get_arguments() const = =3D 0; > =A0 =A0 =A0 =A0 =A0 virtual resultdecl* get_result() const =3D 0; > =A0 =A0 =A0 =A0 =A0 // etc: accessor functions > =A0 =A0 =A0 }; > > =A0 =A0 =A0 class type : public gcmanaged { > =A0 =A0 =A0 // etc... > =A0 =A0 =A0 }; > > =A0 =A0 =A0 class integertype :: public type { > =A0 =A0 =A0 // etc... > =A0 =A0 =A0 }; > > =A0 =A0 =A0 // does there actually need to be a "tree" base class? =A0I t= hink > =A0 =A0 =A0 // the tcc_codes express the base classes, at a first approxi= mation > =A0 =A0 =A0 // could even lose the "tree" concept altogether > =A0 }; > }; But that of course requires real objects to mediate between the plugin API and GCC internals - something I'd try to avoid hard. Objects should be identified by pointers to the original internal structures - that also makes it simple to eventually add a garbage collector friendly reference taking mechanism (by simply having a new internal GC root being a pointer-map, ptr:refcount and a plugin API that does gcc_push_ref () / gcc_pop_ref ()). > [though I'd greatly prefer to capitalize the classnames (sigh) :) ] > > The implementation would have entirely separate (private) headers, with > something like: > > class location_impl : public gcc::location { > private: > =A0 location_t loc; > > public: > =A0 =A0// ctor: > =A0 =A0location_impl(location_t loc) : loc(loc) {} > > =A0 =A0// implement the public API: > =A0 =A0void mark_in_use() const { > =A0 =A0 =A0 =A0 // empty: location_t currently isn't gc managed > =A0 =A0} > > =A0 =A0std::string get_filename() const; > =A0 =A0int get_line() const; > =A0 =A0int get_column() const; > }; > > class phi_impl : public gcc::gimple::phi { > private: > =A0 =A0gimple stmt; // of correct subtype > > public: > =A0 =A0// gcmanaged: > =A0 =A0void mark_in_use() const; > > =A0 =A0// ctor: > =A0 =A0phi(gimple stmt); > > =A0 =A0// accessors: > =A0 =A0location *get_location() const; > =A0 =A0enum code get_code() const; > > =A0 =A0tree::ssaname *get_lhs() const; > =A0 =A0std::list get_rhs() const; > }; > > That way the implementation is *entirely* hidden, with entirely separate > headers (though during prototyping I'd need an escape hatch). =A0But it > does mean that the API is handing off pointers-to-pointers, rather than > just pointers (allocating lots of tiny wrapper objects, like the ones > above, so that the caller doesn't know the size, and so there can be a > vtable). > > Does this sound sane? No, requiring extra storage allocation will greatly complicate API use. Richard. > Long-term, perhaps these impl classes could perhaps *become* the > implementations - everything's hidden. > > Thanks again for looking at this > Dave >