From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2696 invoked by alias); 24 May 2005 04:57:06 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 2635 invoked by uid 22791); 24 May 2005 04:57:00 -0000 Received: from smtp-102-tuesday.noc.nerim.net (HELO mallaury.noc.nerim.net) (62.4.17.102) by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Tue, 24 May 2005 04:57:00 +0000 Received: from uniton.integrable-solutions.net (gdr.net1.nerim.net [62.212.99.186]) by mallaury.noc.nerim.net (Postfix) with ESMTP id 6B66A62D0A; Tue, 24 May 2005 06:56:53 +0200 (CEST) Received: from uniton.integrable-solutions.net (localhost [127.0.0.1]) by uniton.integrable-solutions.net (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j4O67aLv017486; Tue, 24 May 2005 08:07:36 +0200 Received: (from gdr@localhost) by uniton.integrable-solutions.net (8.12.10/8.12.10/Submit) id j4O67a8G017485; Tue, 24 May 2005 08:07:36 +0200 X-Authentication-Warning: uniton.integrable-solutions.net: gdr set sender to gdr@integrable-solutions.net using -f To: Zack Weinberg Cc: gcc@gcc.gnu.org, jason@redhat.com, mark@codesourcery.com, dberlin@dberlin.org Subject: Re: Compiling GCC with g++: a report References: <1116907280.9577.31.camel@localhost.localdomain> From: Gabriel Dos Reis In-Reply-To: <1116907280.9577.31.camel@localhost.localdomain> Date: Tue, 24 May 2005 06:18:00 -0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2005-05/txt/msg01274.txt.bz2 Zack Weinberg writes: | On Mon, 2005-05-23 at 01:15 -0500, Gabriel Dos Reis wrote: | > Hi, | > | > I spent the week-end trying to get GCC -- mainline -- compilable | > (i.e. those compoenents written in C) with a C++ compiler (e.g. g++). | | These results are very interesting. | | As a general observation: A lot of the things you have found to be | problematic, are in fact preferred idioms for C code. For instance, | no standard-C programmer would ever write an explicit cast on malloc's | return value. I think that we are losing something, if only in | readability, if we restrict our code to the subset of C which is also | correct C++. I think opinions are variable here a lot. If for example, you take a look at examples in The C Programming Language (all editions), you'll find explicit casts on malloc's return value. Yet, I would refrain from calling Dennis Ritchie as a non standard-C programmer or his book, TCPL2, not describing standard C. Most of the C programmers I've met have learnt from his book. (Yes, I've also read some C programmers comment that nobody should cast the return value of malloc, but for large scale sofwtare, I have not seen their opinions as dorminating). The cast you're talking about is buried deep in XNEWVEC, XRESIZEVEC and such. It is not anything you'll find in the code directly. So, in fact we do not lose readability as you claim. | Now, if we were migrating to C++, that would be okay, | because we would (eventually) get all of the additional expressive power | of C++ in exchange. However, if we're not migrating to C++, I'm opposed | to the inclusion of patches that restrict our C code to the subset which | is correct C++. The patches are aligning us to our coding standards. I don't think it is reasonable to throw roadblocks in the way, especially when they are contrary to our current coding standards. The claim that the cast will obscure the code is unjustified as the use of the libiberty macros relieve us of springling cast in the code. See my previous patches to libiberty and fixincludes. I don't think your suggestion of moving to C++ is workable at this point. The patches of aligning us to the common subsets of C90 and C++ is following the consensus we developed as our coding standards. | Furthermore, as I've said before, I support migrating | to C++ -- but only if the C++ ABI and libstdc++ soname are first | permanently frozen. If we do not do that first, we risk being trapped | into a situation where we need specific versions of GCC to compile | specific newer versions of GCC, which would be a Bad Thing. Throwing roadblocks in the way is not going to help the GCC project. It is unreasonable to that at this time. | The C++ ABI seems to be stable at this point, but there is not yet | consensus that it will never again be changed. The libstdc++ team is | currently developing yet another new, incompatible version, so I see no | hope for a permanent freeze of its soname in the near future. Thus, | while you've discovered some interesting things by trying this, I don't | think C++ compatibility patches should be applied now. The issue of moving to C++ is independent of our aligning ourselves to our coding standards. I don't beleive it is reasonable to block these patches on the ground that we could conceive moving to C++ (which is a controversial issue). The decision to code at the intersection of C90 and C++ is a consensus we reached after repeated debates. | Having said that, some comments on the problems you have found: | | > Third, there is some "type-punning" with enums, int and unsigned int, | > where the middle-end (mostly) relies on implicit conversion from int | > to enums. | | Being allowed to do this is very important. Some enumerated types are | to be treated as opaque outside a very narrow context; the only way to | do that in C is to have (a typedef of) unsigned int as the visible type, | and only declare the enumerated type in the context where it's allowed | to be used. I have looked at every of those uses -- since I went through editing almost every file needed for compiling GNU C and GNU C++ compilers. None of the cases appear important. The only compelling cases are when front-ends (eg.g C or C++) extend them (e.g. c_tree_code or cplus_tree_code). However, none of the current approach is necessary. As, RTH pointed out in the past, front-ends should define those enumerators as a whole by appropriately #include the file. We can arrange for that -- in fact I've tested variants of that in my experiments. No cast is neeeded when done properly. | I want to see more use of this idiom, not less; for | example, 'enum machine_mode' ought to be a black box to almost the | entire compiler. Me too, but the way to make it a black box is not to cast it so unsigned int back forth willy nilly -- that does not make it a black box, on the contrary. For example, we should be using EXPAND_NORMAL instead of plain "0". | I'd be delighted to hear of a more C++-friendly way to | code this. See above. | Naturally, where the constant is _not_ opaque outside of a | defined context, but is part of an interface (as your examples seemed to | be), not using it is just sloppy. | | > Fourth, it appears that we're implicilty using C99's semantics of | > "extern inline" in our source -- when we have a pure C90 compiler that | > does not understand "inline", we just #define inline to nothing so we | > don't get into trouble. With a C++ compiler, we're in trouble because | > an inline function needs to be defined in every translation where it | > is used. So, I either move the affected functions to "static inline" | > or just make then non-inline (cases are in hashtable.c and toplev.c). | | Use of bare 'inline' is just plain wrong in our source code; this has | nothing to do with C++, no two C compilers implement bare 'inline' | alike. Well, the way I figureed it out was running the code source through a C++ compiler. I'm aware that inline is absent from C90 and that many of the current compilers that claim to implement C99 have their own opinions on the matter. However, what I was reporting is an *actual* experiment, no a thought. And it popped up only because I ran the source code through g++. Which, I think I should mention. | Patches to add 'static' to such functions (AND MAKING NO OTHER | CHANGES) are preapproved, post-slush. | | > Fifth, there is a slight difference between "const" in C and in C++. | > In C++, a const variable implicitly has an internal linkage; so a | > C++ compiler tends to optimize it out when its address is not taken | > (so no storage is wasted). This is an issue for the objects | > automatically generated by the gengtype support machinery. The are | > supposed to have external linkage, so we need to explicitly say | > "extern" in their definitions. | | Presumably such constants are declared in some header file, with | external linkage. It would be better to make that declaration visible | at the point of definition, rather than marking up the declarations with | 'extern'. I'm talking of the various gt_* objects created by the gengtype. Please, do have a look at the actual contents of the file and re-read what I wrote. | > Sixth, there is a real "mess" about name spaces. It is true that | > every C programmers knows the rule saying tags inhabit different name | > space than variable of functions. However, all the C coding standards | > I've read so far usually suggest | > | > typedef struct foo foo; | > | > but *not* | > | > typedef struct foo *foo; | > | > i.e. "bringing" the tag-name into normal name space to name the type | > structure or enumeration is OK, but not naming a different type! | | Ugh. Where do we do that? In our source code. :-) To name one that come to mind, alias.c:96:typedef struct alias_set_entry *alias_set_entry; I've also found that we have hash_table from libcpp as a typedef-name and hash_table as global (static variable) in cselib.c. | I will suggest, when you find these, that | you tack "_s" on the end of the tag-name; That is what I did in my local tree. But I believe we need to standardize on a coherent coding standards. Which is why I brought up the issue. | that doesn't conflict with | POSIX, and should require fewer changes elsewhere in the code. -- Gaby