From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3544 invoked by alias); 17 Dec 2001 21:42:40 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 3318 invoked from network); 17 Dec 2001 21:41:24 -0000 Received: from unknown (HELO neon-gw.transmeta.com) (63.209.4.196) by sources.redhat.com with SMTP; 17 Dec 2001 21:41:24 -0000 Received: (from root@localhost) by neon-gw.transmeta.com (8.9.3/8.9.3) id NAA15740; Mon, 17 Dec 2001 13:41:19 -0800 Received: from mailhost.transmeta.com(10.1.1.15) by neon-gw.transmeta.com via smap (V2.1) id xma015705; Mon, 17 Dec 01 13:40:51 -0800 Received: from penguin.transmeta.com (penguin.transmeta.com [10.10.27.78]) by deepthought.transmeta.com (8.11.6/8.11.6) with ESMTP id fBHLetS27159; Mon, 17 Dec 2001 13:40:55 -0800 (PST) Received: from localhost (torvalds@localhost) by penguin.transmeta.com (8.11.2/8.7.3) with ESMTP id fBHLe5I01601; Mon, 17 Dec 2001 13:40:05 -0800 X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs Date: Mon, 17 Dec 2001 13:43:00 -0000 From: Linus Torvalds To: cc: Subject: Re: Big-endian Gcc on Intel IA32 In-Reply-To: <20011217211252.431D3F28BD@nile.gnat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2001-12/txt/msg00943.txt.bz2 On Mon, 17 Dec 2001 dewar@gnat.com wrote: > > This is a much trickier language feature to design than you would imagine. > We have been struggling with this in Ada for a while. Hmm.. It sounds like one of those "obvious in principle" things, but I can imagine that it falls afoul of a lot of the gcc optimizations (ie x86.md has a pattern for doing "load + and $255" with a "movzbl" instruction, which is legal only on little-endian data: on big-endian you can still do it, but you have to modify the address). That's just the _really_ obvious kind of problem I can imagine off-hand. I assume you've seen many many more.. However, I think that the most _fundamental_ problem is completely independent of whether a simple and good implementation for gcc is even feasible: it's not even clear that a byte-order attribute necessarily helps porting of legacy applications all that much. The problem is pointers do data - you must _never_ lose the byte-order attribute by mistake, and you must never mix them. And a compiler (and particularly a C compiler) has a really hard time asserting that people don't mis-use pointers, with "void *" often being used as a "whatever". So I realize that a lot of code is byte-order dependent exactly because the code itself uses the same pointer in different ways (ie what happens when you pass a byte-order-aware pointer to something like "memcpy()"? It's ok if _both_ pointers are of the same byte order and the same type, but not in general. And that's the _easy_ case, with a standard function that the compiler could check for). So it may be that the feature itself is simply not very helpful, simply because it's so hard to retrofit existing programs even if you had some compiler support for the notion. So the actual _implementation_ on a gcc level might be the least of your troubles. That said, it still sounds like one of those dangerously "simple and clever" ideas. On a tangential issue: I actually think that it might be equally powerful to just have a way of "tainting" certain pointers, and disallowing their use at compile-time unless the recipient claims to accept the specific form of "tainting". This is, in fact, more-or-less what the "const" qualifier does, but it might be useful to allow user-defined "taints". The reason this is tangential is that byte-order would be one such potential use of "tainting" - not so much for compiler-assisted code generation, but simply for compiler-assisted type-checking: allowing the person who gets stuck with the job of fixing byte-order problems to "taint" the pointers with byte-order information, and make the compiler warn about it when a pointer is ever passed into any function that doesn't expect that byte-order. So the byte-order-attribute thing doesn't actually have to affect code generation to be potentially useful. (Inside the kernel, I'd love to be able to taint pointers and data that came from user space, for example, to make sure that the compiler will refuse to even _compile_ code that uses such data without the proper safety checks. This is not all that different from keeping track of what byte-order a specific datum has). Ehh? Linus