public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Linux and aliasing?
@ 1999-06-03 10:23 Chip Salzenberg
  1999-06-03 10:37 ` mark
  1999-06-30 15:43 ` Chip Salzenberg
  0 siblings, 2 replies; 218+ messages in thread
From: Chip Salzenberg @ 1999-06-03 10:23 UTC (permalink / raw)
  To: egcs

Linus continues to complain on linux-kernel that egcs lacks a way to
*selectively* turn off the new stronger alias analysis.  Is this not
easy, or is it just not an important issue to the egcs team?
-- 
Chip Salzenberg      - a.k.a. -      <chip@perlsupport.com>
      "When do you work?"   "Whenever I'm not busy."

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 10:23 Linux and aliasing? Chip Salzenberg
@ 1999-06-03 10:37 ` mark
  1999-06-03 11:26   ` David S. Miller
                     ` (2 more replies)
  1999-06-30 15:43 ` Chip Salzenberg
  1 sibling, 3 replies; 218+ messages in thread
From: mark @ 1999-06-03 10:37 UTC (permalink / raw)
  To: chip; +Cc: egcs

>>>>> "Chip" == Chip Salzenberg <chip@perlsupport.com> writes:

    Chip> Linus continues to complain on linux-kernel that egcs lacks
    Chip> a way to *selectively* turn off the new stronger alias
    Chip> analysis.  Is this not easy, or is it just not an important
    Chip> issue to the egcs team?  -- Chip Salzenberg - a.k.a. -
    Chip> <chip@perlsupport.com> "When do you work?"  "Whenever I'm
    Chip> not busy."  

It's not easy.  

And it's not in any way clear that it's the right thing to do.

And David Miller (IIRC) indicated that the kernel folks would probably
eliminate the non-standard C code in the next major kernel revision.

So, no, I don't think it's a priority for us to make any such change.
I can't really speak for others, but that's my take on the situation.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 10:37 ` mark
@ 1999-06-03 11:26   ` David S. Miller
  1999-06-03 12:03     ` mark
  1999-06-30 15:43     ` David S. Miller
  1999-06-03 12:02   ` Andi Kleen
  1999-06-30 15:43   ` mark
  2 siblings, 2 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-03 11:26 UTC (permalink / raw)
  To: mark; +Cc: chip, egcs, torvalds

   From: mark@codesourcery.com
   Date: Thu, 03 Jun 1999 10:40:38 -0700

   And David Miller (IIRC) indicated that the kernel folks would
   probably eliminate the non-standard C code in the next major kernel
   revision.

Actually, I've changed my mind, having to get rid of all such types of
casts inside the networking etc. is just an abomination.  There is no
reason anyone should have to use unions to teach the compiler about
what they're actually touching, if someone casts the thing they (the
programmer) know what they are doing, the compiler shouldn't assume
anything.

I'm not saying this should be the normal mode of operation, but some
mechanism needs to exist so that such code can be made valid _without_
resorting to ugly unions.  Consider our TCP hashing comparison code
in the kernel has gems like this:

#define TCP_IPV4_MATCH(__sk, __cookie, __saddr, __daddr, __ports, __dif)\
	(((*((__u64 *)&((__sk)->daddr)))== (__cookie))	&&		\
	 ((*((__u32 *)&((__sk)->dport)))== (__ports))   &&		\
	 (!((__sk)->bound_dev_if) || ((__sk)->bound_dev_if == (__dif))))

Sorry, I'm not using a union for this, it's totally a performance hack
and I'm not going to uglify the socket structure with silly unions.
And actually in this case, there is no reason the compiler cannot see
what I am up to.  Do powerful optimizations override common sense?

You know exactly what I'm doing there, and so do I, why can't egcs
figure it out that easily as well?

The compiler is just a tool.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 10:37 ` mark
  1999-06-03 11:26   ` David S. Miller
@ 1999-06-03 12:02   ` Andi Kleen
  1999-06-03 15:38     ` Martin v. Loewis
  1999-06-30 15:43     ` Andi Kleen
  1999-06-30 15:43   ` mark
  2 siblings, 2 replies; 218+ messages in thread
From: Andi Kleen @ 1999-06-03 12:02 UTC (permalink / raw)
  To: mark; +Cc: egcs

mark@codesourcery.com writes:
> 
> And David Miller (IIRC) indicated that the kernel folks would probably
> eliminate the non-standard C code in the next major kernel revision.

This would be a major rewrite of a lot of code. Linux is full of such
things. Having it all fixed in the next major release (and even in the
next release after that) would be a miracle. 

> So, no, I don't think it's a priority for us to make any such change.
> I can't really speak for others, but that's my take on the situation.

Bad :/. -fno-strict-aliasing is the only alternative then. What a pity,
Linus' proposal looked reasonable. 

-Andi

-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 11:26   ` David S. Miller
@ 1999-06-03 12:03     ` mark
  1999-06-03 12:25       ` David S. Miller
                         ` (2 more replies)
  1999-06-30 15:43     ` David S. Miller
  1 sibling, 3 replies; 218+ messages in thread
From: mark @ 1999-06-03 12:03 UTC (permalink / raw)
  To: davem; +Cc: chip, egcs, torvalds

>>>>> "David" == David S Miller <davem@redhat.com> writes:

    David> Actually, I've changed my mind.

OK, sorry to misrepresent your position.

    David> I'm not saying this should be the normal mode of operation,
    David> but some mechanism needs to exist so that such code can be
    David> made valid _without_ resorting to ugly unions.

There is one: -fno-strict-aliasing.  

You can turn off the optimization.  Then, however, if you complain
that the kernel would go faster if type-based alias analysis is in
use, you're out of luck.  But, this is no worse off than you were
before the optimization existed, and Linux worked pretty well in those
days too.

Despite what you say, you could just use some unions.  IMO, it
wouldn't take that much to fix up the TCP_IPV4_MATCH macro.  I'm sorry
the socket structure would become uglier, but, on the other hand, it
would make more obvious what exactly it is.  Right now, some of the
fields in the structure definition are really acting as unions, and
you're not making that clear to the reader of the code.

    David> You know exactly what I'm doing there, and so do I, why
    David> can't egcs figure it out that easily as well?

Good question, but you know very well that it's rhetorical. :-)  There
are lots of situations where it's obvious what should happen to the
programmer, but highly non-trivial to do in the compiler, if possible
at all.

I agree that in this case a special hack that says "if the access is
through the address of a variable/field of one type, but cast to
another type, then the user is doing something fishy, and we should
treat the access as if it were done with char*" is not ridiculous, and
not impossible to implement.  You might consider implementing this, or
hiring someone else to do it for you.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 12:03     ` mark
@ 1999-06-03 12:25       ` David S. Miller
  1999-06-03 20:06         ` craig
  1999-06-30 15:43         ` David S. Miller
  1999-06-03 13:31       ` Andi Kleen
  1999-06-30 15:43       ` mark
  2 siblings, 2 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-03 12:25 UTC (permalink / raw)
  To: mark; +Cc: chip, egcs, torvalds

   From: mark@codesourcery.com
   Date: Thu, 03 Jun 1999 12:07:33 -0700

       David> I'm not saying this should be the normal mode of operation,
       David> but some mechanism needs to exist so that such code can be
       David> made valid _without_ resorting to ugly unions.

   There is one: -fno-strict-aliasing.  

And there is another: if my code is not "standard C" then output an
error instead of silently generating bad code.  Then again, I did not
specify -ansi and -ansi is not the default, and therefore I have not
asked for "standard C", instead I expect to get "GNU C" which
traditionally has been "standard C + common sense" :-)

So in that light, there is:

	If some pointer is cast to another of a different storage
	class or size, I'm doing something strange, and the compiler
	should turn off aliasing optimizations for anything to do with
	that set/class of pointers.

I know it can be detected, and I also know aliasing can be turned off
for a particular class of related pointers just as easily in the
current compiler code.

Common sense should override whatever standards say, where feasible,
and I argue that here it is indeed feasible.

	You might consider implementing this, or hiring someone else
	to do it for you.

Before anyone considers implementing any change, it would be prudent
to make sure most folks agree on the issue and how it should be
solved.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 12:03     ` mark
  1999-06-03 12:25       ` David S. Miller
@ 1999-06-03 13:31       ` Andi Kleen
  1999-06-30 15:43         ` Andi Kleen
  1999-06-30 15:43       ` mark
  2 siblings, 1 reply; 218+ messages in thread
From: Andi Kleen @ 1999-06-03 13:31 UTC (permalink / raw)
  To: mark; +Cc: egcs, davem

mark@codesourcery.com writes:

> Despite what you say, you could just use some unions.  IMO, it
> wouldn't take that much to fix up the TCP_IPV4_MATCH macro.  I'm sorry
> the socket structure would become uglier, but, on the other hand, it
> would make more obvious what exactly it is.  Right now, some of the
> fields in the structure definition are really acting as unions, and
> you're not making that clear to the reader of the code.

This macro is just a particular example. The code is full of such stuff.

I actually started with converting some of the alias occurrences in
the 2.2 TCP code to unions, but I abandoned the project, because it
already touched far too much code and because there is no good way to 
limit the changes to specific modules. Also the unions are really ugly

Of course it would be nice to move to less such casts to allow more
optimizations in the future, but it is not realistic in a short term
fix, even for 2.3.

If GNU C had anonymous unions like VC++ or plan9 or G++ it would be a lot
easier though, because then tagless structure members could be converted
without requiring global search-replaces. Unfortunately it has not, 
and with the current "no more extensions" egcs policy it looks unlikely
(and also it would break Linux's link to gcc 2.7.2, which I fear would cause
a storm in the user and coder base)

-Andi

P.S.: David, at least it would be a good argument to merge tcp_tw_bucket
into sock, even if the cast extension would eventually get in, just to
squeeze some more optimizations out of that important paths. I imagine 
that could be important for good performance on IA64, which will probably 
be hurt much more by missed optimizations than IA32 or sparc.

-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 12:02   ` Andi Kleen
@ 1999-06-03 15:38     ` Martin v. Loewis
  1999-06-30 15:43       ` Martin v. Loewis
  1999-06-30 15:43     ` Andi Kleen
  1 sibling, 1 reply; 218+ messages in thread
From: Martin v. Loewis @ 1999-06-03 15:38 UTC (permalink / raw)
  To: ak; +Cc: mark, egcs

> Bad :/. -fno-strict-aliasing is the only alternative then. What a pity,
> Linus' proposal looked reasonable. 

It isn't really that bad. Of course, you lose some optimization
opportunities - but it isn't worse than earlier versions of gcc, which
never considered the type for aliasing, anyway.

Regards,
Martin

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 12:25       ` David S. Miller
@ 1999-06-03 20:06         ` craig
  1999-06-03 23:03           ` Linus Torvalds
                             ` (3 more replies)
  1999-06-30 15:43         ` David S. Miller
  1 sibling, 4 replies; 218+ messages in thread
From: craig @ 1999-06-03 20:06 UTC (permalink / raw)
  To: davem; +Cc: craig

>Common sense should override whatever standards say, where feasible,
>and I argue that here it is indeed feasible.

Maybe it is -- I haven't looked into the issues in detail -- but,
generally, it is very hard to implement common sense *in the compiler
itself*.

For all I know, this problem is the result of C, or gcc, being
too permissive about allowing casts across pointers to different
types...in the sense that, if that sort of thing was simply
disallowed, then programmers wouldn't even *think* they "knew what
they were doing", because they'd be getting compile-time diagnostics,
which, as you point out, is what they *should* be getting if the
compiler isn't basically successfully reading the programmer's mind
and implementing his desires.

In particular, while it might make sense for *your* application
to have the compiler "automatically" disable (even localized)
aliasing when it sees certain "suspicious" constructs, how do we
know there won't be people who say "hey, *we* use those constructs,
but we use them *correctly*, and we don't want to lose the
performance those alias assumptions give us", either now or in
the future?  Why should *they* have to pay for their more-
conforming (to the compiler's growing expectations, anyway) usage
by modifying their code, or even their shell scripts?

I'm thinking, more and more, that there really needs to be a
`GNU C--' or similar language for embedded systems, operating
systems like Linux, and so on, because the C standard seems
to be evolving towards making C *more*, not less, of a HLL,
and I doubt gcc (and its maintainers) will be up to the task
of making it fit both needs while evolving to handle new
architectures (e.g. IA64) in an optimal way.

(Or, anyone up for writing a BLISS front end to gcc, along with a
C-to-BLISS converter to be run over, for example, the Linux sources?  ;-)

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 20:06         ` craig
@ 1999-06-03 23:03           ` Linus Torvalds
  1999-06-03 23:45             ` mark
                               ` (4 more replies)
  1999-06-03 23:53           ` Martin v. Loewis
                             ` (2 subsequent siblings)
  3 siblings, 5 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-03 23:03 UTC (permalink / raw)
  To: craig; +Cc: davem, mark, chip, egcs

On 4 Jun 1999 craig@jcb-sc.com wrote:
> 
> Maybe it is -- I haven't looked into the issues in detail -- but,
> generally, it is very hard to implement common sense *in the compiler
> itself*.

Oh, agreed.

But it should be reasonably easy to implement very straightforward rules,
and have the rules themselves make common sense ;)

The extremely straightforward rule that at least I would advocate is _so_
straightforward as to be almost scary:
 - if there is a pointer cast, that pointer cast invalidates all
   type-based alias information.

It wouldn't matter if you cast the pointer to the type it had originally
anyway: a cast is a cast is a cast. If somebody dereferences a casted
value, the type information shouldn't be trusted.

It is a common sense rule at least to me, and it should be very simple for
the compiler too. And it has another advantage: it does not expand the
language in any way, and is obviously entirely ANSI compliant (it's
obviously not a _requirement_ of ANSI, but it is certainly allowed by it). 

_And_ any well-written software that isn't trying to be clever with
pointers would never ever notice anything, because the only reasonable
reason why you would ever use a pointer cast is because you're playing
games with the pointers in question, no?

Having a simple cast rule would make most of the alias issues go away
completely right off the bat, and the ones it wouldn't make go away you
can patch up by hand in the sources by just adding a dummy cast if
somebody is doing something _really_ ugly. 

> For all I know, this problem is the result of C, or gcc, being
> too permissive about allowing casts across pointers to different
> types...in the sense that, if that sort of thing was simply
> disallowed, then programmers wouldn't even *think* they "knew what
> they were doing", because they'd be getting compile-time diagnostics,
> which, as you point out, is what they *should* be getting if the
> compiler isn't basically successfully reading the programmer's mind
> and implementing his desires.

Well, the other way of thinking about this is to just say "oh, the
programmer is casting stuff, let's not trust the type system any more". 

Craig, don't always try to make the programmer look bad. Occasionally you
could just admit to the possibility that the programmer _really_ knows
what he is doing, and the compiler does not. Ok? Instead of blaming the
programmer, please just _allow_ him to say "you're just a stupid compiler,
and you shouldn't be getting too much in my way". Is that so hard to do?

> In particular, while it might make sense for *your* application
> to have the compiler "automatically" disable (even localized)
> aliasing when it sees certain "suspicious" constructs, how do we
> know there won't be people who say "hey, *we* use those constructs,
> but we use them *correctly*, and we don't want to lose the
> performance those alias assumptions give us", either now or in
> the future?  Why should *they* have to pay for their more-
> conforming (to the compiler's growing expectations, anyway) usage
> by modifying their code, or even their shell scripts?

They shouldn't. You should have an option that says

 -fno-strict-alias

(you have it already) and you should have an option that says

 -freally-strict-alias

but you should probably default to something that makes sense (and the
ANSI C rules certainly do _not_ qualify - I see how they were created, but
they do not "make sense" in any sense of that expression). Something that
notices that "oh, they're playing with the type system, I probably
shouldn't do aliases here". Ok?

In fact, the only _really_ sensible type-based alias system is probably
one where the _only_ thing that overrides a type-based alias is a pointer
cast. ANSI C has all the "funny" rules about "char *" being special etc,
and that means that you lose a lot of potentially very useful information. 
So you might consider having a mode that is _stricter_ than ANSI C in that
regard (not making "char *" anything special as far as the type alias
logic is concerned, but instead implementing _just_ the cast rule). 

So you can think of it as a sum of two independent yes/no rules: "do we
consider 'char *' to be a global alias killer" and "does a pointer cast
invalidate the type-based alias for the casted access?". So give the user
_two_ options:

	-fcast-invalidates-type-alias
	-fansi-alias-rule-invalidation

where the compiler would default to having both rules enabled (for the
"safest" kind of type-based alias), while people who really feel confident
that their program is entirely ANSI-alias safe would say that he does not
want the "cast-invalidates-type-alias" logic enabled. 

And in contrast, people like me who think the ANSI C rules are completely
arbitrary and much harder to understand than the cast alias rule, would at
least have the _option_ to use the simpler and more straightforward setup. 
No? 

Again, instead of thinking that the compiler always knows best, give the
user a choice. We're not in Windows any more, Tonto. Give the programmer
the gun, and allow him to shoot himself in the head. But give him a
laser-guided nightsight too, in case he wants it.

Don't get caught up in the MS way of doing things, where you not only give
a programmer a gun, you aim it at (roughtly) the wrong target, and you
pull the trigger for him too. 

Face it, there are clever programmers out there. You shouldn't make it
illegal to be clever. A standard is not a law of nature, and it's not a
universal excuse to be unfriendly to people who want to go outside the
standard.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 23:03           ` Linus Torvalds
@ 1999-06-03 23:45             ` mark
  1999-06-04  0:04               ` Linus Torvalds
  1999-06-30 15:43               ` mark
  1999-06-04  5:47             ` craig
                               ` (3 subsequent siblings)
  4 siblings, 2 replies; 218+ messages in thread
From: mark @ 1999-06-03 23:45 UTC (permalink / raw)
  To: torvalds; +Cc: craig, davem, chip, egcs

I don't think the cast rule is by any means the right obvious default.
For one thing, it pessimizes object-oriented C code that does
downcasts through an inheritance hierarchy.  There's no reason that we
shouldn't be able to use type-based alias analysis in such situations,
but your proposal would make it not happen.

You can use -fno-strict-aliasing to get the "traditional" behavior.
The only affect on your code will be that some optimizations that used
not to happen, but would with -fstrict-aliasing, will still not
happen.  What's the big deal?  If -fstrict-aliasing had never been
implemented, you wouldn't be complaining would you?  So, we've
improved GCC, and we've preserved the old behavior.

GCC has plenty of odd rules and way too many options.  We don't need
more.  The exception, I think, is when there's something that you can
only do with a language extension or special flag.  Extended asm's are
one such; you just can't do it without an extension.  So, we have it.

But, here, you just don't like ANSI/ISO C, and wish it had different
semantics.  You *could* express what you want in legal ANSI/ISO C, and
then GCC would do the right thing, with its default flags.

If we come up with a rule that turns off strict aliasing only for code
which is non legal ANSI/ISO C, then perhaps we should issue a warning
(on the illegal construct) and then turn off strict aliasing.  But, I
don't think you should expect us to do this any time soon (without 
financial incentive, or a noble volunteer).

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 20:06         ` craig
  1999-06-03 23:03           ` Linus Torvalds
@ 1999-06-03 23:53           ` Martin v. Loewis
  1999-06-30 15:43             ` Martin v. Loewis
       [not found]           ` <v04205101b37d700fbf8d@[192.168.1.254]>
  1999-06-30 15:43           ` craig
  3 siblings, 1 reply; 218+ messages in thread
From: Martin v. Loewis @ 1999-06-03 23:53 UTC (permalink / raw)
  To: craig; +Cc: davem, egcs

> For all I know, this problem is the result of C, or gcc, being
> too permissive about allowing casts across pointers to different
> types...

The problem is that ISO C explicitly allows you to cast pointers
forwards and backwards to completely unrelated types. Only when
you *dereference* the pointer, you must be consistent in the type,
or derefence through char*.

If you have a sequence

long foo()
{
   long a = 15;
   long *b = &a;
   void *c = b;
   float *d = c;
   *d = 3.14;
   return a;
}

then the standard says that this code has undefined behaviour, yet
every single statement is ok. The compiler could take the position
that d cannot possibly alias with b, and return 15. (In this example,
analysis detects that they are aliased, anyway)

Now, there are proposals to relax the rules. There is
-fno-strict-aliasing; I can't really see how there is an in-between.

Regards,
Martin

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 23:45             ` mark
@ 1999-06-04  0:04               ` Linus Torvalds
  1999-06-04  1:08                 ` Branko Cibej
                                   ` (5 more replies)
  1999-06-30 15:43               ` mark
  1 sibling, 6 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-04  0:04 UTC (permalink / raw)
  To: mark; +Cc: craig, davem, chip, egcs

On Thu, 3 Jun 1999 mark@codesourcery.com wrote:
> 
> I don't think the cast rule is by any means the right obvious default.
> For one thing, it pessimizes object-oriented C code that does
> downcasts through an inheritance hierarchy.  There's no reason that we
> shouldn't be able to use type-based alias analysis in such situations,
> but your proposal would make it not happen.

But those downcasts are implicit, not explicit, no? I think only explicit
casts should break the alias rule.

> You can use -fno-strict-aliasing to get the "traditional" behavior.

Yes. But that's a complete on-off switch. 

> The only affect on your code will be that some optimizations that used
> not to happen, but would with -fstrict-aliasing, will still not
> happen.  What's the big deal?  If -fstrict-aliasing had never been
> implemented, you wouldn't be complaining would you?  So, we've
> improved GCC, and we've preserved the old behavior.

Oh, you don't expect me to complain about bad code generation when I know
gcc could do better?

Why do you have a "-O" flag at all if you think people don't care about
performance?

I'd love to have the alias extensions, but I don't think it should be a
per-file global setting. Sure, I can just be silent, but if you expect all
egcs users to just sit idly when you do silly things, why do you bother
making pre-releases available at all? You obviously don't care about the
feedback you get from real users.

> But, here, you just don't like ANSI/ISO C, and wish it had different
> semantics.  You *could* express what you want in legal ANSI/ISO C, and
> then GCC would do the right thing, with its default flags.

Have you actually ever tried? I don't think you realize quite what a
rat-hole it is. It's not worth ANYBODYS time.

Sure, I can live with -fno-strict-aliasing. But I'm also really saddened
by all the lawyers like you who think that standards are somehow more
important than programmers. 

I can see technical arguments. An argument of "it's really too painful to
do" I can understand (preferably with an explanation, but hey, I don't
mind getting told that it's too hard to explain). I use that argument
every day myself. 

I think it's a damn shame that instead of technical arguments _everything_
revolves around people reading the standard as if it was the bible, and
trying to make people feel guilty for not really caring. It's not a sin to
just want to get good code without having to do magic contortions, guys.

			Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  0:04               ` Linus Torvalds
@ 1999-06-04  1:08                 ` Branko Cibej
  1999-06-30 15:43                   ` Branko Cibej
  1999-06-04  1:24                 ` Joe Buck
                                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 218+ messages in thread
From: Branko Cibej @ 1999-06-04  1:08 UTC (permalink / raw)
  To: egcs

Linus Torvalds wrote:

> I think it's a damn shame that instead of technical arguments _everything_
> revolves around people reading the standard as if it was the bible, and
> trying to make people feel guilty for not really caring. It's not a sin to
> just want to get good code without having to do magic contortions, guys.

Why, Linus, it's trivial to do! Use MSVC instead of egcs, then you can do

     #pragma optimize("a", on/off)

around every single statement, if you like. It even supports anonymous unions
_and_ structs, and is only slightly influenced by the ISO C standard.

They'll even sell you a version that can generate code for the alpha!

>:->

    Brane


P.S.: Just for the record: I, a "real user" and (imnsho) a "clever programmer"
who "know what I'm doing", at least most of the time, and who do not "read the
standard as if it were a bible", vote for having a standard-conforming
compiler that's not bloated by a ton of marginally useful (or even useless)
features. If I want featuritis, I know where I can get it.


P.P.S: They can't even spell "optimise" ...

--
Branko &Ccaron;ibej                 <branko.cibej@hermes.si>
HERMES SoftLab, Litijska 51, 1000 Ljubljana, Slovenia
voice: (+386 61) 186 53 49   fax: (+386 61) 186 52 70


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  0:04               ` Linus Torvalds
  1999-06-04  1:08                 ` Branko Cibej
@ 1999-06-04  1:24                 ` Joe Buck
  1999-06-04  1:50                   ` Linus Torvalds
  1999-06-30 15:43                   ` Joe Buck
  1999-06-04  5:47                 ` craig
                                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-04  1:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mark, craig, davem, chip, egcs

Mark writes:
> > The only affect on your code will be that some optimizations that used
> > not to happen, but would with -fstrict-aliasing, will still not
> > happen.  What's the big deal?  If -fstrict-aliasing had never been
> > implemented, you wouldn't be complaining would you?  So, we've
> > improved GCC, and we've preserved the old behavior.

Linus writes:
> Oh, you don't expect me to complain about bad code generation when I know
> gcc could do better?

Oh, don't worry, we expect you to complain, and in a rude and insulting
matter at that.  We're used to it.  It seems that you were a nicer guy
before you had so many worshippers.

> Why do you have a "-O" flag at all if you think people don't care about
> performance?

Mark put in strict-aliasing because it is a big performance win.  Of
course he cares about performance.  The ISO rules were written in the
way they were written precisely to enable this performance improvement.

> I'd love to have the alias extensions, but I don't think it should be a
> per-file global setting. Sure, I can just be silent, but if you expect all
> egcs users to just sit idly when you do silly things, why do you bother
> making pre-releases available at all? You obviously don't care about the
> feedback you get from real users.

More rudeness and insults.  Of course we care.  Why do you insist on
talking to us like that?

> Sure, I can live with -fno-strict-aliasing. But I'm also really saddened
> by all the lawyers like you who think that standards are somehow more
> important than programmers. 

A compiler will do better the more aliasing possibilities it can
eliminate.  Mark used the ISO rules to determine what the set of
eliminatable aliases is.  You want to change this set to a smaller set, so
your programs will continue to work.  I understand that, I even
sympathize.  But you seem blind to the fact that this will inevitably make
some (possibly many) ISO-valid programs slower.  Possibly, with the right
rules, this set of slowed-down programs can be made very small.  Maybe
someone can donate a patch that will do this right.  But it's non-trivial,
and needs to be done carefully.

> I can see technical arguments. An argument of "it's really too painful to
> do" I can understand (preferably with an explanation, but hey, I don't
> mind getting told that it's too hard to explain). I use that argument
> every day myself. 

See above, or find someone to submit working code (you would ask this
in an equivalent situation on the kernel list).

> I think it's a damn shame that instead of technical arguments _everything_
> revolves around people reading the standard as if it was the bible, and
> trying to make people feel guilty for not really caring.

The standard is not arbitrary: it is the way it is for technical reasons,
specifically to make C a suitable language for numerical computation.
Without such rules serious number-crunchers have to switch to Fortran.

> just want to get good code without having to do magic contortions, guys.

We could flip the default for the flag, so that people have to write
-fstrict-aliasing to get the optimization.  Had we done that, you
never would have noticed.



^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  1:24                 ` Joe Buck
@ 1999-06-04  1:50                   ` Linus Torvalds
  1999-06-04  5:46                     ` craig
  1999-06-30 15:43                     ` Linus Torvalds
  1999-06-30 15:43                   ` Joe Buck
  1 sibling, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-04  1:50 UTC (permalink / raw)
  To: Joe Buck; +Cc: mark, craig, davem, chip, egcs

On Fri, 4 Jun 1999, Joe Buck wrote:
> 
> Oh, don't worry, we expect you to complain, and in a rude and insulting
> matter at that.  We're used to it.  It seems that you were a nicer guy
> before you had so many worshippers.

Ehh, I wasn't exactly known for being polite even before. Why do you think
people still quote the flame wars I had about microkernels?

But point taken.

> Mark put in strict-aliasing because it is a big performance win.  Of
> course he cares about performance.  The ISO rules were written in the
> way they were written precisely to enable this performance improvement.

The ISO rules were not written to "enable" the performance improvement. 
They were written explicitly to _DISable_ it in a number of cases where
the optimization was known to break old code. And not everybody was all
that excited about the rules even when they were written. Understandably,
because they really aren't made to make sense they are only made to give
at least _some_ way around aliasing issues. 

> A compiler will do better the more aliasing possibilities it can
> eliminate.  Mark used the ISO rules to determine what the set of
> eliminatable aliases is.  You want to change this set to a smaller set, so
> your programs will continue to work.

NO!

I want the _user_ to be able to give input. 

I have one _suggested_ option, that to me has the huge advantage of not
really polluting the language, while being simple and obvious.

I would be happy with a #pragma, or with an attribute. People have been
talking about much more specialized attributes ("naked" etc) that are not
really useful to _any_ normal programs. The alias control feature would be
useful to real users - not just the kernel. At least judging by the
snippets of code I've seen. Code that breaks with the ANSI rules. 

I happen to think that the "explicit cast invalidates the alias
information" rule is the simplest and best one, and gives the user the
best control without adding things like new attributes or other ways to
let the user be in control.

But the details of _how_ that control is achieved are much less important
than the fact that the programmer _should_ be in control.

>					  I understand that, I even
> sympathize.  But you seem blind to the fact that this will inevitably make
> some (possibly many) ISO-valid programs slower.

Did you read my post? I'm arguing against making it something we have no
control over.

I was even arguing for allowing _stricter_ aliases than ANSI allows - the
"char *" thing in ANSI is actually really hard to code around (as far as I
know, the only way to do a one-byte access that still allows alias logic
to work in ANSI C is to do something really ridiculous like

	typedef struct {
		char c;
	} *one_byte_t;

in order to avoid the rule that any char access automatically means that
the compiler can't use the regular alias type rules.

> > I can see technical arguments. An argument of "it's really too painful to
> > do" I can understand (preferably with an explanation, but hey, I don't
> > mind getting told that it's too hard to explain). I use that argument
> > every day myself. 
> 
> See above, or find someone to submit working code (you would ask this
> in an equivalent situation on the kernel list).

Yes. I would ask the same.

But I do NOT use arguments like "that is undefined by POSIX" unless I have
a damn good reason to. I consider POSIX to be a guide to me, but I do not
consider it to be automatically correct (POSIX has done some major
blunders in its time: outright idiocies that simply could not be
implemented correctly on 64-bit architecturesfor very simple technical
reasons, for example).

And I would not say "POSIX does not allow you to do that, so why should
you do it"? 

> The standard is not arbitrary: it is the way it is for technical reasons,
> specifically to make C a suitable language for numerical computation.
> Without such rules serious number-crunchers have to switch to Fortran.

Look at the actual rules. Tell me that the "char *" rule makes sense.

The standard _is_ arbitrary. They tried to select a number of special
rules to make it UNLIKELY that old programs break. But the rules _were_
arbitrary. 

Note that "arbitrary" does not imply "random". There are reasons for the
rules. "char *" has historical issues associated with it. But there are
reasons for the extension I suggested too - and they aren't really any
different from the standard reasons.

"Arbitrary" means that you don't have any strong reason to choose one over
the other. So maybe you should allow the user some choice in the matter?

> > just want to get good code without having to do magic contortions, guys.
> 
> We could flip the default for the flag, so that people have to write
> -fstrict-aliasing to get the optimization.  Had we done that, you
> never would have noticed.

I would certainly have complained less, yes. Backwards compatibility is a
strong argument, and the way it is set up now just rubs everyodys nose in
the fact that the compiler behaviour changed. Behaviour you could rely in
according to other (and equally valid) standards of the language - the
alias thing was not even a proposal when I started doing Linux. 

But I would have noticed - I don't think you realize quite how important
generated code quality is to me, and that I actually _am_ aware of the
standard even when I disagree with some of the details in it. I _like_
alias analysis. I just want to have better control over it, because I
happen to think that I can take _advantage_ of it. 

I dislike fascist compilers who think they know better than I do.

And I dislike people who think fascist compilers are a good idea.

			Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  1:50                   ` Linus Torvalds
@ 1999-06-04  5:46                     ` craig
  1999-06-04  7:22                       ` burley (was Re: Linux and aliasing?) Mark Hahn
                                         ` (2 more replies)
  1999-06-30 15:43                     ` Linus Torvalds
  1 sibling, 3 replies; 218+ messages in thread
From: craig @ 1999-06-04  5:46 UTC (permalink / raw)
  To: torvalds; +Cc: craig

>I want the _user_ to be able to give input. 
[...]
>I happen to think that the "explicit cast invalidates the alias
>information" rule is the simplest and best one, and gives the user the
>best control without adding things like new attributes or other ways to
>let the user be in control.

In other words, you believe you are a better language designer than
the ISO C people as well as the gcc maintainers, despite the fact
that you know, what, *nothing* about language design, and *nothing*
about compiler design and, especially, long-term maintenance of
compilers?

[...]
>And I would not say "POSIX does not allow you to do that, so why should
>you do it"? 

>I dislike fascist compilers who think they know better than I do.

Then stop using them.

>And I dislike people who think fascist compilers are a good idea.

We dislike people who think the only good compiler is one that compiles
Linux, on the theory that whatever features Linux needs, must be the
exact ones that should have been in the C language in the first place.

*They* appear to be the real fascists to me, showing up on gcc lists
every few months, telling us how stupid, obnoxious, or ignorant we
are to not take their advice by running around like rats implementing
every feature they ask for, but hardly ever listening to our advice.

gcc is a tool, but some people appear to want it to be a wife -- Betty
Crocker in the kitchen, a virgin with the folks, and a hooker in the
bedroom.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 23:03           ` Linus Torvalds
  1999-06-03 23:45             ` mark
@ 1999-06-04  5:47             ` craig
  1999-06-04  8:17               ` Linus Torvalds
  1999-06-30 15:43               ` craig
  1999-06-04  8:39             ` Tim Hollebeek
                               ` (2 subsequent siblings)
  4 siblings, 2 replies; 218+ messages in thread
From: craig @ 1999-06-04  5:47 UTC (permalink / raw)
  To: torvalds; +Cc: craig

>Craig, don't always try to make the programmer look bad.

Why would I, since you seem to be doing such a good job of it, by
coming here and lecturing us about how to build compilers?

My goal is to make *great* programmers look *better*.  If gcc is
really a tool, it'll be simple, easy to use properly, and behave
consistently no matter where it is used.  *Great* programmers
prefer that kind of tool.  *Lousy* programmers think a hammer should
behave like a screwdriver so they can save some time changing tools
in the middle of a job, or so it can be "compatible" with the
locals' tradition that something called a "hammah" does the job of
a screwdriver.

>Instead of blaming the
>programmer, please just _allow_ him to say "you're just a stupid compiler,
>and you shouldn't be getting too much in my way". Is that so hard to do?

a) you already have the ability to do that, but you seem to not like
it, and b) yes, when it *is* so hard to do that, when there end up
being 5,000 different controls like that, and we decide that, lo and
behold, there's a *bug* in there somewhere, or, hey, maybe we want
to rewrite a chunk of the compiler, and it seems simple enough to do
that, except, first, we have to do an in-depth study on exactly how
those 5,000 controls might interact (since we have little or no
useful documentation on them, other than existing code that might
break if they stop behaving exactly as they did).

>Again, instead of thinking that the compiler always knows best, give the
>user a choice. We're not in Windows any more, Tonto. Give the programmer
>the gun, and allow him to shoot himself in the head. But give him a
>laser-guided nightsight too, in case he wants it.

Any programmer worth his salt, wanting "a laser-guided nightsight",
and not wanting to tweak (or even rewrite) his code for every new
compiler release, will *not* use a C compiler.  Period.

>Don't get caught up in the MS way of doing things, where you not only give
>a programmer a gun, you aim it at (roughtly) the wrong target, and you
>pull the trigger for him too. 

The MS way *is* to offer a product with bazillions of little features
that are not really appropriate to the "core mission" of the product.
Or what do *you* call MS Word -- a word processor?  Nobody I know,
who understands design issues, calls it that.

>Face it, there are clever programmers out there. You shouldn't make it
>illegal to be clever. A standard is not a law of nature, and it's not a
>universal excuse to be unfriendly to people who want to go outside the
>standard.

I would prefer gcc default to catering to *great* programmers.  *Clever*
programmers make the fatal mistakes we all have to live with, like
making distinctions based on content of whitespace, or believing that
a key labeled "backspace" must necessarily generate ASCII BS simply
because they share the same name, or believing that their code will
be rewritten and deployed before Jan 1, 2000.

Put another way: I *know* there are clever programmers out there.  That's
what scares me.

Face it: *you* made the fatal mistake here, by choosing C to implement
an operating system that you wanted to be fast, easy to maintain, and
portable.  Even worse: you choose "GNU C", the C language extended by
people who larely didn't understand what they were doing.  And, I agree
100%, all the people making these mistakes, from yourself to RMS, are
"clever programmers".

But you can mitigate the mistakes *you* made by dropping at least one
of the requirements you appear to have for Linux:

  -  Speed.  Rewrite it to accommodate some reasonable subset of ISO C,
     then live with whatever performance you get by tweaking compiler
     options.  That means no `asm', of course.

  -  Easy to maintain.  Decide that, upon every new release of gcc you
     want to compile Linux, you'll commit significant resources studying
     the effects of new optimizations and rewriting *Linux* -- not *gcc*
     -- to accommodate it.

  -  Portable.  Decide that gcc 2.7.2 will forever be the compiler for
     Linux, and that you'll therefore live with never porting Linux
     to new architectures not supported by that version of gcc.

The above will be recognized as a variation on a well-known theme --
"you can have it Soon, Cheap, and Working; choose *two*".

It would at least help us some, in accommodating cases (which, for all
I know, is true of this particular issue, though other gcc contributors
suggest otherwise) where we really blew it in selecting a default,
if you'd focus on the middle item more, to the extent that it results
in sharing what you learn about the effects of new optimizations, etc.
on the existing Linux code base.

I mean, I do recall some discussions of this issue before, but why
are we discussing it *now*, in a release cycle, when there's *nothing*
we can do about it, given that there's no *bug* here?

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  0:04               ` Linus Torvalds
  1999-06-04  1:08                 ` Branko Cibej
  1999-06-04  1:24                 ` Joe Buck
@ 1999-06-04  5:47                 ` craig
  1999-06-30 15:43                   ` craig
  1999-06-04  7:11                 ` mark
                                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 218+ messages in thread
From: craig @ 1999-06-04  5:47 UTC (permalink / raw)
  To: torvalds; +Cc: craig

>Sure, I can live with -fno-strict-aliasing. But I'm also really saddened
>by all the lawyers like you who think that standards are somehow more
>important than programmers. 

*Here*'s a clue: WE'RE PROGRAMMERS TOO.  Try reminding yourself of that
before the *next* time you flame our efforts to get a release out the
door without even providing a patch, much less a detailed specification,
to do what you want, okay?

>I think it's a damn shame that instead of technical arguments _everything_
>revolves around people reading the standard as if it was the bible, and
>trying to make people feel guilty for not really caring. It's not a sin to
>just want to get good code without having to do magic contortions, guys.

No, but it's stupid to want to do that in C or C++.  Even Fortran is a
better choice, and *it's* got lots of problems.

Ask Dakota Scientific Systems how they produce some of the most-optimized
numerical libraries on the planet.  They start with the original Fortran
code.  They look at the output from the native compiler for the particular
combination of architecture/CPU/cache-size/memory-latency that they're
targeting.  Then, they tweak the *original* Fortran code in order to
convince the compiler to generate better output for that target.

Fortunately, they seem experienced enough to understand that this process
works only for a *particularly* version of a compiler, rather than
believing that their tweaks must be honored for all time by that
compiler.  And, they don't seem to think it's necessary to ask the
compiler folks to add all sorts of fiddly little knobs to do *their*
work for them, based on my impressions from talking with one of the
people there.

But, then, they're *real* programmers.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
       [not found]           ` <v04205101b37d700fbf8d@[192.168.1.254]>
@ 1999-06-04  7:01             ` craig
  1999-06-30 15:43               ` craig
  0 siblings, 1 reply; 218+ messages in thread
From: craig @ 1999-06-04  7:01 UTC (permalink / raw)
  To: lehotsky; +Cc: craig

>	The complaints probably won't go away until we implement
>
>		#pragma dwim;

Indeed.  My own reasons for getting into the compiler "arena" included
many of these same issues -- I knew I wanted better code generated
for my OS/kernel work, but also knew that the more such work could be
leveraged off of what applications people wanted, the better for all.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  0:04               ` Linus Torvalds
                                   ` (2 preceding siblings ...)
  1999-06-04  5:47                 ` craig
@ 1999-06-04  7:11                 ` mark
  1999-06-04  8:38                   ` Linus Torvalds
  1999-06-30 15:43                   ` mark
  1999-06-04  8:41                 ` Tim Hollebeek
  1999-06-30 15:43                 ` Linus Torvalds
  5 siblings, 2 replies; 218+ messages in thread
From: mark @ 1999-06-04  7:11 UTC (permalink / raw)
  To: torvalds; +Cc: craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus> On Thu, 3 Jun 1999 mark@codesourcery.com wrote:
    >>  I don't think the cast rule is by any means the right obvious
    >> default.  For one thing, it pessimizes object-oriented C code
    >> that does downcasts through an inheritance hierarchy.  There's
    >> no reason that we shouldn't be able to use type-based alias
    >> analysis in such situations, but your proposal would make it
    >> not happen.

    Linus> But those downcasts are implicit, not explicit, no? I think
    Linus> only explicit casts should break the alias rule.

No, they are often explicit.  In C, you don't have base classes, per
se, so you right them explicitly.

    Linus> Have you actually ever tried? I don't think you realize
    Linus> quite what a rat-hole it is. It's not worth ANYBODYS time.

Yes, I have done similar things.

    Linus> I think it's a damn shame that instead of technical
    Linus> arguments _everything_ revolves around people reading the
    Linus> standard as if it was the bible, and trying to make people
    Linus> feel guilty for not really caring. It's not a sin to just
    Linus> want to get good code without having to do magic
    Linus> contortions, guys.

I implemented the code, and I wouldn't say that I ignored real
programmers.  In fact, my work was paid for by real programmers, who
noticed that GCC would generate markedly better code on some examples
they had if type-based aliasing were in use.  

I've expressed the position that if we come up with a reasonable
localized rule, that does not pessimize conforming code, that I would
have no objection.  In fact, I would be perfectly willing to work on
such a project.

With respect to your comments about prereleases, they're simply not
fair.  I see no reason that I, or anyone else, should volunteer our
time to add features.  Had I introduced a bug, I would feel duty-bound
to fix it.  

I do listen to feedback, and I've heard your point of view.  I respect
your opinion.  That doesn't mean I'm going to sit down and to what you
would like on my own time.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* re: burley (was Re: Linux and aliasing?)
  1999-06-04  5:46                     ` craig
@ 1999-06-04  7:22                       ` Mark Hahn
  1999-06-04  8:16                         ` craig
  1999-06-30 15:43                         ` Mark Hahn
  1999-06-04  8:35                       ` Linux and aliasing? Linus Torvalds
  1999-06-30 15:43                       ` craig
  2 siblings, 2 replies; 218+ messages in thread
From: Mark Hahn @ 1999-06-04  7:22 UTC (permalink / raw)
  To: egcs

could we please have a separate list for this kind of asinine namecalling?
perhaps call it "hoity-toity-armchair-architects-soapbox"?

> In other words, you believe you are a better language designer than
> the ISO C people as well as the gcc maintainers, despite the fact
> that you know, what, *nothing* about language design, and *nothing*
> about compiler design and, especially, long-term maintenance of
> compilers?

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: burley (was Re: Linux and aliasing?)
  1999-06-04  7:22                       ` burley (was Re: Linux and aliasing?) Mark Hahn
@ 1999-06-04  8:16                         ` craig
  1999-06-30 15:43                           ` craig
  1999-06-30 15:43                         ` Mark Hahn
  1 sibling, 1 reply; 218+ messages in thread
From: craig @ 1999-06-04  8:16 UTC (permalink / raw)
  To: hahn; +Cc: craig

>could we please have a separate list for this kind of asinine namecalling?

We do, it's called /dev/null, but Linus continues to insist on using
*this* list.

>perhaps call it "hoity-toity-armchair-architects-soapbox"?
>
>> In other words, you believe you are a better language designer than
>> the ISO C people as well as the gcc maintainers, despite the fact
>> that you know, what, *nothing* about language design, and *nothing*
>> about compiler design and, especially, long-term maintenance of
>> compilers?

Strange you quote *my* email, since I called him no names, like "lawyer",
in it.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  5:47             ` craig
@ 1999-06-04  8:17               ` Linus Torvalds
  1999-06-04  8:49                 ` craig
  1999-06-30 15:43                 ` Linus Torvalds
  1999-06-30 15:43               ` craig
  1 sibling, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-04  8:17 UTC (permalink / raw)
  To: craig; +Cc: davem, mark, chip, egcs

On 4 Jun 1999 craig@jcb-sc.com wrote:
> 
> Any programmer worth his salt, wanting "a laser-guided nightsight",
> and not wanting to tweak (or even rewrite) his code for every new
> compiler release, will *not* use a C compiler.  Period.

Oh?

That's a new argument. Instead of "don't use that feature", it's now
"don't do anything clever at all".

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  5:46                     ` craig
  1999-06-04  7:22                       ` burley (was Re: Linux and aliasing?) Mark Hahn
@ 1999-06-04  8:35                       ` Linus Torvalds
  1999-06-04 10:04                         ` Joe Buck
  1999-06-30 15:43                         ` Linus Torvalds
  1999-06-30 15:43                       ` craig
  2 siblings, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-04  8:35 UTC (permalink / raw)
  To: craig; +Cc: jbuck, mark, davem, chip, egcs

On 4 Jun 1999 craig@jcb-sc.com wrote:
> 
> In other words, you believe you are a better language designer than
> the ISO C people as well as the gcc maintainers, despite the fact
> that you know, what, *nothing* about language design, and *nothing*
> about compiler design and, especially, long-term maintenance of
> compilers?

Craig, instead of only getting down to personal insults, and how you think
I should just stay with one compiler all my life or rewrite my code every
year, how about you actually face any of the =technical= issues? Too
scary?

I haven't maintained a compiler long-term. I _have_ maintained a larger,
and arguably mode complex system with many more degrees of freedom and
thus choices than gcc. I know about maintenance, code-boy. Wether you'll
ever admit to that is irrelevant.

So instead of just spouting off crap, why don't you give a single
technical reason why my suggestion is actually BAD? Instead of talking
about "language design" and trying to set yourself up as the only person
in the world who understands the issues, why don't you just face the
technical issues and get down to details?

_I_ think my simple extension was perfectly legitimate, adn a _lot_ more
obvious than a lot of things people are discussing on the lists. So don't
give me that crap about not adding new features outside the standard: 
people in the egcs camp do that all the time, and they usually _like_
doing it, even for much more specialized problems like function prologue
and epiloge code generation. 

And I don't see any "language design" issues either - it's a very clean
extension, and makes complete sense. I bet that if we took any average
C programmer (and most of us do =not= know all that much about aliases),
people would understand the extended semantics a lot more easily than they
understand the basic ANSI rules.

So how about it? Instead of just telling everybody that C isn't a portable
systems language (which it was designed to be, by people I respect a lot
more than you, despite all your rhetoric about being such a good language
person), just tell us why you think the simple "explicit cast invalidates
type information for aliasing" rule is so bad. 

And I realize that people are in a hurry and somewhat stressed to get 2.95
out the door. I do NOT think that anything like this should be a gating
issue - that would just be silly. The current egcs works, albeit with a
too draconian (in my opinion) global flag. Please don't get the feeling
that I'm doing this just to disrupt some release process. 

			Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  7:11                 ` mark
@ 1999-06-04  8:38                   ` Linus Torvalds
  1999-06-30 15:43                     ` Linus Torvalds
  1999-06-30 15:43                   ` mark
  1 sibling, 1 reply; 218+ messages in thread
From: Linus Torvalds @ 1999-06-04  8:38 UTC (permalink / raw)
  To: mark; +Cc: craig, davem, chip, egcs

On Fri, 4 Jun 1999 mark@codesourcery.com wrote:
> 
> With respect to your comments about prereleases, they're simply not
> fair.  I see no reason that I, or anyone else, should volunteer our
> time to add features.  Had I introduced a bug, I would feel duty-bound
> to fix it.  

Oh, let me apologize for that comment. Consider me properly chastizised: I
obviously use pre-releases all the time myself, and they are the greatest
thing since sliced reak.

> I do listen to feedback, and I've heard your point of view.  I respect
> your opinion.  That doesn't mean I'm going to sit down and to what you
> would like on my own time.

I really don't expect people to code for me. It's damn convenient, though.

I =do= expect people to at least consider the issue seriously, and
seriously dismiss it if they do - and keep it in mind. Instead of
attacking it on some paperwork issue.. Which you do seem to be doing. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 23:03           ` Linus Torvalds
  1999-06-03 23:45             ` mark
  1999-06-04  5:47             ` craig
@ 1999-06-04  8:39             ` Tim Hollebeek
  1999-06-04  8:55               ` Linus Torvalds
  1999-06-30 15:43               ` Tim Hollebeek
  1999-06-04 15:02             ` Richard Henderson
  1999-06-30 15:43             ` Linus Torvalds
  4 siblings, 2 replies; 218+ messages in thread
From: Tim Hollebeek @ 1999-06-04  8:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: craig, davem, mark, chip, egcs

Linus Torvalds writes ...
> 
> But it should be reasonably easy to implement very straightforward rules,
> and have the rules themselves make common sense ;)
> 
> The extremely straightforward rule that at least I would advocate is _so_
> straightforward as to be almost scary:
>  - if there is a pointer cast, that pointer cast invalidates all
>    type-based alias information.

I think it's pretty obvious this is the wrong thing to do.

Sure, it does the right thing (for a narrow definition of "right
thing") if your code always uses hairy expressions where all the
nastiness is jumbled in one expression.

If you do the same type tricks but use intermediate variables to
improve readability, you lose.  In fact, simply taking an expression
and decomposing it into constituent parts can change the behavior of
code under this rule.  Absolutely horrible.

Unless you're suggesting data flow analysis to figure out which
pointers values could have been derived from a casted pointer??? ick,
ick, ick.

-Tim

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  0:04               ` Linus Torvalds
                                   ` (3 preceding siblings ...)
  1999-06-04  7:11                 ` mark
@ 1999-06-04  8:41                 ` Tim Hollebeek
  1999-06-04  8:53                   ` Jeffrey A Law
  1999-06-30 15:43                   ` Tim Hollebeek
  1999-06-30 15:43                 ` Linus Torvalds
  5 siblings, 2 replies; 218+ messages in thread
From: Tim Hollebeek @ 1999-06-04  8:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mark, craig, davem, chip, egcs

Linus Torvalds writes ...
> 
> I think it's a damn shame that instead of technical arguments _everything_
> revolves around people reading the standard as if it was the bible, and
> trying to make people feel guilty for not really caring. It's not a sin to
> just want to get good code without having to do magic contortions, guys.

I think its a damn shame certain people can't be disagreed with without
insulting their opponents.

-Tim

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:17               ` Linus Torvalds
@ 1999-06-04  8:49                 ` craig
  1999-06-04  8:57                   ` Linus Torvalds
  1999-06-30 15:43                   ` craig
  1999-06-30 15:43                 ` Linus Torvalds
  1 sibling, 2 replies; 218+ messages in thread
From: craig @ 1999-06-04  8:49 UTC (permalink / raw)
  To: torvalds; +Cc: craig

>On 4 Jun 1999 craig@jcb-sc.com wrote:
>> 
>> Any programmer worth his salt, wanting "a laser-guided nightsight",
>> and not wanting to tweak (or even rewrite) his code for every new
>> compiler release, will *not* use a C compiler.  Period.
>
>Oh?
>
>That's a new argument. Instead of "don't use that feature", it's now
>"don't do anything clever at all".

No, you misunderstand.

C is simply a poor language for the task at hand.  It provides too
little low-level control of how a C compiler should do its work,
yet it's not high-level enough to make it practical for the compiler
to do enough of the optimization work for the programmer to satisfy
developers of embedded/OS code.

Your recently-expressed concerns about `volatile' were a perfect
example of that.  You correctly (or pretty nearly so) noted the
distinction you wanted to make between a volatile *reference* to
an object and a reference to an object via a volatile *address*
(pointer).  That's a distinction C apparently doesn't provide,
among many, in a language supposedly "suitable" for "low-level"
coding (and, I admit, it's better than PL/I in that regard).

Not that I have anything great to suggest in place of C, you understand,
and I fully realize that you're not about to rewrite Linux into some
other language anyway.

But what you're essentially asking for is for us to make gcc compile
some language that is less and less like C, one that is more and more
like your particular *vision* of what C should be, which happens to
be quite at odds with the direct C9X and others are taking, if my
impressions of those efforts (based on posts to this list) are
correct.  (Clearly it'd be easier if those working on the upcoming C
standard simply implemented your desires...at least, easier on the
gcc developers.)

Further, you're asking for us to do language design "on the fly", while
implementing a compiler for that language.  In my experience, that
attempt to marry language design and compiler implementation, while
plenty of fun and full of opportunity for cleverness to show itself,
almost always leads to poor language design.

You've already been bitten by hard-to-find bugs stemming from *extensions*
to GNU C that you used, sometimes without regard to the fact that they
were not particularly well documented.  These experiences have led you
to conclude, or at least complain, that gcc was going in a direction
you did not like.  (Worst of all, you often express this by insulting
people like myself, who, especially in my case, aren't the ones *causing*
the trouble, but are simply trying to *explain* it to you!)

Now you are complaining about a *standard* language feature being
implemented in a standard-conforming way by gcc, one which you can
work around by changing your code to be standard-conforming (surely
a SMOP, say a few lines of Perl ;-) *or* by using a command-line
option...but you don't like the pain of the former, or the performance
of the latter.  Welcome to C hell.

I don't know the details of the issues involved, but I trust those that
have spoken against your proposal, that they *do* know them.

What I have been trying to do is get you to see that, at some point,
you have to conclude that you're never going to succeed at making
the edge of *this* particular hammer, gcc, sharp enough to make
nice clean cuts in silk, for any length of time, without a whole lot
of pain, because every time someone uses it as a hammer, those nice
cutting edges get worn off.

        tq vm, (burley)

P.S. Due to my own outbursts on this thread, and the resulting email
I got, I promise to make this the *last* time I will *ever* respond
to queries, complaints, etc. about similar issues coming from the
Linux camp.  Clearly I do not have what it takes to respond to what
I see as extreme (and repeated) childishness without letting myself
be dragged (at least somewhat) down into the muck, a problem I've
long known I've had, but have yet to fully address.

So, those of you who *encouraged* me in private email, thanks, but,
from now on, you're on your own in defending the honor of gcc
developers against the unfounded, and unfair, accusations of people like
Linus Torvalds.  It's not just that I don't have the maturity to
cope with it -- I don't have the patience, and I surely don't have
the time, to keep going over the same ground again and again.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:41                 ` Tim Hollebeek
@ 1999-06-04  8:53                   ` Jeffrey A Law
  1999-06-30 15:43                     ` Jeffrey A Law
  1999-06-30 15:43                   ` Tim Hollebeek
  1 sibling, 1 reply; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-04  8:53 UTC (permalink / raw)
  To: Tim Hollebeek; +Cc: Linus Torvalds, mark, craig, davem, chip, egcs

  In message < 199906041541.LAA27121@wagner.Princeton.EDU >you write:
  > I think its a damn shame certain people can't be disagreed with without
  > insulting their opponents.
I couldn't agree more.

Folks if you feel you must take a pop-shot at someone, do so privately.  We've
got better things to do than get into a pissing match.

jeff


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:39             ` Tim Hollebeek
@ 1999-06-04  8:55               ` Linus Torvalds
  1999-06-04 15:20                 ` Richard Henderson
  1999-06-30 15:43                 ` Linus Torvalds
  1999-06-30 15:43               ` Tim Hollebeek
  1 sibling, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-04  8:55 UTC (permalink / raw)
  To: Tim Hollebeek; +Cc: craig, davem, mark, chip, egcs

On Fri, 4 Jun 1999, Tim Hollebeek wrote:
> 
> If you do the same type tricks but use intermediate variables to
> improve readability, you lose.  In fact, simply taking an expression
> and decomposing it into constituent parts can change the behavior of
> code under this rule.  Absolutely horrible.

Uhh. You're right. I considered it, but I didn't find it "absolutely
horrible", I thought it could be considered a feature in that it was only
ever entirely local to =one= memory operation. 

That's not something new per se: the gcc __extension__ thing is kind of
similarly meant to silence things up locally to that expression.

But I can see why you wouldn't like it, and I understand your argument. I
don't how how else you would limit the scope of anything like this, though
(scoping it to something larger than a single dereference sounds like a
horrible rats nest to me, but opnions can certainly differ).

> Unless you're suggesting data flow analysis to figure out which
> pointers values could have been derived from a casted pointer??? ick,
> ick, ick.

Oh, no, no, no. Shudder. I hope nobody took it that way. Barf.

I meant the features as something to expressly allow a local override.
Think of the rule more as an issue of "poisoning" the dereference operator
rather than poisoning the _pointer_. In a kind of silly "precedence rule"
notation, it would be

	*(char *)y

becomes (*(char *)) y where it is the "*(char *)" thing that makes the
alias go away. (Now somebody is going to flame my ass off for mixing C and
a non-C precedence rule). 

And maybe the above is hard to do because by the time you actually would
want to do the above logic the information isn't really there any more. 
That's entirely possible, and if people tell me it's a bad idea for reason
X I'll shut up about it, but I'd try to come up with another one. Deal? 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:49                 ` craig
@ 1999-06-04  8:57                   ` Linus Torvalds
  1999-06-04  9:02                     ` Jean-Pierre Radley
  1999-06-30 15:43                     ` Linus Torvalds
  1999-06-30 15:43                   ` craig
  1 sibling, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-04  8:57 UTC (permalink / raw)
  To: craig; +Cc: davem, mark, chip, egcs

On 4 Jun 1999 craig@jcb-sc.com wrote:
> 
> C is simply a poor language for the task at hand.

Well, I do agree. But at the same time I also disagree, for the obvious
reason that it's still the =best= language for the task at hand. So I can
only hope to make it better for it rather than make it worse.

And I do know that other people have other concerns. Wich is why I think a
flexible approach which allows people to express those concerns would be
such a nice thing. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:57                   ` Linus Torvalds
@ 1999-06-04  9:02                     ` Jean-Pierre Radley
  1999-06-30 15:43                       ` Jean-Pierre Radley
  1999-06-30 15:43                     ` Linus Torvalds
  1 sibling, 1 reply; 218+ messages in thread
From: Jean-Pierre Radley @ 1999-06-04  9:02 UTC (permalink / raw)
  To: EGCS Developers

Linus Torvalds averred (on Fri, Jun 04, 1999 at 08:56:31AM -0700):
| 
| On 4 Jun 1999 craig@jcb-sc.com wrote:
| > 
| > C is simply a poor language for the task at hand.
| 
| Well, I do agree. But at the same time I also disagree, for the obvious
| reason that it's still the =best= language for the task at hand. So I can
| only hope to make it better for it rather than make it worse.

Which brings to mind Winston Churchill's remark to the effect that
democracy is the worst form of government, except for all the rest.

-- 
Jean-Pierre Radley <jpr@jpr.com>  XC/XT Custodian   Sysop, CompuServe SCOForum

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:35                       ` Linux and aliasing? Linus Torvalds
@ 1999-06-04 10:04                         ` Joe Buck
  1999-06-04 10:22                           ` Jeffrey A Law
                                             ` (3 more replies)
  1999-06-30 15:43                         ` Linus Torvalds
  1 sibling, 4 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-04 10:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: craig, jbuck, mark, davem, chip, egcs

Linus writes, to Craig:

> I haven't maintained a compiler long-term. I _have_ maintained a larger,
> and arguably mode complex system with many more degrees of freedom and
> thus choices than gcc.

No, gcc is substantially larger than the Linux kernel (on the order of 10x
larger, as anyone can easily verify).  gcc does not need to deal with race
conditions, so in that sense it is simpler, but in other respects it is
far more complex than the kernel.

[ personal insults deleted, perhaps the author is in need of a vacation ]

> _I_ think my simple extension was perfectly legitimate, adn a _lot_ more
> obvious than a lot of things people are discussing on the lists.

Your "simple extension" will have the effect of -fno-strict-aliasing
for any function that does any pointer cast (there may be marginal
differences if there are loops before the first cast).  So why not just use
-fno-strict-aliasing and get the same code?

I appreciate your desire for a better solution, but your suggestion
doesn't cut it.

Perhaps we should make -fno-strict-aliasing the default, if many programs
in common use do this kind of type-punning.  But that's about all
we can do at this stage for gcc-2.95.  

writing to Craig, Linus says:
> So how about it? Instead of just telling everybody that C isn't a portable
> systems language (which it was designed to be, by people I respect a lot
> more than you, despite all your rhetoric about being such a good language
> person), just tell us why you think the simple "explicit cast invalidates
> type information for aliasing" rule is so bad. 

C is a reasonably portable systems language, provided that the users use
it correctly: if you don't obey the standards you are in the land of
unspecified behavior, so your code isn't portable.  I think what Craig is
saying is that it's really not correct to think of it as a portable
assembler, and if you want that level of fine control it may not be
suitable.

> And I realize that people are in a hurry and somewhat stressed to get 2.95
> out the door. I do NOT think that anything like this should be a gating
> issue - that would just be silly. The current egcs works, albeit with a
> too draconian (in my opinion) global flag. Please don't get the feeling
> that I'm doing this just to disrupt some release process. 

OK, maybe we can get somewhere.  It seems to me that there are only two
options for gcc-2.95 on this issue:

Either
1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).

or

2. Don't enable the new optimization for C unless the user says
   -fstrict-aliasing.

   Since C++ is fussier about type safety, we could make it the default for
   C++ (nice to get those C++ critics who falsely claim C++ is slower than
   C ;-).

Which would you recommend?

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:04                         ` Joe Buck
@ 1999-06-04 10:22                           ` Jeffrey A Law
  1999-06-04 10:31                             ` Joe Buck
                                               ` (2 more replies)
  1999-06-04 11:49                           ` Linus Torvalds
                                             ` (2 subsequent siblings)
  3 siblings, 3 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-04 10:22 UTC (permalink / raw)
  To: Joe Buck; +Cc: Linus Torvalds, craig, mark, davem, chip, egcs

  > Either
  > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).
This is my strong preference.

I see no need to make conforming, portable code run slower.  Lots of folks have
already fixed these problems in their code (in large part because vendor
compilers started doing this kind of alias analysis years ago).

Folks working with non-portable code can use -fno-strict-aliasing and pay
the resulting performance penalty.

jeff



^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:22                           ` Jeffrey A Law
@ 1999-06-04 10:31                             ` Joe Buck
  1999-06-04 10:53                               ` Jeffrey A Law
                                                 ` (2 more replies)
  1999-06-04 11:11                             ` Toon Moene
  1999-06-30 15:43                             ` Jeffrey A Law
  2 siblings, 3 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-04 10:31 UTC (permalink / raw)
  To: law; +Cc: jbuck, torvalds, craig, mark, davem, chip, egcs

>   > Either
>   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).
> This is my strong preference.

In that case, then all release announcements and NEWS should prominently
mention the effect of this new optimization and the -fno-strict-aliasing
flag, so that everyone has fair warning.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:31                             ` Joe Buck
@ 1999-06-04 10:53                               ` Jeffrey A Law
  1999-06-30 15:43                                 ` Jeffrey A Law
  1999-06-30 15:43                               ` Joe Buck
  1999-07-11 10:55                               ` Jeffrey A Law
  2 siblings, 1 reply; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-04 10:53 UTC (permalink / raw)
  To: Joe Buck; +Cc: torvalds, craig, mark, davem, chip, egcs

  In message < 199906041728.KAA19685@atrus.synopsys.com >you write:
  > >   > Either
  > >   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing
  > ).
  > > This is my strong preference.
  > 
  > In that case, then all release announcements and NEWS should prominently
  > mention the effect of this new optimization and the -fno-strict-aliasing
  > flag, so that everyone has fair warning.
Agreed.  

jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:22                           ` Jeffrey A Law
  1999-06-04 10:31                             ` Joe Buck
@ 1999-06-04 11:11                             ` Toon Moene
  1999-06-04 12:20                               ` Jeffrey A Law
                                                 ` (2 more replies)
  1999-06-30 15:43                             ` Jeffrey A Law
  2 siblings, 3 replies; 218+ messages in thread
From: Toon Moene @ 1999-06-04 11:11 UTC (permalink / raw)
  To: law; +Cc: Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs

Jeffrey A Law wrote:

> Mark Mitchell wrote:

>   > Either
>   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).

> This is my strong preference.
> 
> I see no need to make conforming, portable code run slower.

Exactly.  Remember that a standard is a contract between producer and
(end-)user, in our case:  between compiler writer and C programmer.

	"We won't optimize your constructs away as long as you program 
	 according to said standard"

Giving Linus more freedom in getting his C code to compile to the code
he thinks is right will take freedom away from us, the compiler writers.
Unfortunately, this is not according to contract and won't be upheld in
court.

> Folks working with non-portable code can use -fno-strict-aliasing and pay
> the resulting performance penalty.

Or those Fortran users (like me) who still do not understand how a
strictly C performance enhancement can worsen the code generated for
purely Fortran source, like it is the case for me (I use
-fno-strict-aliasing since the end of February - and no, we Fortran
users do not have a problem with aliasing; as I outlined on comp.arch,
we outlawed it).

Cheers.

[Oh, BTW, it doesn't make sense to call me names - I'm a native of
 Amsterdam; if I cared about *that*, I would have been dead for decades,
 now]

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
GNU Fortran: http://world.std.com/~burley/g77.html

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:04                         ` Joe Buck
  1999-06-04 10:22                           ` Jeffrey A Law
@ 1999-06-04 11:49                           ` Linus Torvalds
  1999-06-04 13:03                             ` Gabriel Dos_Reis
  1999-06-30 15:43                             ` Linus Torvalds
  1999-06-04 12:59                           ` Alexandre Oliva
  1999-06-30 15:43                           ` Joe Buck
  3 siblings, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-04 11:49 UTC (permalink / raw)
  To: Joe Buck; +Cc: craig, mark, davem, chip, egcs

Ok. Only real technical details. Please shoot it down on technical issues.

On Fri, 4 Jun 1999, Joe Buck wrote:
> 
> > _I_ think my simple extension was perfectly legitimate, adn a _lot_ more
> > obvious than a lot of things people are discussing on the lists.
> 
> Your "simple extension" will have the effect of -fno-strict-aliasing
> for any function that does any pointer cast (there may be marginal
> differences if there are loops before the first cast).  So why not just use
> -fno-strict-aliasing and get the same code?

Well, the biggest advantage I see for having the alias type checking is
that it allows re-ordering of operations _everywhere_, even if there is no
other a priori reason to allow it. A function that does a pointer cast
would _not_ be affected globally even under my scheme. It would just mean
that that particular access that is affected would be marked as being in
alias set zero. 

Hmm.. I really expected this to be simple, but judging by the amount of
traffic it has generated it is obvious that I went about this the wrong
way. Sorry. I really didn't mean to start a flame war, and let me back up
a bit. 

To me, the fact that the scheme should be easy to implement is actually
really important. I'm not a gcc hacker, and it's been several years since
I actually submitted patches to gcc (and even then they weren't always
accepted, although I can claim credit for some of the alpha fixes). But I
really tried to come up with something that wasn't just easy for the users
but that I thought would be easy to do inside gcc. 

So with that background, and the further explanation that if I _am_ wrong
(entirely possible), and it's a rats nest to implement in gcc, then it
very obviously _should_ be discarded out-of-hand. It really wasn't a case
of me just trying to make life harder for people. Let me explain what my
concept was on a more technical level, and then people can shoot holes in
it on a technical level and maybe we can avoid too much more flamage.

Sorry. I kind of took the technical part for granted. 

So on a technical level, let me explain it the way I thought gcc might
implement this rather than explaining the end result as I initially went
about.  Maybe people would understand (and accept) what my idea was better
this way. 

What I would actually do if I knew gcc better is do something like this: 

 - add a new type attribute bit. There are plenty of these already, so
   this isn't a big issue. I didn't worry about naming, because although
   it _could_ probably be used in typedefs directly, it probably never
   would be. But who knows? Somebody might have a good reason to make some
   type always have the "this can alias anything" behaviour, and it could
   in fact be used for the C "char *" type, so that the ANSI special "char
   *" rules would be just a subset of this.

   Let's call the attribute "noalias" just for obvious reasons.

 - the attribute bit magically gets set by any explicit cast when (and
   this obviously _would_ be controlled by a gcc option like
   "-fcast-invalidates-alias", so people who don't like the extended
   semantics wouldn't be affected). This can be done at parse time, it
   looks pretty trivial to me. I may be wrong.

   Think of this part as another simple rule: a typecast always implies
   the "noalias" attribute if the global flag is set. Nothing else would
   imply that attribute.

 - the attribute percolates down normal pointer arithmetic, but NOTHING
   else. It doesn't inherit across a assignment (although with a named
   attribute the assigned variable migth have that attribute natively).
   You already have the notion of attribute inheritace, this is nothing
   new. In fact, I think the inheritance rules are basically the same as
   for the "volatile" attribute, but I haven't really verified that.

 - a alias set query will always return zero for a expression with that
   attribute set.

And that's it. The above doesn't really explain what I'm trying to
_achieve_, it only explains the way I thought those goals would be
achieved. 

So for example, just to make the suggestion more "tangible" to the people
who actually think in terms of gcc code, look at

	 int
	get_alias_set (t)
	     tree t;

in tree.c, and mentally imagine adding a simple condition that just says
something like

   if (!flag_strict_aliasing || !lang_get_alias_set)
     /* If we're not doing any lanaguage-specific alias analysis, just
        assume everything aliases everything else.  */
     return 0;
+  else if (lookup_attribute("noalias", t->attribute))
+    return 0
   else
     return (*lang_get_alias_set) (t);

and that's the only real place where it is tested. 

The attribute is set when parsing the type casting, and in my "clean up
'char *' semantics"  extension it would also always be set for any "char
*" type. In that case the special casing of "char *" can go away, so you'd
actually _remove_ the code in c-common.c that says

      else if (signed_variant == signed_char_type_node)
        /* The C standard guarantess that any object may be accessed
           via an lvalue that has character type.  We don't have to
           check for unsigned_char_type_node or char_type_node because
           we are specifically looking at the signed variant.  */
        TYPE_ALIAS_SET (type) = 0;

but that's a detail that I just show to point out the ramifications of the
_idea_ rather than advocating as something that should necessarily be
done. But it would conceptually put the decision in one place, which is
nice. 

(To me, when I judge peoples ideas about kernel changes, a personally
important criterion is always "does it conceptually solve _multiple_
problems?", and an idea that can be used to solve another thing is
something that I consider more interesting and consider to be more
"flexible". I don't know if the egcs people use that same strategy, but I
wanted to point it out in case others have similar decision making methods
to the ones I use). 

> OK, maybe we can get somewhere.  It seems to me that there are only two
> options for gcc-2.95 on this issue:
> 
> Either
> 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).
> 
> or
> 
> 2. Don't enable the new optimization for C unless the user says
>    -fstrict-aliasing.
> 
>    Since C++ is fussier about type safety, we could make it the default for
>    C++ (nice to get those C++ critics who falsely claim C++ is slower than
>    C ;-).
> 
> Which would you recommend?

If I did the same decision, I would just make the new feature the default,
to make sure it got tested. I don't really disagree about that - it _does_
rub peoples faces into the issue, and it _will_ make others complain too
and you'll end up explaining to a lot of people what the aliasing issues
really are, but it still makes sense to enable it by default to get better
coverage on it. 

Have an open mind, and be ready to decide that if there turns out to be a
lot of people that get bitten by it you should just turn it off by default
(and for example the eventual decision might be to only turn it on at
optimization level 3 instead of 2, as -O2 tends to be a fairly common
optimization level because it does so much else on gcc too..). 

I really only want to make sure that _eventually_ I can take advantage of
type-aliasing, while at the same time having a convenient back door for
when the kernel does the ugly things.. 

Do people see any obvious problems in the technical idea above? It looks
very maintainable and clean to me, but I'll readily admit that I only look
at the problem from ten thousand feet when it comes to the compiler side.
Maybe it is just completely unusable for some reason I missed.. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 11:11                             ` Toon Moene
@ 1999-06-04 12:20                               ` Jeffrey A Law
  1999-06-05  5:45                                 ` Toon Moene
  1999-06-30 15:43                                 ` Jeffrey A Law
  1999-06-05  4:05                               ` Andi Kleen
  1999-06-30 15:43                               ` Toon Moene
  2 siblings, 2 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-04 12:20 UTC (permalink / raw)
  To: Toon Moene; +Cc: Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs

  In message < 375814C8.85CA17C9@moene.indiv.nluug.nl >you write:
  > Or those Fortran users (like me) who still do not understand how a
  > strictly C performance enhancement can worsen the code generated for
  > purely Fortran source, like it is the case for me (I use
  > -fno-strict-aliasing since the end of February - and no, we Fortran
  > users do not have a problem with aliasing; as I outlined on comp.arch,
  > we outlawed it).
I though this was tracked down to the inability to re-share those auto
arrays on the stack.  I also thought we had turned off strict aliasing
for Fortran for precisely this reason.  Did I misunderstand the end result
of that discussion?

jeff

  > Cheers.
  > 
  > [Oh, BTW, it doesn't make sense to call me names - I'm a native of
  >  Amsterdam; if I cared about *that*, I would have been dead for decades,
  >  now]
  > 
  > -- 
  > Toon Moene (toon@moene.indiv.nluug.nl)
  > Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
  > Phone: +31 346 214290; Fax: +31 346 214286
  > GNU Fortran: http://world.std.com/~burley/g77.html
  > 


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:04                         ` Joe Buck
  1999-06-04 10:22                           ` Jeffrey A Law
  1999-06-04 11:49                           ` Linus Torvalds
@ 1999-06-04 12:59                           ` Alexandre Oliva
  1999-06-04 13:29                             ` Joe Buck
  1999-06-30 15:43                             ` Alexandre Oliva
  1999-06-30 15:43                           ` Joe Buck
  3 siblings, 2 replies; 218+ messages in thread
From: Alexandre Oliva @ 1999-06-04 12:59 UTC (permalink / raw)
  To: Joe Buck; +Cc: Linus Torvalds, craig, mark, davem, chip, egcs

On Jun  4, 1999, Joe Buck <jbuck@Synopsys.COM> wrote:

> Linus writes, to Craig:

>> _I_ think my simple extension was perfectly legitimate, adn a _lot_ more
>> obvious than a lot of things people are discussing on the lists.

> Your "simple extension" will have the effect of -fno-strict-aliasing
> for any function that does any pointer cast (there may be marginal
> differences if there are loops before the first cast).  So why not
> just use -fno-strict-aliasing and get the same code?

> I appreciate your desire for a better solution, but your suggestion
> doesn't cut it.

AFAICT, in a cast to `(some_type_t *volatile)', the `volatile' doesn't
have any actual effect on the generated code, because the pointer has
already been evaluated.  Couldn't we implement an extension by which
this `volatile' would kind of have the opposite meaning of `restrict'?
It would mean that the resulting pointer may be aliased to anything
else, so the compiler shouldn't move it around nor optimize it ``too
much''.  It seems to me that `volatile' is the right word to mean it,
especially because it would be ignored by compilers that don't support
this extension.

-- 
Alexandre Oliva http://www.dcc.unicamp.br/~oliva IC-Unicamp, Bra[sz]il
{oliva,Alexandre.Oliva}@dcc.unicamp.br  aoliva@{acm.org,computer.org}
oliva@{gnu.org,kaffe.org,{egcs,sourceware}.cygnus.com,samba.org}
*** E-mail about software projects will be forwarded to mailing lists

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 11:49                           ` Linus Torvalds
@ 1999-06-04 13:03                             ` Gabriel Dos_Reis
  1999-06-04 13:13                               ` Joe Buck
  1999-06-30 15:43                               ` Gabriel Dos_Reis
  1999-06-30 15:43                             ` Linus Torvalds
  1 sibling, 2 replies; 218+ messages in thread
From: Gabriel Dos_Reis @ 1999-06-04 13:03 UTC (permalink / raw)
  To: egcs

Linus Torvalds <torvalds@transmeta.com> writes:

[...]

| So on a technical level, let me explain it the way I thought gcc might
| implement this rather than explaining the end result as I initially went
| about.

Do you have a complete patch?

-- Gaby

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 13:03                             ` Gabriel Dos_Reis
@ 1999-06-04 13:13                               ` Joe Buck
  1999-06-30 15:43                                 ` Joe Buck
  1999-06-30 15:43                               ` Gabriel Dos_Reis
  1 sibling, 1 reply; 218+ messages in thread
From: Joe Buck @ 1999-06-04 13:13 UTC (permalink / raw)
  To: Gabriel Dos_Reis; +Cc: egcs

> Linus Torvalds <torvalds@transmeta.com> writes:
> | So on a technical level, let me explain it the way I thought gcc might
> | implement this rather than explaining the end result as I initially went
> | about.

Gaby writes:
> Do you have a complete patch?

For something like this, it's not best to take a rigid "complete patch or
go away" stance (tempting as it is).  Linus has provided enough of a
skeleton at this point where it's possible to discuss whether the approach
is feasible (though the folks expert on that part of the compiler may not
have the cycles to discuss it in detail just now).

By the way, I said something stupid earlier in this discussion: while
Linux was once an order of magnitude smaller than gcc back in the 1.2
days, now, thanks to tons of device drivers it's about the same size.
Since I never download the whole thing (just patches, and I'm still
running 2.0.3x for some x), I hadn't really noticed this.

So clearly I was wrong to say that gcc is much bigger than Linux.
Sorry, Linus.




^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 12:59                           ` Alexandre Oliva
@ 1999-06-04 13:29                             ` Joe Buck
  1999-06-04 13:39                               ` Alexandre Oliva
  1999-06-30 15:43                               ` Joe Buck
  1999-06-30 15:43                             ` Alexandre Oliva
  1 sibling, 2 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-04 13:29 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: jbuck, torvalds, craig, mark, davem, chip, egcs

> AFAICT, in a cast to `(some_type_t *volatile)', the `volatile' doesn't
> have any actual effect on the generated code, because the pointer has
> already been evaluated.  Couldn't we implement an extension by which
> this `volatile' would kind of have the opposite meaning of `restrict'?
> It would mean that the resulting pointer may be aliased to anything
> else, so the compiler shouldn't move it around nor optimize it ``too
> much''.

It's not a matter of not moving it around or optimizing the pointer
itself.  Rather, after someone writes through one of these anti-restrict
pointers, the compiler has to assume the worst, and re-read everything
from memory that it might have cached away in registers, because we're
saying that the pointer could point to *anything*.  Even a read through
such a pointer can impede optimization, as we'd have to flush everything
out to memory before the read, since the pointer might be reading any
object (the old value of which might be in a register).  This kills
performance of loops over arrays.

For this reason, the effect of proposals like this might be similar in
performance to just saying, if a function contains a cast of a pointer
that is later dereferenced, apply -fno-strict-aliasing to the entire
function, or at least a significant chunk of it.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 13:29                             ` Joe Buck
@ 1999-06-04 13:39                               ` Alexandre Oliva
  1999-06-30 15:43                                 ` Alexandre Oliva
  1999-06-30 15:43                               ` Joe Buck
  1 sibling, 1 reply; 218+ messages in thread
From: Alexandre Oliva @ 1999-06-04 13:39 UTC (permalink / raw)
  To: Joe Buck; +Cc: torvalds, craig, mark, davem, chip, egcs

I had written:

>> It would mean that the resulting pointer may be aliased to anything
>> else, so the compiler shouldn't move it around nor optimize it ``too
>> much''.

On Jun  4, 1999, Joe Buck <jbuck@Synopsys.COM> wrote:

> For this reason, the effect of proposals like this might be similar
> in performance to just saying, if a function contains a cast of a
> pointer that is later dereferenced, apply -fno-strict-aliasing to
> the entire function, or at least a significant chunk of it.

The main difference is that the notation I propose could be used in
macros.  It is true that it could have a glocal effect on the
optimization of any function that uses it, but the requirement
statement would be closely linked to its use, which is good, so that
you wouldn't have to maintain special Makefile rules because such and
such files haven't been `union'ized (yet?) to make them
ANSI-aliasing-safe.

-- 
Alexandre Oliva http://www.dcc.unicamp.br/~oliva IC-Unicamp, Bra[sz]il
{oliva,Alexandre.Oliva}@dcc.unicamp.br  aoliva@{acm.org,computer.org}
oliva@{gnu.org,kaffe.org,{egcs,sourceware}.cygnus.com,samba.org}
*** E-mail about software projects will be forwarded to mailing lists

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 23:03           ` Linus Torvalds
                               ` (2 preceding siblings ...)
  1999-06-04  8:39             ` Tim Hollebeek
@ 1999-06-04 15:02             ` Richard Henderson
  1999-06-04 16:50               ` Bernd Schmidt
                                 ` (2 more replies)
  1999-06-30 15:43             ` Linus Torvalds
  4 siblings, 3 replies; 218+ messages in thread
From: Richard Henderson @ 1999-06-04 15:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: craig, davem, mark, chip, egcs

This thread is huge, and there is obviously a bit of bile swilling about,
so I probably won't read it all.  However, I will point out one thing --

On Thu, Jun 03, 1999 at 11:02:35PM -0700, Linus Torvalds wrote:
> The extremely straightforward rule that at least I would advocate is _so_
> straightforward as to be almost scary:
>  - if there is a pointer cast, that pointer cast invalidates all
>    type-based alias information.

Doing what you want is actually very hard for GCC right now.  Consider

	int i;
	short s, *ps = (short *)&i;
	i = 0;
	s = *ps;

Due to a long-ago quirk of history, GCC processes the abstract syntax
tree one statement at a time, so the fact of the cast is long gone by
the time we do the dereference.  Mark got around this problem by
annotating the memories as we create them, which is good enough to pass
legal muster, but not good enough for what you want.

To do what you want, we'd have to annotate pointers instead of memories
and then do global data flow analysis to find out what addresses have
been "infected" by the cast.  Doing anything on a local scale wouldn't
be good enough, I don't think, to handle code coming in from inlines.

Now, we do want to do some of this, since if you can do global data
flow analysis, you can propogate points-to data that gets you even 
better alias info than what we have now.  We'd just fall back on type
information for lack of interprocedural alias info. 

But something like that is a long way off.


r~

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:55               ` Linus Torvalds
@ 1999-06-04 15:20                 ` Richard Henderson
  1999-06-05  9:50                   ` Linus Torvalds
  1999-06-30 15:43                   ` Richard Henderson
  1999-06-30 15:43                 ` Linus Torvalds
  1 sibling, 2 replies; 218+ messages in thread
From: Richard Henderson @ 1999-06-04 15:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tim Hollebeek, craig, davem, mark, chip, egcs

On Fri, Jun 04, 1999 at 08:53:47AM -0700, Linus Torvalds wrote:
> > Unless you're suggesting data flow analysis to figure out which
> > pointers values could have been derived from a casted pointer??? ick,
> > ick, ick.
> 
> Oh, no, no, no. Shudder. I hope nobody took it that way. Barf.

I did.  I'm wary that anything less wouldn't be good enough.

> I meant the features as something to expressly allow a local override.
> Think of the rule more as an issue of "poisoning" the dereference operator
> rather than poisoning the _pointer_. In a kind of silly "precedence rule"
> notation, it would be
> 
> 	*(char *)y
> 
> becomes (*(char *)) y where it is the "*(char *)" thing that makes the
> alias go away. (Now somebody is going to flame my ass off for mixing C and
> a non-C precedence rule). 

This doesn't handle

	extern inline int foo(short *ptr)
	{
		return *ptr;
	}

	int bar(void)
	{
		int i;
		i = 0;
		return foo((short *)&i);
	}

Which isn't unlike some uses of inlines in the kernel.


r~

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 15:02             ` Richard Henderson
@ 1999-06-04 16:50               ` Bernd Schmidt
  1999-06-30 15:43                 ` Bernd Schmidt
  1999-06-05  9:35               ` Linus Torvalds
  1999-06-30 15:43               ` Richard Henderson
  2 siblings, 1 reply; 218+ messages in thread
From: Bernd Schmidt @ 1999-06-04 16:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Linus Torvalds, craig, davem, mark, chip, egcs

On Fri, 4 Jun 1999, Richard Henderson wrote:
> On Thu, Jun 03, 1999 at 11:02:35PM -0700, Linus Torvalds wrote:
> > The extremely straightforward rule that at least I would advocate is _so_
> > straightforward as to be almost scary:
> >  - if there is a pointer cast, that pointer cast invalidates all
> >    type-based alias information.
> 
> Doing what you want is actually very hard for GCC right now.  Consider
> 
> 	int i;
> 	short s, *ps = (short *)&i;
> 	i = 0;
> 	s = *ps;
> 
> Due to a long-ago quirk of history, GCC processes the abstract syntax
> tree one statement at a time, so the fact of the cast is long gone by
> the time we do the dereference.  Mark got around this problem by
> annotating the memories as we create them, which is good enough to pass
> legal muster, but not good enough for what you want.
> 
> To do what you want, we'd have to annotate pointers instead of memories
> and then do global data flow analysis to find out what addresses have
> been "infected" by the cast.  Doing anything on a local scale wouldn't
> be good enough, I don't think, to handle code coming in from inlines.

Instead of trying to re-create this information with flow analysis,
couldn't we solve this by adding syntax to create a special "aliased" pointer
type, rather than just using a cast to a regular pointer type?

The cast would then read e.g. "(short * __attribute__ ((aliased)))&i"; and we
could declare the variable "ps" to be of the same type.  Each time a pointer
is dereferenced, we know its type, so we can tell whether it has the "aliased"
attribute.  If it does, we need to avoid setting alias information for the
memory reference we create.
For assignments (or casts) between pointers there could be a warning when the
user tries to convert an unaliased pointer back to a normal one.

Bernd

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 11:11                             ` Toon Moene
  1999-06-04 12:20                               ` Jeffrey A Law
@ 1999-06-05  4:05                               ` Andi Kleen
  1999-06-30 15:43                                 ` Andi Kleen
  1999-06-30 15:43                               ` Toon Moene
  2 siblings, 1 reply; 218+ messages in thread
From: Andi Kleen @ 1999-06-05  4:05 UTC (permalink / raw)
  To: Toon Moene; +Cc: egcs, torvalds, law

toon@moene.indiv.nluug.nl (Toon Moene) writes:

> Jeffrey A Law wrote:
> 
> > Mark Mitchell wrote:
> 
> >   > Either
> >   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).
> 
> > This is my strong preference.
> > 
> > I see no need to make conforming, portable code run slower.
> 
> Exactly.  Remember that a standard is a contract between producer and
> (end-)user, in our case:  between compiler writer and C programmer.
> 
> 	"We won't optimize your constructs away as long as you program 
> 	 according to said standard"


Erm, there seem to be some misunderstandings about the C standard in
this discussion.

My C9x draft says: 

6.2.6.1
 [#5] Certain object representations  need  not  represent  a
       value  of the object type.  If the stored value of an object
       has such a representation  and  is  accessed  by  an  lvalue
       expression  that  does not have character type, the behavior
       is undefined.  If such a representation  is  produced  by  a
       side  effect  that modifies all or any part of the object by
       an lvalue expression that does not have character type,  the
       behavior is undefined.37)  Such a representation is called a
       trap representation.


Now it says undefined behaviour is:

       3.18
       [#1] undefined behavior
       behavior, upon use of a  nonportable  or  erroneous  program
       construct,  of  erroneous data, or of indeterminately valued
       objects, for which this International  Standard  imposes  no
       requirements

So it imposes no requirements on what to do when it happen. This means gcc
is free to do what it wants. This includes unreasonable things, or reasonable
things. I think turning alias analysis off in this case is reasonable, and
of course fully standards compliant. Also the argument "that will slow
down legal programs" is non sense, because there are no strictly conforming
programs which can do this. 


-Andi

P.S.: Toon, this is not Fortran ;)

-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 12:20                               ` Jeffrey A Law
@ 1999-06-05  5:45                                 ` Toon Moene
  1999-06-05  6:23                                   ` Andi Kleen
                                                     ` (3 more replies)
  1999-06-30 15:43                                 ` Jeffrey A Law
  1 sibling, 4 replies; 218+ messages in thread
From: Toon Moene @ 1999-06-05  5:45 UTC (permalink / raw)
  To: law; +Cc: Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs, Andi Kleen

Jeffrey A Law wrote:

>   I wrote:

>   > Or those Fortran users (like me) who still do not understand how a
>   > strictly C performance enhancement can worsen the code generated for
>   > purely Fortran source, like it is the case for me (I use
>   > -fno-strict-aliasing since the end of February - and no, we Fortran
>   > users do not have a problem with aliasing; as I outlined on comp.arch,
>   > we outlawed it).

> I though this was tracked down to the inability to re-share those auto
> arrays on the stack.  I also thought we had turned off strict aliasing
> for Fortran for precisely this reason.  Did I misunderstand the end result
> of that discussion?

This was suggested, but I replied that I didn't believe that to be the
reason.  Note that Fortran basically only has one "scope" for automatic
variables (whether arrays or scalars):  The complete subprogram (i.e.
subroutine or function).

That means that in the scope Mark's alias analysis works in, automatic
arrays are created precisely once (at the beginning of that scope) and
destroyed exactly once (at the end of said scope); hence, there is no
opportunity to re-use stack slots.

Strict aliasing isn't turned off, yet (quoting f/com.c):

  /* Set default options for Fortran.  */
  flag_move_all_movables = 1;
  flag_reduce_all_givs = 1;
  flag_argument_noalias = 2;
  flag_errno_math = 0;
  flag_complex_divide_method = 1;

I also feel uneasy about just turning it off - I prefer to first *know*
why it generates worse code.

Andi Kleen wrote:

> I wrote:

>> Exactly.  Remember that a standard is a contract between producer and
>> (end-)user, in our case:  between compiler writer and C programmer.
>> 
>>       "We won't optimize your constructs away as long as you program 
>>        according to said standard"

> Erm, there seem to be some misunderstandings about the C standard in
> this discussion.

Yep, that's what you get when you want to summarize standardese in
one-liners.  Note that I later wrote about freedom for the compiler
writer vs. freedom for the programmer (I think that better catches the
spirit of the Standard).

> So it imposes no requirements on what to do when it happen. This means 
> gcc is free to do what it wants. This includes unreasonable things, or 
> reasonable things. I think turning alias analysis off in this case is 
> reasonable, and of course fully standards compliant.

Ah, yes, but the discussion is whether we should have gcc generate
"reasonable" behaviour where "reasonable" is defined by a small group of
users.  Note that all "behaviours" not explicitly required by the
Standard are prone to:

1. Erosion (within a decade, gcc maintainers forget why we did this in
   the first place: "Hey, look at this code - what hair - and it is
   undefined behaviour according to the Standard in the first place;
   rip it out")

2. Contradiction (the C0X Standard defines the previously undefined
   behaviour, but in a way incompatible with the "reasonable" behaviour
   we thought up here).

Cheers,

[In 24 hours I'm off for my first X3J3 meeting - it shows, doesn't it?]

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
GNU Fortran: http://world.std.com/~burley/g77.html

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  5:45                                 ` Toon Moene
@ 1999-06-05  6:23                                   ` Andi Kleen
  1999-06-05 10:32                                     ` Toon Moene
                                                       ` (2 more replies)
  1999-06-06 23:12                                   ` f77 vs type based alias analysis Jeffrey A Law
                                                     ` (2 subsequent siblings)
  3 siblings, 3 replies; 218+ messages in thread
From: Andi Kleen @ 1999-06-05  6:23 UTC (permalink / raw)
  To: Toon Moene
  Cc: law, Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs,
	Andi Kleen, mark

On Sat, Jun 05, 1999 at 02:37:20PM +0200, Toon Moene wrote:
> Ah, yes, but the discussion is whether we should have gcc generate
> "reasonable" behaviour where "reasonable" is defined by a small group of
> users.  Note that all "behaviours" not explicitly required by the
> Standard are prone to:

Generating faulty code is in my book always unreasonable, even when
the source is not strictly conforming (and the compiler has a realistic
chance to detect it).  

The argument that it may inhibit some optimizations for strictly conforming 
programs I also cannot follow. As I understand it there are basically two 
cases:


1. One casts a pointer to a object to some other non-char pointer and doesn't
access it. This is although strictly conforming rather useless, and should
be optimized away anyways. This case is not interesting.

2. One casts a pointer to a object to some other non-char pointer, and 
uses the new pointer to access the object. The standard says that is undefined.
Strictly conforming programs cannot do that. Currently gcc generates code
for it that most likely will result in a bug in the program. The casting
proposal turns the "wrong code" interpretation of undefined into something
that has a good chance to make a lot of old programs work again. 

Because case (1) is not interesting (it is a noop) I don't think worrying
about missing optimizations in noops is a good use of one's time.

Now of course I agree that it is a good idea to convert the code in the
long run to be strict-aliasing safe, simply to give the optimizer more
information. For some projects like Linux it is although a long and 
difficult way. I think the best compromise would be to turn -fstrict-aliasing
off per default (like what is already proposed) and to offer a new
-flose-aliasing switch that turns the "turn off alias analysis for casts"
off.  

I'm playing a bit with a patch that just implements that and and works in a 
similar way as Linus outlined. I am not sure if it is worth to try to detect
the case (1) (casting but result is not directly accessed), or to simply
set the alias set to 0 for a pointer cast. I think it is not.

Mark, even when you don't like it, would you as alias-expert-in-residence
think that the basic strategy is workable?



> 
> 1. Erosion (within a decade, gcc maintainers forget why we did this in
>    the first place: "Hey, look at this code - what hair - and it is
>    undefined behaviour according to the Standard in the first place;
>    rip it out")

If it is clearly documented that will not happen.


> 
> 2. Contradiction (the C0X Standard defines the previously undefined
>    behaviour, but in a way incompatible with the "reasonable" behaviour
>    we thought up here).

I don't think that such a vague possibility should guide a gcc design 
decision ("in 30 years an asteroid may crash onto earth and ruin your 
whole day - don't implement it because the exception handlers don't handle 
that event") Also there is no cue in the future directions that that may 
happen. In any case it wouldn't strike me as a strong enough argument
to suppress a useful feature.


> 
> Cheers,
> 
> [In 24 hours I'm off for my first X3J3 meeting - it shows, doesn't it?]

Definitely.


-Andi

-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 15:02             ` Richard Henderson
  1999-06-04 16:50               ` Bernd Schmidt
@ 1999-06-05  9:35               ` Linus Torvalds
  1999-06-05 13:34                 ` Richard Henderson
  1999-06-30 15:43                 ` Linus Torvalds
  1999-06-30 15:43               ` Richard Henderson
  2 siblings, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-05  9:35 UTC (permalink / raw)
  To: Richard Henderson; +Cc: craig, davem, mark, chip, egcs

On Fri, 4 Jun 1999, Richard Henderson wrote:
> 
> Doing what you want is actually very hard for GCC right now.  Consider
> 
> 	int i;
> 	short s, *ps = (short *)&i;
> 	i = 0;
> 	s = *ps;

Note that while the kernel may contain constructs like the above, I never
meant for the "extended rule" to cover them. We'd have to fix them up.

The "pointer cast rule" was meant to allow people to know about and
override the type-based aliasing - it wasn't meant to handle every single
pointer cast automatically being non-aliased. I would consider such
behaviour to be basically (a) unimplementable and (b) too non-local.

I obviously didn't explain that very well, although I hope my later email
about the _implementation_ side explained the details more clearly.

The concept was never meant to avoid alias information on any global
scale. I think type-based alias information is important. It was meant to
be a syntactically simple way to override =specific= instances where the
programmer knows he is playing games with typing.

As an example, the above sequence obviously has a alia problem as it
stands now. My suggestion would _not_ make the above code generate
anything different at all. The only thing my suggestion really does is
give the programmer a chance to say "oh, I see: the above worked in the
original ANSI C, but it does not work with the new one, and I only care
about gcc anyway, so I can do the quick fix by just adding the cast":

	s = *(short *)ps;

Note that the cast above in C terms is a no-op: it casts a short pointer
to a short pointer, but it would be a way to tell gcc that this access
should not be aliased.

> Due to a long-ago quirk of history, GCC processes the abstract syntax
> tree one statement at a time, so the fact of the cast is long gone by
> the time we do the dereference.

I agree 100% with the concern you raise, and I'd just like to say that
that was never the intention. Having some kind of complete flow would
obviously be a very broken concept, and I fully understand the horror
people felt if they thought that was what I proposed.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 15:20                 ` Richard Henderson
@ 1999-06-05  9:50                   ` Linus Torvalds
  1999-06-05 11:00                     ` mark
  1999-06-30 15:43                     ` Linus Torvalds
  1999-06-30 15:43                   ` Richard Henderson
  1 sibling, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-05  9:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Tim Hollebeek, craig, davem, mark, chip, egcs

On Fri, 4 Jun 1999, Richard Henderson wrote:
> 
> This doesn't handle
> 
> 	extern inline int foo(short *ptr)
> 	{
> 		return *ptr;
> 	}
> 
> 	int bar(void)
> 	{
> 		int i;
> 		i = 0;
> 		return foo((short *)&i);
> 	}
> 
> Which isn't unlike some uses of inlines in the kernel.

Again - I _really_ didn't mean for this to be some Linux kernel specific
hack.

Think of it as a larger problem than just the kernel. Think of it as the
problem of "we have a huge code-base, and it wasn't written with
type-based alias in mind - and the new ANSI rules are fairly cumbersome,
but we'd like to have access to the new optimization: not only is it the 
default, but it does generate better code too!". 

It's not that changes wouldn't be needed, it's the fact that the ANSI
rules really only give you two ways to overcome the alias issue:

 - "char *" - which is just unbearably slow, and obviously not really an
   option for many things. You're better off just disabling the alias
   logic altogether.

 - using a union - which does work, but is just incredibly horrible syntax
   if you don't have just one well-defined case and/or designed with it in
   mind.

For example, the union approach is obviously acceptable if you have the
specific case of once in a blue moon (a few times in a large project) a
need to convert floating point to the integer bit pattern representation
and back. And that's obviously what ANSI was concerned with. 

The "cast invalidates alias information" is a _syntactical_ thing to do
the same thing much more simply. I find it much more natural anyway, and
as I tried to show it /should/ be straightforward to implement.

Gcc has various of these ANSI extensions that are really purely
syntactical:

 - "a ? : b" is just syntactical sugar for "a ? a : b"
 - pointer casting lvalues is syntactical sugar for something that is
   actually reasonably hard to do, but occasionally useful.

In fact, think of it the same was as the casting of lvalues: sure, you CAN
do it with standard ANSI C, but it cumbersome.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  6:23                                   ` Andi Kleen
@ 1999-06-05 10:32                                     ` Toon Moene
  1999-06-05 13:26                                       ` Jamie Lokier
                                                         ` (2 more replies)
  1999-06-05 10:37                                     ` mark
  1999-06-30 15:43                                     ` Andi Kleen
  2 siblings, 3 replies; 218+ messages in thread
From: Toon Moene @ 1999-06-05 10:32 UTC (permalink / raw)
  To: Andi Kleen; +Cc: law, Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs

Andi Kleen wrote:

> On Sat, Jun 05, 1999 at 02:37:20PM +0200, I wrote:

> > Ah, yes, but the discussion is whether we should have gcc generate
> > "reasonable" behaviour where "reasonable" is defined by a small group of
> > users.  Note that all "behaviours" not explicitly required by the
> > Standard are prone to:

> Generating faulty code is in my book always unreasonable, even when
> the source is not strictly conforming (and the compiler has a realistic
> chance to detect it).

Be careful to not run in circles here:  gcc generates "some" code that's
allowed because the construct invokes `undefined behaviour'.  That
doesn't make it "faulty" - just undefined.

> The argument that it may inhibit some optimizations for strictly conforming
> programs I also cannot follow. As I understand it there are basically two
> cases:

It does if you have to apply a compiler option to prevent this
optimisation - because in that case the optimisation will be prevented
for the whole compilation unit (a source file)

> > 1. Erosion (within a decade, gcc maintainers forget why we did this in
> >    the first place: "Hey, look at this code - what hair - and it is
> >    undefined behaviour according to the Standard in the first place;
> >    rip it out")

> If it is clearly documented that will not happen.

Yeah, sure.  Unfortunately, if the "correct" treatment of this feature
means to change a dozen source files (and rth's comments make me fear
that that's the case), the chance that someone, somewhere forgets to say
exactly why these changes were necessary (and on what other changes in
other files they depend) is far larger than I want to consider.  We've
seen this before.  Lucky we are that some long time gcc-hackers are
still among us, who might remark:  Oh yes, that's undefined by the C
Standard, but it happens to be an extension gcc supports ...  Seen that,
got the T-shirt.

> > 2. Contradiction (the C0X Standard defines the previously undefined
> >    behaviour, but in a way incompatible with the "reasonable" behaviour
> >    we thought up here).
> 
> I don't think that such a vague possibility should guide a gcc design
> decision ("in 30 years an asteroid may crash onto earth and ruin your
> whole day - don't implement it because the exception handlers don't handle
> that event") Also there is no cue in the future directions that that may
> happen. In any case it wouldn't strike me as a strong enough argument
> to suppress a useful feature.

If you think so, bring it up in comp.std.c.  At least that's the
ultimate criterium I use:  If I can explain an extension to the Fortran
Standard coherently on comp.lang.fortran (where all the J3 members
listen in), and no-one shoots it down in two weeks time, it might indeed
have some value.

Success !

[No, there's no smiley here - I really think you should try that route,
 because it's the only sane way.]

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
GNU Fortran: http://world.std.com/~burley/g77.html

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  6:23                                   ` Andi Kleen
  1999-06-05 10:32                                     ` Toon Moene
@ 1999-06-05 10:37                                     ` mark
  1999-06-05 11:09                                       ` David S. Miller
                                                         ` (3 more replies)
  1999-06-30 15:43                                     ` Andi Kleen
  2 siblings, 4 replies; 218+ messages in thread
From: mark @ 1999-06-05 10:37 UTC (permalink / raw)
  To: ak; +Cc: toon, law, jbuck, torvalds, craig, davem, chip, egcs

>>>>> "Andi" == Andi Kleen <ak@muc.de> writes:

    Andi> Mark, even when you don't like it, would you as
    Andi> alias-expert-in-residence think that the basic strategy is
    Andi> workable?

I don't know what workable means.  

But, I would argue against your patch.  There are cases where a
pointer is cast to one type, and then cast back to another, and then
used.  These cases are conforming, and I think that Linus' proposal
will disable alias analysis in these cases.  That's bad.  Especially
since often these casts are to `void*' for the express purpose in
storing them in some kind of generic data structure.

Note that I made an alternate, more circumspect, proposal, which has
been ignored by both Linus and yourelf up until now, although there's
been so much traffic that one couldn't really expect anyone to keep up
with all of it:

  Put expressions of the form `*((foo*) (&x))' in alias set zero if
  x does not have type foo, or one of the types that is allowed
  to alias it.

This proposal only affects nonconforming code, and thus changing the
behavior of the compiler will not pessimize any conforming code.  It
is important that `x' be a variable, or a field of a variable, not an
arbitrary expression.  (For example, I don't think this should apply
to `*((foo*) (f()))' since that might be conforming.)  But, if
`x' is a variable, or of the form `x->y' or `x.'y' then we should be
OK (it's not legal to talk about `x->y' if `x' is not of the right
type), then we should be OK.

So, this proposal is, IMO, a workable extension of the standard
semantics.  I don't know if this covers all the cases in the kernel,
but it should be easier to change Linux to fit this model than the
strictly conforming one.

I'm also not sure if this is a good idea.  If we don't document this
behavior, we're not promising it to Linux.  So, we might break it
later.  If we *do* document it, then we have to promise to maintain
this behavior.  That's extra work for us; we have to be convinced
there's a good enough reason, and I'm not convinced yet.  The
questions are:

  o How badly does Linux need the extra cycles that might be squeezed
    out by this extra alias analysis?  How much faster will the 
    average Linux system go?

  o How hard would it be to fix the kernel?

  o How hard will it be to do this be to do in GCC?
  
  o What will the maintenance costs be?

  o What else could we all do with our time that would either improve
    Linux or GNU CC?

Your answers probably won't change my mind.  Not because I don't
respect them, but because I usually need to reach my own conclusions.
Hard numbers might change my mind, but I bet we don't have them.  We
can't know answers to the last four questions.  The first one could be
numerically estimated.  But, if you find a hot-spot in the kernel, you
could always make just enough of the kernel conforming to turn on
strict-aliasing there.

Question four is one of the most important, and historically has been
all too often ignored by GCC developers.  *Your* convenience now is
traded against *our* convenience later.  Today's easy hack may be
tomorrow's maintenance nightmare.  Overall, Linux and GCC are both
part of the GNU project (techincally, I know that Linux may not be,
but in practice we're all on the same side), so we have to do what's
best for the project *as a whole*.

Technically, I suspect that code transformations in the front-ends
(and yes, I'm planning some, like inlining on trees) will make doing
this analysis in the middle-end difficult; we could miss in both
directions, putting things in alias set zero when we should not, and
vice versa.

I think to make the semantics robust, this analysis would have to be
done in the front-ends.  Note that this is a *syntactic* thing, not a
*semantic* thing; it's the use of cast syntax in expressions of a
particular form.  In other words:

  inline foo* f(bar* x) { return (foo*) x; }
  *(f(&x)) /* Does not go in alias set zero, even if all 
              inlining is done.  */

So, in summary, I think:

  o It's not clear we want this behavior that badly.
  o A correct implementation will be difficult.
  o There will be maintenance headaches.

Furthermoe, I bet that by now, if all this energy had been spent
fixing the code in the kernel, you'd have made good headway on some of
the most prominent data structures.  Yes, this will be a tedious
chore, but it's an easy one: you enclose things in a union, compile,
see what doesn't, fix it, and go on.  

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  9:50                   ` Linus Torvalds
@ 1999-06-05 11:00                     ` mark
  1999-06-06 10:30                       ` Linus Torvalds
  1999-06-30 15:43                       ` mark
  1999-06-30 15:43                     ` Linus Torvalds
  1 sibling, 2 replies; 218+ messages in thread
From: mark @ 1999-06-05 11:00 UTC (permalink / raw)
  To: torvalds; +Cc: rth, tim, craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus>  - "char *" - which is just unbearably slow, and obviously
    Linus> not really an option for many things. You're better off
    Linus> just disabling the alias logic altogether.

Not really always true.  You can use `memcpy (target, src, sizeof
(x))' and if the alignments of the src and target are known to the
compiler you *should* get optimal code.  (I don't know if GCC does
this at present, but it could, and that would clearly be a good
improvement.)

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 10:37                                     ` mark
@ 1999-06-05 11:09                                       ` David S. Miller
  1999-06-05 12:11                                         ` Toon Moene
                                                           ` (2 more replies)
  1999-06-05 11:35                                       ` Andi Kleen
                                                         ` (2 subsequent siblings)
  3 siblings, 3 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-05 11:09 UTC (permalink / raw)
  To: mark; +Cc: ak, toon, law, jbuck, torvalds, craig, chip, egcs

   From: mark@codesourcery.com
   Date: Sat, 05 Jun 1999 10:41:07 -0700

   Furthermoe, I bet that by now, if all this energy had been spent
   fixing the code in the kernel, you'd have made good headway on some
   of the most prominent data structures.  Yes, this will be a tedious
   chore, but it's an easy one: you enclose things in a union,
   compile, see what doesn't, fix it, and go on.

What seems to be ignored are the future maintenance costs incurred by
this set of changes to the kernel, as if "do it and get it over right
now" is some triviality.  Effort has been expended already to make
attempts to do this (mentioned here by Andi Klein who did a run at it
for the networking), and the findings made there support the
non-triviality claim, in Andi's case he tossed the work midstream due
to the non-stop overwhelming accumulation of issues.

Also some of the datastructures one would need to change are included
by userspace applications, especially for some of the networking
instances, and thus one would have ABI issues to concern themselves
about if they were to go and perform these transformations.  Much more
is it than a tedious chore.  One could certainly create another header
file, leave the old one alone with the same name, and use only the new
one inside the kernel, but does it make sense to have two copies and
maintain them?

However the headerfile interface issue is cleanly handled if only the
offending code in the kernel is changed (changes thus which are
invisible to the user headerfile ABI) to adhere to the proposed gcc
cast aliasing behavior.

This argument is orthogonal to your proposed possible future
maintenance costs gcc might incur due to the implementation of cast
aliasing behavior.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 10:37                                     ` mark
  1999-06-05 11:09                                       ` David S. Miller
@ 1999-06-05 11:35                                       ` Andi Kleen
  1999-06-30 15:43                                         ` Andi Kleen
  1999-06-05 12:41                                       ` Jamie Lokier
  1999-06-30 15:43                                       ` mark
  3 siblings, 1 reply; 218+ messages in thread
From: Andi Kleen @ 1999-06-05 11:35 UTC (permalink / raw)
  To: mark; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

On Sat, Jun 05, 1999 at 07:41:07PM +0200, mark@codesourcery.com wrote:
> >>>>> "Andi" == Andi Kleen <ak@muc.de> writes:
> 
>     Andi> Mark, even when you don't like it, would you as
>     Andi> alias-expert-in-residence think that the basic strategy is
>     Andi> workable?
> 
> I don't know what workable means.  
> 
> But, I would argue against your patch.  There are cases where a
> pointer is cast to one type, and then cast back to another, and then
> used.  These cases are conforming, and I think that Linus' proposal
> will disable alias analysis in these cases.  That's bad.  Especially
> since often these casts are to `void*' for the express purpose in
> storing them in some kind of generic data structure.

I agree.

> 
> Note that I made an alternate, more circumspect, proposal, which has
> been ignored by both Linus and yourelf up until now, although there's
> been so much traffic that one couldn't really expect anyone to keep up
> with all of it:

Sorry, I must have missed it.

> 
>   Put expressions of the form `*((foo*) (&x))' in alias set zero if
>   x does not have type foo, or one of the types that is allowed
>   to alias it.
> 
> This proposal only affects nonconforming code, and thus changing the
> behavior of the compiler will not pessimize any conforming code.  It
> is important that `x' be a variable, or a field of a variable, not an
> arbitrary expression.  (For example, I don't think this should apply
> to `*((foo*) (f()))' since that might be conforming.)  But, if
> `x' is a variable, or of the form `x->y' or `x.'y' then we should be
> OK (it's not legal to talk about `x->y' if `x' is not of the right
> type), then we should be OK.

The "only with variable rule" makes it a bit more complicated and arbitary
than I hoped (e.g. I don't see
the difference between *((foo*)f()) = 1; and { foo *x=(foo*)f(); *x=1 }), 
but I could live with that if it is needed for the compromise needed for a
consensus. 

I think the kernel has some of the first cases, so it may be helpful to have
an optional (=not in -Wall) warning at least for the function case so that 
someone could go through the code base and fix it.


> So, this proposal is, IMO, a workable extension of the standard
> semantics.  I don't know if this covers all the cases in the kernel,
> but it should be easier to change Linux to fit this model than the
> strictly conforming one.
> 
> I'm also not sure if this is a good idea.  If we don't document this
> behavior, we're not promising it to Linux.  So, we might break it
> later.  If we *do* document it, then we have to promise to maintain
> this behavior.  That's extra work for us; we have to be convinced
> there's a good enough reason, and I'm not convinced yet.  The
> questions are:
> 
>   o How badly does Linux need the extra cycles that might be squeezed
>     out by this extra alias analysis?  How much faster will the 
>     average Linux system go?

There are some hot paths (e.g. TCP input packet processing) that would
benefit from it. The average Linux box is a work station that is mostly
idle (:@), but for high load servers and applications like Beowulf clusters
where latency counts it is helpful. Also  I think it will be more important
in the future (e.g. on Linux/IA64), where the CPU needs much more compiler
support for good performance.
> 
>   o How hard would it be to fix the kernel?

Very hard. I just tried to fix it in a small part of the TCP code, and it
already involved major changes. The main problem is that these generally
cannot be encapsulated in modules, it has to be changed globally, which
can be a big problem in a system with lots of external code and complicated
dependencies like Linux.


> So, in summary, I think:
> 
>   o It's not clear we want this behavior that badly.
>   o A correct implementation will be difficult.
>   o There will be maintenance headaches.
> 
> Furthermoe, I bet that by now, if all this energy had been spent
> fixing the code in the kernel, you'd have made good headway on some of
> the most prominent data structures.  Yes, this will be a tedious
> chore, but it's an easy one: you enclose things in a union, compile,
> see what doesn't, fix it, and go on.  

Erm no, it isn't that easy. There are no warnings and these cast could hide
everywhere. Someone would basically have to carefully audit about 3.5M LOCs of 
kernel source. And you probably know hard it is to coordinate such mega
patches with multiple (in case of Linux hundreds) maintainers. e.g. I already 
had to discard the nowhere near complete TCP alias fix work from my working
tree again, because David would most likely not have accepted it at this 
point because of the major changes involved, and keep it would have required
substantial continuous effort to hand integrate most new patches because
of the rejects.  Doing it in a crash effort is logistically not possible
I think.  The only way to do it are continuous slow incremental changes,
and the proposed gcc extension would make it a lot easier I think.



-Andi
-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 11:09                                       ` David S. Miller
@ 1999-06-05 12:11                                         ` Toon Moene
  1999-06-05 12:21                                           ` David S. Miller
  1999-06-30 15:43                                           ` Toon Moene
  1999-06-07  6:01                                         ` Joern Rennecke
  1999-06-30 15:43                                         ` David S. Miller
  2 siblings, 2 replies; 218+ messages in thread
From: Toon Moene @ 1999-06-05 12:11 UTC (permalink / raw)
  To: David S. Miller; +Cc: mark, ak, law, jbuck, torvalds, craig, chip, egcs

David S. Miller wrote:

>    From: mark@codesourcery.com
>    Date: Sat, 05 Jun 1999 10:41:07 -0700

>    Furthermoe, I bet that by now, if all this energy had been spent
>    fixing the code in the kernel, you'd have made good headway on some
>    of the most prominent data structures.  Yes, this will be a tedious
>    chore, but it's an easy one: you enclose things in a union,
>    compile, see what doesn't, fix it, and go on.

> What seems to be ignored are the future maintenance costs incurred by
> this set of changes to the kernel, as if "do it and get it over right
> now" is some triviality.  Effort has been expended already to make
> attempts to do this (mentioned here by Andi Klein who did a run at it
> for the networking), and the findings made there support the
> non-triviality claim, in Andi's case he tossed the work midstream due
> to the non-stop overwhelming accumulation of issues.

If these issues are so pervasive, isn't it easier to use the compiler
flag -fno-strict-aliasing and document *that* ?

I mean, if this sort of trickery permeates the Linux kernel, you won't
get any mileage out of the new optimization anyway, so you could just as
well disable it.

[ Before anyone thinks *I* am a language purist:  I know and have been
  contributing code to our Numerical Weather Prediction programs that
  willfully break Fortran alias assumptions.  We get away by it because
  it is mostly of the form "two arrays overlap completely", which -
  up till now - doesn't seem to be fouled up by optimization passes in
  existing Fortran compilers.  That doesn't mean that I would beat a
  compiler vendor over the head with a blunt object *before* I would
  have checked that our sloppiness is not the cause of our troubles ]

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
GNU Fortran: http://world.std.com/~burley/g77.html

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 12:11                                         ` Toon Moene
@ 1999-06-05 12:21                                           ` David S. Miller
  1999-06-05 16:51                                             ` mark
  1999-06-30 15:43                                             ` David S. Miller
  1999-06-30 15:43                                           ` Toon Moene
  1 sibling, 2 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-05 12:21 UTC (permalink / raw)
  To: toon; +Cc: mark, ak, law, jbuck, torvalds, craig, chip, egcs

   Date: Sat, 05 Jun 1999 21:02:27 +0200
   From: Toon Moene <toon@moene.indiv.nluug.nl>

   I mean, if this sort of trickery permeates the Linux kernel, you
   won't get any mileage out of the new optimization anyway, so you
   could just as well disable it.

It is not believed that this is the case.

Networking will be used as an example.

The core of the fast paths in interrupt level processing consist of
parsing and verifying packet header data.  These are the areas where
non-alias-friendly casts are used to speed up the header inspection
(to decrease the number of load instructions and also decrease the
number of comparisons executed).

Yet in the user side portion of networking, and to a decent amount in
the packet processing once we've obtained the header data, the bulk of
the work consists of updating state in the per-connection data
structures, where a plethora of alias analysis benefits exist.

So the situation here is quite the contrary to the assertion, one of
the most worrysome areas of the kernel, with respect to the
union'ization of data structures to remove non-alias-friendly casts,
is also the place where alias analysis would be highly beneficial.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 10:37                                     ` mark
  1999-06-05 11:09                                       ` David S. Miller
  1999-06-05 11:35                                       ` Andi Kleen
@ 1999-06-05 12:41                                       ` Jamie Lokier
  1999-06-05 14:43                                         ` Martin v. Loewis
                                                           ` (2 more replies)
  1999-06-30 15:43                                       ` mark
  3 siblings, 3 replies; 218+ messages in thread
From: Jamie Lokier @ 1999-06-05 12:41 UTC (permalink / raw)
  To: mark; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

mark@codesourcery.com suggests:

>   Put expressions of the form `*((foo*) (&x))' in alias set zero if
>   x does not have type foo, or one of the types that is allowed
>   to alias it.
> 
> This proposal only affects nonconforming code, and thus changing the
> behavior of the compiler will not pessimize any conforming code.

I don't like this because it's not what we do in C++.
In C++ when we want to do these naughty things, we used to do:

  *(foo*)(void*)(&x)

I bet there's still a fair bit of that around.
In the modern world we've got *reinterpret_cast<foo*>(&x), which
is presumably treated specially w.r.t. aliases.

[Could someone tell me if reinterpret_cast does the right thing with
aliases please?]

thanks,
-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 10:32                                     ` Toon Moene
@ 1999-06-05 13:26                                       ` Jamie Lokier
  1999-06-05 19:35                                         ` Linus Torvalds
  1999-06-30 15:43                                         ` Jamie Lokier
  1999-06-05 18:48                                       ` Linus Torvalds
  1999-06-30 15:43                                       ` Toon Moene
  2 siblings, 2 replies; 218+ messages in thread
From: Jamie Lokier @ 1999-06-05 13:26 UTC (permalink / raw)
  To: Toon Moene
  Cc: Andi Kleen, law, Joe Buck, Linus Torvalds, craig, mark, davem,
	chip, egcs

Jamie's suggestion de jour.

First let me cover the bases.

  Linus wants non-union (ie. non-ugly) casts to do the sensible thing.
  What that is isn't quite clear -- after thinking it through.
  If it's just aesthetics, I don't see why a macro wouldn't do ;-)

  Dave Miller notes that Linux does want to take advantage of full alias
  analysis, alongside code that does hacks to minimise loads and stores
  on current architectures.  I think Dave's trying to eat two cakes at
  once, but since I hack network hardware for profit, I know exactly
  where he's coming from.

  Other folks whose names (I must apologise) seem to have blended
  together in my head right now, have to write compilers.  Good
  compilers that produce the best possible code for standards-conforming
  programs.  And then degrade gracefully for non-conforming programs ;-)

  In particlar, a cast may be conforming, in which case the compiler
  should strive the generate the best allowed code (unless it's
  pathological).  A serious of non-conforming looking casts might be
  conforming too -- cast to void * and back was pointed out as
  conforming _and_ commonplace.

  Issues raised: 

  - Lots of legacy code uses casts, assuming nothing weird will happen.
  - Weird things now happen.

  - "non-conforming cast implies pointer may alias all" + "full flow
    analysis" got proposed.  No one likes it.  Bin.

  - "non-... all" + "*no* flow analysis" got proposed by Linus.  It's a
    simple special case, arguably syntactic.  But it has semantic warts:
       *(foo_t*) &bar = foo; 
    now means something different than:
       { foo_t* p = (foo_t*) &bar; *p = foo; }

    I submit that those two forms ought to be "trivially equivalent" --
    we think in terms of data flow when we write code.  We also think
    that way when we think about how compilers expand expressions, how
    common subexpressions are combined and so on.  A fine distinction
    would, IMO, be a major language misfeature and a cause of many
    subtle bugs in future.

  - "type attribute" got proposed by Linus too.  This I like.

    Type attribute means you can write:
       *(foo_t __alias*)(&bar) = foo;
    and the equivalent:
       { foo_t __alias* p = (foo_t __alias*) &bar; *p = foo; }

    See how the these _are_ trivially equivalent?
    The problem with this is that there are 2 times 10^30 ugly casts
    in the C cosmos that don't have such an attribute, of with 2 times
    10^29 are in Linux kernel.

Jamie's thought of the day
--------------------------

    It looks like the compiler could spot the dodgy casts, including
    some standard-conforming ones, based solely on the types of the
    casts (with attributes).  It could warn that you may be getting
    non-alias optimisations you weren't expecting.  The word access
    casts in the Linux kernel (and lots of other code) would be prime
    candidates.  Hard-core fortran coders have this warning switched
    off.  The compiler proceeds to optimise anyway (you were warned).

    But if you include a suitable type attribute, possible aliasing is
    implied and you don't get the warning.  You get sensible code out.

    There's an alternative attribute, for programs that mix and match,
    where possible aliasing is _not_ implied but it you still don't get
    the warning.  (A bit like __attribute__((unused)) is used solely to
    suppress warnings).

    Thus your thoroughly non-conforming kernel code starts with lots of
    warnings.  You add the __mayalias (or whatever) keyword into all the
    structure manipulation casts -- which the compiler helpfully pointed
    you to.  You add the __doesntalias keyword (is it the same as
    __restrict?) to all those places where you wrote conforming code and
    really do want it fully optimised.  So you get the best of all
    worlds and the compiler helps you get there.

    This way we get:

       - Full optimisation of conforming code.
       - Best optimisation of mostly-conforming code with dodgy casts.
       - No dubious alias flow analysis -- keeps things simple.
       - Code transformations such as rearranging intermediate
         values between expressions, extracting intermediates or merging
         then, continue to be valid transformations.  (IMO v.important).
       - The compiler tells us where to think about aliasing issues.
       - When all the warnings have gone away, then you _know_ it's safe to
         actually use the output of -fstrict-aliasing.
       - If your confident the code is conformant anyway, turn off
         the warning.

    Does this seem (a) implementable, (b) a good, incremental
    maintenance path for the kernel authors?

Enjoy,
-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  9:35               ` Linus Torvalds
@ 1999-06-05 13:34                 ` Richard Henderson
  1999-06-05 18:40                   ` Linus Torvalds
                                     ` (2 more replies)
  1999-06-30 15:43                 ` Linus Torvalds
  1 sibling, 3 replies; 218+ messages in thread
From: Richard Henderson @ 1999-06-05 13:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: craig, davem, mark, chip, egcs

On Sat, Jun 05, 1999 at 09:34:26AM -0700, Linus Torvalds wrote:
> As an example, the above sequence obviously has a alia problem as it
> stands now. My suggestion would _not_ make the above code generate
> anything different at all. The only thing my suggestion really does is
> give the programmer a chance to say "oh, I see: the above worked in the
> original ANSI C, but it does not work with the new one, and I only care
> about gcc anyway, so I can do the quick fix by just adding the cast":
> 
> 	s = *(short *)ps;

So what you're saying is, you don't mind fixing up alias
problems on a local scale?  You're not expecting to get 
away with no source code changes?

If this is all you want, you can get this with a union and
judicious use of macros --

  #define noalias(type, ptr) (((union { type __x__; } *)(ptr))->__x__)

  s = noalias(short, ps);

Which doesn't strike me as too horrible syntax for public
consupmtion.  Note that this works because it is the access
to the union's member that null's the alias set, not the
cast to the union type.


r~

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 12:41                                       ` Jamie Lokier
@ 1999-06-05 14:43                                         ` Martin v. Loewis
  1999-06-30 15:43                                           ` Martin v. Loewis
  1999-06-05 16:53                                         ` mark
  1999-06-30 15:43                                         ` Jamie Lokier
  2 siblings, 1 reply; 218+ messages in thread
From: Martin v. Loewis @ 1999-06-05 14:43 UTC (permalink / raw)
  To: egcs; +Cc: mark, egcs

> [Could someone tell me if reinterpret_cast does the right thing with
> aliases please?]

No, it won't. The C++ standard says

>> A pointer to an object can be explicitly converted to a pointer to
>> an object of different type. Except that converting an rvalue of
>> type "pointer to T1" to the type "pointer to T2" (where T1 and T2
>> are object types and where the alignment requirements of T2 are no
>> stricter than those of T1) and back to its original type yields the
>> original pointer value, the result of such a pointer conversion is
>> unspecified.

In g++, conversion to a different pointer type in reinterpret_cast
will always yield a pointer that has the same internal representation.

However, dereferencing such a pointer has undefined result. You must
not access an object through a pointer to a different type, period (1).
This is a very easy rule (despite Linus' saying that it is very
complicated), and it is the foundation for allowing type-based alias
analysis optimizations.

Of course, the compiler could provide the local-overriding mechanism
that Linus proposed. It currently does not do so, neither for plain
casts, nor for reinterpret_casts.

Regards,
Martin

(1) except if that different type is char.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 12:21                                           ` David S. Miller
@ 1999-06-05 16:51                                             ` mark
  1999-06-30 15:43                                               ` mark
  1999-06-30 15:43                                             ` David S. Miller
  1 sibling, 1 reply; 218+ messages in thread
From: mark @ 1999-06-05 16:51 UTC (permalink / raw)
  To: davem; +Cc: toon, ak, law, jbuck, torvalds, craig, chip, egcs

>>>>> "David" == David S Miller <davem@redhat.com> writes:

    David> So the situation here is quite the contrary to the
    David> assertion, one of the most worrysome areas of the kernel,
    David> with respect to the union'ization of data structures to
    David> remove non-alias-friendly casts, is also the place where
    David> alias analysis would be highly beneficial.

Note, however, that placing *anything* in alias set zero tends to
largely botch type-based alias analysis.  Since alias set zero means
that anything can be changed, no code on a path after the alias set
zero access can use typed-based alias analysis from before that point.
So, even with Linus' proposal, or my less intrusive proposal, this
code may not benefit that greatly.

As I pointed out before, you can introduce unions.  You will then get
compile-time errors.  You can then run through and fix them all.  This
will be tedious.  Yes, it will be a large patch, but not a complex
one.  Do one data structure at a time to make it simpler.  Such a
patch may not be appropriate for the stable kernels, but it should be
tolerable on the unstable kernels.  I fail to see what is so *hard*
about this; I do see that it will take effort.

Your argument about user-space headers is valid.  However, you can
always do:

  /* user-space header */
  struct s { int i; int j; }; /* But sometimes really the whole 
	                         thing is a double.  */

  /* kernel header */
  union ks {
    struct s s;
    double d;
  };

That doesn't require changing the user-space headers.  Yes, you
introduce some duplication.  That's the price of not writing ANSI/ISO
C from the standard *and* wanting type-based alias analysis to work
correctly *and* wanting to keep user-space headers intact.

Quite frankly, I still don't fully buy the "it's too hard to fix, but
where it's too hard is exactly where we need it" conundrum.  There are
a variety of things you can do: unions, macros to access the fields by
a type-safe means, memcpy and friends (see my earlier post for why
this should be fast), etc.  I believe you that it's a pain in the neck
to fix this stuff; just not that it's simultaneously as hard *and* as
important as you claim.

If there are really hot spots in the kernel that need to go fast, you
can hand-code them in assembly, or rewrite them in portable C.  Yes,
that will be a pain.  The benefits may or may not outweigh the cost.
That's a decision you have to make.  But, it's not clear that
introducing some extension, perhaps with gotchas we cannot forsee, to
GCC will have more benefits than costs, either.

In some sense, Linux is coded in a dialect of C.  In this dialect, you
can do funny casts and things still work.  That's not ANSI/ISO C, and
it's not something that GNU CC ever promised to support (unlike
extended asms, for example.)  But, the "Linux dialect" *is* still
supported with -fstrict-aliasing.

You want the new optimization, but to retain the old dialect.  In
short, you want to have your cake and eat it too.  That's natural, but
not necessarily reasonable or realistic.

If someone contributes a patch that:

  o Provides some "anti-aliasing" :-) behavior for non-conforming
    programs in a well thought-out way.
  o Does not affect conforming programs.
  o Is easy to maintain.  See my earlier post for some of the
    problems that must be solved.
  o Looks to be useful to projects outside of the Linux kernel.

then, I would expect that the GCC maintainers would react favorably.

But, frankly, I bet there are a lot of projects on which we could
spend our time that would have more widespread benefits, for GCC, the
kernel, and the community.  For example, better x86 scheduling would
allow *all* applications, probably including Linux, to run
significantly faster.

I am just not persuaded that this is a good place to spend my time,
espcially what little I can afford to volunteer for free.  I am not
persuaded that it is a good use of anyone else's time, either,
including yours, David, or you Andi.  You are excellent programmers,
and your contributions to many projects are valuable; I bet this isn't
the most useful thing you could do.

Unfortunately, (or perhaps fortunately for the readers of this list!),
I can't afford to spend any more time arguing the point.  I hope that
doesn't offend anyone.  I do understand the Linux issues, and why
you're arguing for what you're arguing for.  Let's agree to disagree,
at least until someone produces a patch for GCC that we can start a
fresh argument about. :-)

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 12:41                                       ` Jamie Lokier
  1999-06-05 14:43                                         ` Martin v. Loewis
@ 1999-06-05 16:53                                         ` mark
  1999-06-07  2:36                                           ` Jamie Lokier
  1999-06-30 15:43                                           ` mark
  1999-06-30 15:43                                         ` Jamie Lokier
  2 siblings, 2 replies; 218+ messages in thread
From: mark @ 1999-06-05 16:53 UTC (permalink / raw)
  To: egcs; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

>>>>> "Jamie" == Jamie Lokier <egcs@tantalophile.demon.co.uk> writes:

    Jamie> I don't like this because it's not what we do in C++.  In
    Jamie> C++ when we want to do these naughty things, we used to do:

    Jamie>   *(foo*)(void*)(&x)

I intended this to be covered by my proposal.  This would officially
be a "funny cast", and considered able to alias anything, provided
that x is a variable of an expression of the form a->b or a.b.

    Jamie> I bet there's still a fair bit of that around.  In the
    Jamie> modern world we've got *reinterpret_cast<foo*>(&x), which
    Jamie> is presumably treated specially w.r.t. aliases.

No, it does not.  The use of reinterpret_cast does not exempt a
standard-conforming program from the rules about using an lvalue of
the wrong type to access storage.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 13:34                 ` Richard Henderson
@ 1999-06-05 18:40                   ` Linus Torvalds
  1999-06-30 15:43                     ` Linus Torvalds
  1999-06-05 21:38                   ` Jakub Jelinek
  1999-06-30 15:43                   ` Richard Henderson
  2 siblings, 1 reply; 218+ messages in thread
From: Linus Torvalds @ 1999-06-05 18:40 UTC (permalink / raw)
  To: Richard Henderson; +Cc: craig, davem, mark, chip, egcs

On Sat, 5 Jun 1999, Richard Henderson wrote:
> 
> So what you're saying is, you don't mind fixing up alias
> problems on a local scale?  You're not expecting to get 
> away with no source code changes?

Right. I expect to get away with fairly minimal source code changes, and
I'd expect that some of the common codes just will work the way the
programmer intended them without people having to even worry about it.

The problem I have with the union approach (even when hidden behind a
macro like yours - which makes things better) is that I'd be much happier
if the "obvious" code just worked. The less people have to really be aware
of the aliase issues, the better.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 10:32                                     ` Toon Moene
  1999-06-05 13:26                                       ` Jamie Lokier
@ 1999-06-05 18:48                                       ` Linus Torvalds
  1999-06-30 15:43                                         ` Linus Torvalds
  1999-06-30 15:43                                       ` Toon Moene
  2 siblings, 1 reply; 218+ messages in thread
From: Linus Torvalds @ 1999-06-05 18:48 UTC (permalink / raw)
  To: Toon Moene; +Cc: Andi Kleen, law, Joe Buck, craig, mark, davem, chip, egcs

On Sat, 5 Jun 1999, Toon Moene wrote:
> 
> Be careful to not run in circles here:  gcc generates "some" code that's
> allowed because the construct invokes `undefined behaviour'.  That
> doesn't make it "faulty" - just undefined.

Sure. But wouldn't it be nice if the undefined behaviour did what the
programmer obviously meant?

You can see it as a quality of implementation issue - you're _allowed_ to
do anything under the standard, and the ANSI C standard doesn't for
example _require_ that any compile generate efficient code - but a quality
of implementation obviously means that you want to not just say "the
standard doesn't say that you have to generate good code, so we don't
optimize".

A quality of implementation issue says that you'd want to not just do what
the standard requires, but that there are other issues that the standard
just leaves at the discretion of the compiler implementer.

> If you think so, bring it up in comp.std.c.  At least that's the
> ultimate criterium I use:  If I can explain an extension to the Fortran
> Standard coherently on comp.lang.fortran (where all the J3 members
> listen in), and no-one shoots it down in two weeks time, it might indeed
> have some value.
> 
> Success !

That's probably a good idea.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 13:26                                       ` Jamie Lokier
@ 1999-06-05 19:35                                         ` Linus Torvalds
  1999-06-06  1:18                                           ` Martin v. Loewis
  1999-06-30 15:43                                           ` Linus Torvalds
  1999-06-30 15:43                                         ` Jamie Lokier
  1 sibling, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-05 19:35 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Toon Moene, Andi Kleen, law, Joe Buck, craig, mark, davem, chip, egcs

On Sat, 5 Jun 1999, Jamie Lokier wrote:
> Jamie's suggestion de jour.

I like your suggestion - it dos sound like a lot more work especially for
the compiler than my simplistic one, but the fact that we would at least
get warnings from the compiler about them means that we wouldn't have to
rely on somebody going through 30MB worth of sources by hand..

>   Linus wants non-union (ie. non-ugly) casts to do the sensible thing.
>   What that is isn't quite clear -- after thinking it through.
>   If it's just aesthetics, I don't see why a macro wouldn't do ;-)

A macro can do the same thing (the same way I think the current gcc lvalue
cast could be done with a macro), but my approach has the in my opinion
very useful behaviour that it makes most "normal" type cast problems just
automatically do the right thing. So in many cases it would work as-is
(not just for kernel code), and in cases where it does not (ie the cast is
non-local) my proposal has a way out (add another cast that _is_ local
to the actual de-reference).

I don't really see why people hate the proposal so much, but maybe that's
just my personal coding style. I do not consider pointer casts (or any
other kinds of casts) acceptable programming practice for any normal
cases, so just about =all= the casts I ever see are of the type where
alias information should obviously be disabled. So to me it sounds like a
"natural" way of doing things.

(Just to clarify - it's not as if the linux kernel does a _lot_ of ugly
pointer stuff. It's just that it does happen, and it isn't done in one one
well-defined area or similar.)

It seems that other people use more casts for "normal" things, and are
actually afraid of my proposal for performance reasons. I'm surprised:
people that do things like that are usually not the people who complain
about others coding standards ;)

Anyway, I grepped the kernel for "likely" places where my change would
make a difference by using the following heuristic grep:

	grep '\*(.*\* *)' */*.c

and in basically all cases the compiler would have done the right thing if
it had followed my proposal.

THAT is why I like it. It does the RightThing(tm), with basically zero
complexity for either the user or the compiler. It is a "do what I mean" 
kind of patch. 

"Do what I mean" is a quality of implementation thing. Yes, all of this is
obviously not defined by the standard. But exactly because it is NOT
defined by the standard, it's very good if the behaviour is what you'd
expect.

The people who worry about the thing being a performance problem for them:
try the above grep and see what it shows you. No, the grep doesn't really
catch all the cases that the compiler change would impact, but it should
give you a rough idea.

In particular, if the grep comes up empty (ie "Well written code without
any strange casts"), you probably wouldn't actually be impacted by the
"Linus proposal" at all. 

>   In particlar, a cast may be conforming, in which case the compiler
>   should strive the generate the best allowed code (unless it's
>   pathological).

"may be conforming", yes. Are there any real life cases where it really
matters? The case where my rule kicks in is definitely "suspicious" - I
agree that it _may_ conform, but do people actually ever write code like
that in strictly conforming programs? That's why I'd like to see what the
grep above shows people..

>   Issues raised: 
> 
>   - Lots of legacy code uses casts, assuming nothing weird will happen.
>   - Weird things now happen.

Right. The "Linus proposal" would not make that go away completely, but it
would make a large percentage of the weird cases do what the old code
expected.

>   - "non-conforming cast implies pointer may alias all" + "full flow
>     analysis" got proposed.  No one likes it.  Bin.

Yes.

>   - "non-... all" + "*no* flow analysis" got proposed by Linus.  It's a
>     simple special case, arguably syntactic.  But it has semantic warts:
>        *(foo_t*) &bar = foo; 
>     now means something different than:
>        { foo_t* p = (foo_t*) &bar; *p = foo; }

I understand that people can see this as a wart, but if you consider it
syntactic then you shouldn't even _expect_ the above to be the same thing.

In fact, I'd like to consider it a bonus that you will =not= get the
looser alias semantics for the case where you actually assign the pointer.
So you can use the second version as a way to _avoid_ the "Linus rule" if
you like it in general but in a specific case want to disable it.

I guess I'm not convincing you.

>   - "type attribute" got proposed by Linus too.  This I like.

Well, that one is just the "implementation part" of my basic proposal.

It can be done on its own without the "implied no-alias" of course. But I
_meant_ it to be done in conjunction with my other proposal, just
explaining how it would be implemented.

> Jamie's thought of the day
> --------------------------

[ deleted ]

Hey, works for me. It seems to do the Linus proposal in a "warning sense",
if I understood you correctly, with a way to just force whichever actual
semantics you want. Right?

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 13:34                 ` Richard Henderson
  1999-06-05 18:40                   ` Linus Torvalds
@ 1999-06-05 21:38                   ` Jakub Jelinek
  1999-06-30 15:43                     ` Jakub Jelinek
  1999-06-30 15:43                   ` Richard Henderson
  2 siblings, 1 reply; 218+ messages in thread
From: Jakub Jelinek @ 1999-06-05 21:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Linus Torvalds, craig, davem, mark, chip, egcs

> If this is all you want, you can get this with a union and
> judicious use of macros --
> 
>   #define noalias(type, ptr) (((union { type __x__; } *)(ptr))->__x__)
> 
>   s = noalias(short, ps);
> 
> Which doesn't strike me as too horrible syntax for public
> consupmtion.  Note that this works because it is the access
> to the union's member that null's the alias set, not the
> cast to the union type.

I would not mind changing code to look like this, but I think it would be
much better if Mark or somebody else implemented putting the problematic
casts dereferences into alias set zero AND provided some warning option
which would trigger a warning in such a case. Thus, when somebody writes a
non-conforming code, it would work even with -fsctrict-aliasing, albeit
slower, but if he cared about performance, he could inspect the warnings
after specifically enabling this kind of warning and use either noalias
macro or rewrite things using unions to speed things up.

Cheers,
    Jakub
___________________________________________________________________
Jakub Jelinek | jj@sunsite.mff.cuni.cz | http://sunsite.mff.cuni.cz
Administrator of SunSITE Czech Republic, MFF, Charles University
___________________________________________________________________
UltraLinux  |  http://ultra.linux.cz/  |  http://ultra.penguin.cz/
Linux version 2.3.4 on a sparc64 machine (1343.49 BogoMips)
___________________________________________________________________

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 19:35                                         ` Linus Torvalds
@ 1999-06-06  1:18                                           ` Martin v. Loewis
  1999-06-06 10:46                                             ` Linus Torvalds
                                                               ` (2 more replies)
  1999-06-30 15:43                                           ` Linus Torvalds
  1 sibling, 3 replies; 218+ messages in thread
From: Martin v. Loewis @ 1999-06-06  1:18 UTC (permalink / raw)
  To: torvalds; +Cc: egcs

> It seems that other people use more casts for "normal" things, and are
> actually afraid of my proposal for performance reasons. I'm surprised:
> people that do things like that are usually not the people who complain
> about others coding standards ;)

Well, no. The 'normal' kind of cast is very common, and frequently
used in the Linux kernel. For example, if a tty driver routine is
called (e.g. drivers/char/rocket.c :-), it fetches driver_data and
casts it to the device-specific type (i.e. (struct r_port *)).

In these cases, people typically save the cast result in a variable
instead of derefencing it, so they would not suffer from your
anti-aliasing mechanism. These uses of casts are conforming C code:
The driver put an r_port pointer into driver_data earlier on.

> Anyway, I grepped the kernel for "likely" places where my change would
> make a difference by using the following heuristic grep:
> 
> 	grep '\*(.*\* *)' */*.c
> 
> and in basically all cases the compiler would have done the right thing if
> it had followed my proposal.

It is actually the other casts that the Linux contributors need to
worry about. Alias problems are very hard to find (as you pointed
out), and somebody will have to go over the complete kernel source and
investigate every single cast - if you ever plan to turn-on
-fstrict-aliasing.

I did the inverse grep

       grep '(.*\* *)' */*.c | grep -v '\*(.*\* *)' */*.c

and found only one place (fs/binfmt_aout.c:create_aout_tables) where
pointers are aliased in different types, and dereferenced later. The
hidden treasures are probably in the header files (as earlier examples
indicate).

Regards,
Martin

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 11:00                     ` mark
@ 1999-06-06 10:30                       ` Linus Torvalds
  1999-06-06 10:44                         ` mark
  1999-06-30 15:43                         ` Linus Torvalds
  1999-06-30 15:43                       ` mark
  1 sibling, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-06 10:30 UTC (permalink / raw)
  To: mark; +Cc: rth, tim, craig, davem, chip, egcs

On Sat, 5 Jun 1999 mark@codesourcery.com wrote:
> 
> Not really always true.  You can use `memcpy (target, src, sizeof
> (x))' and if the alignments of the src and target are known to the
> compiler you *should* get optimal code.  (I don't know if GCC does
> this at present, but it could, and that would clearly be a good
> improvement.)

Only if that's assuming that it _is_ a memcpy.

Think of things like

	a = ntohl(*(u32 *)p);

etc - which is _not_ just a copy.

Current gcc versions do pretty well on the pure memcpy() case, I agree. A
lot of the Linux memcpy() logic is because gcc historically did _not_ do
any of the optimizations people felt really had to be done.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 10:30                       ` Linus Torvalds
@ 1999-06-06 10:44                         ` mark
  1999-06-06 14:17                           ` Linus Torvalds
  1999-06-30 15:43                           ` mark
  1999-06-30 15:43                         ` Linus Torvalds
  1 sibling, 2 replies; 218+ messages in thread
From: mark @ 1999-06-06 10:44 UTC (permalink / raw)
  To: torvalds; +Cc: rth, tim, craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus> On Sat, 5 Jun 1999 mark@codesourcery.com wrote:
    >>  Not really always true.  You can use `memcpy (target, src,
    >> sizeof (x))' and if the alignments of the src and target are
    >> known to the compiler you *should* get optimal code.  (I don't
    >> know if GCC does this at present, but it could, and that would
    >> clearly be a good improvement.)

    Linus> Only if that's assuming that it _is_ a memcpy.

    Linus> Think of things like

    Linus> 	a = ntohl(*(u32 *)p);

    Linus> etc - which is _not_ just a copy.

Right.  But the part that's causing aliasing issues is just a memcpy;
that's the `*(u32 *) p' bit.   You could write:

  memcpy (&a, p, sizeof (a));
  a = ntohl (a);

I would argue that GCC *should* generate the code you want for this.
GCC may not.  Fixing it might be difficult.  But, if GCC does not
generate good code for this case, improving it would be an
optimization of general utility.

Yes, I recognize that this is not as a compact a coding style as you
are used to.  All the previous discussion about difficulty of
conversion stands; both your opinions and mine.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06  1:18                                           ` Martin v. Loewis
@ 1999-06-06 10:46                                             ` Linus Torvalds
  1999-06-30 15:43                                               ` Linus Torvalds
  1999-06-06 17:56                                             ` Jason Merrill
  1999-06-30 15:43                                             ` Martin v. Loewis
  2 siblings, 1 reply; 218+ messages in thread
From: Linus Torvalds @ 1999-06-06 10:46 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: egcs

On Sun, 6 Jun 1999, Martin v. Loewis wrote:
> 
> Well, no. The 'normal' kind of cast is very common, and frequently
> used in the Linux kernel. For example, if a tty driver routine is
> called (e.g. drivers/char/rocket.c :-), it fetches driver_data and
> casts it to the device-specific type (i.e. (struct r_port *)).

No, but that "normal" kind of cast does not imply a immediate _derefernce_
(which is the only case where my new rule would kick in.

I agree that it is perfectly normal (and unavoidable) to do pointer
casting like

	struct specific_struct * mystruct;

	mystruct = (struct specific_struct *) data_struct->private_member;

and then use "mystruct". No argument. Linux (and tons of other programs) 
do this all over the place, exactly because you have "anonymous"  generic
pointers whose usage depends on who is the "owner" of that pointer. 

The other obvious case is things like

	mystruct = (struct specific_struct *) malloc();

which kind of falls under the same header.

But my proposal would only change cases where you actually dereference
such a cast without ever using the cast for anything else, which I
consider to be "dodgy" code unless the cast is there explicitly as a type
forcing conversion (that would imply no alias information). 

"Normal" use like the two examples above would (and should) NOT be
impacted by any proposal of mine.

> In these cases, people typically save the cast result in a variable
> instead of derefencing it, so they would not suffer from your
> anti-aliasing mechanism. These uses of casts are conforming C code:
> The driver put an r_port pointer into driver_data earlier on.

Indeed. 

Maybe people worried that those kinds of uses would be changed by the
change I proposed. They wouldn't. I would be upset if they were, and I
would understand that others would be upset if they were. That would imply
a real lack of alias information. 


> > 	grep '\*(.*\* *)' */*.c
> 
> It is actually the other casts that the Linux contributors need to
> worry about. Alias problems are very hard to find (as you pointed
> out), and somebody will have to go over the complete kernel source and
> investigate every single cast - if you ever plan to turn-on
> -fstrict-aliasing.

I agree. We need to be careful. But my proposal has two advantages: 

 - it takes care of the obvious cases (not just for the kernel, but for a
   ton of other programs), and has a nice "do what I mean" kind of
   semantic for all the cases I found.

   In fact, try the above "grep" on the gcc sources themselves. You'll see
   code like

	*((EMUSHORT *) r) = w[3];
	*((EMUSHORT *) r + 1) = w[2];
	*((EMUSHORT *) r + 2) = w[1];
	*((EMUSHORT *) r + 3) = w[0];

   as part of the "PUT_REAL()" macro, and then you'll see usage like

	PUT_REAL (g, &r);
	return (r);

   which is ILLEGAL because strict-aliasing might decide that "r" could be
   loaded before PUT_REAL hass changed it, because "EMUSHORT" cannot alias
   with "double". But it's _exactly_ the kind of code that my proposal
   would just automatically do the right thing for. 

   It's "Do what I mean!"

   Do you see? Gcc _itself_ wouldn't mind having the feature I propose.
   Does that make people more likely to realize why I'm proposing it? 
   Maybe people still thought that this was something kernel-specific? 

 - The other advantage is that when going through the other cases more
   carefully, my proposal would make it trivial to fix them up by just
   adding a cast. I consider this part of the proposal to be a smaller
   advantage, though.

Oh, well. I don't seem to be convincing people who have all dug themselves
into their own view.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 10:44                         ` mark
@ 1999-06-06 14:17                           ` Linus Torvalds
  1999-06-06 17:41                             ` mark
  1999-06-30 15:43                             ` Linus Torvalds
  1999-06-30 15:43                           ` mark
  1 sibling, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-06 14:17 UTC (permalink / raw)
  To: mark; +Cc: rth, tim, craig, davem, chip, egcs

On Sun, 6 Jun 1999 mark@codesourcery.com wrote:
> 
> Right.  But the part that's causing aliasing issues is just a memcpy;
> that's the `*(u32 *) p' bit.   You could write:
> 
>   memcpy (&a, p, sizeof (a));
>   a = ntohl (a);

Which is crap.

And a compiler that requires you to write code like that is, by
implication..

If it comes to examples like the above, then "generated code" is not the
major point of contention any more. Bad syntax and requireing programmers
to do ludicrous things _is_ the issue. 

Language design is not just about making it easy for the compiler.  It's
also about making things easy to do for the programmer.  The above is BAD! 

If you can't see why

	a = ntohl((u32 *) p);

is better than the horrible thing you're suggesting (regardless of whether
the code generated is the same or not), then I might as well throw in the
towel immediately. The whole point of my suggestion was to make good code
generation possible with an interface that you can actually use without
barfing..

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 14:17                           ` Linus Torvalds
@ 1999-06-06 17:41                             ` mark
  1999-06-07  8:58                               ` Linus Torvalds
  1999-06-30 15:43                               ` mark
  1999-06-30 15:43                             ` Linus Torvalds
  1 sibling, 2 replies; 218+ messages in thread
From: mark @ 1999-06-06 17:41 UTC (permalink / raw)
  To: torvalds; +Cc: rth, tim, craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus> On Sun, 6 Jun 1999 mark@codesourcery.com wrote:
    >>  Right.  But the part that's causing aliasing issues is just a
    >> memcpy; that's the `*(u32 *) p' bit.  You could write:
    >> 
    >> memcpy (&a, p, sizeof (a)); a = ntohl (a);

    Linus> Which is crap.

I think that I freely admitted in the posting that this approach is
not as convenient as what you had.  I think you can also see how to
wrap this up in a macro (probably using the already documented, and
hence guaranteed, statement-expression extension).

    Linus> And a compiler that requires you to write code like that
    Linus> is, by implication..

Was that really necessary? :-)

    Linus> If you can't see why

    Linus> 	a = ntohl((u32 *) p);

    Linus> is better than the horrible thing you're suggesting
    Linus> (regardless of whether the code generated is the same or
    Linus> not), then I might as well throw in the towel
    Linus> immediately.

I can see what the "horrible" thing is less convenient for you.  I've
also made sound (in my opinion, naturally) technical arguments against
your proposal, on grounds having not only to do with maintenance of
GCC, but also to do with the impact on code-generation for conforming
programs.  (For example, your proposal, as written, pessimizes:

  int i;	   
  int *ip = &i;
  void *vp = ip;
  *((int*) vp) = 3;

You can amend your proposal to handle the void (and perhaps char?)
case specially, but what about structures with common initial
segments, as used in object-oriented C?  TCL, for example, is one
program that uses this kind of thing heavily.  The Xt toolkit is
another, and there's something that is often performance-critical on a
Linux system.)

So, unless I'm overruled, *something* is going to have to change in

  a =ntohl((u32*) p);

if you're going to enable type-based alias analysis.

BTW, I've been notified in private mail that you pointed out a bug in
GCC's real.c, involving exactly the kinds of casts were arguing about.
(I somehow missed that message from you.)   Thanks for pointing that
out!  I'll fix it soon.

I gather that you suggested your proposal would avoid changing GCC.
But, it wouldn't, since GCC's first stage is compiled with a (possibly
non-GCC) host compiler.  Thus, GCC *must* be written in legal ANSI/ISO
C. 

Even in the kernel, your proposal will lead to a confusing situation.
You claim it's DWIM, but there the "I" really is "Linus Torvalds", and
not necessarily the rest of us.  People used to the ANSI/ISO C
aliasing rules will have to read the GCC manual very carefully to
figure out the meaning of your code.

I think by now you've been presented with a variety of strategies for
solving the problem in the kernel, including more than one idea for
macros that you could use like:

  ALIASING_CAST (type, x)

that would do what you want.  I believe Richard Henderson suggested
one involving local unions; you could also use memcpy as I suggested.
(There may be alignment issues that make my suggestion better, or
maybe not.  I'm not sure.) Using this approach would make your code
clear and self-documenting.  (It would be DWIS!)  This approach is
better than my earlier suggestion (using unions in header files): it
does not require header-file duplication, and requires only local
changes to the kernel.  (What function really could be improved by
type-based alias analysis?  Put it in a separate file.  Use
ALIASING_CAST in it.  Compile *that file* with -fstrict-aliasing.
Performance win, little additional maintenance cost, no impact on the
rest of the kernel.)

Even if we implemented your proposal you'd have to audit all your code
to make sure that all the technically invalid casts come in
expressions that are immediately derefenced, and not stored in
temporaries.

At this point, I strongly suggest you abandon your proposal.  Nobody
looks likely to implement it (at least on a volunteer basis), and I've
pointed out that it will be hard to do so, even if it was agreed that
it was a good thing to do.  Sorry.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06  1:18                                           ` Martin v. Loewis
  1999-06-06 10:46                                             ` Linus Torvalds
@ 1999-06-06 17:56                                             ` Jason Merrill
  1999-06-06 19:24                                               ` Tim Hollebeek
                                                                 ` (3 more replies)
  1999-06-30 15:43                                             ` Martin v. Loewis
  2 siblings, 4 replies; 218+ messages in thread
From: Jason Merrill @ 1999-06-06 17:56 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: torvalds, egcs

It seems to me that this issue is broader than the kernel; any code that
uses casts to, say, get at the bitwise representation of a floating point
value is likely to break.  This seems like a very unfortunate state of
affairs, because this sort of failure is inherently hard to track down.  We
can't just silently emit bad code and say that the standard allows us to do
that; as Linus says, it's a QOI issue.

One simple and safe solution would be to turn off -fstrict-aliasing in any
function which contains a pointer cast (reinterpret_casts only in C++),
along with a warning.  Power users could override this behavior with a flag.

Jason

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 17:56                                             ` Jason Merrill
@ 1999-06-06 19:24                                               ` Tim Hollebeek
  1999-06-30 15:43                                                 ` Tim Hollebeek
  1999-06-06 22:23                                               ` Jeffrey A Law
                                                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 218+ messages in thread
From: Tim Hollebeek @ 1999-06-06 19:24 UTC (permalink / raw)
  To: Jason Merrill; +Cc: martin, torvalds, egcs

Jason Merrill writes ...
> 
> It seems to me that this issue is broader than the kernel; any code that
> uses casts to, say, get at the bitwise representation of a floating point
> value is likely to break.  This seems like a very unfortunate state of
> affairs, because this sort of failure is inherently hard to track down.  We
> can't just silently emit bad code and say that the standard allows us to do
> that; as Linus says, it's a QOI issue.
> 
> One simple and safe solution would be to turn off -fstrict-aliasing in any
> function which contains a pointer cast (reinterpret_casts only in C++),
> along with a warning.  Power users could override this behavior with a flag.

No, this is not a simple and safe solution.  Simply because it doesn't work.

Consider:

extern int *get_representation(float *);

float fiddle() {
    float d = 3.14159;
    int *s;

    s = get_representation(&d); /* Look, Ma, no casts! */
    (*s) |= 0x00300;
    return d; /* oops! */
}

-fno-strict-aliasing is the only safe solution, if you want to
guarantee this definition of "bad" code never occurs.  Sure, there are
cases where you can prove pointer aliasing can't happen, but then we
don't need the ANSI rules at all anyway!

The interesting case is where the ANSI rules say something can't be
aliased, but it is impossible or very hard for gcc to say whether this
assertion is correct or not (the above case falls in the impossible
category).  If you want to be safe, -fno-strict-aliasing is the
solution.  But when it's on, you can't warn about such pointer stores
since you'd have to warn about any store you can't prove points to an
object of the same type.  Requires data flow -> zillions of false
positives.

-Tim

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 17:56                                             ` Jason Merrill
  1999-06-06 19:24                                               ` Tim Hollebeek
@ 1999-06-06 22:23                                               ` Jeffrey A Law
  1999-06-30 15:43                                                 ` Jeffrey A Law
       [not found]                                               ` <199906070645.IAA00615@mira.isdn.cs.tu-berlin.de>
  1999-06-30 15:43                                               ` Jason Merrill
  3 siblings, 1 reply; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-06 22:23 UTC (permalink / raw)
  To: Jason Merrill; +Cc: Martin v. Loewis, torvalds, egcs

  In message < u9lndw7rr9.fsf@yorick.cygnus.com >you write:
  > It seems to me that this issue is broader than the kernel; any code that
  > uses casts to, say, get at the bitwise representation of a floating point
  > value is likely to break.
This has been the most common problem I've seen over the years with type
based alias analysis.  However, as I've mentioned before, most folks have
already fixed their code so that it would work with various vendor compilers
that added type based alias analysis years ago.


jeff


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: f77 vs type based alias analysis
  1999-06-05  5:45                                 ` Toon Moene
  1999-06-05  6:23                                   ` Andi Kleen
@ 1999-06-06 23:12                                   ` Jeffrey A Law
  1999-06-30 15:43                                     ` Jeffrey A Law
  1999-06-06 23:20                                   ` Linux and aliasing? Jeffrey A Law
  1999-06-30 15:43                                   ` Toon Moene
  3 siblings, 1 reply; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-06 23:12 UTC (permalink / raw)
  To: Toon Moene; +Cc: craig, egcs

Note, we've got two threads in this message (my fault actually)...  I'll
split for Fortran thread off from the main aliasing thread.

  In message < 37591A00.ACE54339@moene.indiv.nluug.nl >you write:
  > This was suggested, but I replied that I didn't believe that to be the
  > reason.  Note that Fortran basically only has one "scope" for automatic
  > variables (whether arrays or scalars):  The complete subprogram (i.e.
  > subroutine or function).
  > 
  > That means that in the scope Mark's alias analysis works in, automatic
  > arrays are created precisely once (at the beginning of that scope) and
  > destroyed exactly once (at the end of said scope); hence, there is no
  > opportunity to re-use stack slots.
Hmmm.  Well, there's a reasonably easy way to verify if it is the stack
slot issue.

Replace "flag_strict_aliasing" with "0" in function.c and benchmark against
the unmodified version of the compiler.

That allows us to isolate the slot combination/reuse issues from the rest of
the strict aliasing changes.  While there is the slight chance you'll get wrong
code, I doubt it'll happen in practice with Fortran.  And at this point we're
just trying to find out why the code is slower when strict aliasing is enable,
so 100% correctness isn't needed.


jeff

 

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  5:45                                 ` Toon Moene
  1999-06-05  6:23                                   ` Andi Kleen
  1999-06-06 23:12                                   ` f77 vs type based alias analysis Jeffrey A Law
@ 1999-06-06 23:20                                   ` Jeffrey A Law
  1999-06-30 15:43                                     ` Jeffrey A Law
  1999-06-30 15:43                                   ` Toon Moene
  3 siblings, 1 reply; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-06 23:20 UTC (permalink / raw)
  To: Toon Moene
  Cc: Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs, Andi Kleen

  In message < 37591A00.ACE54339@moene.indiv.nluug.nl >you write:
  > Ah, yes, but the discussion is whether we should have gcc generate
  > "reasonable" behaviour where "reasonable" is defined by a small group of
  > users.  Note that all "behaviours" not explicitly required by the
  > Standard are prone to:
  > 
  > 1. Erosion (within a decade, gcc maintainers forget why we did this in
  >    the first place: "Hey, look at this code - what hair - and it is
  >    undefined behaviour according to the Standard in the first place;
  >    rip it out")
  > 
  > 2. Contradiction (the C0X Standard defines the previously undefined
  >    behaviour, but in a way incompatible with the "reasonable" behaviour
  >    we thought up here).
I can't agree more.  I haven't caught up on the whole thread yet, but in
general it seems like a mistake from a design standpoint to extend GCC in
the manner that I've seen suggested here.

If the Linux kernel folks don't want to change their (non-conforming) code,
then they should use -fno-strict-aliasing.  Yes it will inhibit some opts,
but that's the price one pays for writing non-conforming code.

jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
       [not found]                                               ` <199906070645.IAA00615@mira.isdn.cs.tu-berlin.de>
@ 1999-06-07  2:14                                                 ` Jason Merrill
  1999-06-07  8:02                                                   ` mark
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 218+ messages in thread
From: Jason Merrill @ 1999-06-07  2:14 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: egcs

>>>>> Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:

 > We don't really emit *bad* code: garbage in, garbage out. People will
 > run into problems, yes. We will advertise this new feature in big
 > letters, and people will recompile with -fno-strict-aliasing, and then
 > see whether it still breaks (for some other reason). If it was an
 > aliasing problem, we can tell them that their code was not C.

My problem with this is that only people who read the release notes for
*this release* will see the big letters.  Meanwhile, people who start with
a later version or aren't involved in deploying the tools or whatever won't
see the warning.  Meanwhile, casts are an intuitive way to achieve the
desired effect, much more obvious than unions, so people who haven't been
explicitly warned will continue to write code that uses unsafe casts,
without realizing that it will break.

Saying "garbage in, garbage out" is a cop-out.  If we're going to let code
like this break, we need to emit a warning so that people know that they
have a problem, rather than leaving them to debug obscure problems.

BTW, the union trick isn't part of C, either.  It's a GCC implementation
choice.

Jason

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 16:53                                         ` mark
@ 1999-06-07  2:36                                           ` Jamie Lokier
  1999-06-07  8:04                                             ` mark
  1999-06-30 15:43                                             ` Jamie Lokier
  1999-06-30 15:43                                           ` mark
  1 sibling, 2 replies; 218+ messages in thread
From: Jamie Lokier @ 1999-06-07  2:36 UTC (permalink / raw)
  To: mark; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

PgGGmark@codesourcery.com wrote:
>     Jamie>   *(foo*)(void*)(&x)
> 
> I intended this to be covered by my proposal.  This would officially
> be a "funny cast", and considered able to alias anything, provided
> that x is a variable of an expression of the form a->b or a.b.

Why the restriction on x?
Things I've seen around, that are outside your proposal:

 - accessing an integer/float as a struct, to access individual parts.
 - vice versa to generate hash values / do vector operations.
 - accessing an integer array as different size integers for fast
   vector operatings (e.g. image processing).

Image processing is a particular pain.  As are optimised implementations
of strlen, strcpy, memcpy etc.

We could just tell people their code will work if the pointed-to entity
happens to be a struct member.  We could tell them to use a union like
they're supposed to.

But simply fixing a vector processing kernel to use unions won't
guarantee correct code: all the callers must be changed to use the union
representation too, because the image processing ops may get inlined.
Hence the special char * exception so things like memcpy work I suppose.

>     Jamie> I bet there's still a fair bit of that around.  In the
>     Jamie> modern world we've got *reinterpret_cast<foo*>(&x), which
>     Jamie> is presumably treated specially w.r.t. aliases.
> 
> No, it does not.  The use of reinterpret_cast does not exempt a
> standard-conforming program from the rules about using an lvalue of
> the wrong type to access storage.

I meant more along the lines of "what GCC does" than "what the standard
says" on this.  I realise this gives undefined behaviour standard-wise.

Presumably reinterpret_cast this equivalent to one of your "funny casts"?

have nice day,
-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 11:09                                       ` David S. Miller
  1999-06-05 12:11                                         ` Toon Moene
@ 1999-06-07  6:01                                         ` Joern Rennecke
  1999-06-30 15:43                                           ` Joern Rennecke
  1999-06-30 15:43                                         ` David S. Miller
  2 siblings, 1 reply; 218+ messages in thread
From: Joern Rennecke @ 1999-06-07  6:01 UTC (permalink / raw)
  To: David S. Miller; +Cc: mark, ak, toon, law, jbuck, torvalds, craig, chip, egcs

> Also some of the datastructures one would need to change are included
> by userspace applications, especially for some of the networking
> instances, and thus one would have ABI issues to concern themselves
> about if they were to go and perform these transformations.  Much more
> is it than a tedious chore.  One could certainly create another header
> file, leave the old one alone with the same name, and use only the new
> one inside the kernel, but does it make sense to have two copies and
> maintain them?

No.  You could still have a single header file, and control the variant
portions with #ifdef __KERNEL__ / #else / #endif .
Or if you have some recurring common type mix, you could use some macros
in the declarations that are definied differently for kernel and user space.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  2:14                                                 ` Jason Merrill
@ 1999-06-07  8:02                                                   ` mark
  1999-06-07  8:41                                                     ` David S. Miller
  1999-06-30 15:43                                                     ` mark
  1999-06-07 13:11                                                   ` Jeffrey A Law
  1999-06-30 15:43                                                   ` Jason Merrill
  2 siblings, 2 replies; 218+ messages in thread
From: mark @ 1999-06-07  8:02 UTC (permalink / raw)
  To: jason; +Cc: martin, egcs

>>>>> "Jason" == Jason Merrill <jason@cygnus.com> writes:

    Jason> My problem with this is that only people who read the
    Jason> release notes for *this release* will see the big letters.
    Jason> Meanwhile, people who start with a later version or aren't
    Jason> involved in deploying the tools or whatever won't see the
    Jason> warning. 

This is a valid point.

    Jason> Saying "garbage in, garbage out" is a cop-out.  If we're
    Jason> going to let code like this break, we need to emit a
    Jason> warning so that people know that they have a problem,
    Jason> rather than leaving them to debug obscure problems.

Any warning will yield many false positives.  For example, there are
places in GCC where we store a `foo *' in a `tree' slot in a data
structure.  We (hopefully) never access it as a tree; we just cast if
back and forth.  That's legal.  It's also perhaps worth warning about.

But, warning like mad over every object-oriented C program seems
annoying.  Of course, a warning that's part of -W or some such is
probably OK.  But, then it won't accomplish what you want.  The
warning has to be on by default to overcome the problem you describe.
It's a reasonable point of view to argue that this is a better QOI
than the current state, but it's reasonable to argue the opposite as
well: false positive warnings are pretty annoying, and obscure the
real warnings.

We've had -fstrict-aliasing on in the tree for a long time, and had
very few bugs that we tracked down to the kind of thing you are
talking about.  But, admittedly, we haven't yet had it on in a general
release, so it's a relatively small sample size.

It's reasonable to argue that -fstrict-aliasing should be off by
default.  Then, you have to turn it on to get the benefits; when you
do so, it's fair to expect that you figured out what it did before you
did it, and it's easy to realize that if something went wrong after
you turned it on that it had to do with -fstrict-aliasing.
Unfortunately, this alternative means that most code will never be
compiled with this optimization.  I'm not sure what to think; QOI also
involves generating code that goes as fast as possible when presented
with conformant code, and not requiring people to root around the
manual to find funny flags to write into their makefiles to make the
code go fast.  It's a trade-off; reasonable people can certainly
disagree on whether or not -fstrict-aliasing should be on by default.

Note that this choice would have absolutely no bearing on the
*original* discussion regarding the Linux kernel; they want
-fstrict-aliasing *on* and still to have their code work as they
expect.

    Jason> BTW, the union trick isn't part of C, either.  It's a GCC
    Jason> implementation choice.

True enough.  This is implementation-defined behavior.  (Not
"undefined", but still "implementation-defined".)  I believe the
"memcpy trick" is the only 100% portable solution.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  2:36                                           ` Jamie Lokier
@ 1999-06-07  8:04                                             ` mark
  1999-06-30 15:43                                               ` mark
  1999-06-30 15:43                                             ` Jamie Lokier
  1 sibling, 1 reply; 218+ messages in thread
From: mark @ 1999-06-07  8:04 UTC (permalink / raw)
  To: egcs; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

>>>>> "Jamie" == Jamie Lokier <egcs@tantalophile.demon.co.uk> writes:

    Jamie> PgGGmark@codesourcery.com wrote: *(foo*)(void*)(&x)
    >>  I intended this to be covered by my proposal.  This would
    >> officially be a "funny cast", and considered able to alias
    >> anything, provided that x is a variable of an expression of the
    >> form a->b or a.b.

    Jamie> Why the restriction on x?  Things I've seen around, that
    Jamie> are outside your proposal:

So that we can be sure this code is non-conforming before we pessimize
it.

    Jamie> Presumably reinterpret_cast this equivalent to one of your
    Jamie> "funny casts"?

I hadn't intended that.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  8:02                                                   ` mark
@ 1999-06-07  8:41                                                     ` David S. Miller
  1999-06-07  9:24                                                       ` Jeffrey A Law
                                                                         ` (2 more replies)
  1999-06-30 15:43                                                     ` mark
  1 sibling, 3 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-07  8:41 UTC (permalink / raw)
  To: mark; +Cc: jason, martin, egcs

   From: mark@codesourcery.com
   Date: Mon, 07 Jun 1999 08:05:31 -0700

   We've had -fstrict-aliasing on in the tree for a long time, and had
   very few bugs that we tracked down to the kind of thing you are
   talking about.  But, admittedly, we haven't yet had it on in a
   general release, so it's a relatively small sample size.

True.

One issue which seems to not be mentioned explicitly, is that such a
change is typically not of the "flag day" variety, which turning it on
for the next release seems to imply.

Other compiler vendors seem to have done it in two stages:

1) Ok, the strict aliasing is there, but not a default optimization,
   you have to enable it explicitly.  But come next release it will be
   on by default and thus you have ample time to fixup your code.

2) It's on by default in this new subsequent release, we warned you.

The time between two compiler releases is more than sufficient time
for both ends of the equation (the compiler and it's users) to work
out the issue.

Compiler vendors who have done this typically are often the default
compiler for a single system.  For EGCS we know of at least 4 whole
systems (Linux and the 3 publicly available BSD variants) which use
gcc as the default compiler.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 17:41                             ` mark
@ 1999-06-07  8:58                               ` Linus Torvalds
  1999-06-07  9:18                                 ` mark
                                                   ` (2 more replies)
  1999-06-30 15:43                               ` mark
  1 sibling, 3 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-07  8:58 UTC (permalink / raw)
  To: mark; +Cc: rth, tim, craig, davem, chip, egcs

On Sun, 6 Jun 1999 mark@codesourcery.com wrote:
> 
> BTW, I've been notified in private mail that you pointed out a bug in
> GCC's real.c, involving exactly the kinds of casts were arguing about.
> (I somehow missed that message from you.)   Thanks for pointing that
> out!  I'll fix it soon.

Note that I didn't point it out as a kind of "nyaah, nyaah!" kind of
thing: it just happens that I had the gcc sources on-line and thought I'd
idly check whether it looked like it could have problems just to make
people realize how PERMEATING this is.

> I gather that you suggested your proposal would avoid changing GCC.
> But, it wouldn't, since GCC's first stage is compiled with a (possibly
> non-GCC) host compiler.  Thus, GCC *must* be written in legal ANSI/ISO
> C. 

My proposal is not just about "avoiding changing X", whether X be gcc, the
kernel, or anything else.

What I _really_ wanted to point out that even among the people who (a)
should know and (b) now quote the standard as a legal reason to do
anything, these kinds of things happen. 

My proposal is really a way of saying "ok, there is old code out there,
and we want to try to be as graceful about it as we can".

In the case of the Linux kernel, that "gracefulness" would be something I
would be really happy to take advantage of, as I don't expect to compile
the kernel with much else.

> Even in the kernel, your proposal will lead to a confusing situation.
> You claim it's DWIM, but there the "I" really is "Linus Torvalds", and
> not necessarily the rest of us.  People used to the ANSI/ISO C
> aliasing rules will have to read the GCC manual very carefully to
> figure out the meaning of your code.

No. People used to the ANSI/ISO C aliasing rules (all five of them) will
just point to the code and say it is not strictly conforming, and then
they will go back to building their ivory towers. 

It's not just the kernel. It's not just gcc. I bet there are things like
this in just about all major projects - some of which we'll never see
source code for. 

My proposal might mean that fewer people will use the "-fno-strict-alias"
switch, because they won't have to. I don't think you realize how most
professional software projects work. The "professional" part means that
people are under a deadline and don't really care about your standards
conformance, they want things to WORK. 

That may not be your definition of professional, but it's a fact of life. 

That means that I suspect that if there isn't some simple workaround (like
mine), then it's not just the kernel project that uses the disable switch.
Is that what you want?

Flexibility is a GOOD thing. Even if that flexibility means "Oh, you don't
_have_ to program to the standard, and I'll still try to do the best I
can". 

Think of it this way: you still support "-traditional -O2" - you try to
generate good code even when presented with C that isn't even called C any
more.

Why? Because the code is out there, and it's not worth changing thousands
of software packages when you can instead change one: the compiler.

> I think by now you've been presented with a variety of strategies for
> solving the problem in the kernel, including more than one idea for
> macros that you could use like:
> 
>   ALIASING_CAST (type, x)

I've been told in private email, that the proposed macro wasn't even
standards conforming in the sense that it doesn't guarantee that the
compiler couldn't decide it aliases (because in order to guarantee that
the union should contain all possible types). It happens to work for gcc. 

I don't know whether that is true - I don't have the official standard
around. But you might want to check that out.

> that would do what you want.  I believe Richard Henderson suggested
> one involving local unions; you could also use memcpy as I suggested.

Or I could use "-fno-strict-alias" which is actually preferable to
starting to introduce ugly code.

I think it was Craig who complained about maintenance. "Ugly code" is a
big maintenance issue, and it's always much much better if the "obvious"
code works even if it is not "strictly conforming". The kernel doesn't try
to be strictly conforming anyway, we use tons of other things.

> Even if we implemented your proposal you'd have to audit all your code
> to make sure that all the technically invalid casts come in
> expressions that are immediately derefenced, and not stored in
> temporaries.

Sure. But it wouldn't result in horribly ugly code.

I'm not using egcs at the moment. As such, I'm just seeing reports saying
that it's broken wrt the kernel. My reaction is still that people should
just use gcc-2.7.2, because it's just too _painful_ to upgrade to egcs.
Oh, well..

> At this point, I strongly suggest you abandon your proposal.  Nobody
> looks likely to implement it (at least on a volunteer basis),

Andy Kleen already said he was playing with patches that implemented it,
but just ignore that, like you ignore all the other arguments I've
presented. Sorry,

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  8:58                               ` Linus Torvalds
@ 1999-06-07  9:18                                 ` mark
  1999-06-07  9:29                                   ` Linus Torvalds
  1999-06-30 15:43                                   ` mark
  1999-06-07 13:34                                 ` Jamie Lokier
  1999-06-30 15:43                                 ` Linus Torvalds
  2 siblings, 2 replies; 218+ messages in thread
From: mark @ 1999-06-07  9:18 UTC (permalink / raw)
  To: torvalds; +Cc: rth, tim, craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus> My proposal might mean that fewer people will use the
    Linus> "-fno-strict-alias" switch, because they won't have to. I
    Linus> don't think you realize how most professional software
    Linus> projects work. The "professional" part means that people
    Linus> are under a deadline and don't really care about your
    Linus> standards conformance, they want things to WORK.

Please don't make these kinds of statements.  They're not becoming.

I am a professional developer, paid for my work.  Some of that work is
on free software, some is not.  Before my current job, I worked as a
technical lead in midsized software corporation, where I, and my team,
were all professional developers.  I brought two or three products to
release, and I'm well aware of the pressures, both technical and
otherwise, that accompany such a project.

    Linus> Andy Kleen already said he was playing with patches that
    Linus> implemented it, but just ignore that, like you ignore all
    Linus> the other arguments I've presented. Sorry,

I did not ignore Andy.  Indeed, I responded to him, both personally
and to the list.  I discussed his/your proposal with him, and pointed
out techincal flaws both in your proposal, and in the obvious way of
implementing it.  (I haven't seen Andy's patches, so I don't know what
approach he took; all I said was why the obvious one won't work.)

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  8:41                                                     ` David S. Miller
@ 1999-06-07  9:24                                                       ` Jeffrey A Law
  1999-06-07  9:29                                                         ` David S. Miller
  1999-06-30 15:43                                                         ` Jeffrey A Law
  1999-06-07  9:32                                                       ` Joe Buck
  1999-06-30 15:43                                                       ` David S. Miller
  2 siblings, 2 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-07  9:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: mark, jason, martin, egcs

  In message < 199906071545.IAA07099@pizda.davem.net >you write:
  > Other compiler vendors seem to have done it in two stages:
  > 
  > 1) Ok, the strict aliasing is there, but not a default optimization,
  >    you have to enable it explicitly.  But come next release it will be
  >    on by default and thus you have ample time to fixup your code.
Err, we had this in egcs-1.1.  One had to enable type based alias analysis
explicitly via -fstrict-aliasing.


  > 2) It's on by default in this new subsequent release, we warned you.
With the full intention of doing this for egcs-1.2 (now gcc-2.95).  It
probably was even mentioned in the release notes for egcs-1.1.

jeff


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:18                                 ` mark
@ 1999-06-07  9:29                                   ` Linus Torvalds
  1999-06-07  9:38                                     ` Tim Hollebeek
  1999-06-30 15:43                                     ` Linus Torvalds
  1999-06-30 15:43                                   ` mark
  1 sibling, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-07  9:29 UTC (permalink / raw)
  To: mark; +Cc: rth, tim, craig, davem, chip, egcs

On Mon, 7 Jun 1999 mark@codesourcery.com wrote:
> 
>     Linus> My proposal might mean that fewer people will use the
>     Linus> "-fno-strict-alias" switch, because they won't have to. I
>     Linus> don't think you realize how most professional software
>     Linus> projects work. The "professional" part means that people
>     Linus> are under a deadline and don't really care about your
>     Linus> standards conformance, they want things to WORK.
> 
> Please don't make these kinds of statements.  They're not becoming.

What? It's not about being becoming. It's about how things work. Not in
all places, I'll give you that, and not in the best places. But I think
you have rose-coloured glasses if you think most software projects are
going to spend a lot of time to try to get everything to run perfectly.

People write so-so code, and then they hope that by using -O2 the compiler
will make it good. When it doesn't, and when they find out it was due to
strict aliasing rules (if they ever do - the more likely schenario is
that they'll ship the binary compiled with just -O), they'll turn it off
rather than fight it.

You obviously disagree. This isn't technology, so there isn't "one right
answer",

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:24                                                       ` Jeffrey A Law
@ 1999-06-07  9:29                                                         ` David S. Miller
  1999-06-30 15:43                                                           ` David S. Miller
  1999-06-30 15:43                                                         ` Jeffrey A Law
  1 sibling, 1 reply; 218+ messages in thread
From: David S. Miller @ 1999-06-07  9:29 UTC (permalink / raw)
  To: law; +Cc: mark, jason, martin, egcs

   Date: Mon, 07 Jun 1999 10:14:11 -0600
   From: Jeffrey A Law <law@cygnus.com>

     > 2) It's on by default in this new subsequent release, we warned you.

   With the full intention of doing this for egcs-1.2 (now gcc-2.95).  It
   probably was even mentioned in the release notes for egcs-1.1.

It was?  If so, I stand totally corrected, thanks.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  8:41                                                     ` David S. Miller
  1999-06-07  9:24                                                       ` Jeffrey A Law
@ 1999-06-07  9:32                                                       ` Joe Buck
  1999-06-30 15:43                                                         ` Joe Buck
  1999-06-30 15:43                                                       ` David S. Miller
  2 siblings, 1 reply; 218+ messages in thread
From: Joe Buck @ 1999-06-07  9:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: mark, jason, martin, egcs

David Miller writes:

> One issue which seems to not be mentioned explicitly, is that such a
> change is typically not of the "flag day" variety, which turning it on
> for the next release seems to imply.
> 
> Other compiler vendors seem to have done it in two stages:

> 1) Ok, the strict aliasing is there, but not a default optimization,
>    you have to enable it explicitly.  But come next release it will be
>    on by default and thus you have ample time to fixup your code.

> 2) It's on by default in this new subsequent release, we warned you.

I'm inclined to agree with you.  This is why I suggested that it perhaps
shouldn't be the default for gcc-2.95.  This is still a possibility (it's
a couple of lines of code at most to flip the default).

One question we don't really have data on is how many programs will break.
If it's only the Linux kernel and a couple of others, we can just use
-fno-strict-aliasing for the affected programs. If many programs are
affected, we don't want to break them all, causing massive inconvenience
to users and developers and damaging the reputation of egcs/gcc.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:29                                   ` Linus Torvalds
@ 1999-06-07  9:38                                     ` Tim Hollebeek
  1999-06-07 10:05                                       ` Jamie Lokier
                                                         ` (2 more replies)
  1999-06-30 15:43                                     ` Linus Torvalds
  1 sibling, 3 replies; 218+ messages in thread
From: Tim Hollebeek @ 1999-06-07  9:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mark, rth, craig, davem, chip, egcs

Linus Torvalds writes ...
> 
> People write so-so code, and then they hope that by using -O2 the compiler
> will make it good. When it doesn't, and when they find out it was due to
> strict aliasing rules (if they ever do - the more likely schenario is
> that they'll ship the binary compiled with just -O), they'll turn it off
> rather than fight it.

This is going to happen even with the Torvalds hack.  If they are
writing code that ignores the aliasing rules, not every single
instance will conform to the Torvalds "all pointer trickery happens in
a single expression" coding style.  Hence their binary will still fail.

Then we'll have to explain two things to them instead of just one: the
ANSI rules, and the extra Torvalds non-ANSI rules.

-Tim

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:38                                     ` Tim Hollebeek
@ 1999-06-07 10:05                                       ` Jamie Lokier
  1999-06-30 15:43                                         ` Jamie Lokier
  1999-06-07 10:44                                       ` Linus Torvalds
  1999-06-30 15:43                                       ` Tim Hollebeek
  2 siblings, 1 reply; 218+ messages in thread
From: Jamie Lokier @ 1999-06-07 10:05 UTC (permalink / raw)
  To: Tim Hollebeek; +Cc: Linus Torvalds, mark, rth, craig, davem, chip, egcs

Tim Hollebeek wrote:
> This is going to happen even with the Torvalds hack.  If they are
> writing code that ignores the aliasing rules, not every single
> instance will conform to the Torvalds "all pointer trickery happens in
> a single expression" coding style.  Hence their binary will still fail.
> 
> Then we'll have to explain two things to them instead of just one: the
> ANSI rules, and the extra Torvalds non-ANSI rules.

Which is why we should make the compiler emit a warning for anything
that _looks_ like it might break because of the aliasing rules.

This will include some conforming code.  Such is life.  It can be
disabled case by case -- just as you can write `if ((x & 1))' or `int
x __attribute__ ((unused))' to disable some other useful warnings.

My proposal for a type attribute handles this quite generally -- no
Torvalds style required.

have a nice day,
-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:38                                     ` Tim Hollebeek
  1999-06-07 10:05                                       ` Jamie Lokier
@ 1999-06-07 10:44                                       ` Linus Torvalds
  1999-06-07 11:22                                         ` Jeffrey A Law
  1999-06-30 15:43                                         ` Linus Torvalds
  1999-06-30 15:43                                       ` Tim Hollebeek
  2 siblings, 2 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-07 10:44 UTC (permalink / raw)
  To: Tim Hollebeek; +Cc: mark, rth, craig, davem, chip, egcs

On Mon, 7 Jun 1999, Tim Hollebeek wrote:
> 
> This is going to happen even with the Torvalds hack.  If they are
> writing code that ignores the aliasing rules, not every single
> instance will conform to the Torvalds "all pointer trickery happens in
> a single expression" coding style.  Hence their binary will still fail.

Yes. But it's less likely to fail.

However, somebody else did suggest just a warning (for the "torvalds case"
and potentially for other cases that are deemed suspect), and I certainly
agree with that as a kind of "uhhuh, somebody is doing something
dangerous". 

However, even then I think we'd _also_ need to have a syntactically
cleaner way of fixing it - if a warning is generated it obviously would
need to have some way of disabling the warning on a case-by-case basis
(with either saying "it's ok to alias - don't warn me" or a "oh, you were
right, this could alias, please consider it to be in alias set zero").

I completely agree with that kind of extension - it's obviously better
than mine, but it's also much more ambitious than my quick and simple
hack.

> Then we'll have to explain two things to them instead of just one: the
> ANSI rules, and the extra Torvalds non-ANSI rules. 

Just explain it as "dangerous code", and give examples. There are
certainly bound to be other cases, although the "torvalds case" is the
obvious and most common one. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07 10:44                                       ` Linus Torvalds
@ 1999-06-07 11:22                                         ` Jeffrey A Law
  1999-06-08  1:34                                           ` Nick Ing-Simmons
  1999-06-30 15:43                                           ` Jeffrey A Law
  1999-06-30 15:43                                         ` Linus Torvalds
  1 sibling, 2 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-07 11:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tim Hollebeek, mark, rth, craig, davem, chip, egcs

  In message < Pine.LNX.3.95.990607103826.22680A-100000@penguin.transmeta.com >yo
u write:
  > > Then we'll have to explain two things to them instead of just one: the
  > > ANSI rules, and the extra Torvalds non-ANSI rules. 
  > 
  > Just explain it as "dangerous code", and give examples. There are
  > certainly bound to be other cases, although the "torvalds case" is the
  > obvious and most common one. 
Building a new set of aliasing rules which are only going to be used by the
Linux kernel to avoid making their code standards complaint is simply dumb.
We should follow the ISO/ANSI rules and be done with it.

I'm not going to approve any changes of this nature.
jeff



^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  2:14                                                 ` Jason Merrill
  1999-06-07  8:02                                                   ` mark
@ 1999-06-07 13:11                                                   ` Jeffrey A Law
  1999-06-30 15:43                                                     ` Jeffrey A Law
  1999-06-30 15:43                                                   ` Jason Merrill
  2 siblings, 1 reply; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-07 13:11 UTC (permalink / raw)
  To: Jason Merrill; +Cc: Martin v. Loewis, egcs

  In message < u9hfok74ok.fsf@yorick.cygnus.com >you write:
  > Saying "garbage in, garbage out" is a cop-out.  If we're going to let code
  > like this break, we need to emit a warning so that people know that they
  > have a problem, rather than leaving them to debug obscure problems.
Anyone want to work on this?  If we can spit out a warning without getting
too many false positives it would be a win.    I'm not familiar enough with
the front-ends to even guess how much work it might be.

jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  8:58                               ` Linus Torvalds
  1999-06-07  9:18                                 ` mark
@ 1999-06-07 13:34                                 ` Jamie Lokier
  1999-06-30 15:43                                   ` Jamie Lokier
  1999-06-30 15:43                                 ` Linus Torvalds
  2 siblings, 1 reply; 218+ messages in thread
From: Jamie Lokier @ 1999-06-07 13:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mark, rth, tim, craig, davem, chip, egcs

Linus Torvalds wrote:
> > I think by now you've been presented with a variety of strategies for
> > solving the problem in the kernel, including more than one idea for
> > macros that you could use like:
> > 
> >   ALIASING_CAST (type, x)
> 
> I've been told in private email, that the proposed macro wasn't even
> standards conforming in the sense that it doesn't guarantee that the
> compiler couldn't decide it aliases (because in order to guarantee that
> the union should contain all possible types). It happens to work for gcc.

Luckily, GCC has __typeof__.
So you can just put the target type and __typeof__(x) in the union --
should work shouldn't it?

-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07 11:22                                         ` Jeffrey A Law
@ 1999-06-08  1:34                                           ` Nick Ing-Simmons
  1999-06-08  1:48                                             ` Jeffrey A Law
  1999-06-30 15:43                                             ` Nick Ing-Simmons
  1999-06-30 15:43                                           ` Jeffrey A Law
  1 sibling, 2 replies; 218+ messages in thread
From: Nick Ing-Simmons @ 1999-06-08  1:34 UTC (permalink / raw)
  To: law; +Cc: chip, davem, rth, craig, egcs, Linus Torvalds, Tim Hollebeek, mark

Jeffrey A Law <law@cygnus.com> writes:
>  In message < Pine.LNX.3.95.990607103826.22680A-100000@penguin.transmeta.com >yo
>u write:
>  > > Then we'll have to explain two things to them instead of just one: the
>  > > ANSI rules, and the extra Torvalds non-ANSI rules. 
>  > 
>  > Just explain it as "dangerous code", and give examples. There are
>  > certainly bound to be other cases, although the "torvalds case" is the
>  > obvious and most common one. 
>Building a new set of aliasing rules which are only going to be used by the
>Linux kernel to avoid making their code standards complaint is simply dumb.

It is not just the Linux kernel, it is _any_ kernel and 
aliasing tricks are _everywhere_ - most embedded C has them too, 
I would be surprised if X11 did not have them, ...

-- 
Nick Ing-Simmons <nik@tiuk.ti.com>
Via, but not speaking for: Texas Instruments Ltd.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-08  1:34                                           ` Nick Ing-Simmons
@ 1999-06-08  1:48                                             ` Jeffrey A Law
  1999-06-30 15:43                                               ` Jeffrey A Law
  1999-06-30 15:43                                             ` Nick Ing-Simmons
  1 sibling, 1 reply; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-08  1:48 UTC (permalink / raw)
  To: Nick Ing-Simmons
  Cc: chip, davem, rth, craig, egcs, Linus Torvalds, Tim Hollebeek, mark

  In message < 199906080832.JAA09214@tiuk.ti.com >you write:
  > It is not just the Linux kernel, it is _any_ kernel and 
  > aliasing tricks are _everywhere_ - most embedded C has them too, 
  > I would be surprised if X11 did not have them, ...
And such code is perfectly welcome to use -fno-strict-aliasing to avoid
problems with their non-portable code.  Non-conforming code of this nature
is in the vast minority relative to conforming code.

I would be amazed if X11 did not already fix this -- X11 has been building
with vendor compilers that have been doing type based alias analysis for years.

jeff


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  5:47                 ` craig
@ 1999-06-30 15:43                   ` craig
  0 siblings, 0 replies; 218+ messages in thread
From: craig @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: craig

>Sure, I can live with -fno-strict-aliasing. But I'm also really saddened
>by all the lawyers like you who think that standards are somehow more
>important than programmers. 

*Here*'s a clue: WE'RE PROGRAMMERS TOO.  Try reminding yourself of that
before the *next* time you flame our efforts to get a release out the
door without even providing a patch, much less a detailed specification,
to do what you want, okay?

>I think it's a damn shame that instead of technical arguments _everything_
>revolves around people reading the standard as if it was the bible, and
>trying to make people feel guilty for not really caring. It's not a sin to
>just want to get good code without having to do magic contortions, guys.

No, but it's stupid to want to do that in C or C++.  Even Fortran is a
better choice, and *it's* got lots of problems.

Ask Dakota Scientific Systems how they produce some of the most-optimized
numerical libraries on the planet.  They start with the original Fortran
code.  They look at the output from the native compiler for the particular
combination of architecture/CPU/cache-size/memory-latency that they're
targeting.  Then, they tweak the *original* Fortran code in order to
convince the compiler to generate better output for that target.

Fortunately, they seem experienced enough to understand that this process
works only for a *particularly* version of a compiler, rather than
believing that their tweaks must be honored for all time by that
compiler.  And, they don't seem to think it's necessary to ask the
compiler folks to add all sorts of fiddly little knobs to do *their*
work for them, based on my impressions from talking with one of the
people there.

But, then, they're *real* programmers.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 14:17                           ` Linus Torvalds
  1999-06-06 17:41                             ` mark
@ 1999-06-30 15:43                             ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: rth, tim, craig, davem, chip, egcs

On Sun, 6 Jun 1999 mark@codesourcery.com wrote:
> 
> Right.  But the part that's causing aliasing issues is just a memcpy;
> that's the `*(u32 *) p' bit.   You could write:
> 
>   memcpy (&a, p, sizeof (a));
>   a = ntohl (a);

Which is crap.

And a compiler that requires you to write code like that is, by
implication..

If it comes to examples like the above, then "generated code" is not the
major point of contention any more. Bad syntax and requireing programmers
to do ludicrous things _is_ the issue. 

Language design is not just about making it easy for the compiler.  It's
also about making things easy to do for the programmer.  The above is BAD! 

If you can't see why

	a = ntohl((u32 *) p);

is better than the horrible thing you're suggesting (regardless of whether
the code generated is the same or not), then I might as well throw in the
towel immediately. The whole point of my suggestion was to make good code
generation possible with an interface that you can actually use without
barfing..

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:22                           ` Jeffrey A Law
  1999-06-04 10:31                             ` Joe Buck
  1999-06-04 11:11                             ` Toon Moene
@ 1999-06-30 15:43                             ` Jeffrey A Law
  2 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Joe Buck; +Cc: Linus Torvalds, craig, mark, davem, chip, egcs

  > Either
  > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).
This is my strong preference.

I see no need to make conforming, portable code run slower.  Lots of folks have
already fixed these problems in their code (in large part because vendor
compilers started doing this kind of alias analysis years ago).

Folks working with non-portable code can use -fno-strict-aliasing and pay
the resulting performance penalty.

jeff



^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  6:23                                   ` Andi Kleen
  1999-06-05 10:32                                     ` Toon Moene
  1999-06-05 10:37                                     ` mark
@ 1999-06-30 15:43                                     ` Andi Kleen
  2 siblings, 0 replies; 218+ messages in thread
From: Andi Kleen @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Toon Moene
  Cc: law, Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs,
	Andi Kleen, mark

On Sat, Jun 05, 1999 at 02:37:20PM +0200, Toon Moene wrote:
> Ah, yes, but the discussion is whether we should have gcc generate
> "reasonable" behaviour where "reasonable" is defined by a small group of
> users.  Note that all "behaviours" not explicitly required by the
> Standard are prone to:

Generating faulty code is in my book always unreasonable, even when
the source is not strictly conforming (and the compiler has a realistic
chance to detect it).  

The argument that it may inhibit some optimizations for strictly conforming 
programs I also cannot follow. As I understand it there are basically two 
cases:


1. One casts a pointer to a object to some other non-char pointer and doesn't
access it. This is although strictly conforming rather useless, and should
be optimized away anyways. This case is not interesting.

2. One casts a pointer to a object to some other non-char pointer, and 
uses the new pointer to access the object. The standard says that is undefined.
Strictly conforming programs cannot do that. Currently gcc generates code
for it that most likely will result in a bug in the program. The casting
proposal turns the "wrong code" interpretation of undefined into something
that has a good chance to make a lot of old programs work again. 

Because case (1) is not interesting (it is a noop) I don't think worrying
about missing optimizations in noops is a good use of one's time.

Now of course I agree that it is a good idea to convert the code in the
long run to be strict-aliasing safe, simply to give the optimizer more
information. For some projects like Linux it is although a long and 
difficult way. I think the best compromise would be to turn -fstrict-aliasing
off per default (like what is already proposed) and to offer a new
-flose-aliasing switch that turns the "turn off alias analysis for casts"
off.  

I'm playing a bit with a patch that just implements that and and works in a 
similar way as Linus outlined. I am not sure if it is worth to try to detect
the case (1) (casting but result is not directly accessed), or to simply
set the alias set to 0 for a pointer cast. I think it is not.

Mark, even when you don't like it, would you as alias-expert-in-residence
think that the basic strategy is workable?



> 
> 1. Erosion (within a decade, gcc maintainers forget why we did this in
>    the first place: "Hey, look at this code - what hair - and it is
>    undefined behaviour according to the Standard in the first place;
>    rip it out")

If it is clearly documented that will not happen.


> 
> 2. Contradiction (the C0X Standard defines the previously undefined
>    behaviour, but in a way incompatible with the "reasonable" behaviour
>    we thought up here).

I don't think that such a vague possibility should guide a gcc design 
decision ("in 30 years an asteroid may crash onto earth and ruin your 
whole day - don't implement it because the exception handlers don't handle 
that event") Also there is no cue in the future directions that that may 
happen. In any case it wouldn't strike me as a strong enough argument
to suppress a useful feature.


> 
> Cheers,
> 
> [In 24 hours I'm off for my first X3J3 meeting - it shows, doesn't it?]

Definitely.


-Andi

-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Linux and aliasing?
  1999-06-03 10:23 Linux and aliasing? Chip Salzenberg
  1999-06-03 10:37 ` mark
@ 1999-06-30 15:43 ` Chip Salzenberg
  1 sibling, 0 replies; 218+ messages in thread
From: Chip Salzenberg @ 1999-06-30 15:43 UTC (permalink / raw)
  To: egcs

Linus continues to complain on linux-kernel that egcs lacks a way to
*selectively* turn off the new stronger alias analysis.  Is this not
easy, or is it just not an important issue to the egcs team?
-- 
Chip Salzenberg      - a.k.a. -      <chip@perlsupport.com>
      "When do you work?"   "Whenever I'm not busy."

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 12:11                                         ` Toon Moene
  1999-06-05 12:21                                           ` David S. Miller
@ 1999-06-30 15:43                                           ` Toon Moene
  1 sibling, 0 replies; 218+ messages in thread
From: Toon Moene @ 1999-06-30 15:43 UTC (permalink / raw)
  To: David S. Miller; +Cc: mark, ak, law, jbuck, torvalds, craig, chip, egcs

David S. Miller wrote:

>    From: mark@codesourcery.com
>    Date: Sat, 05 Jun 1999 10:41:07 -0700

>    Furthermoe, I bet that by now, if all this energy had been spent
>    fixing the code in the kernel, you'd have made good headway on some
>    of the most prominent data structures.  Yes, this will be a tedious
>    chore, but it's an easy one: you enclose things in a union,
>    compile, see what doesn't, fix it, and go on.

> What seems to be ignored are the future maintenance costs incurred by
> this set of changes to the kernel, as if "do it and get it over right
> now" is some triviality.  Effort has been expended already to make
> attempts to do this (mentioned here by Andi Klein who did a run at it
> for the networking), and the findings made there support the
> non-triviality claim, in Andi's case he tossed the work midstream due
> to the non-stop overwhelming accumulation of issues.

If these issues are so pervasive, isn't it easier to use the compiler
flag -fno-strict-aliasing and document *that* ?

I mean, if this sort of trickery permeates the Linux kernel, you won't
get any mileage out of the new optimization anyway, so you could just as
well disable it.

[ Before anyone thinks *I* am a language purist:  I know and have been
  contributing code to our Numerical Weather Prediction programs that
  willfully break Fortran alias assumptions.  We get away by it because
  it is mostly of the form "two arrays overlap completely", which -
  up till now - doesn't seem to be fouled up by optimization passes in
  existing Fortran compilers.  That doesn't mean that I would beat a
  compiler vendor over the head with a blunt object *before* I would
  have checked that our sloppiness is not the cause of our troubles ]

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
GNU Fortran: http://world.std.com/~burley/g77.html

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:04                         ` Joe Buck
                                             ` (2 preceding siblings ...)
  1999-06-04 12:59                           ` Alexandre Oliva
@ 1999-06-30 15:43                           ` Joe Buck
  3 siblings, 0 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: craig, jbuck, mark, davem, chip, egcs

Linus writes, to Craig:

> I haven't maintained a compiler long-term. I _have_ maintained a larger,
> and arguably mode complex system with many more degrees of freedom and
> thus choices than gcc.

No, gcc is substantially larger than the Linux kernel (on the order of 10x
larger, as anyone can easily verify).  gcc does not need to deal with race
conditions, so in that sense it is simpler, but in other respects it is
far more complex than the kernel.

[ personal insults deleted, perhaps the author is in need of a vacation ]

> _I_ think my simple extension was perfectly legitimate, adn a _lot_ more
> obvious than a lot of things people are discussing on the lists.

Your "simple extension" will have the effect of -fno-strict-aliasing
for any function that does any pointer cast (there may be marginal
differences if there are loops before the first cast).  So why not just use
-fno-strict-aliasing and get the same code?

I appreciate your desire for a better solution, but your suggestion
doesn't cut it.

Perhaps we should make -fno-strict-aliasing the default, if many programs
in common use do this kind of type-punning.  But that's about all
we can do at this stage for gcc-2.95.  

writing to Craig, Linus says:
> So how about it? Instead of just telling everybody that C isn't a portable
> systems language (which it was designed to be, by people I respect a lot
> more than you, despite all your rhetoric about being such a good language
> person), just tell us why you think the simple "explicit cast invalidates
> type information for aliasing" rule is so bad. 

C is a reasonably portable systems language, provided that the users use
it correctly: if you don't obey the standards you are in the land of
unspecified behavior, so your code isn't portable.  I think what Craig is
saying is that it's really not correct to think of it as a portable
assembler, and if you want that level of fine control it may not be
suitable.

> And I realize that people are in a hurry and somewhat stressed to get 2.95
> out the door. I do NOT think that anything like this should be a gating
> issue - that would just be silly. The current egcs works, albeit with a
> too draconian (in my opinion) global flag. Please don't get the feeling
> that I'm doing this just to disrupt some release process. 

OK, maybe we can get somewhere.  It seems to me that there are only two
options for gcc-2.95 on this issue:

Either
1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).

or

2. Don't enable the new optimization for C unless the user says
   -fstrict-aliasing.

   Since C++ is fussier about type safety, we could make it the default for
   C++ (nice to get those C++ critics who falsely claim C++ is slower than
   C ;-).

Which would you recommend?

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  5:47             ` craig
  1999-06-04  8:17               ` Linus Torvalds
@ 1999-06-30 15:43               ` craig
  1 sibling, 0 replies; 218+ messages in thread
From: craig @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: craig

>Craig, don't always try to make the programmer look bad.

Why would I, since you seem to be doing such a good job of it, by
coming here and lecturing us about how to build compilers?

My goal is to make *great* programmers look *better*.  If gcc is
really a tool, it'll be simple, easy to use properly, and behave
consistently no matter where it is used.  *Great* programmers
prefer that kind of tool.  *Lousy* programmers think a hammer should
behave like a screwdriver so they can save some time changing tools
in the middle of a job, or so it can be "compatible" with the
locals' tradition that something called a "hammah" does the job of
a screwdriver.

>Instead of blaming the
>programmer, please just _allow_ him to say "you're just a stupid compiler,
>and you shouldn't be getting too much in my way". Is that so hard to do?

a) you already have the ability to do that, but you seem to not like
it, and b) yes, when it *is* so hard to do that, when there end up
being 5,000 different controls like that, and we decide that, lo and
behold, there's a *bug* in there somewhere, or, hey, maybe we want
to rewrite a chunk of the compiler, and it seems simple enough to do
that, except, first, we have to do an in-depth study on exactly how
those 5,000 controls might interact (since we have little or no
useful documentation on them, other than existing code that might
break if they stop behaving exactly as they did).

>Again, instead of thinking that the compiler always knows best, give the
>user a choice. We're not in Windows any more, Tonto. Give the programmer
>the gun, and allow him to shoot himself in the head. But give him a
>laser-guided nightsight too, in case he wants it.

Any programmer worth his salt, wanting "a laser-guided nightsight",
and not wanting to tweak (or even rewrite) his code for every new
compiler release, will *not* use a C compiler.  Period.

>Don't get caught up in the MS way of doing things, where you not only give
>a programmer a gun, you aim it at (roughtly) the wrong target, and you
>pull the trigger for him too. 

The MS way *is* to offer a product with bazillions of little features
that are not really appropriate to the "core mission" of the product.
Or what do *you* call MS Word -- a word processor?  Nobody I know,
who understands design issues, calls it that.

>Face it, there are clever programmers out there. You shouldn't make it
>illegal to be clever. A standard is not a law of nature, and it's not a
>universal excuse to be unfriendly to people who want to go outside the
>standard.

I would prefer gcc default to catering to *great* programmers.  *Clever*
programmers make the fatal mistakes we all have to live with, like
making distinctions based on content of whitespace, or believing that
a key labeled "backspace" must necessarily generate ASCII BS simply
because they share the same name, or believing that their code will
be rewritten and deployed before Jan 1, 2000.

Put another way: I *know* there are clever programmers out there.  That's
what scares me.

Face it: *you* made the fatal mistake here, by choosing C to implement
an operating system that you wanted to be fast, easy to maintain, and
portable.  Even worse: you choose "GNU C", the C language extended by
people who larely didn't understand what they were doing.  And, I agree
100%, all the people making these mistakes, from yourself to RMS, are
"clever programmers".

But you can mitigate the mistakes *you* made by dropping at least one
of the requirements you appear to have for Linux:

  -  Speed.  Rewrite it to accommodate some reasonable subset of ISO C,
     then live with whatever performance you get by tweaking compiler
     options.  That means no `asm', of course.

  -  Easy to maintain.  Decide that, upon every new release of gcc you
     want to compile Linux, you'll commit significant resources studying
     the effects of new optimizations and rewriting *Linux* -- not *gcc*
     -- to accommodate it.

  -  Portable.  Decide that gcc 2.7.2 will forever be the compiler for
     Linux, and that you'll therefore live with never porting Linux
     to new architectures not supported by that version of gcc.

The above will be recognized as a variation on a well-known theme --
"you can have it Soon, Cheap, and Working; choose *two*".

It would at least help us some, in accommodating cases (which, for all
I know, is true of this particular issue, though other gcc contributors
suggest otherwise) where we really blew it in selecting a default,
if you'd focus on the middle item more, to the extent that it results
in sharing what you learn about the effects of new optimizations, etc.
on the existing Linux code base.

I mean, I do recall some discussions of this issue before, but why
are we discussing it *now*, in a release cycle, when there's *nothing*
we can do about it, given that there's no *bug* here?

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 23:53           ` Martin v. Loewis
@ 1999-06-30 15:43             ` Martin v. Loewis
  0 siblings, 0 replies; 218+ messages in thread
From: Martin v. Loewis @ 1999-06-30 15:43 UTC (permalink / raw)
  To: craig; +Cc: davem, egcs

> For all I know, this problem is the result of C, or gcc, being
> too permissive about allowing casts across pointers to different
> types...

The problem is that ISO C explicitly allows you to cast pointers
forwards and backwards to completely unrelated types. Only when
you *dereference* the pointer, you must be consistent in the type,
or derefence through char*.

If you have a sequence

long foo()
{
   long a = 15;
   long *b = &a;
   void *c = b;
   float *d = c;
   *d = 3.14;
   return a;
}

then the standard says that this code has undefined behaviour, yet
every single statement is ok. The compiler could take the position
that d cannot possibly alias with b, and return 15. (In this example,
analysis detects that they are aliased, anyway)

Now, there are proposals to relax the rules. There is
-fno-strict-aliasing; I can't really see how there is an in-between.

Regards,
Martin

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 12:59                           ` Alexandre Oliva
  1999-06-04 13:29                             ` Joe Buck
@ 1999-06-30 15:43                             ` Alexandre Oliva
  1 sibling, 0 replies; 218+ messages in thread
From: Alexandre Oliva @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Joe Buck; +Cc: Linus Torvalds, craig, mark, davem, chip, egcs

On Jun  4, 1999, Joe Buck <jbuck@Synopsys.COM> wrote:

> Linus writes, to Craig:

>> _I_ think my simple extension was perfectly legitimate, adn a _lot_ more
>> obvious than a lot of things people are discussing on the lists.

> Your "simple extension" will have the effect of -fno-strict-aliasing
> for any function that does any pointer cast (there may be marginal
> differences if there are loops before the first cast).  So why not
> just use -fno-strict-aliasing and get the same code?

> I appreciate your desire for a better solution, but your suggestion
> doesn't cut it.

AFAICT, in a cast to `(some_type_t *volatile)', the `volatile' doesn't
have any actual effect on the generated code, because the pointer has
already been evaluated.  Couldn't we implement an extension by which
this `volatile' would kind of have the opposite meaning of `restrict'?
It would mean that the resulting pointer may be aliased to anything
else, so the compiler shouldn't move it around nor optimize it ``too
much''.  It seems to me that `volatile' is the right word to mean it,
especially because it would be ignored by compilers that don't support
this extension.

-- 
Alexandre Oliva http://www.dcc.unicamp.br/~oliva IC-Unicamp, Bra[sz]il
{oliva,Alexandre.Oliva}@dcc.unicamp.br  aoliva@{acm.org,computer.org}
oliva@{gnu.org,kaffe.org,{egcs,sourceware}.cygnus.com,samba.org}
*** E-mail about software projects will be forwarded to mailing lists

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 22:23                                               ` Jeffrey A Law
@ 1999-06-30 15:43                                                 ` Jeffrey A Law
  0 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Jason Merrill; +Cc: Martin v. Loewis, torvalds, egcs

  In message < u9lndw7rr9.fsf@yorick.cygnus.com >you write:
  > It seems to me that this issue is broader than the kernel; any code that
  > uses casts to, say, get at the bitwise representation of a floating point
  > value is likely to break.
This has been the most common problem I've seen over the years with type
based alias analysis.  However, as I've mentioned before, most folks have
already fixed their code so that it would work with various vendor compilers
that added type based alias analysis years ago.


jeff


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07 13:11                                                   ` Jeffrey A Law
@ 1999-06-30 15:43                                                     ` Jeffrey A Law
  0 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Jason Merrill; +Cc: Martin v. Loewis, egcs

  In message < u9hfok74ok.fsf@yorick.cygnus.com >you write:
  > Saying "garbage in, garbage out" is a cop-out.  If we're going to let code
  > like this break, we need to emit a warning so that people know that they
  > have a problem, rather than leaving them to debug obscure problems.
Anyone want to work on this?  If we can spit out a warning without getting
too many false positives it would be a win.    I'm not familiar enough with
the front-ends to even guess how much work it might be.

jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 12:25       ` David S. Miller
  1999-06-03 20:06         ` craig
@ 1999-06-30 15:43         ` David S. Miller
  1 sibling, 0 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: chip, egcs, torvalds

   From: mark@codesourcery.com
   Date: Thu, 03 Jun 1999 12:07:33 -0700

       David> I'm not saying this should be the normal mode of operation,
       David> but some mechanism needs to exist so that such code can be
       David> made valid _without_ resorting to ugly unions.

   There is one: -fno-strict-aliasing.  

And there is another: if my code is not "standard C" then output an
error instead of silently generating bad code.  Then again, I did not
specify -ansi and -ansi is not the default, and therefore I have not
asked for "standard C", instead I expect to get "GNU C" which
traditionally has been "standard C + common sense" :-)

So in that light, there is:

	If some pointer is cast to another of a different storage
	class or size, I'm doing something strange, and the compiler
	should turn off aliasing optimizations for anything to do with
	that set/class of pointers.

I know it can be detected, and I also know aliasing can be turned off
for a particular class of related pointers just as easily in the
current compiler code.

Common sense should override whatever standards say, where feasible,
and I argue that here it is indeed feasible.

	You might consider implementing this, or hiring someone else
	to do it for you.

Before anyone considers implementing any change, it would be prudent
to make sure most folks agree on the issue and how it should be
solved.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  1:08                 ` Branko Cibej
@ 1999-06-30 15:43                   ` Branko Cibej
  0 siblings, 0 replies; 218+ messages in thread
From: Branko Cibej @ 1999-06-30 15:43 UTC (permalink / raw)
  To: egcs

Linus Torvalds wrote:

> I think it's a damn shame that instead of technical arguments _everything_
> revolves around people reading the standard as if it was the bible, and
> trying to make people feel guilty for not really caring. It's not a sin to
> just want to get good code without having to do magic contortions, guys.

Why, Linus, it's trivial to do! Use MSVC instead of egcs, then you can do

     #pragma optimize("a", on/off)

around every single statement, if you like. It even supports anonymous unions
_and_ structs, and is only slightly influenced by the ISO C standard.

They'll even sell you a version that can generate code for the alpha!

>:->

    Brane


P.S.: Just for the record: I, a "real user" and (imnsho) a "clever programmer"
who "know what I'm doing", at least most of the time, and who do not "read the
standard as if it were a bible", vote for having a standard-conforming
compiler that's not bloated by a ton of marginally useful (or even useless)
features. If I want featuritis, I know where I can get it.


P.P.S: They can't even spell "optimise" ...

--
Branko &Ccaron;ibej                 <branko.cibej@hermes.si>
HERMES SoftLab, Litijska 51, 1000 Ljubljana, Slovenia
voice: (+386 61) 186 53 49   fax: (+386 61) 186 52 70


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 17:56                                             ` Jason Merrill
                                                                 ` (2 preceding siblings ...)
       [not found]                                               ` <199906070645.IAA00615@mira.isdn.cs.tu-berlin.de>
@ 1999-06-30 15:43                                               ` Jason Merrill
  3 siblings, 0 replies; 218+ messages in thread
From: Jason Merrill @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: torvalds, egcs

It seems to me that this issue is broader than the kernel; any code that
uses casts to, say, get at the bitwise representation of a floating point
value is likely to break.  This seems like a very unfortunate state of
affairs, because this sort of failure is inherently hard to track down.  We
can't just silently emit bad code and say that the standard allows us to do
that; as Linus says, it's a QOI issue.

One simple and safe solution would be to turn off -fstrict-aliasing in any
function which contains a pointer cast (reinterpret_casts only in C++),
along with a warning.  Power users could override this behavior with a flag.

Jason

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07 10:05                                       ` Jamie Lokier
@ 1999-06-30 15:43                                         ` Jamie Lokier
  0 siblings, 0 replies; 218+ messages in thread
From: Jamie Lokier @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Tim Hollebeek; +Cc: Linus Torvalds, mark, rth, craig, davem, chip, egcs

Tim Hollebeek wrote:
> This is going to happen even with the Torvalds hack.  If they are
> writing code that ignores the aliasing rules, not every single
> instance will conform to the Torvalds "all pointer trickery happens in
> a single expression" coding style.  Hence their binary will still fail.
> 
> Then we'll have to explain two things to them instead of just one: the
> ANSI rules, and the extra Torvalds non-ANSI rules.

Which is why we should make the compiler emit a warning for anything
that _looks_ like it might break because of the aliasing rules.

This will include some conforming code.  Such is life.  It can be
disabled case by case -- just as you can write `if ((x & 1))' or `int
x __attribute__ ((unused))' to disable some other useful warnings.

My proposal for a type attribute handles this quite generally -- no
Torvalds style required.

have a nice day,
-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 12:21                                           ` David S. Miller
  1999-06-05 16:51                                             ` mark
@ 1999-06-30 15:43                                             ` David S. Miller
  1 sibling, 0 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-30 15:43 UTC (permalink / raw)
  To: toon; +Cc: mark, ak, law, jbuck, torvalds, craig, chip, egcs

   Date: Sat, 05 Jun 1999 21:02:27 +0200
   From: Toon Moene <toon@moene.indiv.nluug.nl>

   I mean, if this sort of trickery permeates the Linux kernel, you
   won't get any mileage out of the new optimization anyway, so you
   could just as well disable it.

It is not believed that this is the case.

Networking will be used as an example.

The core of the fast paths in interrupt level processing consist of
parsing and verifying packet header data.  These are the areas where
non-alias-friendly casts are used to speed up the header inspection
(to decrease the number of load instructions and also decrease the
number of comparisons executed).

Yet in the user side portion of networking, and to a decent amount in
the packet processing once we've obtained the header data, the bulk of
the work consists of updating state in the per-connection data
structures, where a plethora of alias analysis benefits exist.

So the situation here is quite the contrary to the assertion, one of
the most worrysome areas of the kernel, with respect to the
union'ization of data structures to remove non-alias-friendly casts,
is also the place where alias analysis would be highly beneficial.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:35                       ` Linux and aliasing? Linus Torvalds
  1999-06-04 10:04                         ` Joe Buck
@ 1999-06-30 15:43                         ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: craig; +Cc: jbuck, mark, davem, chip, egcs

On 4 Jun 1999 craig@jcb-sc.com wrote:
> 
> In other words, you believe you are a better language designer than
> the ISO C people as well as the gcc maintainers, despite the fact
> that you know, what, *nothing* about language design, and *nothing*
> about compiler design and, especially, long-term maintenance of
> compilers?

Craig, instead of only getting down to personal insults, and how you think
I should just stay with one compiler all my life or rewrite my code every
year, how about you actually face any of the =technical= issues? Too
scary?

I haven't maintained a compiler long-term. I _have_ maintained a larger,
and arguably mode complex system with many more degrees of freedom and
thus choices than gcc. I know about maintenance, code-boy. Wether you'll
ever admit to that is irrelevant.

So instead of just spouting off crap, why don't you give a single
technical reason why my suggestion is actually BAD? Instead of talking
about "language design" and trying to set yourself up as the only person
in the world who understands the issues, why don't you just face the
technical issues and get down to details?

_I_ think my simple extension was perfectly legitimate, adn a _lot_ more
obvious than a lot of things people are discussing on the lists. So don't
give me that crap about not adding new features outside the standard: 
people in the egcs camp do that all the time, and they usually _like_
doing it, even for much more specialized problems like function prologue
and epiloge code generation. 

And I don't see any "language design" issues either - it's a very clean
extension, and makes complete sense. I bet that if we took any average
C programmer (and most of us do =not= know all that much about aliases),
people would understand the extended semantics a lot more easily than they
understand the basic ANSI rules.

So how about it? Instead of just telling everybody that C isn't a portable
systems language (which it was designed to be, by people I respect a lot
more than you, despite all your rhetoric about being such a good language
person), just tell us why you think the simple "explicit cast invalidates
type information for aliasing" rule is so bad. 

And I realize that people are in a hurry and somewhat stressed to get 2.95
out the door. I do NOT think that anything like this should be a gating
issue - that would just be silly. The current egcs works, albeit with a
too draconian (in my opinion) global flag. Please don't get the feeling
that I'm doing this just to disrupt some release process. 

			Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 23:03           ` Linus Torvalds
                               ` (3 preceding siblings ...)
  1999-06-04 15:02             ` Richard Henderson
@ 1999-06-30 15:43             ` Linus Torvalds
  4 siblings, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: craig; +Cc: davem, mark, chip, egcs

On 4 Jun 1999 craig@jcb-sc.com wrote:
> 
> Maybe it is -- I haven't looked into the issues in detail -- but,
> generally, it is very hard to implement common sense *in the compiler
> itself*.

Oh, agreed.

But it should be reasonably easy to implement very straightforward rules,
and have the rules themselves make common sense ;)

The extremely straightforward rule that at least I would advocate is _so_
straightforward as to be almost scary:
 - if there is a pointer cast, that pointer cast invalidates all
   type-based alias information.

It wouldn't matter if you cast the pointer to the type it had originally
anyway: a cast is a cast is a cast. If somebody dereferences a casted
value, the type information shouldn't be trusted.

It is a common sense rule at least to me, and it should be very simple for
the compiler too. And it has another advantage: it does not expand the
language in any way, and is obviously entirely ANSI compliant (it's
obviously not a _requirement_ of ANSI, but it is certainly allowed by it). 

_And_ any well-written software that isn't trying to be clever with
pointers would never ever notice anything, because the only reasonable
reason why you would ever use a pointer cast is because you're playing
games with the pointers in question, no?

Having a simple cast rule would make most of the alias issues go away
completely right off the bat, and the ones it wouldn't make go away you
can patch up by hand in the sources by just adding a dummy cast if
somebody is doing something _really_ ugly. 

> For all I know, this problem is the result of C, or gcc, being
> too permissive about allowing casts across pointers to different
> types...in the sense that, if that sort of thing was simply
> disallowed, then programmers wouldn't even *think* they "knew what
> they were doing", because they'd be getting compile-time diagnostics,
> which, as you point out, is what they *should* be getting if the
> compiler isn't basically successfully reading the programmer's mind
> and implementing his desires.

Well, the other way of thinking about this is to just say "oh, the
programmer is casting stuff, let's not trust the type system any more". 

Craig, don't always try to make the programmer look bad. Occasionally you
could just admit to the possibility that the programmer _really_ knows
what he is doing, and the compiler does not. Ok? Instead of blaming the
programmer, please just _allow_ him to say "you're just a stupid compiler,
and you shouldn't be getting too much in my way". Is that so hard to do?

> In particular, while it might make sense for *your* application
> to have the compiler "automatically" disable (even localized)
> aliasing when it sees certain "suspicious" constructs, how do we
> know there won't be people who say "hey, *we* use those constructs,
> but we use them *correctly*, and we don't want to lose the
> performance those alias assumptions give us", either now or in
> the future?  Why should *they* have to pay for their more-
> conforming (to the compiler's growing expectations, anyway) usage
> by modifying their code, or even their shell scripts?

They shouldn't. You should have an option that says

 -fno-strict-alias

(you have it already) and you should have an option that says

 -freally-strict-alias

but you should probably default to something that makes sense (and the
ANSI C rules certainly do _not_ qualify - I see how they were created, but
they do not "make sense" in any sense of that expression). Something that
notices that "oh, they're playing with the type system, I probably
shouldn't do aliases here". Ok?

In fact, the only _really_ sensible type-based alias system is probably
one where the _only_ thing that overrides a type-based alias is a pointer
cast. ANSI C has all the "funny" rules about "char *" being special etc,
and that means that you lose a lot of potentially very useful information. 
So you might consider having a mode that is _stricter_ than ANSI C in that
regard (not making "char *" anything special as far as the type alias
logic is concerned, but instead implementing _just_ the cast rule). 

So you can think of it as a sum of two independent yes/no rules: "do we
consider 'char *' to be a global alias killer" and "does a pointer cast
invalidate the type-based alias for the casted access?". So give the user
_two_ options:

	-fcast-invalidates-type-alias
	-fansi-alias-rule-invalidation

where the compiler would default to having both rules enabled (for the
"safest" kind of type-based alias), while people who really feel confident
that their program is entirely ANSI-alias safe would say that he does not
want the "cast-invalidates-type-alias" logic enabled. 

And in contrast, people like me who think the ANSI C rules are completely
arbitrary and much harder to understand than the cast alias rule, would at
least have the _option_ to use the simpler and more straightforward setup. 
No? 

Again, instead of thinking that the compiler always knows best, give the
user a choice. We're not in Windows any more, Tonto. Give the programmer
the gun, and allow him to shoot himself in the head. But give him a
laser-guided nightsight too, in case he wants it.

Don't get caught up in the MS way of doing things, where you not only give
a programmer a gun, you aim it at (roughtly) the wrong target, and you
pull the trigger for him too. 

Face it, there are clever programmers out there. You shouldn't make it
illegal to be clever. A standard is not a law of nature, and it's not a
universal excuse to be unfriendly to people who want to go outside the
standard.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 10:46                                             ` Linus Torvalds
@ 1999-06-30 15:43                                               ` Linus Torvalds
  0 siblings, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: egcs

On Sun, 6 Jun 1999, Martin v. Loewis wrote:
> 
> Well, no. The 'normal' kind of cast is very common, and frequently
> used in the Linux kernel. For example, if a tty driver routine is
> called (e.g. drivers/char/rocket.c :-), it fetches driver_data and
> casts it to the device-specific type (i.e. (struct r_port *)).

No, but that "normal" kind of cast does not imply a immediate _derefernce_
(which is the only case where my new rule would kick in.

I agree that it is perfectly normal (and unavoidable) to do pointer
casting like

	struct specific_struct * mystruct;

	mystruct = (struct specific_struct *) data_struct->private_member;

and then use "mystruct". No argument. Linux (and tons of other programs) 
do this all over the place, exactly because you have "anonymous"  generic
pointers whose usage depends on who is the "owner" of that pointer. 

The other obvious case is things like

	mystruct = (struct specific_struct *) malloc();

which kind of falls under the same header.

But my proposal would only change cases where you actually dereference
such a cast without ever using the cast for anything else, which I
consider to be "dodgy" code unless the cast is there explicitly as a type
forcing conversion (that would imply no alias information). 

"Normal" use like the two examples above would (and should) NOT be
impacted by any proposal of mine.

> In these cases, people typically save the cast result in a variable
> instead of derefencing it, so they would not suffer from your
> anti-aliasing mechanism. These uses of casts are conforming C code:
> The driver put an r_port pointer into driver_data earlier on.

Indeed. 

Maybe people worried that those kinds of uses would be changed by the
change I proposed. They wouldn't. I would be upset if they were, and I
would understand that others would be upset if they were. That would imply
a real lack of alias information. 


> > 	grep '\*(.*\* *)' */*.c
> 
> It is actually the other casts that the Linux contributors need to
> worry about. Alias problems are very hard to find (as you pointed
> out), and somebody will have to go over the complete kernel source and
> investigate every single cast - if you ever plan to turn-on
> -fstrict-aliasing.

I agree. We need to be careful. But my proposal has two advantages: 

 - it takes care of the obvious cases (not just for the kernel, but for a
   ton of other programs), and has a nice "do what I mean" kind of
   semantic for all the cases I found.

   In fact, try the above "grep" on the gcc sources themselves. You'll see
   code like

	*((EMUSHORT *) r) = w[3];
	*((EMUSHORT *) r + 1) = w[2];
	*((EMUSHORT *) r + 2) = w[1];
	*((EMUSHORT *) r + 3) = w[0];

   as part of the "PUT_REAL()" macro, and then you'll see usage like

	PUT_REAL (g, &r);
	return (r);

   which is ILLEGAL because strict-aliasing might decide that "r" could be
   loaded before PUT_REAL hass changed it, because "EMUSHORT" cannot alias
   with "double". But it's _exactly_ the kind of code that my proposal
   would just automatically do the right thing for. 

   It's "Do what I mean!"

   Do you see? Gcc _itself_ wouldn't mind having the feature I propose.
   Does that make people more likely to realize why I'm proposing it? 
   Maybe people still thought that this was something kernel-specific? 

 - The other advantage is that when going through the other cases more
   carefully, my proposal would make it trivial to fix them up by just
   adding a cast. I consider this part of the proposal to be a smaller
   advantage, though.

Oh, well. I don't seem to be convincing people who have all dug themselves
into their own view.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 19:24                                               ` Tim Hollebeek
@ 1999-06-30 15:43                                                 ` Tim Hollebeek
  0 siblings, 0 replies; 218+ messages in thread
From: Tim Hollebeek @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Jason Merrill; +Cc: martin, torvalds, egcs

Jason Merrill writes ...
> 
> It seems to me that this issue is broader than the kernel; any code that
> uses casts to, say, get at the bitwise representation of a floating point
> value is likely to break.  This seems like a very unfortunate state of
> affairs, because this sort of failure is inherently hard to track down.  We
> can't just silently emit bad code and say that the standard allows us to do
> that; as Linus says, it's a QOI issue.
> 
> One simple and safe solution would be to turn off -fstrict-aliasing in any
> function which contains a pointer cast (reinterpret_casts only in C++),
> along with a warning.  Power users could override this behavior with a flag.

No, this is not a simple and safe solution.  Simply because it doesn't work.

Consider:

extern int *get_representation(float *);

float fiddle() {
    float d = 3.14159;
    int *s;

    s = get_representation(&d); /* Look, Ma, no casts! */
    (*s) |= 0x00300;
    return d; /* oops! */
}

-fno-strict-aliasing is the only safe solution, if you want to
guarantee this definition of "bad" code never occurs.  Sure, there are
cases where you can prove pointer aliasing can't happen, but then we
don't need the ANSI rules at all anyway!

The interesting case is where the ANSI rules say something can't be
aliased, but it is impossible or very hard for gcc to say whether this
assertion is correct or not (the above case falls in the impossible
category).  If you want to be safe, -fno-strict-aliasing is the
solution.  But when it's on, you can't warn about such pointer stores
since you'd have to warn about any store you can't prove points to an
object of the same type.  Requires data flow -> zillions of false
positives.

-Tim

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  0:04               ` Linus Torvalds
                                   ` (4 preceding siblings ...)
  1999-06-04  8:41                 ` Tim Hollebeek
@ 1999-06-30 15:43                 ` Linus Torvalds
  5 siblings, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: craig, davem, chip, egcs

On Thu, 3 Jun 1999 mark@codesourcery.com wrote:
> 
> I don't think the cast rule is by any means the right obvious default.
> For one thing, it pessimizes object-oriented C code that does
> downcasts through an inheritance hierarchy.  There's no reason that we
> shouldn't be able to use type-based alias analysis in such situations,
> but your proposal would make it not happen.

But those downcasts are implicit, not explicit, no? I think only explicit
casts should break the alias rule.

> You can use -fno-strict-aliasing to get the "traditional" behavior.

Yes. But that's a complete on-off switch. 

> The only affect on your code will be that some optimizations that used
> not to happen, but would with -fstrict-aliasing, will still not
> happen.  What's the big deal?  If -fstrict-aliasing had never been
> implemented, you wouldn't be complaining would you?  So, we've
> improved GCC, and we've preserved the old behavior.

Oh, you don't expect me to complain about bad code generation when I know
gcc could do better?

Why do you have a "-O" flag at all if you think people don't care about
performance?

I'd love to have the alias extensions, but I don't think it should be a
per-file global setting. Sure, I can just be silent, but if you expect all
egcs users to just sit idly when you do silly things, why do you bother
making pre-releases available at all? You obviously don't care about the
feedback you get from real users.

> But, here, you just don't like ANSI/ISO C, and wish it had different
> semantics.  You *could* express what you want in legal ANSI/ISO C, and
> then GCC would do the right thing, with its default flags.

Have you actually ever tried? I don't think you realize quite what a
rat-hole it is. It's not worth ANYBODYS time.

Sure, I can live with -fno-strict-aliasing. But I'm also really saddened
by all the lawyers like you who think that standards are somehow more
important than programmers. 

I can see technical arguments. An argument of "it's really too painful to
do" I can understand (preferably with an explanation, but hey, I don't
mind getting told that it's too hard to explain). I use that argument
every day myself. 

I think it's a damn shame that instead of technical arguments _everything_
revolves around people reading the standard as if it was the bible, and
trying to make people feel guilty for not really caring. It's not a sin to
just want to get good code without having to do magic contortions, guys.

			Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:39             ` Tim Hollebeek
  1999-06-04  8:55               ` Linus Torvalds
@ 1999-06-30 15:43               ` Tim Hollebeek
  1 sibling, 0 replies; 218+ messages in thread
From: Tim Hollebeek @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: craig, davem, mark, chip, egcs

Linus Torvalds writes ...
> 
> But it should be reasonably easy to implement very straightforward rules,
> and have the rules themselves make common sense ;)
> 
> The extremely straightforward rule that at least I would advocate is _so_
> straightforward as to be almost scary:
>  - if there is a pointer cast, that pointer cast invalidates all
>    type-based alias information.

I think it's pretty obvious this is the wrong thing to do.

Sure, it does the right thing (for a narrow definition of "right
thing") if your code always uses hairy expressions where all the
nastiness is jumbled in one expression.

If you do the same type tricks but use intermediate variables to
improve readability, you lose.  In fact, simply taking an expression
and decomposing it into constituent parts can change the behavior of
code under this rule.  Absolutely horrible.

Unless you're suggesting data flow analysis to figure out which
pointers values could have been derived from a casted pointer??? ick,
ick, ick.

-Tim

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 16:50               ` Bernd Schmidt
@ 1999-06-30 15:43                 ` Bernd Schmidt
  0 siblings, 0 replies; 218+ messages in thread
From: Bernd Schmidt @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Linus Torvalds, craig, davem, mark, chip, egcs

On Fri, 4 Jun 1999, Richard Henderson wrote:
> On Thu, Jun 03, 1999 at 11:02:35PM -0700, Linus Torvalds wrote:
> > The extremely straightforward rule that at least I would advocate is _so_
> > straightforward as to be almost scary:
> >  - if there is a pointer cast, that pointer cast invalidates all
> >    type-based alias information.
> 
> Doing what you want is actually very hard for GCC right now.  Consider
> 
> 	int i;
> 	short s, *ps = (short *)&i;
> 	i = 0;
> 	s = *ps;
> 
> Due to a long-ago quirk of history, GCC processes the abstract syntax
> tree one statement at a time, so the fact of the cast is long gone by
> the time we do the dereference.  Mark got around this problem by
> annotating the memories as we create them, which is good enough to pass
> legal muster, but not good enough for what you want.
> 
> To do what you want, we'd have to annotate pointers instead of memories
> and then do global data flow analysis to find out what addresses have
> been "infected" by the cast.  Doing anything on a local scale wouldn't
> be good enough, I don't think, to handle code coming in from inlines.

Instead of trying to re-create this information with flow analysis,
couldn't we solve this by adding syntax to create a special "aliased" pointer
type, rather than just using a cast to a regular pointer type?

The cast would then read e.g. "(short * __attribute__ ((aliased)))&i"; and we
could declare the variable "ps" to be of the same type.  Each time a pointer
is dereferenced, we know its type, so we can tell whether it has the "aliased"
attribute.  If it does, we need to avoid setting alias information for the
memory reference we create.
For assignments (or casts) between pointers there could be a warning when the
user tries to convert an unaliased pointer back to a normal one.

Bernd

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:55               ` Linus Torvalds
  1999-06-04 15:20                 ` Richard Henderson
@ 1999-06-30 15:43                 ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Tim Hollebeek; +Cc: craig, davem, mark, chip, egcs

On Fri, 4 Jun 1999, Tim Hollebeek wrote:
> 
> If you do the same type tricks but use intermediate variables to
> improve readability, you lose.  In fact, simply taking an expression
> and decomposing it into constituent parts can change the behavior of
> code under this rule.  Absolutely horrible.

Uhh. You're right. I considered it, but I didn't find it "absolutely
horrible", I thought it could be considered a feature in that it was only
ever entirely local to =one= memory operation. 

That's not something new per se: the gcc __extension__ thing is kind of
similarly meant to silence things up locally to that expression.

But I can see why you wouldn't like it, and I understand your argument. I
don't how how else you would limit the scope of anything like this, though
(scoping it to something larger than a single dereference sounds like a
horrible rats nest to me, but opnions can certainly differ).

> Unless you're suggesting data flow analysis to figure out which
> pointers values could have been derived from a casted pointer??? ick,
> ick, ick.

Oh, no, no, no. Shudder. I hope nobody took it that way. Barf.

I meant the features as something to expressly allow a local override.
Think of the rule more as an issue of "poisoning" the dereference operator
rather than poisoning the _pointer_. In a kind of silly "precedence rule"
notation, it would be

	*(char *)y

becomes (*(char *)) y where it is the "*(char *)" thing that makes the
alias go away. (Now somebody is going to flame my ass off for mixing C and
a non-C precedence rule). 

And maybe the above is hard to do because by the time you actually would
want to do the above logic the information isn't really there any more. 
That's entirely possible, and if people tell me it's a bad idea for reason
X I'll shut up about it, but I'd try to come up with another one. Deal? 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  6:01                                         ` Joern Rennecke
@ 1999-06-30 15:43                                           ` Joern Rennecke
  0 siblings, 0 replies; 218+ messages in thread
From: Joern Rennecke @ 1999-06-30 15:43 UTC (permalink / raw)
  To: David S. Miller; +Cc: mark, ak, toon, law, jbuck, torvalds, craig, chip, egcs

> Also some of the datastructures one would need to change are included
> by userspace applications, especially for some of the networking
> instances, and thus one would have ABI issues to concern themselves
> about if they were to go and perform these transformations.  Much more
> is it than a tedious chore.  One could certainly create another header
> file, leave the old one alone with the same name, and use only the new
> one inside the kernel, but does it make sense to have two copies and
> maintain them?

No.  You could still have a single header file, and control the variant
portions with #ifdef __KERNEL__ / #else / #endif .
Or if you have some recurring common type mix, you could use some macros
in the declarations that are definied differently for kernel and user space.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 13:13                               ` Joe Buck
@ 1999-06-30 15:43                                 ` Joe Buck
  0 siblings, 0 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Gabriel Dos_Reis; +Cc: egcs

> Linus Torvalds <torvalds@transmeta.com> writes:
> | So on a technical level, let me explain it the way I thought gcc might
> | implement this rather than explaining the end result as I initially went
> | about.

Gaby writes:
> Do you have a complete patch?

For something like this, it's not best to take a rigid "complete patch or
go away" stance (tempting as it is).  Linus has provided enough of a
skeleton at this point where it's possible to discuss whether the approach
is feasible (though the folks expert on that part of the compiler may not
have the cycles to discuss it in detail just now).

By the way, I said something stupid earlier in this discussion: while
Linux was once an order of magnitude smaller than gcc back in the 1.2
days, now, thanks to tons of device drivers it's about the same size.
Since I never download the whole thing (just patches, and I'm still
running 2.0.3x for some x), I hadn't really noticed this.

So clearly I was wrong to say that gcc is much bigger than Linux.
Sorry, Linus.




^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:29                                                         ` David S. Miller
@ 1999-06-30 15:43                                                           ` David S. Miller
  0 siblings, 0 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-30 15:43 UTC (permalink / raw)
  To: law; +Cc: mark, jason, martin, egcs

   Date: Mon, 07 Jun 1999 10:14:11 -0600
   From: Jeffrey A Law <law@cygnus.com>

     > 2) It's on by default in this new subsequent release, we warned you.

   With the full intention of doing this for egcs-1.2 (now gcc-2.95).  It
   probably was even mentioned in the release notes for egcs-1.1.

It was?  If so, I stand totally corrected, thanks.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-08  1:48                                             ` Jeffrey A Law
@ 1999-06-30 15:43                                               ` Jeffrey A Law
  0 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Nick Ing-Simmons
  Cc: chip, davem, rth, craig, egcs, Linus Torvalds, Tim Hollebeek, mark

  In message < 199906080832.JAA09214@tiuk.ti.com >you write:
  > It is not just the Linux kernel, it is _any_ kernel and 
  > aliasing tricks are _everywhere_ - most embedded C has them too, 
  > I would be surprised if X11 did not have them, ...
And such code is perfectly welcome to use -fno-strict-aliasing to avoid
problems with their non-portable code.  Non-conforming code of this nature
is in the vast minority relative to conforming code.

I would be amazed if X11 did not already fix this -- X11 has been building
with vendor compilers that have been doing type based alias analysis for years.

jeff


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  7:01             ` craig
@ 1999-06-30 15:43               ` craig
  0 siblings, 0 replies; 218+ messages in thread
From: craig @ 1999-06-30 15:43 UTC (permalink / raw)
  To: lehotsky; +Cc: craig

>	The complaints probably won't go away until we implement
>
>		#pragma dwim;

Indeed.  My own reasons for getting into the compiler "arena" included
many of these same issues -- I knew I wanted better code generated
for my OS/kernel work, but also knew that the more such work could be
leveraged off of what applications people wanted, the better for all.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 10:44                         ` mark
  1999-06-06 14:17                           ` Linus Torvalds
@ 1999-06-30 15:43                           ` mark
  1 sibling, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: rth, tim, craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus> On Sat, 5 Jun 1999 mark@codesourcery.com wrote:
    >>  Not really always true.  You can use `memcpy (target, src,
    >> sizeof (x))' and if the alignments of the src and target are
    >> known to the compiler you *should* get optimal code.  (I don't
    >> know if GCC does this at present, but it could, and that would
    >> clearly be a good improvement.)

    Linus> Only if that's assuming that it _is_ a memcpy.

    Linus> Think of things like

    Linus> 	a = ntohl(*(u32 *)p);

    Linus> etc - which is _not_ just a copy.

Right.  But the part that's causing aliasing issues is just a memcpy;
that's the `*(u32 *) p' bit.   You could write:

  memcpy (&a, p, sizeof (a));
  a = ntohl (a);

I would argue that GCC *should* generate the code you want for this.
GCC may not.  Fixing it might be difficult.  But, if GCC does not
generate good code for this case, improving it would be an
optimization of general utility.

Yes, I recognize that this is not as a compact a coding style as you
are used to.  All the previous discussion about difficulty of
conversion stands; both your opinions and mine.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:38                                     ` Tim Hollebeek
  1999-06-07 10:05                                       ` Jamie Lokier
  1999-06-07 10:44                                       ` Linus Torvalds
@ 1999-06-30 15:43                                       ` Tim Hollebeek
  2 siblings, 0 replies; 218+ messages in thread
From: Tim Hollebeek @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mark, rth, craig, davem, chip, egcs

Linus Torvalds writes ...
> 
> People write so-so code, and then they hope that by using -O2 the compiler
> will make it good. When it doesn't, and when they find out it was due to
> strict aliasing rules (if they ever do - the more likely schenario is
> that they'll ship the binary compiled with just -O), they'll turn it off
> rather than fight it.

This is going to happen even with the Torvalds hack.  If they are
writing code that ignores the aliasing rules, not every single
instance will conform to the Torvalds "all pointer trickery happens in
a single expression" coding style.  Hence their binary will still fail.

Then we'll have to explain two things to them instead of just one: the
ANSI rules, and the extra Torvalds non-ANSI rules.

-Tim

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  4:05                               ` Andi Kleen
@ 1999-06-30 15:43                                 ` Andi Kleen
  0 siblings, 0 replies; 218+ messages in thread
From: Andi Kleen @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Toon Moene; +Cc: egcs, torvalds, law

toon@moene.indiv.nluug.nl (Toon Moene) writes:

> Jeffrey A Law wrote:
> 
> > Mark Mitchell wrote:
> 
> >   > Either
> >   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).
> 
> > This is my strong preference.
> > 
> > I see no need to make conforming, portable code run slower.
> 
> Exactly.  Remember that a standard is a contract between producer and
> (end-)user, in our case:  between compiler writer and C programmer.
> 
> 	"We won't optimize your constructs away as long as you program 
> 	 according to said standard"


Erm, there seem to be some misunderstandings about the C standard in
this discussion.

My C9x draft says: 

6.2.6.1
 [#5] Certain object representations  need  not  represent  a
       value  of the object type.  If the stored value of an object
       has such a representation  and  is  accessed  by  an  lvalue
       expression  that  does not have character type, the behavior
       is undefined.  If such a representation  is  produced  by  a
       side  effect  that modifies all or any part of the object by
       an lvalue expression that does not have character type,  the
       behavior is undefined.37)  Such a representation is called a
       trap representation.


Now it says undefined behaviour is:

       3.18
       [#1] undefined behavior
       behavior, upon use of a  nonportable  or  erroneous  program
       construct,  of  erroneous data, or of indeterminately valued
       objects, for which this International  Standard  imposes  no
       requirements

So it imposes no requirements on what to do when it happen. This means gcc
is free to do what it wants. This includes unreasonable things, or reasonable
things. I think turning alias analysis off in this case is reasonable, and
of course fully standards compliant. Also the argument "that will slow
down legal programs" is non sense, because there are no strictly conforming
programs which can do this. 


-Andi

P.S.: Toon, this is not Fortran ;)

-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 11:09                                       ` David S. Miller
  1999-06-05 12:11                                         ` Toon Moene
  1999-06-07  6:01                                         ` Joern Rennecke
@ 1999-06-30 15:43                                         ` David S. Miller
  2 siblings, 0 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: ak, toon, law, jbuck, torvalds, craig, chip, egcs

   From: mark@codesourcery.com
   Date: Sat, 05 Jun 1999 10:41:07 -0700

   Furthermoe, I bet that by now, if all this energy had been spent
   fixing the code in the kernel, you'd have made good headway on some
   of the most prominent data structures.  Yes, this will be a tedious
   chore, but it's an easy one: you enclose things in a union,
   compile, see what doesn't, fix it, and go on.

What seems to be ignored are the future maintenance costs incurred by
this set of changes to the kernel, as if "do it and get it over right
now" is some triviality.  Effort has been expended already to make
attempts to do this (mentioned here by Andi Klein who did a run at it
for the networking), and the findings made there support the
non-triviality claim, in Andi's case he tossed the work midstream due
to the non-stop overwhelming accumulation of issues.

Also some of the datastructures one would need to change are included
by userspace applications, especially for some of the networking
instances, and thus one would have ABI issues to concern themselves
about if they were to go and perform these transformations.  Much more
is it than a tedious chore.  One could certainly create another header
file, leave the old one alone with the same name, and use only the new
one inside the kernel, but does it make sense to have two copies and
maintain them?

However the headerfile interface issue is cleanly handled if only the
offending code in the kernel is changed (changes thus which are
invisible to the user headerfile ABI) to adhere to the proposed gcc
cast aliasing behavior.

This argument is orthogonal to your proposed possible future
maintenance costs gcc might incur due to the implementation of cast
aliasing behavior.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  9:02                     ` Jean-Pierre Radley
@ 1999-06-30 15:43                       ` Jean-Pierre Radley
  0 siblings, 0 replies; 218+ messages in thread
From: Jean-Pierre Radley @ 1999-06-30 15:43 UTC (permalink / raw)
  To: EGCS Developers

Linus Torvalds averred (on Fri, Jun 04, 1999 at 08:56:31AM -0700):
| 
| On 4 Jun 1999 craig@jcb-sc.com wrote:
| > 
| > C is simply a poor language for the task at hand.
| 
| Well, I do agree. But at the same time I also disagree, for the obvious
| reason that it's still the =best= language for the task at hand. So I can
| only hope to make it better for it rather than make it worse.

Which brings to mind Winston Churchill's remark to the effect that
democracy is the worst form of government, except for all the rest.

-- 
Jean-Pierre Radley <jpr@jpr.com>  XC/XT Custodian   Sysop, CompuServe SCOForum

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 16:51                                             ` mark
@ 1999-06-30 15:43                                               ` mark
  0 siblings, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: davem; +Cc: toon, ak, law, jbuck, torvalds, craig, chip, egcs

>>>>> "David" == David S Miller <davem@redhat.com> writes:

    David> So the situation here is quite the contrary to the
    David> assertion, one of the most worrysome areas of the kernel,
    David> with respect to the union'ization of data structures to
    David> remove non-alias-friendly casts, is also the place where
    David> alias analysis would be highly beneficial.

Note, however, that placing *anything* in alias set zero tends to
largely botch type-based alias analysis.  Since alias set zero means
that anything can be changed, no code on a path after the alias set
zero access can use typed-based alias analysis from before that point.
So, even with Linus' proposal, or my less intrusive proposal, this
code may not benefit that greatly.

As I pointed out before, you can introduce unions.  You will then get
compile-time errors.  You can then run through and fix them all.  This
will be tedious.  Yes, it will be a large patch, but not a complex
one.  Do one data structure at a time to make it simpler.  Such a
patch may not be appropriate for the stable kernels, but it should be
tolerable on the unstable kernels.  I fail to see what is so *hard*
about this; I do see that it will take effort.

Your argument about user-space headers is valid.  However, you can
always do:

  /* user-space header */
  struct s { int i; int j; }; /* But sometimes really the whole 
	                         thing is a double.  */

  /* kernel header */
  union ks {
    struct s s;
    double d;
  };

That doesn't require changing the user-space headers.  Yes, you
introduce some duplication.  That's the price of not writing ANSI/ISO
C from the standard *and* wanting type-based alias analysis to work
correctly *and* wanting to keep user-space headers intact.

Quite frankly, I still don't fully buy the "it's too hard to fix, but
where it's too hard is exactly where we need it" conundrum.  There are
a variety of things you can do: unions, macros to access the fields by
a type-safe means, memcpy and friends (see my earlier post for why
this should be fast), etc.  I believe you that it's a pain in the neck
to fix this stuff; just not that it's simultaneously as hard *and* as
important as you claim.

If there are really hot spots in the kernel that need to go fast, you
can hand-code them in assembly, or rewrite them in portable C.  Yes,
that will be a pain.  The benefits may or may not outweigh the cost.
That's a decision you have to make.  But, it's not clear that
introducing some extension, perhaps with gotchas we cannot forsee, to
GCC will have more benefits than costs, either.

In some sense, Linux is coded in a dialect of C.  In this dialect, you
can do funny casts and things still work.  That's not ANSI/ISO C, and
it's not something that GNU CC ever promised to support (unlike
extended asms, for example.)  But, the "Linux dialect" *is* still
supported with -fstrict-aliasing.

You want the new optimization, but to retain the old dialect.  In
short, you want to have your cake and eat it too.  That's natural, but
not necessarily reasonable or realistic.

If someone contributes a patch that:

  o Provides some "anti-aliasing" :-) behavior for non-conforming
    programs in a well thought-out way.
  o Does not affect conforming programs.
  o Is easy to maintain.  See my earlier post for some of the
    problems that must be solved.
  o Looks to be useful to projects outside of the Linux kernel.

then, I would expect that the GCC maintainers would react favorably.

But, frankly, I bet there are a lot of projects on which we could
spend our time that would have more widespread benefits, for GCC, the
kernel, and the community.  For example, better x86 scheduling would
allow *all* applications, probably including Linux, to run
significantly faster.

I am just not persuaded that this is a good place to spend my time,
espcially what little I can afford to volunteer for free.  I am not
persuaded that it is a good use of anyone else's time, either,
including yours, David, or you Andi.  You are excellent programmers,
and your contributions to many projects are valuable; I bet this isn't
the most useful thing you could do.

Unfortunately, (or perhaps fortunately for the readers of this list!),
I can't afford to spend any more time arguing the point.  I hope that
doesn't offend anyone.  I do understand the Linux issues, and why
you're arguing for what you're arguing for.  Let's agree to disagree,
at least until someone produces a patch for GCC that we can start a
fresh argument about. :-)

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: f77 vs type based alias analysis
  1999-06-06 23:12                                   ` f77 vs type based alias analysis Jeffrey A Law
@ 1999-06-30 15:43                                     ` Jeffrey A Law
  0 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Toon Moene; +Cc: craig, egcs

Note, we've got two threads in this message (my fault actually)...  I'll
split for Fortran thread off from the main aliasing thread.

  In message < 37591A00.ACE54339@moene.indiv.nluug.nl >you write:
  > This was suggested, but I replied that I didn't believe that to be the
  > reason.  Note that Fortran basically only has one "scope" for automatic
  > variables (whether arrays or scalars):  The complete subprogram (i.e.
  > subroutine or function).
  > 
  > That means that in the scope Mark's alias analysis works in, automatic
  > arrays are created precisely once (at the beginning of that scope) and
  > destroyed exactly once (at the end of said scope); hence, there is no
  > opportunity to re-use stack slots.
Hmmm.  Well, there's a reasonably easy way to verify if it is the stack
slot issue.

Replace "flag_strict_aliasing" with "0" in function.c and benchmark against
the unmodified version of the compiler.

That allows us to isolate the slot combination/reuse issues from the rest of
the strict aliasing changes.  While there is the slight chance you'll get wrong
code, I doubt it'll happen in practice with Fortran.  And at this point we're
just trying to find out why the code is slower when strict aliasing is enable,
so 100% correctness isn't needed.


jeff

 

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:53                               ` Jeffrey A Law
@ 1999-06-30 15:43                                 ` Jeffrey A Law
  0 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Joe Buck; +Cc: torvalds, craig, mark, davem, chip, egcs

  In message < 199906041728.KAA19685@atrus.synopsys.com >you write:
  > >   > Either
  > >   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing
  > ).
  > > This is my strong preference.
  > 
  > In that case, then all release announcements and NEWS should prominently
  > mention the effect of this new optimization and the -fno-strict-aliasing
  > flag, so that everyone has fair warning.
Agreed.  

jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:17               ` Linus Torvalds
  1999-06-04  8:49                 ` craig
@ 1999-06-30 15:43                 ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: craig; +Cc: davem, mark, chip, egcs

On 4 Jun 1999 craig@jcb-sc.com wrote:
> 
> Any programmer worth his salt, wanting "a laser-guided nightsight",
> and not wanting to tweak (or even rewrite) his code for every new
> compiler release, will *not* use a C compiler.  Period.

Oh?

That's a new argument. Instead of "don't use that feature", it's now
"don't do anything clever at all".

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 19:35                                         ` Linus Torvalds
  1999-06-06  1:18                                           ` Martin v. Loewis
@ 1999-06-30 15:43                                           ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Toon Moene, Andi Kleen, law, Joe Buck, craig, mark, davem, chip, egcs

On Sat, 5 Jun 1999, Jamie Lokier wrote:
> Jamie's suggestion de jour.

I like your suggestion - it dos sound like a lot more work especially for
the compiler than my simplistic one, but the fact that we would at least
get warnings from the compiler about them means that we wouldn't have to
rely on somebody going through 30MB worth of sources by hand..

>   Linus wants non-union (ie. non-ugly) casts to do the sensible thing.
>   What that is isn't quite clear -- after thinking it through.
>   If it's just aesthetics, I don't see why a macro wouldn't do ;-)

A macro can do the same thing (the same way I think the current gcc lvalue
cast could be done with a macro), but my approach has the in my opinion
very useful behaviour that it makes most "normal" type cast problems just
automatically do the right thing. So in many cases it would work as-is
(not just for kernel code), and in cases where it does not (ie the cast is
non-local) my proposal has a way out (add another cast that _is_ local
to the actual de-reference).

I don't really see why people hate the proposal so much, but maybe that's
just my personal coding style. I do not consider pointer casts (or any
other kinds of casts) acceptable programming practice for any normal
cases, so just about =all= the casts I ever see are of the type where
alias information should obviously be disabled. So to me it sounds like a
"natural" way of doing things.

(Just to clarify - it's not as if the linux kernel does a _lot_ of ugly
pointer stuff. It's just that it does happen, and it isn't done in one one
well-defined area or similar.)

It seems that other people use more casts for "normal" things, and are
actually afraid of my proposal for performance reasons. I'm surprised:
people that do things like that are usually not the people who complain
about others coding standards ;)

Anyway, I grepped the kernel for "likely" places where my change would
make a difference by using the following heuristic grep:

	grep '\*(.*\* *)' */*.c

and in basically all cases the compiler would have done the right thing if
it had followed my proposal.

THAT is why I like it. It does the RightThing(tm), with basically zero
complexity for either the user or the compiler. It is a "do what I mean" 
kind of patch. 

"Do what I mean" is a quality of implementation thing. Yes, all of this is
obviously not defined by the standard. But exactly because it is NOT
defined by the standard, it's very good if the behaviour is what you'd
expect.

The people who worry about the thing being a performance problem for them:
try the above grep and see what it shows you. No, the grep doesn't really
catch all the cases that the compiler change would impact, but it should
give you a rough idea.

In particular, if the grep comes up empty (ie "Well written code without
any strange casts"), you probably wouldn't actually be impacted by the
"Linus proposal" at all. 

>   In particlar, a cast may be conforming, in which case the compiler
>   should strive the generate the best allowed code (unless it's
>   pathological).

"may be conforming", yes. Are there any real life cases where it really
matters? The case where my rule kicks in is definitely "suspicious" - I
agree that it _may_ conform, but do people actually ever write code like
that in strictly conforming programs? That's why I'd like to see what the
grep above shows people..

>   Issues raised: 
> 
>   - Lots of legacy code uses casts, assuming nothing weird will happen.
>   - Weird things now happen.

Right. The "Linus proposal" would not make that go away completely, but it
would make a large percentage of the weird cases do what the old code
expected.

>   - "non-conforming cast implies pointer may alias all" + "full flow
>     analysis" got proposed.  No one likes it.  Bin.

Yes.

>   - "non-... all" + "*no* flow analysis" got proposed by Linus.  It's a
>     simple special case, arguably syntactic.  But it has semantic warts:
>        *(foo_t*) &bar = foo; 
>     now means something different than:
>        { foo_t* p = (foo_t*) &bar; *p = foo; }

I understand that people can see this as a wart, but if you consider it
syntactic then you shouldn't even _expect_ the above to be the same thing.

In fact, I'd like to consider it a bonus that you will =not= get the
looser alias semantics for the case where you actually assign the pointer.
So you can use the second version as a way to _avoid_ the "Linus rule" if
you like it in general but in a specific case want to disable it.

I guess I'm not convincing you.

>   - "type attribute" got proposed by Linus too.  This I like.

Well, that one is just the "implementation part" of my basic proposal.

It can be done on its own without the "implied no-alias" of course. But I
_meant_ it to be done in conjunction with my other proposal, just
explaining how it would be implemented.

> Jamie's thought of the day
> --------------------------

[ deleted ]

Hey, works for me. It seems to do the Linus proposal in a "warning sense",
if I understood you correctly, with a way to just force whichever actual
semantics you want. Right?

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07 10:44                                       ` Linus Torvalds
  1999-06-07 11:22                                         ` Jeffrey A Law
@ 1999-06-30 15:43                                         ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Tim Hollebeek; +Cc: mark, rth, craig, davem, chip, egcs

On Mon, 7 Jun 1999, Tim Hollebeek wrote:
> 
> This is going to happen even with the Torvalds hack.  If they are
> writing code that ignores the aliasing rules, not every single
> instance will conform to the Torvalds "all pointer trickery happens in
> a single expression" coding style.  Hence their binary will still fail.

Yes. But it's less likely to fail.

However, somebody else did suggest just a warning (for the "torvalds case"
and potentially for other cases that are deemed suspect), and I certainly
agree with that as a kind of "uhhuh, somebody is doing something
dangerous". 

However, even then I think we'd _also_ need to have a syntactically
cleaner way of fixing it - if a warning is generated it obviously would
need to have some way of disabling the warning on a case-by-case basis
(with either saying "it's ok to alias - don't warn me" or a "oh, you were
right, this could alias, please consider it to be in alias set zero").

I completely agree with that kind of extension - it's obviously better
than mine, but it's also much more ambitious than my quick and simple
hack.

> Then we'll have to explain two things to them instead of just one: the
> ANSI rules, and the extra Torvalds non-ANSI rules. 

Just explain it as "dangerous code", and give examples. There are
certainly bound to be other cases, although the "torvalds case" is the
obvious and most common one. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07 11:22                                         ` Jeffrey A Law
  1999-06-08  1:34                                           ` Nick Ing-Simmons
@ 1999-06-30 15:43                                           ` Jeffrey A Law
  1 sibling, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tim Hollebeek, mark, rth, craig, davem, chip, egcs

  In message < Pine.LNX.3.95.990607103826.22680A-100000@penguin.transmeta.com >yo
u write:
  > > Then we'll have to explain two things to them instead of just one: the
  > > ANSI rules, and the extra Torvalds non-ANSI rules. 
  > 
  > Just explain it as "dangerous code", and give examples. There are
  > certainly bound to be other cases, although the "torvalds case" is the
  > obvious and most common one. 
Building a new set of aliasing rules which are only going to be used by the
Linux kernel to avoid making their code standards complaint is simply dumb.
We should follow the ISO/ANSI rules and be done with it.

I'm not going to approve any changes of this nature.
jeff



^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 15:38     ` Martin v. Loewis
@ 1999-06-30 15:43       ` Martin v. Loewis
  0 siblings, 0 replies; 218+ messages in thread
From: Martin v. Loewis @ 1999-06-30 15:43 UTC (permalink / raw)
  To: ak; +Cc: mark, egcs

> Bad :/. -fno-strict-aliasing is the only alternative then. What a pity,
> Linus' proposal looked reasonable. 

It isn't really that bad. Of course, you lose some optimization
opportunities - but it isn't worse than earlier versions of gcc, which
never considered the type for aliasing, anyway.

Regards,
Martin

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:24                                                       ` Jeffrey A Law
  1999-06-07  9:29                                                         ` David S. Miller
@ 1999-06-30 15:43                                                         ` Jeffrey A Law
  1 sibling, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: David S. Miller; +Cc: mark, jason, martin, egcs

  In message < 199906071545.IAA07099@pizda.davem.net >you write:
  > Other compiler vendors seem to have done it in two stages:
  > 
  > 1) Ok, the strict aliasing is there, but not a default optimization,
  >    you have to enable it explicitly.  But come next release it will be
  >    on by default and thus you have ample time to fixup your code.
Err, we had this in egcs-1.1.  One had to enable type based alias analysis
explicitly via -fstrict-aliasing.


  > 2) It's on by default in this new subsequent release, we warned you.
With the full intention of doing this for egcs-1.2 (now gcc-2.95).  It
probably was even mentioned in the release notes for egcs-1.1.

jeff


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  5:46                     ` craig
  1999-06-04  7:22                       ` burley (was Re: Linux and aliasing?) Mark Hahn
  1999-06-04  8:35                       ` Linux and aliasing? Linus Torvalds
@ 1999-06-30 15:43                       ` craig
  2 siblings, 0 replies; 218+ messages in thread
From: craig @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: craig

>I want the _user_ to be able to give input. 
[...]
>I happen to think that the "explicit cast invalidates the alias
>information" rule is the simplest and best one, and gives the user the
>best control without adding things like new attributes or other ways to
>let the user be in control.

In other words, you believe you are a better language designer than
the ISO C people as well as the gcc maintainers, despite the fact
that you know, what, *nothing* about language design, and *nothing*
about compiler design and, especially, long-term maintenance of
compilers?

[...]
>And I would not say "POSIX does not allow you to do that, so why should
>you do it"? 

>I dislike fascist compilers who think they know better than I do.

Then stop using them.

>And I dislike people who think fascist compilers are a good idea.

We dislike people who think the only good compiler is one that compiles
Linux, on the theory that whatever features Linux needs, must be the
exact ones that should have been in the C language in the first place.

*They* appear to be the real fascists to me, showing up on gcc lists
every few months, telling us how stupid, obnoxious, or ignorant we
are to not take their advice by running around like rats implementing
every feature they ask for, but hardly ever listening to our advice.

gcc is a tool, but some people appear to want it to be a wife -- Betty
Crocker in the kitchen, a virgin with the folks, and a hooker in the
bedroom.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:41                 ` Tim Hollebeek
  1999-06-04  8:53                   ` Jeffrey A Law
@ 1999-06-30 15:43                   ` Tim Hollebeek
  1 sibling, 0 replies; 218+ messages in thread
From: Tim Hollebeek @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mark, craig, davem, chip, egcs

Linus Torvalds writes ...
> 
> I think it's a damn shame that instead of technical arguments _everything_
> revolves around people reading the standard as if it was the bible, and
> trying to make people feel guilty for not really caring. It's not a sin to
> just want to get good code without having to do magic contortions, guys.

I think its a damn shame certain people can't be disagreed with without
insulting their opponents.

-Tim

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 12:20                               ` Jeffrey A Law
  1999-06-05  5:45                                 ` Toon Moene
@ 1999-06-30 15:43                                 ` Jeffrey A Law
  1 sibling, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Toon Moene; +Cc: Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs

  In message < 375814C8.85CA17C9@moene.indiv.nluug.nl >you write:
  > Or those Fortran users (like me) who still do not understand how a
  > strictly C performance enhancement can worsen the code generated for
  > purely Fortran source, like it is the case for me (I use
  > -fno-strict-aliasing since the end of February - and no, we Fortran
  > users do not have a problem with aliasing; as I outlined on comp.arch,
  > we outlawed it).
I though this was tracked down to the inability to re-share those auto
arrays on the stack.  I also thought we had turned off strict aliasing
for Fortran for precisely this reason.  Did I misunderstand the end result
of that discussion?

jeff

  > Cheers.
  > 
  > [Oh, BTW, it doesn't make sense to call me names - I'm a native of
  >  Amsterdam; if I cared about *that*, I would have been dead for decades,
  >  now]
  > 
  > -- 
  > Toon Moene (toon@moene.indiv.nluug.nl)
  > Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
  > Phone: +31 346 214290; Fax: +31 346 214286
  > GNU Fortran: http://world.std.com/~burley/g77.html
  > 


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 11:11                             ` Toon Moene
  1999-06-04 12:20                               ` Jeffrey A Law
  1999-06-05  4:05                               ` Andi Kleen
@ 1999-06-30 15:43                               ` Toon Moene
  2 siblings, 0 replies; 218+ messages in thread
From: Toon Moene @ 1999-06-30 15:43 UTC (permalink / raw)
  To: law; +Cc: Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs

Jeffrey A Law wrote:

> Mark Mitchell wrote:

>   > Either
>   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).

> This is my strong preference.
> 
> I see no need to make conforming, portable code run slower.

Exactly.  Remember that a standard is a contract between producer and
(end-)user, in our case:  between compiler writer and C programmer.

	"We won't optimize your constructs away as long as you program 
	 according to said standard"

Giving Linus more freedom in getting his C code to compile to the code
he thinks is right will take freedom away from us, the compiler writers.
Unfortunately, this is not according to contract and won't be upheld in
court.

> Folks working with non-portable code can use -fno-strict-aliasing and pay
> the resulting performance penalty.

Or those Fortran users (like me) who still do not understand how a
strictly C performance enhancement can worsen the code generated for
purely Fortran source, like it is the case for me (I use
-fno-strict-aliasing since the end of February - and no, we Fortran
users do not have a problem with aliasing; as I outlined on comp.arch,
we outlawed it).

Cheers.

[Oh, BTW, it doesn't make sense to call me names - I'm a native of
 Amsterdam; if I cared about *that*, I would have been dead for decades,
 now]

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
GNU Fortran: http://world.std.com/~burley/g77.html

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 13:29                             ` Joe Buck
  1999-06-04 13:39                               ` Alexandre Oliva
@ 1999-06-30 15:43                               ` Joe Buck
  1 sibling, 0 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: jbuck, torvalds, craig, mark, davem, chip, egcs

> AFAICT, in a cast to `(some_type_t *volatile)', the `volatile' doesn't
> have any actual effect on the generated code, because the pointer has
> already been evaluated.  Couldn't we implement an extension by which
> this `volatile' would kind of have the opposite meaning of `restrict'?
> It would mean that the resulting pointer may be aliased to anything
> else, so the compiler shouldn't move it around nor optimize it ``too
> much''.

It's not a matter of not moving it around or optimizing the pointer
itself.  Rather, after someone writes through one of these anti-restrict
pointers, the compiler has to assume the worst, and re-read everything
from memory that it might have cached away in registers, because we're
saying that the pointer could point to *anything*.  Even a read through
such a pointer can impede optimization, as we'd have to flush everything
out to memory before the read, since the pointer might be reading any
object (the old value of which might be in a register).  This kills
performance of loops over arrays.

For this reason, the effect of proposals like this might be similar in
performance to just saying, if a function contains a cast of a pointer
that is later dereferenced, apply -fno-strict-aliasing to the entire
function, or at least a significant chunk of it.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 23:20                                   ` Linux and aliasing? Jeffrey A Law
@ 1999-06-30 15:43                                     ` Jeffrey A Law
  0 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Toon Moene
  Cc: Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs, Andi Kleen

  In message < 37591A00.ACE54339@moene.indiv.nluug.nl >you write:
  > Ah, yes, but the discussion is whether we should have gcc generate
  > "reasonable" behaviour where "reasonable" is defined by a small group of
  > users.  Note that all "behaviours" not explicitly required by the
  > Standard are prone to:
  > 
  > 1. Erosion (within a decade, gcc maintainers forget why we did this in
  >    the first place: "Hey, look at this code - what hair - and it is
  >    undefined behaviour according to the Standard in the first place;
  >    rip it out")
  > 
  > 2. Contradiction (the C0X Standard defines the previously undefined
  >    behaviour, but in a way incompatible with the "reasonable" behaviour
  >    we thought up here).
I can't agree more.  I haven't caught up on the whole thread yet, but in
general it seems like a mistake from a design standpoint to extend GCC in
the manner that I've seen suggested here.

If the Linux kernel folks don't want to change their (non-conforming) code,
then they should use -fno-strict-aliasing.  Yes it will inhibit some opts,
but that's the price one pays for writing non-conforming code.

jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07 13:34                                 ` Jamie Lokier
@ 1999-06-30 15:43                                   ` Jamie Lokier
  0 siblings, 0 replies; 218+ messages in thread
From: Jamie Lokier @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mark, rth, tim, craig, davem, chip, egcs

Linus Torvalds wrote:
> > I think by now you've been presented with a variety of strategies for
> > solving the problem in the kernel, including more than one idea for
> > macros that you could use like:
> > 
> >   ALIASING_CAST (type, x)
> 
> I've been told in private email, that the proposed macro wasn't even
> standards conforming in the sense that it doesn't guarantee that the
> compiler couldn't decide it aliases (because in order to guarantee that
> the union should contain all possible types). It happens to work for gcc.

Luckily, GCC has __typeof__.
So you can just put the target type and __typeof__(x) in the union --
should work shouldn't it?

-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* re: burley (was Re: Linux and aliasing?)
  1999-06-04  7:22                       ` burley (was Re: Linux and aliasing?) Mark Hahn
  1999-06-04  8:16                         ` craig
@ 1999-06-30 15:43                         ` Mark Hahn
  1 sibling, 0 replies; 218+ messages in thread
From: Mark Hahn @ 1999-06-30 15:43 UTC (permalink / raw)
  To: egcs

could we please have a separate list for this kind of asinine namecalling?
perhaps call it "hoity-toity-armchair-architects-soapbox"?

> In other words, you believe you are a better language designer than
> the ISO C people as well as the gcc maintainers, despite the fact
> that you know, what, *nothing* about language design, and *nothing*
> about compiler design and, especially, long-term maintenance of
> compilers?

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 10:37 ` mark
  1999-06-03 11:26   ` David S. Miller
  1999-06-03 12:02   ` Andi Kleen
@ 1999-06-30 15:43   ` mark
  2 siblings, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: chip; +Cc: egcs

>>>>> "Chip" == Chip Salzenberg <chip@perlsupport.com> writes:

    Chip> Linus continues to complain on linux-kernel that egcs lacks
    Chip> a way to *selectively* turn off the new stronger alias
    Chip> analysis.  Is this not easy, or is it just not an important
    Chip> issue to the egcs team?  -- Chip Salzenberg - a.k.a. -
    Chip> <chip@perlsupport.com> "When do you work?"  "Whenever I'm
    Chip> not busy."  

It's not easy.  

And it's not in any way clear that it's the right thing to do.

And David Miller (IIRC) indicated that the kernel folks would probably
eliminate the non-standard C code in the next major kernel revision.

So, no, I don't think it's a priority for us to make any such change.
I can't really speak for others, but that's my take on the situation.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  1:24                 ` Joe Buck
  1999-06-04  1:50                   ` Linus Torvalds
@ 1999-06-30 15:43                   ` Joe Buck
  1 sibling, 0 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mark, craig, davem, chip, egcs

Mark writes:
> > The only affect on your code will be that some optimizations that used
> > not to happen, but would with -fstrict-aliasing, will still not
> > happen.  What's the big deal?  If -fstrict-aliasing had never been
> > implemented, you wouldn't be complaining would you?  So, we've
> > improved GCC, and we've preserved the old behavior.

Linus writes:
> Oh, you don't expect me to complain about bad code generation when I know
> gcc could do better?

Oh, don't worry, we expect you to complain, and in a rude and insulting
matter at that.  We're used to it.  It seems that you were a nicer guy
before you had so many worshippers.

> Why do you have a "-O" flag at all if you think people don't care about
> performance?

Mark put in strict-aliasing because it is a big performance win.  Of
course he cares about performance.  The ISO rules were written in the
way they were written precisely to enable this performance improvement.

> I'd love to have the alias extensions, but I don't think it should be a
> per-file global setting. Sure, I can just be silent, but if you expect all
> egcs users to just sit idly when you do silly things, why do you bother
> making pre-releases available at all? You obviously don't care about the
> feedback you get from real users.

More rudeness and insults.  Of course we care.  Why do you insist on
talking to us like that?

> Sure, I can live with -fno-strict-aliasing. But I'm also really saddened
> by all the lawyers like you who think that standards are somehow more
> important than programmers. 

A compiler will do better the more aliasing possibilities it can
eliminate.  Mark used the ISO rules to determine what the set of
eliminatable aliases is.  You want to change this set to a smaller set, so
your programs will continue to work.  I understand that, I even
sympathize.  But you seem blind to the fact that this will inevitably make
some (possibly many) ISO-valid programs slower.  Possibly, with the right
rules, this set of slowed-down programs can be made very small.  Maybe
someone can donate a patch that will do this right.  But it's non-trivial,
and needs to be done carefully.

> I can see technical arguments. An argument of "it's really too painful to
> do" I can understand (preferably with an explanation, but hey, I don't
> mind getting told that it's too hard to explain). I use that argument
> every day myself. 

See above, or find someone to submit working code (you would ask this
in an equivalent situation on the kernel list).

> I think it's a damn shame that instead of technical arguments _everything_
> revolves around people reading the standard as if it was the bible, and
> trying to make people feel guilty for not really caring.

The standard is not arbitrary: it is the way it is for technical reasons,
specifically to make C a suitable language for numerical computation.
Without such rules serious number-crunchers have to switch to Fortran.

> just want to get good code without having to do magic contortions, guys.

We could flip the default for the flag, so that people have to write
-fstrict-aliasing to get the optimization.  Had we done that, you
never would have noticed.



^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  8:04                                             ` mark
@ 1999-06-30 15:43                                               ` mark
  0 siblings, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: egcs; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

>>>>> "Jamie" == Jamie Lokier <egcs@tantalophile.demon.co.uk> writes:

    Jamie> PgGGmark@codesourcery.com wrote: *(foo*)(void*)(&x)
    >>  I intended this to be covered by my proposal.  This would
    >> officially be a "funny cast", and considered able to alias
    >> anything, provided that x is a variable of an expression of the
    >> form a->b or a.b.

    Jamie> Why the restriction on x?  Things I've seen around, that
    Jamie> are outside your proposal:

So that we can be sure this code is non-conforming before we pessimize
it.

    Jamie> Presumably reinterpret_cast this equivalent to one of your
    Jamie> "funny casts"?

I hadn't intended that.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:31                             ` Joe Buck
  1999-06-04 10:53                               ` Jeffrey A Law
@ 1999-06-30 15:43                               ` Joe Buck
  1999-07-11 10:55                               ` Jeffrey A Law
  2 siblings, 0 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-30 15:43 UTC (permalink / raw)
  To: law; +Cc: jbuck, torvalds, craig, mark, davem, chip, egcs

>   > Either
>   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).
> This is my strong preference.

In that case, then all release announcements and NEWS should prominently
mention the effect of this new optimization and the -fno-strict-aliasing
flag, so that everyone has fair warning.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 10:37                                     ` mark
                                                         ` (2 preceding siblings ...)
  1999-06-05 12:41                                       ` Jamie Lokier
@ 1999-06-30 15:43                                       ` mark
  3 siblings, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: ak; +Cc: toon, law, jbuck, torvalds, craig, davem, chip, egcs

>>>>> "Andi" == Andi Kleen <ak@muc.de> writes:

    Andi> Mark, even when you don't like it, would you as
    Andi> alias-expert-in-residence think that the basic strategy is
    Andi> workable?

I don't know what workable means.  

But, I would argue against your patch.  There are cases where a
pointer is cast to one type, and then cast back to another, and then
used.  These cases are conforming, and I think that Linus' proposal
will disable alias analysis in these cases.  That's bad.  Especially
since often these casts are to `void*' for the express purpose in
storing them in some kind of generic data structure.

Note that I made an alternate, more circumspect, proposal, which has
been ignored by both Linus and yourelf up until now, although there's
been so much traffic that one couldn't really expect anyone to keep up
with all of it:

  Put expressions of the form `*((foo*) (&x))' in alias set zero if
  x does not have type foo, or one of the types that is allowed
  to alias it.

This proposal only affects nonconforming code, and thus changing the
behavior of the compiler will not pessimize any conforming code.  It
is important that `x' be a variable, or a field of a variable, not an
arbitrary expression.  (For example, I don't think this should apply
to `*((foo*) (f()))' since that might be conforming.)  But, if
`x' is a variable, or of the form `x->y' or `x.'y' then we should be
OK (it's not legal to talk about `x->y' if `x' is not of the right
type), then we should be OK.

So, this proposal is, IMO, a workable extension of the standard
semantics.  I don't know if this covers all the cases in the kernel,
but it should be easier to change Linux to fit this model than the
strictly conforming one.

I'm also not sure if this is a good idea.  If we don't document this
behavior, we're not promising it to Linux.  So, we might break it
later.  If we *do* document it, then we have to promise to maintain
this behavior.  That's extra work for us; we have to be convinced
there's a good enough reason, and I'm not convinced yet.  The
questions are:

  o How badly does Linux need the extra cycles that might be squeezed
    out by this extra alias analysis?  How much faster will the 
    average Linux system go?

  o How hard would it be to fix the kernel?

  o How hard will it be to do this be to do in GCC?
  
  o What will the maintenance costs be?

  o What else could we all do with our time that would either improve
    Linux or GNU CC?

Your answers probably won't change my mind.  Not because I don't
respect them, but because I usually need to reach my own conclusions.
Hard numbers might change my mind, but I bet we don't have them.  We
can't know answers to the last four questions.  The first one could be
numerically estimated.  But, if you find a hot-spot in the kernel, you
could always make just enough of the kernel conforming to turn on
strict-aliasing there.

Question four is one of the most important, and historically has been
all too often ignored by GCC developers.  *Your* convenience now is
traded against *our* convenience later.  Today's easy hack may be
tomorrow's maintenance nightmare.  Overall, Linux and GCC are both
part of the GNU project (techincally, I know that Linux may not be,
but in practice we're all on the same side), so we have to do what's
best for the project *as a whole*.

Technically, I suspect that code transformations in the front-ends
(and yes, I'm planning some, like inlining on trees) will make doing
this analysis in the middle-end difficult; we could miss in both
directions, putting things in alias set zero when we should not, and
vice versa.

I think to make the semantics robust, this analysis would have to be
done in the front-ends.  Note that this is a *syntactic* thing, not a
*semantic* thing; it's the use of cast syntax in expressions of a
particular form.  In other words:

  inline foo* f(bar* x) { return (foo*) x; }
  *(f(&x)) /* Does not go in alias set zero, even if all 
              inlining is done.  */

So, in summary, I think:

  o It's not clear we want this behavior that badly.
  o A correct implementation will be difficult.
  o There will be maintenance headaches.

Furthermoe, I bet that by now, if all this energy had been spent
fixing the code in the kernel, you'd have made good headway on some of
the most prominent data structures.  Yes, this will be a tedious
chore, but it's an easy one: you enclose things in a union, compile,
see what doesn't, fix it, and go on.  

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  5:45                                 ` Toon Moene
                                                     ` (2 preceding siblings ...)
  1999-06-06 23:20                                   ` Linux and aliasing? Jeffrey A Law
@ 1999-06-30 15:43                                   ` Toon Moene
  3 siblings, 0 replies; 218+ messages in thread
From: Toon Moene @ 1999-06-30 15:43 UTC (permalink / raw)
  To: law; +Cc: Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs, Andi Kleen

Jeffrey A Law wrote:

>   I wrote:

>   > Or those Fortran users (like me) who still do not understand how a
>   > strictly C performance enhancement can worsen the code generated for
>   > purely Fortran source, like it is the case for me (I use
>   > -fno-strict-aliasing since the end of February - and no, we Fortran
>   > users do not have a problem with aliasing; as I outlined on comp.arch,
>   > we outlawed it).

> I though this was tracked down to the inability to re-share those auto
> arrays on the stack.  I also thought we had turned off strict aliasing
> for Fortran for precisely this reason.  Did I misunderstand the end result
> of that discussion?

This was suggested, but I replied that I didn't believe that to be the
reason.  Note that Fortran basically only has one "scope" for automatic
variables (whether arrays or scalars):  The complete subprogram (i.e.
subroutine or function).

That means that in the scope Mark's alias analysis works in, automatic
arrays are created precisely once (at the beginning of that scope) and
destroyed exactly once (at the end of said scope); hence, there is no
opportunity to re-use stack slots.

Strict aliasing isn't turned off, yet (quoting f/com.c):

  /* Set default options for Fortran.  */
  flag_move_all_movables = 1;
  flag_reduce_all_givs = 1;
  flag_argument_noalias = 2;
  flag_errno_math = 0;
  flag_complex_divide_method = 1;

I also feel uneasy about just turning it off - I prefer to first *know*
why it generates worse code.

Andi Kleen wrote:

> I wrote:

>> Exactly.  Remember that a standard is a contract between producer and
>> (end-)user, in our case:  between compiler writer and C programmer.
>> 
>>       "We won't optimize your constructs away as long as you program 
>>        according to said standard"

> Erm, there seem to be some misunderstandings about the C standard in
> this discussion.

Yep, that's what you get when you want to summarize standardese in
one-liners.  Note that I later wrote about freedom for the compiler
writer vs. freedom for the programmer (I think that better catches the
spirit of the Standard).

> So it imposes no requirements on what to do when it happen. This means 
> gcc is free to do what it wants. This includes unreasonable things, or 
> reasonable things. I think turning alias analysis off in this case is 
> reasonable, and of course fully standards compliant.

Ah, yes, but the discussion is whether we should have gcc generate
"reasonable" behaviour where "reasonable" is defined by a small group of
users.  Note that all "behaviours" not explicitly required by the
Standard are prone to:

1. Erosion (within a decade, gcc maintainers forget why we did this in
   the first place: "Hey, look at this code - what hair - and it is
   undefined behaviour according to the Standard in the first place;
   rip it out")

2. Contradiction (the C0X Standard defines the previously undefined
   behaviour, but in a way incompatible with the "reasonable" behaviour
   we thought up here).

Cheers,

[In 24 hours I'm off for my first X3J3 meeting - it shows, doesn't it?]

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
GNU Fortran: http://world.std.com/~burley/g77.html

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 13:26                                       ` Jamie Lokier
  1999-06-05 19:35                                         ` Linus Torvalds
@ 1999-06-30 15:43                                         ` Jamie Lokier
  1 sibling, 0 replies; 218+ messages in thread
From: Jamie Lokier @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Toon Moene
  Cc: Andi Kleen, law, Joe Buck, Linus Torvalds, craig, mark, davem,
	chip, egcs

Jamie's suggestion de jour.

First let me cover the bases.

  Linus wants non-union (ie. non-ugly) casts to do the sensible thing.
  What that is isn't quite clear -- after thinking it through.
  If it's just aesthetics, I don't see why a macro wouldn't do ;-)

  Dave Miller notes that Linux does want to take advantage of full alias
  analysis, alongside code that does hacks to minimise loads and stores
  on current architectures.  I think Dave's trying to eat two cakes at
  once, but since I hack network hardware for profit, I know exactly
  where he's coming from.

  Other folks whose names (I must apologise) seem to have blended
  together in my head right now, have to write compilers.  Good
  compilers that produce the best possible code for standards-conforming
  programs.  And then degrade gracefully for non-conforming programs ;-)

  In particlar, a cast may be conforming, in which case the compiler
  should strive the generate the best allowed code (unless it's
  pathological).  A serious of non-conforming looking casts might be
  conforming too -- cast to void * and back was pointed out as
  conforming _and_ commonplace.

  Issues raised: 

  - Lots of legacy code uses casts, assuming nothing weird will happen.
  - Weird things now happen.

  - "non-conforming cast implies pointer may alias all" + "full flow
    analysis" got proposed.  No one likes it.  Bin.

  - "non-... all" + "*no* flow analysis" got proposed by Linus.  It's a
    simple special case, arguably syntactic.  But it has semantic warts:
       *(foo_t*) &bar = foo; 
    now means something different than:
       { foo_t* p = (foo_t*) &bar; *p = foo; }

    I submit that those two forms ought to be "trivially equivalent" --
    we think in terms of data flow when we write code.  We also think
    that way when we think about how compilers expand expressions, how
    common subexpressions are combined and so on.  A fine distinction
    would, IMO, be a major language misfeature and a cause of many
    subtle bugs in future.

  - "type attribute" got proposed by Linus too.  This I like.

    Type attribute means you can write:
       *(foo_t __alias*)(&bar) = foo;
    and the equivalent:
       { foo_t __alias* p = (foo_t __alias*) &bar; *p = foo; }

    See how the these _are_ trivially equivalent?
    The problem with this is that there are 2 times 10^30 ugly casts
    in the C cosmos that don't have such an attribute, of with 2 times
    10^29 are in Linux kernel.

Jamie's thought of the day
--------------------------

    It looks like the compiler could spot the dodgy casts, including
    some standard-conforming ones, based solely on the types of the
    casts (with attributes).  It could warn that you may be getting
    non-alias optimisations you weren't expecting.  The word access
    casts in the Linux kernel (and lots of other code) would be prime
    candidates.  Hard-core fortran coders have this warning switched
    off.  The compiler proceeds to optimise anyway (you were warned).

    But if you include a suitable type attribute, possible aliasing is
    implied and you don't get the warning.  You get sensible code out.

    There's an alternative attribute, for programs that mix and match,
    where possible aliasing is _not_ implied but it you still don't get
    the warning.  (A bit like __attribute__((unused)) is used solely to
    suppress warnings).

    Thus your thoroughly non-conforming kernel code starts with lots of
    warnings.  You add the __mayalias (or whatever) keyword into all the
    structure manipulation casts -- which the compiler helpfully pointed
    you to.  You add the __doesntalias keyword (is it the same as
    __restrict?) to all those places where you wrote conforming code and
    really do want it fully optimised.  So you get the best of all
    worlds and the compiler helps you get there.

    This way we get:

       - Full optimisation of conforming code.
       - Best optimisation of mostly-conforming code with dodgy casts.
       - No dubious alias flow analysis -- keeps things simple.
       - Code transformations such as rearranging intermediate
         values between expressions, extracting intermediates or merging
         then, continue to be valid transformations.  (IMO v.important).
       - The compiler tells us where to think about aliasing issues.
       - When all the warnings have gone away, then you _know_ it's safe to
         actually use the output of -fstrict-aliasing.
       - If your confident the code is conformant anyway, turn off
         the warning.

    Does this seem (a) implementable, (b) a good, incremental
    maintenance path for the kernel authors?

Enjoy,
-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 12:41                                       ` Jamie Lokier
  1999-06-05 14:43                                         ` Martin v. Loewis
  1999-06-05 16:53                                         ` mark
@ 1999-06-30 15:43                                         ` Jamie Lokier
  2 siblings, 0 replies; 218+ messages in thread
From: Jamie Lokier @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

mark@codesourcery.com suggests:

>   Put expressions of the form `*((foo*) (&x))' in alias set zero if
>   x does not have type foo, or one of the types that is allowed
>   to alias it.
> 
> This proposal only affects nonconforming code, and thus changing the
> behavior of the compiler will not pessimize any conforming code.

I don't like this because it's not what we do in C++.
In C++ when we want to do these naughty things, we used to do:

  *(foo*)(void*)(&x)

I bet there's still a fair bit of that around.
In the modern world we've got *reinterpret_cast<foo*>(&x), which
is presumably treated specially w.r.t. aliases.

[Could someone tell me if reinterpret_cast does the right thing with
aliases please?]

thanks,
-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 23:45             ` mark
  1999-06-04  0:04               ` Linus Torvalds
@ 1999-06-30 15:43               ` mark
  1 sibling, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: craig, davem, chip, egcs

I don't think the cast rule is by any means the right obvious default.
For one thing, it pessimizes object-oriented C code that does
downcasts through an inheritance hierarchy.  There's no reason that we
shouldn't be able to use type-based alias analysis in such situations,
but your proposal would make it not happen.

You can use -fno-strict-aliasing to get the "traditional" behavior.
The only affect on your code will be that some optimizations that used
not to happen, but would with -fstrict-aliasing, will still not
happen.  What's the big deal?  If -fstrict-aliasing had never been
implemented, you wouldn't be complaining would you?  So, we've
improved GCC, and we've preserved the old behavior.

GCC has plenty of odd rules and way too many options.  We don't need
more.  The exception, I think, is when there's something that you can
only do with a language extension or special flag.  Extended asm's are
one such; you just can't do it without an extension.  So, we have it.

But, here, you just don't like ANSI/ISO C, and wish it had different
semantics.  You *could* express what you want in legal ANSI/ISO C, and
then GCC would do the right thing, with its default flags.

If we come up with a rule that turns off strict aliasing only for code
which is non legal ANSI/ISO C, then perhaps we should issue a warning
(on the illegal construct) and then turn off strict aliasing.  But, I
don't think you should expect us to do this any time soon (without 
financial incentive, or a noble volunteer).

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:32                                                       ` Joe Buck
@ 1999-06-30 15:43                                                         ` Joe Buck
  0 siblings, 0 replies; 218+ messages in thread
From: Joe Buck @ 1999-06-30 15:43 UTC (permalink / raw)
  To: David S. Miller; +Cc: mark, jason, martin, egcs

David Miller writes:

> One issue which seems to not be mentioned explicitly, is that such a
> change is typically not of the "flag day" variety, which turning it on
> for the next release seems to imply.
> 
> Other compiler vendors seem to have done it in two stages:

> 1) Ok, the strict aliasing is there, but not a default optimization,
>    you have to enable it explicitly.  But come next release it will be
>    on by default and thus you have ample time to fixup your code.

> 2) It's on by default in this new subsequent release, we warned you.

I'm inclined to agree with you.  This is why I suggested that it perhaps
shouldn't be the default for gcc-2.95.  This is still a possibility (it's
a couple of lines of code at most to flip the default).

One question we don't really have data on is how many programs will break.
If it's only the Linux kernel and a couple of others, we can just use
-fno-strict-aliasing for the affected programs. If many programs are
affected, we don't want to break them all, causing massive inconvenience
to users and developers and damaging the reputation of egcs/gcc.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 18:48                                       ` Linus Torvalds
@ 1999-06-30 15:43                                         ` Linus Torvalds
  0 siblings, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Toon Moene; +Cc: Andi Kleen, law, Joe Buck, craig, mark, davem, chip, egcs

On Sat, 5 Jun 1999, Toon Moene wrote:
> 
> Be careful to not run in circles here:  gcc generates "some" code that's
> allowed because the construct invokes `undefined behaviour'.  That
> doesn't make it "faulty" - just undefined.

Sure. But wouldn't it be nice if the undefined behaviour did what the
programmer obviously meant?

You can see it as a quality of implementation issue - you're _allowed_ to
do anything under the standard, and the ANSI C standard doesn't for
example _require_ that any compile generate efficient code - but a quality
of implementation obviously means that you want to not just say "the
standard doesn't say that you have to generate good code, so we don't
optimize".

A quality of implementation issue says that you'd want to not just do what
the standard requires, but that there are other issues that the standard
just leaves at the discretion of the compiler implementer.

> If you think so, bring it up in comp.std.c.  At least that's the
> ultimate criterium I use:  If I can explain an extension to the Fortran
> Standard coherently on comp.lang.fortran (where all the J3 members
> listen in), and no-one shoots it down in two weeks time, it might indeed
> have some value.
> 
> Success !

That's probably a good idea.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 21:38                   ` Jakub Jelinek
@ 1999-06-30 15:43                     ` Jakub Jelinek
  0 siblings, 0 replies; 218+ messages in thread
From: Jakub Jelinek @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Linus Torvalds, craig, davem, mark, chip, egcs

> If this is all you want, you can get this with a union and
> judicious use of macros --
> 
>   #define noalias(type, ptr) (((union { type __x__; } *)(ptr))->__x__)
> 
>   s = noalias(short, ps);
> 
> Which doesn't strike me as too horrible syntax for public
> consupmtion.  Note that this works because it is the access
> to the union's member that null's the alias set, not the
> cast to the union type.

I would not mind changing code to look like this, but I think it would be
much better if Mark or somebody else implemented putting the problematic
casts dereferences into alias set zero AND provided some warning option
which would trigger a warning in such a case. Thus, when somebody writes a
non-conforming code, it would work even with -fsctrict-aliasing, albeit
slower, but if he cared about performance, he could inspect the warnings
after specifically enabling this kind of warning and use either noalias
macro or rewrite things using unions to speed things up.

Cheers,
    Jakub
___________________________________________________________________
Jakub Jelinek | jj@sunsite.mff.cuni.cz | http://sunsite.mff.cuni.cz
Administrator of SunSITE Czech Republic, MFF, Charles University
___________________________________________________________________
UltraLinux  |  http://ultra.linux.cz/  |  http://ultra.penguin.cz/
Linux version 2.3.4 on a sparc64 machine (1343.49 BogoMips)
___________________________________________________________________

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:29                                   ` Linus Torvalds
  1999-06-07  9:38                                     ` Tim Hollebeek
@ 1999-06-30 15:43                                     ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: rth, tim, craig, davem, chip, egcs

On Mon, 7 Jun 1999 mark@codesourcery.com wrote:
> 
>     Linus> My proposal might mean that fewer people will use the
>     Linus> "-fno-strict-alias" switch, because they won't have to. I
>     Linus> don't think you realize how most professional software
>     Linus> projects work. The "professional" part means that people
>     Linus> are under a deadline and don't really care about your
>     Linus> standards conformance, they want things to WORK.
> 
> Please don't make these kinds of statements.  They're not becoming.

What? It's not about being becoming. It's about how things work. Not in
all places, I'll give you that, and not in the best places. But I think
you have rose-coloured glasses if you think most software projects are
going to spend a lot of time to try to get everything to run perfectly.

People write so-so code, and then they hope that by using -O2 the compiler
will make it good. When it doesn't, and when they find out it was due to
strict aliasing rules (if they ever do - the more likely schenario is
that they'll ship the binary compiled with just -O), they'll turn it off
rather than fight it.

You obviously disagree. This isn't technology, so there isn't "one right
answer",

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 18:40                   ` Linus Torvalds
@ 1999-06-30 15:43                     ` Linus Torvalds
  0 siblings, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Richard Henderson; +Cc: craig, davem, mark, chip, egcs

On Sat, 5 Jun 1999, Richard Henderson wrote:
> 
> So what you're saying is, you don't mind fixing up alias
> problems on a local scale?  You're not expecting to get 
> away with no source code changes?

Right. I expect to get away with fairly minimal source code changes, and
I'd expect that some of the common codes just will work the way the
programmer intended them without people having to even worry about it.

The problem I have with the union approach (even when hidden behind a
macro like yours - which makes things better) is that I'd be much happier
if the "obvious" code just worked. The less people have to really be aware
of the aliase issues, the better.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 10:32                                     ` Toon Moene
  1999-06-05 13:26                                       ` Jamie Lokier
  1999-06-05 18:48                                       ` Linus Torvalds
@ 1999-06-30 15:43                                       ` Toon Moene
  2 siblings, 0 replies; 218+ messages in thread
From: Toon Moene @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Andi Kleen; +Cc: law, Joe Buck, Linus Torvalds, craig, mark, davem, chip, egcs

Andi Kleen wrote:

> On Sat, Jun 05, 1999 at 02:37:20PM +0200, I wrote:

> > Ah, yes, but the discussion is whether we should have gcc generate
> > "reasonable" behaviour where "reasonable" is defined by a small group of
> > users.  Note that all "behaviours" not explicitly required by the
> > Standard are prone to:

> Generating faulty code is in my book always unreasonable, even when
> the source is not strictly conforming (and the compiler has a realistic
> chance to detect it).

Be careful to not run in circles here:  gcc generates "some" code that's
allowed because the construct invokes `undefined behaviour'.  That
doesn't make it "faulty" - just undefined.

> The argument that it may inhibit some optimizations for strictly conforming
> programs I also cannot follow. As I understand it there are basically two
> cases:

It does if you have to apply a compiler option to prevent this
optimisation - because in that case the optimisation will be prevented
for the whole compilation unit (a source file)

> > 1. Erosion (within a decade, gcc maintainers forget why we did this in
> >    the first place: "Hey, look at this code - what hair - and it is
> >    undefined behaviour according to the Standard in the first place;
> >    rip it out")

> If it is clearly documented that will not happen.

Yeah, sure.  Unfortunately, if the "correct" treatment of this feature
means to change a dozen source files (and rth's comments make me fear
that that's the case), the chance that someone, somewhere forgets to say
exactly why these changes were necessary (and on what other changes in
other files they depend) is far larger than I want to consider.  We've
seen this before.  Lucky we are that some long time gcc-hackers are
still among us, who might remark:  Oh yes, that's undefined by the C
Standard, but it happens to be an extension gcc supports ...  Seen that,
got the T-shirt.

> > 2. Contradiction (the C0X Standard defines the previously undefined
> >    behaviour, but in a way incompatible with the "reasonable" behaviour
> >    we thought up here).
> 
> I don't think that such a vague possibility should guide a gcc design
> decision ("in 30 years an asteroid may crash onto earth and ruin your
> whole day - don't implement it because the exception handlers don't handle
> that event") Also there is no cue in the future directions that that may
> happen. In any case it wouldn't strike me as a strong enough argument
> to suppress a useful feature.

If you think so, bring it up in comp.std.c.  At least that's the
ultimate criterium I use:  If I can explain an extension to the Fortran
Standard coherently on comp.lang.fortran (where all the J3 members
listen in), and no-one shoots it down in two weeks time, it might indeed
have some value.

Success !

[No, there's no smiley here - I really think you should try that route,
 because it's the only sane way.]

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
GNU Fortran: http://world.std.com/~burley/g77.html

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  2:14                                                 ` Jason Merrill
  1999-06-07  8:02                                                   ` mark
  1999-06-07 13:11                                                   ` Jeffrey A Law
@ 1999-06-30 15:43                                                   ` Jason Merrill
  2 siblings, 0 replies; 218+ messages in thread
From: Jason Merrill @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Martin v. Loewis; +Cc: egcs

>>>>> Martin v Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:

 > We don't really emit *bad* code: garbage in, garbage out. People will
 > run into problems, yes. We will advertise this new feature in big
 > letters, and people will recompile with -fno-strict-aliasing, and then
 > see whether it still breaks (for some other reason). If it was an
 > aliasing problem, we can tell them that their code was not C.

My problem with this is that only people who read the release notes for
*this release* will see the big letters.  Meanwhile, people who start with
a later version or aren't involved in deploying the tools or whatever won't
see the warning.  Meanwhile, casts are an intuitive way to achieve the
desired effect, much more obvious than unions, so people who haven't been
explicitly warned will continue to write code that uses unsafe casts,
without realizing that it will break.

Saying "garbage in, garbage out" is a cop-out.  If we're going to let code
like this break, we need to emit a warning so that people know that they
have a problem, rather than leaving them to debug obscure problems.

BTW, the union trick isn't part of C, either.  It's a GCC implementation
choice.

Jason

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 13:39                               ` Alexandre Oliva
@ 1999-06-30 15:43                                 ` Alexandre Oliva
  0 siblings, 0 replies; 218+ messages in thread
From: Alexandre Oliva @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Joe Buck; +Cc: torvalds, craig, mark, davem, chip, egcs

I had written:

>> It would mean that the resulting pointer may be aliased to anything
>> else, so the compiler shouldn't move it around nor optimize it ``too
>> much''.

On Jun  4, 1999, Joe Buck <jbuck@Synopsys.COM> wrote:

> For this reason, the effect of proposals like this might be similar
> in performance to just saying, if a function contains a cast of a
> pointer that is later dereferenced, apply -fno-strict-aliasing to
> the entire function, or at least a significant chunk of it.

The main difference is that the notation I propose could be used in
macros.  It is true that it could have a glocal effect on the
optimization of any function that uses it, but the requirement
statement would be closely linked to its use, which is good, so that
you wouldn't have to maintain special Makefile rules because such and
such files haven't been `union'ized (yet?) to make them
ANSI-aliasing-safe.

-- 
Alexandre Oliva http://www.dcc.unicamp.br/~oliva IC-Unicamp, Bra[sz]il
{oliva,Alexandre.Oliva}@dcc.unicamp.br  aoliva@{acm.org,computer.org}
oliva@{gnu.org,kaffe.org,{egcs,sourceware}.cygnus.com,samba.org}
*** E-mail about software projects will be forwarded to mailing lists

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 15:20                 ` Richard Henderson
  1999-06-05  9:50                   ` Linus Torvalds
@ 1999-06-30 15:43                   ` Richard Henderson
  1 sibling, 0 replies; 218+ messages in thread
From: Richard Henderson @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tim Hollebeek, craig, davem, mark, chip, egcs

On Fri, Jun 04, 1999 at 08:53:47AM -0700, Linus Torvalds wrote:
> > Unless you're suggesting data flow analysis to figure out which
> > pointers values could have been derived from a casted pointer??? ick,
> > ick, ick.
> 
> Oh, no, no, no. Shudder. I hope nobody took it that way. Barf.

I did.  I'm wary that anything less wouldn't be good enough.

> I meant the features as something to expressly allow a local override.
> Think of the rule more as an issue of "poisoning" the dereference operator
> rather than poisoning the _pointer_. In a kind of silly "precedence rule"
> notation, it would be
> 
> 	*(char *)y
> 
> becomes (*(char *)) y where it is the "*(char *)" thing that makes the
> alias go away. (Now somebody is going to flame my ass off for mixing C and
> a non-C precedence rule). 

This doesn't handle

	extern inline int foo(short *ptr)
	{
		return *ptr;
	}

	int bar(void)
	{
		int i;
		i = 0;
		return foo((short *)&i);
	}

Which isn't unlike some uses of inlines in the kernel.


r~

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 12:03     ` mark
  1999-06-03 12:25       ` David S. Miller
  1999-06-03 13:31       ` Andi Kleen
@ 1999-06-30 15:43       ` mark
  2 siblings, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: davem; +Cc: chip, egcs, torvalds

>>>>> "David" == David S Miller <davem@redhat.com> writes:

    David> Actually, I've changed my mind.

OK, sorry to misrepresent your position.

    David> I'm not saying this should be the normal mode of operation,
    David> but some mechanism needs to exist so that such code can be
    David> made valid _without_ resorting to ugly unions.

There is one: -fno-strict-aliasing.  

You can turn off the optimization.  Then, however, if you complain
that the kernel would go faster if type-based alias analysis is in
use, you're out of luck.  But, this is no worse off than you were
before the optimization existed, and Linux worked pretty well in those
days too.

Despite what you say, you could just use some unions.  IMO, it
wouldn't take that much to fix up the TCP_IPV4_MATCH macro.  I'm sorry
the socket structure would become uglier, but, on the other hand, it
would make more obvious what exactly it is.  Right now, some of the
fields in the structure definition are really acting as unions, and
you're not making that clear to the reader of the code.

    David> You know exactly what I'm doing there, and so do I, why
    David> can't egcs figure it out that easily as well?

Good question, but you know very well that it's rhetorical. :-)  There
are lots of situations where it's obvious what should happen to the
programmer, but highly non-trivial to do in the compiler, if possible
at all.

I agree that in this case a special hack that says "if the access is
through the address of a variable/field of one type, but cast to
another type, then the user is doing something fishy, and we should
treat the access as if it were done with char*" is not ridiculous, and
not impossible to implement.  You might consider implementing this, or
hiring someone else to do it for you.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:53                   ` Jeffrey A Law
@ 1999-06-30 15:43                     ` Jeffrey A Law
  0 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Tim Hollebeek; +Cc: Linus Torvalds, mark, craig, davem, chip, egcs

  In message < 199906041541.LAA27121@wagner.Princeton.EDU >you write:
  > I think its a damn shame certain people can't be disagreed with without
  > insulting their opponents.
I couldn't agree more.

Folks if you feel you must take a pop-shot at someone, do so privately.  We've
got better things to do than get into a pissing match.

jeff


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  8:58                               ` Linus Torvalds
  1999-06-07  9:18                                 ` mark
  1999-06-07 13:34                                 ` Jamie Lokier
@ 1999-06-30 15:43                                 ` Linus Torvalds
  2 siblings, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: rth, tim, craig, davem, chip, egcs

On Sun, 6 Jun 1999 mark@codesourcery.com wrote:
> 
> BTW, I've been notified in private mail that you pointed out a bug in
> GCC's real.c, involving exactly the kinds of casts were arguing about.
> (I somehow missed that message from you.)   Thanks for pointing that
> out!  I'll fix it soon.

Note that I didn't point it out as a kind of "nyaah, nyaah!" kind of
thing: it just happens that I had the gcc sources on-line and thought I'd
idly check whether it looked like it could have problems just to make
people realize how PERMEATING this is.

> I gather that you suggested your proposal would avoid changing GCC.
> But, it wouldn't, since GCC's first stage is compiled with a (possibly
> non-GCC) host compiler.  Thus, GCC *must* be written in legal ANSI/ISO
> C. 

My proposal is not just about "avoiding changing X", whether X be gcc, the
kernel, or anything else.

What I _really_ wanted to point out that even among the people who (a)
should know and (b) now quote the standard as a legal reason to do
anything, these kinds of things happen. 

My proposal is really a way of saying "ok, there is old code out there,
and we want to try to be as graceful about it as we can".

In the case of the Linux kernel, that "gracefulness" would be something I
would be really happy to take advantage of, as I don't expect to compile
the kernel with much else.

> Even in the kernel, your proposal will lead to a confusing situation.
> You claim it's DWIM, but there the "I" really is "Linus Torvalds", and
> not necessarily the rest of us.  People used to the ANSI/ISO C
> aliasing rules will have to read the GCC manual very carefully to
> figure out the meaning of your code.

No. People used to the ANSI/ISO C aliasing rules (all five of them) will
just point to the code and say it is not strictly conforming, and then
they will go back to building their ivory towers. 

It's not just the kernel. It's not just gcc. I bet there are things like
this in just about all major projects - some of which we'll never see
source code for. 

My proposal might mean that fewer people will use the "-fno-strict-alias"
switch, because they won't have to. I don't think you realize how most
professional software projects work. The "professional" part means that
people are under a deadline and don't really care about your standards
conformance, they want things to WORK. 

That may not be your definition of professional, but it's a fact of life. 

That means that I suspect that if there isn't some simple workaround (like
mine), then it's not just the kernel project that uses the disable switch.
Is that what you want?

Flexibility is a GOOD thing. Even if that flexibility means "Oh, you don't
_have_ to program to the standard, and I'll still try to do the best I
can". 

Think of it this way: you still support "-traditional -O2" - you try to
generate good code even when presented with C that isn't even called C any
more.

Why? Because the code is out there, and it's not worth changing thousands
of software packages when you can instead change one: the compiler.

> I think by now you've been presented with a variety of strategies for
> solving the problem in the kernel, including more than one idea for
> macros that you could use like:
> 
>   ALIASING_CAST (type, x)

I've been told in private email, that the proposed macro wasn't even
standards conforming in the sense that it doesn't guarantee that the
compiler couldn't decide it aliases (because in order to guarantee that
the union should contain all possible types). It happens to work for gcc. 

I don't know whether that is true - I don't have the official standard
around. But you might want to check that out.

> that would do what you want.  I believe Richard Henderson suggested
> one involving local unions; you could also use memcpy as I suggested.

Or I could use "-fno-strict-alias" which is actually preferable to
starting to introduce ugly code.

I think it was Craig who complained about maintenance. "Ugly code" is a
big maintenance issue, and it's always much much better if the "obvious"
code works even if it is not "strictly conforming". The kernel doesn't try
to be strictly conforming anyway, we use tons of other things.

> Even if we implemented your proposal you'd have to audit all your code
> to make sure that all the technically invalid casts come in
> expressions that are immediately derefenced, and not stored in
> temporaries.

Sure. But it wouldn't result in horribly ugly code.

I'm not using egcs at the moment. As such, I'm just seeing reports saying
that it's broken wrt the kernel. My reaction is still that people should
just use gcc-2.7.2, because it's just too _painful_ to upgrade to egcs.
Oh, well..

> At this point, I strongly suggest you abandon your proposal.  Nobody
> looks likely to implement it (at least on a volunteer basis),

Andy Kleen already said he was playing with patches that implemented it,
but just ignore that, like you ignore all the other arguments I've
presented. Sorry,

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  9:35               ` Linus Torvalds
  1999-06-05 13:34                 ` Richard Henderson
@ 1999-06-30 15:43                 ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Richard Henderson; +Cc: craig, davem, mark, chip, egcs

On Fri, 4 Jun 1999, Richard Henderson wrote:
> 
> Doing what you want is actually very hard for GCC right now.  Consider
> 
> 	int i;
> 	short s, *ps = (short *)&i;
> 	i = 0;
> 	s = *ps;

Note that while the kernel may contain constructs like the above, I never
meant for the "extended rule" to cover them. We'd have to fix them up.

The "pointer cast rule" was meant to allow people to know about and
override the type-based aliasing - it wasn't meant to handle every single
pointer cast automatically being non-aliased. I would consider such
behaviour to be basically (a) unimplementable and (b) too non-local.

I obviously didn't explain that very well, although I hope my later email
about the _implementation_ side explained the details more clearly.

The concept was never meant to avoid alias information on any global
scale. I think type-based alias information is important. It was meant to
be a syntactically simple way to override =specific= instances where the
programmer knows he is playing games with typing.

As an example, the above sequence obviously has a alia problem as it
stands now. My suggestion would _not_ make the above code generate
anything different at all. The only thing my suggestion really does is
give the programmer a chance to say "oh, I see: the above worked in the
original ANSI C, but it does not work with the new one, and I only care
about gcc anyway, so I can do the quick fix by just adding the cast":

	s = *(short *)ps;

Note that the cast above in C terms is a no-op: it casts a short pointer
to a short pointer, but it would be a way to tell gcc that this access
should not be aliased.

> Due to a long-ago quirk of history, GCC processes the abstract syntax
> tree one statement at a time, so the fact of the cast is long gone by
> the time we do the dereference.

I agree 100% with the concern you raise, and I'd just like to say that
that was never the intention. Having some kind of complete flow would
obviously be a very broken concept, and I fully understand the horror
people felt if they thought that was what I proposed.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06  1:18                                           ` Martin v. Loewis
  1999-06-06 10:46                                             ` Linus Torvalds
  1999-06-06 17:56                                             ` Jason Merrill
@ 1999-06-30 15:43                                             ` Martin v. Loewis
  2 siblings, 0 replies; 218+ messages in thread
From: Martin v. Loewis @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: egcs

> It seems that other people use more casts for "normal" things, and are
> actually afraid of my proposal for performance reasons. I'm surprised:
> people that do things like that are usually not the people who complain
> about others coding standards ;)

Well, no. The 'normal' kind of cast is very common, and frequently
used in the Linux kernel. For example, if a tty driver routine is
called (e.g. drivers/char/rocket.c :-), it fetches driver_data and
casts it to the device-specific type (i.e. (struct r_port *)).

In these cases, people typically save the cast result in a variable
instead of derefencing it, so they would not suffer from your
anti-aliasing mechanism. These uses of casts are conforming C code:
The driver put an r_port pointer into driver_data earlier on.

> Anyway, I grepped the kernel for "likely" places where my change would
> make a difference by using the following heuristic grep:
> 
> 	grep '\*(.*\* *)' */*.c
> 
> and in basically all cases the compiler would have done the right thing if
> it had followed my proposal.

It is actually the other casts that the Linux contributors need to
worry about. Alias problems are very hard to find (as you pointed
out), and somebody will have to go over the complete kernel source and
investigate every single cast - if you ever plan to turn-on
-fstrict-aliasing.

I did the inverse grep

       grep '(.*\* *)' */*.c | grep -v '\*(.*\* *)' */*.c

and found only one place (fs/binfmt_aout.c:create_aout_tables) where
pointers are aliased in different types, and dereferenced later. The
hidden treasures are probably in the header files (as earlier examples
indicate).

Regards,
Martin

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:57                   ` Linus Torvalds
  1999-06-04  9:02                     ` Jean-Pierre Radley
@ 1999-06-30 15:43                     ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: craig; +Cc: davem, mark, chip, egcs

On 4 Jun 1999 craig@jcb-sc.com wrote:
> 
> C is simply a poor language for the task at hand.

Well, I do agree. But at the same time I also disagree, for the obvious
reason that it's still the =best= language for the task at hand. So I can
only hope to make it better for it rather than make it worse.

And I do know that other people have other concerns. Wich is why I think a
flexible approach which allows people to express those concerns would be
such a nice thing. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 14:43                                         ` Martin v. Loewis
@ 1999-06-30 15:43                                           ` Martin v. Loewis
  0 siblings, 0 replies; 218+ messages in thread
From: Martin v. Loewis @ 1999-06-30 15:43 UTC (permalink / raw)
  To: egcs; +Cc: mark, egcs

> [Could someone tell me if reinterpret_cast does the right thing with
> aliases please?]

No, it won't. The C++ standard says

>> A pointer to an object can be explicitly converted to a pointer to
>> an object of different type. Except that converting an rvalue of
>> type "pointer to T1" to the type "pointer to T2" (where T1 and T2
>> are object types and where the alignment requirements of T2 are no
>> stricter than those of T1) and back to its original type yields the
>> original pointer value, the result of such a pointer conversion is
>> unspecified.

In g++, conversion to a different pointer type in reinterpret_cast
will always yield a pointer that has the same internal representation.

However, dereferencing such a pointer has undefined result. You must
not access an object through a pointer to a different type, period (1).
This is a very easy rule (despite Linus' saying that it is very
complicated), and it is the foundation for allowing type-based alias
analysis optimizations.

Of course, the compiler could provide the local-overriding mechanism
that Linus proposed. It currently does not do so, neither for plain
casts, nor for reinterpret_casts.

Regards,
Martin

(1) except if that different type is char.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 13:31       ` Andi Kleen
@ 1999-06-30 15:43         ` Andi Kleen
  0 siblings, 0 replies; 218+ messages in thread
From: Andi Kleen @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: egcs, davem

mark@codesourcery.com writes:

> Despite what you say, you could just use some unions.  IMO, it
> wouldn't take that much to fix up the TCP_IPV4_MATCH macro.  I'm sorry
> the socket structure would become uglier, but, on the other hand, it
> would make more obvious what exactly it is.  Right now, some of the
> fields in the structure definition are really acting as unions, and
> you're not making that clear to the reader of the code.

This macro is just a particular example. The code is full of such stuff.

I actually started with converting some of the alias occurrences in
the 2.2 TCP code to unions, but I abandoned the project, because it
already touched far too much code and because there is no good way to 
limit the changes to specific modules. Also the unions are really ugly

Of course it would be nice to move to less such casts to allow more
optimizations in the future, but it is not realistic in a short term
fix, even for 2.3.

If GNU C had anonymous unions like VC++ or plan9 or G++ it would be a lot
easier though, because then tagless structure members could be converted
without requiring global search-replaces. Unfortunately it has not, 
and with the current "no more extensions" egcs policy it looks unlikely
(and also it would break Linux's link to gcc 2.7.2, which I fear would cause
a storm in the user and coder base)

-Andi

P.S.: David, at least it would be a good argument to merge tcp_tw_bucket
into sock, even if the cast extension would eventually get in, just to
squeeze some more optimizations out of that important paths. I imagine 
that could be important for good performance on IA64, which will probably 
be hurt much more by missed optimizations than IA32 or sparc.

-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 20:06         ` craig
                             ` (2 preceding siblings ...)
       [not found]           ` <v04205101b37d700fbf8d@[192.168.1.254]>
@ 1999-06-30 15:43           ` craig
  3 siblings, 0 replies; 218+ messages in thread
From: craig @ 1999-06-30 15:43 UTC (permalink / raw)
  To: davem; +Cc: craig

>Common sense should override whatever standards say, where feasible,
>and I argue that here it is indeed feasible.

Maybe it is -- I haven't looked into the issues in detail -- but,
generally, it is very hard to implement common sense *in the compiler
itself*.

For all I know, this problem is the result of C, or gcc, being
too permissive about allowing casts across pointers to different
types...in the sense that, if that sort of thing was simply
disallowed, then programmers wouldn't even *think* they "knew what
they were doing", because they'd be getting compile-time diagnostics,
which, as you point out, is what they *should* be getting if the
compiler isn't basically successfully reading the programmer's mind
and implementing his desires.

In particular, while it might make sense for *your* application
to have the compiler "automatically" disable (even localized)
aliasing when it sees certain "suspicious" constructs, how do we
know there won't be people who say "hey, *we* use those constructs,
but we use them *correctly*, and we don't want to lose the
performance those alias assumptions give us", either now or in
the future?  Why should *they* have to pay for their more-
conforming (to the compiler's growing expectations, anyway) usage
by modifying their code, or even their shell scripts?

I'm thinking, more and more, that there really needs to be a
`GNU C--' or similar language for embedded systems, operating
systems like Linux, and so on, because the C standard seems
to be evolving towards making C *more*, not less, of a HLL,
and I doubt gcc (and its maintainers) will be up to the task
of making it fit both needs while evolving to handle new
architectures (e.g. IA64) in an optimal way.

(Or, anyone up for writing a BLISS front end to gcc, along with a
C-to-BLISS converter to be run over, for example, the Linux sources?  ;-)

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 10:30                       ` Linus Torvalds
  1999-06-06 10:44                         ` mark
@ 1999-06-30 15:43                         ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: rth, tim, craig, davem, chip, egcs

On Sat, 5 Jun 1999 mark@codesourcery.com wrote:
> 
> Not really always true.  You can use `memcpy (target, src, sizeof
> (x))' and if the alignments of the src and target are known to the
> compiler you *should* get optimal code.  (I don't know if GCC does
> this at present, but it could, and that would clearly be a good
> improvement.)

Only if that's assuming that it _is_ a memcpy.

Think of things like

	a = ntohl(*(u32 *)p);

etc - which is _not_ just a copy.

Current gcc versions do pretty well on the pure memcpy() case, I agree. A
lot of the Linux memcpy() logic is because gcc historically did _not_ do
any of the optimizations people felt really had to be done.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  1:50                   ` Linus Torvalds
  1999-06-04  5:46                     ` craig
@ 1999-06-30 15:43                     ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Joe Buck; +Cc: mark, craig, davem, chip, egcs

On Fri, 4 Jun 1999, Joe Buck wrote:
> 
> Oh, don't worry, we expect you to complain, and in a rude and insulting
> matter at that.  We're used to it.  It seems that you were a nicer guy
> before you had so many worshippers.

Ehh, I wasn't exactly known for being polite even before. Why do you think
people still quote the flame wars I had about microkernels?

But point taken.

> Mark put in strict-aliasing because it is a big performance win.  Of
> course he cares about performance.  The ISO rules were written in the
> way they were written precisely to enable this performance improvement.

The ISO rules were not written to "enable" the performance improvement. 
They were written explicitly to _DISable_ it in a number of cases where
the optimization was known to break old code. And not everybody was all
that excited about the rules even when they were written. Understandably,
because they really aren't made to make sense they are only made to give
at least _some_ way around aliasing issues. 

> A compiler will do better the more aliasing possibilities it can
> eliminate.  Mark used the ISO rules to determine what the set of
> eliminatable aliases is.  You want to change this set to a smaller set, so
> your programs will continue to work.

NO!

I want the _user_ to be able to give input. 

I have one _suggested_ option, that to me has the huge advantage of not
really polluting the language, while being simple and obvious.

I would be happy with a #pragma, or with an attribute. People have been
talking about much more specialized attributes ("naked" etc) that are not
really useful to _any_ normal programs. The alias control feature would be
useful to real users - not just the kernel. At least judging by the
snippets of code I've seen. Code that breaks with the ANSI rules. 

I happen to think that the "explicit cast invalidates the alias
information" rule is the simplest and best one, and gives the user the
best control without adding things like new attributes or other ways to
let the user be in control.

But the details of _how_ that control is achieved are much less important
than the fact that the programmer _should_ be in control.

>					  I understand that, I even
> sympathize.  But you seem blind to the fact that this will inevitably make
> some (possibly many) ISO-valid programs slower.

Did you read my post? I'm arguing against making it something we have no
control over.

I was even arguing for allowing _stricter_ aliases than ANSI allows - the
"char *" thing in ANSI is actually really hard to code around (as far as I
know, the only way to do a one-byte access that still allows alias logic
to work in ANSI C is to do something really ridiculous like

	typedef struct {
		char c;
	} *one_byte_t;

in order to avoid the rule that any char access automatically means that
the compiler can't use the regular alias type rules.

> > I can see technical arguments. An argument of "it's really too painful to
> > do" I can understand (preferably with an explanation, but hey, I don't
> > mind getting told that it's too hard to explain). I use that argument
> > every day myself. 
> 
> See above, or find someone to submit working code (you would ask this
> in an equivalent situation on the kernel list).

Yes. I would ask the same.

But I do NOT use arguments like "that is undefined by POSIX" unless I have
a damn good reason to. I consider POSIX to be a guide to me, but I do not
consider it to be automatically correct (POSIX has done some major
blunders in its time: outright idiocies that simply could not be
implemented correctly on 64-bit architecturesfor very simple technical
reasons, for example).

And I would not say "POSIX does not allow you to do that, so why should
you do it"? 

> The standard is not arbitrary: it is the way it is for technical reasons,
> specifically to make C a suitable language for numerical computation.
> Without such rules serious number-crunchers have to switch to Fortran.

Look at the actual rules. Tell me that the "char *" rule makes sense.

The standard _is_ arbitrary. They tried to select a number of special
rules to make it UNLIKELY that old programs break. But the rules _were_
arbitrary. 

Note that "arbitrary" does not imply "random". There are reasons for the
rules. "char *" has historical issues associated with it. But there are
reasons for the extension I suggested too - and they aren't really any
different from the standard reasons.

"Arbitrary" means that you don't have any strong reason to choose one over
the other. So maybe you should allow the user some choice in the matter?

> > just want to get good code without having to do magic contortions, guys.
> 
> We could flip the default for the flag, so that people have to write
> -fstrict-aliasing to get the optimization.  Had we done that, you
> never would have noticed.

I would certainly have complained less, yes. Backwards compatibility is a
strong argument, and the way it is set up now just rubs everyodys nose in
the fact that the compiler behaviour changed. Behaviour you could rely in
according to other (and equally valid) standards of the language - the
alias thing was not even a proposal when I started doing Linux. 

But I would have noticed - I don't think you realize quite how important
generated code quality is to me, and that I actually _am_ aware of the
standard even when I disagree with some of the details in it. I _like_
alias analysis. I just want to have better control over it, because I
happen to think that I can take _advantage_ of it. 

I dislike fascist compilers who think they know better than I do.

And I dislike people who think fascist compilers are a good idea.

			Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 16:53                                         ` mark
  1999-06-07  2:36                                           ` Jamie Lokier
@ 1999-06-30 15:43                                           ` mark
  1 sibling, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: egcs; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

>>>>> "Jamie" == Jamie Lokier <egcs@tantalophile.demon.co.uk> writes:

    Jamie> I don't like this because it's not what we do in C++.  In
    Jamie> C++ when we want to do these naughty things, we used to do:

    Jamie>   *(foo*)(void*)(&x)

I intended this to be covered by my proposal.  This would officially
be a "funny cast", and considered able to alias anything, provided
that x is a variable of an expression of the form a->b or a.b.

    Jamie> I bet there's still a fair bit of that around.  In the
    Jamie> modern world we've got *reinterpret_cast<foo*>(&x), which
    Jamie> is presumably treated specially w.r.t. aliases.

No, it does not.  The use of reinterpret_cast does not exempt a
standard-conforming program from the rules about using an lvalue of
the wrong type to access storage.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:49                 ` craig
  1999-06-04  8:57                   ` Linus Torvalds
@ 1999-06-30 15:43                   ` craig
  1 sibling, 0 replies; 218+ messages in thread
From: craig @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: craig

>On 4 Jun 1999 craig@jcb-sc.com wrote:
>> 
>> Any programmer worth his salt, wanting "a laser-guided nightsight",
>> and not wanting to tweak (or even rewrite) his code for every new
>> compiler release, will *not* use a C compiler.  Period.
>
>Oh?
>
>That's a new argument. Instead of "don't use that feature", it's now
>"don't do anything clever at all".

No, you misunderstand.

C is simply a poor language for the task at hand.  It provides too
little low-level control of how a C compiler should do its work,
yet it's not high-level enough to make it practical for the compiler
to do enough of the optimization work for the programmer to satisfy
developers of embedded/OS code.

Your recently-expressed concerns about `volatile' were a perfect
example of that.  You correctly (or pretty nearly so) noted the
distinction you wanted to make between a volatile *reference* to
an object and a reference to an object via a volatile *address*
(pointer).  That's a distinction C apparently doesn't provide,
among many, in a language supposedly "suitable" for "low-level"
coding (and, I admit, it's better than PL/I in that regard).

Not that I have anything great to suggest in place of C, you understand,
and I fully realize that you're not about to rewrite Linux into some
other language anyway.

But what you're essentially asking for is for us to make gcc compile
some language that is less and less like C, one that is more and more
like your particular *vision* of what C should be, which happens to
be quite at odds with the direct C9X and others are taking, if my
impressions of those efforts (based on posts to this list) are
correct.  (Clearly it'd be easier if those working on the upcoming C
standard simply implemented your desires...at least, easier on the
gcc developers.)

Further, you're asking for us to do language design "on the fly", while
implementing a compiler for that language.  In my experience, that
attempt to marry language design and compiler implementation, while
plenty of fun and full of opportunity for cleverness to show itself,
almost always leads to poor language design.

You've already been bitten by hard-to-find bugs stemming from *extensions*
to GNU C that you used, sometimes without regard to the fact that they
were not particularly well documented.  These experiences have led you
to conclude, or at least complain, that gcc was going in a direction
you did not like.  (Worst of all, you often express this by insulting
people like myself, who, especially in my case, aren't the ones *causing*
the trouble, but are simply trying to *explain* it to you!)

Now you are complaining about a *standard* language feature being
implemented in a standard-conforming way by gcc, one which you can
work around by changing your code to be standard-conforming (surely
a SMOP, say a few lines of Perl ;-) *or* by using a command-line
option...but you don't like the pain of the former, or the performance
of the latter.  Welcome to C hell.

I don't know the details of the issues involved, but I trust those that
have spoken against your proposal, that they *do* know them.

What I have been trying to do is get you to see that, at some point,
you have to conclude that you're never going to succeed at making
the edge of *this* particular hammer, gcc, sharp enough to make
nice clean cuts in silk, for any length of time, without a whole lot
of pain, because every time someone uses it as a hammer, those nice
cutting edges get worn off.

        tq vm, (burley)

P.S. Due to my own outbursts on this thread, and the resulting email
I got, I promise to make this the *last* time I will *ever* respond
to queries, complaints, etc. about similar issues coming from the
Linux camp.  Clearly I do not have what it takes to respond to what
I see as extreme (and repeated) childishness without letting myself
be dragged (at least somewhat) down into the muck, a problem I've
long known I've had, but have yet to fully address.

So, those of you who *encouraged* me in private email, thanks, but,
from now on, you're on your own in defending the honor of gcc
developers against the unfounded, and unfair, accusations of people like
Linus Torvalds.  It's not just that I don't have the maturity to
cope with it -- I don't have the patience, and I surely don't have
the time, to keep going over the same ground again and again.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 11:49                           ` Linus Torvalds
  1999-06-04 13:03                             ` Gabriel Dos_Reis
@ 1999-06-30 15:43                             ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Joe Buck; +Cc: craig, mark, davem, chip, egcs

Ok. Only real technical details. Please shoot it down on technical issues.

On Fri, 4 Jun 1999, Joe Buck wrote:
> 
> > _I_ think my simple extension was perfectly legitimate, adn a _lot_ more
> > obvious than a lot of things people are discussing on the lists.
> 
> Your "simple extension" will have the effect of -fno-strict-aliasing
> for any function that does any pointer cast (there may be marginal
> differences if there are loops before the first cast).  So why not just use
> -fno-strict-aliasing and get the same code?

Well, the biggest advantage I see for having the alias type checking is
that it allows re-ordering of operations _everywhere_, even if there is no
other a priori reason to allow it. A function that does a pointer cast
would _not_ be affected globally even under my scheme. It would just mean
that that particular access that is affected would be marked as being in
alias set zero. 

Hmm.. I really expected this to be simple, but judging by the amount of
traffic it has generated it is obvious that I went about this the wrong
way. Sorry. I really didn't mean to start a flame war, and let me back up
a bit. 

To me, the fact that the scheme should be easy to implement is actually
really important. I'm not a gcc hacker, and it's been several years since
I actually submitted patches to gcc (and even then they weren't always
accepted, although I can claim credit for some of the alpha fixes). But I
really tried to come up with something that wasn't just easy for the users
but that I thought would be easy to do inside gcc. 

So with that background, and the further explanation that if I _am_ wrong
(entirely possible), and it's a rats nest to implement in gcc, then it
very obviously _should_ be discarded out-of-hand. It really wasn't a case
of me just trying to make life harder for people. Let me explain what my
concept was on a more technical level, and then people can shoot holes in
it on a technical level and maybe we can avoid too much more flamage.

Sorry. I kind of took the technical part for granted. 

So on a technical level, let me explain it the way I thought gcc might
implement this rather than explaining the end result as I initially went
about.  Maybe people would understand (and accept) what my idea was better
this way. 

What I would actually do if I knew gcc better is do something like this: 

 - add a new type attribute bit. There are plenty of these already, so
   this isn't a big issue. I didn't worry about naming, because although
   it _could_ probably be used in typedefs directly, it probably never
   would be. But who knows? Somebody might have a good reason to make some
   type always have the "this can alias anything" behaviour, and it could
   in fact be used for the C "char *" type, so that the ANSI special "char
   *" rules would be just a subset of this.

   Let's call the attribute "noalias" just for obvious reasons.

 - the attribute bit magically gets set by any explicit cast when (and
   this obviously _would_ be controlled by a gcc option like
   "-fcast-invalidates-alias", so people who don't like the extended
   semantics wouldn't be affected). This can be done at parse time, it
   looks pretty trivial to me. I may be wrong.

   Think of this part as another simple rule: a typecast always implies
   the "noalias" attribute if the global flag is set. Nothing else would
   imply that attribute.

 - the attribute percolates down normal pointer arithmetic, but NOTHING
   else. It doesn't inherit across a assignment (although with a named
   attribute the assigned variable migth have that attribute natively).
   You already have the notion of attribute inheritace, this is nothing
   new. In fact, I think the inheritance rules are basically the same as
   for the "volatile" attribute, but I haven't really verified that.

 - a alias set query will always return zero for a expression with that
   attribute set.

And that's it. The above doesn't really explain what I'm trying to
_achieve_, it only explains the way I thought those goals would be
achieved. 

So for example, just to make the suggestion more "tangible" to the people
who actually think in terms of gcc code, look at

	 int
	get_alias_set (t)
	     tree t;

in tree.c, and mentally imagine adding a simple condition that just says
something like

   if (!flag_strict_aliasing || !lang_get_alias_set)
     /* If we're not doing any lanaguage-specific alias analysis, just
        assume everything aliases everything else.  */
     return 0;
+  else if (lookup_attribute("noalias", t->attribute))
+    return 0
   else
     return (*lang_get_alias_set) (t);

and that's the only real place where it is tested. 

The attribute is set when parsing the type casting, and in my "clean up
'char *' semantics"  extension it would also always be set for any "char
*" type. In that case the special casing of "char *" can go away, so you'd
actually _remove_ the code in c-common.c that says

      else if (signed_variant == signed_char_type_node)
        /* The C standard guarantess that any object may be accessed
           via an lvalue that has character type.  We don't have to
           check for unsigned_char_type_node or char_type_node because
           we are specifically looking at the signed variant.  */
        TYPE_ALIAS_SET (type) = 0;

but that's a detail that I just show to point out the ramifications of the
_idea_ rather than advocating as something that should necessarily be
done. But it would conceptually put the decision in one place, which is
nice. 

(To me, when I judge peoples ideas about kernel changes, a personally
important criterion is always "does it conceptually solve _multiple_
problems?", and an idea that can be used to solve another thing is
something that I consider more interesting and consider to be more
"flexible". I don't know if the egcs people use that same strategy, but I
wanted to point it out in case others have similar decision making methods
to the ones I use). 

> OK, maybe we can get somewhere.  It seems to me that there are only two
> options for gcc-2.95 on this issue:
> 
> Either
> 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing).
> 
> or
> 
> 2. Don't enable the new optimization for C unless the user says
>    -fstrict-aliasing.
> 
>    Since C++ is fussier about type safety, we could make it the default for
>    C++ (nice to get those C++ critics who falsely claim C++ is slower than
>    C ;-).
> 
> Which would you recommend?

If I did the same decision, I would just make the new feature the default,
to make sure it got tested. I don't really disagree about that - it _does_
rub peoples faces into the issue, and it _will_ make others complain too
and you'll end up explaining to a lot of people what the aliasing issues
really are, but it still makes sense to enable it by default to get better
coverage on it. 

Have an open mind, and be ready to decide that if there turns out to be a
lot of people that get bitten by it you should just turn it off by default
(and for example the eventual decision might be to only turn it on at
optimization level 3 instead of 2, as -O2 tends to be a fairly common
optimization level because it does so much else on gcc too..). 

I really only want to make sure that _eventually_ I can take advantage of
type-aliasing, while at the same time having a convenient back door for
when the kernel does the ugly things.. 

Do people see any obvious problems in the technical idea above? It looks
very maintainable and clean to me, but I'll readily admit that I only look
at the problem from ten thousand feet when it comes to the compiler side.
Maybe it is just completely unusable for some reason I missed.. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 11:00                     ` mark
  1999-06-06 10:30                       ` Linus Torvalds
@ 1999-06-30 15:43                       ` mark
  1 sibling, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: rth, tim, craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus>  - "char *" - which is just unbearably slow, and obviously
    Linus> not really an option for many things. You're better off
    Linus> just disabling the alias logic altogether.

Not really always true.  You can use `memcpy (target, src, sizeof
(x))' and if the alignments of the src and target are known to the
compiler you *should* get optimal code.  (I don't know if GCC does
this at present, but it could, and that would clearly be a good
improvement.)

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  8:41                                                     ` David S. Miller
  1999-06-07  9:24                                                       ` Jeffrey A Law
  1999-06-07  9:32                                                       ` Joe Buck
@ 1999-06-30 15:43                                                       ` David S. Miller
  2 siblings, 0 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: jason, martin, egcs

   From: mark@codesourcery.com
   Date: Mon, 07 Jun 1999 08:05:31 -0700

   We've had -fstrict-aliasing on in the tree for a long time, and had
   very few bugs that we tracked down to the kind of thing you are
   talking about.  But, admittedly, we haven't yet had it on in a
   general release, so it's a relatively small sample size.

True.

One issue which seems to not be mentioned explicitly, is that such a
change is typically not of the "flag day" variety, which turning it on
for the next release seems to imply.

Other compiler vendors seem to have done it in two stages:

1) Ok, the strict aliasing is there, but not a default optimization,
   you have to enable it explicitly.  But come next release it will be
   on by default and thus you have ample time to fixup your code.

2) It's on by default in this new subsequent release, we warned you.

The time between two compiler releases is more than sufficient time
for both ends of the equation (the compiler and it's users) to work
out the issue.

Compiler vendors who have done this typically are often the default
compiler for a single system.  For EGCS we know of at least 4 whole
systems (Linux and the 3 publicly available BSD variants) which use
gcc as the default compiler.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-08  1:34                                           ` Nick Ing-Simmons
  1999-06-08  1:48                                             ` Jeffrey A Law
@ 1999-06-30 15:43                                             ` Nick Ing-Simmons
  1 sibling, 0 replies; 218+ messages in thread
From: Nick Ing-Simmons @ 1999-06-30 15:43 UTC (permalink / raw)
  To: law; +Cc: chip, davem, rth, craig, egcs, Linus Torvalds, Tim Hollebeek, mark

Jeffrey A Law <law@cygnus.com> writes:
>  In message < Pine.LNX.3.95.990607103826.22680A-100000@penguin.transmeta.com >yo
>u write:
>  > > Then we'll have to explain two things to them instead of just one: the
>  > > ANSI rules, and the extra Torvalds non-ANSI rules. 
>  > 
>  > Just explain it as "dangerous code", and give examples. There are
>  > certainly bound to be other cases, although the "torvalds case" is the
>  > obvious and most common one. 
>Building a new set of aliasing rules which are only going to be used by the
>Linux kernel to avoid making their code standards complaint is simply dumb.

It is not just the Linux kernel, it is _any_ kernel and 
aliasing tricks are _everywhere_ - most embedded C has them too, 
I would be surprised if X11 did not have them, ...

-- 
Nick Ing-Simmons <nik@tiuk.ti.com>
Via, but not speaking for: Texas Instruments Ltd.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 13:34                 ` Richard Henderson
  1999-06-05 18:40                   ` Linus Torvalds
  1999-06-05 21:38                   ` Jakub Jelinek
@ 1999-06-30 15:43                   ` Richard Henderson
  2 siblings, 0 replies; 218+ messages in thread
From: Richard Henderson @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: craig, davem, mark, chip, egcs

On Sat, Jun 05, 1999 at 09:34:26AM -0700, Linus Torvalds wrote:
> As an example, the above sequence obviously has a alia problem as it
> stands now. My suggestion would _not_ make the above code generate
> anything different at all. The only thing my suggestion really does is
> give the programmer a chance to say "oh, I see: the above worked in the
> original ANSI C, but it does not work with the new one, and I only care
> about gcc anyway, so I can do the quick fix by just adding the cast":
> 
> 	s = *(short *)ps;

So what you're saying is, you don't mind fixing up alias
problems on a local scale?  You're not expecting to get 
away with no source code changes?

If this is all you want, you can get this with a union and
judicious use of macros --

  #define noalias(type, ptr) (((union { type __x__; } *)(ptr))->__x__)

  s = noalias(short, ps);

Which doesn't strike me as too horrible syntax for public
consupmtion.  Note that this works because it is the access
to the union's member that null's the alias set, not the
cast to the union type.


r~

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 13:03                             ` Gabriel Dos_Reis
  1999-06-04 13:13                               ` Joe Buck
@ 1999-06-30 15:43                               ` Gabriel Dos_Reis
  1 sibling, 0 replies; 218+ messages in thread
From: Gabriel Dos_Reis @ 1999-06-30 15:43 UTC (permalink / raw)
  To: egcs

Linus Torvalds <torvalds@transmeta.com> writes:

[...]

| So on a technical level, let me explain it the way I thought gcc might
| implement this rather than explaining the end result as I initially went
| about.

Do you have a complete patch?

-- Gaby

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05  9:50                   ` Linus Torvalds
  1999-06-05 11:00                     ` mark
@ 1999-06-30 15:43                     ` Linus Torvalds
  1 sibling, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Tim Hollebeek, craig, davem, mark, chip, egcs

On Fri, 4 Jun 1999, Richard Henderson wrote:
> 
> This doesn't handle
> 
> 	extern inline int foo(short *ptr)
> 	{
> 		return *ptr;
> 	}
> 
> 	int bar(void)
> 	{
> 		int i;
> 		i = 0;
> 		return foo((short *)&i);
> 	}
> 
> Which isn't unlike some uses of inlines in the kernel.

Again - I _really_ didn't mean for this to be some Linux kernel specific
hack.

Think of it as a larger problem than just the kernel. Think of it as the
problem of "we have a huge code-base, and it wasn't written with
type-based alias in mind - and the new ANSI rules are fairly cumbersome,
but we'd like to have access to the new optimization: not only is it the 
default, but it does generate better code too!". 

It's not that changes wouldn't be needed, it's the fact that the ANSI
rules really only give you two ways to overcome the alias issue:

 - "char *" - which is just unbearably slow, and obviously not really an
   option for many things. You're better off just disabling the alias
   logic altogether.

 - using a union - which does work, but is just incredibly horrible syntax
   if you don't have just one well-defined case and/or designed with it in
   mind.

For example, the union approach is obviously acceptable if you have the
specific case of once in a blue moon (a few times in a large project) a
need to convert floating point to the integer bit pattern representation
and back. And that's obviously what ANSI was concerned with. 

The "cast invalidates alias information" is a _syntactical_ thing to do
the same thing much more simply. I find it much more natural anyway, and
as I tried to show it /should/ be straightforward to implement.

Gcc has various of these ANSI extensions that are really purely
syntactical:

 - "a ? : b" is just syntactical sugar for "a ? a : b"
 - pointer casting lvalues is syntactical sugar for something that is
   actually reasonably hard to do, but occasionally useful.

In fact, think of it the same was as the casting of lvalues: sure, you CAN
do it with standard ANSI C, but it cumbersome.

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  8:02                                                   ` mark
  1999-06-07  8:41                                                     ` David S. Miller
@ 1999-06-30 15:43                                                     ` mark
  1 sibling, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: jason; +Cc: martin, egcs

>>>>> "Jason" == Jason Merrill <jason@cygnus.com> writes:

    Jason> My problem with this is that only people who read the
    Jason> release notes for *this release* will see the big letters.
    Jason> Meanwhile, people who start with a later version or aren't
    Jason> involved in deploying the tools or whatever won't see the
    Jason> warning. 

This is a valid point.

    Jason> Saying "garbage in, garbage out" is a cop-out.  If we're
    Jason> going to let code like this break, we need to emit a
    Jason> warning so that people know that they have a problem,
    Jason> rather than leaving them to debug obscure problems.

Any warning will yield many false positives.  For example, there are
places in GCC where we store a `foo *' in a `tree' slot in a data
structure.  We (hopefully) never access it as a tree; we just cast if
back and forth.  That's legal.  It's also perhaps worth warning about.

But, warning like mad over every object-oriented C program seems
annoying.  Of course, a warning that's part of -W or some such is
probably OK.  But, then it won't accomplish what you want.  The
warning has to be on by default to overcome the problem you describe.
It's a reasonable point of view to argue that this is a better QOI
than the current state, but it's reasonable to argue the opposite as
well: false positive warnings are pretty annoying, and obscure the
real warnings.

We've had -fstrict-aliasing on in the tree for a long time, and had
very few bugs that we tracked down to the kind of thing you are
talking about.  But, admittedly, we haven't yet had it on in a general
release, so it's a relatively small sample size.

It's reasonable to argue that -fstrict-aliasing should be off by
default.  Then, you have to turn it on to get the benefits; when you
do so, it's fair to expect that you figured out what it did before you
did it, and it's easy to realize that if something went wrong after
you turned it on that it had to do with -fstrict-aliasing.
Unfortunately, this alternative means that most code will never be
compiled with this optimization.  I'm not sure what to think; QOI also
involves generating code that goes as fast as possible when presented
with conformant code, and not requiring people to root around the
manual to find funny flags to write into their makefiles to make the
code go fast.  It's a trade-off; reasonable people can certainly
disagree on whether or not -fstrict-aliasing should be on by default.

Note that this choice would have absolutely no bearing on the
*original* discussion regarding the Linux kernel; they want
-fstrict-aliasing *on* and still to have their code work as they
expect.

    Jason> BTW, the union trick isn't part of C, either.  It's a GCC
    Jason> implementation choice.

True enough.  This is implementation-defined behavior.  (Not
"undefined", but still "implementation-defined".)  I believe the
"memcpy trick" is the only 100% portable solution.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  2:36                                           ` Jamie Lokier
  1999-06-07  8:04                                             ` mark
@ 1999-06-30 15:43                                             ` Jamie Lokier
  1 sibling, 0 replies; 218+ messages in thread
From: Jamie Lokier @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

PgGGmark@codesourcery.com wrote:
>     Jamie>   *(foo*)(void*)(&x)
> 
> I intended this to be covered by my proposal.  This would officially
> be a "funny cast", and considered able to alias anything, provided
> that x is a variable of an expression of the form a->b or a.b.

Why the restriction on x?
Things I've seen around, that are outside your proposal:

 - accessing an integer/float as a struct, to access individual parts.
 - vice versa to generate hash values / do vector operations.
 - accessing an integer array as different size integers for fast
   vector operatings (e.g. image processing).

Image processing is a particular pain.  As are optimised implementations
of strlen, strcpy, memcpy etc.

We could just tell people their code will work if the pointed-to entity
happens to be a struct member.  We could tell them to use a union like
they're supposed to.

But simply fixing a vector processing kernel to use unions won't
guarantee correct code: all the callers must be changed to use the union
representation too, because the image processing ops may get inlined.
Hence the special char * exception so things like memcpy work I suppose.

>     Jamie> I bet there's still a fair bit of that around.  In the
>     Jamie> modern world we've got *reinterpret_cast<foo*>(&x), which
>     Jamie> is presumably treated specially w.r.t. aliases.
> 
> No, it does not.  The use of reinterpret_cast does not exempt a
> standard-conforming program from the rules about using an lvalue of
> the wrong type to access storage.

I meant more along the lines of "what GCC does" than "what the standard
says" on this.  I realise this gives undefined behaviour standard-wise.

Presumably reinterpret_cast this equivalent to one of your "funny casts"?

have nice day,
-- Jamie

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 11:26   ` David S. Miller
  1999-06-03 12:03     ` mark
@ 1999-06-30 15:43     ` David S. Miller
  1 sibling, 0 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: chip, egcs, torvalds

   From: mark@codesourcery.com
   Date: Thu, 03 Jun 1999 10:40:38 -0700

   And David Miller (IIRC) indicated that the kernel folks would
   probably eliminate the non-standard C code in the next major kernel
   revision.

Actually, I've changed my mind, having to get rid of all such types of
casts inside the networking etc. is just an abomination.  There is no
reason anyone should have to use unions to teach the compiler about
what they're actually touching, if someone casts the thing they (the
programmer) know what they are doing, the compiler shouldn't assume
anything.

I'm not saying this should be the normal mode of operation, but some
mechanism needs to exist so that such code can be made valid _without_
resorting to ugly unions.  Consider our TCP hashing comparison code
in the kernel has gems like this:

#define TCP_IPV4_MATCH(__sk, __cookie, __saddr, __daddr, __ports, __dif)\
	(((*((__u64 *)&((__sk)->daddr)))== (__cookie))	&&		\
	 ((*((__u32 *)&((__sk)->dport)))== (__ports))   &&		\
	 (!((__sk)->bound_dev_if) || ((__sk)->bound_dev_if == (__dif))))

Sorry, I'm not using a union for this, it's totally a performance hack
and I'm not going to uglify the socket structure with silly unions.
And actually in this case, there is no reason the compiler cannot see
what I am up to.  Do powerful optimizations override common sense?

You know exactly what I'm doing there, and so do I, why can't egcs
figure it out that easily as well?

The compiler is just a tool.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-05 11:35                                       ` Andi Kleen
@ 1999-06-30 15:43                                         ` Andi Kleen
  0 siblings, 0 replies; 218+ messages in thread
From: Andi Kleen @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: ak, toon, law, jbuck, torvalds, craig, davem, chip, egcs

On Sat, Jun 05, 1999 at 07:41:07PM +0200, mark@codesourcery.com wrote:
> >>>>> "Andi" == Andi Kleen <ak@muc.de> writes:
> 
>     Andi> Mark, even when you don't like it, would you as
>     Andi> alias-expert-in-residence think that the basic strategy is
>     Andi> workable?
> 
> I don't know what workable means.  
> 
> But, I would argue against your patch.  There are cases where a
> pointer is cast to one type, and then cast back to another, and then
> used.  These cases are conforming, and I think that Linus' proposal
> will disable alias analysis in these cases.  That's bad.  Especially
> since often these casts are to `void*' for the express purpose in
> storing them in some kind of generic data structure.

I agree.

> 
> Note that I made an alternate, more circumspect, proposal, which has
> been ignored by both Linus and yourelf up until now, although there's
> been so much traffic that one couldn't really expect anyone to keep up
> with all of it:

Sorry, I must have missed it.

> 
>   Put expressions of the form `*((foo*) (&x))' in alias set zero if
>   x does not have type foo, or one of the types that is allowed
>   to alias it.
> 
> This proposal only affects nonconforming code, and thus changing the
> behavior of the compiler will not pessimize any conforming code.  It
> is important that `x' be a variable, or a field of a variable, not an
> arbitrary expression.  (For example, I don't think this should apply
> to `*((foo*) (f()))' since that might be conforming.)  But, if
> `x' is a variable, or of the form `x->y' or `x.'y' then we should be
> OK (it's not legal to talk about `x->y' if `x' is not of the right
> type), then we should be OK.

The "only with variable rule" makes it a bit more complicated and arbitary
than I hoped (e.g. I don't see
the difference between *((foo*)f()) = 1; and { foo *x=(foo*)f(); *x=1 }), 
but I could live with that if it is needed for the compromise needed for a
consensus. 

I think the kernel has some of the first cases, so it may be helpful to have
an optional (=not in -Wall) warning at least for the function case so that 
someone could go through the code base and fix it.


> So, this proposal is, IMO, a workable extension of the standard
> semantics.  I don't know if this covers all the cases in the kernel,
> but it should be easier to change Linux to fit this model than the
> strictly conforming one.
> 
> I'm also not sure if this is a good idea.  If we don't document this
> behavior, we're not promising it to Linux.  So, we might break it
> later.  If we *do* document it, then we have to promise to maintain
> this behavior.  That's extra work for us; we have to be convinced
> there's a good enough reason, and I'm not convinced yet.  The
> questions are:
> 
>   o How badly does Linux need the extra cycles that might be squeezed
>     out by this extra alias analysis?  How much faster will the 
>     average Linux system go?

There are some hot paths (e.g. TCP input packet processing) that would
benefit from it. The average Linux box is a work station that is mostly
idle (:@), but for high load servers and applications like Beowulf clusters
where latency counts it is helpful. Also  I think it will be more important
in the future (e.g. on Linux/IA64), where the CPU needs much more compiler
support for good performance.
> 
>   o How hard would it be to fix the kernel?

Very hard. I just tried to fix it in a small part of the TCP code, and it
already involved major changes. The main problem is that these generally
cannot be encapsulated in modules, it has to be changed globally, which
can be a big problem in a system with lots of external code and complicated
dependencies like Linux.


> So, in summary, I think:
> 
>   o It's not clear we want this behavior that badly.
>   o A correct implementation will be difficult.
>   o There will be maintenance headaches.
> 
> Furthermoe, I bet that by now, if all this energy had been spent
> fixing the code in the kernel, you'd have made good headway on some of
> the most prominent data structures.  Yes, this will be a tedious
> chore, but it's an easy one: you enclose things in a union, compile,
> see what doesn't, fix it, and go on.  

Erm no, it isn't that easy. There are no warnings and these cast could hide
everywhere. Someone would basically have to carefully audit about 3.5M LOCs of 
kernel source. And you probably know hard it is to coordinate such mega
patches with multiple (in case of Linux hundreds) maintainers. e.g. I already 
had to discard the nowhere near complete TCP alias fix work from my working
tree again, because David would most likely not have accepted it at this 
point because of the major changes involved, and keep it would have required
substantial continuous effort to hand integrate most new patches because
of the rejects.  Doing it in a crash effort is logistically not possible
I think.  The only way to do it are continuous slow incremental changes,
and the proposed gcc extension would make it a lot easier I think.



-Andi
-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 17:41                             ` mark
  1999-06-07  8:58                               ` Linus Torvalds
@ 1999-06-30 15:43                               ` mark
  1 sibling, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: rth, tim, craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus> On Sun, 6 Jun 1999 mark@codesourcery.com wrote:
    >>  Right.  But the part that's causing aliasing issues is just a
    >> memcpy; that's the `*(u32 *) p' bit.  You could write:
    >> 
    >> memcpy (&a, p, sizeof (a)); a = ntohl (a);

    Linus> Which is crap.

I think that I freely admitted in the posting that this approach is
not as convenient as what you had.  I think you can also see how to
wrap this up in a macro (probably using the already documented, and
hence guaranteed, statement-expression extension).

    Linus> And a compiler that requires you to write code like that
    Linus> is, by implication..

Was that really necessary? :-)

    Linus> If you can't see why

    Linus> 	a = ntohl((u32 *) p);

    Linus> is better than the horrible thing you're suggesting
    Linus> (regardless of whether the code generated is the same or
    Linus> not), then I might as well throw in the towel
    Linus> immediately.

I can see what the "horrible" thing is less convenient for you.  I've
also made sound (in my opinion, naturally) technical arguments against
your proposal, on grounds having not only to do with maintenance of
GCC, but also to do with the impact on code-generation for conforming
programs.  (For example, your proposal, as written, pessimizes:

  int i;	   
  int *ip = &i;
  void *vp = ip;
  *((int*) vp) = 3;

You can amend your proposal to handle the void (and perhaps char?)
case specially, but what about structures with common initial
segments, as used in object-oriented C?  TCL, for example, is one
program that uses this kind of thing heavily.  The Xt toolkit is
another, and there's something that is often performance-critical on a
Linux system.)

So, unless I'm overruled, *something* is going to have to change in

  a =ntohl((u32*) p);

if you're going to enable type-based alias analysis.

BTW, I've been notified in private mail that you pointed out a bug in
GCC's real.c, involving exactly the kinds of casts were arguing about.
(I somehow missed that message from you.)   Thanks for pointing that
out!  I'll fix it soon.

I gather that you suggested your proposal would avoid changing GCC.
But, it wouldn't, since GCC's first stage is compiled with a (possibly
non-GCC) host compiler.  Thus, GCC *must* be written in legal ANSI/ISO
C. 

Even in the kernel, your proposal will lead to a confusing situation.
You claim it's DWIM, but there the "I" really is "Linus Torvalds", and
not necessarily the rest of us.  People used to the ANSI/ISO C
aliasing rules will have to read the GCC manual very carefully to
figure out the meaning of your code.

I think by now you've been presented with a variety of strategies for
solving the problem in the kernel, including more than one idea for
macros that you could use like:

  ALIASING_CAST (type, x)

that would do what you want.  I believe Richard Henderson suggested
one involving local unions; you could also use memcpy as I suggested.
(There may be alignment issues that make my suggestion better, or
maybe not.  I'm not sure.) Using this approach would make your code
clear and self-documenting.  (It would be DWIS!)  This approach is
better than my earlier suggestion (using unions in header files): it
does not require header-file duplication, and requires only local
changes to the kernel.  (What function really could be improved by
type-based alias analysis?  Put it in a separate file.  Use
ALIASING_CAST in it.  Compile *that file* with -fstrict-aliasing.
Performance win, little additional maintenance cost, no impact on the
rest of the kernel.)

Even if we implemented your proposal you'd have to audit all your code
to make sure that all the technically invalid casts come in
expressions that are immediately derefenced, and not stored in
temporaries.

At this point, I strongly suggest you abandon your proposal.  Nobody
looks likely to implement it (at least on a volunteer basis), and I've
pointed out that it will be hard to do so, even if it was agreed that
it was a good thing to do.  Sorry.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-03 12:02   ` Andi Kleen
  1999-06-03 15:38     ` Martin v. Loewis
@ 1999-06-30 15:43     ` Andi Kleen
  1 sibling, 0 replies; 218+ messages in thread
From: Andi Kleen @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: egcs

mark@codesourcery.com writes:
> 
> And David Miller (IIRC) indicated that the kernel folks would probably
> eliminate the non-standard C code in the next major kernel revision.

This would be a major rewrite of a lot of code. Linux is full of such
things. Having it all fixed in the next major release (and even in the
next release after that) would be a miracle. 

> So, no, I don't think it's a priority for us to make any such change.
> I can't really speak for others, but that's my take on the situation.

Bad :/. -fno-strict-aliasing is the only alternative then. What a pity,
Linus' proposal looked reasonable. 

-Andi

-- 
This is like TV. I don't like TV.

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: burley (was Re: Linux and aliasing?)
  1999-06-04  8:16                         ` craig
@ 1999-06-30 15:43                           ` craig
  0 siblings, 0 replies; 218+ messages in thread
From: craig @ 1999-06-30 15:43 UTC (permalink / raw)
  To: hahn; +Cc: craig

>could we please have a separate list for this kind of asinine namecalling?

We do, it's called /dev/null, but Linus continues to insist on using
*this* list.

>perhaps call it "hoity-toity-armchair-architects-soapbox"?
>
>> In other words, you believe you are a better language designer than
>> the ISO C people as well as the gcc maintainers, despite the fact
>> that you know, what, *nothing* about language design, and *nothing*
>> about compiler design and, especially, long-term maintenance of
>> compilers?

Strange you quote *my* email, since I called him no names, like "lawyer",
in it.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 15:02             ` Richard Henderson
  1999-06-04 16:50               ` Bernd Schmidt
  1999-06-05  9:35               ` Linus Torvalds
@ 1999-06-30 15:43               ` Richard Henderson
  2 siblings, 0 replies; 218+ messages in thread
From: Richard Henderson @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: craig, davem, mark, chip, egcs

This thread is huge, and there is obviously a bit of bile swilling about,
so I probably won't read it all.  However, I will point out one thing --

On Thu, Jun 03, 1999 at 11:02:35PM -0700, Linus Torvalds wrote:
> The extremely straightforward rule that at least I would advocate is _so_
> straightforward as to be almost scary:
>  - if there is a pointer cast, that pointer cast invalidates all
>    type-based alias information.

Doing what you want is actually very hard for GCC right now.  Consider

	int i;
	short s, *ps = (short *)&i;
	i = 0;
	s = *ps;

Due to a long-ago quirk of history, GCC processes the abstract syntax
tree one statement at a time, so the fact of the cast is long gone by
the time we do the dereference.  Mark got around this problem by
annotating the memories as we create them, which is good enough to pass
legal muster, but not good enough for what you want.

To do what you want, we'd have to annotate pointers instead of memories
and then do global data flow analysis to find out what addresses have
been "infected" by the cast.  Doing anything on a local scale wouldn't
be good enough, I don't think, to handle code coming in from inlines.

Now, we do want to do some of this, since if you can do global data
flow analysis, you can propogate points-to data that gets you even 
better alias info than what we have now.  We'd just fall back on type
information for lack of interprocedural alias info. 

But something like that is a long way off.


r~

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  7:11                 ` mark
  1999-06-04  8:38                   ` Linus Torvalds
@ 1999-06-30 15:43                   ` mark
  1 sibling, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus> On Thu, 3 Jun 1999 mark@codesourcery.com wrote:
    >>  I don't think the cast rule is by any means the right obvious
    >> default.  For one thing, it pessimizes object-oriented C code
    >> that does downcasts through an inheritance hierarchy.  There's
    >> no reason that we shouldn't be able to use type-based alias
    >> analysis in such situations, but your proposal would make it
    >> not happen.

    Linus> But those downcasts are implicit, not explicit, no? I think
    Linus> only explicit casts should break the alias rule.

No, they are often explicit.  In C, you don't have base classes, per
se, so you right them explicitly.

    Linus> Have you actually ever tried? I don't think you realize
    Linus> quite what a rat-hole it is. It's not worth ANYBODYS time.

Yes, I have done similar things.

    Linus> I think it's a damn shame that instead of technical
    Linus> arguments _everything_ revolves around people reading the
    Linus> standard as if it was the bible, and trying to make people
    Linus> feel guilty for not really caring. It's not a sin to just
    Linus> want to get good code without having to do magic
    Linus> contortions, guys.

I implemented the code, and I wouldn't say that I ignored real
programmers.  In fact, my work was paid for by real programmers, who
noticed that GCC would generate markedly better code on some examples
they had if type-based aliasing were in use.  

I've expressed the position that if we come up with a reasonable
localized rule, that does not pessimize conforming code, that I would
have no objection.  In fact, I would be perfectly willing to work on
such a project.

With respect to your comments about prereleases, they're simply not
fair.  I see no reason that I, or anyone else, should volunteer our
time to add features.  Had I introduced a bug, I would feel duty-bound
to fix it.  

I do listen to feedback, and I've heard your point of view.  I respect
your opinion.  That doesn't mean I'm going to sit down and to what you
would like on my own time.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04  8:38                   ` Linus Torvalds
@ 1999-06-30 15:43                     ` Linus Torvalds
  0 siblings, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark; +Cc: craig, davem, chip, egcs

On Fri, 4 Jun 1999 mark@codesourcery.com wrote:
> 
> With respect to your comments about prereleases, they're simply not
> fair.  I see no reason that I, or anyone else, should volunteer our
> time to add features.  Had I introduced a bug, I would feel duty-bound
> to fix it.  

Oh, let me apologize for that comment. Consider me properly chastizised: I
obviously use pre-releases all the time myself, and they are the greatest
thing since sliced reak.

> I do listen to feedback, and I've heard your point of view.  I respect
> your opinion.  That doesn't mean I'm going to sit down and to what you
> would like on my own time.

I really don't expect people to code for me. It's damn convenient, though.

I =do= expect people to at least consider the issue seriously, and
seriously dismiss it if they do - and keep it in mind. Instead of
attacking it on some paperwork issue.. Which you do seem to be doing. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-07  9:18                                 ` mark
  1999-06-07  9:29                                   ` Linus Torvalds
@ 1999-06-30 15:43                                   ` mark
  1 sibling, 0 replies; 218+ messages in thread
From: mark @ 1999-06-30 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: rth, tim, craig, davem, chip, egcs

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

    Linus> My proposal might mean that fewer people will use the
    Linus> "-fno-strict-alias" switch, because they won't have to. I
    Linus> don't think you realize how most professional software
    Linus> projects work. The "professional" part means that people
    Linus> are under a deadline and don't really care about your
    Linus> standards conformance, they want things to WORK.

Please don't make these kinds of statements.  They're not becoming.

I am a professional developer, paid for my work.  Some of that work is
on free software, some is not.  Before my current job, I worked as a
technical lead in midsized software corporation, where I, and my team,
were all professional developers.  I brought two or three products to
release, and I'm well aware of the pressures, both technical and
otherwise, that accompany such a project.

    Linus> Andy Kleen already said he was playing with patches that
    Linus> implemented it, but just ignore that, like you ignore all
    Linus> the other arguments I've presented. Sorry,

I did not ignore Andy.  Indeed, I responded to him, both personally
and to the list.  I discussed his/your proposal with him, and pointed
out techincal flaws both in your proposal, and in the obvious way of
implementing it.  (I haven't seen Andy's patches, so I don't know what
approach he took; all I said was why the obvious one won't work.)

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 10:31                             ` Joe Buck
  1999-06-04 10:53                               ` Jeffrey A Law
  1999-06-30 15:43                               ` Joe Buck
@ 1999-07-11 10:55                               ` Jeffrey A Law
  1999-07-31 23:33                                 ` Jeffrey A Law
  2 siblings, 1 reply; 218+ messages in thread
From: Jeffrey A Law @ 1999-07-11 10:55 UTC (permalink / raw)
  To: Joe Buck; +Cc: torvalds, craig, mark, davem, chip, egcs

  In message <199906041728.KAA19685@atrus.synopsys.com>you write:
  > >   > Either
  > >   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing
  > ).
  > > This is my strong preference.
  > 
  > In that case, then all release announcements and NEWS should prominently
  > mention the effect of this new optimization and the -fno-strict-aliasing
  > flag, so that everyone has fair warning.
We've got preliminary entry in the FAQ.  That entry is also referenced by the
prototype gcc-2.95 features page as well as the prototype gcc-2.95 caveats
page.

I hope to improve the faq entry before the actual release.  In particular we
want to include a better explanation of the generic problem (along with a
code sample?) as well as various options one could use to fix the code.

Then we just mention the Linux kernel as an example of a package which is known
to have this particular problem.

This is basically what we did with the asm clobbers information.  We've got a
generic entry which describes the problem and it references a Linux kernel
specific entry.

jeff
which show
problem
  > 
  > 


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-07-11 10:55                               ` Jeffrey A Law
@ 1999-07-31 23:33                                 ` Jeffrey A Law
  0 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-07-31 23:33 UTC (permalink / raw)
  To: Joe Buck; +Cc: torvalds, craig, mark, davem, chip, egcs

  In message <199906041728.KAA19685@atrus.synopsys.com>you write:
  > >   > Either
  > >   > 1. Leave it as it is (the Linux kernel will need -fno-strict-aliasing
  > ).
  > > This is my strong preference.
  > 
  > In that case, then all release announcements and NEWS should prominently
  > mention the effect of this new optimization and the -fno-strict-aliasing
  > flag, so that everyone has fair warning.
We've got preliminary entry in the FAQ.  That entry is also referenced by the
prototype gcc-2.95 features page as well as the prototype gcc-2.95 caveats
page.

I hope to improve the faq entry before the actual release.  In particular we
want to include a better explanation of the generic problem (along with a
code sample?) as well as various options one could use to fix the code.

Then we just mention the Linux kernel as an example of a package which is known
to have this particular problem.

This is basically what we did with the asm clobbers information.  We've got a
generic entry which describes the problem and it references a Linux kernel
specific entry.

jeff
which show
problem
  > 
  > 


^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 11:54 Mike Stump
  1999-06-04 12:13 ` Jeffrey A Law
@ 1999-06-30 15:43 ` Mike Stump
  1 sibling, 0 replies; 218+ messages in thread
From: Mike Stump @ 1999-06-30 15:43 UTC (permalink / raw)
  To: jbuck, law; +Cc: chip, craig, davem, egcs, mark, torvalds

> From: Joe Buck <jbuck@Synopsys.COM>
> Date: Fri, 4 Jun 99 10:28:34 PDT

> In that case, then all release announcements and NEWS should prominently
> mention the effect of this new optimization and the -fno-strict-aliasing
> flag, so that everyone has fair warning.

I agree.  I am trying to figure out how to alert my entire company of
this new feature.  I am sure we have old networking code and tons of
programmers that probably aren't used to optimizing compilers.  :-)

I can see them wanting to rip the skin off my body if I tried to
surprise this one on them, then I would have to do all this back
pedaling....

The recent mini-flame war is useful to me,  in that it realerted me of
the `problem' (one of education from my perspective).

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 12:13 ` Jeffrey A Law
  1999-06-04 13:25   ` Sylvain Pion
@ 1999-06-30 15:43   ` Jeffrey A Law
  1 sibling, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Mike Stump; +Cc: jbuck, chip, craig, davem, egcs, mark, torvalds

  In message < 199906041854.LAA05487@kankakee.wrs.com >you write:

  > > In that case, then all release announcements and NEWS should prominently
  > > mention the effect of this new optimization and the -fno-strict-aliasing
  > > flag, so that everyone has fair warning.
  > 
  > I agree.  I am trying to figure out how to alert my entire company of
  > this new feature.  I am sure we have old networking code and tons of
  > programmers that probably aren't used to optimizing compilers.  :-)
I suspect you're not totally alone :-)    If you write up any kind of summary
with examples of dangerous code I think everyone would find it useful.  Any
chance you could post it so that we can put it on the web?

jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 15:08 Ross Harvey
  1999-06-06 15:46 ` Linus Torvalds
  1999-06-06 17:29 ` David S. Miller
@ 1999-06-30 15:43 ` Ross Harvey
  2 siblings, 0 replies; 218+ messages in thread
From: Ross Harvey @ 1999-06-30 15:43 UTC (permalink / raw)
  To: mark, torvalds; +Cc: chip, craig, davem, egcs, rth, tim

> From: Linus Torvalds <torvalds@transmeta.com>
>
> On Sun, 6 Jun 1999 mark@codesourcery.com wrote:
> > 
> > Right.  But the part that's causing aliasing issues is just a memcpy;
> > that's the `*(u32 *) p' bit.   You could write:
> > 
> >   memcpy (&a, p, sizeof (a));
> >   a = ntohl (a);
>
> Which is crap.
>
> And a compiler that requires you to write code like that is, by
> implication..
>:::
> If you can't see why
>
> 	a = ntohl((u32 *) p);
>
> is better than the horrible thing you're suggesting (regardless of whether
> the code generated is the same or not), then I might as well throw in the
> towel immediately. The whole point of my suggestion was to make good code
> generation possible with an interface that you can actually use without
> barfing..
>
> 		Linus

Umm, the ice is getting thin, here. From time to time, I have to change
kernel code FROM things like ``a = ntohl((u32 *) p);'' TO things like
``memcpy (&a, p, sizeof a);  a = ntohl (a);''.

Why? Because it's illegal for a reason, and the alpha platform I support
has alignment requirements for which I have a guarantee that the memcpy
approach will work and be reasonably efficient. The cast can generate a
kernel alignment fault and will either panic or have a hideous run-time
fixup cost.  I know we are talking about aliasing, but in terms of why the
rules are there, I think this is an illuminating example.

LP64 is coming to the PeeCee world, and even if the fixup is in hardware,
you still have a runtime penalty for making addressibility assumptions that
are properly the domain of the compiler.

	Ross.Harvey@Computer.Org

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 17:29 ` David S. Miller
@ 1999-06-30 15:43   ` David S. Miller
  0 siblings, 0 replies; 218+ messages in thread
From: David S. Miller @ 1999-06-30 15:43 UTC (permalink / raw)
  To: ross; +Cc: mark, torvalds, chip, craig, egcs, rth, tim

   Date: Sun, 6 Jun 1999 15:07:18 -0700 (PDT)
   From: Ross Harvey <ross@ghs.com>

   Umm, the ice is getting thin, here. From time to time, I have to change
   kernel code FROM things like ``a = ntohl((u32 *) p);'' TO things like
   ``memcpy (&a, p, sizeof a);  a = ntohl (a);''.

   Why? Because it's illegal for a reason, and the alpha platform I support
   has alignment requirements for which I have a guarantee that the memcpy
   approach will work and be reasonably efficient. The cast can generate a
   kernel alignment fault and will either panic or have a hideous run-time
   fixup cost.

This assumes you haven't setup your packet input processing to make
the data end up aligned by the time the header parsing gets at it.

The cast references work just fine and incur no traps on Alpha in the
Linux networking, so I have no idea what you are talking about.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 13:25   ` Sylvain Pion
  1999-06-04 13:32     ` Jeffrey A Law
@ 1999-06-30 15:43     ` Sylvain Pion
  1 sibling, 0 replies; 218+ messages in thread
From: Sylvain Pion @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: EGCS list

On Fri, Jun 04, 1999 at 01:08:15PM -0600, Jeffrey A Law wrote:
>   In message < 199906041854.LAA05487@kankakee.wrs.com >you write:
>   > > In that case, then all release announcements and NEWS should prominently
>   > > mention the effect of this new optimization and the -fno-strict-aliasing
>   > > flag, so that everyone has fair warning.
>   > 
>   > I agree.  I am trying to figure out how to alert my entire company of
>   > this new feature.  I am sure we have old networking code and tons of
>   > programmers that probably aren't used to optimizing compilers.  :-)
> I suspect you're not totally alone :-)    If you write up any kind of summary
> with examples of dangerous code I think everyone would find it useful.  Any
> chance you could post it so that we can put it on the web?

Is there a way to emit a warning when dangerous code is used ?
That would greatly help.

-- 
Sylvain

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 15:46 ` Linus Torvalds
@ 1999-06-30 15:43   ` Linus Torvalds
  0 siblings, 0 replies; 218+ messages in thread
From: Linus Torvalds @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Ross Harvey; +Cc: mark, chip, craig, davem, egcs, rth, tim

On Sun, 6 Jun 1999, Ross Harvey wrote:
> 
> Why? Because it's illegal for a reason, and the alpha platform I support
> has alignment requirements for which I have a guarantee that the memcpy
> approach will work and be reasonably efficient. The cast can generate a
> kernel alignment fault and will either panic or have a hideous run-time
> fixup cost.

Indeed. Which is why we have "get_unaligned()" for example - which does
what the name suggests. Exactly because memcpy() is _not_ acceptable for a
fairly obvious syntactic reason. 

I certainly would not knowingly ever apply a patch that adds memcpy's like
in the example. They may exist in drivers where I don't care what stupid
things people do, but it's not something I consider acceptable coding
practice for any regular stuff.

I'd be pretty impressed if gcc _were_ able to generate the correct code
for the example specified - that would be fairly impressive in itself. 
Currently that's not the case. And if I were a compiler guy, I'd try my
best to encourage people to use other constructs and making sure they work
better. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 13:32     ` Jeffrey A Law
@ 1999-06-30 15:43       ` Jeffrey A Law
  0 siblings, 0 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-30 15:43 UTC (permalink / raw)
  To: Sylvain Pion; +Cc: EGCS list

  In message < 19990604222537.B12018@rigel.inria.fr >you write:
  > Is there a way to emit a warning when dangerous code is used ?
  > That would greatly help.
Possibly.  I'm not well versed enough in the front-end to really know.

One might be able to start by looking at how -Wcast-align works and see if
there's enough information at that time to warn without giving too many
false positives.
jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 15:08 Ross Harvey
  1999-06-06 15:46 ` Linus Torvalds
@ 1999-06-06 17:29 ` David S. Miller
  1999-06-30 15:43   ` David S. Miller
  1999-06-30 15:43 ` Ross Harvey
  2 siblings, 1 reply; 218+ messages in thread
From: David S. Miller @ 1999-06-06 17:29 UTC (permalink / raw)
  To: ross; +Cc: mark, torvalds, chip, craig, egcs, rth, tim

   Date: Sun, 6 Jun 1999 15:07:18 -0700 (PDT)
   From: Ross Harvey <ross@ghs.com>

   Umm, the ice is getting thin, here. From time to time, I have to change
   kernel code FROM things like ``a = ntohl((u32 *) p);'' TO things like
   ``memcpy (&a, p, sizeof a);  a = ntohl (a);''.

   Why? Because it's illegal for a reason, and the alpha platform I support
   has alignment requirements for which I have a guarantee that the memcpy
   approach will work and be reasonably efficient. The cast can generate a
   kernel alignment fault and will either panic or have a hideous run-time
   fixup cost.

This assumes you haven't setup your packet input processing to make
the data end up aligned by the time the header parsing gets at it.

The cast references work just fine and incur no traps on Alpha in the
Linux networking, so I have no idea what you are talking about.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-06 15:08 Ross Harvey
@ 1999-06-06 15:46 ` Linus Torvalds
  1999-06-30 15:43   ` Linus Torvalds
  1999-06-06 17:29 ` David S. Miller
  1999-06-30 15:43 ` Ross Harvey
  2 siblings, 1 reply; 218+ messages in thread
From: Linus Torvalds @ 1999-06-06 15:46 UTC (permalink / raw)
  To: Ross Harvey; +Cc: mark, chip, craig, davem, egcs, rth, tim

On Sun, 6 Jun 1999, Ross Harvey wrote:
> 
> Why? Because it's illegal for a reason, and the alpha platform I support
> has alignment requirements for which I have a guarantee that the memcpy
> approach will work and be reasonably efficient. The cast can generate a
> kernel alignment fault and will either panic or have a hideous run-time
> fixup cost.

Indeed. Which is why we have "get_unaligned()" for example - which does
what the name suggests. Exactly because memcpy() is _not_ acceptable for a
fairly obvious syntactic reason. 

I certainly would not knowingly ever apply a patch that adds memcpy's like
in the example. They may exist in drivers where I don't care what stupid
things people do, but it's not something I consider acceptable coding
practice for any regular stuff.

I'd be pretty impressed if gcc _were_ able to generate the correct code
for the example specified - that would be fairly impressive in itself. 
Currently that's not the case. And if I were a compiler guy, I'd try my
best to encourage people to use other constructs and making sure they work
better. 

		Linus

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
@ 1999-06-06 15:08 Ross Harvey
  1999-06-06 15:46 ` Linus Torvalds
                   ` (2 more replies)
  0 siblings, 3 replies; 218+ messages in thread
From: Ross Harvey @ 1999-06-06 15:08 UTC (permalink / raw)
  To: mark, torvalds; +Cc: chip, craig, davem, egcs, rth, tim

> From: Linus Torvalds <torvalds@transmeta.com>
>
> On Sun, 6 Jun 1999 mark@codesourcery.com wrote:
> > 
> > Right.  But the part that's causing aliasing issues is just a memcpy;
> > that's the `*(u32 *) p' bit.   You could write:
> > 
> >   memcpy (&a, p, sizeof (a));
> >   a = ntohl (a);
>
> Which is crap.
>
> And a compiler that requires you to write code like that is, by
> implication..
>:::
> If you can't see why
>
> 	a = ntohl((u32 *) p);
>
> is better than the horrible thing you're suggesting (regardless of whether
> the code generated is the same or not), then I might as well throw in the
> towel immediately. The whole point of my suggestion was to make good code
> generation possible with an interface that you can actually use without
> barfing..
>
> 		Linus

Umm, the ice is getting thin, here. From time to time, I have to change
kernel code FROM things like ``a = ntohl((u32 *) p);'' TO things like
``memcpy (&a, p, sizeof a);  a = ntohl (a);''.

Why? Because it's illegal for a reason, and the alpha platform I support
has alignment requirements for which I have a guarantee that the memcpy
approach will work and be reasonably efficient. The cast can generate a
kernel alignment fault and will either panic or have a hideous run-time
fixup cost.  I know we are talking about aliasing, but in terms of why the
rules are there, I think this is an illuminating example.

LP64 is coming to the PeeCee world, and even if the fixup is in hardware,
you still have a runtime penalty for making addressibility assumptions that
are properly the domain of the compiler.

	Ross.Harvey@Computer.Org

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 13:25   ` Sylvain Pion
@ 1999-06-04 13:32     ` Jeffrey A Law
  1999-06-30 15:43       ` Jeffrey A Law
  1999-06-30 15:43     ` Sylvain Pion
  1 sibling, 1 reply; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-04 13:32 UTC (permalink / raw)
  To: Sylvain Pion; +Cc: EGCS list

  In message < 19990604222537.B12018@rigel.inria.fr >you write:
  > Is there a way to emit a warning when dangerous code is used ?
  > That would greatly help.
Possibly.  I'm not well versed enough in the front-end to really know.

One might be able to start by looking at how -Wcast-align works and see if
there's enough information at that time to warn without giving too many
false positives.
jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 12:13 ` Jeffrey A Law
@ 1999-06-04 13:25   ` Sylvain Pion
  1999-06-04 13:32     ` Jeffrey A Law
  1999-06-30 15:43     ` Sylvain Pion
  1999-06-30 15:43   ` Jeffrey A Law
  1 sibling, 2 replies; 218+ messages in thread
From: Sylvain Pion @ 1999-06-04 13:25 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: EGCS list

On Fri, Jun 04, 1999 at 01:08:15PM -0600, Jeffrey A Law wrote:
>   In message < 199906041854.LAA05487@kankakee.wrs.com >you write:
>   > > In that case, then all release announcements and NEWS should prominently
>   > > mention the effect of this new optimization and the -fno-strict-aliasing
>   > > flag, so that everyone has fair warning.
>   > 
>   > I agree.  I am trying to figure out how to alert my entire company of
>   > this new feature.  I am sure we have old networking code and tons of
>   > programmers that probably aren't used to optimizing compilers.  :-)
> I suspect you're not totally alone :-)    If you write up any kind of summary
> with examples of dangerous code I think everyone would find it useful.  Any
> chance you could post it so that we can put it on the web?

Is there a way to emit a warning when dangerous code is used ?
That would greatly help.

-- 
Sylvain

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
  1999-06-04 11:54 Mike Stump
@ 1999-06-04 12:13 ` Jeffrey A Law
  1999-06-04 13:25   ` Sylvain Pion
  1999-06-30 15:43   ` Jeffrey A Law
  1999-06-30 15:43 ` Mike Stump
  1 sibling, 2 replies; 218+ messages in thread
From: Jeffrey A Law @ 1999-06-04 12:13 UTC (permalink / raw)
  To: Mike Stump; +Cc: jbuck, chip, craig, davem, egcs, mark, torvalds

  In message < 199906041854.LAA05487@kankakee.wrs.com >you write:

  > > In that case, then all release announcements and NEWS should prominently
  > > mention the effect of this new optimization and the -fno-strict-aliasing
  > > flag, so that everyone has fair warning.
  > 
  > I agree.  I am trying to figure out how to alert my entire company of
  > this new feature.  I am sure we have old networking code and tons of
  > programmers that probably aren't used to optimizing compilers.  :-)
I suspect you're not totally alone :-)    If you write up any kind of summary
with examples of dangerous code I think everyone would find it useful.  Any
chance you could post it so that we can put it on the web?

jeff

^ permalink raw reply	[flat|nested] 218+ messages in thread

* Re: Linux and aliasing?
@ 1999-06-04 11:54 Mike Stump
  1999-06-04 12:13 ` Jeffrey A Law
  1999-06-30 15:43 ` Mike Stump
  0 siblings, 2 replies; 218+ messages in thread
From: Mike Stump @ 1999-06-04 11:54 UTC (permalink / raw)
  To: jbuck, law; +Cc: chip, craig, davem, egcs, mark, torvalds

> From: Joe Buck <jbuck@Synopsys.COM>
> Date: Fri, 4 Jun 99 10:28:34 PDT

> In that case, then all release announcements and NEWS should prominently
> mention the effect of this new optimization and the -fno-strict-aliasing
> flag, so that everyone has fair warning.

I agree.  I am trying to figure out how to alert my entire company of
this new feature.  I am sure we have old networking code and tons of
programmers that probably aren't used to optimizing compilers.  :-)

I can see them wanting to rip the skin off my body if I tried to
surprise this one on them, then I would have to do all this back
pedaling....

The recent mini-flame war is useful to me,  in that it realerted me of
the `problem' (one of education from my perspective).

^ permalink raw reply	[flat|nested] 218+ messages in thread

end of thread, other threads:[~1999-07-31 23:33 UTC | newest]

Thread overview: 218+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-06-03 10:23 Linux and aliasing? Chip Salzenberg
1999-06-03 10:37 ` mark
1999-06-03 11:26   ` David S. Miller
1999-06-03 12:03     ` mark
1999-06-03 12:25       ` David S. Miller
1999-06-03 20:06         ` craig
1999-06-03 23:03           ` Linus Torvalds
1999-06-03 23:45             ` mark
1999-06-04  0:04               ` Linus Torvalds
1999-06-04  1:08                 ` Branko Cibej
1999-06-30 15:43                   ` Branko Cibej
1999-06-04  1:24                 ` Joe Buck
1999-06-04  1:50                   ` Linus Torvalds
1999-06-04  5:46                     ` craig
1999-06-04  7:22                       ` burley (was Re: Linux and aliasing?) Mark Hahn
1999-06-04  8:16                         ` craig
1999-06-30 15:43                           ` craig
1999-06-30 15:43                         ` Mark Hahn
1999-06-04  8:35                       ` Linux and aliasing? Linus Torvalds
1999-06-04 10:04                         ` Joe Buck
1999-06-04 10:22                           ` Jeffrey A Law
1999-06-04 10:31                             ` Joe Buck
1999-06-04 10:53                               ` Jeffrey A Law
1999-06-30 15:43                                 ` Jeffrey A Law
1999-06-30 15:43                               ` Joe Buck
1999-07-11 10:55                               ` Jeffrey A Law
1999-07-31 23:33                                 ` Jeffrey A Law
1999-06-04 11:11                             ` Toon Moene
1999-06-04 12:20                               ` Jeffrey A Law
1999-06-05  5:45                                 ` Toon Moene
1999-06-05  6:23                                   ` Andi Kleen
1999-06-05 10:32                                     ` Toon Moene
1999-06-05 13:26                                       ` Jamie Lokier
1999-06-05 19:35                                         ` Linus Torvalds
1999-06-06  1:18                                           ` Martin v. Loewis
1999-06-06 10:46                                             ` Linus Torvalds
1999-06-30 15:43                                               ` Linus Torvalds
1999-06-06 17:56                                             ` Jason Merrill
1999-06-06 19:24                                               ` Tim Hollebeek
1999-06-30 15:43                                                 ` Tim Hollebeek
1999-06-06 22:23                                               ` Jeffrey A Law
1999-06-30 15:43                                                 ` Jeffrey A Law
     [not found]                                               ` <199906070645.IAA00615@mira.isdn.cs.tu-berlin.de>
1999-06-07  2:14                                                 ` Jason Merrill
1999-06-07  8:02                                                   ` mark
1999-06-07  8:41                                                     ` David S. Miller
1999-06-07  9:24                                                       ` Jeffrey A Law
1999-06-07  9:29                                                         ` David S. Miller
1999-06-30 15:43                                                           ` David S. Miller
1999-06-30 15:43                                                         ` Jeffrey A Law
1999-06-07  9:32                                                       ` Joe Buck
1999-06-30 15:43                                                         ` Joe Buck
1999-06-30 15:43                                                       ` David S. Miller
1999-06-30 15:43                                                     ` mark
1999-06-07 13:11                                                   ` Jeffrey A Law
1999-06-30 15:43                                                     ` Jeffrey A Law
1999-06-30 15:43                                                   ` Jason Merrill
1999-06-30 15:43                                               ` Jason Merrill
1999-06-30 15:43                                             ` Martin v. Loewis
1999-06-30 15:43                                           ` Linus Torvalds
1999-06-30 15:43                                         ` Jamie Lokier
1999-06-05 18:48                                       ` Linus Torvalds
1999-06-30 15:43                                         ` Linus Torvalds
1999-06-30 15:43                                       ` Toon Moene
1999-06-05 10:37                                     ` mark
1999-06-05 11:09                                       ` David S. Miller
1999-06-05 12:11                                         ` Toon Moene
1999-06-05 12:21                                           ` David S. Miller
1999-06-05 16:51                                             ` mark
1999-06-30 15:43                                               ` mark
1999-06-30 15:43                                             ` David S. Miller
1999-06-30 15:43                                           ` Toon Moene
1999-06-07  6:01                                         ` Joern Rennecke
1999-06-30 15:43                                           ` Joern Rennecke
1999-06-30 15:43                                         ` David S. Miller
1999-06-05 11:35                                       ` Andi Kleen
1999-06-30 15:43                                         ` Andi Kleen
1999-06-05 12:41                                       ` Jamie Lokier
1999-06-05 14:43                                         ` Martin v. Loewis
1999-06-30 15:43                                           ` Martin v. Loewis
1999-06-05 16:53                                         ` mark
1999-06-07  2:36                                           ` Jamie Lokier
1999-06-07  8:04                                             ` mark
1999-06-30 15:43                                               ` mark
1999-06-30 15:43                                             ` Jamie Lokier
1999-06-30 15:43                                           ` mark
1999-06-30 15:43                                         ` Jamie Lokier
1999-06-30 15:43                                       ` mark
1999-06-30 15:43                                     ` Andi Kleen
1999-06-06 23:12                                   ` f77 vs type based alias analysis Jeffrey A Law
1999-06-30 15:43                                     ` Jeffrey A Law
1999-06-06 23:20                                   ` Linux and aliasing? Jeffrey A Law
1999-06-30 15:43                                     ` Jeffrey A Law
1999-06-30 15:43                                   ` Toon Moene
1999-06-30 15:43                                 ` Jeffrey A Law
1999-06-05  4:05                               ` Andi Kleen
1999-06-30 15:43                                 ` Andi Kleen
1999-06-30 15:43                               ` Toon Moene
1999-06-30 15:43                             ` Jeffrey A Law
1999-06-04 11:49                           ` Linus Torvalds
1999-06-04 13:03                             ` Gabriel Dos_Reis
1999-06-04 13:13                               ` Joe Buck
1999-06-30 15:43                                 ` Joe Buck
1999-06-30 15:43                               ` Gabriel Dos_Reis
1999-06-30 15:43                             ` Linus Torvalds
1999-06-04 12:59                           ` Alexandre Oliva
1999-06-04 13:29                             ` Joe Buck
1999-06-04 13:39                               ` Alexandre Oliva
1999-06-30 15:43                                 ` Alexandre Oliva
1999-06-30 15:43                               ` Joe Buck
1999-06-30 15:43                             ` Alexandre Oliva
1999-06-30 15:43                           ` Joe Buck
1999-06-30 15:43                         ` Linus Torvalds
1999-06-30 15:43                       ` craig
1999-06-30 15:43                     ` Linus Torvalds
1999-06-30 15:43                   ` Joe Buck
1999-06-04  5:47                 ` craig
1999-06-30 15:43                   ` craig
1999-06-04  7:11                 ` mark
1999-06-04  8:38                   ` Linus Torvalds
1999-06-30 15:43                     ` Linus Torvalds
1999-06-30 15:43                   ` mark
1999-06-04  8:41                 ` Tim Hollebeek
1999-06-04  8:53                   ` Jeffrey A Law
1999-06-30 15:43                     ` Jeffrey A Law
1999-06-30 15:43                   ` Tim Hollebeek
1999-06-30 15:43                 ` Linus Torvalds
1999-06-30 15:43               ` mark
1999-06-04  5:47             ` craig
1999-06-04  8:17               ` Linus Torvalds
1999-06-04  8:49                 ` craig
1999-06-04  8:57                   ` Linus Torvalds
1999-06-04  9:02                     ` Jean-Pierre Radley
1999-06-30 15:43                       ` Jean-Pierre Radley
1999-06-30 15:43                     ` Linus Torvalds
1999-06-30 15:43                   ` craig
1999-06-30 15:43                 ` Linus Torvalds
1999-06-30 15:43               ` craig
1999-06-04  8:39             ` Tim Hollebeek
1999-06-04  8:55               ` Linus Torvalds
1999-06-04 15:20                 ` Richard Henderson
1999-06-05  9:50                   ` Linus Torvalds
1999-06-05 11:00                     ` mark
1999-06-06 10:30                       ` Linus Torvalds
1999-06-06 10:44                         ` mark
1999-06-06 14:17                           ` Linus Torvalds
1999-06-06 17:41                             ` mark
1999-06-07  8:58                               ` Linus Torvalds
1999-06-07  9:18                                 ` mark
1999-06-07  9:29                                   ` Linus Torvalds
1999-06-07  9:38                                     ` Tim Hollebeek
1999-06-07 10:05                                       ` Jamie Lokier
1999-06-30 15:43                                         ` Jamie Lokier
1999-06-07 10:44                                       ` Linus Torvalds
1999-06-07 11:22                                         ` Jeffrey A Law
1999-06-08  1:34                                           ` Nick Ing-Simmons
1999-06-08  1:48                                             ` Jeffrey A Law
1999-06-30 15:43                                               ` Jeffrey A Law
1999-06-30 15:43                                             ` Nick Ing-Simmons
1999-06-30 15:43                                           ` Jeffrey A Law
1999-06-30 15:43                                         ` Linus Torvalds
1999-06-30 15:43                                       ` Tim Hollebeek
1999-06-30 15:43                                     ` Linus Torvalds
1999-06-30 15:43                                   ` mark
1999-06-07 13:34                                 ` Jamie Lokier
1999-06-30 15:43                                   ` Jamie Lokier
1999-06-30 15:43                                 ` Linus Torvalds
1999-06-30 15:43                               ` mark
1999-06-30 15:43                             ` Linus Torvalds
1999-06-30 15:43                           ` mark
1999-06-30 15:43                         ` Linus Torvalds
1999-06-30 15:43                       ` mark
1999-06-30 15:43                     ` Linus Torvalds
1999-06-30 15:43                   ` Richard Henderson
1999-06-30 15:43                 ` Linus Torvalds
1999-06-30 15:43               ` Tim Hollebeek
1999-06-04 15:02             ` Richard Henderson
1999-06-04 16:50               ` Bernd Schmidt
1999-06-30 15:43                 ` Bernd Schmidt
1999-06-05  9:35               ` Linus Torvalds
1999-06-05 13:34                 ` Richard Henderson
1999-06-05 18:40                   ` Linus Torvalds
1999-06-30 15:43                     ` Linus Torvalds
1999-06-05 21:38                   ` Jakub Jelinek
1999-06-30 15:43                     ` Jakub Jelinek
1999-06-30 15:43                   ` Richard Henderson
1999-06-30 15:43                 ` Linus Torvalds
1999-06-30 15:43               ` Richard Henderson
1999-06-30 15:43             ` Linus Torvalds
1999-06-03 23:53           ` Martin v. Loewis
1999-06-30 15:43             ` Martin v. Loewis
     [not found]           ` <v04205101b37d700fbf8d@[192.168.1.254]>
1999-06-04  7:01             ` craig
1999-06-30 15:43               ` craig
1999-06-30 15:43           ` craig
1999-06-30 15:43         ` David S. Miller
1999-06-03 13:31       ` Andi Kleen
1999-06-30 15:43         ` Andi Kleen
1999-06-30 15:43       ` mark
1999-06-30 15:43     ` David S. Miller
1999-06-03 12:02   ` Andi Kleen
1999-06-03 15:38     ` Martin v. Loewis
1999-06-30 15:43       ` Martin v. Loewis
1999-06-30 15:43     ` Andi Kleen
1999-06-30 15:43   ` mark
1999-06-30 15:43 ` Chip Salzenberg
1999-06-04 11:54 Mike Stump
1999-06-04 12:13 ` Jeffrey A Law
1999-06-04 13:25   ` Sylvain Pion
1999-06-04 13:32     ` Jeffrey A Law
1999-06-30 15:43       ` Jeffrey A Law
1999-06-30 15:43     ` Sylvain Pion
1999-06-30 15:43   ` Jeffrey A Law
1999-06-30 15:43 ` Mike Stump
1999-06-06 15:08 Ross Harvey
1999-06-06 15:46 ` Linus Torvalds
1999-06-30 15:43   ` Linus Torvalds
1999-06-06 17:29 ` David S. Miller
1999-06-30 15:43   ` David S. Miller
1999-06-30 15:43 ` Ross Harvey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).