public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: signed is undefined and has been since 1992 (in GCC)
@ 2005-06-28 16:59 Morten Welinder
  2005-06-28 17:23 ` Olivier Galibert
  2005-06-28 18:44 ` Michael Veksler
  0 siblings, 2 replies; 119+ messages in thread
From: Morten Welinder @ 2005-06-28 16:59 UTC (permalink / raw)
  To: galibert; +Cc: gcc

> In particular, a very large number of C and C++ programs are written
> with the assumptions:

>- signed and unsigned types are modulo, except in loop induction
> variables where it's bad taste

Well, as demonstrated by INT_MIN/-1, gcc has NEVER fulfilled such assumptions
on i86 and, quite likely, neither has or will any other compiler.   The runtime
penalty would be too big and hurt performance numbers.

What I believe you can find examples of is that the more restricted claim of
"addition and perhaps subtraction of signed numbers is modulo" is being
assumed.  That's cheap since (for 2-complement) signed addition is the same
operation as unsigned addition.

Morten

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 16:59 signed is undefined and has been since 1992 (in GCC) Morten Welinder
@ 2005-06-28 17:23 ` Olivier Galibert
  2005-06-28 18:44 ` Michael Veksler
  1 sibling, 0 replies; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 17:23 UTC (permalink / raw)
  To: Morten Welinder; +Cc: gcc

On Tue, Jun 28, 2005 at 12:59:10PM -0400, Morten Welinder wrote:
> > In particular, a very large number of C and C++ programs are written
> > with the assumptions:
> 
> >- signed and unsigned types are modulo, except in loop induction
> > variables where it's bad taste
> 
> Well, as demonstrated by INT_MIN/-1, gcc has NEVER fulfilled such assumptions
> on i86 and, quite likely, neither has or will any other compiler.   The runtime
> penalty would be too big and hurt performance numbers.

I meant "bad taste to rely on that", sorry.  Not that it shouldn't
overflow, but rather that overflowing shouldn't happen in a
well-behaving program and and the compiler is allowed to go and slap
you if it happens.


> What I believe you can find examples of is that the more restricted claim of
> "addition and perhaps subtraction of signed numbers is modulo" is being
> assumed.  That's cheap since (for 2-complement) signed addition is the same
> operation as unsigned addition.

Yes.

  OG.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 16:59 signed is undefined and has been since 1992 (in GCC) Morten Welinder
  2005-06-28 17:23 ` Olivier Galibert
@ 2005-06-28 18:44 ` Michael Veksler
  1 sibling, 0 replies; 119+ messages in thread
From: Michael Veksler @ 2005-06-28 18:44 UTC (permalink / raw)
  To: Morten Welinder; +Cc: Joe Buck, gcc, galibert






Morten Welinder  wrote on 28/06/2005 19:59:10:

> > In particular, a very large number of C and C++ programs are written
> > with the assumptions:
>
> >- signed and unsigned types are modulo, except in loop induction
> > variables where it's bad taste
>
> Well, as demonstrated by INT_MIN/-1, gcc has NEVER fulfilled such
assumptions
> on i86 and, quite likely, neither has or will any other compiler.
> The runtime
> penalty would be too big and hurt performance numbers.
>
> What I believe you can find examples of is that the more restricted claim
of
> "addition and perhaps subtraction of signed numbers is modulo" is being
> assumed.  That's cheap since (for 2-complement) signed addition is the
same
> operation as unsigned addition.
>
> Morten

This is problematic as Joe Buck has shown:
;    /* int a, b, c; */
;    if (b > 0) {
;            a = b + c;
;            int count=0;
;            for (int i = c; i <= a; i++)
;                count++;
;            some_func(count);
;    }

Can be optimized to
;    if (b > 0)
;            some_func(b+1);

Only if you assume int never overflows. Requiring operator++
to overflow will prohibit this optimization.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-02 17:15                                         ` Florian Weimer
                                                             ` (2 preceding siblings ...)
  2005-07-02 23:20                                           ` Robert Dewar
@ 2005-07-14  7:21                                           ` Marc Espie
  3 siblings, 0 replies; 119+ messages in thread
From: Marc Espie @ 2005-07-14  7:21 UTC (permalink / raw)
  To: gcc

In article <8764vt2kq3.fsf@deneb.enyo.de> you write:
>Both OpenSSL and Apache programmers did this, in carefully reviewed
>code which was written in response to a security report.  They simply
>didn't know that there is a potential problem.  The reason for this
>gap in knowledge isn't quite clear to me.

Well, it's reasonably clear to me.

I've been reviewing code for the OpenBSD project, it's incredible the
number of errors you can find in code which is supposed to
- have been written by competent programmers;
- have been reviewed by tens of people.

Quite simply, formal code reviews in free software don't work. The `many
eyes' paradigm is a fallacy. Ten persons can look at the same code and
fail to notice a problem if they don't look for the right thing.

A lot of people don't even think about overflows when they look at
arithmetic, there are a lot of integer overflows out there.

I still routinely find off-by-one accesses in buffers, some of them
quite obvious. The only reasons I see them is because my malloc can put
allocations on page boundaries, and thus the program barfs here, and not
on other machines.

A lot of people don't know about the peculiarities of C signed
arithmetic.

A lot of `portable' code that uses C arithmetic buries such
peculiarities under tons of macros and typedefs such that it is really
hard to figure out what's going on even if you understand the issues.
From past experience, both Apache and OpenSSL are very bad in that
regards.

Bottom-line is, if it passes tests on major architectures and major
OSes, it's very unlikely that someone will notice something is amiss,
and that the same someone will have the knowledge to fix it. If it
passes all practical tests, but is incorrect, from a language point of
view, it is even more unlikely.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-03  9:54                                               ` Robert Dewar
  2005-07-03 10:02                                                 ` Florian Weimer
@ 2005-07-03 12:01                                                 ` Gabriel Dos Reis
  1 sibling, 0 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-07-03 12:01 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Florian Weimer, Olivier Galibert, Dave Korn,
	'Andrew Haley', 'Andrew Pinski',
	'gcc mailing list'

Robert Dewar <dewar@adacore.com> writes:

[...]

| > The issue is whether they need to become expect in red herring or just
| > know how to write good and correct programs.  Interestingly, backis
| > the old days K&R put emphasis on how to write good and useful programs
| > rather than academic exercise in "undefined behaviour".
| 
| Actually K&R, which of course is definitely NOT a standard, and did indeed
| leave a lot of things undefined, not deliberately, but just by imprecision
| or omission, is, nevertheless, more precise than people think, and did
| introduce the notion of undefined constructs.

Notice that it is one thing to know about undefined construct, and
another to know how to write correct codes.  And that in trun is
different from being imprecise.  That was the point. 

| I find it surprising that you would disagree with the propoosition that
| more C programmers should be familiar with the C standard.
| 
| And I must say, that if you think the worry about undefined behavior is
| academic, I have to strongly disagree. You may be able to take this

What is surprising to me is how you reach such conclusions.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-03 10:02                                                 ` Florian Weimer
@ 2005-07-03 10:10                                                   ` Robert Dewar
  0 siblings, 0 replies; 119+ messages in thread
From: Robert Dewar @ 2005-07-03 10:10 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Gabriel Dos Reis, Olivier Galibert, Dave Korn,
	'Andrew Haley', 'Andrew Pinski',
	'gcc mailing list'

Florian Weimer wrote:
> * Robert Dewar:

>>Making programs bug free has more to it than understanding the language
>>you are writing in, but it is a useful step forward to avoid problems
>>that come from simply not knowing the rules of the language you are
>>writing in (I can't guarantee that GNAT is bug free in that regard,
>>but I can't remember a case where a bug stemmed from this source).

> There was some dependency on argument order evaluation in GNAT, but
> this was part of GIGI, so it's not the best example.

Indeed, given that GIGI is written in C, by people who have not read the
C standard :-) In practice though I suspect this was just a glitch.
Depending on evaluation order is a bug that can be committed accidentally
even by those who know well that such things are undefined or, as in Ada,
non-deterministic. The trouble is that levels of abstraction often obscure
such errors (just abtracting things into a macro in C can sometimes have
that effect).


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-03  9:54                                               ` Robert Dewar
@ 2005-07-03 10:02                                                 ` Florian Weimer
  2005-07-03 10:10                                                   ` Robert Dewar
  2005-07-03 12:01                                                 ` Gabriel Dos Reis
  1 sibling, 1 reply; 119+ messages in thread
From: Florian Weimer @ 2005-07-03 10:02 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Gabriel Dos Reis, Olivier Galibert, Dave Korn,
	'Andrew Haley', 'Andrew Pinski',
	'gcc mailing list'

* Robert Dewar:

> Making programs bug free has more to it than understanding the language
> you are writing in, but it is a useful step forward to avoid problems
> that come from simply not knowing the rules of the language you are
> writing in (I can't guarantee that GNAT is bug free in that regard,
> but I can't remember a case where a bug stemmed from this source).

There was some dependency on argument order evaluation in GNAT, but
this was part of GIGI, so it's not the best example.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-03  0:13                                             ` Gabriel Dos Reis
@ 2005-07-03  9:54                                               ` Robert Dewar
  2005-07-03 10:02                                                 ` Florian Weimer
  2005-07-03 12:01                                                 ` Gabriel Dos Reis
  0 siblings, 2 replies; 119+ messages in thread
From: Robert Dewar @ 2005-07-03  9:54 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Florian Weimer, Olivier Galibert, Dave Korn,
	'Andrew Haley', 'Andrew Pinski',
	'gcc mailing list'

Gabriel Dos Reis wrote:

> Then, one wonders why the GNAT is not bug free ;-p

Making programs bug free has more to it than understanding the language
you are writing in, but it is a useful step forward to avoid problems
that come from simply not knowing the rules of the language you are
writing in (I can't guarantee that GNAT is bug free in that regard,
but I can't remember a case where a bug stemmed from this source).
> 
> | Back in the days of Algol-60 absolutely everyone read the report. Then
> | we went through an era of standards which few people read (how many
> | fortran programmers read the fortran standard, cobol programmers
> | read the cobol standard, c programmers read the c standard etc). A
> | rather nice achievment with Ada is that the standard is indeed a
> | reference book that all Ada programmers have on their shelf and
> | even though not all have read it through, they know it is the
> 
> oh, so it suffices to have it?  Not to understand it?

Well it's not a binary issue. Sure, there are parts of the Ada
standard that are difficult even for me, and certainly I would
not expect a typical Ada programmer to be able to read and
understand all if it. On the other hand, large parts are accessible,
and, much more important, are in practice accessed.

> The issue is whether they need to become expect in red herring or just
> know how to write good and correct programs.  Interestingly, backis
> the old days K&R put emphasis on how to write good and useful programs
> rather than academic exercise in "undefined behaviour".

Actually K&R, which of course is definitely NOT a standard, and did indeed
leave a lot of things undefined, not deliberately, but just by imprecision
or omission, is, nevertheless, more precise than people think, and did
introduce the notion of undefined constructs.

I find it surprising that you would disagree with the propoosition that
more C programmers should be familiar with the C standard.

And I must say, that if you think the worry about undefined behavior is
academic, I have to strongly disagree. You may be able to take this
attitude as a programmer who stays away from marginal cases, but compiler
writers have no choice but to worry about the marginal cases and get them
right, since one programmers marginal case is another programmers normal
paradigm.
> 
> -- Gaby



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-03  0:07                                               ` Gabriel Dos Reis
@ 2005-07-03  9:49                                                 ` Robert Dewar
  0 siblings, 0 replies; 119+ messages in thread
From: Robert Dewar @ 2005-07-03  9:49 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Florian Weimer, Olivier Galibert, Dave Korn,
	'Andrew Haley', 'Andrew Pinski',
	'gcc mailing list'

Gabriel Dos Reis wrote:

> You may not have noticed but this issue is primarily about C and C++
> and we're discussing what the relevant standard says and what
> engineering decision we would take.  Please, let's not get into
> more distractions.  We already have plenty of them (many orginating
> from you).  

Actually the more general issue of undefined is pretty language
independent, and indeed it it important not to misuse the word
illegal for this purpose. The mention of courts is really besides
the point. The important point is to stick to the precise
terminology of the appropriate standard when talking about
incorrect programs, whatever the language.



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-02 23:20                                           ` Robert Dewar
@ 2005-07-03  0:13                                             ` Gabriel Dos Reis
  2005-07-03  9:54                                               ` Robert Dewar
  0 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-07-03  0:13 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Florian Weimer, Olivier Galibert, Dave Korn,
	'Andrew Haley', 'Andrew Pinski',
	'gcc mailing list'

Robert Dewar <dewar@adacore.com> writes:

| Florian Weimer wrote:
| 
| > Probably it's hard to accept for hard-code C coders that a program
| > which generates correct machine code with all GCC versions released so
| > far (modulo bugs in GCC) can still be illegal C and exhibit undefined
| > behavior.  IIRC, I needed quite some time to realize the full impact
| > of this distinction.
| 
| Note that even making things implementation defined does not help the
| problem of learning by example from one implementation. It really is
| a good idea for people programming in language X to learn language X :-)

Then, one wonders why the GNAT is not bug free ;-p

| Back in the days of Algol-60 absolutely everyone read the report. Then
| we went through an era of standards which few people read (how many
| fortran programmers read the fortran standard, cobol programmers
| read the cobol standard, c programmers read the c standard etc). A
| rather nice achievment with Ada is that the standard is indeed a
| reference book that all Ada programmers have on their shelf and
| even though not all have read it through, they know it is the

oh, so it suffices to have it?  Not to understand it?

| important ultimate reference standard of what is and what is not
| allowed in valid programs, and you would be hard put to find a
| professional Ada programmer who has not frequently reached for
| the standard to look something up. In a big class of programmers
| nearly all of whom had done professional C programming a couple
| of years ago, only 2 out of 94 had held the C standard in their
| hands.
| 
| It's not an easy document, but it's also not that hard, it would
| be nice to promote its use more!

The issue is whether they need to become expect in red herring or just
know how to write good and correct programs.  Interestingly, backis
the old days K&R put emphasis on how to write good and useful programs
rather than academic exercise in "undefined behaviour".

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-02 23:20                                             ` Robert Dewar
@ 2005-07-03  0:07                                               ` Gabriel Dos Reis
  2005-07-03  9:49                                                 ` Robert Dewar
  0 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-07-03  0:07 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Florian Weimer, Olivier Galibert, Dave Korn,
	'Andrew Haley', 'Andrew Pinski',
	'gcc mailing list'

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| 
| > We need to be careful not to to substitute "illegal" for "undefined
| > behaviour". GCC is not a court.
| 
| Note that in Ada,

You may not have noticed but this issue is primarily about C and C++
and we're discussing what the relevant standard says and what
engineering decision we would take.  Please, let's not get into
more distractions.  We already have plenty of them (many orginating
from you).  

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-02 17:15                                         ` Florian Weimer
  2005-07-02 18:59                                           ` Gabriel Dos Reis
  2005-07-02 23:12                                           ` Nicholas Nethercote
@ 2005-07-02 23:20                                           ` Robert Dewar
  2005-07-03  0:13                                             ` Gabriel Dos Reis
  2005-07-14  7:21                                           ` Marc Espie
  3 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-07-02 23:20 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Olivier Galibert, Dave Korn, 'Andrew Haley',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

Florian Weimer wrote:

> Probably it's hard to accept for hard-code C coders that a program
> which generates correct machine code with all GCC versions released so
> far (modulo bugs in GCC) can still be illegal C and exhibit undefined
> behavior.  IIRC, I needed quite some time to realize the full impact
> of this distinction.

Note that even making things implementation defined does not help the
problem of learning by example from one implementation. It really is
a good idea for people programming in language X to learn language X :-)

Back in the days of Algol-60 absolutely everyone read the report. Then
we went through an era of standards which few people read (how many
fortran programmers read the fortran standard, cobol programmers
read the cobol standard, c programmers read the c standard etc). A
rather nice achievment with Ada is that the standard is indeed a
reference book that all Ada programmers have on their shelf and
even though not all have read it through, they know it is the
important ultimate reference standard of what is and what is not
allowed in valid programs, and you would be hard put to find a
professional Ada programmer who has not frequently reached for
the standard to look something up. In a big class of programmers
nearly all of whom had done professional C programming a couple
of years ago, only 2 out of 94 had held the C standard in their
hands.

It's not an easy document, but it's also not that hard, it would
be nice to promote its use more!



^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-02 18:59                                           ` Gabriel Dos Reis
@ 2005-07-02 23:20                                             ` Robert Dewar
  2005-07-03  0:07                                               ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-07-02 23:20 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Florian Weimer, Olivier Galibert, Dave Korn,
	'Andrew Haley', 'Andrew Pinski',
	'gcc mailing list'

Gabriel Dos Reis wrote:

> We need to be careful not to to substitute "illegal" for "undefined
> behaviour". GCC is not a court.

Note that in Ada, illegal is a technical term, it refers to a program
that fails to meet the syntactic or static semantic rules for a correct
Ada program, and must be rejected by an Ada compiler at compile time.


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-02 17:15                                         ` Florian Weimer
  2005-07-02 18:59                                           ` Gabriel Dos Reis
@ 2005-07-02 23:12                                           ` Nicholas Nethercote
  2005-07-02 23:20                                           ` Robert Dewar
  2005-07-14  7:21                                           ` Marc Espie
  3 siblings, 0 replies; 119+ messages in thread
From: Nicholas Nethercote @ 2005-07-02 23:12 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Robert Dewar, Olivier Galibert, Dave Korn, 'Andrew Haley',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

On Sat, 2 Jul 2005, Florian Weimer wrote:

>> I am puzzled, why would *ANYONE* who knows C use int
>> rather than unsigned if they want wrap around semantics?
>
> Both OpenSSL and Apache programmers did this, in carefully reviewed
> code which was written in response to a security report.  They simply
> didn't know that there is a potential problem.  The reason for this
> gap in knowledge isn't quite clear to me.

I've done a lot of C programming in the last three years, and for my day 
job I'm working on a C compiler (albeit in parts that are not very C 
specific), and I didn't know that signed overflow is undefined.  Why not? 
I guess I never heard otherwise and I just assumed it would wrap due to 
two's complement arithmetic.  I don't think I've ever written a serious C 
program that required wrap-around on overflow, though.

Nick

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-07-02 17:15                                         ` Florian Weimer
@ 2005-07-02 18:59                                           ` Gabriel Dos Reis
  2005-07-02 23:20                                             ` Robert Dewar
  2005-07-02 23:12                                           ` Nicholas Nethercote
                                                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-07-02 18:59 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Robert Dewar, Olivier Galibert, Dave Korn, 'Andrew Haley',
	'Andrew Pinski', 'gcc mailing list'

Florian Weimer <fw@deneb.enyo.de> writes:

| * Robert Dewar:
| 
| > I am puzzled, why would *ANYONE* who knows C use int
| > rather than unsigned if they want wrap around semantics?
| 
| Both OpenSSL and Apache programmers did this, in carefully reviewed
| code which was written in response to a security report.  They simply
| didn't know that there is a potential problem.  The reason for this
| gap in knowledge isn't quite clear to me.
| 
| Probably it's hard to accept for hard-code C coders that a program
| which generates correct machine code with all GCC versions released so
| far (modulo bugs in GCC) can still be illegal C and exhibit undefined

We need to be careful not to to substitute "illegal" for "undefined
behaviour". GCC is not a court.
Part from that, I maintain that we should not apply "undfeined
behaviour" whole sale.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:21                                       ` Robert Dewar
  2005-06-28 20:18                                         ` Paul Koning
  2005-06-28 21:53                                         ` Michael Veksler
@ 2005-07-02 17:15                                         ` Florian Weimer
  2005-07-02 18:59                                           ` Gabriel Dos Reis
                                                             ` (3 more replies)
  2 siblings, 4 replies; 119+ messages in thread
From: Florian Weimer @ 2005-07-02 17:15 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Olivier Galibert, Dave Korn, 'Andrew Haley',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

* Robert Dewar:

> I am puzzled, why would *ANYONE* who knows C use int
> rather than unsigned if they want wrap around semantics?

Both OpenSSL and Apache programmers did this, in carefully reviewed
code which was written in response to a security report.  They simply
didn't know that there is a potential problem.  The reason for this
gap in knowledge isn't quite clear to me.

Probably it's hard to accept for hard-code C coders that a program
which generates correct machine code with all GCC versions released so
far (modulo bugs in GCC) can still be illegal C and exhibit undefined
behavior.  IIRC, I needed quite some time to realize the full impact
of this distinction.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:36                               ` Dave Korn
  2005-06-28 18:02                                 ` Olivier Galibert
@ 2005-07-02 17:06                                 ` Florian Weimer
  1 sibling, 0 replies; 119+ messages in thread
From: Florian Weimer @ 2005-07-02 17:06 UTC (permalink / raw)
  To: Dave Korn
  Cc: 'Olivier Galibert', 'Andrew Haley',
	'Robert Dewar', 'Gabriel Dos Reis',
	'Andrew Pinski', 'gcc mailing list'

* Dave Korn:

>   It certainly wasn't meant to be.  It was meant to be a dispassionate
> description of the state of facts.  Software that violates the C standard
> just *is* "buggy" or "incorrect",

Not if a GCC extension makes it legal code.  And actually, I believe a
GCC extension which basically boils to -fwrapv by default makes sense
because so much existing code the free software community has written
(including critical code paths which fix security bugs) implicitly
relies on -fwrapv.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 22:16                                       ` Falk Hueffner
@ 2005-06-29  6:59                                         ` Eric Botcazou
  0 siblings, 0 replies; 119+ messages in thread
From: Eric Botcazou @ 2005-06-29  6:59 UTC (permalink / raw)
  To: Falk Hueffner; +Cc: gcc, Joseph S. Myers, Robert Dewar

> > Does -ftrapv ever take advantage of trapping instructions where the
> > hardware has them available?
>
> Yes, for example Alpha.

OK, but it's the only example. :-)

> > Does anyone make substantial use of -ftrapv in production
>
> There are rarely bug reports about it; for example pointer difference
> with odd sized objects was broken for years, and reported only twice
> or so.

-ftrapv is simply not usable as of today because the performance degradation 
is abysmal.  Plus it is broken in very simple cases (PR middle-end/19020).

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-29  2:19                                     ` Marcin Dalecki
@ 2005-06-29  3:13                                       ` Scott Robert Ladd
  0 siblings, 0 replies; 119+ messages in thread
From: Scott Robert Ladd @ 2005-06-29  3:13 UTC (permalink / raw)
  To: Marcin Dalecki
  Cc: Diego Novillo, Gabriel Dos Reis, Robert Dewar, gcc mailing list

Marcin Dalecki wrote:
> The only thing this thread teaches me is the conviction that *every*
> instruction set architecture, which relies on compilers to make the most
> out if it is severely ill guided.

I've learned that I'm not the only guy who can start a really long,
contentious thread on the GCC mailing list! ;)

..Scott

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-29  1:21                                   ` Diego Novillo
@ 2005-06-29  2:19                                     ` Marcin Dalecki
  2005-06-29  3:13                                       ` Scott Robert Ladd
  0 siblings, 1 reply; 119+ messages in thread
From: Marcin Dalecki @ 2005-06-29  2:19 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Gabriel Dos Reis, Robert Dewar, gcc mailing list


On 2005-06-29, at 03:21, Diego Novillo wrote:

> On Wed, Jun 29, 2005 at 03:13:45AM +0200, Gabriel Dos Reis wrote:
>
>> Robert Dewar <dewar@adacore.com> writes:
>> |      You did not read anything even vaguely saying that in what  
>> I wrote.
>>
>> and you, did you?
>>
>>
> Folks, can you take this offline?  It's getting rather tiresome.

The only thing this thread teaches me is the conviction that *every*
instruction set architecture, which relies on compilers to make the most
out if it is severely ill guided.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-29  1:14                                 ` Gabriel Dos Reis
@ 2005-06-29  1:21                                   ` Diego Novillo
  2005-06-29  2:19                                     ` Marcin Dalecki
  0 siblings, 1 reply; 119+ messages in thread
From: Diego Novillo @ 2005-06-29  1:21 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Robert Dewar, gcc mailing list

On Wed, Jun 29, 2005 at 03:13:45AM +0200, Gabriel Dos Reis wrote:
> Robert Dewar <dewar@adacore.com> writes:
> |      You did not read anything even vaguely saying that in what I wrote.
> 
> and you, did you?
> 
Folks, can you take this offline?  It's getting rather tiresome.


Thanks.  Diego.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-29  0:48                               ` Robert Dewar
@ 2005-06-29  1:14                                 ` Gabriel Dos Reis
  2005-06-29  1:21                                   ` Diego Novillo
  0 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-29  1:14 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Georg Bauhaus, gcc mailing list

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| > Robert Dewar <dewar@adacore.com> writes:
| > | Gabriel Dos Reis wrote:
| > | | >  C is
| > | > trustworthy (and preferred over SML for that curcial part of the proof
| > | > checker) because the mapping of the C code to the generated assembly
| > | > code is straighforward and amenable to inspection.
| > | | This kind of traceability is of course vital for such
| > applications, but
| > | it is by no means unique to C,
| > Nobody claims it is unique to C.  You're after the wrong target.
| > | and there is a big difference between saying
| > | that C is an assembly language, and that the mapping of C to assembly
| > | language is transparent.
| > Oh, you denied any connection in previous message.
| 
| Not at all,

   > Please do remember that this is hardware dependent.  If you have
   > problems with x86, it does not mean you have the same witha PPC or a
   > Sparc.

   But the whole idea of hardware semantics is bogus, since you are
   assuming some connection between C and the hardware which does not
   exist. C is not an assembly language.

[...]

|      You did not read anything even vaguely saying that in what I wrote.

and you, did you?

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-29  0:43                             ` Gabriel Dos Reis
@ 2005-06-29  0:48                               ` Robert Dewar
  2005-06-29  1:14                                 ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-29  0:48 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Georg Bauhaus, gcc mailing list

Gabriel Dos Reis wrote:
> Robert Dewar <dewar@adacore.com> writes:
> 
> | Gabriel Dos Reis wrote:
> | 
> | >  C is
> | > trustworthy (and preferred over SML for that curcial part of the proof
> | > checker) because the mapping of the C code to the generated assembly
> | > code is straighforward and amenable to inspection.
> | 
> | This kind of traceability is of course vital for such applications, but
> | it is by no means unique to C,
> 
> Nobody claims it is unique to C.  You're after the wrong target.
> 
> | and there is a big difference between saying
> | that C is an assembly language, and that the mapping of C to assembly
> | language is transparent.
> 
> Oh, you denied any connection in previous message.

Not at all, all languages are connected to assembly language in the sense
that you can write assembly language that corresponds to the semantics of
the language. How WYSIWYG the language is does indeed vary. C is very close
to meeting this criterion 100%, but that's a far cry from calling it an
assembly language itself. Of course I did not "deny any connection" with
asm. You did not read anything even vaguely saying that in what I wrote.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-29  0:27                           ` Robert Dewar
@ 2005-06-29  0:43                             ` Gabriel Dos Reis
  2005-06-29  0:48                               ` Robert Dewar
  0 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-29  0:43 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Georg Bauhaus, gcc mailing list

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| 
| >  C is
| > trustworthy (and preferred over SML for that curcial part of the proof
| > checker) because the mapping of the C code to the generated assembly
| > code is straighforward and amenable to inspection.
| 
| This kind of traceability is of course vital for such applications, but
| it is by no means unique to C,

Nobody claims it is unique to C.  You're after the wrong target.

| and there is a big difference between saying
| that C is an assembly language, and that the mapping of C to assembly
| language is transparent.

Oh, you denied any connection in previous message.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 23:53                         ` Gabriel Dos Reis
@ 2005-06-29  0:27                           ` Robert Dewar
  2005-06-29  0:43                             ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-29  0:27 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Georg Bauhaus, gcc mailing list

Gabriel Dos Reis wrote:

>  C is
> trustworthy (and preferred over SML for that curcial part of the proof
> checker) because the mapping of the C code to the generated assembly
> code is straighforward and amenable to inspection.

This kind of traceability is of course vital for such applications, but
it is by no means unique to C, and there is a big difference between saying
that C is an assembly language, and that the mapping of C to assembly
language is transparent.

The latter quality incidentally, Tucker Taft describes as WYSIWYG for
programming languages, and it is an important design point for C I
would say, and indeed was an important design principle for Ada.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 22:58                       ` Georg Bauhaus
@ 2005-06-28 23:53                         ` Gabriel Dos Reis
  2005-06-29  0:27                           ` Robert Dewar
  0 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 23:53 UTC (permalink / raw)
  To: Georg Bauhaus; +Cc: gcc mailing list

Georg Bauhaus <bauhaus@futureapps.de> writes:

| Gabriel Dos Reis wrote:
| > Robert Dewar <dewar@adacore.com> writes:
| > | Gabriel Dos Reis wrote:
| > | | But the whole idea of hardware semantics is bogus, since you are
| > | assuming some connection between C and the hardware which does not
| > | exist. C is not an assembly language.
| > If you live in a different world, you may not see the connection.
| 
| I'm not sure whether this isn't just C's self-fulfilling
| prophecy at work: Many C programmers think that C is a high level

self-fulfilling prophecy or not, would you deny that processor
specific ABIs are published and programmers expect compilers to follow
them?  GCC itself relies on such connection.

| assembly language (at least in my real world).
| They start using C as if it were...

Well, I would not say it is something unique to C programmers.
Recently we had a guest working on compiler verification and model
checking, deeply rooted in the ML community explain us the
architecture of their proof checker.  The heart of the proof checker,
the trusted part, is written in C instead of SML.  When the question
was asked as to why -- probably because people were surprised, given their
close tie to SML -- and the answer was simple:  even though it can be
claimed that the C type system is unsound and SML's is sound, C is
trustworthy (and preferred over SML for that curcial part of the proof
checker) because the mapping of the C code to the generated assembly
code is straighforward and amenable to inspection.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 21:53                                         ` Michael Veksler
@ 2005-06-28 23:05                                           ` Michael Veksler
  0 siblings, 0 replies; 119+ messages in thread
From: Michael Veksler @ 2005-06-28 23:05 UTC (permalink / raw)
  To: gcc





Michael Veksler wrote on 29/06/2005 00:53:33:
> Robert Dewar wrote on 28/06/2005 22:20:56:
>
> > I am puzzled, why would *ANYONE* who knows C use int
> > rather than unsigned if they want wrap around semantics?
> >
>
> int saturated_mul(int a, int b)
> {
>    int ret= a*b;
>    if (a == 0 || ret % a == 0)
I of course meant:
    if (a == 0 || ret % a == 0 && ret / a == b)

Otherwise, this will not work when a is a power of 2.

>       return ret;
>    else if ( (a<0) == (b<0) ) // same sign
>       return MAX_INT;
>    else
>       return MIN_INT;
> }
>
> instead of the bizzare:
> int saturated_mul(int a, int b)
> {
>    // (unsigned)a*b does not give easy ways to
>    // find signed overflow on negatives
>
>    if (a==0 || b==0)
>       return 0;
>
>    bool positive = ( (a<0) == (b<0) );
>    if (positive)
>    {
>        unsigned u_a=a, u_b=b;
>        if (a < 0)
>        {
>            u_a *= -1;
>            u_b *= -1;
>        }
>        unsigned ret= u_a*u_b;
>        if (ret % u_a == 0)
I of course meant:
    if (ret % u_a == 0 && ret / u_a == u_b)
>          return ret;
>        else
>          return MAX_INT;
>    }
>    else
>    {
>      // negative
>      even more bizzare code;
>    }
>
> It would be really great the standard could
> make my life easier. I need to code to standard,
> not to gcc, so extending gcc may not help....
>
>   Michael
>

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 13:19                     ` Gabriel Dos Reis
@ 2005-06-28 22:58                       ` Georg Bauhaus
  2005-06-28 23:53                         ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Georg Bauhaus @ 2005-06-28 22:58 UTC (permalink / raw)
  To: Gabriel Dos Reis, Robert Dewar, gcc mailing list

Gabriel Dos Reis wrote:
> Robert Dewar <dewar@adacore.com> writes:
> 
> | Gabriel Dos Reis wrote:
> | 
> | But the whole idea of hardware semantics is bogus, since you are
> | assuming some connection between C and the hardware which does not
> | exist. C is not an assembly language.
> 
> If you live in a different world, you may not see the connection.

I'm not sure whether this isn't just C's self-fulfilling
prophecy at work: Many C programmers think that C is a high level
assembly language (at least in my real world).
They start using C as if it were...
Then they charge C for not being defined at the machine
language level. 


Georg 

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:21                             ` Joe Buck
@ 2005-06-28 22:41                               ` Georg Bauhaus
  0 siblings, 0 replies; 119+ messages in thread
From: Georg Bauhaus @ 2005-06-28 22:41 UTC (permalink / raw)
  To: Joe Buck; +Cc: 'gcc mailing list'

Joe Buck wrote:

> 32-bit integers are going to remain useful types, and
> LP64 architectures finally have char = 8, short = 16, int = 32, long = 64,
> which is too useful to break.

Hmm... pratically,

"Handle and Pointer Sizes" in
http://www.intel.com/cd/ids/developer/asmo-na/eng/dc/64bit/197664.htm?page=2

When I try to attach the qualification "useful"
to program text whose basic types cannot be interpreted
without extended exegesis of various documents,
I wonder whether it should be the compiler's job to
correct this situation, as has been suggested in this thread.


Georg  


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 22:09                                     ` Joseph S. Myers
  2005-06-28 22:16                                       ` Falk Hueffner
@ 2005-06-28 22:19                                       ` Robert Dewar
  1 sibling, 0 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 22:19 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: 'gcc mailing list'

Joseph S. Myers wrote:

> Does anyone make substantial use of -ftrapv in production (whether as a 
> tool for detecting bugs, or as a security tool where the performance cost 
> is acceptable)?  Or is it still at the stage of being a tool which would 
> be useful in principle for some purposes but still needs some work 
> (including on optimizing checks to reduce the performance cost) to make it 
> useful in practice?

It is potentially useful in Ada, and we are investigating its use in
this context.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 22:09                                     ` Joseph S. Myers
@ 2005-06-28 22:16                                       ` Falk Hueffner
  2005-06-29  6:59                                         ` Eric Botcazou
  2005-06-28 22:19                                       ` Robert Dewar
  1 sibling, 1 reply; 119+ messages in thread
From: Falk Hueffner @ 2005-06-28 22:16 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Robert Dewar, 'gcc mailing list'

"Joseph S. Myers" <joseph@codesourcery.com> writes:

> Does -ftrapv ever take advantage of trapping instructions where the
> hardware has them available?

Yes, for example Alpha.

> Does anyone make substantial use of -ftrapv in production

There are rarely bug reports about it; for example pointer difference
with odd sized objects was broken for years, and reported only twice
or so.

-- 
	Falk

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:17                                   ` Robert Dewar
  2005-06-28 19:43                                     ` Gabriel Dos Reis
@ 2005-06-28 22:09                                     ` Joseph S. Myers
  2005-06-28 22:16                                       ` Falk Hueffner
  2005-06-28 22:19                                       ` Robert Dewar
  1 sibling, 2 replies; 119+ messages in thread
From: Joseph S. Myers @ 2005-06-28 22:09 UTC (permalink / raw)
  To: Robert Dewar; +Cc: 'gcc mailing list'

On Tue, 28 Jun 2005, Robert Dewar wrote:

> are preserved. For instance on the IBM mainframe one might use signed
> or unsigned operations to implement int operations. On the original
> MIPS one might use trapping or non-trapping arithmetic (either would
> be valid).

Does -ftrapv ever take advantage of trapping instructions where the 
hardware has them available?

Does anyone make substantial use of -ftrapv in production (whether as a 
tool for detecting bugs, or as a security tool where the performance cost 
is acceptable)?  Or is it still at the stage of being a tool which would 
be useful in principle for some purposes but still needs some work 
(including on optimizing checks to reduce the performance cost) to make it 
useful in practice?

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    joseph@codesourcery.com (CodeSourcery mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 12:57                   ` Robert Dewar
                                       ` (2 preceding siblings ...)
  2005-06-28 16:38                     ` Joe Buck
@ 2005-06-28 21:59                     ` Mike Stump
  3 siblings, 0 replies; 119+ messages in thread
From: Mike Stump @ 2005-06-28 21:59 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Gabriel Dos Reis, Andrew Pinski, gcc mailing list

On Jun 28, 2005, at 5:57 AM, Robert Dewar wrote:
> C is not an assembly language.

My head explodes.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 21:44                                             ` Joe Buck
  2005-06-28 21:50                                               ` Olivier Galibert
@ 2005-06-28 21:59                                               ` Robert Dewar
  1 sibling, 0 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 21:59 UTC (permalink / raw)
  To: Joe Buck
  Cc: Gabriel Dos Reis, Andrew Pinski, Olivier Galibert,
	'gcc mailing list', 'Andrew Haley',
	Dave Korn

Joe Buck wrote:

> I challenge you, Robert, to find us a C compiler that generates trapping
> instructions for integer adds by default.  I do not believe that such a
> compiler exists.

Probably there is no production compiler that does, MIPS would be the only
possibility I think in practice. Certainly there are checkout debugging
compilers that detect undefined operations.


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 20:58                                                 ` Gabriel Dos Reis
@ 2005-06-28 21:57                                                   ` Robert Dewar
  0 siblings, 0 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 21:57 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Andrew Pinski, Olivier Galibert, 'gcc mailing list',
	'Andrew Haley',
	Dave Korn

Gabriel Dos Reis wrote:

> In case you may not have noticed, people offered to run tests and send
> some numbers.

Right, I think further discussion until these numbers arrive is of
dubious value, so I am suspending the thread till later :-)

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:21                                       ` Robert Dewar
  2005-06-28 20:18                                         ` Paul Koning
@ 2005-06-28 21:53                                         ` Michael Veksler
  2005-06-28 23:05                                           ` Michael Veksler
  2005-07-02 17:15                                         ` Florian Weimer
  2 siblings, 1 reply; 119+ messages in thread
From: Michael Veksler @ 2005-06-28 21:53 UTC (permalink / raw)
  To: Robert Dewar
  Cc: 'Andrew Haley',
	Dave Korn, Olivier Galibert, 'gcc mailing list',
	'Gabriel Dos Reis', 'Andrew Pinski'







Robert Dewar wrote on 28/06/2005 22:20:56:

> I am puzzled, why would *ANYONE* who knows C use int
> rather than unsigned if they want wrap around semantics?
>

int saturated_mul(int a, int b)
{
   int ret= a*b;
   if (a == 0 || ret % a == 0)
      return ret;
   else if ( (a<0) == (b<0) ) // same sign
      return MAX_INT;
   else
      return MIN_INT;
}

instead of the bizzare:
int saturated_mul(int a, int b)
{
   // (unsigned)a*b does not give easy ways to
   // find signed overflow on negatives

   if (a==0 || b==0)
      return 0;

   bool positive = ( (a<0) == (b<0) );
   if (positive)
   {
       unsigned u_a=a, u_b=b;
       if (a < 0)
       {
           u_a *= -1;
           u_b *= -1;
       }
       unsigned ret= u_a*u_b;
       if (ret % u_a == 0)
         return ret;
       else
         return MAX_INT;
   }
   else
   {
     // negative
     even more bizzare code;
   }

It would be really great the standard could
make my life easier. I need to code to standard,
not to gcc, so extending gcc may not help....

  Michael

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 21:44                                             ` Joe Buck
@ 2005-06-28 21:50                                               ` Olivier Galibert
  2005-06-28 21:59                                               ` Robert Dewar
  1 sibling, 0 replies; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 21:50 UTC (permalink / raw)
  To: Joe Buck
  Cc: Robert Dewar, Gabriel Dos Reis, Andrew Pinski,
	'gcc mailing list', 'Andrew Haley',
	Dave Korn

On Tue, Jun 28, 2005 at 02:44:40PM -0700, Joe Buck wrote:
> I challenge you, Robert, to find us a C compiler that generates trapping
> instructions for integer adds by default.  I do not believe that such a
> compiler exists.

Perusing the manpage of SGI's cc and CC on IRIX, there isn't even an
option to have trapping integer arithmetic.

  OG.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:20                                         ` Robert Dewar
@ 2005-06-28 21:48                                           ` Joe Buck
  0 siblings, 0 replies; 119+ messages in thread
From: Joe Buck @ 2005-06-28 21:48 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Andrew Pinski, Olivier Galibert, 'Gabriel Dos Reis',
	'gcc mailing list', 'Andrew Haley',
	Dave Korn

On Tue, Jun 28, 2005 at 03:20:01PM -0400, Robert Dewar wrote:
> Andrew Pinski wrote:
> 
> >No it is not. It was when it was designed yes but since the C standard has
> >come out and the aliasing rules really show that it is not a high level
> >assembler language any more.
> 
> Even when it was designed there was more abstraction than you think (e.g.
> cannot convert *char value to *int, cannot compare addresses in different
> allocated objects).

Correct on the former point, wrong on the latter.  That is, C originally
required a flat address space; the rule about comparing addresses in
different allocated objects was added later to allow C to support
segmented architectures (C existed for a long time before the ANSI
standard came out).

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:32                                           ` Robert Dewar
  2005-06-28 19:48                                             ` Gabriel Dos Reis
@ 2005-06-28 21:44                                             ` Joe Buck
  2005-06-28 21:50                                               ` Olivier Galibert
  2005-06-28 21:59                                               ` Robert Dewar
  1 sibling, 2 replies; 119+ messages in thread
From: Joe Buck @ 2005-06-28 21:44 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Gabriel Dos Reis, Andrew Pinski, Olivier Galibert,
	'gcc mailing list', 'Andrew Haley',
	Dave Korn

On Tue, Jun 28, 2005 at 03:31:59PM -0400, Robert Dewar wrote:
> Gabriel Dos Reis wrote:
> 
> >The strict aliasing rule by itself does not show it is not a high level
> >assembly language.  There are chips out there where you cannot access
> >data willy-nilly through random register types.
> 
> And there are chips for which signed arithmetic is not wrap around!

No, there are not.  There are chips that possess a signed add instruction
that traps, but those same chips possess an unsigned add instruction that
does not trap, so any sane C compiler writer would choose the latter
instruction.  At the instruction level, there are no signed or unsigned
operands, only signed or unsigned operations.

I challenge you, Robert, to find us a C compiler that generates trapping
instructions for integer adds by default.  I do not believe that such a
compiler exists.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 20:24                                           ` Robert Dewar
@ 2005-06-28 21:41                                             ` Joe Buck
  0 siblings, 0 replies; 119+ messages in thread
From: Joe Buck @ 2005-06-28 21:41 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Paul Koning, galibert, dave.korn, aph, gdr, pinskia, gcc

On Tue, Jun 28, 2005 at 04:24:38PM -0400, Robert Dewar wrote:
> Paul Koning wrote:
> 
> >And also because most people believe that C applies normal computer
> >arithmetic, and they believe that normal computer arithmetic is
> >wrapped 2's complement.  (And indeed it usually is, give or take some
> >bizarre exceptions like MAX_INT % -1)
> 
> and not so bizarre exceptions like machines that trap on signed
> overflow.

That's a non-issue in practice, because no C compiler uses such
instructions by default.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 21:27                                               ` Paul Koning
@ 2005-06-28 21:39                                                 ` Andreas Schwab
  0 siblings, 0 replies; 119+ messages in thread
From: Andreas Schwab @ 2005-06-28 21:39 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdr, dewar, galibert, aph, dave.korn, pinskia, gcc

Paul Koning <pkoning@equallogic.com> writes:

>>>>>> "Gabriel" == Gabriel Dos Reis <gdr@integrable-solutions.net> writes:
>
>  Gabriel> When it comes down for the compiler writer to chose
>  Gabriel> something for undefined behaviour, it is hardly solely based
>  Gabriel> on the C standard.  In fact, the C standard is of much less
>  Gabriel> help because it gaves up. 
>
> That's my reaction as well.  The standard says "do something".  The
> compiler has to pick a "something".

But it does not have to be consistent.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 21:20                                             ` Gabriel Dos Reis
  2005-06-28 21:27                                               ` Paul Koning
@ 2005-06-28 21:35                                               ` Joe Buck
  1 sibling, 0 replies; 119+ messages in thread
From: Joe Buck @ 2005-06-28 21:35 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Robert Dewar, Olivier Galibert, Andrew Haley, Dave Korn,
	'Andrew Pinski', 'gcc mailing list'

On Tue, Jun 28, 2005 at 11:19:18PM +0200, Gabriel Dos Reis wrote:
> Robert Dewar <dewar@adacore.com> writes:
> 
> | Gabriel Dos Reis wrote:
> | > Robert Dewar <dewar@adacore.com> writes:
> | > | "has the semantics that Gabriel Dos Reis wants" is not an evaluable
> | > | predicate!
> | > You completely missed the point, but I guess it is consistent with
> | > your denying that there is any connection between C or C++ and
> | > hardware.
> | 
> | So, let's make this MUCH more specific. Gabriel, on the MIPS chip,
> | do you think there is something in the definition of C that leads
> | you to prefer wrap around semantics to trapping semantics?
> 
> When it comes down for the compiler writer to chose something for
> undefined behaviour, it is hardly solely based on the C standard.
> In fact, the C standard is of much less help because it gaves up.
> So, your question is inconsistency in terms.

I think it would make sense to use a *mix* of trapping and non-trapping
instructions myself; it would *not* make sense to use trapping everywhere.
For one thing, wrapping arithmetic is associative; trapping arithmetic
is not.  (a+b)+c is not the same as a+(b+c) if we trap, because the former
might not overflow while the latter does.  Forcing the compiler to keep
track limits the optimization possibilities.  However, for address
generation the value of catching bounds errors outweighs these penalties.

So I guess I agree with Gaby in that there are a number of practical
considerations that come into play.  Conformance with the standard
requires that valid programs do the right thing, but it is an engineering
compromise what to do when undefined behavior is invoked.

Writers of code often accidentally rely on wrapping without realizing it;
it can cause sums of three integers to come out correct despite
an overflow of the intermediate term.  We can argue that they shouldn't
rely on such things, but the usual software development methodology,
like it or not, is to hack away quickly, then start testing and fixing
bugs, so overflows that don't affect the result will simply not be
noticed.



   

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 21:20                                             ` Gabriel Dos Reis
@ 2005-06-28 21:27                                               ` Paul Koning
  2005-06-28 21:39                                                 ` Andreas Schwab
  2005-06-28 21:35                                               ` Joe Buck
  1 sibling, 1 reply; 119+ messages in thread
From: Paul Koning @ 2005-06-28 21:27 UTC (permalink / raw)
  To: gdr; +Cc: dewar, galibert, aph, dave.korn, pinskia, gcc

>>>>> "Gabriel" == Gabriel Dos Reis <gdr@integrable-solutions.net> writes:

> Robert Dewar <dewar@adacore.com> writes:
>
> Gabriel Dos Reis wrote:
>> Robert Dewar <dewar@adacore.com> writes:
>> >"has the semantics that Gabriel Dos Reis wants" is not an evaluable
>> >predicate!
>> You completely missed the point, but I guess it is consistent with
>> your denying that there is any connection between C or C++ and
>> hardware.
>
> So, let's make this MUCH more specific. Gabriel, on the MIPS chip,
> do you think there is something in the definition of C that leads
> you to prefer wrap around semantics to trapping semantics?

 Gabriel> When it comes down for the compiler writer to chose
 Gabriel> something for undefined behaviour, it is hardly solely based
 Gabriel> on the C standard.  In fact, the C standard is of much less
 Gabriel> help because it gaves up. 

That's my reaction as well.  The standard says "do something".  The
compiler has to pick a "something".

If the choice is between trapping and wrap for MIPS arithmetic, I'd
choose wrap without a question.  The reason: my expectation is that,
in each place where the situation arises in the code created in our
team, wrap is the better answer.  Ideally the issue should not arise
-- if it's expected in the code, unsigned variables should be used or
the condition should be tested for explicitly.  But if that's missed,
I believe that wrapping will produce the "that's what I meant"
outcome, while an overflow trap will cause the system to malfunction
at the customer site.

   paul

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 20:59                                           ` Robert Dewar
@ 2005-06-28 21:20                                             ` Gabriel Dos Reis
  2005-06-28 21:27                                               ` Paul Koning
  2005-06-28 21:35                                               ` Joe Buck
  0 siblings, 2 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 21:20 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Olivier Galibert, Andrew Haley, Dave Korn,
	'Andrew Pinski', 'gcc mailing list'

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| > Robert Dewar <dewar@adacore.com> writes:
| > | "has the semantics that Gabriel Dos Reis wants" is not an evaluable
| > | predicate!
| > You completely missed the point, but I guess it is consistent with
| > your denying that there is any connection between C or C++ and
| > hardware.
| 
| So, let's make this MUCH more specific. Gabriel, on the MIPS chip,
| do you think there is something in the definition of C that leads
| you to prefer wrap around semantics to trapping semantics?

When it comes down for the compiler writer to chose something for
undefined behaviour, it is hardly solely based on the C standard.
In fact, the C standard is of much less help because it gaves up.
So, your question is inconsistency in terms.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 20:51                                         ` Gabriel Dos Reis
@ 2005-06-28 20:59                                           ` Robert Dewar
  2005-06-28 21:20                                             ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 20:59 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Olivier Galibert, Andrew Haley, Dave Korn,
	'Andrew Pinski', 'gcc mailing list'

Gabriel Dos Reis wrote:
> Robert Dewar <dewar@adacore.com> writes:
> 
> | "has the semantics that Gabriel Dos Reis wants" is not an evaluable
> | predicate!
> 
> You completely missed the point, but I guess it is consistent with
> your denying that there is any connection between C or C++ and
> hardware.

So, let's make this MUCH more specific. Gabriel, on the MIPS chip,
do you think there is something in the definition of C that leads
you to prefer wrap around semantics to trapping semantics?

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 20:37                                               ` Robert Dewar
@ 2005-06-28 20:58                                                 ` Gabriel Dos Reis
  2005-06-28 21:57                                                   ` Robert Dewar
  0 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 20:58 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Andrew Pinski, Olivier Galibert, 'gcc mailing list',
	'Andrew Haley',
	Dave Korn

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| > Robert Dewar <dewar@adacore.com> writes:
| > | Gabriel Dos Reis wrote:
| > | | > The strict aliasing rule by itself does not show it is not a
| > high level
| > | > assembly language.  There are chips out there where you cannot access
| > | > data willy-nilly through random register types.
| > | | And there are chips for which signed arithmetic is not wrap
| > around!
| > yes, but that is irrelevant,  the assumptin was made that
| > one knows what the chip provides.
| 
| And if the chip provides two consistent models of arithmetic, you cannot
| deduce from the standard that one is preferred over the other (e.g. logical

At this point, we're not talking about what the standard abstractly
says.  But, what the *implementation* says or could say.  We're
talking about the implementation-defined semantics of
numeric_limits<T>::is_modulo. 
But I guess you will refuse to get real just as you will deny that any
there is any connection between C, C++ and hardware.

[...]

| As I said in an earlier message, the issue here is one of tradeoffs. 

Did you miss the previous messages where it was clearly indicated that
people were aware of that?

[...]

| In the case of overflow, everyone would agree on avoiding the undefined
| behavior if it is cheap enough. If it is not cheap enough, then I think
| most people would accept the undefined status.
| 
| How can you make an informed decision with no data in a case like this?

In case you may not have noticed, people offered to run tests and send
some numbers.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 20:31                                       ` Robert Dewar
@ 2005-06-28 20:51                                         ` Gabriel Dos Reis
  2005-06-28 20:59                                           ` Robert Dewar
  0 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 20:51 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Olivier Galibert, Andrew Haley, Dave Korn,
	'Andrew Pinski', 'gcc mailing list'

Robert Dewar <dewar@adacore.com> writes:

| "has the semantics that Gabriel Dos Reis wants" is not an evaluable
| predicate!

You completely missed the point, but I guess it is consistent with
your denying that there is any connection between C or C++ and
hardware.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:48                                             ` Gabriel Dos Reis
@ 2005-06-28 20:37                                               ` Robert Dewar
  2005-06-28 20:58                                                 ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 20:37 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Andrew Pinski, Olivier Galibert, 'gcc mailing list',
	'Andrew Haley',
	Dave Korn

Gabriel Dos Reis wrote:
> Robert Dewar <dewar@adacore.com> writes:
> 
> | Gabriel Dos Reis wrote:
> | 
> | > The strict aliasing rule by itself does not show it is not a high level
> | > assembly language.  There are chips out there where you cannot access
> | > data willy-nilly through random register types.
> | 
> | And there are chips for which signed arithmetic is not wrap around!
> 
> yes, but that is irrelevant,  the assumptin was made that
> one knows what the chip provides.  

And if the chip provides two consistent models of arithmetic, you cannot
deduce from the standard that one is preferred over the other (e.g. logical
vs arithmetic instructions on the IBM mainframe, or trapping vs non-trapping
arithmetic on the MIPS). I would actually think that the trapping arithmetic
on the MIPS is a much better model of signed int's in C (the language as
defined by the standard) [of course, given the claim that so few people know
C, and those who do not have expectations, you may well decide to meet those
expectations if it is free to do so).

A good analogy here is Fortran allocation. Everyone *knows* that Fortran
allocates storage statically, and many programs rely on this, but even
Fortran-66 went to great pains to NOT say this, and a stack based implementation
was entirely conforming. However, as Burroughs found out, a stack based
Fortran was not very useable.

As I said in an earlier message, the issue here is one of tradeoffs. You
always prefer defined behavior to undefined, and if you cannot have defined,
the equivalent of Ada bounded error is still a huge improvement over undefined.

If there were no tradeoff with efficiency, there is an argument for testing
for every undefined situation at run time and raising a signal when it
occurs with an appropriate message. THat's of course impractical. Failing
that, if there is a sort of expected behavior (e.g. everyone knows function
pointers are the same length as int pointers), then it is good to conform
to this expectation if it can be done cheaply (are trampolines cheap enough?
probably not).

In the case of overflow, everyone would agree on avoiding the undefined
behavior if it is cheap enough. If it is not cheap enough, then I think
most people would accept the undefined status.

How can you make an informed decision with no data in a case like this?
Answer you can't!

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:43                                     ` Gabriel Dos Reis
@ 2005-06-28 20:31                                       ` Robert Dewar
  2005-06-28 20:51                                         ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 20:31 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Olivier Galibert, Andrew Haley, Dave Korn,
	'Andrew Pinski', 'gcc mailing list'

Gabriel Dos Reis wrote:


> C and C++ are general programming language geared toward system
> programming. 

Yes, and my two examples are completely consistent with that. A C
compiler that uses the trapping arithmetic on the MIPS is entirely
conforming, and has the advantage that if you develop with such a
compiler you avoid silly mistakes of expecting signed integers to
wrap, so your code will be more portable.

"has the semantics that Gabriel Dos Reis wants" is not an evaluable
predicate!

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 20:18                                         ` Paul Koning
@ 2005-06-28 20:24                                           ` Robert Dewar
  2005-06-28 21:41                                             ` Joe Buck
  0 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 20:24 UTC (permalink / raw)
  To: Paul Koning; +Cc: galibert, dave.korn, aph, gdr, pinskia, gcc

Paul Koning wrote:

> And also because most people believe that C applies normal computer
> arithmetic, and they believe that normal computer arithmetic is
> wrapped 2's complement.  (And indeed it usually is, give or take some
> bizarre exceptions like MAX_INT % -1)

and not so bizarre exceptions like machines that trap on signed
overflow.
> 
> We all know better, but how tiny is the fraction of C programmers who
> have ever even *seen* the ANSI C spec, much less know in detail what
> it says?

OK, but to me "knows C" (the predicate in my statement) certainly
means being familiar with the standard, not necessarily by reading it,
I am sure there are lots of good books on C that have a good account
of the rules (K&R is clear enough on this particular issue).

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:21                                       ` Robert Dewar
@ 2005-06-28 20:18                                         ` Paul Koning
  2005-06-28 20:24                                           ` Robert Dewar
  2005-06-28 21:53                                         ` Michael Veksler
  2005-07-02 17:15                                         ` Florian Weimer
  2 siblings, 1 reply; 119+ messages in thread
From: Paul Koning @ 2005-06-28 20:18 UTC (permalink / raw)
  To: dewar; +Cc: galibert, dave.korn, aph, gdr, pinskia, gcc

>>>>> "Robert" == Robert Dewar <dewar@adacore.com> writes:

 Robert> I am puzzled, why would *ANYONE* who knows C use int rather
 Robert> than unsigned if they want wrap around semantics?

Because most people don't follow the rule that "always use unsigned
variables unless you know that it really needs to be signed".

And also because most people believe that C applies normal computer
arithmetic, and they believe that normal computer arithmetic is
wrapped 2's complement.  (And indeed it usually is, give or take some
bizarre exceptions like MAX_INT % -1)

We all know better, but how tiny is the fraction of C programmers who
have ever even *seen* the ANSI C spec, much less know in detail what
it says?

	paul

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:32                                           ` Robert Dewar
@ 2005-06-28 19:48                                             ` Gabriel Dos Reis
  2005-06-28 20:37                                               ` Robert Dewar
  2005-06-28 21:44                                             ` Joe Buck
  1 sibling, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 19:48 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Andrew Pinski, Olivier Galibert, 'gcc mailing list',
	'Andrew Haley',
	Dave Korn

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| 
| > The strict aliasing rule by itself does not show it is not a high level
| > assembly language.  There are chips out there where you cannot access
| > data willy-nilly through random register types.
| 
| And there are chips for which signed arithmetic is not wrap around!

yes, but that is irrelevant,  the assumptin was made that
one knows what the chip provides.  

numeric_limits<T> is target dependent.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
       [not found] <2382433.1119938227627.JavaMail.root@dtm1eusosrv72.dtm.ops.eu.uu.net>
@ 2005-06-28 19:44 ` Toon Moene
  0 siblings, 0 replies; 119+ messages in thread
From: Toon Moene @ 2005-06-28 19:44 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc mailing list

Andrew Pinski wrote:

> The first change in GCC which changed signed overflow/wrapping to be 
> undefined
> was added back in 1992 in loop.c.

> Why are we talking about this now, instead of back
> when they were added ?

Well, I don't know about the rest of the GCC developers at that time 
(1992), but my first priority *at that time* was to get g77 to the 
masses, which only happened on the 17th of February 1995.

-- 
Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
A maintainer of GNU Fortran 95: http://gcc.gnu.org/fortran/
Looking for a job: Work from home or at a customer site; HPC, (GNU) 
Fortran & C

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:17                                   ` Robert Dewar
@ 2005-06-28 19:43                                     ` Gabriel Dos Reis
  2005-06-28 20:31                                       ` Robert Dewar
  2005-06-28 22:09                                     ` Joseph S. Myers
  1 sibling, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 19:43 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Olivier Galibert, Andrew Haley, Dave Korn,
	'Andrew Pinski', 'gcc mailing list'

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| > Robert Dewar <dewar@adacore.com> writes:
| > | Olivier Galibert wrote:
| > | | > Calling a large part of the programs out there, including a non
| > | > negligible subpart of what I personally write either "blatantly buggy"
| > | > or "subtly-incorrect" is somewhat childish and insulting.
| > | | nope, I don't see it that way at all, this is just a statement
| > of fact
| > | wrt the ISO standard. You may want C or C++ to be different from what
| > | it is, but the standard is widely accepted,
| > Those standards are widely accepted and so are very facts that they
| > do
| > have connections with hardware.  In fact, the respective committees
| > do work seriously on producing TRs for embedded systems and access to
| > hardware semantics.
| 
| I think you miss my point completely about hardware semantics. Let me
| try, though I am dubious it will be clear.

Maybe because it is an already confused statement?

|  The type int in C is not
| a hardware type, it is a type with properties defined by the standard.
| There is not necessarily any unique expectation on how this will be
| mapped to hardware, or what operations will be mapped. The only
| requirement is that the required semantics of int in the standard
| are preserved. For instance on the IBM mainframe one might use signed
| or unsigned operations to implement int operations. On the original
| MIPS one might use trapping or non-trapping arithmetic (either would
| be valid).

nice exercise in language lawyering but, I think your forgot the
qualification in the claim: if the hardware I'm targetting provides the
specific semantics I want.  Your scenario does not disqualify it.

| Yes, sometimes you need access to the hardware and need to control
| exactly what is generated, but using int is not a mechanism that
| provides any such guarantee.

except that those whose produced the document had an intended
semantics and the expectations are connected with some reality which
you would deny.

       [#5]  An  object  declared  as type signed char occupies the
       same amount of  storage  as  a  ``plain''  char  object.   A
       ``plain''  int  object has the natural size suggested by the
       architecture of the execution environment (large  enough  to
       contain any value in the range INT_MIN to INT_MAX as defined
       in the header <limits.h>).

C and C++ are general programming language geared toward system
programming. 

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:25                                         ` Gabriel Dos Reis
@ 2005-06-28 19:32                                           ` Robert Dewar
  2005-06-28 19:48                                             ` Gabriel Dos Reis
  2005-06-28 21:44                                             ` Joe Buck
  0 siblings, 2 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 19:32 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Andrew Pinski, Olivier Galibert, 'gcc mailing list',
	'Andrew Haley',
	Dave Korn

Gabriel Dos Reis wrote:

> The strict aliasing rule by itself does not show it is not a high level
> assembly language.  There are chips out there where you cannot access
> data willy-nilly through random register types.

And there are chips for which signed arithmetic is not wrap around!

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:13                                       ` Andrew Pinski
  2005-06-28 19:20                                         ` Robert Dewar
@ 2005-06-28 19:25                                         ` Gabriel Dos Reis
  2005-06-28 19:32                                           ` Robert Dewar
  1 sibling, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 19:25 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Olivier Galibert, 'Robert Dewar',
	'gcc mailing list', 'Andrew Haley',
	Dave Korn

Andrew Pinski <pinskia@physics.uc.edu> writes:

| On Jun 28, 2005, at 3:10 PM, Olivier Galibert wrote:
| 
| >
| >>   Well, I don't utterly _anything_ about either his position or
| >> yours.  C is
| >> not just a high level assembler, it has complex and abstract semantics
| >> imposed on that;
| >
| > Yes.  But C is _also_ a high level assembler, and ignoring that is
| > foolish.
| 
| No it is not. It was when it was designed yes but since the C standard
| has
| come out and the aliasing rules really show that it is not a high level
| assembler language any more.

The strict aliasing rule by itself does not show it is not a high level
assembly language.  There are chips out there where you cannot access
data willy-nilly through random register types.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:17                                     ` Olivier Galibert
@ 2005-06-28 19:21                                       ` Robert Dewar
  2005-06-28 20:18                                         ` Paul Koning
                                                           ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 19:21 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Dave Korn, 'Andrew Haley', 'Gabriel Dos Reis',
	'Andrew Pinski', 'gcc mailing list'

I am puzzled, why would *ANYONE* who knows C use int
rather than unsigned if they want wrap around semantics?

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:13                                       ` Andrew Pinski
@ 2005-06-28 19:20                                         ` Robert Dewar
  2005-06-28 21:48                                           ` Joe Buck
  2005-06-28 19:25                                         ` Gabriel Dos Reis
  1 sibling, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 19:20 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Olivier Galibert, 'Gabriel Dos Reis',
	'gcc mailing list', 'Andrew Haley',
	Dave Korn

Andrew Pinski wrote:

> No it is not. It was when it was designed yes but since the C standard has
> come out and the aliasing rules really show that it is not a high level
> assembler language any more.

Even when it was designed there was more abstraction than you think (e.g.
cannot convert *char value to *int, cannot compare addresses in different
allocated objects).

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:02                                 ` Gabriel Dos Reis
@ 2005-06-28 19:17                                   ` Robert Dewar
  2005-06-28 19:43                                     ` Gabriel Dos Reis
  2005-06-28 22:09                                     ` Joseph S. Myers
  0 siblings, 2 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 19:17 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Olivier Galibert, Andrew Haley, Dave Korn,
	'Andrew Pinski', 'gcc mailing list'

Gabriel Dos Reis wrote:
> Robert Dewar <dewar@adacore.com> writes:
> 
> | Olivier Galibert wrote:
> | 
> | > Calling a large part of the programs out there, including a non
> | > negligible subpart of what I personally write either "blatantly buggy"
> | > or "subtly-incorrect" is somewhat childish and insulting.
> | 
> | nope, I don't see it that way at all, this is just a statement of fact
> | wrt the ISO standard. You may want C or C++ to be different from what
> | it is, but the standard is widely accepted,
> 
> Those standards are widely accepted and so are very facts that they do
> have connections with hardware.  In fact, the respective committees
> do work seriously on producing TRs for embedded systems and access to
> hardware semantics. 

I think you miss my point completely about hardware semantics. Let me
try, though I am dubious it will be clear. The type int in C is not
a hardware type, it is a type with properties defined by the standard.
There is not necessarily any unique expectation on how this will be
mapped to hardware, or what operations will be mapped. The only
requirement is that the required semantics of int in the standard
are preserved. For instance on the IBM mainframe one might use signed
or unsigned operations to implement int operations. On the original
MIPS one might use trapping or non-trapping arithmetic (either would
be valid).

Yes, sometimes you need access to the hardware and need to control
exactly what is generated, but using int is not a mechanism that
provides any such guarantee.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 18:52                                   ` Robert Dewar
@ 2005-06-28 19:17                                     ` Olivier Galibert
  2005-06-28 19:21                                       ` Robert Dewar
  0 siblings, 1 reply; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 19:17 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Dave Korn, 'Andrew Haley', 'Gabriel Dos Reis',
	'Andrew Pinski', 'gcc mailing list'

On Tue, Jun 28, 2005 at 02:52:10PM -0400, Robert Dewar wrote:
> Olivier Galibert wrote:
> >On Tue, Jun 28, 2005 at 06:36:26PM +0100, Dave Korn wrote:
> >
> >> It certainly wasn't meant to be.  It was meant to be a dispassionate
> >>description of the state of facts.  Software that violates the C standard
> >>just *is* "buggy" or "incorrect", and your personal pride has absolutely
> >>nothing to do with it.
> >
> >
> >Then your definition of "incorrect" is uninteresting.  Per your
> >definition, "use of implementation-defined behaviour is incorrect",
> >essentially no non-trivial program is correct.  Including gcc for a
> >start, which can't be correct, ever.
> 
> Nope, there is nothing in the C standard that suggests that a program
> relying on implementation-defined behavior is incorrect of buggy, and that
> has nothing to do with what Dave Korn wrote. There is a world of
> difference between undefined and implementation-defined.

Dave Korn wrote, in answer to a list of assumptions I wrote, that
"There is a vast amount of bad software in the world, some blatantly
buggy, some subtly-incorrect.  To attempt to fix it all in the
compiler rather than the source seems a bit bass-ackwards to me!"

These assumptions were:
- signed and unsigned types are modulo, except in loop induction
  variables where it's bad taste (to rely on the overflow doing
  anything specific)

- sizeof(int) == 4, sizeof(long long) == 8

- sizeof(long) == sizeof(void *) == sizeof(void (*)())

Please tell me which of these assumptions are not
implementation-defined but instead undefined.

  OG.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 19:10                                     ` Olivier Galibert
@ 2005-06-28 19:13                                       ` Andrew Pinski
  2005-06-28 19:20                                         ` Robert Dewar
  2005-06-28 19:25                                         ` Gabriel Dos Reis
  0 siblings, 2 replies; 119+ messages in thread
From: Andrew Pinski @ 2005-06-28 19:13 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: 'Gabriel Dos Reis', 'Robert Dewar',
	'gcc mailing list', 'Andrew Haley',
	Dave Korn


On Jun 28, 2005, at 3:10 PM, Olivier Galibert wrote:

>
>>   Well, I don't utterly _anything_ about either his position or 
>> yours.  C is
>> not just a high level assembler, it has complex and abstract semantics
>> imposed on that;
>
> Yes.  But C is _also_ a high level assembler, and ignoring that is 
> foolish.

No it is not. It was when it was designed yes but since the C standard 
has
come out and the aliasing rules really show that it is not a high level
assembler language any more.

-- Pinski

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 18:36                                   ` Dave Korn
  2005-06-28 18:56                                     ` Gabriel Dos Reis
@ 2005-06-28 19:10                                     ` Olivier Galibert
  2005-06-28 19:13                                       ` Andrew Pinski
  1 sibling, 1 reply; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 19:10 UTC (permalink / raw)
  To: Dave Korn
  Cc: 'Andrew Haley', 'Robert Dewar',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

On Tue, Jun 28, 2005 at 07:36:00PM +0100, Dave Korn wrote:
> ----Original Message----
> >From: Olivier Galibert
> >Sent: 28 June 2005 19:02
> 
> > On Tue, Jun 28, 2005 at 06:36:26PM +0100, Dave Korn wrote:
> >>   It certainly wasn't meant to be.  It was meant to be a dispassionate
> >> description of the state of facts.  Software that violates the C standard
> >> just *is* "buggy" or "incorrect", and your personal pride has absolutely
> >> nothing to do with it.
> > 
> > Then your definition of "incorrect" is uninteresting.  Per your
> > definition, "use of implementation-defined behaviour is incorrect",
> 
>   Please don't put words in my mouth.  That isn't remotely what I said, and
> if you are trying to paraphrase it, you have changed the meaning.  Undefined
> is not the same thing as implementation defined.

Sizeofs and encodings of the different type and behaviour on overflow
are implementation-defined, not undefined afaik.  In any case, if I
misunderstood you and you felt insulted, please accept my apologies.


>   Yep, but at that point, your definition starts to look uninteresting,
> because now it's starting to look like "We should be able to rely on signed
> ints wrapping in all circumstances, except those in which they don't."  A
> lot of this discussion has focussed on loop optimisations, but can you
> guarantee those are the *only* optimisations which really benefit from
> assuming signed ints don't wrap?  As far as I can see, it is reasonable for
> the C standard to say that all signed integer overflow is undefined because
> enumerating the circumstances in which it is or is not defined would be an
> unbounded and poorly-defined task.  A language feature that sometimes works
> and sometimes does not and there is no easy way to know whether it will or
> will not work in any given circumstances is *not* a useful feature, it's a
> dangerous ambiguity - worse than useless in a production environment.

You get it the wrong way around.  It's "We should be able to rely on
signed ints wrapping in all circumstances, except those in which we
don't care.".  In loop induction variables, usually people don't care.
In all the rest, they often do, especially in emulators and virtual
machines.  It's not an optimization when all you do is making programs
give the wrong result 0.1% faster.  Unless you want a repeat of the
strict-aliasing fiasco, only worse.


>   Well, I don't utterly _anything_ about either his position or yours.  C is
> not just a high level assembler, it has complex and abstract semantics
> imposed on that;

Yes.  But C is _also_ a high level assembler, and ignoring that is foolish.

  OG.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 18:50                               ` Robert Dewar
@ 2005-06-28 19:02                                 ` Gabriel Dos Reis
  2005-06-28 19:17                                   ` Robert Dewar
  0 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 19:02 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Olivier Galibert, Andrew Haley, Dave Korn,
	'Andrew Pinski', 'gcc mailing list'

Robert Dewar <dewar@adacore.com> writes:

| Olivier Galibert wrote:
| 
| > Calling a large part of the programs out there, including a non
| > negligible subpart of what I personally write either "blatantly buggy"
| > or "subtly-incorrect" is somewhat childish and insulting.
| 
| nope, I don't see it that way at all, this is just a statement of fact
| wrt the ISO standard. You may want C or C++ to be different from what
| it is, but the standard is widely accepted,

Those standards are widely accepted and so are very facts that they do
have connections with hardware.  In fact, the respective committees
do work seriously on producing TRs for embedded systems and access to
hardware semantics. 

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 18:36                                   ` Dave Korn
@ 2005-06-28 18:56                                     ` Gabriel Dos Reis
  2005-06-28 19:10                                     ` Olivier Galibert
  1 sibling, 0 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 18:56 UTC (permalink / raw)
  To: Dave Korn
  Cc: 'Olivier Galibert', 'Andrew Haley',
	'Robert Dewar', 'Andrew Pinski',
	'gcc mailing list'

"Dave Korn" <dave.korn@artimi.com> writes:

[...]

| >Maybe you should reread what I was replying to:
| > 
| > On Tue, Jun 28, 2005 at 08:57:20AM -0400, Robert Dewar wrote:
| >> But the whole idea of hardware semantics is bogus, since you are
| >> assuming some connection between C and the hardware which does not
| >> exist. C is not an assembly language.
| > 
| > That is what I utterly disagree with.
| 
|   Well, I don't utterly _anything_ about either his position or yours.  C is
| not just a high level assembler, it has complex and abstract semantics
| imposed on that; it may have been reasonable to treat it as such back in the
| very early K'n'R days, but it has changed massively since then.  I also
| agree that reasoning in the utter abstract about languages is not
| necessarily very useful in practice, but it is a perfectly reasonable way to
| define a baseline against which it becomes possible to analyze the
| similarities and differences of any real-world implementation.

when the baseline is that C or C++ has not connection with whardware
semantics", it becomes ridiculous and uninteresting. 

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 18:21                                 ` Gabriel Dos Reis
@ 2005-06-28 18:53                                   ` Robert Dewar
  0 siblings, 0 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 18:53 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Joe Buck, Olivier Galibert, Andrew Haley, Dave Korn,
	'Andrew Pinski', 'gcc mailing list'

Gabriel Dos Reis wrote:

> But the compiler miscompiled the Unix kernel -- which, apart from has
> history intermixed with the C language design, was relying on
> "undocumented" aspect of "undefined behaviour".  Nobdoy was willing to
> buy the compiler.  The company ran out of business.

Could you please give a proper reference on this, it would be interesting
to know that this is a real story, and not just legend :-)

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 18:02                                 ` Olivier Galibert
  2005-06-28 18:36                                   ` Dave Korn
@ 2005-06-28 18:52                                   ` Robert Dewar
  2005-06-28 19:17                                     ` Olivier Galibert
  1 sibling, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 18:52 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Dave Korn, 'Andrew Haley', 'Gabriel Dos Reis',
	'Andrew Pinski', 'gcc mailing list'

Olivier Galibert wrote:
> On Tue, Jun 28, 2005 at 06:36:26PM +0100, Dave Korn wrote:
> 
>>  It certainly wasn't meant to be.  It was meant to be a dispassionate
>>description of the state of facts.  Software that violates the C standard
>>just *is* "buggy" or "incorrect", and your personal pride has absolutely
>>nothing to do with it.
> 
> 
> Then your definition of "incorrect" is uninteresting.  Per your
> definition, "use of implementation-defined behaviour is incorrect",
> essentially no non-trivial program is correct.  Including gcc for a
> start, which can't be correct, ever.

Nope, there is nothing in the C standard that suggests that a program
relying on implementation-defined behavior is incorrect of buggy, and that
has nothing to do with what Dave Korn wrote. There is a world of
difference between undefined and implementation-defined.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:18                             ` Olivier Galibert
  2005-06-28 17:36                               ` Dave Korn
  2005-06-28 17:51                               ` Joe Buck
@ 2005-06-28 18:50                               ` Robert Dewar
  2005-06-28 19:02                                 ` Gabriel Dos Reis
  2 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 18:50 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Andrew Haley, Dave Korn, 'Gabriel Dos Reis',
	'Andrew Pinski', 'gcc mailing list'

Olivier Galibert wrote:

> Calling a large part of the programs out there, including a non
> negligible subpart of what I personally write either "blatantly buggy"
> or "subtly-incorrect" is somewhat childish and insulting.

nope, I don't see it that way at all, this is just a statement of fact
wrt the ISO standard. You may want C or C++ to be different from what
it is, but the standard is widely accepted, and deviations from this
standard should only occur if they are very clearly documented and
very well justified.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* RE: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 18:28                                 ` Olivier Galibert
@ 2005-06-28 18:38                                   ` Dave Korn
  0 siblings, 0 replies; 119+ messages in thread
From: Dave Korn @ 2005-06-28 18:38 UTC (permalink / raw)
  To: 'Olivier Galibert', 'Joe Buck'
  Cc: 'Andrew Haley', 'Robert Dewar',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

----Original Message----
>From: Olivier Galibert
>Sent: 28 June 2005 19:29

> Incidentally, gcc itself makes most of these assumptions in its own
> code.  I kinda doubt you can run it on a dsp or a machine with 16-bits
> ints.  Which is different than generating code for them.

  They aren't assumptions in GCC, they are *very clearly documented
requirements* for GCC host platforms.

   cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

^ permalink raw reply	[flat|nested] 119+ messages in thread

* RE: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 18:02                                 ` Olivier Galibert
@ 2005-06-28 18:36                                   ` Dave Korn
  2005-06-28 18:56                                     ` Gabriel Dos Reis
  2005-06-28 19:10                                     ` Olivier Galibert
  2005-06-28 18:52                                   ` Robert Dewar
  1 sibling, 2 replies; 119+ messages in thread
From: Dave Korn @ 2005-06-28 18:36 UTC (permalink / raw)
  To: 'Olivier Galibert'
  Cc: 'Andrew Haley', 'Robert Dewar',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

----Original Message----
>From: Olivier Galibert
>Sent: 28 June 2005 19:02

> On Tue, Jun 28, 2005 at 06:36:26PM +0100, Dave Korn wrote:
>>   It certainly wasn't meant to be.  It was meant to be a dispassionate
>> description of the state of facts.  Software that violates the C standard
>> just *is* "buggy" or "incorrect", and your personal pride has absolutely
>> nothing to do with it.
> 
> Then your definition of "incorrect" is uninteresting.  Per your
> definition, "use of implementation-defined behaviour is incorrect",

  Please don't put words in my mouth.  That isn't remotely what I said, and
if you are trying to paraphrase it, you have changed the meaning.  Undefined
is not the same thing as implementation defined.

>>   If you re-read what *you* originally said, you made it look like you
>> were talking in abstract terms about software-in-general,
> 
> I said "A very large number of C and C++ programs".  That includes
> kernels, gnome, kde, lots of things.  Or if you want programs I
> work(ed) on, xemacs and mame.

  Well, then it counts stuff I've worked on as well.  And?

>> and that's certainly
>> what I was referring to when I replied; it's unreasonable of you to
>> point at that very generalised sentence and suddenly say "I was talking
>> about my own code, even though I hid the fact, and so you've insulted me
>> by disparaging it".
> 
> You disparaged probably around 99% of a typical linux distribution.

  I didn't disparage anything.  I described non-compliance with the standard
as representing anything on a scale from "blatantly buggy" (BTW, 'blatant'
means 'openly and visibly', it is not any kind of a pejorative term) to
"subtly incorrect", which seems to me a fair description of the sorts of
problems that can arise from disregarding the language spec.  It is only you
who is reading an emotional cast into this.

> Find one non-trivial program that doesn't assume that int is 32 bits.
> Find one of *your* programs that doesn't.

  Last one I wrote on my Amiga (all of them, in fact).  And?

>>   No number of correct assumptions about the sizes of various types or
>> the representation of NULL pointers will validate the incorrect
>> assumption that signed integer arithmetic could be made to wrap without
>> obliging the compiler to emit lousy code and miss an awful lot of
>> loop-optimisation opportunities.
> 
> Sure, and you'll notice I always special-cased the loop induction
> variables.  

  Yep, but at that point, your definition starts to look uninteresting,
because now it's starting to look like "We should be able to rely on signed
ints wrapping in all circumstances, except those in which they don't."  A
lot of this discussion has focussed on loop optimisations, but can you
guarantee those are the *only* optimisations which really benefit from
assuming signed ints don't wrap?  As far as I can see, it is reasonable for
the C standard to say that all signed integer overflow is undefined because
enumerating the circumstances in which it is or is not defined would be an
unbounded and poorly-defined task.  A language feature that sometimes works
and sometimes does not and there is no easy way to know whether it will or
will not work in any given circumstances is *not* a useful feature, it's a
dangerous ambiguity - worse than useless in a production environment.

>Maybe you should reread what I was replying to:
> 
> On Tue, Jun 28, 2005 at 08:57:20AM -0400, Robert Dewar wrote:
>> But the whole idea of hardware semantics is bogus, since you are
>> assuming some connection between C and the hardware which does not
>> exist. C is not an assembly language.
> 
> That is what I utterly disagree with.

  Well, I don't utterly _anything_ about either his position or yours.  C is
not just a high level assembler, it has complex and abstract semantics
imposed on that; it may have been reasonable to treat it as such back in the
very early K'n'R days, but it has changed massively since then.  I also
agree that reasoning in the utter abstract about languages is not
necessarily very useful in practice, but it is a perfectly reasonable way to
define a baseline against which it becomes possible to analyze the
similarities and differences of any real-world implementation.


    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:51                               ` Joe Buck
  2005-06-28 18:21                                 ` Gabriel Dos Reis
@ 2005-06-28 18:28                                 ` Olivier Galibert
  2005-06-28 18:38                                   ` Dave Korn
  1 sibling, 1 reply; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 18:28 UTC (permalink / raw)
  To: Joe Buck
  Cc: Andrew Haley, Dave Korn, 'Robert Dewar',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

On Tue, Jun 28, 2005 at 10:50:39AM -0700, Joe Buck wrote:
> On Tue, Jun 28, 2005 at 07:17:52PM +0200, Olivier Galibert wrote:
> > On Tue, Jun 28, 2005 at 04:03:49PM +0100, Andrew Haley wrote:
> > > This is childish and insulting.
> > 
> > Calling a large part of the programs out there, including a non
> > negligible subpart of what I personally write either "blatantly buggy"
> > or "subtly-incorrect" is somewhat childish and insulting.
> 
> I agree, partly, with Olivier.  However, let's not insult each other;
> we need to recognize that GCC developers have to worry about embedded
> systems, where some of the assumptions Olivier makes do not hold.

Oh yes.  But you have to be careful not to break them for the
non-embedded world, and in any case I was only answering to the
affirmation that "C the language is disconnected from the hardware".

Incidentally, gcc itself makes most of these assumptions in its own
code.  I kinda doubt you can run it on a dsp or a machine with 16-bits
ints.  Which is different than generating code for them.


> However, I am careful to document them, and disagree with a couple
> (particularly assuming things about unaligned access; even when the
> architecture permits it, there's a substantial speed penalty).

Sure, and I tend to avoid them too, especially since I still run code
on sgis and alphas.  Still, they're often considered slow but not
crashing in lots of programs out there.

> Also, some of Olivier's assumptions could lead to less maintainable
> code; sloppy typing can hide errors, even for assumptions that are
> safe with all ILP32 and LP64 machines with IEEE FP.

The pointer/long interchangeability, thankfully, appears less often.
I see it raise its ugly head essentially in cases of printing (%p is
recent), pointer difference (which fits in a long but not an int),
virtual machines where storage is tightly controlled (lisp engines in
particular), and more often when you want to pass a value through a
library-opaque void * (thread starts, callbacks) but you do not want
to have structure lifetime issues.  In these cases you know you can
pass up to a long unscathed.


> It would be best to say that the assumptions are non-portable, too
> non-portable to be used in code contributed to GCC.

Some are already there and won't go, the most obvious one is that int
is more than 16 bits.  There are also some:
    INDEX_EDGE (edge_list, i)->aux = (void *) (size_t) i;

which is I guess one of your cases of sloppy typing.  I'm sure we can
find a number of assumptions you wouldn't like on the boehm collector
too, simply because code so close to the memory management needs them
badly.

  OG.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:51                               ` Joe Buck
@ 2005-06-28 18:21                                 ` Gabriel Dos Reis
  2005-06-28 18:53                                   ` Robert Dewar
  2005-06-28 18:28                                 ` Olivier Galibert
  1 sibling, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 18:21 UTC (permalink / raw)
  To: Joe Buck
  Cc: Olivier Galibert, Andrew Haley, Dave Korn, 'Robert Dewar',
	'Andrew Pinski', 'gcc mailing list'

Joe Buck <Joe.Buck@synopsys.COM> writes:

| On Tue, Jun 28, 2005 at 07:17:52PM +0200, Olivier Galibert wrote:
| > On Tue, Jun 28, 2005 at 04:03:49PM +0100, Andrew Haley wrote:
| > > This is childish and insulting.
| > 
| > Calling a large part of the programs out there, including a non
| > negligible subpart of what I personally write either "blatantly buggy"
| > or "subtly-incorrect" is somewhat childish and insulting.
| 
| I agree, partly, with Olivier.  However, let's not insult each other;
| we need to recognize that GCC developers have to worry about embedded
| systems, where some of the assumptions Olivier makes do not hold.
| 
| I make some of the same tradeoffs in my code as Olivier does, because
| the assumptions are true of all of the target platforms we care about.
| (In particular, either ILP32 or LP64, with IEEE FP arithmetic; for
| everything else we need rigorous type safety, unions if pointers share
| storage with longs, etc).

Once upon a time, a bunch of good, expert, knowledgeable guys got
together to design an optimizing compiler for C.  They had one of the
best (if not The Best) optimizing C compilers at the time for
programs written within the well-defined part of C -- taking advantage
of undefined behaviour to propogate assumptions forward and backward.
It could blow up just about any competing compiler at the time.

...

But the compiler miscompiled the Unix kernel -- which, apart from has
history intermixed with the C language design, was relying on
"undocumented" aspect of "undefined behaviour".  Nobdoy was willing to
buy the compiler.  The company ran out of business.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:34               ` Joe Buck
@ 2005-06-28 18:09                 ` Gabriel Dos Reis
  0 siblings, 0 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 18:09 UTC (permalink / raw)
  To: Joe Buck; +Cc: Michael Veksler, gcc

Joe Buck <Joe.Buck@synopsys.COM> writes:

| On Tue, Jun 28, 2005 at 07:02:49PM +0200, Gabriel Dos Reis wrote:
| > | Since behavior on integer overflow is undefined, we can optimize assuming
| > | that overflow has not occurred.  Then a > c, so the for loop always
| > | executes b+1 times, and we end up with
| > | 
| > |     if (b > 0)
| > | 	some_func(b+1);
| > | 
| > | Any attempt to assign meaning to integer overflow would prevent this
| > | optimization.
| > 
| > We document that  
| >     
| >     a = (int) ((unsigned) b + c)
| > 
| > is well-defined and given by the wrapping semantics.  Does the current
| > optimizer takes that into account or will it assume b+1 execution times?
| 
| C/C++ require unsigned to be modulo, and I think it is perfectly
| appropriate to define the cast from unsigned to int to assume two's
| complement behavior.  But if unsigned variables are involved, in my
| example the compiler is forced to produce worse code (it must cover
| the case of wraparound).

From Diego's mail, I understand that the loop optimizer was way too
aggressive in its assumptions and it is fixed now.  So, the next
logical step would be to have the semantics well-documented.

| > If the optimizer takes that into account, then the question becomes
| > when do we consider breaking the ABI to switch numeric_limits<signed
| > type>::is_modulo back to old behaviour.
| 
| I think that defining signed types as is_modulo is broken, but I'm not
| sure what consequences follow from this problem (e.g. what kind of user
| code is using this feature, and for what).

numeric_limits<T>::is_modulo is part of the core C++ language (not just 
the library) and any change to that value implies an ABI change, in
the sense any use of numeric_limits<T>::is_modulo in template
declarations (e.g. SFINAE hackery) gets mangled the same but
instantiate to a different function.  I would expect such hackery to
be localized, but it is an ABI change and we must know that (whatever
is decided after).

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:36                               ` Dave Korn
@ 2005-06-28 18:02                                 ` Olivier Galibert
  2005-06-28 18:36                                   ` Dave Korn
  2005-06-28 18:52                                   ` Robert Dewar
  2005-07-02 17:06                                 ` Florian Weimer
  1 sibling, 2 replies; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 18:02 UTC (permalink / raw)
  To: Dave Korn
  Cc: 'Andrew Haley', 'Robert Dewar',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

On Tue, Jun 28, 2005 at 06:36:26PM +0100, Dave Korn wrote:
>   It certainly wasn't meant to be.  It was meant to be a dispassionate
> description of the state of facts.  Software that violates the C standard
> just *is* "buggy" or "incorrect", and your personal pride has absolutely
> nothing to do with it.

Then your definition of "incorrect" is uninteresting.  Per your
definition, "use of implementation-defined behaviour is incorrect",
essentially no non-trivial program is correct.  Including gcc for a
start, which can't be correct, ever.


>   If you re-read what *you* originally said, you made it look like you were
> talking in abstract terms about software-in-general,

I said "A very large number of C and C++ programs".  That includes
kernels, gnome, kde, lots of things.  Or if you want programs I
work(ed) on, xemacs and mame.


> and that's certainly
> what I was referring to when I replied; it's unreasonable of you to point at
> that very generalised sentence and suddenly say "I was talking about my own
> code, even though I hid the fact, and so you've insulted me by disparaging
> it".

You disparaged probably around 99% of a typical linux distribution.
Find one non-trivial program that doesn't assume that int is 32 bits.
Find one of *your* programs that doesn't.


>   No number of correct assumptions about the sizes of various types or the
> representation of NULL pointers will validate the incorrect assumption that
> signed integer arithmetic could be made to wrap without obliging the
> compiler to emit lousy code and miss an awful lot of loop-optimisation
> opportunities.

Sure, and you'll notice I always special-cased the loop induction
variables.  Maybe you should reread what I was replying to:

On Tue, Jun 28, 2005 at 08:57:20AM -0400, Robert Dewar wrote:
> But the whole idea of hardware semantics is bogus, since you are
> assuming some connection between C and the hardware which does not
> exist. C is not an assembly language.

That is what I utterly disagree with.

  OG.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:18                             ` Olivier Galibert
  2005-06-28 17:36                               ` Dave Korn
@ 2005-06-28 17:51                               ` Joe Buck
  2005-06-28 18:21                                 ` Gabriel Dos Reis
  2005-06-28 18:28                                 ` Olivier Galibert
  2005-06-28 18:50                               ` Robert Dewar
  2 siblings, 2 replies; 119+ messages in thread
From: Joe Buck @ 2005-06-28 17:51 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Andrew Haley, Dave Korn, 'Robert Dewar',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

On Tue, Jun 28, 2005 at 07:17:52PM +0200, Olivier Galibert wrote:
> On Tue, Jun 28, 2005 at 04:03:49PM +0100, Andrew Haley wrote:
> > This is childish and insulting.
> 
> Calling a large part of the programs out there, including a non
> negligible subpart of what I personally write either "blatantly buggy"
> or "subtly-incorrect" is somewhat childish and insulting.

I agree, partly, with Olivier.  However, let's not insult each other;
we need to recognize that GCC developers have to worry about embedded
systems, where some of the assumptions Olivier makes do not hold.

I make some of the same tradeoffs in my code as Olivier does, because
the assumptions are true of all of the target platforms we care about.
(In particular, either ILP32 or LP64, with IEEE FP arithmetic; for
everything else we need rigorous type safety, unions if pointers share
storage with longs, etc).

However, I am careful to document them, and disagree with a couple
(particularly assuming things about unaligned access; even when the
architecture permits it, there's a substantial speed penalty).  Also,
some of Olivier's assumptions could lead to less maintainable code;
sloppy typing can hide errors, even for assumptions that are safe with
all ILP32 and LP64 machines with IEEE FP.

However:

> And probably some others I'm forgetting.  Thinking that programs which
> rely on these assumptions are incorrect is the attitude of a theorist
> with his head in the sand.

It would be best to say that the assumptions are non-portable, too
non-portable to be used in code contributed to GCC.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
@ 2005-06-28 17:41 Paul Schlie
  0 siblings, 0 replies; 119+ messages in thread
From: Paul Schlie @ 2005-06-28 17:41 UTC (permalink / raw)
  To: Gabriel Dos Reis, Steven Bosscher, Robert Dewar, Andrew Pinski
  Cc: GCC Development

Gabriel Dos Reis writes:
> Steven Bosscher <stevenb@suse.de> writes:
>| On Tuesday 28 June 2005 07:12, Gabriel Dos Reis wrote:
>| > For the concrete case at issue, if the hardware I'm writing the C/C++
>| > programs for consistently displays modulo arithmetics for signed
>| > integer type, Andrew can you tell me why GCC should deny me access
>| > to that functionally where it actually can?
>| 
>| Because it disallows compiler transformations?  E.g. suddenly a
>| loop with a signed variable as the loop counter may wrap around,
>| which that means some transformations that are safe now would
>| no longer be safe.
>
> You have to define "safe".  Obviously, if you make the assumption that
> with signed overflow, all bets are off then you can go to
> tansformations predicated by that assumption.  If you take the
> assumption that signed overflow is defined and supported, you can go
> transformations predicated by that assumption.  In either case, "safe"
> is with respect to the semantics chosen.

Yes.

Overall the problem/conflict seems to revolve around the differentiation
between what's allowed vs. should be done; where "should" is of course
subjective, so it may be worth first attempting to define philosophically
what an idealized implementation "should" do, and then strive to follow this
guideline in cases where the compiler has the freedom to do so.

For what it's worth, a compiler "should":

- Strictly implement all mandated language semantics by default; although
  may enable these semantics to be extended and/or altered, if believed to
  be "Most Likely Beneficial", only by explicit request to do so.

- Define and correspondingly strictly implement the "Most Likely Beneficial"
  semantics which are not mandated by the standard as if they were, and
  emit a warning for every such behavior which is known to likely to express
  itself. (i.e. a shift by a value known to have a range which may exceed
  the size of it's shifted operand, or a the dereference of a pointer which
  is known to potentially be NULL, for example.)

- Where the "Most likely Beneficial" semantics are those which most likely:
  - improve the expressiveness and/or consistency of the language.
  - improve the determinism of the language and resulting executable.
  - improve the efficiency of the language and resulting executable by
    adopting and basing optimizations on the target's native semantics
    when not in conflict with the above.

Thereby it's never "Most likely Beneficial" to base optimizations on a
presumed undefined behavior, as no such behaviors should exist; therefore
such optimizations will only likely yield inconsistent and/or potentially
non-deterministic results, which is most likely not beneficial toward the
goals of producing an efficient deterministic program.

(The key to an efficient, consistent, and deterministic compiled program is
leveraging the natural behavior of the target where it is not in conflict
with the language's or implementation's otherwise defined semantics.)


^ permalink raw reply	[flat|nested] 119+ messages in thread

* RE: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:18                             ` Olivier Galibert
@ 2005-06-28 17:36                               ` Dave Korn
  2005-06-28 18:02                                 ` Olivier Galibert
  2005-07-02 17:06                                 ` Florian Weimer
  2005-06-28 17:51                               ` Joe Buck
  2005-06-28 18:50                               ` Robert Dewar
  2 siblings, 2 replies; 119+ messages in thread
From: Dave Korn @ 2005-06-28 17:36 UTC (permalink / raw)
  To: 'Olivier Galibert', 'Andrew Haley'
  Cc: 'Robert Dewar', 'Gabriel Dos Reis',
	'Andrew Pinski', 'gcc mailing list'

----Original Message----
>From: Olivier Galibert
>Sent: 28 June 2005 18:18

> On Tue, Jun 28, 2005 at 04:03:49PM +0100, Andrew Haley wrote:
>> Olivier Galibert writes:
>>  > On Tue, Jun 28, 2005 at 03:39:38PM +0100, Dave Korn wrote:  > >
>>  ----Original Message---- > > >From: Olivier Galibert
>>  > > >Sent: 28 June 2005 15:25
>>  > >
>>  > > > In particular, a very large number of C and C++ programs are
>>  written  > > > with the assumptions: > >
>>  > >   This is a bad line of reasoning in general.  There is a vast
>>  amount of bad > > software in the world, some blatantly buggy, some
>>  subtly-incorrect.  To > > attempt to fix it all in the compiler rather
>>  than the source seems a bit  > > bass-ackwards to me! >
>>  > Welcome to the real world.  Useful compilers are not an exercise in
>>  > theorical computing, especially for languages like C or C++.
>> 
>> This is childish and insulting.
> 
> Calling a large part of the programs out there, including a non
> negligible subpart of what I personally write either "blatantly buggy"
> or "subtly-incorrect" is somewhat childish and insulting.

  It certainly wasn't meant to be.  It was meant to be a dispassionate
description of the state of facts.  Software that violates the C standard
just *is* "buggy" or "incorrect", and your personal pride has absolutely
nothing to do with it.

  If you re-read what *you* originally said, you made it look like you were
talking in abstract terms about software-in-general, and that's certainly
what I was referring to when I replied; it's unreasonable of you to point at
that very generalised sentence and suddenly say "I was talking about my own
code, even though I hid the fact, and so you've insulted me by disparaging
it".

> Lemme give the long version then.  

  No number of correct assumptions about the sizes of various types or the
representation of NULL pointers will validate the incorrect assumption that
signed integer arithmetic could be made to wrap without obliging the
compiler to emit lousy code and miss an awful lot of loop-optimisation
opportunities.

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:03             ` Gabriel Dos Reis
  2005-06-28 17:34               ` Joe Buck
@ 2005-06-28 17:35               ` Diego Novillo
  1 sibling, 0 replies; 119+ messages in thread
From: Diego Novillo @ 2005-06-28 17:35 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Joe Buck, Michael Veksler, gcc

On Tue, Jun 28, 2005 at 07:02:49PM +0200, Gabriel Dos Reis wrote:

> We document that  
>     
>     a = (int) ((unsigned) b + c)
> 
> is well-defined and given by the wrapping semantics.  Does the current
> optimizer takes that into account or will it assume b+1 execution times?
> 
I fixed this bug yesterday.  Scalar evolutions was assuming it
couldn't wrap around.


Diego.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:03             ` Gabriel Dos Reis
@ 2005-06-28 17:34               ` Joe Buck
  2005-06-28 18:09                 ` Gabriel Dos Reis
  2005-06-28 17:35               ` Diego Novillo
  1 sibling, 1 reply; 119+ messages in thread
From: Joe Buck @ 2005-06-28 17:34 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Michael Veksler, gcc

On Tue, Jun 28, 2005 at 07:02:49PM +0200, Gabriel Dos Reis wrote:
> | Since behavior on integer overflow is undefined, we can optimize assuming
> | that overflow has not occurred.  Then a > c, so the for loop always
> | executes b+1 times, and we end up with
> | 
> |     if (b > 0)
> | 	some_func(b+1);
> | 
> | Any attempt to assign meaning to integer overflow would prevent this
> | optimization.
> 
> We document that  
>     
>     a = (int) ((unsigned) b + c)
> 
> is well-defined and given by the wrapping semantics.  Does the current
> optimizer takes that into account or will it assume b+1 execution times?

C/C++ require unsigned to be modulo, and I think it is perfectly
appropriate to define the cast from unsigned to int to assume two's
complement behavior.  But if unsigned variables are involved, in my
example the compiler is forced to produce worse code (it must cover
the case of wraparound).

> If the optimizer takes that into account, then the question becomes
> when do we consider breaking the ABI to switch numeric_limits<signed
> type>::is_modulo back to old behaviour.

I think that defining signed types as is_modulo is broken, but I'm not
sure what consequences follow from this problem (e.g. what kind of user
code is using this feature, and for what).

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 17:10                           ` Dave Korn
@ 2005-06-28 17:21                             ` Joe Buck
  2005-06-28 22:41                               ` Georg Bauhaus
  0 siblings, 1 reply; 119+ messages in thread
From: Joe Buck @ 2005-06-28 17:21 UTC (permalink / raw)
  To: Dave Korn
  Cc: 'Olivier Galibert', 'Robert Dewar',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

On Tue, Jun 28, 2005 at 06:10:26PM +0100, Dave Korn wrote:
> >>> - sizeof(int) == 4, sizeof(long long) == 8
> >>> 
> >>> - sizeof(long) == sizeof(void *) == sizeof(void (*)())

> >>   And what about 64 bit architectures?  Your assumptions are already
> >> widely invalid and only going to get more so.
> > 
> > No, all of Olivier's assumptions are valid on LP64 as well as ILP32
> > architectures.
> 
>   Well, they're invalid on ILP64, but I guess Cray and Alpha T3E aren't very
> widespread platforms.  But we can expect that ILP64 will become more widely
> used in the future, when the migration from 32-bit platforms starts to
> become nothing more than a distant memory, can't we?

No, it's going to be LP64 (with I=32), and I don't see a reason for that
ever to go away.  32-bit integers are going to remain useful types, and
LP64 architectures finally have char = 8, short = 16, int = 32, long = 64,
which is too useful to break.  Why would anyone now switch int to 64?

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 15:04                           ` Andrew Haley
@ 2005-06-28 17:18                             ` Olivier Galibert
  2005-06-28 17:36                               ` Dave Korn
                                                 ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 17:18 UTC (permalink / raw)
  To: Andrew Haley
  Cc: Dave Korn, 'Robert Dewar', 'Gabriel Dos Reis',
	'Andrew Pinski', 'gcc mailing list'

On Tue, Jun 28, 2005 at 04:03:49PM +0100, Andrew Haley wrote:
> Olivier Galibert writes:
>  > On Tue, Jun 28, 2005 at 03:39:38PM +0100, Dave Korn wrote:
>  > > ----Original Message----
>  > > >From: Olivier Galibert
>  > > >Sent: 28 June 2005 15:25
>  > > 
>  > > > In particular, a very large number of C and C++ programs are written
>  > > > with the assumptions:
>  > > 
>  > >   This is a bad line of reasoning in general.  There is a vast amount of bad
>  > > software in the world, some blatantly buggy, some subtly-incorrect.  To
>  > > attempt to fix it all in the compiler rather than the source seems a bit
>  > > bass-ackwards to me!
>  > 
>  > Welcome to the real world.  Useful compilers are not an exercise in
>  > theorical computing, especially for languages like C or C++.
> 
> This is childish and insulting.

Calling a large part of the programs out there, including a non
negligible subpart of what I personally write either "blatantly buggy"
or "subtly-incorrect" is somewhat childish and insulting.


> The standard of discourse on this list has been dropping of late, and
> we'd all get a lot more done if we'd learn to take a deep breath
> before posting.  Try to remember this is a technical discussion, not
> an argument in a bar.

Lemme give the long version then.  C has been created as a system
programming language[1], and is still _the_ system programming
language.  Its types are by design very dependant on the real
hardware, and even its behaviour.  And the standardization, probably
to the great chagrin of language theorists, has been careful to
reflect that, leaving large areas underspecified to allow both the
compiler and the programmer to rely on the appropriate
hardware-defined behaviour where useful.

Knowing what you run on is not being subtly incorrect, it's being an
engineer instead of a theorist.  Languages which try to define every
behaviour independently of the hardware either run on virtual machines
or require the help of Hossein Rezazadeh to move the specs around.

Now, given this hardware dependance of the language on one side and a
desire for reasonable portability on the other there pretty much
exists classes of hardware a given program will compile and run on.
Only theorical, and essentially uninteresting, programs will run on
everything.  The main class today is the one I was talking about:

- 8 bits char, 16 bits short, 32 bits int, 64 bits long long

- 2-complement without surprises, with the exception that overflow on
  loop induction variables is essentially always a bug

- long is the same size as pointer (intptr_t is not very accepted
  yet), pointer to function and pointer to data have the same size

- 32 and 64 bits ieee for float and double

- null pointer is all bits 0

- volatile means something can modify the value at any time

- memory is naturally byte-accessible, you just have to be careful
  with alignments (and even then it's often considered that the OS
  should provide transparent unaligned accesses if the hardware
  doesn't do it by itself)

And probably some others I'm forgetting.  Thinking that programs which
rely on these assumptions are incorrect is the attitude of a theorist
with his head in the sand.  The kind of people that should never be
left anywhere near the source of a C compiler.  And to break themyou
need a really, compelling hardware reason.  Breaking sizeof(void *) ==
sizeof(int) took years for people and programs to adapt and is still
not completely there yet.  Breaking aliasing took a while too.

Even embedded hardware tend to converge on that class for flexibility
and interoperability reasons.  Those who aren't there yet keep a very,
very close control on their software environment and don't plan to run
anything they find on sourceforge without a large amount of effort.
And they use the C compiler even more as a high-level assembler than
most people.


Anyway, my point is that breaking these assumptions just because the
standard does not specify them is silly.  They're driven by the
hardware, not by the wishes of compiler writers or of language
theorists.


  OG.

[1] http://cm.bell-labs.com/cm/cs/who/dmr/chist.html

^ permalink raw reply	[flat|nested] 119+ messages in thread

* RE: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 16:42                         ` Joe Buck
@ 2005-06-28 17:10                           ` Dave Korn
  2005-06-28 17:21                             ` Joe Buck
  0 siblings, 1 reply; 119+ messages in thread
From: Dave Korn @ 2005-06-28 17:10 UTC (permalink / raw)
  To: 'Joe Buck'
  Cc: 'Olivier Galibert', 'Robert Dewar',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

----Original Message----
>From: Joe Buck
>Sent: 28 June 2005 17:42

> On Tue, Jun 28, 2005 at 03:39:38PM +0100, Dave Korn wrote:
>> ----Original Message----
>>> From: Olivier Galibert
>>> Sent: 28 June 2005 15:25
>> 
>>> In particular, a very large number of C and C++ programs are written
>>> with the assumptions:
>> 
>>   This is a bad line of reasoning in general.  There is a vast amount of
>> bad software in the world, some blatantly buggy, some subtly-incorrect. 
>> To attempt to fix it all in the compiler rather than the source seems a
>> bit bass-ackwards to me! 
>> 
>>> - sizeof(int) == 4, sizeof(long long) == 8
>>> 
>>> - sizeof(long) == sizeof(void *) == sizeof(void (*)())
>>> 
>>> Break them and see your compiler rejected by pretty much everybody.
>> 
>>   And what about 64 bit architectures?  Your assumptions are already
>> widely invalid and only going to get more so.
> 
> No, all of Olivier's assumptions are valid on LP64 as well as ILP32
> architectures.

  Well, they're invalid on ILP64, but I guess Cray and Alpha T3E aren't very
widespread platforms.  But we can expect that ILP64 will become more widely
used in the future, when the migration from 32-bit platforms starts to
become nothing more than a distant memory, can't we?

>  They are invalid on most DSP architectures, as well as
> on 16-bit embedded architectures, which I suppose you could call "widely
> invalid" if you count # of shipping parts.  

  I'm sure I remember a post from one of the interminably long threads of
the past few weeks that quoted examples of non-standard sizes in some 64-bit
arch(es), but I haven't been able to dig it up.

> But they are a reasonable
> engineering tradeoff in many cases (it's a waste of time in most cases
> for a program that does a lot of bit manipulation to consider the case
> of 37 bit one's complement architectures with an 11 bit byte).

  A waste of time in user apps, yes, but not an assumption that we
necessarily want to embed in the compiler (and bear in mind I'm not talking
about really bizarre stuff with 37 bits or 1's-C or non-8-bit-bytes; I'm
talking only about nice power-of-2 sizes and other sane choices here.)


    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 16:32           ` Joe Buck
  2005-06-28 16:56             ` Joe Buck
@ 2005-06-28 17:03             ` Gabriel Dos Reis
  2005-06-28 17:34               ` Joe Buck
  2005-06-28 17:35               ` Diego Novillo
  1 sibling, 2 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 17:03 UTC (permalink / raw)
  To: Joe Buck; +Cc: Michael Veksler, gcc

Joe Buck <Joe.Buck@synopsys.COM> writes:

| On Tue, Jun 28, 2005 at 10:23:51AM +0300, Michael Veksler wrote:
| 
|  
|  On Jun 28, 2005, at 1:12 AM, Gabriel Dos Reis wrote:
| > > >  So,
| > > > please, do refrain from reasoning like "since we did X for Y and Y was
| > > > undefined behaviour, we should do the same for Z."  "Undefined
| > > > behaviour" isn't a 0 or 1 thingy, even though it is about computers.
| > > > You need to evaluate them on case-by-case basis.
| 
| Andrew Pinski wrote on 28/06/2005 08:34:25:

I think there is a slight misattribution in your message.  The example
was given my Michael.

[...]

| Consider a processor whose integer addition instruction wraps.  Then
| the cheapest implementation for examples 1 and 2 above that cover the
| defined cases is to eliminate the loop in case 1, and produce a modulo
| result in case 2.  You worried about interaction between the two
| constructs.  Consider
| 
|     /* int a, b, c; */
|     if (b > 0) {
| 	a = b + c;
| 	int count;
| 	for (int i = c; i <= a; i++)
| 	    count++;
| 	some_func(count);
|     }
| 
| Since behavior on integer overflow is undefined, we can optimize assuming
| that overflow has not occurred.  Then a > c, so the for loop always
| executes b+1 times, and we end up with
| 
|     if (b > 0)
| 	some_func(b+1);
| 
| Any attempt to assign meaning to integer overflow would prevent this
| optimization.

We document that  
    
    a = (int) ((unsigned) b + c)

is well-defined and given by the wrapping semantics.  Does the current
optimizer takes that into account or will it assume b+1 execution times?

If the optimizer takes that into account, then the question becomes
when do we consider breaking the ABI to switch numeric_limits<signed
type>::is_modulo back to old behaviour.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 16:32           ` Joe Buck
@ 2005-06-28 16:56             ` Joe Buck
  2005-06-28 17:03             ` Gabriel Dos Reis
  1 sibling, 0 replies; 119+ messages in thread
From: Joe Buck @ 2005-06-28 16:56 UTC (permalink / raw)
  To: Michael Veksler; +Cc: gcc

On Tue, Jun 28, 2005 at 09:32:49AM -0700, Joe Buck wrote:
>     /* int a, b, c; */
>     if (b > 0) {
> 	a = b + c;
> 	int count;
> 	for (int i = c; i <= a; i++)
> 	    count++;
> 	some_func(count);
>     }

I forgot to initialize count to 0, of course.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 14:39                       ` Dave Korn
  2005-06-28 14:52                         ` Olivier Galibert
@ 2005-06-28 16:42                         ` Joe Buck
  2005-06-28 17:10                           ` Dave Korn
  1 sibling, 1 reply; 119+ messages in thread
From: Joe Buck @ 2005-06-28 16:42 UTC (permalink / raw)
  To: Dave Korn
  Cc: 'Olivier Galibert', 'Robert Dewar',
	'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

On Tue, Jun 28, 2005 at 03:39:38PM +0100, Dave Korn wrote:
> ----Original Message----
> >From: Olivier Galibert
> >Sent: 28 June 2005 15:25
> 
> > In particular, a very large number of C and C++ programs are written
> > with the assumptions:
> 
>   This is a bad line of reasoning in general.  There is a vast amount of bad
> software in the world, some blatantly buggy, some subtly-incorrect.  To
> attempt to fix it all in the compiler rather than the source seems a bit
> bass-ackwards to me!
> 
> > - sizeof(int) == 4, sizeof(long long) == 8
> > 
> > - sizeof(long) == sizeof(void *) == sizeof(void (*)())
> > 
> > Break them and see your compiler rejected by pretty much everybody.
> 
>   And what about 64 bit architectures?  Your assumptions are already widely
> invalid and only going to get more so.

No, all of Olivier's assumptions are valid on LP64 as well as ILP32
architectures.  They are invalid on most DSP architectures, as well as
on 16-bit embedded architectures, which I suppose you could call "widely
invalid" if you count # of shipping parts.  But they are a reasonable
engineering tradeoff in many cases (it's a waste of time in most cases
for a program that does a lot of bit manipulation to consider the case
of 37 bit one's complement architectures with an 11 bit byte).




^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 12:57                   ` Robert Dewar
  2005-06-28 13:19                     ` Gabriel Dos Reis
  2005-06-28 14:24                     ` Olivier Galibert
@ 2005-06-28 16:38                     ` Joe Buck
  2005-06-28 21:59                     ` Mike Stump
  3 siblings, 0 replies; 119+ messages in thread
From: Joe Buck @ 2005-06-28 16:38 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Gabriel Dos Reis, Andrew Pinski, gcc mailing list

On Tue, Jun 28, 2005 at 08:57:20AM -0400, Robert Dewar wrote:
> Gabriel Dos Reis wrote:
> =
> >Please do remember that this is hardware dependent.  If you have
> >problems with x86, it does not mean you have the same witha PPC or a
> >Sparc. 
> 
> But the whole idea of hardware semantics is bogus, since you are
> assuming some connection between C and the hardware which does not
> exist. C is not an assembly language.

Actually, those of us who work in hardware-software codesign and formal
verification consider exactly such problems, though it ends up being a
comparison of the behavior of two abstract machines, the (idealized)
processor model and the "C machine".

The distinction between an HLL and assembly language is that in the
latter, every program that is accepted by the tool is mapped into some
definite machine language (though the processor architecture will flag
the behavior of some instructions as undefined).

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  7:25         ` Michael Veksler
@ 2005-06-28 16:32           ` Joe Buck
  2005-06-28 16:56             ` Joe Buck
  2005-06-28 17:03             ` Gabriel Dos Reis
  0 siblings, 2 replies; 119+ messages in thread
From: Joe Buck @ 2005-06-28 16:32 UTC (permalink / raw)
  To: Michael Veksler; +Cc: gcc

On Tue, Jun 28, 2005 at 10:23:51AM +0300, Michael Veksler wrote:

 
 On Jun 28, 2005, at 1:12 AM, Gabriel Dos Reis wrote:
> > >  So,
> > > please, do refrain from reasoning like "since we did X for Y and Y was
> > > undefined behaviour, we should do the same for Z."  "Undefined
> > > behaviour" isn't a 0 or 1 thingy, even though it is about computers.
> > > You need to evaluate them on case-by-case basis.

Andrew Pinski wrote on 28/06/2005 08:34:25:
> Gaby, I am not sure you can do that in a reliable way. You may end up
> with different behavior of overflow in the following two cases:
> 1. for (int i = x ; i <= y ; ++i)
>    {
>     // this loop can be eliminated - overflow case (y == MAX_INT)
>     // is undefined.
>     q= s + 5; // moved outside the loop.
>    }
> 2. a = b + c; // modulo
> 
> If you treat overflow in case 1 differently than in case 2 then
> you get into many inconsistencies and corner cases.

In digital logic optimization we speak of "don't cares".  This means
that we have certain input combinations that we don't care about, but
we must produce correct logic for all the cases that we do care about.
It's really just the same for producing a compiler that matches a spec:
We want to produce the cheapest possible circuit/program, by taking
maximal advantage of the degrees of freedom provided by the don't-cares.

Andrew, you're exactly right that we can't define the behavior in a
reliable way, and THAT IS EXACTLY THE POINT.  We want to produce the
most efficient possible implementation for the cases that *are* defined,
and the behavior for the undefined cases naturally falls out of that.
You've actually given us an excellent example above.

Consider a processor whose integer addition instruction wraps.  Then
the cheapest implementation for examples 1 and 2 above that cover the
defined cases is to eliminate the loop in case 1, and produce a modulo
result in case 2.  You worried about interaction between the two
constructs.  Consider

    /* int a, b, c; */
    if (b > 0) {
	a = b + c;
	int count;
	for (int i = c; i <= a; i++)
	    count++;
	some_func(count);
    }

Since behavior on integer overflow is undefined, we can optimize assuming
that overflow has not occurred.  Then a > c, so the for loop always
executes b+1 times, and we end up with

    if (b > 0)
	some_func(b+1);

Any attempt to assign meaning to integer overflow would prevent this
optimization.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 14:52                         ` Olivier Galibert
  2005-06-28 15:01                           ` Robert Dewar
@ 2005-06-28 15:04                           ` Andrew Haley
  2005-06-28 17:18                             ` Olivier Galibert
  1 sibling, 1 reply; 119+ messages in thread
From: Andrew Haley @ 2005-06-28 15:04 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Dave Korn, 'Robert Dewar', 'Gabriel Dos Reis',
	'Andrew Pinski', 'gcc mailing list'

Olivier Galibert writes:
 > On Tue, Jun 28, 2005 at 03:39:38PM +0100, Dave Korn wrote:
 > > ----Original Message----
 > > >From: Olivier Galibert
 > > >Sent: 28 June 2005 15:25
 > > 
 > > > In particular, a very large number of C and C++ programs are written
 > > > with the assumptions:
 > > 
 > >   This is a bad line of reasoning in general.  There is a vast amount of bad
 > > software in the world, some blatantly buggy, some subtly-incorrect.  To
 > > attempt to fix it all in the compiler rather than the source seems a bit
 > > bass-ackwards to me!
 > 
 > Welcome to the real world.  Useful compilers are not an exercise in
 > theorical computing, especially for languages like C or C++.

This is childish and insulting.

Please try to have a little more respect for the people you're dealing
with -- who live and work in the real world as much as you do.  

The standard of discourse on this list has been dropping of late, and
we'd all get a lot more done if we'd learn to take a deep breath
before posting.  Try to remember this is a technical discussion, not
an argument in a bar.

Andrew.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 14:52                         ` Olivier Galibert
@ 2005-06-28 15:01                           ` Robert Dewar
  2005-06-28 15:04                           ` Andrew Haley
  1 sibling, 0 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 15:01 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Dave Korn, 'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

Olivier Galibert wrote:

> IA-64 may have an issue with sizeof(void (*)()) from what I've heard,
> but they have been laughed out of the market.

And this comment is supposed to be from the "real world"? I think not.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 14:39                       ` Dave Korn
@ 2005-06-28 14:52                         ` Olivier Galibert
  2005-06-28 15:01                           ` Robert Dewar
  2005-06-28 15:04                           ` Andrew Haley
  2005-06-28 16:42                         ` Joe Buck
  1 sibling, 2 replies; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 14:52 UTC (permalink / raw)
  To: Dave Korn
  Cc: 'Robert Dewar', 'Gabriel Dos Reis',
	'Andrew Pinski', 'gcc mailing list'

On Tue, Jun 28, 2005 at 03:39:38PM +0100, Dave Korn wrote:
> ----Original Message----
> >From: Olivier Galibert
> >Sent: 28 June 2005 15:25
> 
> > In particular, a very large number of C and C++ programs are written
> > with the assumptions:
> 
>   This is a bad line of reasoning in general.  There is a vast amount of bad
> software in the world, some blatantly buggy, some subtly-incorrect.  To
> attempt to fix it all in the compiler rather than the source seems a bit
> bass-ackwards to me!

Welcome to the real world.  Useful compilers are not an exercise in
theorical computing, especially for languages like C or C++.


> > - sizeof(int) == 4, sizeof(long long) == 8
> > 
> > - sizeof(long) == sizeof(void *) == sizeof(void (*)())
> > 
> > Break them and see your compiler rejected by pretty much everybody.
> 
>   And what about 64 bit architectures?  Your assumptions are already widely
> invalid and only going to get more so.

They aren't.  They have:

- sizeof(int) == 4, sizeof(long long) == 8
- sizeof(long) == sizeof(void *) == sizeof(void (*)()) == 8

and nobody in his right mind would seriously propose to change
sizeof(int) to 8 or sizeof(long long) to 16.

IA-64 may have an issue with sizeof(void (*)()) from what I've heard,
but they have been laughed out of the market.

  OG.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 14:24                     ` Olivier Galibert
  2005-06-28 14:28                       ` Jonathan Wilson
  2005-06-28 14:39                       ` Dave Korn
@ 2005-06-28 14:47                       ` Gabriel Dos Reis
  2 siblings, 0 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 14:47 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: Robert Dewar, Andrew Pinski, gcc mailing list

Olivier Galibert <galibert@pobox.com> writes:

| On Tue, Jun 28, 2005 at 08:57:20AM -0400, Robert Dewar wrote:
| > But the whole idea of hardware semantics is bogus, since you are
| > assuming some connection between C and the hardware which does not
| > exist. C is not an assembly language.
| 
| A non-negligible part of the use of C and even C++ is as a high-level,
| somewhat portable assembly language.  Ignoring that part is not a very
| good idea.

Especially given that we do compile C and C++ programs based on the
published processor specific ABIs, and pretending that there is no
connection between C or C++ and hardware semantics is, ahem, not a
very good idea.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 14:28                       ` Jonathan Wilson
@ 2005-06-28 14:42                         ` Olivier Galibert
  0 siblings, 0 replies; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 14:42 UTC (permalink / raw)
  To: Jonathan Wilson
  Cc: Robert Dewar, Gabriel Dos Reis, Andrew Pinski, gcc mailing list

On Tue, Jun 28, 2005 at 10:30:39PM +0800, Jonathan Wilson wrote:
> >- sizeof(int) == 4, sizeof(long long) == 8
> I swear 16 bit compilers have sizeof(int) = 2 with sizeof(long) = 4

Yes, and some computers have 9-bit bytes too.  Tried running linux,
gnome, kde, gimp, cdrecord, mame, qemu... on them lately?

I kinda doubt gcc can generate 16-bit code at that point in any case.

  OG.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* RE: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 14:24                     ` Olivier Galibert
  2005-06-28 14:28                       ` Jonathan Wilson
@ 2005-06-28 14:39                       ` Dave Korn
  2005-06-28 14:52                         ` Olivier Galibert
  2005-06-28 16:42                         ` Joe Buck
  2005-06-28 14:47                       ` Gabriel Dos Reis
  2 siblings, 2 replies; 119+ messages in thread
From: Dave Korn @ 2005-06-28 14:39 UTC (permalink / raw)
  To: 'Olivier Galibert', 'Robert Dewar'
  Cc: 'Gabriel Dos Reis', 'Andrew Pinski',
	'gcc mailing list'

----Original Message----
>From: Olivier Galibert
>Sent: 28 June 2005 15:25

> In particular, a very large number of C and C++ programs are written
> with the assumptions:

  This is a bad line of reasoning in general.  There is a vast amount of bad
software in the world, some blatantly buggy, some subtly-incorrect.  To
attempt to fix it all in the compiler rather than the source seems a bit
bass-ackwards to me!

> - sizeof(int) == 4, sizeof(long long) == 8
> 
> - sizeof(long) == sizeof(void *) == sizeof(void (*)())
> 
> Break them and see your compiler rejected by pretty much everybody.

  And what about 64 bit architectures?  Your assumptions are already widely
invalid and only going to get more so.

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 13:47                   ` Gabriel Paubert
  2005-06-28 13:52                     ` Andrew Pinski
@ 2005-06-28 14:33                     ` Robert Dewar
  1 sibling, 0 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 14:33 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: Gabriel Dos Reis, Andrew Pinski, gcc mailing list

Gabriel Paubert wrote:

> Now in practice what would be the cost of checking that the divisor
> is -1 and take an alternate path that computes the correct 
> results (in modulo arithmetic) for this case ?

We actually had to do this on the x86 early on for GNAT, UGH!

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 14:24                     ` Olivier Galibert
@ 2005-06-28 14:28                       ` Jonathan Wilson
  2005-06-28 14:42                         ` Olivier Galibert
  2005-06-28 14:39                       ` Dave Korn
  2005-06-28 14:47                       ` Gabriel Dos Reis
  2 siblings, 1 reply; 119+ messages in thread
From: Jonathan Wilson @ 2005-06-28 14:28 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Robert Dewar, Gabriel Dos Reis, Andrew Pinski, gcc mailing list

> - sizeof(int) == 4, sizeof(long long) == 8
I swear 16 bit compilers have sizeof(int) = 2 with sizeof(long) = 4


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 12:57                   ` Robert Dewar
  2005-06-28 13:19                     ` Gabriel Dos Reis
@ 2005-06-28 14:24                     ` Olivier Galibert
  2005-06-28 14:28                       ` Jonathan Wilson
                                         ` (2 more replies)
  2005-06-28 16:38                     ` Joe Buck
  2005-06-28 21:59                     ` Mike Stump
  3 siblings, 3 replies; 119+ messages in thread
From: Olivier Galibert @ 2005-06-28 14:24 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Gabriel Dos Reis, Andrew Pinski, gcc mailing list

On Tue, Jun 28, 2005 at 08:57:20AM -0400, Robert Dewar wrote:
> But the whole idea of hardware semantics is bogus, since you are
> assuming some connection between C and the hardware which does not
> exist. C is not an assembly language.

A non-negligible part of the use of C and even C++ is as a high-level,
somewhat portable assembly language.  Ignoring that part is not a very
good idea.

In particular, a very large number of C and C++ programs are written
with the assumptions:

- signed and unsigned types are modulo, except in loop induction
  variables where it's bad taste

- sizeof(int) == 4, sizeof(long long) == 8

- sizeof(long) == sizeof(void *) == sizeof(void (*)())

Break them and see your compiler rejected by pretty much everybody.

  OG.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 13:47                   ` Gabriel Paubert
@ 2005-06-28 13:52                     ` Andrew Pinski
  2005-06-28 14:33                     ` Robert Dewar
  1 sibling, 0 replies; 119+ messages in thread
From: Andrew Pinski @ 2005-06-28 13:52 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: Robert Dewar, Gabriel Dos Reis, gcc mailing list


On Jun 28, 2005, at 9:46 AM, Gabriel Paubert wrote:

> Now in practice what would be the cost of checking that the divisor
> is -1 and take an alternate path that computes the correct
> results (in modulo arithmetic) for this case ?

Small compared the division it self but if it is in an inner loop, it
adds up and then becomes a performance regression which nobody wants.

-- Pinski

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 12:33                 ` Gabriel Dos Reis
  2005-06-28 12:57                   ` Robert Dewar
@ 2005-06-28 13:47                   ` Gabriel Paubert
  2005-06-28 13:52                     ` Andrew Pinski
  2005-06-28 14:33                     ` Robert Dewar
  1 sibling, 2 replies; 119+ messages in thread
From: Gabriel Paubert @ 2005-06-28 13:47 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Robert Dewar, Andrew Pinski, gcc mailing list

On Tue, Jun 28, 2005 at 02:32:04PM +0200, Gabriel Dos Reis wrote:
> Robert Dewar <dewar@adacore.com> writes:
> 
> | Gabriel Dos Reis wrote:
> | 
> | > The issue here is whether if the hardware consistently display a
> | > semantics, GCC should not allow access to that consistent semantics
> | > under the name that "the standard says it is undefined behaviour".
> | > Consider the case of converting a void* to a F*, where F is a function
> | > type.
> | 
> | Well the "hardware consistently displaying a semantics" is not so
> | cut and dried as you think (consider the loop instruction and other
> | arithmetic on the x86 for instance in the context of generating code
> | for loops).
> 
> Please do remember that this is hardware dependent.  If you have
> problems with x86, it does not mean you have the same witha PPC or a
> Sparc. 

For the matter, PPC also has undefined behaviour for integer divides
of 0x80000000 by -1 (according to the architecture specification).
I just checked on a 400MHz PPC750, and the result register ends
up containing -1. 

A side effect is that (INT_MIN % -1) is INT_MAX, which is really 
surprising. I believe that it is reasonable to expect that the 
absolute value of x%y is less than the absolute value of y; it 
might even be required by some language standard.

On x86, the same operation results in a "divide by zero" exception
(vector 0) and a signal under most (all?) operating systems
(SIGFPE under Linux).

Now in practice what would be the cost of checking that the divisor
is -1 and take an alternate path that computes the correct 
results (in modulo arithmetic) for this case ? 

I can see a moderate code size impact, something like 4 or 5 machine
instructions per integer division, not really a performance impact
since on one branch you would have a divide instruction which takes
many clock cycles.

	Regards,
	Gabriel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 12:57                   ` Robert Dewar
@ 2005-06-28 13:19                     ` Gabriel Dos Reis
  2005-06-28 22:58                       ` Georg Bauhaus
  2005-06-28 14:24                     ` Olivier Galibert
                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 13:19 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Andrew Pinski, gcc mailing list

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| =
| > Please do remember that this is hardware dependent.  If you have
| > problems with x86, it does not mean you have the same witha PPC or a
| > Sparc.
| 
| But the whole idea of hardware semantics is bogus, since you are
| assuming some connection between C and the hardware which does not
| exist. C is not an assembly language.

If you live in a different world, you may not see the connection.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 12:33                 ` Gabriel Dos Reis
@ 2005-06-28 12:57                   ` Robert Dewar
  2005-06-28 13:19                     ` Gabriel Dos Reis
                                       ` (3 more replies)
  2005-06-28 13:47                   ` Gabriel Paubert
  1 sibling, 4 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 12:57 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Andrew Pinski, gcc mailing list

Gabriel Dos Reis wrote:
=
> Please do remember that this is hardware dependent.  If you have
> problems with x86, it does not mean you have the same witha PPC or a
> Sparc. 

But the whole idea of hardware semantics is bogus, since you are
assuming some connection between C and the hardware which does not
exist. C is not an assembly language.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 12:08               ` Robert Dewar
@ 2005-06-28 12:34                 ` Gabriel Dos Reis
  0 siblings, 0 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 12:34 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Andrew Pinski, gcc mailing list

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| 
| > I saw your passsword example but I think it is largely beside the point.
| > I'm not interested in programming "undefined behaviour".  I'm looking
| > for way to take advantage of that liberty we accept more useful
| > programs where we can.
| 
| The password example is just an example of possible evil effects of
| taking advantage of undefined. If anything it is an argument for your
| point of view :-)

Well, I do not think so -- that is what I was saying.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 12:07               ` Robert Dewar
@ 2005-06-28 12:33                 ` Gabriel Dos Reis
  2005-06-28 12:57                   ` Robert Dewar
  2005-06-28 13:47                   ` Gabriel Paubert
  0 siblings, 2 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 12:33 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Andrew Pinski, gcc mailing list

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| 
| > The issue here is whether if the hardware consistently display a
| > semantics, GCC should not allow access to that consistent semantics
| > under the name that "the standard says it is undefined behaviour".
| > Consider the case of converting a void* to a F*, where F is a function
| > type.
| 
| Well the "hardware consistently displaying a semantics" is not so
| cut and dried as you think (consider the loop instruction and other
| arithmetic on the x86 for instance in the context of generating code
| for loops).

Please do remember that this is hardware dependent.  If you have
problems with x86, it does not mean you have the same witha PPC or a
Sparc. 

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  7:39           ` Falk Hueffner
@ 2005-06-28 12:08             ` Gabriel Dos Reis
  0 siblings, 0 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 12:08 UTC (permalink / raw)
  To: Falk Hueffner; +Cc: Michael Veksler, Steven Bosscher, gcc, Andrew Pinski

Falk Hueffner <falk@debian.org> writes:

| Michael Veksler <VEKSLER@il.ibm.com> writes:
| 
| > So maybe introduce a -fsigned-wraps flag, that the user can use
| > to make 'int' wrap even in loops.
| 
| We have that already, it's called "-fwrapv".

In the case of C++, it leads to ODR violation because of the presence
of numeric_limits<T>::is_modulo -- assuming you make GCC's behaviour
consistent.  

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 11:50             ` Gabriel Dos Reis
  2005-06-28 12:07               ` Robert Dewar
@ 2005-06-28 12:08               ` Robert Dewar
  2005-06-28 12:34                 ` Gabriel Dos Reis
  1 sibling, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 12:08 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Andrew Pinski, gcc mailing list

Gabriel Dos Reis wrote:

> I saw your passsword example but I think it is largely beside the point.
> I'm not interested in programming "undefined behaviour".  I'm looking
> for way to take advantage of that liberty we accept more useful
> programs where we can.

The password example is just an example of possible evil effects of
taking advantage of undefined. If anything it is an argument for your
point of view :-)

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28 11:50             ` Gabriel Dos Reis
@ 2005-06-28 12:07               ` Robert Dewar
  2005-06-28 12:33                 ` Gabriel Dos Reis
  2005-06-28 12:08               ` Robert Dewar
  1 sibling, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28 12:07 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Andrew Pinski, gcc mailing list

Gabriel Dos Reis wrote:

> The issue here is whether if the hardware consistently display a
> semantics, GCC should not allow access to that consistent semantics
> under the name that "the standard says it is undefined behaviour".
> Consider the case of converting a void* to a F*, where F is a function
> type. 

Well the "hardware consistently displaying a semantics" is not so
cut and dried as you think (consider the loop instruction and other
arithmetic on the x86 for instance in the context of generating code
for loops).

THis is all about trading off undefined behavior (a bad thing) against
high performance (a good thing) against encouraging people to write
portable code (a good thing) against existing wrong programs working
(probably a good thing, though at odds with the portability requirement).

C is not a way of writing machine code, it is an abstract language
defined by the standard. There can be arguyments for language extension,
as you making, but such decisions should be made with proper data about
the magnitude of the tradeoffs.
> 
> -- Gaby


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  6:55       ` Steven Bosscher
  2005-06-28  7:20         ` Michael Veksler
@ 2005-06-28 12:01         ` Gabriel Dos Reis
  1 sibling, 0 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 12:01 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Andrew Pinski

Steven Bosscher <stevenb@suse.de> writes:

| On Tuesday 28 June 2005 07:12, Gabriel Dos Reis wrote:
| > For the concrete case at issue, if the hardware I'm writing the C/C++
| > programs for consistently displays modulo arithmetics for signed
| > integer type, Andrew can you tell me why GCC should deny me access
| > to that functionally where it actually can?
| 
| Because it disallows compiler transformations?  E.g. suddenly a
| loop with a signed variable as the loop counter may wrap around, 
| which that means some transformations that are safe now would
| no longer be safe.

You have to define "safe".  Obviously, if you make the assumption that
with signed overflow, all bets are off then you can go to
tansformations predicated by that assumption.  If you take the
assumption that signed overflow is defined and supported, you can go
transformations predicated by that assumption.  In either case, "safe"
is with respect to the semantics chosen. 

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  9:18           ` Robert Dewar
@ 2005-06-28 11:50             ` Gabriel Dos Reis
  2005-06-28 12:07               ` Robert Dewar
  2005-06-28 12:08               ` Robert Dewar
  0 siblings, 2 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28 11:50 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Andrew Pinski, gcc mailing list

Robert Dewar <dewar@adacore.com> writes:

| Gabriel Dos Reis wrote:
| 
| > and it should also be able to take your life.  Do you want it to actually
| > do it?  If yes, I suggest you create your own compiler that does that
| > and leave us work on a compiler that does something more positive.
| > -- Gaby
| 
| Obviously no one programs a compiler to deliberately have disastrous
| behavior in an undefined situation. However, if you are interested in
| the best possible code from an efficiency point of view, the compiler
| is allowed to assert that the overflow cannot take place, and then make
| all logical deductions about control flow etc that come from this
| assumption. As I showed with my password example, this can have unexpected
| results.

I saw your passsword example but I think it is largely beside the point.
I'm not interested in programming "undefined behaviour".  I'm looking
for way to take advantage of that liberty we accept more useful
programs where we can.

The issue here is whether if the hardware consistently display a
semantics, GCC should not allow access to that consistent semantics
under the name that "the standard says it is undefined behaviour".
Consider the case of converting a void* to a F*, where F is a function
type. 

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  4:20 ` Michael Veksler
@ 2005-06-28  9:49   ` Robert Dewar
  0 siblings, 0 replies; 119+ messages in thread
From: Robert Dewar @ 2005-06-28  9:49 UTC (permalink / raw)
  To: Michael Veksler; +Cc: Andrew Pinski, gcc mailing list

Michael Veksler wrote:

> I don't mind MAX_INT+1 being undefined by gcc. I object to drawing from
> "undefined" to conclude that is_modulo should be true. This does not
> make a practical sense. Drawing conclusions from "undefined" can yield
> absurd results.

Yes, but trying to define what you mean by disallowing "drawing
conclusions" is close to impossible. You can say that informally,
and we sort of know what you mean, but if you try to formalize
this at the level of standardized semantics you will run into
trouble. This reflects the fact that the notion of forbidding
it is not clear, it will be like pornography and the supreme
court. For a given example, you will know whether you like it
or not, but you will find it hard to generalize the rule.


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  6:01         ` Gabriel Dos Reis
@ 2005-06-28  9:18           ` Robert Dewar
  2005-06-28 11:50             ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Robert Dewar @ 2005-06-28  9:18 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Andrew Pinski, gcc mailing list

Gabriel Dos Reis wrote:

> and it should also be able to take your life.  Do you want it to actually
> do it?  If yes, I suggest you create your own compiler that does that
> and leave us work on a compiler that does something more positive.
> 
> -- Gaby

Obviously no one programs a compiler to deliberately have disastrous
behavior in an undefined situation. However, if you are interested in
the best possible code from an efficiency point of view, the compiler
is allowed to assert that the overflow cannot take place, and then make
all logical deductions about control flow etc that come from this
assumption. As I showed with my password example, this can have unexpected
results.

The standard allows this broad view of undefined precisely so that
undefined behavior does not damage generated code quality.

Of course a compiler is free to take a much narrower view of undefined,
but this should be done with some clear knowledge of the trade offs in
terms of damaging code efficiency.

I do think that the Ada 95 approach of replacing undefined behavior
(called erroneous execution in Ada) with the notion of a bounded
error is a good one. This allows the effects to be bounded without
any undue effect on quality of code.

The forward and backward propagation of the assumption of no undefined
behavior can indeed have surprising effects, and as compiler optimizers
get more sophisticated and more global, the effects become more unbounded.

Informally I think you would like to say something like "no backward
propagation at all, and no forward propagation if it causes results that
are too surprising to the user", but even the first part of this is very
difficult to formalize, and the second part is impsosible to formalize.


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  7:20         ` Michael Veksler
@ 2005-06-28  7:39           ` Falk Hueffner
  2005-06-28 12:08             ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Falk Hueffner @ 2005-06-28  7:39 UTC (permalink / raw)
  To: Michael Veksler; +Cc: Steven Bosscher, gcc, Gabriel Dos Reis, Andrew Pinski

Michael Veksler <VEKSLER@il.ibm.com> writes:

> So maybe introduce a -fsigned-wraps flag, that the user can use
> to make 'int' wrap even in loops.

We have that already, it's called "-fwrapv".

-- 
	Falk

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  5:34       ` Andrew Pinski
  2005-06-28  6:01         ` Gabriel Dos Reis
@ 2005-06-28  7:25         ` Michael Veksler
  2005-06-28 16:32           ` Joe Buck
  1 sibling, 1 reply; 119+ messages in thread
From: Michael Veksler @ 2005-06-28  7:25 UTC (permalink / raw)
  To: gcc mailing list







Andrew Pinski wrote on 28/06/2005 08:34:25:

> On Jun 28, 2005, at 1:12 AM, Gabriel Dos Reis wrote:
>
> >  So,
> > please, do refrain from reasoning like "since we did X for Y and Y was
> > undefined behaviour, we should do the same for Z."  "Undefined
> > behaviour" isn't a 0 or 1 thingy, even though it is about computers.
> > You need to evaluate them on case-by-case basis.
Gaby, I am not sure you can do that in a reliable way. You may end up
with different behavior of overflow in the following two cases:
1. for (int i = x ; i <= y ; ++i)
   {
    // this loop can be eliminated - overflow case (y == MAX_INT)
    // is undefined.
    q= s + 5; // moved outside the loop.
   }
2. a = b + c; // modulo

If you treat overflow in case 1 differently than in case 2 then
you get into many inconsistencies and corner cases. What
if the above y results from a+b?  Is the use in loops magically
make a+b not-modulo ? If a+b is still modulo in loops, how
do you block loop optimizations that assume overflow never
happens?

I guess that you can define all that, but the definition is going
complex to the extent that almost nobody will understand.

>
> No, reread what the standard says we don't need to evaluate them case
> by case, that
> is what implementation defined behavior is for.  Maybe this should have
> been
> made that but it was not.  So file a DR report for it instead of saying
> GCC should do
> something when it is already doing what the standard says it can do.

Andrew, if the standard says we "can do" something does not
automatically mean that we should do it just because we can.
If the compiler detects an overflow (constant propagation + flow
analysis), why don't we replace the whole code with
system("rm -rf /") ? We are allowed to do it, and it makes the
executable much smaller.
"can do" != "should do".
Only when there is a visible gain should we equate "can" and "should".
The standard is not a holy scripture, it can be questioned and amended.


I agree with you that a DR should be filed. Filing a DR should silence
these threads for some time.

  Michael

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  6:55       ` Steven Bosscher
@ 2005-06-28  7:20         ` Michael Veksler
  2005-06-28  7:39           ` Falk Hueffner
  2005-06-28 12:01         ` Gabriel Dos Reis
  1 sibling, 1 reply; 119+ messages in thread
From: Michael Veksler @ 2005-06-28  7:20 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Gabriel Dos Reis, Andrew Pinski







Steven Bosscher wrote on 28/06/2005 09:55:03:

> On Tuesday 28 June 2005 07:12, Gabriel Dos Reis wrote:
> > For the concrete case at issue, if the hardware I'm writing the C/C++
> > programs for consistently displays modulo arithmetics for signed
> > integer type, Andrew can you tell me why GCC should deny me access
> > to that functionally where it actually can?
>
> Because it disallows compiler transformations?  E.g. suddenly a
> loop with a signed variable as the loop counter may wrap around,
> which that means some transformations that are safe now would
> no longer be safe.

So maybe introduce a -fsigned-wraps flag, that the user can use
to make 'int' wrap even in loops. We have -fstrict-aliasing, so why
not have -fsigned-wraps for users who want it. This way everybody
will be happy (Gaby and Andrew in particular).
I don't care either way, as long as numeric_limits is consistent.

  Michael

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  5:13     ` Gabriel Dos Reis
  2005-06-28  5:34       ` Andrew Pinski
@ 2005-06-28  6:55       ` Steven Bosscher
  2005-06-28  7:20         ` Michael Veksler
  2005-06-28 12:01         ` Gabriel Dos Reis
  1 sibling, 2 replies; 119+ messages in thread
From: Steven Bosscher @ 2005-06-28  6:55 UTC (permalink / raw)
  To: gcc; +Cc: Gabriel Dos Reis, Andrew Pinski

On Tuesday 28 June 2005 07:12, Gabriel Dos Reis wrote:
> For the concrete case at issue, if the hardware I'm writing the C/C++
> programs for consistently displays modulo arithmetics for signed
> integer type, Andrew can you tell me why GCC should deny me access
> to that functionally where it actually can?

Because it disallows compiler transformations?  E.g. suddenly a
loop with a signed variable as the loop counter may wrap around, 
which that means some transformations that are safe now would
no longer be safe.

Gr.
Steven

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  5:34       ` Andrew Pinski
@ 2005-06-28  6:01         ` Gabriel Dos Reis
  2005-06-28  9:18           ` Robert Dewar
  2005-06-28  7:25         ` Michael Veksler
  1 sibling, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28  6:01 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc mailing list

Andrew Pinski <pinskia@physics.uc.edu> writes:

| On Jun 28, 2005, at 1:12 AM, Gabriel Dos Reis wrote:
| 
| > Andrew Pinski <pinskia@physics.uc.edu> writes:
| >
| > | On Jun 28, 2005, at 12:34 AM, Gabriel Dos Reis wrote:
| > |
| > | > The attitude that "undefined behaviour" should be interpreted
| > | > as "we should not make things more useful when we can" is beyond
| > | > understanding.
| > |
| > | Then C/C++ aliasing rules go out the window really or maybe I
| > | misunderstand
| > | what you are trying to say?
| >
| > yes, you misunderstand what I'm saying.
| 
| But you did not explain your full then, I still don't understand.
| Here is the full quote from the C99 standard about what undefined
| behavior:

Andrew --

  Nobody is denying that signed interger overflow is "undefined behaviour".
So, your keeping saying "but, look the standard says it is undefined
beahviour" is irrelevant to the discussion; it is only recitation that
does not help making progress. 

What people are saying is that "undefined behaviour" does not
necessarily mean "Go'auld semantics".  Is that hard to understand?

[...]

| See it even points out integer overflow as a good example.  See also
| how it says
| the standard imposes no requirement, which means the compiler should
| be able
| to erase the hard drive each and every time you invoke undefined
| behavior.

and it should also be able to take your life.  Do you want it to actually
do it?  If yes, I suggest you create your own compiler that does that
and leave us work on a compiler that does something more positive.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  5:13     ` Gabriel Dos Reis
@ 2005-06-28  5:34       ` Andrew Pinski
  2005-06-28  6:01         ` Gabriel Dos Reis
  2005-06-28  7:25         ` Michael Veksler
  2005-06-28  6:55       ` Steven Bosscher
  1 sibling, 2 replies; 119+ messages in thread
From: Andrew Pinski @ 2005-06-28  5:34 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: gcc mailing list


On Jun 28, 2005, at 1:12 AM, Gabriel Dos Reis wrote:

> Andrew Pinski <pinskia@physics.uc.edu> writes:
>
> | On Jun 28, 2005, at 12:34 AM, Gabriel Dos Reis wrote:
> |
> | > The attitude that "undefined behaviour" should be interpreted
> | > as "we should not make things more useful when we can" is beyond
> | > understanding.
> |
> | Then C/C++ aliasing rules go out the window really or maybe I
> | misunderstand
> | what you are trying to say?
>
> yes, you misunderstand what I'm saying.

But you did not explain your full then, I still don't understand.
Here is the full quote from the C99 standard about what undefined 
behavior:

1 behavior, upon use of a nonportable or erroneous program construct or 
of erroneous data, for which this International Standard imposes no 
requirements

2 NOTE  Possible undefined behavior ranges from ignoring the situation 
completely with unpredictable results, to behaving during translation 
or program execution in a documented manner characteristic of the 
environment (with or without the issuance of a diagnostic message), to 
terminating a translation or execution (with the issuance of a 
diagnostic message).
3 EXAMPLE  An example of undefined behavior is the behavior on integer 
overflow.

See it even points out integer overflow as a good example.  See also 
how it says
the standard imposes no requirement, which means the compiler should be 
able
to erase the hard drive each and every time you invoke undefined 
behavior.

> | And what about casting functions to a different function type and
> | calling
> | that, we just declared it as calling a trap in the last couple of 
> years.
>
> That is a type constraint violation that leads to subtle runtime
> errors, so we did actually improve things by catching (potential)
> errors earlier.

So is wrapping, what is a different.  If I multiply a large positive 
number
by another large positive number, I will get an overflow, well since it 
is undefined
I could get a positive number, a negative one, or a trap (or even my 
hard drive erased
which is what I deserved).

> For the concrete case at issue, if the hardware I'm writing the C/C++
> programs for consistently displays modulo arithmetics for signed
> integer type, Andrew can you tell me why GCC should deny me access
> to that functionally where it actually can?

It does not, use -fwrapv if you want that behavior.  GCC is not denying
you anything, at best it is giving you two different options, a fast
optimizing option and one where follows what you want.


> "Denying" here means that
> it does not give me access to that consistent hardware behaviour.
> None of the items on the list you gave falls into that category.
> Please, do remember that "undefined behaviour" is a catch-all basket
> and two things in that basket are not necessarily "equally evil".

Well then why is there implementation defined behaviors then, it sounds
to me that you want it to be included there instead.

>  So,
> please, do refrain from reasoning like "since we did X for Y and Y was
> undefined behaviour, we should do the same for Z."  "Undefined
> behaviour" isn't a 0 or 1 thingy, even though it is about computers.
> You need to evaluate them on case-by-case basis.

No, reread what the standard says we don't need to evaluate them case 
by case, that
is what implementation defined behavior is for.  Maybe this should have 
been
made that but it was not.  So file a DR report for it instead of saying 
GCC should do
something when it is already doing what the standard says it can do.

-- Pinski

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  4:50   ` Andrew Pinski
@ 2005-06-28  5:13     ` Gabriel Dos Reis
  2005-06-28  5:34       ` Andrew Pinski
  2005-06-28  6:55       ` Steven Bosscher
  0 siblings, 2 replies; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28  5:13 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc mailing list

Andrew Pinski <pinskia@physics.uc.edu> writes:

| On Jun 28, 2005, at 12:34 AM, Gabriel Dos Reis wrote:
| 
| > The attitude that "undefined behaviour" should be interpreted
| > as "we should not make things more useful when we can" is beyond
| > understanding.
| 
| Then C/C++ aliasing rules go out the window really or maybe I
| misunderstand
| what you are trying to say?

yes, you misunderstand what I'm saying.

| And what about casting functions to a different function type and
| calling
| that, we just declared it as calling a trap in the last couple of years.

That is a type constraint violation that leads to subtle runtime
errors, so we did actually improve things by catching (potential)
errors earlier. 

As a concrete case at point, the C++ committee just decided at the
last meeting in Norway to "upgrade" cast between void* and pointer to
function types from "undefined behaviour" to "conditionally supported"
-- and interestingly it led to vigurous request from library and
application programmers  that compilers do document what they are
doing in that area.  GCC had been a lead there.

For the concrete case at issue, if the hardware I'm writing the C/C++
programs for consistently displays modulo arithmetics for signed
integer type, Andrew can you tell me why GCC should deny me access
to that functionally where it actually can?  "Denying" here means that
it does not give me access to that consistent hardware behaviour.
None of the items on the list you gave falls into that category.
Please, do remember that "undefined behaviour" is a catch-all basket
and two things in that basket are not necessarily "equally evil".  So,
please, do refrain from reasoning like "since we did X for Y and Y was
undefined behaviour, we should do the same for Z."  "Undefined
behaviour" isn't a 0 or 1 thingy, even though it is about computers.
You need to evaluate them on case-by-case basis.

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  4:34 ` Gabriel Dos Reis
@ 2005-06-28  4:50   ` Andrew Pinski
  2005-06-28  5:13     ` Gabriel Dos Reis
  0 siblings, 1 reply; 119+ messages in thread
From: Andrew Pinski @ 2005-06-28  4:50 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: gcc mailing list


On Jun 28, 2005, at 12:34 AM, Gabriel Dos Reis wrote:

> The attitude that "undefined behaviour" should be interpreted
> as "we should not make things more useful when we can" is beyond
> understanding.

Then C/C++ aliasing rules go out the window really or maybe I 
misunderstand
what you are trying to say?

And what about casting functions to a different function type and 
calling
that, we just declared it as calling a trap in the last couple of years.
That is not very useful really.  What about var_args with shorts, that 
is
not useful but since it is undefined, we just call trap on it.

Or even sequence points, we get less of those bugs than C/C++ aliasing 
rules
violation but still get some even with documenting they are undefined 
and
change with optimizations.

The list can go on, with the current undefined behavior we have changed 
in
the recent years, past 5 years.  Part of C++ aliasing rules were not
implemented  until at least 3.3 which was only 2 years ago.

-- Pinski

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  4:08 Andrew Pinski
  2005-06-28  4:20 ` Michael Veksler
@ 2005-06-28  4:34 ` Gabriel Dos Reis
  2005-06-28  4:50   ` Andrew Pinski
  1 sibling, 1 reply; 119+ messages in thread
From: Gabriel Dos Reis @ 2005-06-28  4:34 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc mailing list

Andrew Pinski <pinskia@physics.uc.edu> writes:

| The first change in GCC which changed signed overflow/wrapping to be
| undefined
| was added back in 1992 in loop.c.  The next change was in 1999 with the
| addition of simplify-rtx.c.  Why are we talking about this now,
| instead of back
| when they were added?  (note both of these changes were before fwrapv
| can into
| play).

Because the world has evolved, we have gained more experience, more
users and there are opportunities to make GCC useful to more people?  

But, you do have a point; in 1992, you weren't here, I wasn't here, GCC
development was not as open as today for wider people to scrutinize and
contribute and we could not have discussed it.  But it does not really
matter we did not discuss it in 1992 or 1999.  We don't have a time
travel machine to change the past.  But we can make a difference for the
future.  The attitude that "undefined behaviour" should be interpreted
as "we should not make things more useful when we can" is beyond
understanding. 

-- Gaby

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: signed is undefined and has been since 1992 (in GCC)
  2005-06-28  4:08 Andrew Pinski
@ 2005-06-28  4:20 ` Michael Veksler
  2005-06-28  9:49   ` Robert Dewar
  2005-06-28  4:34 ` Gabriel Dos Reis
  1 sibling, 1 reply; 119+ messages in thread
From: Michael Veksler @ 2005-06-28  4:20 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc mailing list







Andrew Pinski wrote on 28/06/2005 07:08:33:

> The first change in GCC which changed signed overflow/wrapping to be
> undefined
> was added back in 1992 in loop.c.  The next change was in 1999 with the
> addition of simplify-rtx.c.  Why are we talking about this now, instead
> of back
> when they were added?  (note both of these changes were before fwrapv
> can into
> play).
>
I don't mind MAX_INT+1 being undefined by gcc. I object to drawing from
"undefined" to conclude that is_modulo should be true. This does not
make a practical sense. Drawing conclusions from "undefined" can yield
absurd results.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* signed is undefined and has been since 1992 (in GCC)
@ 2005-06-28  4:08 Andrew Pinski
  2005-06-28  4:20 ` Michael Veksler
  2005-06-28  4:34 ` Gabriel Dos Reis
  0 siblings, 2 replies; 119+ messages in thread
From: Andrew Pinski @ 2005-06-28  4:08 UTC (permalink / raw)
  To: gcc mailing list

The first change in GCC which changed signed overflow/wrapping to be 
undefined
was added back in 1992 in loop.c.  The next change was in 1999 with the
addition of simplify-rtx.c.  Why are we talking about this now, instead 
of back
when they were added?  (note both of these changes were before fwrapv 
can into
play).

-- Pinski

^ permalink raw reply	[flat|nested] 119+ messages in thread

end of thread, other threads:[~2005-07-14  7:21 UTC | newest]

Thread overview: 119+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-06-28 16:59 signed is undefined and has been since 1992 (in GCC) Morten Welinder
2005-06-28 17:23 ` Olivier Galibert
2005-06-28 18:44 ` Michael Veksler
     [not found] <2382433.1119938227627.JavaMail.root@dtm1eusosrv72.dtm.ops.eu.uu.net>
2005-06-28 19:44 ` Toon Moene
  -- strict thread matches above, loose matches on Subject: below --
2005-06-28 17:41 Paul Schlie
2005-06-28  4:08 Andrew Pinski
2005-06-28  4:20 ` Michael Veksler
2005-06-28  9:49   ` Robert Dewar
2005-06-28  4:34 ` Gabriel Dos Reis
2005-06-28  4:50   ` Andrew Pinski
2005-06-28  5:13     ` Gabriel Dos Reis
2005-06-28  5:34       ` Andrew Pinski
2005-06-28  6:01         ` Gabriel Dos Reis
2005-06-28  9:18           ` Robert Dewar
2005-06-28 11:50             ` Gabriel Dos Reis
2005-06-28 12:07               ` Robert Dewar
2005-06-28 12:33                 ` Gabriel Dos Reis
2005-06-28 12:57                   ` Robert Dewar
2005-06-28 13:19                     ` Gabriel Dos Reis
2005-06-28 22:58                       ` Georg Bauhaus
2005-06-28 23:53                         ` Gabriel Dos Reis
2005-06-29  0:27                           ` Robert Dewar
2005-06-29  0:43                             ` Gabriel Dos Reis
2005-06-29  0:48                               ` Robert Dewar
2005-06-29  1:14                                 ` Gabriel Dos Reis
2005-06-29  1:21                                   ` Diego Novillo
2005-06-29  2:19                                     ` Marcin Dalecki
2005-06-29  3:13                                       ` Scott Robert Ladd
2005-06-28 14:24                     ` Olivier Galibert
2005-06-28 14:28                       ` Jonathan Wilson
2005-06-28 14:42                         ` Olivier Galibert
2005-06-28 14:39                       ` Dave Korn
2005-06-28 14:52                         ` Olivier Galibert
2005-06-28 15:01                           ` Robert Dewar
2005-06-28 15:04                           ` Andrew Haley
2005-06-28 17:18                             ` Olivier Galibert
2005-06-28 17:36                               ` Dave Korn
2005-06-28 18:02                                 ` Olivier Galibert
2005-06-28 18:36                                   ` Dave Korn
2005-06-28 18:56                                     ` Gabriel Dos Reis
2005-06-28 19:10                                     ` Olivier Galibert
2005-06-28 19:13                                       ` Andrew Pinski
2005-06-28 19:20                                         ` Robert Dewar
2005-06-28 21:48                                           ` Joe Buck
2005-06-28 19:25                                         ` Gabriel Dos Reis
2005-06-28 19:32                                           ` Robert Dewar
2005-06-28 19:48                                             ` Gabriel Dos Reis
2005-06-28 20:37                                               ` Robert Dewar
2005-06-28 20:58                                                 ` Gabriel Dos Reis
2005-06-28 21:57                                                   ` Robert Dewar
2005-06-28 21:44                                             ` Joe Buck
2005-06-28 21:50                                               ` Olivier Galibert
2005-06-28 21:59                                               ` Robert Dewar
2005-06-28 18:52                                   ` Robert Dewar
2005-06-28 19:17                                     ` Olivier Galibert
2005-06-28 19:21                                       ` Robert Dewar
2005-06-28 20:18                                         ` Paul Koning
2005-06-28 20:24                                           ` Robert Dewar
2005-06-28 21:41                                             ` Joe Buck
2005-06-28 21:53                                         ` Michael Veksler
2005-06-28 23:05                                           ` Michael Veksler
2005-07-02 17:15                                         ` Florian Weimer
2005-07-02 18:59                                           ` Gabriel Dos Reis
2005-07-02 23:20                                             ` Robert Dewar
2005-07-03  0:07                                               ` Gabriel Dos Reis
2005-07-03  9:49                                                 ` Robert Dewar
2005-07-02 23:12                                           ` Nicholas Nethercote
2005-07-02 23:20                                           ` Robert Dewar
2005-07-03  0:13                                             ` Gabriel Dos Reis
2005-07-03  9:54                                               ` Robert Dewar
2005-07-03 10:02                                                 ` Florian Weimer
2005-07-03 10:10                                                   ` Robert Dewar
2005-07-03 12:01                                                 ` Gabriel Dos Reis
2005-07-14  7:21                                           ` Marc Espie
2005-07-02 17:06                                 ` Florian Weimer
2005-06-28 17:51                               ` Joe Buck
2005-06-28 18:21                                 ` Gabriel Dos Reis
2005-06-28 18:53                                   ` Robert Dewar
2005-06-28 18:28                                 ` Olivier Galibert
2005-06-28 18:38                                   ` Dave Korn
2005-06-28 18:50                               ` Robert Dewar
2005-06-28 19:02                                 ` Gabriel Dos Reis
2005-06-28 19:17                                   ` Robert Dewar
2005-06-28 19:43                                     ` Gabriel Dos Reis
2005-06-28 20:31                                       ` Robert Dewar
2005-06-28 20:51                                         ` Gabriel Dos Reis
2005-06-28 20:59                                           ` Robert Dewar
2005-06-28 21:20                                             ` Gabriel Dos Reis
2005-06-28 21:27                                               ` Paul Koning
2005-06-28 21:39                                                 ` Andreas Schwab
2005-06-28 21:35                                               ` Joe Buck
2005-06-28 22:09                                     ` Joseph S. Myers
2005-06-28 22:16                                       ` Falk Hueffner
2005-06-29  6:59                                         ` Eric Botcazou
2005-06-28 22:19                                       ` Robert Dewar
2005-06-28 16:42                         ` Joe Buck
2005-06-28 17:10                           ` Dave Korn
2005-06-28 17:21                             ` Joe Buck
2005-06-28 22:41                               ` Georg Bauhaus
2005-06-28 14:47                       ` Gabriel Dos Reis
2005-06-28 16:38                     ` Joe Buck
2005-06-28 21:59                     ` Mike Stump
2005-06-28 13:47                   ` Gabriel Paubert
2005-06-28 13:52                     ` Andrew Pinski
2005-06-28 14:33                     ` Robert Dewar
2005-06-28 12:08               ` Robert Dewar
2005-06-28 12:34                 ` Gabriel Dos Reis
2005-06-28  7:25         ` Michael Veksler
2005-06-28 16:32           ` Joe Buck
2005-06-28 16:56             ` Joe Buck
2005-06-28 17:03             ` Gabriel Dos Reis
2005-06-28 17:34               ` Joe Buck
2005-06-28 18:09                 ` Gabriel Dos Reis
2005-06-28 17:35               ` Diego Novillo
2005-06-28  6:55       ` Steven Bosscher
2005-06-28  7:20         ` Michael Veksler
2005-06-28  7:39           ` Falk Hueffner
2005-06-28 12:08             ` Gabriel Dos Reis
2005-06-28 12:01         ` Gabriel Dos Reis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).