New no-undefined-overflow branch

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* New no-undefined-overflow branch
@ 2009-02-26 11:05 Richard Guenther
  2009-02-26 11:28 ` Manuel López-Ibáñez
                   ` (4 more replies)
  0 siblings, 5 replies; 52+ messages in thread
From: Richard Guenther @ 2009-02-26 11:05 UTC (permalink / raw)
  To: gcc

There have been numerous issues in the past with respect to the
undefinedness of signed overflow in our IL, with the non-semantics
of sizetype and friends and with the pessimization of using
unsigned arithmetic vs. signed arithmetic in loop optimizations.

The goal of the no-undefined-overflow branch is to address all
these problems by tackling the single worst problem, imposing

  There shall be no construct in the GIMPLE IL that invokes
  undefined behavior.

Thus, from now on integer overflow is defined and will wrap with
the usual twos-complement semantics.  Thus, for the middle-end
-fwrapv is always in effect.  This extends to pointer overflow.

To support languages that have undefined semantics on overflowing
operations the middle-end gets new unary and binary operators
that implicitly encode value-range information about their operands
noting that the operation does not overflow.  These does-not-overflow
operators transform the undefined behavior into a valid assumption
and thus make the GIMPLE IL fully defined.

Thus it is now up to the frontend(s) and value-range analysis to
figure out if operations overflow and accordingly change the IL
to retain optimizations that were possible before.

As a start there will be no-overflow variants of NEGATE_EXPR,
PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR.

The sizetypes will simply be operated on in no-overflow variants
by default (by size_binop and friends).

Naming suggestions welcome, at the moment I consider NEGATEV_EXPR
(thus appending V to the first part).

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther
@ 2009-02-26 11:28 ` Manuel López-Ibáñez
  2009-02-26 12:02   ` Richard Guenther
  2009-02-26 13:07 ` Joseph S. Myers
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 52+ messages in thread
From: Manuel López-Ibáñez @ 2009-02-26 11:28 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

2009/2/26 Richard Guenther <rguenther@suse.de>:
>
>  There shall be no construct in the GIMPLE IL that invokes
>  undefined behavior.

What about uninitialized variables?

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-26 11:28 ` Manuel López-Ibáñez
@ 2009-02-26 12:02   ` Richard Guenther
  0 siblings, 0 replies; 52+ messages in thread
From: Richard Guenther @ 2009-02-26 12:02 UTC (permalink / raw)
  To: Manuel López-Ibáñez; +Cc: gcc

[-- Attachment #1: Type: TEXT/PLAIN, Size: 299 bytes --]

On Thu, 26 Feb 2009, Manuel LÃ³pez-IbÃ¡Ã±ez wrote:

> 2009/2/26 Richard Guenther <rguenther@suse.de>:
> >
> > Â There shall be no construct in the GIMPLE IL that invokes
> > Â undefined behavior.
> 
> What about uninitialized variables?

That's unrelated and something I do not care about.

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther
  2009-02-26 11:28 ` Manuel López-Ibáñez
@ 2009-02-26 13:07 ` Joseph S. Myers
  2009-02-26 13:22   ` Richard Guenther
  2009-02-26 14:16 ` Dave Korn
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 52+ messages in thread
From: Joseph S. Myers @ 2009-02-26 13:07 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

On Thu, 26 Feb 2009, Richard Guenther wrote:

> As a start there will be no-overflow variants of NEGATE_EXPR,
> PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR.

So you would have both wrapping and no-overflow versions of these codes 
(with it of course always being valid for a transformation to convert the 
no-overflow version into a wrapping one if in a particular case that is 
beneficial), and the gimplification process would select which codes to 
use based on the -fwrapv setting?  Whereas trapping versions (any 
reimplementation of -ftrapv) would be expected to be implemented at 
gimplification time or before with explicit checks in GIMPLE rather than 
further variant codes?

(RTL has only the overflow-wraps operation variants, as well as having 
signed/unsigned operations where necessary instead of signed/unsigned 
types.  (It has overflow-traps versions as well in the present -ftrapv 
implementation.))

If anyone were to wish to make -fwrapv get INT_MIN / -1 and INT_MIN % -1 
right (see the discussions in PR 30484), where would this fit in this 
scheme?  Different operations for division and modulo operations according 
to whether these cases are defined, and a lowering stage carried out on 
GIMPLE at some point to convert the versions where they are defined to the 
versions where they aren't (inserting explicit checks for -1) if the 
underlying target instructions do not make these operations defined?  
(Both variants would embed an assumption that you are not dividing by 
integer 0.)

Making these semantics explicit in the IL instead of in global flags is of 
course a good thing.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-26 13:07 ` Joseph S. Myers
@ 2009-02-26 13:22   ` Richard Guenther
  0 siblings, 0 replies; 52+ messages in thread
From: Richard Guenther @ 2009-02-26 13:22 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: gcc

On Thu, 26 Feb 2009, Joseph S. Myers wrote:

> On Thu, 26 Feb 2009, Richard Guenther wrote:
> 
> > As a start there will be no-overflow variants of NEGATE_EXPR,
> > PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR.
> 
> So you would have both wrapping and no-overflow versions of these codes 
> (with it of course always being valid for a transformation to convert the 
> no-overflow version into a wrapping one if in a particular case that is 
> beneficial), and the gimplification process would select which codes to 
> use based on the -fwrapv setting?

Yes, there is a wrapping and a no-overflow variant (though the
no-overflow variant also wraps if it overflows - indeed, this is
so that existing build2 calls for example for PLUS_EXPR are
always valid).  I am thinking of not selecting the codes during
gimplification but instead force the frontends to choose whatever
they think match their semantics.

> Whereas trapping versions (any 
> reimplementation of -ftrapv) would be expected to be implemented at 
> gimplification time or before with explicit checks in GIMPLE rather than 
> further variant codes?

Yes.  I was thinking about overloading the no-overflow variants
for floating-point types to mean non-trapping, non-inf, non-nan
(thus a "fast-math" operation).  But that's for future thoughts.

> (RTL has only the overflow-wraps operation variants, as well as having 
> signed/unsigned operations where necessary instead of signed/unsigned 
> types.  (It has overflow-traps versions as well in the present -ftrapv 
> implementation.)

Yes, I am aware of that.

> If anyone were to wish to make -fwrapv get INT_MIN / -1 and INT_MIN % -1 
> right (see the discussions in PR 30484), where would this fit in this 
> scheme?  Different operations for division and modulo operations according 
> to whether these cases are defined, and a lowering stage carried out on 
> GIMPLE at some point to convert the versions where they are defined to the 
> versions where they aren't (inserting explicit checks for -1) if the 
> underlying target instructions do not make these operations defined?  
> (Both variants would embed an assumption that you are not dividing by 
> integer 0.)

The canonical way would be to have TRUNC_DIV_EXPR insert proper checks
at expansion time and TRUNC_DIVV_EXPR assume that the dividend isn't
INT_MIN at the same time as the divisor is -1 (I wonder if the
same implementation problem may exist for negating INT_MIN?  Or does
all hardware get this right?).

There is of course the question on whether we want to pay the
penalty of expanding TRUNC_DIV_EXPR "correct" or not.  I support the
idea of a separate target flag for this.

Note that I think this issue perfectly fits in the wrapping / no-overflow
scheme, just it isn't automagically solved ;)  (but we can use
value-range analysis to optimize the cases where we know we don't need
the checks).

> Making these semantics explicit in the IL instead of in global flags is of 
> course a good thing.

It's even needed for cross-language (and cross-TU) LTO.  I also think
it will end up creating more optimization opportunities rather than
less.

As with the first answer I hope that I get help from the frontend
maintainers properly emitting the no-overflow variants where they
think their language allows it.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther
  2009-02-26 11:28 ` Manuel López-Ibáñez
  2009-02-26 13:07 ` Joseph S. Myers
@ 2009-02-26 14:16 ` Dave Korn
  2009-02-26 23:08 ` Zdenek Dvorak
  2009-02-27 23:07 ` Diego Novillo
  4 siblings, 0 replies; 52+ messages in thread
From: Dave Korn @ 2009-02-26 14:16 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Richard Guenther wrote:

> As a start there will be no-overflow variants of NEGATE_EXPR,
> PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR.

> Naming suggestions welcome, at the moment I consider NEGATEV_EXPR
> (thus appending V to the first part).

  Wow, an actual /invitation/ to bikeshed!  Ok then, I'd suggest that since
"V" means "overflow" in CPU flags, you want to add "NV" instead, no?

    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther
                   ` (2 preceding siblings ...)
  2009-02-26 14:16 ` Dave Korn
@ 2009-02-26 23:08 ` Zdenek Dvorak
  2009-02-27 10:16   ` Richard Guenther
  2009-02-27 23:07 ` Diego Novillo
  4 siblings, 1 reply; 52+ messages in thread
From: Zdenek Dvorak @ 2009-02-26 23:08 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Hi,

in general, I like this proposal a lot.  However,

> As a start there will be no-overflow variants of NEGATE_EXPR,
> PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR.
> 
> The sizetypes will simply be operated on in no-overflow variants
> by default (by size_binop and friends).
> 
> Naming suggestions welcome, at the moment I consider NEGATEV_EXPR
> (thus appending V to the first part).

introducing new codes seems like a bad idea to me.  There are many
places that do not care about the distinction between PLUS_EXPR and
PLUSV_EXPR, and handling both cases will complicate the code (see eg.
the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR
distinction).  Why not just use a flag to mark the operation as
non-overflowing?

Zdenek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-26 23:08 ` Zdenek Dvorak
@ 2009-02-27 10:16   ` Richard Guenther
  2009-02-27 12:07     ` Dave Korn
  2009-02-27 17:58     ` Zdenek Dvorak
  0 siblings, 2 replies; 52+ messages in thread
From: Richard Guenther @ 2009-02-27 10:16 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: gcc

On Fri, 27 Feb 2009, Zdenek Dvorak wrote:

> Hi,
> 
> in general, I like this proposal a lot.  However,
> 
> > As a start there will be no-overflow variants of NEGATE_EXPR,
> > PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR.
> > 
> > The sizetypes will simply be operated on in no-overflow variants
> > by default (by size_binop and friends).
> > 
> > Naming suggestions welcome, at the moment I consider NEGATEV_EXPR
> > (thus appending V to the first part).

First, for the bikeshedding part I agree that NV is probably a better
idea.

> introducing new codes seems like a bad idea to me.  There are many
> places that do not care about the distinction between PLUS_EXPR and
> PLUSV_EXPR, and handling both cases will complicate the code (see eg.
> the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR
> distinction).  Why not just use a flag to mark the operation as
> non-overflowing?

I obviously thought about this.  The issue with using a flag is
that there is no convenient place to stick it and that it makes
the distinction between the two variants less visible.  Consider
the folding routines that take split trees for a start.

IMHO using new tree-codes is much less complicated than carrying
around flags.  I did consider putting flags on the tree code
itself, but that isn't going to make the changes easier either.

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 10:16   ` Richard Guenther
@ 2009-02-27 12:07     ` Dave Korn
  2009-02-27 13:17       ` Richard Guenther
  2009-02-27 18:07       ` Zdenek Dvorak
  2009-02-27 17:58     ` Zdenek Dvorak
  1 sibling, 2 replies; 52+ messages in thread
From: Dave Korn @ 2009-02-27 12:07 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Zdenek Dvorak, gcc

Richard Guenther wrote:
> On Fri, 27 Feb 2009, Zdenek Dvorak wrote:

>> introducing new codes seems like a bad idea to me.  There are many
>> places that do not care about the distinction between PLUS_EXPR and
>> PLUSV_EXPR, and handling both cases will complicate the code (see eg.
>> the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR
>> distinction).  Why not just use a flag to mark the operation as
>> non-overflowing?
> 
> I obviously thought about this.  The issue with using a flag is
> that there is no convenient place to stick it and that it makes
> the distinction between the two variants less visible.  Consider
> the folding routines that take split trees for a start.
> 
> IMHO using new tree-codes is much less complicated than carrying
> around flags.  I did consider putting flags on the tree code
> itself, but that isn't going to make the changes easier either.

  I think it's probably a far safer design too.

  If you suddenly introduce a new semantic for an existing code, suddenly
every current check for that tree code becomes a potential bug site that has
to be manually inspected to see if it's overflow-sensitive or not and
therefore whether it needs to test the new flag or not; people who don't know
about the new flag will carry on adding code that processes those tree codes
without knowing to test the flag; and the bugs that do arise will be hard to
find and result in silent bad codegen.

  If OTOH you add new tree codes, the meaning will be explicit, nobody using
them can be under any misapprehension about the overflow semantics, nobody can
forget to check the flag, and any places where they should be handled but
aren't will very quickly draw themselves to our attention by ICEing, or if
they don't will simply fall back to the safe option of not doing any
processing on a tree code that they don't recognize; that might lead to
pessimal code, but it shouldn't generate incorrect code.

    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 12:07     ` Dave Korn
@ 2009-02-27 13:17       ` Richard Guenther
  2009-02-27 13:35         ` Dave Korn
  2009-02-27 18:07       ` Zdenek Dvorak
  1 sibling, 1 reply; 52+ messages in thread
From: Richard Guenther @ 2009-02-27 13:17 UTC (permalink / raw)
  To: Dave Korn; +Cc: Zdenek Dvorak, gcc

On Fri, 27 Feb 2009, Dave Korn wrote:

> Richard Guenther wrote:
> > On Fri, 27 Feb 2009, Zdenek Dvorak wrote:
> 
> >> introducing new codes seems like a bad idea to me.  There are many
> >> places that do not care about the distinction between PLUS_EXPR and
> >> PLUSV_EXPR, and handling both cases will complicate the code (see eg.
> >> the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR
> >> distinction).  Why not just use a flag to mark the operation as
> >> non-overflowing?
> > 
> > I obviously thought about this.  The issue with using a flag is
> > that there is no convenient place to stick it and that it makes
> > the distinction between the two variants less visible.  Consider
> > the folding routines that take split trees for a start.
> > 
> > IMHO using new tree-codes is much less complicated than carrying
> > around flags.  I did consider putting flags on the tree code
> > itself, but that isn't going to make the changes easier either.
> 
>   I think it's probably a far safer design too.
> 
>   If you suddenly introduce a new semantic for an existing code, suddenly
> every current check for that tree code becomes a potential bug site that has
> to be manually inspected to see if it's overflow-sensitive or not and
> therefore whether it needs to test the new flag or not; people who don't know
> about the new flag will carry on adding code that processes those tree codes
> without knowing to test the flag; and the bugs that do arise will be hard to
> find and result in silent bad codegen.
> 
>   If OTOH you add new tree codes, the meaning will be explicit, nobody using
> them can be under any misapprehension about the overflow semantics, nobody can
> forget to check the flag, and any places where they should be handled but
> aren't will very quickly draw themselves to our attention by ICEing, or if
> they don't will simply fall back to the safe option of not doing any
> processing on a tree code that they don't recognize; that might lead to
> pessimal code, but it shouldn't generate incorrect code.

It's definitely safer.  Still we have to carefully modify existing
code to deal with the new tree codes as most of it carelessly
transitiones old codes to new trees.  For example re-associating
(a +/nv b) + c to a +/nv (b + c) is wrong.

Richard.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 13:17       ` Richard Guenther
@ 2009-02-27 13:35         ` Dave Korn
  2009-02-27 13:45           ` Richard Guenther
  0 siblings, 1 reply; 52+ messages in thread
From: Dave Korn @ 2009-02-27 13:35 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Dave Korn, Zdenek Dvorak, gcc

Richard Guenther wrote:

> It's definitely safer.  Still we have to carefully modify existing
> code to deal with the new tree codes as most of it carelessly
> transitiones old codes to new trees.  For example re-associating
> (a +/nv b) + c to a +/nv (b + c) is wrong.

  Yes, of course we have to take care when processing the new codes, but the
wonderful thing that I was pointing out is that none of the current
fold/simplify code will attempt to reassociate the new codes at first, until
it's been explicitly taught about them, whereas if we had used a flag it would
plough straight ahead under the misapprehension that it understood the
semantics of the tree code, that's all.

  Anyway, your plan gets a big thumbs-up from me.  The noobs will be endlessly
glad when signed ints start overflowing in the way they'd hoped, and the
fortran-wielding-heavy-duty-massive-array-number-crunching-academic types can
use a commandline option to make sure they still get their "loop optimiser
assumes your index won't overflow so loop must terminate" optimisations, won't
they?

    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 13:35         ` Dave Korn
@ 2009-02-27 13:45           ` Richard Guenther
  0 siblings, 0 replies; 52+ messages in thread
From: Richard Guenther @ 2009-02-27 13:45 UTC (permalink / raw)
  To: Dave Korn; +Cc: Zdenek Dvorak, gcc

On Fri, 27 Feb 2009, Dave Korn wrote:

> Richard Guenther wrote:
> 
> > It's definitely safer.  Still we have to carefully modify existing
> > code to deal with the new tree codes as most of it carelessly
> > transitiones old codes to new trees.  For example re-associating
> > (a +/nv b) + c to a +/nv (b + c) is wrong.
> 
>   Yes, of course we have to take care when processing the new codes, but the
> wonderful thing that I was pointing out is that none of the current
> fold/simplify code will attempt to reassociate the new codes at first, until
> it's been explicitly taught about them, whereas if we had used a flag it would
> plough straight ahead under the misapprehension that it understood the
> semantics of the tree code, that's all.

Correct.

>   Anyway, your plan gets a big thumbs-up from me.  The noobs will be endlessly
> glad when signed ints start overflowing in the way they'd hoped, and the

Heh - this will only happen if the frontend does not indicate there
isn't overflow with signed operations.  Which of course it will unless
you specify -fwrapv.

> fortran-wielding-heavy-duty-massive-array-number-crunching-academic types can
> use a commandline option to make sure they still get their "loop optimiser
> assumes your index won't overflow so loop must terminate" optimisations, won't
> they?

Yes.  -fwrapv also shouldn't be as bad as we can re-instantiate the
no-wrapping operation codes during value-range analysis.

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 10:16   ` Richard Guenther
  2009-02-27 12:07     ` Dave Korn
@ 2009-02-27 17:58     ` Zdenek Dvorak
  2009-02-27 18:17       ` Richard Guenther
  1 sibling, 1 reply; 52+ messages in thread
From: Zdenek Dvorak @ 2009-02-27 17:58 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Hi,

> > introducing new codes seems like a bad idea to me.  There are many
> > places that do not care about the distinction between PLUS_EXPR and
> > PLUSV_EXPR, and handling both cases will complicate the code (see eg.
> > the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR
> > distinction).  Why not just use a flag to mark the operation as
> > non-overflowing?
> 
> I obviously thought about this.  The issue with using a flag is
> that there is no convenient place to stick it and that it makes
> the distinction between the two variants less visible.  Consider
> the folding routines that take split trees for a start.
> 
> IMHO using new tree-codes is much less complicated than carrying
> around flags.  I did consider putting flags on the tree code
> itself, but that isn't going to make the changes easier either.

OK, then what about this: introduce accessor functions like

tree_code get_operation_semantics (tree_code)
  -- returns PLUS_EXPR for PLUS_EXPR and PLUSNV_EXPR, etc.
bool get_operation_overflow (tree_code)
  -- obvious
tree_code operation_code (tree_code, bool)
  -- PLUS_EXPR, false -> PLUS_EXPR
  -- PLUS_EXPR, true -> PLUSNV_EXPR
  -- PLUSNV_EXPR, * -> abort
  etc.

(possibly with more clear names), and change the code to always
use them?

Zdenek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 12:07     ` Dave Korn
  2009-02-27 13:17       ` Richard Guenther
@ 2009-02-27 18:07       ` Zdenek Dvorak
  1 sibling, 0 replies; 52+ messages in thread
From: Zdenek Dvorak @ 2009-02-27 18:07 UTC (permalink / raw)
  To: Dave Korn; +Cc: Richard Guenther, gcc

Hi,

> > I obviously thought about this.  The issue with using a flag is
> > that there is no convenient place to stick it and that it makes
> > the distinction between the two variants less visible.  Consider
> > the folding routines that take split trees for a start.
> > 
> > IMHO using new tree-codes is much less complicated than carrying
> > around flags.  I did consider putting flags on the tree code
> > itself, but that isn't going to make the changes easier either.
> 
>   I think it's probably a far safer design too.
> 
>   If you suddenly introduce a new semantic for an existing code, suddenly
> every current check for that tree code becomes a potential bug site that has
> to be manually inspected to see if it's overflow-sensitive or not and
> therefore whether it needs to test the new flag or not; people who don't know
> about the new flag will carry on adding code that processes those tree codes
> without knowing to test the flag; and the bugs that do arise will be hard to
> find and result in silent bad codegen.
> 
>   If OTOH you add new tree codes, the meaning will be explicit, nobody using
> them can be under any misapprehension about the overflow semantics, nobody can
> forget to check the flag, and any places where they should be handled but
> aren't will very quickly draw themselves to our attention by ICEing, or if
> they don't will simply fall back to the safe option of not doing any
> processing on a tree code that they don't recognize; that might lead to
> pessimal code, but it shouldn't generate incorrect code.

you are way too optimistic.  Regardless of whether you introduce the new
codes or represent the new information in another way, the fact is
that this changes semantics of PLUS_EXPR in some cases, and it will
lead to a wave of bugs from places that you forgot to update or updated
incorrectly.  Also, adding new codes usually will not lead to ICEs on
places that do not handle them, just to missed optimizations (most
places that handle expressions have some kind of "default" -- if
this is not any of the operations that we handle specially, do
something generic).

Using special codes is in no way safer than having an extra flag -- even
with the extra flag, the default behavior for the newly created
operations would be that they may wrap, so there is really no
difference,

Zdenek

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 17:58     ` Zdenek Dvorak
@ 2009-02-27 18:17       ` Richard Guenther
  0 siblings, 0 replies; 52+ messages in thread
From: Richard Guenther @ 2009-02-27 18:17 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: gcc

On Fri, 27 Feb 2009, Zdenek Dvorak wrote:

> Hi,
> 
> > > introducing new codes seems like a bad idea to me.  There are many
> > > places that do not care about the distinction between PLUS_EXPR and
> > > PLUSV_EXPR, and handling both cases will complicate the code (see eg.
> > > the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR
> > > distinction).  Why not just use a flag to mark the operation as
> > > non-overflowing?
> > 
> > I obviously thought about this.  The issue with using a flag is
> > that there is no convenient place to stick it and that it makes
> > the distinction between the two variants less visible.  Consider
> > the folding routines that take split trees for a start.
> > 
> > IMHO using new tree-codes is much less complicated than carrying
> > around flags.  I did consider putting flags on the tree code
> > itself, but that isn't going to make the changes easier either.
> 
> OK, then what about this: introduce accessor functions like
> 
> tree_code get_operation_semantics (tree_code)
>   -- returns PLUS_EXPR for PLUS_EXPR and PLUSNV_EXPR, etc.
> bool get_operation_overflow (tree_code)
>   -- obvious
> tree_code operation_code (tree_code, bool)
>   -- PLUS_EXPR, false -> PLUS_EXPR
>   -- PLUS_EXPR, true -> PLUSNV_EXPR
>   -- PLUSNV_EXPR, * -> abort
>   etc.
> 
> (possibly with more clear names), and change the code to always
> use them?

Yes.  In playing with an initial patch I see the need for these
kind of helpers.  Sofar I tried to limit myself to PLUS_EXPR_P ()
covering both kinds, but that certainly is not enough.

So I see the need for at least

  tree_code strip_nv (tree_code);
   -- returns PLUS_EXPR for PLUS_EXPR and PLUSNV_EXPR, etc.

I didn't yet come along a need for your last one. Maybe similar to the
second one is the requirement to somehow have a test when we want
to emit strict-overflow warnings - which we want for the NV codes
operating on signed types, possibly using TREE_NO_WARNING for
avoiding warnings for NV codes that value-range analysis inserted
(and, unfortunately there's no suitable place to put TREE_NO_WARNING
on fold arguments ... :/).

As of using them always - I will see how it works out.  For a start
I am able to miscompile gcc after just minimal fold surgery... :/

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther
                   ` (3 preceding siblings ...)
  2009-02-26 23:08 ` Zdenek Dvorak
@ 2009-02-27 23:07 ` Diego Novillo
  2009-02-28  0:04   ` Joseph S. Myers
  2009-02-28 10:25   ` Richard Guenther
  4 siblings, 2 replies; 52+ messages in thread
From: Diego Novillo @ 2009-02-27 23:07 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

On Thu, Feb 26, 2009 at 06:05, Richard Guenther <rguenther@suse.de> wrote:


>  There shall be no construct in the GIMPLE IL that invokes
>  undefined behavior.

Excellent!  Thanks for starting this branch.

> Thus, from now on integer overflow is defined and will wrap with
> the usual twos-complement semantics.  Thus, for the middle-end
> -fwrapv is always in effect.  This extends to pointer overflow.

You mean that -fwrapv is always in effect for the *V_EXPR operations,
right?  PLUS_EXPR will always be defined as not overflowing.  It's up
to the front end to emit one or the other.

How should we define merging two different TUs compiled with different
-fwrapv settings?  This is something that may be common in the context
of LTO:

$ gcc -flto -fwrapv f1.c
$ gcc -flto f2.c
$ gcc -o f -O2 f1.o f2.o

We will be reading IL containing both overflow and non-overflow
operations.  We should define the combination rules for them.

> Naming suggestions welcome, at the moment I consider NEGATEV_EXPR
> (thus appending V to the first part).

As good a convention as any.


Diego.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 23:07 ` Diego Novillo
@ 2009-02-28  0:04   ` Joseph S. Myers
  2009-02-28 10:29     ` Richard Guenther
  2009-02-28 10:25   ` Richard Guenther
  1 sibling, 1 reply; 52+ messages in thread
From: Joseph S. Myers @ 2009-02-28  0:04 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Richard Guenther, gcc

On Fri, 27 Feb 2009, Diego Novillo wrote:

> We will be reading IL containing both overflow and non-overflow
> operations.  We should define the combination rules for them.

The rules are simple:

* No transformation (of arithmetic operations, which is what we are 
discussing here) may change defined behavior for given inputs to undefined 
behavior for those same inputs.  (The reverse transformation is permitted.  
For example, a mixed set of addition/subtraction operations might usefully 
be converted to all overflow-defined operations to allow the operands to 
be rearranged and cancellation to take place.  For example, if both 
versions of a+b with the same operands are in use at some point, the 
overflow-undefined version need not be computed, only the overflow-wraps 
version.)

* No transformation may change defined behavior for given inputs to 
different, defined behavior for those inputs.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 23:07 ` Diego Novillo
  2009-02-28  0:04   ` Joseph S. Myers
@ 2009-02-28 10:25   ` Richard Guenther
  1 sibling, 0 replies; 52+ messages in thread
From: Richard Guenther @ 2009-02-28 10:25 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1850 bytes --]

On Fri, 27 Feb 2009, Diego Novillo wrote:

> On Thu, Feb 26, 2009 at 06:05, Richard Guenther <rguenther@suse.de> wrote:
> 
> 
> > Â There shall be no construct in the GIMPLE IL that invokes
> > Â undefined behavior.
> 
> Excellent!  Thanks for starting this branch.
> 
> > Thus, from now on integer overflow is defined and will wrap with
> > the usual twos-complement semantics. Â Thus, for the middle-end
> > -fwrapv is always in effect. Â This extends to pointer overflow.
> 
> You mean that -fwrapv is always in effect for the *V_EXPR operations,
> right?  PLUS_EXPR will always be defined as not overflowing.  It's up
> to the front end to emit one or the other.

-fwrapv is always in effect for the regular variants (PLUS_EXPR).
It is also in effect for the *NV_EXPR variants (yeah, the *V naming
was confusing...), but we know that overflow doesn't happen for them
so it doesn't matter what semantics overflow has - thus you can as
well say -fno-wrapv is in effect for them - but! - the *NV_EXPR
variants also exist for unsigned arithmetic just specifying the
same restriction - no overflow.

> How should we define merging two different TUs compiled with different
> -fwrapv settings?  This is something that may be common in the context
> of LTO:
> 
> $ gcc -flto -fwrapv f1.c
> $ gcc -flto f2.c
> $ gcc -o f -O2 f1.o f2.o
> 
> We will be reading IL containing both overflow and non-overflow
> operations.  We should define the combination rules for them.

There will be no use of flag_wrapv left after the frontends finished,
everything is encoded in the IL.  So combination of -fwrapv and
-fno-wrapv CUs is just "combine them".

> > Naming suggestions welcome, at the moment I consider NEGATEV_EXPR
> > (thus appending V to the first part).
> 
> As good a convention as any.

As was noted, NEGATENV_EXPR is less confusing.

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-28  0:04   ` Joseph S. Myers
@ 2009-02-28 10:29     ` Richard Guenther
  2009-03-05 17:24       ` Geert Bosch
  0 siblings, 1 reply; 52+ messages in thread
From: Richard Guenther @ 2009-02-28 10:29 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Diego Novillo, gcc

On Sat, 28 Feb 2009, Joseph S. Myers wrote:

> On Fri, 27 Feb 2009, Diego Novillo wrote:
> 
> > We will be reading IL containing both overflow and non-overflow
> > operations.  We should define the combination rules for them.
> 
> The rules are simple:
> 
> * No transformation (of arithmetic operations, which is what we are 
> discussing here) may change defined behavior for given inputs to undefined 
> behavior for those same inputs.  (The reverse transformation is permitted.  
> For example, a mixed set of addition/subtraction operations might usefully 
> be converted to all overflow-defined operations to allow the operands to 
> be rearranged and cancellation to take place.  For example, if both 
> versions of a+b with the same operands are in use at some point, the 
> overflow-undefined version need not be computed, only the overflow-wraps 
> version.)
> 
> * No transformation may change defined behavior for given inputs to 
> different, defined behavior for those inputs.

Indeed.  A safe way of combining mixed trees is to just drop the
*NV_EXPR variants to *_EXPR variants on the result.  Consider
(a -/nv 20) +/nv (b -/nv 20) where we see a re-association
opportunity to combine both constants.  The result is (a + b) - 40,
_not_ (a +/nv b) -/nv 40.  On trunk we have to disable this
re-association because the result would still be undefined overflow
(we don't have means to avoid this - other than doing the arithmetic
unsigned and perform conversions of course).

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-28 10:29     ` Richard Guenther
@ 2009-03-05 17:24       ` Geert Bosch
  2009-03-06 12:28         ` Richard Guenther
  0 siblings, 1 reply; 52+ messages in thread
From: Geert Bosch @ 2009-03-05 17:24 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Joseph S. Myers, Diego Novillo, gcc

Hi Richard,

Great to see that you're addressing this issue. If I understand  
correctly,
for RTL all operations are always wrapping, right?

I have been considering adding "V" variants for operations that trap on
overflow. The main reason I have not (yet) pursued this, is the daunting
task of teaching the folders about all these new codes. initially I'd
like to lower the "V" operations to explicit checks (calling abort() or
raising an exception, depending on the language) during gimplification,
but with the idea of eventually delaying expansion as more code learns
how to handle the new expressions. I already have most of the code  
necessary
for the expansions, as I now do them during translation of Ada trees to
GENERIC trees. Actually, the new and saner wrapping semantics for
{PLUS,MINUS,MULT}_EXPR simplify these a bit, by avoiding the need to use
unsigned types.

As you obviously doing something very similar now by introducing "NV"  
variants,
do you think this would fit in with your scheme? If so, I'd be happy  
to try
and expand on your work to have, wrapping, no-overflow and overflow- 
checking
variants for basic arithmetic.

   -Geert

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-05 17:24       ` Geert Bosch
@ 2009-03-06 12:28         ` Richard Guenther
  2009-03-06 13:54           ` Joseph S. Myers
                             ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: Richard Guenther @ 2009-03-06 12:28 UTC (permalink / raw)
  To: Geert Bosch; +Cc: Joseph S. Myers, Diego Novillo, gcc

On Thu, 5 Mar 2009, Geert Bosch wrote:

> Hi Richard,
> 
> Great to see that you're addressing this issue. If I understand correctly,
> for RTL all operations are always wrapping, right?

That's true (we don't have signedness information there) and I don't
plan to change that.

> I have been considering adding "V" variants for operations that trap on
> overflow. The main reason I have not (yet) pursued this, is the daunting
> task of teaching the folders about all these new codes. initially I'd
> like to lower the "V" operations to explicit checks (calling abort() or
> raising an exception, depending on the language) during gimplification,
> but with the idea of eventually delaying expansion as more code learns
> how to handle the new expressions. I already have most of the code necessary
> for the expansions, as I now do them during translation of Ada trees to
> GENERIC trees. Actually, the new and saner wrapping semantics for
> {PLUS,MINUS,MULT}_EXPR simplify these a bit, by avoiding the need to use
> unsigned types.

Correct.

> As you obviously doing something very similar now by introducing "NV"
> variants,
> do you think this would fit in with your scheme? If so, I'd be happy to try
> and expand on your work to have, wrapping, no-overflow and overflow-checking
> variants for basic arithmetic.

I didn't spend too much time thinking about the trapping variants
(well, I believe it isn't that important ;)).  But in general we would
have to expand the non-NV variants via the trapping expanders
if flag_trapv was true (so yeah, combining TUs with different flag_trapv
settings would be difficult again and it would ask for explicitly
encoding this variant in the IL ...).

There is of course the problem that we have to be careful not to
introduce new traps via folding, a problem that doesn't exist with
the no-overflow variants (I can simply drop to the wrapping variants).
With for example  (a -/v 10) +/v 10 would you want to preserve
the possibly trapping a -/v 10?  For (a -/v 10) +/v (b -/v 10) do
we want to be careful to not introduce extra traps when simplifying
to (a +/v b) -/v 20?

So while trapping variants can certainly be introduced it looks like
this task may be more difficult.  So lowering them early during
gimplification looks like a more reasonable plan IMHO.

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 12:28         ` Richard Guenther
@ 2009-03-06 13:54           ` Joseph S. Myers
  2009-03-06 14:10             ` Richard Guenther
  2009-03-06 14:44           ` Paolo Bonzini
  2009-03-06 17:21           ` Geert Bosch
  2 siblings, 1 reply; 52+ messages in thread
From: Joseph S. Myers @ 2009-03-06 13:54 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Geert Bosch, Diego Novillo, gcc

On Fri, 6 Mar 2009, Richard Guenther wrote:

> There is of course the problem that we have to be careful not to
> introduce new traps via folding, a problem that doesn't exist with
> the no-overflow variants (I can simply drop to the wrapping variants).
> With for example  (a -/v 10) +/v 10 would you want to preserve
> the possibly trapping a -/v 10?  For (a -/v 10) +/v (b -/v 10) do
> we want to be careful to not introduce extra traps when simplifying
> to (a +/v b) -/v 20?

I think we should preserve the property of whether the code evaluated 
between sequence points causes no traps, or at least one trap.  (But need 
not preserve the difference between one trap and two, for example.  This 
is the same as C99 Annex F requirements for floating-point exceptions: the 
set of exceptions produced between calls to fenv.h functions checking or 
modifying exception state needs to be preserved, but not the number (> 0) 
of times each exception is produced.)

This probably means that most folding should not apply to trapping 
variants.

(In principle I believe folding should be done on GIMPLE rather than 
folding functions being called from the front ends on trees.  If you do 
always lower to explicit overflow checks and only run folding after that 
lowering stage, the problem of changing whether a trap occurs does not 
arise.  But with trapping variants in GIMPLE, the folding code would still 
need to allow for them even with folding all happening on GIMPLE.)

> So while trapping variants can certainly be introduced it looks like
> this task may be more difficult.  So lowering them early during
> gimplification looks like a more reasonable plan IMHO.

If you lower to whatever forms of checks for overflow can be conveniently 
expressed in GIMPLE then you have the interesting problem of 
reconstituting trapping operations for those targets that have trapping 
instructions or instructions setting overflow flags that can be checked 
directly.  It's possible that you want to lower to explicit overflow 
checks for targets without the operations, but to trapping operation codes 
for targets with the operations.  (Which would suggest starting with 
trapping variants, and then lowering to explicit checks part way through 
the GIMPLE optimizations, only on targets without suitable instructions 
for the relevant type precision and depending on -Os to determine whether 
inline code or libgcc function calls might be better.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 13:54           ` Joseph S. Myers
@ 2009-03-06 14:10             ` Richard Guenther
  2009-03-06 14:29               ` Joseph S. Myers
  0 siblings, 1 reply; 52+ messages in thread
From: Richard Guenther @ 2009-03-06 14:10 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Geert Bosch, Diego Novillo, gcc

On Fri, 6 Mar 2009, Joseph S. Myers wrote:

> On Fri, 6 Mar 2009, Richard Guenther wrote:
> 
> > There is of course the problem that we have to be careful not to
> > introduce new traps via folding, a problem that doesn't exist with
> > the no-overflow variants (I can simply drop to the wrapping variants).
> > With for example  (a -/v 10) +/v 10 would you want to preserve
> > the possibly trapping a -/v 10?  For (a -/v 10) +/v (b -/v 10) do
> > we want to be careful to not introduce extra traps when simplifying
> > to (a +/v b) -/v 20?
> 
> I think we should preserve the property of whether the code evaluated 
> between sequence points causes no traps, or at least one trap.  (But need 
> not preserve the difference between one trap and two, for example.  This 
> is the same as C99 Annex F requirements for floating-point exceptions: the 
> set of exceptions produced between calls to fenv.h functions checking or 
> modifying exception state needs to be preserved, but not the number (> 0) 
> of times each exception is produced.)

That sounds sensible.  Of course we don't know about sequence points
during folding (as that may get trees combined from several GIMPLE
statements).

> This probably means that most folding should not apply to trapping 
> variants.

Indeed.

> (In principle I believe folding should be done on GIMPLE rather than 
> folding functions being called from the front ends on trees.  If you do 
> always lower to explicit overflow checks and only run folding after that 
> lowering stage, the problem of changing whether a trap occurs does not 
> arise.  But with trapping variants in GIMPLE, the folding code would still 
> need to allow for them even with folding all happening on GIMPLE.)

Yes.

> > So while trapping variants can certainly be introduced it looks like
> > this task may be more difficult.  So lowering them early during
> > gimplification looks like a more reasonable plan IMHO.
> 
> If you lower to whatever forms of checks for overflow can be conveniently 
> expressed in GIMPLE then you have the interesting problem of 
> reconstituting trapping operations for those targets that have trapping 
> instructions or instructions setting overflow flags that can be checked 
> directly.  It's possible that you want to lower to explicit overflow 
> checks for targets without the operations, but to trapping operation codes 
> for targets with the operations.  (Which would suggest starting with 
> trapping variants, and then lowering to explicit checks part way through 
> the GIMPLE optimizations, only on targets without suitable instructions 
> for the relevant type precision and depending on -Os to determine whether 
> inline code or libgcc function calls might be better.)

Ok, I didn't think of generating optimal code for the overflow checks,
but it should be reasonably possible to if-convert the explicit
overflow checks back to builtins that can be specially expanded
(that would also help explicit overflow checks in source code).

So my opinion still holds that explicit overflow checks would be
prefered.

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 14:10             ` Richard Guenther
@ 2009-03-06 14:29               ` Joseph S. Myers
  2009-03-06 17:17                 ` Geert Bosch
  0 siblings, 1 reply; 52+ messages in thread
From: Joseph S. Myers @ 2009-03-06 14:29 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Geert Bosch, Diego Novillo, gcc

On Fri, 6 Mar 2009, Richard Guenther wrote:

> Ok, I didn't think of generating optimal code for the overflow checks,
> but it should be reasonably possible to if-convert the explicit
> overflow checks back to builtins that can be specially expanded
> (that would also help explicit overflow checks in source code).

It looks like only alpha and pa presently have insn patterns such as 
addvsi3 that would be used by the present -ftrapv code, but I expect 
several other processors also have instructions that would help in 
overflow-checking code.  (For example, Power Architecture has versions of 
many arithmetic instructions that set overflow flags, so such instructions 
could be used followed by conditional traps.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 12:28         ` Richard Guenther
  2009-03-06 13:54           ` Joseph S. Myers
@ 2009-03-06 14:44           ` Paolo Bonzini
  2009-03-06 15:10             ` Richard Guenther
                               ` (2 more replies)
  2009-03-06 17:21           ` Geert Bosch
  2 siblings, 3 replies; 52+ messages in thread
From: Paolo Bonzini @ 2009-03-06 14:44 UTC (permalink / raw)
  To: gcc


> So while trapping variants can certainly be introduced it looks like
> this task may be more difficult.

I don't think you need to introduce trapping tree codes.  You can
introduce them directly in the front-end as

   s = x +nv y
   (((s ^ x) & (s ^ y)) < 0) ? trap () : s

   d = x -nv y
   (((d ^ x) & (x ^ y)) < 0) ? trap () : d

   (b == INT_MIN ? trap () : -nv b)

   (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b

Making sure they are compiled efficiently is another story, but
especially for the sake of LTO I think this is the way to go.

Paolo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 14:44           ` Paolo Bonzini
@ 2009-03-06 15:10             ` Richard Guenther
  2009-03-06 15:23               ` Paolo Bonzini
  2009-03-06 15:58               ` Ian Lance Taylor
  2009-03-06 17:25             ` Joseph S. Myers
  2009-03-27 13:11             ` Eus
  2 siblings, 2 replies; 52+ messages in thread
From: Richard Guenther @ 2009-03-06 15:10 UTC (permalink / raw)
  To: gcc

On Fri, Mar 6, 2009 at 3:29 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
>
>> So while trapping variants can certainly be introduced it looks like
>> this task may be more difficult.
>
> I don't think you need to introduce trapping tree codes.  You can
> introduce them directly in the front-end as
>
>   s = x +nv y

I think this should be

  s = x + y

otherwise the compiler can assume that for the following check
the addition did not overflow.

>   (((s ^ x) & (s ^ y)) < 0) ? trap () : s
>
>   d = x -nv y
>   (((d ^ x) & (x ^ y)) < 0) ? trap () : d
>
>   (b == INT_MIN ? trap () : -nv b)
>
>   (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b
>
> Making sure they are compiled efficiently is another story, but
> especially for the sake of LTO I think this is the way to go.

I agree.  Btw, for the addition case we generate

        leal    (%rsi,%rdi), %eax
        xorl    %eax, %esi
        xorl    %eax, %edi
        testl   %edi, %esi
        jns     .L2
                .value  0x0b0f
.L2:
        rep
        ret

which isn't too bad.

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 15:10             ` Richard Guenther
@ 2009-03-06 15:23               ` Paolo Bonzini
  2009-03-06 15:39                 ` Paolo Bonzini
                                   ` (3 more replies)
  2009-03-06 15:58               ` Ian Lance Taylor
  1 sibling, 4 replies; 52+ messages in thread
From: Paolo Bonzini @ 2009-03-06 15:23 UTC (permalink / raw)
  To: gcc

Richard Guenther wrote:
> On Fri, Mar 6, 2009 at 3:29 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
>>> So while trapping variants can certainly be introduced it looks like
>>> this task may be more difficult.
>> I don't think you need to introduce trapping tree codes.  You can
>> introduce them directly in the front-end as
>>
>>   s = x +nv y
> 
> I think this should be
> 
>   s = x + y
>   (((s ^ x) & (s ^ y)) < 0) ? trap () : s
> 
> otherwise the compiler can assume that for the following check
> the addition did not overflow.

Ah yeah I've not yet looked at the patches and I did not know which one
was which.  I actually wrote x + y first and then went back to carefully
check them. :-P

>> Making sure they are compiled efficiently is another story, but
>> especially for the sake of LTO I think this is the way to go.
> 
> I agree.  Btw, for the addition case we generate
> 
>         leal    (%rsi,%rdi), %eax
>         xorl    %eax, %esi
>         xorl    %eax, %edi
>         testl   %edi, %esi
>         jns     .L2
>                 .value  0x0b0f
> .L2:
>         rep
>         ret
> 
> which isn't too bad.

Well, for x86 it requires the addends to die.

This is unfortunately four insns, and combine has a limit of three.  but
maybe you could make combine recognize the check and turn it to an addv
pattern (with the add result unused!); and then CSE or maybe combine as
well would, well, eliminate the duplicate ADD...

If this does not work, on ARM you can also hope for something like this:

     ADD    R0, R1, R2
     XORS   R0, R2, R3
     XORSMI R1, R2, R3
     SWIMI  #trap

But hey, whatever you get, it's anyway faster than a libcall. :-)

Of course there are better choices for x+CONSTANT; using (b == INT_MIN ?
trap () : -b) for negation is one example.

Paolo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 15:23               ` Paolo Bonzini
@ 2009-03-06 15:39                 ` Paolo Bonzini
  2009-03-06 16:19                 ` Richard Guenther
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 52+ messages in thread
From: Paolo Bonzini @ 2009-03-06 15:39 UTC (permalink / raw)
  To: gcc

Richard Guenther wrote:
> On Fri, Mar 6, 2009 at 3:29 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
>>> So while trapping variants can certainly be introduced it looks like
>>> this task may be more difficult.
>> I don't think you need to introduce trapping tree codes.  You can
>> introduce them directly in the front-end as
>>
>>   s = x +nv y
> 
> I think this should be
> 
>   s = x + y
>   (((s ^ x) & (s ^ y)) < 0) ? trap () : s
> 
> otherwise the compiler can assume that for the following check
> the addition did not overflow.

Ah yeah I've not yet looked at the patches and I did not know which one
was which.  I actually wrote x + y first and then went back to carefully
check them. :-P

>> Making sure they are compiled efficiently is another story, but
>> especially for the sake of LTO I think this is the way to go.
> 
> I agree.  Btw, for the addition case we generate
> 
>         leal    (%rsi,%rdi), %eax
>         xorl    %eax, %esi
>         xorl    %eax, %edi
>         testl   %edi, %esi
>         jns     .L2
>                 .value  0x0b0f
> .L2:
>         rep
>         ret
> 
> which isn't too bad.

Well, for x86 it requires the addends to die.

This is unfortunately four insns, and combine has a limit of three.  but
maybe you could make combine recognize the check and turn it to an addv
pattern (with the add result unused!); and then CSE or maybe combine as
well would, well, eliminate the duplicate ADD...

If this does not work, on ARM you can also hope for something like this:

     ADD    R0, R1, R2
     XORS   R0, R2, R3
     XORSMI R1, R2, R3
     SWIMI  #trap

But hey, whatever you get, it's anyway faster than a libcall. :-)

Of course there are better choices for x+CONSTANT; using (b == INT_MIN ?
trap () : -b) for negation is one example.

Paolo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 15:10             ` Richard Guenther
  2009-03-06 15:23               ` Paolo Bonzini
@ 2009-03-06 15:58               ` Ian Lance Taylor
  1 sibling, 0 replies; 52+ messages in thread
From: Ian Lance Taylor @ 2009-03-06 15:58 UTC (permalink / raw)
  To: gcc

Richard Guenther <richard.guenther@gmail.com> writes:

> I agree.  Btw, for the addition case we generate
>
>         leal    (%rsi,%rdi), %eax
>         xorl    %eax, %esi
>         xorl    %eax, %edi
>         testl   %edi, %esi
>         jns     .L2
>                 .value  0x0b0f
> .L2:
>         rep
>         ret
>
> which isn't too bad.

But I think it would normally be faster to do something like
    movl    %esi,%eax
    addl    %edi,%eax
    jo      .Ltrap
    ret
.Ltrap:
    diediedie

Ian

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 15:23               ` Paolo Bonzini
  2009-03-06 15:39                 ` Paolo Bonzini
@ 2009-03-06 16:19                 ` Richard Guenther
  2009-03-06 17:23                   ` Joseph S. Myers
  2009-03-06 18:54                 ` Richard Earnshaw
       [not found]                 ` <1236364189.1654.2.camel@pc960.cambridge.arm.com>
  3 siblings, 1 reply; 52+ messages in thread
From: Richard Guenther @ 2009-03-06 16:19 UTC (permalink / raw)
  To: gcc

On Fri, Mar 6, 2009 at 4:09 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> Richard Guenther wrote:
>> On Fri, Mar 6, 2009 at 3:29 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
>>>> So while trapping variants can certainly be introduced it looks like
>>>> this task may be more difficult.
>>> I don't think you need to introduce trapping tree codes.  You can
>>> introduce them directly in the front-end as
>>>
>>>   s = x +nv y
>>
>> I think this should be
>>
>>   s = x + y
>>   (((s ^ x) & (s ^ y)) < 0) ? trap () : s
>>
>> otherwise the compiler can assume that for the following check
>> the addition did not overflow.
>
> Ah yeah I've not yet looked at the patches and I did not know which one
> was which.  I actually wrote x + y first and then went back to carefully
> check them. :-P
>
>>> Making sure they are compiled efficiently is another story, but
>>> especially for the sake of LTO I think this is the way to go.
>>
>> I agree.  Btw, for the addition case we generate
>>
>>         leal    (%rsi,%rdi), %eax
>>         xorl    %eax, %esi
>>         xorl    %eax, %edi
>>         testl   %edi, %esi
>>         jns     .L2
>>                 .value  0x0b0f
>> .L2:
>>         rep
>>         ret
>>
>> which isn't too bad.
>
> Well, for x86 it requires the addends to die.
>
> This is unfortunately four insns, and combine has a limit of three.  but
> maybe you could make combine recognize the check and turn it to an addv
> pattern (with the add result unused!); and then CSE or maybe combine as
> well would, well, eliminate the duplicate ADD...

Well, I was thinking about detecting the pattern on the tree level instead.

  s_6 = x.0_2 + y.1_4;
  D.1597_7 = s_6 ^ x_1(D);
  D.1598_8 = s_6 ^ y_3(D);
  D.1599_9 = D.1597_7 & D.1598_8;
  if (D.1599_9 < 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  __builtin_trap ();

<bb 4>:

This should be recognizable in the ifcombine pass for example, which
recognizes CFG patterns.  Transforming it to just

  s_6 = __builtin_addv (x.0_2, y.1_4);

<bb 4>:

Only ifcombine runs a little too early for that maybe.

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 14:29               ` Joseph S. Myers
@ 2009-03-06 17:17                 ` Geert Bosch
  2009-03-06 17:42                   ` Joseph S. Myers
  0 siblings, 1 reply; 52+ messages in thread
From: Geert Bosch @ 2009-03-06 17:17 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Richard Guenther, Diego Novillo, gcc

On Mar 6, 2009, at 09:15, Joseph S. Myers wrote:
> It looks like only alpha and pa presently have insn patterns such as
> addvsi3 that would be used by the present -ftrapv code, but I expect
> several other processors also have instructions that would help in
> overflow-checking code.  (For example, Power Architecture has  
> versions of
> many arithmetic instructions that set overflow flags, so such  
> instructions
> could be used followed by conditional traps.)

Most architectures have similar flags or conditions, but during my
work on more efficient overflow checking for Ada I have become less
convinced of the need for them. For Ada code, the overflow
checking is now less expensive than many other checks.

In most cases, one of the operands will be either constant or have
a known sign. Then the overflow check can be expanded as a simple
comparison. The benefit of this is that later optimization passes
can use these tests to derive range information, combine checks,
or fold them. When using some opaque construct, removing it will be  
hard.
Also, for many languages calling abort() for overflow would not be
desirable, and an exception should be raised. Doing this directly
from expanded code rather than using a trap handler will avoid the
need to write target-specific trap handlers and has the advantage
that a message with location information can easily be included.

In any case, while I'd really like to move the checked signed
integer overflow from Gigi (GNAT-to-GNU tree translator) to
the language independent part of GCC, I want to have the absolute
minimum amount of changes that is necessary to achieve this goal.

Since only Ada uses integer overflow checking at this point, any
non-standard GIMPLE will be a maintenance burden and is likely
to be mishandled by later optimizers. When we lower all checks
during gimplification, we can re-implement -ftrapv while avoiding
all of the pitfalls the old implementation had. In particular,
there has to be a clear distinction between which signed integer
operations must be checked and which not. During compilation many
new operations are created and these operations need to have
clear semantics. To me that is the great improvement of the
no-undefined-overflow branch.

   -Geert

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 12:28         ` Richard Guenther
  2009-03-06 13:54           ` Joseph S. Myers
  2009-03-06 14:44           ` Paolo Bonzini
@ 2009-03-06 17:21           ` Geert Bosch
  2009-03-06 17:38             ` Joseph S. Myers
  2 siblings, 1 reply; 52+ messages in thread
From: Geert Bosch @ 2009-03-06 17:21 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Joseph S. Myers, Diego Novillo, gcc


On Mar 6, 2009, at 04:11, Richard Guenther wrote:

> I didn't spend too much time thinking about the trapping variants
> (well, I believe it isn't that important ;)).  But in general we would
> have to expand the non-NV variants via the trapping expanders
> if flag_trapv was true (so yeah, combining TUs with different  
> flag_trapv
> settings would be difficult again and it would ask for explicitly
> encoding this variant in the IL ...).
The non-NV variants have wrap-around semantics on the no-undefined- 
overflow
branch, right? I'm not about to change that based on some global  
flag! :)
I'm proposing something like:
   {PLUS,MINUS,MULT,NEGATE}_EXPR:
	- signed integer operation with wrap-around
   {PLUS,MINUS,MULT,NEGATE)NV_EXPR
	- signed integer operations known to not overflow
   {PLUS,MINUS,MULT,NEGATE)V_EXPR
	- signed integer operation with overflow check that traps,
           aborts or raises an exception on overflow

> There is of course the problem that we have to be careful not to
> introduce new traps via folding, a problem that doesn't exist with
> the no-overflow variants (I can simply drop to the wrapping variants).
> With for example  (a -/v 10) +/v 10 would you want to preserve
> the possibly trapping a -/v 10?  For (a -/v 10) +/v (b -/v 10) do
> we want to be careful to not introduce extra traps when simplifying
> to (a +/v b) -/v 20?
Indeed, there has to be a very clear boundary as we'd be changing
semantics. The original -ftrapv implementation muddled that issue,
something I absolutely want to avoid

> So while trapping variants can certainly be introduced it looks like

> this task may be more difficult.  So lowering them early during
> gimplification looks like a more reasonable plan IMHO.

Right, that was my intention. Still, I'll need to add code to
handle the new tree codes in fold(), right?

   -Geert 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 16:19                 ` Richard Guenther
@ 2009-03-06 17:23                   ` Joseph S. Myers
  0 siblings, 0 replies; 52+ messages in thread
From: Joseph S. Myers @ 2009-03-06 17:23 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

On Fri, 6 Mar 2009, Richard Guenther wrote:

> Well, I was thinking about detecting the pattern on the tree level instead.
> 
>   s_6 = x.0_2 + y.1_4;
>   D.1597_7 = s_6 ^ x_1(D);
>   D.1598_8 = s_6 ^ y_3(D);
>   D.1599_9 = D.1597_7 & D.1598_8;
>   if (D.1599_9 < 0)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
> 
> <bb 3>:
>   __builtin_trap ();
> 
> <bb 4>:
> 
> This should be recognizable in the ifcombine pass for example, which
> recognizes CFG patterns.  Transforming it to just
> 
>   s_6 = __builtin_addv (x.0_2, y.1_4);
> 
> <bb 4>:
> 
> Only ifcombine runs a little too early for that maybe.

In addition to detecting to transform into something like the above (for 
addv insn patterns or libgcc function), you may also want to detect when a 
constant has been propagated into the above and make sure it can get 
optimized into a range check on the non-constant operand.  (I don't know 
if existing optimizers will be able to handle the above with a constant 
for one of the operands and convert it to a range check, or whether 
special code would be needed.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 14:44           ` Paolo Bonzini
  2009-03-06 15:10             ` Richard Guenther
@ 2009-03-06 17:25             ` Joseph S. Myers
  2009-03-06 17:52               ` Paolo Bonzini
  2009-03-27 13:11             ` Eus
  2 siblings, 1 reply; 52+ messages in thread
From: Joseph S. Myers @ 2009-03-06 17:25 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc

On Fri, 6 Mar 2009, Paolo Bonzini wrote:

> I don't think you need to introduce trapping tree codes.  You can
> introduce them directly in the front-end as

Multiple front ends want the same thing.  This is why it would be better 
to introduce the codes in GENERIC and have the language-independent 
gimplifier contain the code to lower them, even if they don't become part 
of GIMPLE.

>    (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b

This is not a solution for trapping multiplication in the widest supported 
type.  I'm sure you can represent that with e.g. some checks for whether 
values are positive or negative and then dividing the largest / smallest 
values of the type by one of the arguments, but you might not want to lose 
symmetry like that until after constant propagation has had a chance to 
make one of the arguments constant (since if that happens you get a range 
check on the other argument).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 17:21           ` Geert Bosch
@ 2009-03-06 17:38             ` Joseph S. Myers
  2009-03-06 18:47               ` Paolo Bonzini
  2009-03-06 22:46               ` Geert Bosch
  0 siblings, 2 replies; 52+ messages in thread
From: Joseph S. Myers @ 2009-03-06 17:38 UTC (permalink / raw)
  To: Geert Bosch; +Cc: Richard Guenther, Diego Novillo, gcc

On Fri, 6 Mar 2009, Geert Bosch wrote:

> > this task may be more difficult.  So lowering them early during
> > gimplification looks like a more reasonable plan IMHO.
> 
> Right, that was my intention. Still, I'll need to add code to
> handle the new tree codes in fold(), right?

If you add new trapping codes to GENERIC I'd recommend *not* making fold() 
handle them.  I don't think much folding is safe for the trapping codes 
when you want to avoid either removing or introducing traps.  Either lower 
the codes in gimplification, or handle them explicitly in a few GIMPLE 
optimizations e.g. when constants are propagated in, but avoid general 
folding for them.

Front ends should set TREE_SIDE_EFFECTS on trapping expressions so that 
fold knows it can't discard a subexpression (whose value doesn't matter to 
the value of the final expression) containing a trapping expression, e.g. 
0 * (a +trap b) needs to evaluate (a +trap b) for its side effects.  With 
this done, front ends generating trapping codes for -ftrapv and fold not 
trying to optimize the trapping codes, I'd hope fold and the rest of the 
language-independent compiler could stop caring about flag_trapv.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 17:17                 ` Geert Bosch
@ 2009-03-06 17:42                   ` Joseph S. Myers
  0 siblings, 0 replies; 52+ messages in thread
From: Joseph S. Myers @ 2009-03-06 17:42 UTC (permalink / raw)
  To: Geert Bosch; +Cc: Richard Guenther, Diego Novillo, gcc

On Fri, 6 Mar 2009, Geert Bosch wrote:

> In any case, while I'd really like to move the checked signed
> integer overflow from Gigi (GNAT-to-GNU tree translator) to
> the language independent part of GCC, I want to have the absolute
> minimum amount of changes that is necessary to achieve this goal.

I think the absolute minimum is that new codes for trapping operations are 
added to GENERIC but not GIMPLE, that Gigi generates those codes rather 
than explicit checks, and the lowering to explicit checks is done in the 
gimplifier (possibly based on code presently in Gigi).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 17:25             ` Joseph S. Myers
@ 2009-03-06 17:52               ` Paolo Bonzini
  2009-03-06 18:36                 ` Joseph S. Myers
  0 siblings, 1 reply; 52+ messages in thread
From: Paolo Bonzini @ 2009-03-06 17:52 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: gcc

Joseph S. Myers wrote:
> On Fri, 6 Mar 2009, Paolo Bonzini wrote:
> 
>> I don't think you need to introduce trapping tree codes.  You can
>> introduce them directly in the front-end as
> 
> Multiple front ends want the same thing.  This is why it would be better 
> to introduce the codes in GENERIC and have the language-independent 
> gimplifier contain the code to lower them, even if they don't become part 
> of GIMPLE.

I see your point.  What I'm worried of, is that this codes would be
tested more lightly and, until folding is a middle-end thing only, the
risk of unwanted optimization on -ftrapv code would be high.

You can have common code shared by front-ends.  They could apply it at
GENERICization time (Fortran, Ada) or directly while parsing (C, C++).

>>    (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b
> 
> This is not a solution for trapping multiplication in the widest supported 
> type.

There's always range checking, I was pointing out optimization
possibilities; the above one can be optimized like

  (h,l) = a*b
  if (h != l >> 31) trap ();    // signed shift

Paolo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 17:52               ` Paolo Bonzini
@ 2009-03-06 18:36                 ` Joseph S. Myers
  0 siblings, 0 replies; 52+ messages in thread
From: Joseph S. Myers @ 2009-03-06 18:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc

On Fri, 6 Mar 2009, Paolo Bonzini wrote:

> Joseph S. Myers wrote:
> > On Fri, 6 Mar 2009, Paolo Bonzini wrote:
> > 
> >> I don't think you need to introduce trapping tree codes.  You can
> >> introduce them directly in the front-end as
> > 
> > Multiple front ends want the same thing.  This is why it would be better 
> > to introduce the codes in GENERIC and have the language-independent 
> > gimplifier contain the code to lower them, even if they don't become part 
> > of GIMPLE.
> 
> I see your point.  What I'm worried of, is that this codes would be
> tested more lightly and, until folding is a middle-end thing only, the
> risk of unwanted optimization on -ftrapv code would be high.

The risk (and fact) of unwanted optimization is there at present.  Simply 
using new codes and not teaching fold about them will reduce the risk over 
the present state where many bits of code have to check flag_trapv.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 17:38             ` Joseph S. Myers
@ 2009-03-06 18:47               ` Paolo Bonzini
  2009-03-07  3:00                 ` Joseph S. Myers
  2009-03-06 22:46               ` Geert Bosch
  1 sibling, 1 reply; 52+ messages in thread
From: Paolo Bonzini @ 2009-03-06 18:47 UTC (permalink / raw)
  To: gcc, Joseph S. Myers, Geert Bosch, Richard Guenther, Diego Novillo

Joseph S. Myers wrote:
> On Fri, 6 Mar 2009, Geert Bosch wrote:
> 
>>> this task may be more difficult.  So lowering them early during
>>> gimplification looks like a more reasonable plan IMHO.
>> Right, that was my intention. Still, I'll need to add code to
>> handle the new tree codes in fold(), right?
> 
> If you add new trapping codes to GENERIC I'd recommend *not* making fold() 
> handle them.

Constant folding should be done for them, though.

> Either lower 
> the codes in gimplification, or handle them explicitly in a few GIMPLE 
> optimizations e.g. when constants are propagated in, but avoid general 
> folding for them.

Definitely the former.

Paolo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 15:23               ` Paolo Bonzini
  2009-03-06 15:39                 ` Paolo Bonzini
  2009-03-06 16:19                 ` Richard Guenther
@ 2009-03-06 18:54                 ` Richard Earnshaw
  2009-03-06 19:53                   ` Ian Lance Taylor
       [not found]                 ` <1236364189.1654.2.camel@pc960.cambridge.arm.com>
  3 siblings, 1 reply; 52+ messages in thread
From: Richard Earnshaw @ 2009-03-06 18:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc

On Fri, 2009-03-06 at 16:09 +0100, Paolo Bonzini wrote:
> If this does not work, on ARM you can also hope for something like this:
> 
>      ADD    R0, R1, R2
>      XORS   R0, R2, R3
>      XORSMI R1, R2, R3
>      SWIMI  #trap

On ARM you can just check for overflow directly...

	ADDS	R0, R1, R2
	SWIVS	#trap

:=)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
       [not found]                 ` <1236364189.1654.2.camel@pc960.cambridge.arm.com>
@ 2009-03-06 19:34                   ` Paolo Bonzini
  0 siblings, 0 replies; 52+ messages in thread
From: Paolo Bonzini @ 2009-03-06 19:34 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc


>> If this does not work, on ARM you can also hope for something like this:
>>
>>      ADD    R0, R1, R2
>>      XORS   R0, R2, R3
>>      XORSMI R1, R2, R3
>>      SWIMI  #trap
> 
> On ARM you can just check for overflow directly...
> 
> 	ADDS	R0, R1, R2
> 	SWIVS	#trap

Of course, I was thinking explicitly of what happens with no MD support.

Paolo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 18:54                 ` Richard Earnshaw
@ 2009-03-06 19:53                   ` Ian Lance Taylor
  2009-03-06 20:42                     ` Robert Dewar
  0 siblings, 1 reply; 52+ messages in thread
From: Ian Lance Taylor @ 2009-03-06 19:53 UTC (permalink / raw)
  To: gcc

Richard Earnshaw <rearnsha@arm.com> writes:

> On Fri, 2009-03-06 at 16:09 +0100, Paolo Bonzini wrote:
>> If this does not work, on ARM you can also hope for something like this:
>> 
>>      ADD    R0, R1, R2
>>      XORS   R0, R2, R3
>>      XORSMI R1, R2, R3
>>      SWIMI  #trap
>
> On ARM you can just check for overflow directly...
>
> 	ADDS	R0, R1, R2
> 	SWIVS	#trap

This point should not be missed.  Some processors (MIPS) have trapping
arithmetic instructions, but many processors have an overflow flag which
can be tested.  Any useful design for -ftrapv needs to make it possible
to use that overflow flag in the generated code.  It will always be more
efficient than using arithmetic to check for overflow.

Ian

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 19:53                   ` Ian Lance Taylor
@ 2009-03-06 20:42                     ` Robert Dewar
  0 siblings, 0 replies; 52+ messages in thread
From: Robert Dewar @ 2009-03-06 20:42 UTC (permalink / raw)
  To: gcc

Ian Lance Taylor wrote:
> Richard Earnshaw <rearnsha@arm.com> writes:
> 
>> On Fri, 2009-03-06 at 16:09 +0100, Paolo Bonzini wrote:
>>> If this does not work, on ARM you can also hope for something like this:
>>>
>>>      ADD    R0, R1, R2
>>>      XORS   R0, R2, R3
>>>      XORSMI R1, R2, R3
>>>      SWIMI  #trap
>> On ARM you can just check for overflow directly...
>>
>> 	ADDS	R0, R1, R2
>> 	SWIVS	#trap
> 
> This point should not be missed.  Some processors (MIPS) have trapping
> arithmetic instructions, but many processors have an overflow flag which
> can be tested.  Any useful design for -ftrapv needs to make it possible
> to use that overflow flag in the generated code.  It will always be more
> efficient than using arithmetic to check for overflow.

One needs to be careful on efficiency here, thinking about pipelines
etc. for example, into looks nice on the x86, but can be expensive to
use compared with explicit tests of the overflow flag.
> 
> Ian

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 17:38             ` Joseph S. Myers
  2009-03-06 18:47               ` Paolo Bonzini
@ 2009-03-06 22:46               ` Geert Bosch
  2009-03-07  1:21                 ` Richard Guenther
  1 sibling, 1 reply; 52+ messages in thread
From: Geert Bosch @ 2009-03-06 22:46 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Richard Guenther, Diego Novillo, gcc


On Mar 6, 2009, at 12:22, Joseph S. Myers wrote:
> If you add new trapping codes to GENERIC I'd recommend *not* making  
> fold()
> handle them.  I don't think much folding is safe for the trapping  
> codes
> when you want to avoid either removing or introducing traps.  Either  
> lower
> the codes in gimplification, or handle them explicitly in a few GIMPLE
> optimizations e.g. when constants are propagated in, but avoid general
> folding for them.

The point here is not to think in terms of the old -trapv and trapping
instructions, but instead at the slightly higher level of a well-defined
model of signed integer arithmetic. That is why signed integer overflow
checking and the no-undefined-overflow branch are closely related.

There are essentially two models to evaluate a signed integer  
expression.
The one Ada uses is that a check may be omitted if the value of the  
expression,
in absence of the check, would have no effect on the external  
interactions
of the program.

> An implementation need not always raise an exception when a language- 
> defined check
> fails.  Instead, the operation that failed the check can simply  
> yield an undefined
> result. The exception  need be raised by the implementation only if,  
> in the absence
> of raising it, the value of this  undefined result would have some  
> effect on the
> external interactions of the program. In determining this, the  
> implementation shall
> not presume that an undefined result has a value that  belongs to  
> its subtype,
> nor even to the base range of its type, if scalar. Having removed  
> the raise of
> the exception, the canonical semantics will in general allow the  
> implementation
> to omit the  code for the check, and some or all of the operation  
> itself.

The other one is the one you suggest:
> Front ends should set TREE_SIDE_EFFECTS on trapping expressions so  
> that
> fold knows it can't discard a subexpression (whose value doesn't  
> matter to
> the value of the final expression) containing a trapping expression,  
> e.g.
> 0 * (a +trap b) needs to evaluate (a +trap b) for its side effects.   
> With
> this done, front ends generating trapping codes for -ftrapv and fold  
> not
> trying to optimize the trapping codes, I'd hope fold and the rest of  
> the
> language-independent compiler could stop caring about flag_trapv.

Setting the TREE_SIDE_EFFECTS seriously limits optimizations. Also, as  
quality
of implementation issue, while an expression may have an undefined  
result,
if that result is not used, removing the entire computation is generally
preferable over raising an exception. Arguments can be made for and  
against both
models, so probably we could make setting of TREE_SIDE_EFFECTS optional.

   -Geert

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 22:46               ` Geert Bosch
@ 2009-03-07  1:21                 ` Richard Guenther
  0 siblings, 0 replies; 52+ messages in thread
From: Richard Guenther @ 2009-03-07  1:21 UTC (permalink / raw)
  To: Geert Bosch; +Cc: Joseph S.Myers, Diego Novillo, gcc

On Fri, 6 Mar 2009, Geert Bosch wrote:

> 
> On Mar 6, 2009, at 12:22, Joseph S. Myers wrote:
> > If you add new trapping codes to GENERIC I'd recommend *not* making fold()
> > handle them.  I don't think much folding is safe for the trapping codes
> > when you want to avoid either removing or introducing traps.  Either lower
> > the codes in gimplification, or handle them explicitly in a few GIMPLE
> > optimizations e.g. when constants are propagated in, but avoid general
> > folding for them.
> 
> The point here is not to think in terms of the old -trapv and trapping
> instructions, but instead at the slightly higher level of a well-defined
> model of signed integer arithmetic. That is why signed integer overflow
> checking and the no-undefined-overflow branch are closely related.
> 
> There are essentially two models to evaluate a signed integer expression.
> The one Ada uses is that a check may be omitted if the value of the
> expression,
> in absence of the check, would have no effect on the external interactions
> of the program.
> 
> > An implementation need not always raise an exception when a language-defined
> > check
> > fails.  Instead, the operation that failed the check can simply yield an
> > undefined
> > result. The exception  need be raised by the implementation only if, in the
> > absence
> > of raising it, the value of this  undefined result would have some effect on
> > the
> > external interactions of the program. In determining this, the
> > implementation shall
> > not presume that an undefined result has a value that  belongs to its
> > subtype,
> > nor even to the base range of its type, if scalar. Having removed the raise
> > of
> > the exception, the canonical semantics will in general allow the
> > implementation
> > to omit the  code for the check, and some or all of the operation itself.
> 
> The other one is the one you suggest:
> > Front ends should set TREE_SIDE_EFFECTS on trapping expressions so that
> > fold knows it can't discard a subexpression (whose value doesn't matter to
> > the value of the final expression) containing a trapping expression, e.g.
> > 0 * (a +trap b) needs to evaluate (a +trap b) for its side effects.  With
> > this done, front ends generating trapping codes for -ftrapv and fold not
> > trying to optimize the trapping codes, I'd hope fold and the rest of the
> > language-independent compiler could stop caring about flag_trapv.
> 
> Setting the TREE_SIDE_EFFECTS seriously limits optimizations. Also, as quality
> of implementation issue, while an expression may have an undefined result,
> if that result is not used, removing the entire computation is generally
> preferable over raising an exception. Arguments can be made for and against
> both
> models, so probably we could make setting of TREE_SIDE_EFFECTS optional.

Yes.  If we introduce new GENERIC tree codes for overflow checking
operations that is certainly useful.  Though once you lowered GENERIC
to explicit checks and trapping (or raising exceptions) then this
side-effect is explicit and we probably won't remove them anymore.

I played with the idea to just implement a -ftrapv substitute by
lowering non-NV operation variants during gimplification, but of
course new GENERIC tree codes are more flexible.

You can use no-undefined-overflow branch to commit patches once
you have them.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 18:47               ` Paolo Bonzini
@ 2009-03-07  3:00                 ` Joseph S. Myers
  0 siblings, 0 replies; 52+ messages in thread
From: Joseph S. Myers @ 2009-03-07  3:00 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc, Geert Bosch, Richard Guenther, Diego Novillo

On Fri, 6 Mar 2009, Paolo Bonzini wrote:

> Joseph S. Myers wrote:
> > On Fri, 6 Mar 2009, Geert Bosch wrote:
> > 
> >>> this task may be more difficult.  So lowering them early during
> >>> gimplification looks like a more reasonable plan IMHO.
> >> Right, that was my intention. Still, I'll need to add code to
> >> handle the new tree codes in fold(), right?
> > 
> > If you add new trapping codes to GENERIC I'd recommend *not* making fold() 
> > handle them.
> 
> Constant folding should be done for them, though.

Yes, constant folding should be done, so that very tiny subset of fold 
needs to handle them.  The vast bulk of fold that does other more 
complicated transformations should not, in my view.  (There may be 
isolated transformations that are safe for these codes.  For example, you 
could transform (x *trap 2) *trap 2 to x *trap 4 (this is a case where two 
traps can become one), but not (x *trap 2) *trap 0 to x *trap 0 or (x 
*trap -1) *trap -1 to x *trap 1.  I think however it would be reasonable 
to replace -ftrapv without going through all fold transformations to 
enable the safe ones for the new tree codes; it's better to start by 
making it reliable and then look at cases where it can be made to generate 
better code.  And I expect that a large proportion of what fold does is 
not safe for trapping codes.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-06 14:44           ` Paolo Bonzini
  2009-03-06 15:10             ` Richard Guenther
  2009-03-06 17:25             ` Joseph S. Myers
@ 2009-03-27 13:11             ` Eus
  2009-03-27 18:07               ` Paolo Bonzini
  2 siblings, 1 reply; 52+ messages in thread
From: Eus @ 2009-03-27 13:11 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc

Hi Ho!

On Fri, 2009-03-06 at 15:29 +0100, Paolo Bonzini wrote:
> > So while trapping variants can certainly be introduced it looks like
> > this task may be more difficult.
> 
> I don't think you need to introduce trapping tree codes.  You can
> introduce them directly in the front-end as
> 
>    s = x +nv y
>    (((s ^ x) & (s ^ y)) < 0) ? trap () : s

Could you please kindly explain a bit about it?

Suppose s, x, and y are 8-bit signed integer.
If x and y are -128, s = (-128) + (-128), which will overflow.
Now if suppose s is 0, ((0 ^ -128) & (0 ^ -128)) == -128, which is less
than zero and will trap. This is correct.
But, if s is -128, ((-128 ^ -128) & (-128 ^ -128)) is zero and will not
trap. This is incorrect, isn't this?

In short, why should s be included in the calculation?
Is it assumed that s is always zero?

Thank you very much for the explanation.

>    d = x -nv y
>    (((d ^ x) & (x ^ y)) < 0) ? trap () : d
> 
>    (b == INT_MIN ? trap () : -nv b)
> 
>    (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b
> 
> Making sure they are compiled efficiently is another story, but
> especially for the sake of LTO I think this is the way to go.
> 
> Paolo

-- 
Best regards,
Eus (FSF member #4445)

In this digital era, where computing technology is pervasive, your
freedom depends on the software controlling those computing devices.

Join free software movement today! It is free as in freedom, not as in
free beer!

Join: http://www.fsf.org/jf?referrer=4445

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-27 13:11             ` Eus
@ 2009-03-27 18:07               ` Paolo Bonzini
  2009-03-31  4:55                 ` Eus
  0 siblings, 1 reply; 52+ messages in thread
From: Paolo Bonzini @ 2009-03-27 18:07 UTC (permalink / raw)
  To: eus; +Cc: gcc

Eus wrote:
> Hi Ho!
> 
> On Fri, 2009-03-06 at 15:29 +0100, Paolo Bonzini wrote:
>>> So while trapping variants can certainly be introduced it looks like
>>> this task may be more difficult.
>> I don't think you need to introduce trapping tree codes.  You can
>> introduce them directly in the front-end as
>>
>>    s = x +nv y
>>    (((s ^ x) & (s ^ y)) < 0) ? trap () : s
> 
> Could you please kindly explain a bit about it?
> 
> Suppose s, x, and y are 8-bit signed integer.
> If x and y are -128, s = (-128) + (-128), which will overflow.
> Now if suppose s is 0, ((0 ^ -128) & (0 ^ -128)) == -128, which is less
> than zero and will trap. This is correct.
> But, if s is -128, ((-128 ^ -128) & (-128 ^ -128)) is zero and will not
> trap. This is incorrect, isn't this?

Don't look at it as an arithmetic test, "< 0" is just a shortcut for
"the most significant bit of the first operand is 1".  So the formula's
outcome only depends on the ANDs and XORs of the sign bits; it means,
"trap if the result's sign is different from both addends' signs" or
equivalently "check that the result's sign matches at least one addend's
sign".

You can also say "((s ^ x) & ~(x ^ y)) < 0", or equivalently "((s ^ y) &
~(x ^ y)) < 0".  These mean "trap if the two addends have the same sign
and the sum has a different sign".  It is easy to prove that they are
equivalent:

  s    x    y     s^x    s^y     ~(x^y)      equal?
  0    0    0      0      0         1          yes (0)
  0    0    1      0      1         0          yes (0)
  0    1    0      1      0         0          yes (0)
  0    1    1      1      1         1          yes (1)
  1    0    0      1      1         1          yes (1)
  1    0    1      1      0         0          yes (0)
  1    1    0      0      1         0          yes (0)
  1    1    1      0      0         1          yes (0)

Paolo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-03-27 18:07               ` Paolo Bonzini
@ 2009-03-31  4:55                 ` Eus
  0 siblings, 0 replies; 52+ messages in thread
From: Eus @ 2009-03-31  4:55 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc

Hi Ho!

On Fri, 2009-03-27 at 13:49 +0100, Paolo Bonzini wrote:
> Eus wrote:
> > Hi Ho!
> > 
> > On Fri, 2009-03-06 at 15:29 +0100, Paolo Bonzini wrote:
> >>> So while trapping variants can certainly be introduced it looks like
> >>> this task may be more difficult.
> >> I don't think you need to introduce trapping tree codes.  You can
> >> introduce them directly in the front-end as
> >>
> >>    s = x +nv y
> >>    (((s ^ x) & (s ^ y)) < 0) ? trap () : s
> > 
> > Could you please kindly explain a bit about it?
> > 
> > Suppose s, x, and y are 8-bit signed integer.
> > If x and y are -128, s = (-128) + (-128), which will overflow.
> > Now if suppose s is 0, ((0 ^ -128) & (0 ^ -128)) == -128, which is less
> > than zero and will trap. This is correct.
> > But, if s is -128, ((-128 ^ -128) & (-128 ^ -128)) is zero and will not
> > trap. This is incorrect, isn't this?
> 
> Don't look at it as an arithmetic test, "< 0" is just a shortcut for
> "the most significant bit of the first operand is 1".  So the formula's
> outcome only depends on the ANDs and XORs of the sign bits; it means,
> "trap if the result's sign is different from both addends' signs" or
> equivalently "check that the result's sign matches at least one addend's
> sign".

Ah, I got it now! :-)
Yes, so after I perform the addition, I check for the signs.
It was my mistake to think that the check was done without performing
the addition first. I mean,
`s = x +nv y' == `(((s ^ x) & (s ^ y)) < 0) ? trap () : s'.
But, they should be performed one after another:
1) s = x +nv y
2) (((s ^ x) & (s ^ y)) < 0) ? trap () : s

> You can also say "((s ^ x) & ~(x ^ y)) < 0", or equivalently "((s ^ y) &
> ~(x ^ y)) < 0".  These mean "trap if the two addends have the same sign
> and the sum has a different sign".  It is easy to prove that they are
> equivalent:
> 
>   s    x    y     s^x    s^y     ~(x^y)      equal?
>   0    0    0      0      0         1          yes (0)
>   0    0    1      0      1         0          yes (0)
>   0    1    0      1      0         0          yes (0)
>   0    1    1      1      1         1          yes (1)
>   1    0    0      1      1         1          yes (1)
>   1    0    1      1      0         0          yes (0)
>   1    1    0      0      1         0          yes (0)
>   1    1    1      0      0         1          yes (0)

Yes, I really understand it now :-)
Thank you very much for your kind explanation.
I really appreciate it.

> Paolo

-- 
Best regards,
Eus (FSF member #4445)

In this digital era, where computing technology is pervasive, your
freedom depends on the software controlling those computing devices.

Join free software movement today! It is free as in freedom, not as in
free beer!

Join: http://www.fsf.org/jf?referrer=4445

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 19:58 Jay Foad
  2009-02-27 22:41 ` Richard Guenther
@ 2009-02-27 23:00 ` Ian Lance Taylor
  1 sibling, 0 replies; 52+ messages in thread
From: Ian Lance Taylor @ 2009-02-27 23:00 UTC (permalink / raw)
  To: Jay Foad; +Cc: rguenther, gcc

Jay Foad <jay.foad@gmail.com> writes:

From an optimisation pass's point of view, what's the difference between:
>
> 1. a PLUS expression that gives an undefined result on overflow, and
> 2. a PLUS expression with a guarantee that the result won't overflow.
>
> I can't see how they will be handled any differently in practice.

I don't think that's the right way to describe the alternatives.  The
choices are:

1. A PLUS expression which gives an undefined result on overflow;
2. A PLUS expression which wraps on overflow.

Those two cases can be optimized differently.  For example,
    (a +undef-overflow C1) < C2
where C1 and C2 are constants, may be optimized to
    a < C2 - C1

That is not true of
    (a +wrapping-overflow C1) < C2

Ian

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
  2009-02-27 19:58 Jay Foad
@ 2009-02-27 22:41 ` Richard Guenther
  2009-02-27 23:00 ` Ian Lance Taylor
  1 sibling, 0 replies; 52+ messages in thread
From: Richard Guenther @ 2009-02-27 22:41 UTC (permalink / raw)
  To: Jay Foad; +Cc: gcc

On Fri, 27 Feb 2009, Jay Foad wrote:

> > To support languages that have undefined semantics on overflowing
> > operations the middle-end gets new unary and binary operators
> > that implicitly encode value-range information about their operands
> > noting that the operation does not overflow.  These does-not-overflow
> > operators transform the undefined behavior into a valid assumption
> > and thus make the GIMPLE IL fully defined.
> 
> From an optimisation pass's point of view, what's the difference between:
> 
> 1. a PLUS expression that gives an undefined result on overflow, and
> 2. a PLUS expression with a guarantee that the result won't overflow.
> 
> I can't see how they will be handled any differently in practice.

There is no difference in practice other than that we do a consistent
interpretation of undefined.  Note that the whole point of the branch
is to make overflow behavior explicit per operation, not globally
set per flag and for signed types only.

Richard.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: New no-undefined-overflow branch
@ 2009-02-27 19:58 Jay Foad
  2009-02-27 22:41 ` Richard Guenther
  2009-02-27 23:00 ` Ian Lance Taylor
  0 siblings, 2 replies; 52+ messages in thread
From: Jay Foad @ 2009-02-27 19:58 UTC (permalink / raw)
  To: rguenther; +Cc: gcc

> To support languages that have undefined semantics on overflowing
> operations the middle-end gets new unary and binary operators
> that implicitly encode value-range information about their operands
> noting that the operation does not overflow.  These does-not-overflow
> operators transform the undefined behavior into a valid assumption
> and thus make the GIMPLE IL fully defined.

From an optimisation pass's point of view, what's the difference between:

1. a PLUS expression that gives an undefined result on overflow, and
2. a PLUS expression with a guarantee that the result won't overflow.

I can't see how they will be handled any differently in practice.

Thanks,
Jay.

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2009-03-31  3:05 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther
2009-02-26 11:28 ` Manuel López-Ibáñez
2009-02-26 12:02   ` Richard Guenther
2009-02-26 13:07 ` Joseph S. Myers
2009-02-26 13:22   ` Richard Guenther
2009-02-26 14:16 ` Dave Korn
2009-02-26 23:08 ` Zdenek Dvorak
2009-02-27 10:16   ` Richard Guenther
2009-02-27 12:07     ` Dave Korn
2009-02-27 13:17       ` Richard Guenther
2009-02-27 13:35         ` Dave Korn
2009-02-27 13:45           ` Richard Guenther
2009-02-27 18:07       ` Zdenek Dvorak
2009-02-27 17:58     ` Zdenek Dvorak
2009-02-27 18:17       ` Richard Guenther
2009-02-27 23:07 ` Diego Novillo
2009-02-28  0:04   ` Joseph S. Myers
2009-02-28 10:29     ` Richard Guenther
2009-03-05 17:24       ` Geert Bosch
2009-03-06 12:28         ` Richard Guenther
2009-03-06 13:54           ` Joseph S. Myers
2009-03-06 14:10             ` Richard Guenther
2009-03-06 14:29               ` Joseph S. Myers
2009-03-06 17:17                 ` Geert Bosch
2009-03-06 17:42                   ` Joseph S. Myers
2009-03-06 14:44           ` Paolo Bonzini
2009-03-06 15:10             ` Richard Guenther
2009-03-06 15:23               ` Paolo Bonzini
2009-03-06 15:39                 ` Paolo Bonzini
2009-03-06 16:19                 ` Richard Guenther
2009-03-06 17:23                   ` Joseph S. Myers
2009-03-06 18:54                 ` Richard Earnshaw
2009-03-06 19:53                   ` Ian Lance Taylor
2009-03-06 20:42                     ` Robert Dewar
     [not found]                 ` <1236364189.1654.2.camel@pc960.cambridge.arm.com>
2009-03-06 19:34                   ` Paolo Bonzini
2009-03-06 15:58               ` Ian Lance Taylor
2009-03-06 17:25             ` Joseph S. Myers
2009-03-06 17:52               ` Paolo Bonzini
2009-03-06 18:36                 ` Joseph S. Myers
2009-03-27 13:11             ` Eus
2009-03-27 18:07               ` Paolo Bonzini
2009-03-31  4:55                 ` Eus
2009-03-06 17:21           ` Geert Bosch
2009-03-06 17:38             ` Joseph S. Myers
2009-03-06 18:47               ` Paolo Bonzini
2009-03-07  3:00                 ` Joseph S. Myers
2009-03-06 22:46               ` Geert Bosch
2009-03-07  1:21                 ` Richard Guenther
2009-02-28 10:25   ` Richard Guenther
2009-02-27 19:58 Jay Foad
2009-02-27 22:41 ` Richard Guenther
2009-02-27 23:00 ` Ian Lance Taylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).