* New no-undefined-overflow branch @ 2009-02-26 11:05 Richard Guenther 2009-02-26 11:28 ` Manuel López-Ibáñez ` (4 more replies) 0 siblings, 5 replies; 52+ messages in thread From: Richard Guenther @ 2009-02-26 11:05 UTC (permalink / raw) To: gcc There have been numerous issues in the past with respect to the undefinedness of signed overflow in our IL, with the non-semantics of sizetype and friends and with the pessimization of using unsigned arithmetic vs. signed arithmetic in loop optimizations. The goal of the no-undefined-overflow branch is to address all these problems by tackling the single worst problem, imposing There shall be no construct in the GIMPLE IL that invokes undefined behavior. Thus, from now on integer overflow is defined and will wrap with the usual twos-complement semantics. Thus, for the middle-end -fwrapv is always in effect. This extends to pointer overflow. To support languages that have undefined semantics on overflowing operations the middle-end gets new unary and binary operators that implicitly encode value-range information about their operands noting that the operation does not overflow. These does-not-overflow operators transform the undefined behavior into a valid assumption and thus make the GIMPLE IL fully defined. Thus it is now up to the frontend(s) and value-range analysis to figure out if operations overflow and accordingly change the IL to retain optimizations that were possible before. As a start there will be no-overflow variants of NEGATE_EXPR, PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR. The sizetypes will simply be operated on in no-overflow variants by default (by size_binop and friends). Naming suggestions welcome, at the moment I consider NEGATEV_EXPR (thus appending V to the first part). Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther @ 2009-02-26 11:28 ` Manuel López-Ibáñez 2009-02-26 12:02 ` Richard Guenther 2009-02-26 13:07 ` Joseph S. Myers ` (3 subsequent siblings) 4 siblings, 1 reply; 52+ messages in thread From: Manuel López-Ibáñez @ 2009-02-26 11:28 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc 2009/2/26 Richard Guenther <rguenther@suse.de>: > > There shall be no construct in the GIMPLE IL that invokes > undefined behavior. What about uninitialized variables? Cheers, Manuel. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-26 11:28 ` Manuel López-Ibáñez @ 2009-02-26 12:02 ` Richard Guenther 0 siblings, 0 replies; 52+ messages in thread From: Richard Guenther @ 2009-02-26 12:02 UTC (permalink / raw) To: Manuel López-Ibáñez; +Cc: gcc [-- Attachment #1: Type: TEXT/PLAIN, Size: 299 bytes --] On Thu, 26 Feb 2009, Manuel López-Ibáñez wrote: > 2009/2/26 Richard Guenther <rguenther@suse.de>: > > > >  There shall be no construct in the GIMPLE IL that invokes > >  undefined behavior. > > What about uninitialized variables? That's unrelated and something I do not care about. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther 2009-02-26 11:28 ` Manuel López-Ibáñez @ 2009-02-26 13:07 ` Joseph S. Myers 2009-02-26 13:22 ` Richard Guenther 2009-02-26 14:16 ` Dave Korn ` (2 subsequent siblings) 4 siblings, 1 reply; 52+ messages in thread From: Joseph S. Myers @ 2009-02-26 13:07 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc On Thu, 26 Feb 2009, Richard Guenther wrote: > As a start there will be no-overflow variants of NEGATE_EXPR, > PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR. So you would have both wrapping and no-overflow versions of these codes (with it of course always being valid for a transformation to convert the no-overflow version into a wrapping one if in a particular case that is beneficial), and the gimplification process would select which codes to use based on the -fwrapv setting? Whereas trapping versions (any reimplementation of -ftrapv) would be expected to be implemented at gimplification time or before with explicit checks in GIMPLE rather than further variant codes? (RTL has only the overflow-wraps operation variants, as well as having signed/unsigned operations where necessary instead of signed/unsigned types. (It has overflow-traps versions as well in the present -ftrapv implementation.)) If anyone were to wish to make -fwrapv get INT_MIN / -1 and INT_MIN % -1 right (see the discussions in PR 30484), where would this fit in this scheme? Different operations for division and modulo operations according to whether these cases are defined, and a lowering stage carried out on GIMPLE at some point to convert the versions where they are defined to the versions where they aren't (inserting explicit checks for -1) if the underlying target instructions do not make these operations defined? (Both variants would embed an assumption that you are not dividing by integer 0.) Making these semantics explicit in the IL instead of in global flags is of course a good thing. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-26 13:07 ` Joseph S. Myers @ 2009-02-26 13:22 ` Richard Guenther 0 siblings, 0 replies; 52+ messages in thread From: Richard Guenther @ 2009-02-26 13:22 UTC (permalink / raw) To: Joseph S. Myers; +Cc: gcc On Thu, 26 Feb 2009, Joseph S. Myers wrote: > On Thu, 26 Feb 2009, Richard Guenther wrote: > > > As a start there will be no-overflow variants of NEGATE_EXPR, > > PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR. > > So you would have both wrapping and no-overflow versions of these codes > (with it of course always being valid for a transformation to convert the > no-overflow version into a wrapping one if in a particular case that is > beneficial), and the gimplification process would select which codes to > use based on the -fwrapv setting? Yes, there is a wrapping and a no-overflow variant (though the no-overflow variant also wraps if it overflows - indeed, this is so that existing build2 calls for example for PLUS_EXPR are always valid). I am thinking of not selecting the codes during gimplification but instead force the frontends to choose whatever they think match their semantics. > Whereas trapping versions (any > reimplementation of -ftrapv) would be expected to be implemented at > gimplification time or before with explicit checks in GIMPLE rather than > further variant codes? Yes. I was thinking about overloading the no-overflow variants for floating-point types to mean non-trapping, non-inf, non-nan (thus a "fast-math" operation). But that's for future thoughts. > (RTL has only the overflow-wraps operation variants, as well as having > signed/unsigned operations where necessary instead of signed/unsigned > types. (It has overflow-traps versions as well in the present -ftrapv > implementation.) Yes, I am aware of that. > If anyone were to wish to make -fwrapv get INT_MIN / -1 and INT_MIN % -1 > right (see the discussions in PR 30484), where would this fit in this > scheme? Different operations for division and modulo operations according > to whether these cases are defined, and a lowering stage carried out on > GIMPLE at some point to convert the versions where they are defined to the > versions where they aren't (inserting explicit checks for -1) if the > underlying target instructions do not make these operations defined? > (Both variants would embed an assumption that you are not dividing by > integer 0.) The canonical way would be to have TRUNC_DIV_EXPR insert proper checks at expansion time and TRUNC_DIVV_EXPR assume that the dividend isn't INT_MIN at the same time as the divisor is -1 (I wonder if the same implementation problem may exist for negating INT_MIN? Or does all hardware get this right?). There is of course the question on whether we want to pay the penalty of expanding TRUNC_DIV_EXPR "correct" or not. I support the idea of a separate target flag for this. Note that I think this issue perfectly fits in the wrapping / no-overflow scheme, just it isn't automagically solved ;) (but we can use value-range analysis to optimize the cases where we know we don't need the checks). > Making these semantics explicit in the IL instead of in global flags is of > course a good thing. It's even needed for cross-language (and cross-TU) LTO. I also think it will end up creating more optimization opportunities rather than less. As with the first answer I hope that I get help from the frontend maintainers properly emitting the no-overflow variants where they think their language allows it. Thanks, Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther 2009-02-26 11:28 ` Manuel López-Ibáñez 2009-02-26 13:07 ` Joseph S. Myers @ 2009-02-26 14:16 ` Dave Korn 2009-02-26 23:08 ` Zdenek Dvorak 2009-02-27 23:07 ` Diego Novillo 4 siblings, 0 replies; 52+ messages in thread From: Dave Korn @ 2009-02-26 14:16 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc Richard Guenther wrote: > As a start there will be no-overflow variants of NEGATE_EXPR, > PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR. > Naming suggestions welcome, at the moment I consider NEGATEV_EXPR > (thus appending V to the first part). Wow, an actual /invitation/ to bikeshed! Ok then, I'd suggest that since "V" means "overflow" in CPU flags, you want to add "NV" instead, no? cheers, DaveK ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther ` (2 preceding siblings ...) 2009-02-26 14:16 ` Dave Korn @ 2009-02-26 23:08 ` Zdenek Dvorak 2009-02-27 10:16 ` Richard Guenther 2009-02-27 23:07 ` Diego Novillo 4 siblings, 1 reply; 52+ messages in thread From: Zdenek Dvorak @ 2009-02-26 23:08 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc Hi, in general, I like this proposal a lot. However, > As a start there will be no-overflow variants of NEGATE_EXPR, > PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR. > > The sizetypes will simply be operated on in no-overflow variants > by default (by size_binop and friends). > > Naming suggestions welcome, at the moment I consider NEGATEV_EXPR > (thus appending V to the first part). introducing new codes seems like a bad idea to me. There are many places that do not care about the distinction between PLUS_EXPR and PLUSV_EXPR, and handling both cases will complicate the code (see eg. the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR distinction). Why not just use a flag to mark the operation as non-overflowing? Zdenek ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-26 23:08 ` Zdenek Dvorak @ 2009-02-27 10:16 ` Richard Guenther 2009-02-27 12:07 ` Dave Korn 2009-02-27 17:58 ` Zdenek Dvorak 0 siblings, 2 replies; 52+ messages in thread From: Richard Guenther @ 2009-02-27 10:16 UTC (permalink / raw) To: Zdenek Dvorak; +Cc: gcc On Fri, 27 Feb 2009, Zdenek Dvorak wrote: > Hi, > > in general, I like this proposal a lot. However, > > > As a start there will be no-overflow variants of NEGATE_EXPR, > > PLUS_EXPR, MINUS_EXPR, MULT_EXPR and POINTER_PLUS_EXPR. > > > > The sizetypes will simply be operated on in no-overflow variants > > by default (by size_binop and friends). > > > > Naming suggestions welcome, at the moment I consider NEGATEV_EXPR > > (thus appending V to the first part). First, for the bikeshedding part I agree that NV is probably a better idea. > introducing new codes seems like a bad idea to me. There are many > places that do not care about the distinction between PLUS_EXPR and > PLUSV_EXPR, and handling both cases will complicate the code (see eg. > the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR > distinction). Why not just use a flag to mark the operation as > non-overflowing? I obviously thought about this. The issue with using a flag is that there is no convenient place to stick it and that it makes the distinction between the two variants less visible. Consider the folding routines that take split trees for a start. IMHO using new tree-codes is much less complicated than carrying around flags. I did consider putting flags on the tree code itself, but that isn't going to make the changes easier either. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 10:16 ` Richard Guenther @ 2009-02-27 12:07 ` Dave Korn 2009-02-27 13:17 ` Richard Guenther 2009-02-27 18:07 ` Zdenek Dvorak 2009-02-27 17:58 ` Zdenek Dvorak 1 sibling, 2 replies; 52+ messages in thread From: Dave Korn @ 2009-02-27 12:07 UTC (permalink / raw) To: Richard Guenther; +Cc: Zdenek Dvorak, gcc Richard Guenther wrote: > On Fri, 27 Feb 2009, Zdenek Dvorak wrote: >> introducing new codes seems like a bad idea to me. There are many >> places that do not care about the distinction between PLUS_EXPR and >> PLUSV_EXPR, and handling both cases will complicate the code (see eg. >> the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR >> distinction). Why not just use a flag to mark the operation as >> non-overflowing? > > I obviously thought about this. The issue with using a flag is > that there is no convenient place to stick it and that it makes > the distinction between the two variants less visible. Consider > the folding routines that take split trees for a start. > > IMHO using new tree-codes is much less complicated than carrying > around flags. I did consider putting flags on the tree code > itself, but that isn't going to make the changes easier either. I think it's probably a far safer design too. If you suddenly introduce a new semantic for an existing code, suddenly every current check for that tree code becomes a potential bug site that has to be manually inspected to see if it's overflow-sensitive or not and therefore whether it needs to test the new flag or not; people who don't know about the new flag will carry on adding code that processes those tree codes without knowing to test the flag; and the bugs that do arise will be hard to find and result in silent bad codegen. If OTOH you add new tree codes, the meaning will be explicit, nobody using them can be under any misapprehension about the overflow semantics, nobody can forget to check the flag, and any places where they should be handled but aren't will very quickly draw themselves to our attention by ICEing, or if they don't will simply fall back to the safe option of not doing any processing on a tree code that they don't recognize; that might lead to pessimal code, but it shouldn't generate incorrect code. cheers, DaveK ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 12:07 ` Dave Korn @ 2009-02-27 13:17 ` Richard Guenther 2009-02-27 13:35 ` Dave Korn 2009-02-27 18:07 ` Zdenek Dvorak 1 sibling, 1 reply; 52+ messages in thread From: Richard Guenther @ 2009-02-27 13:17 UTC (permalink / raw) To: Dave Korn; +Cc: Zdenek Dvorak, gcc On Fri, 27 Feb 2009, Dave Korn wrote: > Richard Guenther wrote: > > On Fri, 27 Feb 2009, Zdenek Dvorak wrote: > > >> introducing new codes seems like a bad idea to me. There are many > >> places that do not care about the distinction between PLUS_EXPR and > >> PLUSV_EXPR, and handling both cases will complicate the code (see eg. > >> the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR > >> distinction). Why not just use a flag to mark the operation as > >> non-overflowing? > > > > I obviously thought about this. The issue with using a flag is > > that there is no convenient place to stick it and that it makes > > the distinction between the two variants less visible. Consider > > the folding routines that take split trees for a start. > > > > IMHO using new tree-codes is much less complicated than carrying > > around flags. I did consider putting flags on the tree code > > itself, but that isn't going to make the changes easier either. > > I think it's probably a far safer design too. > > If you suddenly introduce a new semantic for an existing code, suddenly > every current check for that tree code becomes a potential bug site that has > to be manually inspected to see if it's overflow-sensitive or not and > therefore whether it needs to test the new flag or not; people who don't know > about the new flag will carry on adding code that processes those tree codes > without knowing to test the flag; and the bugs that do arise will be hard to > find and result in silent bad codegen. > > If OTOH you add new tree codes, the meaning will be explicit, nobody using > them can be under any misapprehension about the overflow semantics, nobody can > forget to check the flag, and any places where they should be handled but > aren't will very quickly draw themselves to our attention by ICEing, or if > they don't will simply fall back to the safe option of not doing any > processing on a tree code that they don't recognize; that might lead to > pessimal code, but it shouldn't generate incorrect code. It's definitely safer. Still we have to carefully modify existing code to deal with the new tree codes as most of it carelessly transitiones old codes to new trees. For example re-associating (a +/nv b) + c to a +/nv (b + c) is wrong. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 13:17 ` Richard Guenther @ 2009-02-27 13:35 ` Dave Korn 2009-02-27 13:45 ` Richard Guenther 0 siblings, 1 reply; 52+ messages in thread From: Dave Korn @ 2009-02-27 13:35 UTC (permalink / raw) To: Richard Guenther; +Cc: Dave Korn, Zdenek Dvorak, gcc Richard Guenther wrote: > It's definitely safer. Still we have to carefully modify existing > code to deal with the new tree codes as most of it carelessly > transitiones old codes to new trees. For example re-associating > (a +/nv b) + c to a +/nv (b + c) is wrong. Yes, of course we have to take care when processing the new codes, but the wonderful thing that I was pointing out is that none of the current fold/simplify code will attempt to reassociate the new codes at first, until it's been explicitly taught about them, whereas if we had used a flag it would plough straight ahead under the misapprehension that it understood the semantics of the tree code, that's all. Anyway, your plan gets a big thumbs-up from me. The noobs will be endlessly glad when signed ints start overflowing in the way they'd hoped, and the fortran-wielding-heavy-duty-massive-array-number-crunching-academic types can use a commandline option to make sure they still get their "loop optimiser assumes your index won't overflow so loop must terminate" optimisations, won't they? cheers, DaveK ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 13:35 ` Dave Korn @ 2009-02-27 13:45 ` Richard Guenther 0 siblings, 0 replies; 52+ messages in thread From: Richard Guenther @ 2009-02-27 13:45 UTC (permalink / raw) To: Dave Korn; +Cc: Zdenek Dvorak, gcc On Fri, 27 Feb 2009, Dave Korn wrote: > Richard Guenther wrote: > > > It's definitely safer. Still we have to carefully modify existing > > code to deal with the new tree codes as most of it carelessly > > transitiones old codes to new trees. For example re-associating > > (a +/nv b) + c to a +/nv (b + c) is wrong. > > Yes, of course we have to take care when processing the new codes, but the > wonderful thing that I was pointing out is that none of the current > fold/simplify code will attempt to reassociate the new codes at first, until > it's been explicitly taught about them, whereas if we had used a flag it would > plough straight ahead under the misapprehension that it understood the > semantics of the tree code, that's all. Correct. > Anyway, your plan gets a big thumbs-up from me. The noobs will be endlessly > glad when signed ints start overflowing in the way they'd hoped, and the Heh - this will only happen if the frontend does not indicate there isn't overflow with signed operations. Which of course it will unless you specify -fwrapv. > fortran-wielding-heavy-duty-massive-array-number-crunching-academic types can > use a commandline option to make sure they still get their "loop optimiser > assumes your index won't overflow so loop must terminate" optimisations, won't > they? Yes. -fwrapv also shouldn't be as bad as we can re-instantiate the no-wrapping operation codes during value-range analysis. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 12:07 ` Dave Korn 2009-02-27 13:17 ` Richard Guenther @ 2009-02-27 18:07 ` Zdenek Dvorak 1 sibling, 0 replies; 52+ messages in thread From: Zdenek Dvorak @ 2009-02-27 18:07 UTC (permalink / raw) To: Dave Korn; +Cc: Richard Guenther, gcc Hi, > > I obviously thought about this. The issue with using a flag is > > that there is no convenient place to stick it and that it makes > > the distinction between the two variants less visible. Consider > > the folding routines that take split trees for a start. > > > > IMHO using new tree-codes is much less complicated than carrying > > around flags. I did consider putting flags on the tree code > > itself, but that isn't going to make the changes easier either. > > I think it's probably a far safer design too. > > If you suddenly introduce a new semantic for an existing code, suddenly > every current check for that tree code becomes a potential bug site that has > to be manually inspected to see if it's overflow-sensitive or not and > therefore whether it needs to test the new flag or not; people who don't know > about the new flag will carry on adding code that processes those tree codes > without knowing to test the flag; and the bugs that do arise will be hard to > find and result in silent bad codegen. > > If OTOH you add new tree codes, the meaning will be explicit, nobody using > them can be under any misapprehension about the overflow semantics, nobody can > forget to check the flag, and any places where they should be handled but > aren't will very quickly draw themselves to our attention by ICEing, or if > they don't will simply fall back to the safe option of not doing any > processing on a tree code that they don't recognize; that might lead to > pessimal code, but it shouldn't generate incorrect code. you are way too optimistic. Regardless of whether you introduce the new codes or represent the new information in another way, the fact is that this changes semantics of PLUS_EXPR in some cases, and it will lead to a wave of bugs from places that you forgot to update or updated incorrectly. Also, adding new codes usually will not lead to ICEs on places that do not handle them, just to missed optimizations (most places that handle expressions have some kind of "default" -- if this is not any of the operations that we handle specially, do something generic). Using special codes is in no way safer than having an extra flag -- even with the extra flag, the default behavior for the newly created operations would be that they may wrap, so there is really no difference, Zdenek ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 10:16 ` Richard Guenther 2009-02-27 12:07 ` Dave Korn @ 2009-02-27 17:58 ` Zdenek Dvorak 2009-02-27 18:17 ` Richard Guenther 1 sibling, 1 reply; 52+ messages in thread From: Zdenek Dvorak @ 2009-02-27 17:58 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc Hi, > > introducing new codes seems like a bad idea to me. There are many > > places that do not care about the distinction between PLUS_EXPR and > > PLUSV_EXPR, and handling both cases will complicate the code (see eg. > > the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR > > distinction). Why not just use a flag to mark the operation as > > non-overflowing? > > I obviously thought about this. The issue with using a flag is > that there is no convenient place to stick it and that it makes > the distinction between the two variants less visible. Consider > the folding routines that take split trees for a start. > > IMHO using new tree-codes is much less complicated than carrying > around flags. I did consider putting flags on the tree code > itself, but that isn't going to make the changes easier either. OK, then what about this: introduce accessor functions like tree_code get_operation_semantics (tree_code) -- returns PLUS_EXPR for PLUS_EXPR and PLUSNV_EXPR, etc. bool get_operation_overflow (tree_code) -- obvious tree_code operation_code (tree_code, bool) -- PLUS_EXPR, false -> PLUS_EXPR -- PLUS_EXPR, true -> PLUSNV_EXPR -- PLUSNV_EXPR, * -> abort etc. (possibly with more clear names), and change the code to always use them? Zdenek ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 17:58 ` Zdenek Dvorak @ 2009-02-27 18:17 ` Richard Guenther 0 siblings, 0 replies; 52+ messages in thread From: Richard Guenther @ 2009-02-27 18:17 UTC (permalink / raw) To: Zdenek Dvorak; +Cc: gcc On Fri, 27 Feb 2009, Zdenek Dvorak wrote: > Hi, > > > > introducing new codes seems like a bad idea to me. There are many > > > places that do not care about the distinction between PLUS_EXPR and > > > PLUSV_EXPR, and handling both cases will complicate the code (see eg. > > > the problems caused by introducing POINTER_PLUS_EXPR vs PLUS_EXPR > > > distinction). Why not just use a flag to mark the operation as > > > non-overflowing? > > > > I obviously thought about this. The issue with using a flag is > > that there is no convenient place to stick it and that it makes > > the distinction between the two variants less visible. Consider > > the folding routines that take split trees for a start. > > > > IMHO using new tree-codes is much less complicated than carrying > > around flags. I did consider putting flags on the tree code > > itself, but that isn't going to make the changes easier either. > > OK, then what about this: introduce accessor functions like > > tree_code get_operation_semantics (tree_code) > -- returns PLUS_EXPR for PLUS_EXPR and PLUSNV_EXPR, etc. > bool get_operation_overflow (tree_code) > -- obvious > tree_code operation_code (tree_code, bool) > -- PLUS_EXPR, false -> PLUS_EXPR > -- PLUS_EXPR, true -> PLUSNV_EXPR > -- PLUSNV_EXPR, * -> abort > etc. > > (possibly with more clear names), and change the code to always > use them? Yes. In playing with an initial patch I see the need for these kind of helpers. Sofar I tried to limit myself to PLUS_EXPR_P () covering both kinds, but that certainly is not enough. So I see the need for at least tree_code strip_nv (tree_code); -- returns PLUS_EXPR for PLUS_EXPR and PLUSNV_EXPR, etc. I didn't yet come along a need for your last one. Maybe similar to the second one is the requirement to somehow have a test when we want to emit strict-overflow warnings - which we want for the NV codes operating on signed types, possibly using TREE_NO_WARNING for avoiding warnings for NV codes that value-range analysis inserted (and, unfortunately there's no suitable place to put TREE_NO_WARNING on fold arguments ... :/). As of using them always - I will see how it works out. For a start I am able to miscompile gcc after just minimal fold surgery... :/ Thanks, Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther ` (3 preceding siblings ...) 2009-02-26 23:08 ` Zdenek Dvorak @ 2009-02-27 23:07 ` Diego Novillo 2009-02-28 0:04 ` Joseph S. Myers 2009-02-28 10:25 ` Richard Guenther 4 siblings, 2 replies; 52+ messages in thread From: Diego Novillo @ 2009-02-27 23:07 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc On Thu, Feb 26, 2009 at 06:05, Richard Guenther <rguenther@suse.de> wrote: > There shall be no construct in the GIMPLE IL that invokes > undefined behavior. Excellent! Thanks for starting this branch. > Thus, from now on integer overflow is defined and will wrap with > the usual twos-complement semantics. Thus, for the middle-end > -fwrapv is always in effect. This extends to pointer overflow. You mean that -fwrapv is always in effect for the *V_EXPR operations, right? PLUS_EXPR will always be defined as not overflowing. It's up to the front end to emit one or the other. How should we define merging two different TUs compiled with different -fwrapv settings? This is something that may be common in the context of LTO: $ gcc -flto -fwrapv f1.c $ gcc -flto f2.c $ gcc -o f -O2 f1.o f2.o We will be reading IL containing both overflow and non-overflow operations. We should define the combination rules for them. > Naming suggestions welcome, at the moment I consider NEGATEV_EXPR > (thus appending V to the first part). As good a convention as any. Diego. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 23:07 ` Diego Novillo @ 2009-02-28 0:04 ` Joseph S. Myers 2009-02-28 10:29 ` Richard Guenther 2009-02-28 10:25 ` Richard Guenther 1 sibling, 1 reply; 52+ messages in thread From: Joseph S. Myers @ 2009-02-28 0:04 UTC (permalink / raw) To: Diego Novillo; +Cc: Richard Guenther, gcc On Fri, 27 Feb 2009, Diego Novillo wrote: > We will be reading IL containing both overflow and non-overflow > operations. We should define the combination rules for them. The rules are simple: * No transformation (of arithmetic operations, which is what we are discussing here) may change defined behavior for given inputs to undefined behavior for those same inputs. (The reverse transformation is permitted. For example, a mixed set of addition/subtraction operations might usefully be converted to all overflow-defined operations to allow the operands to be rearranged and cancellation to take place. For example, if both versions of a+b with the same operands are in use at some point, the overflow-undefined version need not be computed, only the overflow-wraps version.) * No transformation may change defined behavior for given inputs to different, defined behavior for those inputs. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-28 0:04 ` Joseph S. Myers @ 2009-02-28 10:29 ` Richard Guenther 2009-03-05 17:24 ` Geert Bosch 0 siblings, 1 reply; 52+ messages in thread From: Richard Guenther @ 2009-02-28 10:29 UTC (permalink / raw) To: Joseph S. Myers; +Cc: Diego Novillo, gcc On Sat, 28 Feb 2009, Joseph S. Myers wrote: > On Fri, 27 Feb 2009, Diego Novillo wrote: > > > We will be reading IL containing both overflow and non-overflow > > operations. We should define the combination rules for them. > > The rules are simple: > > * No transformation (of arithmetic operations, which is what we are > discussing here) may change defined behavior for given inputs to undefined > behavior for those same inputs. (The reverse transformation is permitted. > For example, a mixed set of addition/subtraction operations might usefully > be converted to all overflow-defined operations to allow the operands to > be rearranged and cancellation to take place. For example, if both > versions of a+b with the same operands are in use at some point, the > overflow-undefined version need not be computed, only the overflow-wraps > version.) > > * No transformation may change defined behavior for given inputs to > different, defined behavior for those inputs. Indeed. A safe way of combining mixed trees is to just drop the *NV_EXPR variants to *_EXPR variants on the result. Consider (a -/nv 20) +/nv (b -/nv 20) where we see a re-association opportunity to combine both constants. The result is (a + b) - 40, _not_ (a +/nv b) -/nv 40. On trunk we have to disable this re-association because the result would still be undefined overflow (we don't have means to avoid this - other than doing the arithmetic unsigned and perform conversions of course). Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-28 10:29 ` Richard Guenther @ 2009-03-05 17:24 ` Geert Bosch 2009-03-06 12:28 ` Richard Guenther 0 siblings, 1 reply; 52+ messages in thread From: Geert Bosch @ 2009-03-05 17:24 UTC (permalink / raw) To: Richard Guenther; +Cc: Joseph S. Myers, Diego Novillo, gcc Hi Richard, Great to see that you're addressing this issue. If I understand correctly, for RTL all operations are always wrapping, right? I have been considering adding "V" variants for operations that trap on overflow. The main reason I have not (yet) pursued this, is the daunting task of teaching the folders about all these new codes. initially I'd like to lower the "V" operations to explicit checks (calling abort() or raising an exception, depending on the language) during gimplification, but with the idea of eventually delaying expansion as more code learns how to handle the new expressions. I already have most of the code necessary for the expansions, as I now do them during translation of Ada trees to GENERIC trees. Actually, the new and saner wrapping semantics for {PLUS,MINUS,MULT}_EXPR simplify these a bit, by avoiding the need to use unsigned types. As you obviously doing something very similar now by introducing "NV" variants, do you think this would fit in with your scheme? If so, I'd be happy to try and expand on your work to have, wrapping, no-overflow and overflow- checking variants for basic arithmetic. -Geert ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-05 17:24 ` Geert Bosch @ 2009-03-06 12:28 ` Richard Guenther 2009-03-06 13:54 ` Joseph S. Myers ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Richard Guenther @ 2009-03-06 12:28 UTC (permalink / raw) To: Geert Bosch; +Cc: Joseph S. Myers, Diego Novillo, gcc On Thu, 5 Mar 2009, Geert Bosch wrote: > Hi Richard, > > Great to see that you're addressing this issue. If I understand correctly, > for RTL all operations are always wrapping, right? That's true (we don't have signedness information there) and I don't plan to change that. > I have been considering adding "V" variants for operations that trap on > overflow. The main reason I have not (yet) pursued this, is the daunting > task of teaching the folders about all these new codes. initially I'd > like to lower the "V" operations to explicit checks (calling abort() or > raising an exception, depending on the language) during gimplification, > but with the idea of eventually delaying expansion as more code learns > how to handle the new expressions. I already have most of the code necessary > for the expansions, as I now do them during translation of Ada trees to > GENERIC trees. Actually, the new and saner wrapping semantics for > {PLUS,MINUS,MULT}_EXPR simplify these a bit, by avoiding the need to use > unsigned types. Correct. > As you obviously doing something very similar now by introducing "NV" > variants, > do you think this would fit in with your scheme? If so, I'd be happy to try > and expand on your work to have, wrapping, no-overflow and overflow-checking > variants for basic arithmetic. I didn't spend too much time thinking about the trapping variants (well, I believe it isn't that important ;)). But in general we would have to expand the non-NV variants via the trapping expanders if flag_trapv was true (so yeah, combining TUs with different flag_trapv settings would be difficult again and it would ask for explicitly encoding this variant in the IL ...). There is of course the problem that we have to be careful not to introduce new traps via folding, a problem that doesn't exist with the no-overflow variants (I can simply drop to the wrapping variants). With for example (a -/v 10) +/v 10 would you want to preserve the possibly trapping a -/v 10? For (a -/v 10) +/v (b -/v 10) do we want to be careful to not introduce extra traps when simplifying to (a +/v b) -/v 20? So while trapping variants can certainly be introduced it looks like this task may be more difficult. So lowering them early during gimplification looks like a more reasonable plan IMHO. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 12:28 ` Richard Guenther @ 2009-03-06 13:54 ` Joseph S. Myers 2009-03-06 14:10 ` Richard Guenther 2009-03-06 14:44 ` Paolo Bonzini 2009-03-06 17:21 ` Geert Bosch 2 siblings, 1 reply; 52+ messages in thread From: Joseph S. Myers @ 2009-03-06 13:54 UTC (permalink / raw) To: Richard Guenther; +Cc: Geert Bosch, Diego Novillo, gcc On Fri, 6 Mar 2009, Richard Guenther wrote: > There is of course the problem that we have to be careful not to > introduce new traps via folding, a problem that doesn't exist with > the no-overflow variants (I can simply drop to the wrapping variants). > With for example (a -/v 10) +/v 10 would you want to preserve > the possibly trapping a -/v 10? For (a -/v 10) +/v (b -/v 10) do > we want to be careful to not introduce extra traps when simplifying > to (a +/v b) -/v 20? I think we should preserve the property of whether the code evaluated between sequence points causes no traps, or at least one trap. (But need not preserve the difference between one trap and two, for example. This is the same as C99 Annex F requirements for floating-point exceptions: the set of exceptions produced between calls to fenv.h functions checking or modifying exception state needs to be preserved, but not the number (> 0) of times each exception is produced.) This probably means that most folding should not apply to trapping variants. (In principle I believe folding should be done on GIMPLE rather than folding functions being called from the front ends on trees. If you do always lower to explicit overflow checks and only run folding after that lowering stage, the problem of changing whether a trap occurs does not arise. But with trapping variants in GIMPLE, the folding code would still need to allow for them even with folding all happening on GIMPLE.) > So while trapping variants can certainly be introduced it looks like > this task may be more difficult. So lowering them early during > gimplification looks like a more reasonable plan IMHO. If you lower to whatever forms of checks for overflow can be conveniently expressed in GIMPLE then you have the interesting problem of reconstituting trapping operations for those targets that have trapping instructions or instructions setting overflow flags that can be checked directly. It's possible that you want to lower to explicit overflow checks for targets without the operations, but to trapping operation codes for targets with the operations. (Which would suggest starting with trapping variants, and then lowering to explicit checks part way through the GIMPLE optimizations, only on targets without suitable instructions for the relevant type precision and depending on -Os to determine whether inline code or libgcc function calls might be better.) -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 13:54 ` Joseph S. Myers @ 2009-03-06 14:10 ` Richard Guenther 2009-03-06 14:29 ` Joseph S. Myers 0 siblings, 1 reply; 52+ messages in thread From: Richard Guenther @ 2009-03-06 14:10 UTC (permalink / raw) To: Joseph S. Myers; +Cc: Geert Bosch, Diego Novillo, gcc On Fri, 6 Mar 2009, Joseph S. Myers wrote: > On Fri, 6 Mar 2009, Richard Guenther wrote: > > > There is of course the problem that we have to be careful not to > > introduce new traps via folding, a problem that doesn't exist with > > the no-overflow variants (I can simply drop to the wrapping variants). > > With for example (a -/v 10) +/v 10 would you want to preserve > > the possibly trapping a -/v 10? For (a -/v 10) +/v (b -/v 10) do > > we want to be careful to not introduce extra traps when simplifying > > to (a +/v b) -/v 20? > > I think we should preserve the property of whether the code evaluated > between sequence points causes no traps, or at least one trap. (But need > not preserve the difference between one trap and two, for example. This > is the same as C99 Annex F requirements for floating-point exceptions: the > set of exceptions produced between calls to fenv.h functions checking or > modifying exception state needs to be preserved, but not the number (> 0) > of times each exception is produced.) That sounds sensible. Of course we don't know about sequence points during folding (as that may get trees combined from several GIMPLE statements). > This probably means that most folding should not apply to trapping > variants. Indeed. > (In principle I believe folding should be done on GIMPLE rather than > folding functions being called from the front ends on trees. If you do > always lower to explicit overflow checks and only run folding after that > lowering stage, the problem of changing whether a trap occurs does not > arise. But with trapping variants in GIMPLE, the folding code would still > need to allow for them even with folding all happening on GIMPLE.) Yes. > > So while trapping variants can certainly be introduced it looks like > > this task may be more difficult. So lowering them early during > > gimplification looks like a more reasonable plan IMHO. > > If you lower to whatever forms of checks for overflow can be conveniently > expressed in GIMPLE then you have the interesting problem of > reconstituting trapping operations for those targets that have trapping > instructions or instructions setting overflow flags that can be checked > directly. It's possible that you want to lower to explicit overflow > checks for targets without the operations, but to trapping operation codes > for targets with the operations. (Which would suggest starting with > trapping variants, and then lowering to explicit checks part way through > the GIMPLE optimizations, only on targets without suitable instructions > for the relevant type precision and depending on -Os to determine whether > inline code or libgcc function calls might be better.) Ok, I didn't think of generating optimal code for the overflow checks, but it should be reasonably possible to if-convert the explicit overflow checks back to builtins that can be specially expanded (that would also help explicit overflow checks in source code). So my opinion still holds that explicit overflow checks would be prefered. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 14:10 ` Richard Guenther @ 2009-03-06 14:29 ` Joseph S. Myers 2009-03-06 17:17 ` Geert Bosch 0 siblings, 1 reply; 52+ messages in thread From: Joseph S. Myers @ 2009-03-06 14:29 UTC (permalink / raw) To: Richard Guenther; +Cc: Geert Bosch, Diego Novillo, gcc On Fri, 6 Mar 2009, Richard Guenther wrote: > Ok, I didn't think of generating optimal code for the overflow checks, > but it should be reasonably possible to if-convert the explicit > overflow checks back to builtins that can be specially expanded > (that would also help explicit overflow checks in source code). It looks like only alpha and pa presently have insn patterns such as addvsi3 that would be used by the present -ftrapv code, but I expect several other processors also have instructions that would help in overflow-checking code. (For example, Power Architecture has versions of many arithmetic instructions that set overflow flags, so such instructions could be used followed by conditional traps.) -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 14:29 ` Joseph S. Myers @ 2009-03-06 17:17 ` Geert Bosch 2009-03-06 17:42 ` Joseph S. Myers 0 siblings, 1 reply; 52+ messages in thread From: Geert Bosch @ 2009-03-06 17:17 UTC (permalink / raw) To: Joseph S. Myers; +Cc: Richard Guenther, Diego Novillo, gcc On Mar 6, 2009, at 09:15, Joseph S. Myers wrote: > It looks like only alpha and pa presently have insn patterns such as > addvsi3 that would be used by the present -ftrapv code, but I expect > several other processors also have instructions that would help in > overflow-checking code. (For example, Power Architecture has > versions of > many arithmetic instructions that set overflow flags, so such > instructions > could be used followed by conditional traps.) Most architectures have similar flags or conditions, but during my work on more efficient overflow checking for Ada I have become less convinced of the need for them. For Ada code, the overflow checking is now less expensive than many other checks. In most cases, one of the operands will be either constant or have a known sign. Then the overflow check can be expanded as a simple comparison. The benefit of this is that later optimization passes can use these tests to derive range information, combine checks, or fold them. When using some opaque construct, removing it will be hard. Also, for many languages calling abort() for overflow would not be desirable, and an exception should be raised. Doing this directly from expanded code rather than using a trap handler will avoid the need to write target-specific trap handlers and has the advantage that a message with location information can easily be included. In any case, while I'd really like to move the checked signed integer overflow from Gigi (GNAT-to-GNU tree translator) to the language independent part of GCC, I want to have the absolute minimum amount of changes that is necessary to achieve this goal. Since only Ada uses integer overflow checking at this point, any non-standard GIMPLE will be a maintenance burden and is likely to be mishandled by later optimizers. When we lower all checks during gimplification, we can re-implement -ftrapv while avoiding all of the pitfalls the old implementation had. In particular, there has to be a clear distinction between which signed integer operations must be checked and which not. During compilation many new operations are created and these operations need to have clear semantics. To me that is the great improvement of the no-undefined-overflow branch. -Geert ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 17:17 ` Geert Bosch @ 2009-03-06 17:42 ` Joseph S. Myers 0 siblings, 0 replies; 52+ messages in thread From: Joseph S. Myers @ 2009-03-06 17:42 UTC (permalink / raw) To: Geert Bosch; +Cc: Richard Guenther, Diego Novillo, gcc On Fri, 6 Mar 2009, Geert Bosch wrote: > In any case, while I'd really like to move the checked signed > integer overflow from Gigi (GNAT-to-GNU tree translator) to > the language independent part of GCC, I want to have the absolute > minimum amount of changes that is necessary to achieve this goal. I think the absolute minimum is that new codes for trapping operations are added to GENERIC but not GIMPLE, that Gigi generates those codes rather than explicit checks, and the lowering to explicit checks is done in the gimplifier (possibly based on code presently in Gigi). -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 12:28 ` Richard Guenther 2009-03-06 13:54 ` Joseph S. Myers @ 2009-03-06 14:44 ` Paolo Bonzini 2009-03-06 15:10 ` Richard Guenther ` (2 more replies) 2009-03-06 17:21 ` Geert Bosch 2 siblings, 3 replies; 52+ messages in thread From: Paolo Bonzini @ 2009-03-06 14:44 UTC (permalink / raw) To: gcc > So while trapping variants can certainly be introduced it looks like > this task may be more difficult. I don't think you need to introduce trapping tree codes. You can introduce them directly in the front-end as s = x +nv y (((s ^ x) & (s ^ y)) < 0) ? trap () : s d = x -nv y (((d ^ x) & (x ^ y)) < 0) ? trap () : d (b == INT_MIN ? trap () : -nv b) (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b Making sure they are compiled efficiently is another story, but especially for the sake of LTO I think this is the way to go. Paolo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 14:44 ` Paolo Bonzini @ 2009-03-06 15:10 ` Richard Guenther 2009-03-06 15:23 ` Paolo Bonzini 2009-03-06 15:58 ` Ian Lance Taylor 2009-03-06 17:25 ` Joseph S. Myers 2009-03-27 13:11 ` Eus 2 siblings, 2 replies; 52+ messages in thread From: Richard Guenther @ 2009-03-06 15:10 UTC (permalink / raw) To: gcc On Fri, Mar 6, 2009 at 3:29 PM, Paolo Bonzini <bonzini@gnu.org> wrote: > >> So while trapping variants can certainly be introduced it looks like >> this task may be more difficult. > > I don't think you need to introduce trapping tree codes. You can > introduce them directly in the front-end as > > s = x +nv y I think this should be s = x + y otherwise the compiler can assume that for the following check the addition did not overflow. > (((s ^ x) & (s ^ y)) < 0) ? trap () : s > > d = x -nv y > (((d ^ x) & (x ^ y)) < 0) ? trap () : d > > (b == INT_MIN ? trap () : -nv b) > > (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b > > Making sure they are compiled efficiently is another story, but > especially for the sake of LTO I think this is the way to go. I agree. Btw, for the addition case we generate leal (%rsi,%rdi), %eax xorl %eax, %esi xorl %eax, %edi testl %edi, %esi jns .L2 .value 0x0b0f .L2: rep ret which isn't too bad. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 15:10 ` Richard Guenther @ 2009-03-06 15:23 ` Paolo Bonzini 2009-03-06 15:39 ` Paolo Bonzini ` (3 more replies) 2009-03-06 15:58 ` Ian Lance Taylor 1 sibling, 4 replies; 52+ messages in thread From: Paolo Bonzini @ 2009-03-06 15:23 UTC (permalink / raw) To: gcc Richard Guenther wrote: > On Fri, Mar 6, 2009 at 3:29 PM, Paolo Bonzini <bonzini@gnu.org> wrote: >>> So while trapping variants can certainly be introduced it looks like >>> this task may be more difficult. >> I don't think you need to introduce trapping tree codes. You can >> introduce them directly in the front-end as >> >> s = x +nv y > > I think this should be > > s = x + y > (((s ^ x) & (s ^ y)) < 0) ? trap () : s > > otherwise the compiler can assume that for the following check > the addition did not overflow. Ah yeah I've not yet looked at the patches and I did not know which one was which. I actually wrote x + y first and then went back to carefully check them. :-P >> Making sure they are compiled efficiently is another story, but >> especially for the sake of LTO I think this is the way to go. > > I agree. Btw, for the addition case we generate > > leal (%rsi,%rdi), %eax > xorl %eax, %esi > xorl %eax, %edi > testl %edi, %esi > jns .L2 > .value 0x0b0f > .L2: > rep > ret > > which isn't too bad. Well, for x86 it requires the addends to die. This is unfortunately four insns, and combine has a limit of three. but maybe you could make combine recognize the check and turn it to an addv pattern (with the add result unused!); and then CSE or maybe combine as well would, well, eliminate the duplicate ADD... If this does not work, on ARM you can also hope for something like this: ADD R0, R1, R2 XORS R0, R2, R3 XORSMI R1, R2, R3 SWIMI #trap But hey, whatever you get, it's anyway faster than a libcall. :-) Of course there are better choices for x+CONSTANT; using (b == INT_MIN ? trap () : -b) for negation is one example. Paolo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 15:23 ` Paolo Bonzini @ 2009-03-06 15:39 ` Paolo Bonzini 2009-03-06 16:19 ` Richard Guenther ` (2 subsequent siblings) 3 siblings, 0 replies; 52+ messages in thread From: Paolo Bonzini @ 2009-03-06 15:39 UTC (permalink / raw) To: gcc Richard Guenther wrote: > On Fri, Mar 6, 2009 at 3:29 PM, Paolo Bonzini <bonzini@gnu.org> wrote: >>> So while trapping variants can certainly be introduced it looks like >>> this task may be more difficult. >> I don't think you need to introduce trapping tree codes. You can >> introduce them directly in the front-end as >> >> s = x +nv y > > I think this should be > > s = x + y > (((s ^ x) & (s ^ y)) < 0) ? trap () : s > > otherwise the compiler can assume that for the following check > the addition did not overflow. Ah yeah I've not yet looked at the patches and I did not know which one was which. I actually wrote x + y first and then went back to carefully check them. :-P >> Making sure they are compiled efficiently is another story, but >> especially for the sake of LTO I think this is the way to go. > > I agree. Btw, for the addition case we generate > > leal (%rsi,%rdi), %eax > xorl %eax, %esi > xorl %eax, %edi > testl %edi, %esi > jns .L2 > .value 0x0b0f > .L2: > rep > ret > > which isn't too bad. Well, for x86 it requires the addends to die. This is unfortunately four insns, and combine has a limit of three. but maybe you could make combine recognize the check and turn it to an addv pattern (with the add result unused!); and then CSE or maybe combine as well would, well, eliminate the duplicate ADD... If this does not work, on ARM you can also hope for something like this: ADD R0, R1, R2 XORS R0, R2, R3 XORSMI R1, R2, R3 SWIMI #trap But hey, whatever you get, it's anyway faster than a libcall. :-) Of course there are better choices for x+CONSTANT; using (b == INT_MIN ? trap () : -b) for negation is one example. Paolo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 15:23 ` Paolo Bonzini 2009-03-06 15:39 ` Paolo Bonzini @ 2009-03-06 16:19 ` Richard Guenther 2009-03-06 17:23 ` Joseph S. Myers 2009-03-06 18:54 ` Richard Earnshaw [not found] ` <1236364189.1654.2.camel@pc960.cambridge.arm.com> 3 siblings, 1 reply; 52+ messages in thread From: Richard Guenther @ 2009-03-06 16:19 UTC (permalink / raw) To: gcc On Fri, Mar 6, 2009 at 4:09 PM, Paolo Bonzini <bonzini@gnu.org> wrote: > Richard Guenther wrote: >> On Fri, Mar 6, 2009 at 3:29 PM, Paolo Bonzini <bonzini@gnu.org> wrote: >>>> So while trapping variants can certainly be introduced it looks like >>>> this task may be more difficult. >>> I don't think you need to introduce trapping tree codes. You can >>> introduce them directly in the front-end as >>> >>> s = x +nv y >> >> I think this should be >> >> s = x + y >> (((s ^ x) & (s ^ y)) < 0) ? trap () : s >> >> otherwise the compiler can assume that for the following check >> the addition did not overflow. > > Ah yeah I've not yet looked at the patches and I did not know which one > was which. I actually wrote x + y first and then went back to carefully > check them. :-P > >>> Making sure they are compiled efficiently is another story, but >>> especially for the sake of LTO I think this is the way to go. >> >> I agree. Btw, for the addition case we generate >> >> leal (%rsi,%rdi), %eax >> xorl %eax, %esi >> xorl %eax, %edi >> testl %edi, %esi >> jns .L2 >> .value 0x0b0f >> .L2: >> rep >> ret >> >> which isn't too bad. > > Well, for x86 it requires the addends to die. > > This is unfortunately four insns, and combine has a limit of three. but > maybe you could make combine recognize the check and turn it to an addv > pattern (with the add result unused!); and then CSE or maybe combine as > well would, well, eliminate the duplicate ADD... Well, I was thinking about detecting the pattern on the tree level instead. s_6 = x.0_2 + y.1_4; D.1597_7 = s_6 ^ x_1(D); D.1598_8 = s_6 ^ y_3(D); D.1599_9 = D.1597_7 & D.1598_8; if (D.1599_9 < 0) goto <bb 3>; else goto <bb 4>; <bb 3>: __builtin_trap (); <bb 4>: This should be recognizable in the ifcombine pass for example, which recognizes CFG patterns. Transforming it to just s_6 = __builtin_addv (x.0_2, y.1_4); <bb 4>: Only ifcombine runs a little too early for that maybe. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 16:19 ` Richard Guenther @ 2009-03-06 17:23 ` Joseph S. Myers 0 siblings, 0 replies; 52+ messages in thread From: Joseph S. Myers @ 2009-03-06 17:23 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc On Fri, 6 Mar 2009, Richard Guenther wrote: > Well, I was thinking about detecting the pattern on the tree level instead. > > s_6 = x.0_2 + y.1_4; > D.1597_7 = s_6 ^ x_1(D); > D.1598_8 = s_6 ^ y_3(D); > D.1599_9 = D.1597_7 & D.1598_8; > if (D.1599_9 < 0) > goto <bb 3>; > else > goto <bb 4>; > > <bb 3>: > __builtin_trap (); > > <bb 4>: > > This should be recognizable in the ifcombine pass for example, which > recognizes CFG patterns. Transforming it to just > > s_6 = __builtin_addv (x.0_2, y.1_4); > > <bb 4>: > > Only ifcombine runs a little too early for that maybe. In addition to detecting to transform into something like the above (for addv insn patterns or libgcc function), you may also want to detect when a constant has been propagated into the above and make sure it can get optimized into a range check on the non-constant operand. (I don't know if existing optimizers will be able to handle the above with a constant for one of the operands and convert it to a range check, or whether special code would be needed.) -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 15:23 ` Paolo Bonzini 2009-03-06 15:39 ` Paolo Bonzini 2009-03-06 16:19 ` Richard Guenther @ 2009-03-06 18:54 ` Richard Earnshaw 2009-03-06 19:53 ` Ian Lance Taylor [not found] ` <1236364189.1654.2.camel@pc960.cambridge.arm.com> 3 siblings, 1 reply; 52+ messages in thread From: Richard Earnshaw @ 2009-03-06 18:54 UTC (permalink / raw) To: Paolo Bonzini; +Cc: gcc On Fri, 2009-03-06 at 16:09 +0100, Paolo Bonzini wrote: > If this does not work, on ARM you can also hope for something like this: > > ADD R0, R1, R2 > XORS R0, R2, R3 > XORSMI R1, R2, R3 > SWIMI #trap On ARM you can just check for overflow directly... ADDS R0, R1, R2 SWIVS #trap :=) ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 18:54 ` Richard Earnshaw @ 2009-03-06 19:53 ` Ian Lance Taylor 2009-03-06 20:42 ` Robert Dewar 0 siblings, 1 reply; 52+ messages in thread From: Ian Lance Taylor @ 2009-03-06 19:53 UTC (permalink / raw) To: gcc Richard Earnshaw <rearnsha@arm.com> writes: > On Fri, 2009-03-06 at 16:09 +0100, Paolo Bonzini wrote: >> If this does not work, on ARM you can also hope for something like this: >> >> ADD R0, R1, R2 >> XORS R0, R2, R3 >> XORSMI R1, R2, R3 >> SWIMI #trap > > On ARM you can just check for overflow directly... > > ADDS R0, R1, R2 > SWIVS #trap This point should not be missed. Some processors (MIPS) have trapping arithmetic instructions, but many processors have an overflow flag which can be tested. Any useful design for -ftrapv needs to make it possible to use that overflow flag in the generated code. It will always be more efficient than using arithmetic to check for overflow. Ian ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 19:53 ` Ian Lance Taylor @ 2009-03-06 20:42 ` Robert Dewar 0 siblings, 0 replies; 52+ messages in thread From: Robert Dewar @ 2009-03-06 20:42 UTC (permalink / raw) To: gcc Ian Lance Taylor wrote: > Richard Earnshaw <rearnsha@arm.com> writes: > >> On Fri, 2009-03-06 at 16:09 +0100, Paolo Bonzini wrote: >>> If this does not work, on ARM you can also hope for something like this: >>> >>> ADD R0, R1, R2 >>> XORS R0, R2, R3 >>> XORSMI R1, R2, R3 >>> SWIMI #trap >> On ARM you can just check for overflow directly... >> >> ADDS R0, R1, R2 >> SWIVS #trap > > This point should not be missed. Some processors (MIPS) have trapping > arithmetic instructions, but many processors have an overflow flag which > can be tested. Any useful design for -ftrapv needs to make it possible > to use that overflow flag in the generated code. It will always be more > efficient than using arithmetic to check for overflow. One needs to be careful on efficiency here, thinking about pipelines etc. for example, into looks nice on the x86, but can be expensive to use compared with explicit tests of the overflow flag. > > Ian ^ permalink raw reply [flat|nested] 52+ messages in thread
[parent not found: <1236364189.1654.2.camel@pc960.cambridge.arm.com>]
* Re: New no-undefined-overflow branch [not found] ` <1236364189.1654.2.camel@pc960.cambridge.arm.com> @ 2009-03-06 19:34 ` Paolo Bonzini 0 siblings, 0 replies; 52+ messages in thread From: Paolo Bonzini @ 2009-03-06 19:34 UTC (permalink / raw) To: Richard Earnshaw; +Cc: gcc >> If this does not work, on ARM you can also hope for something like this: >> >> ADD R0, R1, R2 >> XORS R0, R2, R3 >> XORSMI R1, R2, R3 >> SWIMI #trap > > On ARM you can just check for overflow directly... > > ADDS R0, R1, R2 > SWIVS #trap Of course, I was thinking explicitly of what happens with no MD support. Paolo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 15:10 ` Richard Guenther 2009-03-06 15:23 ` Paolo Bonzini @ 2009-03-06 15:58 ` Ian Lance Taylor 1 sibling, 0 replies; 52+ messages in thread From: Ian Lance Taylor @ 2009-03-06 15:58 UTC (permalink / raw) To: gcc Richard Guenther <richard.guenther@gmail.com> writes: > I agree. Btw, for the addition case we generate > > leal (%rsi,%rdi), %eax > xorl %eax, %esi > xorl %eax, %edi > testl %edi, %esi > jns .L2 > .value 0x0b0f > .L2: > rep > ret > > which isn't too bad. But I think it would normally be faster to do something like movl %esi,%eax addl %edi,%eax jo .Ltrap ret .Ltrap: diediedie Ian ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 14:44 ` Paolo Bonzini 2009-03-06 15:10 ` Richard Guenther @ 2009-03-06 17:25 ` Joseph S. Myers 2009-03-06 17:52 ` Paolo Bonzini 2009-03-27 13:11 ` Eus 2 siblings, 1 reply; 52+ messages in thread From: Joseph S. Myers @ 2009-03-06 17:25 UTC (permalink / raw) To: Paolo Bonzini; +Cc: gcc On Fri, 6 Mar 2009, Paolo Bonzini wrote: > I don't think you need to introduce trapping tree codes. You can > introduce them directly in the front-end as Multiple front ends want the same thing. This is why it would be better to introduce the codes in GENERIC and have the language-independent gimplifier contain the code to lower them, even if they don't become part of GIMPLE. > (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b This is not a solution for trapping multiplication in the widest supported type. I'm sure you can represent that with e.g. some checks for whether values are positive or negative and then dividing the largest / smallest values of the type by one of the arguments, but you might not want to lose symmetry like that until after constant propagation has had a chance to make one of the arguments constant (since if that happens you get a range check on the other argument). -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 17:25 ` Joseph S. Myers @ 2009-03-06 17:52 ` Paolo Bonzini 2009-03-06 18:36 ` Joseph S. Myers 0 siblings, 1 reply; 52+ messages in thread From: Paolo Bonzini @ 2009-03-06 17:52 UTC (permalink / raw) To: Joseph S. Myers; +Cc: gcc Joseph S. Myers wrote: > On Fri, 6 Mar 2009, Paolo Bonzini wrote: > >> I don't think you need to introduce trapping tree codes. You can >> introduce them directly in the front-end as > > Multiple front ends want the same thing. This is why it would be better > to introduce the codes in GENERIC and have the language-independent > gimplifier contain the code to lower them, even if they don't become part > of GIMPLE. I see your point. What I'm worried of, is that this codes would be tested more lightly and, until folding is a middle-end thing only, the risk of unwanted optimization on -ftrapv code would be high. You can have common code shared by front-ends. They could apply it at GENERICization time (Fortran, Ada) or directly while parsing (C, C++). >> (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b > > This is not a solution for trapping multiplication in the widest supported > type. There's always range checking, I was pointing out optimization possibilities; the above one can be optimized like (h,l) = a*b if (h != l >> 31) trap (); // signed shift Paolo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 17:52 ` Paolo Bonzini @ 2009-03-06 18:36 ` Joseph S. Myers 0 siblings, 0 replies; 52+ messages in thread From: Joseph S. Myers @ 2009-03-06 18:36 UTC (permalink / raw) To: Paolo Bonzini; +Cc: gcc On Fri, 6 Mar 2009, Paolo Bonzini wrote: > Joseph S. Myers wrote: > > On Fri, 6 Mar 2009, Paolo Bonzini wrote: > > > >> I don't think you need to introduce trapping tree codes. You can > >> introduce them directly in the front-end as > > > > Multiple front ends want the same thing. This is why it would be better > > to introduce the codes in GENERIC and have the language-independent > > gimplifier contain the code to lower them, even if they don't become part > > of GIMPLE. > > I see your point. What I'm worried of, is that this codes would be > tested more lightly and, until folding is a middle-end thing only, the > risk of unwanted optimization on -ftrapv code would be high. The risk (and fact) of unwanted optimization is there at present. Simply using new codes and not teaching fold about them will reduce the risk over the present state where many bits of code have to check flag_trapv. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 14:44 ` Paolo Bonzini 2009-03-06 15:10 ` Richard Guenther 2009-03-06 17:25 ` Joseph S. Myers @ 2009-03-27 13:11 ` Eus 2009-03-27 18:07 ` Paolo Bonzini 2 siblings, 1 reply; 52+ messages in thread From: Eus @ 2009-03-27 13:11 UTC (permalink / raw) To: Paolo Bonzini; +Cc: gcc Hi Ho! On Fri, 2009-03-06 at 15:29 +0100, Paolo Bonzini wrote: > > So while trapping variants can certainly be introduced it looks like > > this task may be more difficult. > > I don't think you need to introduce trapping tree codes. You can > introduce them directly in the front-end as > > s = x +nv y > (((s ^ x) & (s ^ y)) < 0) ? trap () : s Could you please kindly explain a bit about it? Suppose s, x, and y are 8-bit signed integer. If x and y are -128, s = (-128) + (-128), which will overflow. Now if suppose s is 0, ((0 ^ -128) & (0 ^ -128)) == -128, which is less than zero and will trap. This is correct. But, if s is -128, ((-128 ^ -128) & (-128 ^ -128)) is zero and will not trap. This is incorrect, isn't this? In short, why should s be included in the calculation? Is it assumed that s is always zero? Thank you very much for the explanation. > d = x -nv y > (((d ^ x) & (x ^ y)) < 0) ? trap () : d > > (b == INT_MIN ? trap () : -nv b) > > (int)((long long) a * (long long) b) == a *nv b ? trap () : a *nv b > > Making sure they are compiled efficiently is another story, but > especially for the sake of LTO I think this is the way to go. > > Paolo -- Best regards, Eus (FSF member #4445) In this digital era, where computing technology is pervasive, your freedom depends on the software controlling those computing devices. Join free software movement today! It is free as in freedom, not as in free beer! Join: http://www.fsf.org/jf?referrer=4445 ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-27 13:11 ` Eus @ 2009-03-27 18:07 ` Paolo Bonzini 2009-03-31 4:55 ` Eus 0 siblings, 1 reply; 52+ messages in thread From: Paolo Bonzini @ 2009-03-27 18:07 UTC (permalink / raw) To: eus; +Cc: gcc Eus wrote: > Hi Ho! > > On Fri, 2009-03-06 at 15:29 +0100, Paolo Bonzini wrote: >>> So while trapping variants can certainly be introduced it looks like >>> this task may be more difficult. >> I don't think you need to introduce trapping tree codes. You can >> introduce them directly in the front-end as >> >> s = x +nv y >> (((s ^ x) & (s ^ y)) < 0) ? trap () : s > > Could you please kindly explain a bit about it? > > Suppose s, x, and y are 8-bit signed integer. > If x and y are -128, s = (-128) + (-128), which will overflow. > Now if suppose s is 0, ((0 ^ -128) & (0 ^ -128)) == -128, which is less > than zero and will trap. This is correct. > But, if s is -128, ((-128 ^ -128) & (-128 ^ -128)) is zero and will not > trap. This is incorrect, isn't this? Don't look at it as an arithmetic test, "< 0" is just a shortcut for "the most significant bit of the first operand is 1". So the formula's outcome only depends on the ANDs and XORs of the sign bits; it means, "trap if the result's sign is different from both addends' signs" or equivalently "check that the result's sign matches at least one addend's sign". You can also say "((s ^ x) & ~(x ^ y)) < 0", or equivalently "((s ^ y) & ~(x ^ y)) < 0". These mean "trap if the two addends have the same sign and the sum has a different sign". It is easy to prove that they are equivalent: s x y s^x s^y ~(x^y) equal? 0 0 0 0 0 1 yes (0) 0 0 1 0 1 0 yes (0) 0 1 0 1 0 0 yes (0) 0 1 1 1 1 1 yes (1) 1 0 0 1 1 1 yes (1) 1 0 1 1 0 0 yes (0) 1 1 0 0 1 0 yes (0) 1 1 1 0 0 1 yes (0) Paolo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-27 18:07 ` Paolo Bonzini @ 2009-03-31 4:55 ` Eus 0 siblings, 0 replies; 52+ messages in thread From: Eus @ 2009-03-31 4:55 UTC (permalink / raw) To: Paolo Bonzini; +Cc: gcc Hi Ho! On Fri, 2009-03-27 at 13:49 +0100, Paolo Bonzini wrote: > Eus wrote: > > Hi Ho! > > > > On Fri, 2009-03-06 at 15:29 +0100, Paolo Bonzini wrote: > >>> So while trapping variants can certainly be introduced it looks like > >>> this task may be more difficult. > >> I don't think you need to introduce trapping tree codes. You can > >> introduce them directly in the front-end as > >> > >> s = x +nv y > >> (((s ^ x) & (s ^ y)) < 0) ? trap () : s > > > > Could you please kindly explain a bit about it? > > > > Suppose s, x, and y are 8-bit signed integer. > > If x and y are -128, s = (-128) + (-128), which will overflow. > > Now if suppose s is 0, ((0 ^ -128) & (0 ^ -128)) == -128, which is less > > than zero and will trap. This is correct. > > But, if s is -128, ((-128 ^ -128) & (-128 ^ -128)) is zero and will not > > trap. This is incorrect, isn't this? > > Don't look at it as an arithmetic test, "< 0" is just a shortcut for > "the most significant bit of the first operand is 1". So the formula's > outcome only depends on the ANDs and XORs of the sign bits; it means, > "trap if the result's sign is different from both addends' signs" or > equivalently "check that the result's sign matches at least one addend's > sign". Ah, I got it now! :-) Yes, so after I perform the addition, I check for the signs. It was my mistake to think that the check was done without performing the addition first. I mean, `s = x +nv y' == `(((s ^ x) & (s ^ y)) < 0) ? trap () : s'. But, they should be performed one after another: 1) s = x +nv y 2) (((s ^ x) & (s ^ y)) < 0) ? trap () : s > You can also say "((s ^ x) & ~(x ^ y)) < 0", or equivalently "((s ^ y) & > ~(x ^ y)) < 0". These mean "trap if the two addends have the same sign > and the sum has a different sign". It is easy to prove that they are > equivalent: > > s x y s^x s^y ~(x^y) equal? > 0 0 0 0 0 1 yes (0) > 0 0 1 0 1 0 yes (0) > 0 1 0 1 0 0 yes (0) > 0 1 1 1 1 1 yes (1) > 1 0 0 1 1 1 yes (1) > 1 0 1 1 0 0 yes (0) > 1 1 0 0 1 0 yes (0) > 1 1 1 0 0 1 yes (0) Yes, I really understand it now :-) Thank you very much for your kind explanation. I really appreciate it. > Paolo -- Best regards, Eus (FSF member #4445) In this digital era, where computing technology is pervasive, your freedom depends on the software controlling those computing devices. Join free software movement today! It is free as in freedom, not as in free beer! Join: http://www.fsf.org/jf?referrer=4445 ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 12:28 ` Richard Guenther 2009-03-06 13:54 ` Joseph S. Myers 2009-03-06 14:44 ` Paolo Bonzini @ 2009-03-06 17:21 ` Geert Bosch 2009-03-06 17:38 ` Joseph S. Myers 2 siblings, 1 reply; 52+ messages in thread From: Geert Bosch @ 2009-03-06 17:21 UTC (permalink / raw) To: Richard Guenther; +Cc: Joseph S. Myers, Diego Novillo, gcc On Mar 6, 2009, at 04:11, Richard Guenther wrote: > I didn't spend too much time thinking about the trapping variants > (well, I believe it isn't that important ;)). But in general we would > have to expand the non-NV variants via the trapping expanders > if flag_trapv was true (so yeah, combining TUs with different > flag_trapv > settings would be difficult again and it would ask for explicitly > encoding this variant in the IL ...). The non-NV variants have wrap-around semantics on the no-undefined- overflow branch, right? I'm not about to change that based on some global flag! :) I'm proposing something like: {PLUS,MINUS,MULT,NEGATE}_EXPR: - signed integer operation with wrap-around {PLUS,MINUS,MULT,NEGATE)NV_EXPR - signed integer operations known to not overflow {PLUS,MINUS,MULT,NEGATE)V_EXPR - signed integer operation with overflow check that traps, aborts or raises an exception on overflow > There is of course the problem that we have to be careful not to > introduce new traps via folding, a problem that doesn't exist with > the no-overflow variants (I can simply drop to the wrapping variants). > With for example (a -/v 10) +/v 10 would you want to preserve > the possibly trapping a -/v 10? For (a -/v 10) +/v (b -/v 10) do > we want to be careful to not introduce extra traps when simplifying > to (a +/v b) -/v 20? Indeed, there has to be a very clear boundary as we'd be changing semantics. The original -ftrapv implementation muddled that issue, something I absolutely want to avoid > So while trapping variants can certainly be introduced it looks like > this task may be more difficult. So lowering them early during > gimplification looks like a more reasonable plan IMHO. Right, that was my intention. Still, I'll need to add code to handle the new tree codes in fold(), right? -Geert ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 17:21 ` Geert Bosch @ 2009-03-06 17:38 ` Joseph S. Myers 2009-03-06 18:47 ` Paolo Bonzini 2009-03-06 22:46 ` Geert Bosch 0 siblings, 2 replies; 52+ messages in thread From: Joseph S. Myers @ 2009-03-06 17:38 UTC (permalink / raw) To: Geert Bosch; +Cc: Richard Guenther, Diego Novillo, gcc On Fri, 6 Mar 2009, Geert Bosch wrote: > > this task may be more difficult. So lowering them early during > > gimplification looks like a more reasonable plan IMHO. > > Right, that was my intention. Still, I'll need to add code to > handle the new tree codes in fold(), right? If you add new trapping codes to GENERIC I'd recommend *not* making fold() handle them. I don't think much folding is safe for the trapping codes when you want to avoid either removing or introducing traps. Either lower the codes in gimplification, or handle them explicitly in a few GIMPLE optimizations e.g. when constants are propagated in, but avoid general folding for them. Front ends should set TREE_SIDE_EFFECTS on trapping expressions so that fold knows it can't discard a subexpression (whose value doesn't matter to the value of the final expression) containing a trapping expression, e.g. 0 * (a +trap b) needs to evaluate (a +trap b) for its side effects. With this done, front ends generating trapping codes for -ftrapv and fold not trying to optimize the trapping codes, I'd hope fold and the rest of the language-independent compiler could stop caring about flag_trapv. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 17:38 ` Joseph S. Myers @ 2009-03-06 18:47 ` Paolo Bonzini 2009-03-07 3:00 ` Joseph S. Myers 2009-03-06 22:46 ` Geert Bosch 1 sibling, 1 reply; 52+ messages in thread From: Paolo Bonzini @ 2009-03-06 18:47 UTC (permalink / raw) To: gcc, Joseph S. Myers, Geert Bosch, Richard Guenther, Diego Novillo Joseph S. Myers wrote: > On Fri, 6 Mar 2009, Geert Bosch wrote: > >>> this task may be more difficult. So lowering them early during >>> gimplification looks like a more reasonable plan IMHO. >> Right, that was my intention. Still, I'll need to add code to >> handle the new tree codes in fold(), right? > > If you add new trapping codes to GENERIC I'd recommend *not* making fold() > handle them. Constant folding should be done for them, though. > Either lower > the codes in gimplification, or handle them explicitly in a few GIMPLE > optimizations e.g. when constants are propagated in, but avoid general > folding for them. Definitely the former. Paolo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 18:47 ` Paolo Bonzini @ 2009-03-07 3:00 ` Joseph S. Myers 0 siblings, 0 replies; 52+ messages in thread From: Joseph S. Myers @ 2009-03-07 3:00 UTC (permalink / raw) To: Paolo Bonzini; +Cc: gcc, Geert Bosch, Richard Guenther, Diego Novillo On Fri, 6 Mar 2009, Paolo Bonzini wrote: > Joseph S. Myers wrote: > > On Fri, 6 Mar 2009, Geert Bosch wrote: > > > >>> this task may be more difficult. So lowering them early during > >>> gimplification looks like a more reasonable plan IMHO. > >> Right, that was my intention. Still, I'll need to add code to > >> handle the new tree codes in fold(), right? > > > > If you add new trapping codes to GENERIC I'd recommend *not* making fold() > > handle them. > > Constant folding should be done for them, though. Yes, constant folding should be done, so that very tiny subset of fold needs to handle them. The vast bulk of fold that does other more complicated transformations should not, in my view. (There may be isolated transformations that are safe for these codes. For example, you could transform (x *trap 2) *trap 2 to x *trap 4 (this is a case where two traps can become one), but not (x *trap 2) *trap 0 to x *trap 0 or (x *trap -1) *trap -1 to x *trap 1. I think however it would be reasonable to replace -ftrapv without going through all fold transformations to enable the safe ones for the new tree codes; it's better to start by making it reliable and then look at cases where it can be made to generate better code. And I expect that a large proportion of what fold does is not safe for trapping codes.) -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 17:38 ` Joseph S. Myers 2009-03-06 18:47 ` Paolo Bonzini @ 2009-03-06 22:46 ` Geert Bosch 2009-03-07 1:21 ` Richard Guenther 1 sibling, 1 reply; 52+ messages in thread From: Geert Bosch @ 2009-03-06 22:46 UTC (permalink / raw) To: Joseph S. Myers; +Cc: Richard Guenther, Diego Novillo, gcc On Mar 6, 2009, at 12:22, Joseph S. Myers wrote: > If you add new trapping codes to GENERIC I'd recommend *not* making > fold() > handle them. I don't think much folding is safe for the trapping > codes > when you want to avoid either removing or introducing traps. Either > lower > the codes in gimplification, or handle them explicitly in a few GIMPLE > optimizations e.g. when constants are propagated in, but avoid general > folding for them. The point here is not to think in terms of the old -trapv and trapping instructions, but instead at the slightly higher level of a well-defined model of signed integer arithmetic. That is why signed integer overflow checking and the no-undefined-overflow branch are closely related. There are essentially two models to evaluate a signed integer expression. The one Ada uses is that a check may be omitted if the value of the expression, in absence of the check, would have no effect on the external interactions of the program. > An implementation need not always raise an exception when a language- > defined check > fails. Instead, the operation that failed the check can simply > yield an undefined > result. The exception need be raised by the implementation only if, > in the absence > of raising it, the value of this undefined result would have some > effect on the > external interactions of the program. In determining this, the > implementation shall > not presume that an undefined result has a value that belongs to > its subtype, > nor even to the base range of its type, if scalar. Having removed > the raise of > the exception, the canonical semantics will in general allow the > implementation > to omit the code for the check, and some or all of the operation > itself. The other one is the one you suggest: > Front ends should set TREE_SIDE_EFFECTS on trapping expressions so > that > fold knows it can't discard a subexpression (whose value doesn't > matter to > the value of the final expression) containing a trapping expression, > e.g. > 0 * (a +trap b) needs to evaluate (a +trap b) for its side effects. > With > this done, front ends generating trapping codes for -ftrapv and fold > not > trying to optimize the trapping codes, I'd hope fold and the rest of > the > language-independent compiler could stop caring about flag_trapv. Setting the TREE_SIDE_EFFECTS seriously limits optimizations. Also, as quality of implementation issue, while an expression may have an undefined result, if that result is not used, removing the entire computation is generally preferable over raising an exception. Arguments can be made for and against both models, so probably we could make setting of TREE_SIDE_EFFECTS optional. -Geert ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-03-06 22:46 ` Geert Bosch @ 2009-03-07 1:21 ` Richard Guenther 0 siblings, 0 replies; 52+ messages in thread From: Richard Guenther @ 2009-03-07 1:21 UTC (permalink / raw) To: Geert Bosch; +Cc: Joseph S.Myers, Diego Novillo, gcc On Fri, 6 Mar 2009, Geert Bosch wrote: > > On Mar 6, 2009, at 12:22, Joseph S. Myers wrote: > > If you add new trapping codes to GENERIC I'd recommend *not* making fold() > > handle them. I don't think much folding is safe for the trapping codes > > when you want to avoid either removing or introducing traps. Either lower > > the codes in gimplification, or handle them explicitly in a few GIMPLE > > optimizations e.g. when constants are propagated in, but avoid general > > folding for them. > > The point here is not to think in terms of the old -trapv and trapping > instructions, but instead at the slightly higher level of a well-defined > model of signed integer arithmetic. That is why signed integer overflow > checking and the no-undefined-overflow branch are closely related. > > There are essentially two models to evaluate a signed integer expression. > The one Ada uses is that a check may be omitted if the value of the > expression, > in absence of the check, would have no effect on the external interactions > of the program. > > > An implementation need not always raise an exception when a language-defined > > check > > fails. Instead, the operation that failed the check can simply yield an > > undefined > > result. The exception need be raised by the implementation only if, in the > > absence > > of raising it, the value of this undefined result would have some effect on > > the > > external interactions of the program. In determining this, the > > implementation shall > > not presume that an undefined result has a value that belongs to its > > subtype, > > nor even to the base range of its type, if scalar. Having removed the raise > > of > > the exception, the canonical semantics will in general allow the > > implementation > > to omit the code for the check, and some or all of the operation itself. > > The other one is the one you suggest: > > Front ends should set TREE_SIDE_EFFECTS on trapping expressions so that > > fold knows it can't discard a subexpression (whose value doesn't matter to > > the value of the final expression) containing a trapping expression, e.g. > > 0 * (a +trap b) needs to evaluate (a +trap b) for its side effects. With > > this done, front ends generating trapping codes for -ftrapv and fold not > > trying to optimize the trapping codes, I'd hope fold and the rest of the > > language-independent compiler could stop caring about flag_trapv. > > Setting the TREE_SIDE_EFFECTS seriously limits optimizations. Also, as quality > of implementation issue, while an expression may have an undefined result, > if that result is not used, removing the entire computation is generally > preferable over raising an exception. Arguments can be made for and against > both > models, so probably we could make setting of TREE_SIDE_EFFECTS optional. Yes. If we introduce new GENERIC tree codes for overflow checking operations that is certainly useful. Though once you lowered GENERIC to explicit checks and trapping (or raising exceptions) then this side-effect is explicit and we probably won't remove them anymore. I played with the idea to just implement a -ftrapv substitute by lowering non-NV operation variants during gimplification, but of course new GENERIC tree codes are more flexible. You can use no-undefined-overflow branch to commit patches once you have them. Thanks, Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 23:07 ` Diego Novillo 2009-02-28 0:04 ` Joseph S. Myers @ 2009-02-28 10:25 ` Richard Guenther 1 sibling, 0 replies; 52+ messages in thread From: Richard Guenther @ 2009-02-28 10:25 UTC (permalink / raw) To: Diego Novillo; +Cc: gcc [-- Attachment #1: Type: TEXT/PLAIN, Size: 1850 bytes --] On Fri, 27 Feb 2009, Diego Novillo wrote: > On Thu, Feb 26, 2009 at 06:05, Richard Guenther <rguenther@suse.de> wrote: > > > > Â There shall be no construct in the GIMPLE IL that invokes > > Â undefined behavior. > > Excellent! Thanks for starting this branch. > > > Thus, from now on integer overflow is defined and will wrap with > > the usual twos-complement semantics. Â Thus, for the middle-end > > -fwrapv is always in effect. Â This extends to pointer overflow. > > You mean that -fwrapv is always in effect for the *V_EXPR operations, > right? PLUS_EXPR will always be defined as not overflowing. It's up > to the front end to emit one or the other. -fwrapv is always in effect for the regular variants (PLUS_EXPR). It is also in effect for the *NV_EXPR variants (yeah, the *V naming was confusing...), but we know that overflow doesn't happen for them so it doesn't matter what semantics overflow has - thus you can as well say -fno-wrapv is in effect for them - but! - the *NV_EXPR variants also exist for unsigned arithmetic just specifying the same restriction - no overflow. > How should we define merging two different TUs compiled with different > -fwrapv settings? This is something that may be common in the context > of LTO: > > $ gcc -flto -fwrapv f1.c > $ gcc -flto f2.c > $ gcc -o f -O2 f1.o f2.o > > We will be reading IL containing both overflow and non-overflow > operations. We should define the combination rules for them. There will be no use of flag_wrapv left after the frontends finished, everything is encoded in the IL. So combination of -fwrapv and -fno-wrapv CUs is just "combine them". > > Naming suggestions welcome, at the moment I consider NEGATEV_EXPR > > (thus appending V to the first part). > > As good a convention as any. As was noted, NEGATENV_EXPR is less confusing. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch
@ 2009-02-27 19:58 Jay Foad
2009-02-27 22:41 ` Richard Guenther
2009-02-27 23:00 ` Ian Lance Taylor
0 siblings, 2 replies; 52+ messages in thread
From: Jay Foad @ 2009-02-27 19:58 UTC (permalink / raw)
To: rguenther; +Cc: gcc
> To support languages that have undefined semantics on overflowing
> operations the middle-end gets new unary and binary operators
> that implicitly encode value-range information about their operands
> noting that the operation does not overflow. These does-not-overflow
> operators transform the undefined behavior into a valid assumption
> and thus make the GIMPLE IL fully defined.
From an optimisation pass's point of view, what's the difference between:
1. a PLUS expression that gives an undefined result on overflow, and
2. a PLUS expression with a guarantee that the result won't overflow.
I can't see how they will be handled any differently in practice.
Thanks,
Jay.
^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 19:58 Jay Foad @ 2009-02-27 22:41 ` Richard Guenther 2009-02-27 23:00 ` Ian Lance Taylor 1 sibling, 0 replies; 52+ messages in thread From: Richard Guenther @ 2009-02-27 22:41 UTC (permalink / raw) To: Jay Foad; +Cc: gcc On Fri, 27 Feb 2009, Jay Foad wrote: > > To support languages that have undefined semantics on overflowing > > operations the middle-end gets new unary and binary operators > > that implicitly encode value-range information about their operands > > noting that the operation does not overflow. These does-not-overflow > > operators transform the undefined behavior into a valid assumption > > and thus make the GIMPLE IL fully defined. > > From an optimisation pass's point of view, what's the difference between: > > 1. a PLUS expression that gives an undefined result on overflow, and > 2. a PLUS expression with a guarantee that the result won't overflow. > > I can't see how they will be handled any differently in practice. There is no difference in practice other than that we do a consistent interpretation of undefined. Note that the whole point of the branch is to make overflow behavior explicit per operation, not globally set per flag and for signed types only. Richard. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: New no-undefined-overflow branch 2009-02-27 19:58 Jay Foad 2009-02-27 22:41 ` Richard Guenther @ 2009-02-27 23:00 ` Ian Lance Taylor 1 sibling, 0 replies; 52+ messages in thread From: Ian Lance Taylor @ 2009-02-27 23:00 UTC (permalink / raw) To: Jay Foad; +Cc: rguenther, gcc Jay Foad <jay.foad@gmail.com> writes: From an optimisation pass's point of view, what's the difference between: > > 1. a PLUS expression that gives an undefined result on overflow, and > 2. a PLUS expression with a guarantee that the result won't overflow. > > I can't see how they will be handled any differently in practice. I don't think that's the right way to describe the alternatives. The choices are: 1. A PLUS expression which gives an undefined result on overflow; 2. A PLUS expression which wraps on overflow. Those two cases can be optimized differently. For example, (a +undef-overflow C1) < C2 where C1 and C2 are constants, may be optimized to a < C2 - C1 That is not true of (a +wrapping-overflow C1) < C2 Ian ^ permalink raw reply [flat|nested] 52+ messages in thread
end of thread, other threads:[~2009-03-31 3:05 UTC | newest] Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-02-26 11:05 New no-undefined-overflow branch Richard Guenther 2009-02-26 11:28 ` Manuel López-Ibáñez 2009-02-26 12:02 ` Richard Guenther 2009-02-26 13:07 ` Joseph S. Myers 2009-02-26 13:22 ` Richard Guenther 2009-02-26 14:16 ` Dave Korn 2009-02-26 23:08 ` Zdenek Dvorak 2009-02-27 10:16 ` Richard Guenther 2009-02-27 12:07 ` Dave Korn 2009-02-27 13:17 ` Richard Guenther 2009-02-27 13:35 ` Dave Korn 2009-02-27 13:45 ` Richard Guenther 2009-02-27 18:07 ` Zdenek Dvorak 2009-02-27 17:58 ` Zdenek Dvorak 2009-02-27 18:17 ` Richard Guenther 2009-02-27 23:07 ` Diego Novillo 2009-02-28 0:04 ` Joseph S. Myers 2009-02-28 10:29 ` Richard Guenther 2009-03-05 17:24 ` Geert Bosch 2009-03-06 12:28 ` Richard Guenther 2009-03-06 13:54 ` Joseph S. Myers 2009-03-06 14:10 ` Richard Guenther 2009-03-06 14:29 ` Joseph S. Myers 2009-03-06 17:17 ` Geert Bosch 2009-03-06 17:42 ` Joseph S. Myers 2009-03-06 14:44 ` Paolo Bonzini 2009-03-06 15:10 ` Richard Guenther 2009-03-06 15:23 ` Paolo Bonzini 2009-03-06 15:39 ` Paolo Bonzini 2009-03-06 16:19 ` Richard Guenther 2009-03-06 17:23 ` Joseph S. Myers 2009-03-06 18:54 ` Richard Earnshaw 2009-03-06 19:53 ` Ian Lance Taylor 2009-03-06 20:42 ` Robert Dewar [not found] ` <1236364189.1654.2.camel@pc960.cambridge.arm.com> 2009-03-06 19:34 ` Paolo Bonzini 2009-03-06 15:58 ` Ian Lance Taylor 2009-03-06 17:25 ` Joseph S. Myers 2009-03-06 17:52 ` Paolo Bonzini 2009-03-06 18:36 ` Joseph S. Myers 2009-03-27 13:11 ` Eus 2009-03-27 18:07 ` Paolo Bonzini 2009-03-31 4:55 ` Eus 2009-03-06 17:21 ` Geert Bosch 2009-03-06 17:38 ` Joseph S. Myers 2009-03-06 18:47 ` Paolo Bonzini 2009-03-07 3:00 ` Joseph S. Myers 2009-03-06 22:46 ` Geert Bosch 2009-03-07 1:21 ` Richard Guenther 2009-02-28 10:25 ` Richard Guenther 2009-02-27 19:58 Jay Foad 2009-02-27 22:41 ` Richard Guenther 2009-02-27 23:00 ` Ian Lance Taylor
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).