* types in GIMPLE IR @ 2023-06-28 9:15 Krister Walfridsson 2023-06-28 15:46 ` Michael Matz 0 siblings, 1 reply; 8+ messages in thread From: Krister Walfridsson @ 2023-06-28 9:15 UTC (permalink / raw) To: gcc I have some random questions concerning types in the GIMPLE IR that I would appreciate some clarification on. Type safety ----------- Some transformations treat 1-bit types as a synonym of _Bool and mix the types in expressions, such as: <unnamed-unsigned:1> _2; _Bool _3; _Bool _4; ... _4 = _2 ^ _3; and similarly mixing _Bool and enum enum E:bool { E0, E1 }; in one operation. I had expected this to be invalid... What are the type safety rules in the GIMPLE IR? Somewhat related, gcc.c-torture/compile/pr96796.c performs a VIEW_CONVERT_EXPR from struct S1 { long f3; char f4; } g_3_4; to an int p_51_9 = VIEW_CONVERT_EXPR<int>(g_3_4); That must be wrong? Semantics of <signed-boolean:32> -------------------------------- "Wide" Booleans, such as <signed-boolean:32>, seems to allow more values than 0 and 1. For example, I've seen some IR operations like: _66 = _16 ? _Literal (<signed-boolean:32>) -1 : 0; But I guess there must be some semantic difference between <signed-boolean:32> and a 32-bit int, otherwise the wide Boolean type would not be needed... So what are the difference? /Krister ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR 2023-06-28 9:15 types in GIMPLE IR Krister Walfridsson @ 2023-06-28 15:46 ` Michael Matz 2023-06-29 6:21 ` Richard Biener 0 siblings, 1 reply; 8+ messages in thread From: Michael Matz @ 2023-06-28 15:46 UTC (permalink / raw) To: Krister Walfridsson; +Cc: gcc Hello, On Wed, 28 Jun 2023, Krister Walfridsson via Gcc wrote: > Type safety > ----------- > Some transformations treat 1-bit types as a synonym of _Bool and mix the types > in expressions, such as: > > <unnamed-unsigned:1> _2; > _Bool _3; > _Bool _4; > ... > _4 = _2 ^ _3; > > and similarly mixing _Bool and enum > > enum E:bool { E0, E1 }; > > in one operation. > > I had expected this to be invalid... What are the type safety rules in the > GIMPLE IR? Type safety in gimple is defined in terms of type compatiblity, with _many_ exceptions for specific types of statements. Generally stuff is verified in verify_gimple_seq., in this case of a binary assign statement in verify_gimple_assign_binary. As you can see there the normal rules for bread-and-butter binary assigns is simply that RHS, LHS1 and LHS2 must all be type-compatible. T1 and T2 are compatible if conversions from T1 to T2 are useless and conversion from T2 to T1 are also useless. (types_compatible_p) The meat for that is all in gimple-expr.cc:useless_type_conversion_p. For this specific case again we have: /* Preserve conversions to/from BOOLEAN_TYPE if types are not of precision one. */ if (((TREE_CODE (inner_type) == BOOLEAN_TYPE) != (TREE_CODE (outer_type) == BOOLEAN_TYPE)) && TYPE_PRECISION (outer_type) != 1) return false; So, yes, booleans and 1-bit types can be compatible (under certain other conditions, see the function). > Somewhat related, gcc.c-torture/compile/pr96796.c performs a VIEW_CONVERT_EXPR > from > > struct S1 { > long f3; > char f4; > } g_3_4; > > to an int > > p_51_9 = VIEW_CONVERT_EXPR<int>(g_3_4); > > That must be wrong? VIEW_CONVERT_EXPR is _very_ generous. See verify_types_in_gimple_reference: if (TREE_CODE (expr) == VIEW_CONVERT_EXPR) { /* For VIEW_CONVERT_EXPRs which are allowed here too, we only check that their operand is not a register an invariant when requiring an lvalue (this usually means there is a SRA or IPA-SRA bug). Otherwise there is nothing to verify, gross mismatches at most invoke undefined behavior. */ if (require_lvalue && (is_gimple_reg (op) || is_gimple_min_invariant (op))) { error ("conversion of %qs on the left hand side of %qs", get_tree_code_name (TREE_CODE (op)), code_name); debug_generic_stmt (expr); return true; } else if (is_gimple_reg (op) && TYPE_SIZE (TREE_TYPE (expr)) != TYPE_SIZE (TREE_TYPE (op))) { error ("conversion of register to a different size in %qs", code_name); debug_generic_stmt (expr); return true; } } Here the operand is not a register (but a global memory object), so everything goes. It should be said that over the years gimples type system became stricter and stricter, but it started as mostly everything-goes, so making it stricter is a bumpy road that isn't fully travelled yet, because checking types often results in testcase regressions :-) > Semantics of <signed-boolean:32> > -------------------------------- > "Wide" Booleans, such as <signed-boolean:32>, seems to allow more values than > 0 and 1. For example, I've seen some IR operations like: > > _66 = _16 ? _Literal (<signed-boolean:32>) -1 : 0; > > But I guess there must be some semantic difference between > <signed-boolean:32> and a 32-bit int, otherwise the wide Boolean type > would not be needed... So what are the difference? See above, normally conversions to booleans that are wider than 1 bit are _not_ useless (because they require booleanization to true/false). In the above case the not-useless cast is within a COND_EXPR, so it's quite possible that the gimplifier didn't look hard enough to split this out into a proper conversion statement. (The verifier doesn't look inside the expressions of the COND_EXPR, so also doesn't catch this one) If that turns out to be true and the above still happens when -1 is replaced by (say) 42, then it might be possible to construct a wrong-code testcase based on the fact that _66 as boolean should contain only two observable values (true/false), but could then contain 42. OTOH, it might also not be possible to create such testcase, namely when GCC is internally too conservative in handling wide bools :-) In that case we probably have a missed optimization somewhere, which when implemented would enable construction of such wrong-code testcase ;) (I'm saying that -1 should be replaced by something else for a wrong-code testcase, because -1 is special and could justifieably be special-cased in GCC: -1 is the proper arithmetic value for a signed boolean that is "true"). Ciao, Michael. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR 2023-06-28 15:46 ` Michael Matz @ 2023-06-29 6:21 ` Richard Biener 2023-06-29 12:06 ` Krister Walfridsson 0 siblings, 1 reply; 8+ messages in thread From: Richard Biener @ 2023-06-29 6:21 UTC (permalink / raw) To: Michael Matz; +Cc: Krister Walfridsson, gcc On Wed, Jun 28, 2023 at 5:47 PM Michael Matz via Gcc <gcc@gcc.gnu.org> wrote: > > Hello, > > On Wed, 28 Jun 2023, Krister Walfridsson via Gcc wrote: > > > Type safety > > ----------- > > Some transformations treat 1-bit types as a synonym of _Bool and mix the types > > in expressions, such as: > > > > <unnamed-unsigned:1> _2; > > _Bool _3; > > _Bool _4; > > ... > > _4 = _2 ^ _3; > > > > and similarly mixing _Bool and enum > > > > enum E:bool { E0, E1 }; > > > > in one operation. > > > > I had expected this to be invalid... What are the type safety rules in the > > GIMPLE IR? > > Type safety in gimple is defined in terms of type compatiblity, with > _many_ exceptions for specific types of statements. Generally stuff is > verified in verify_gimple_seq., in this case of a binary assign statement > in verify_gimple_assign_binary. As you can see there the normal rules for > bread-and-butter binary assigns is simply that RHS, LHS1 and LHS2 must > all be type-compatible. > > T1 and T2 are compatible if conversions from T1 to T2 are useless and > conversion from T2 to T1 are also useless. (types_compatible_p) The meat > for that is all in gimple-expr.cc:useless_type_conversion_p. For this > specific case again we have: > > /* Preserve conversions to/from BOOLEAN_TYPE if types are not > of precision one. */ > if (((TREE_CODE (inner_type) == BOOLEAN_TYPE) > != (TREE_CODE (outer_type) == BOOLEAN_TYPE)) > && TYPE_PRECISION (outer_type) != 1) > return false; > > So, yes, booleans and 1-bit types can be compatible (under certain other > conditions, see the function). > > > Somewhat related, gcc.c-torture/compile/pr96796.c performs a VIEW_CONVERT_EXPR > > from > > > > struct S1 { > > long f3; > > char f4; > > } g_3_4; > > > > to an int > > > > p_51_9 = VIEW_CONVERT_EXPR<int>(g_3_4); > > > > That must be wrong? > > VIEW_CONVERT_EXPR is _very_ generous. See > verify_types_in_gimple_reference: Yep. In general these cases should rather use a BIT_FIELD_REF to select a same sized subpart and only then do the rvalue conversion. But as Micha says below making the IL stricter isn't an easy task. > if (TREE_CODE (expr) == VIEW_CONVERT_EXPR) > { > /* For VIEW_CONVERT_EXPRs which are allowed here too, we only check > that their operand is not a register an invariant when > requiring an lvalue (this usually means there is a SRA or IPA-SRA > bug). Otherwise there is nothing to verify, gross mismatches at > most invoke undefined behavior. */ > if (require_lvalue > && (is_gimple_reg (op) || is_gimple_min_invariant (op))) > { > error ("conversion of %qs on the left hand side of %qs", > get_tree_code_name (TREE_CODE (op)), code_name); > debug_generic_stmt (expr); > return true; > } > else if (is_gimple_reg (op) > && TYPE_SIZE (TREE_TYPE (expr)) != TYPE_SIZE (TREE_TYPE (op))) > { > error ("conversion of register to a different size in %qs", > code_name); > debug_generic_stmt (expr); > return true; > } > } > > Here the operand is not a register (but a global memory object), so > everything goes. > > It should be said that over the years gimples type system became stricter > and stricter, but it started as mostly everything-goes, so making it > stricter is a bumpy road that isn't fully travelled yet, because checking > types often results in testcase regressions :-) > > > Semantics of <signed-boolean:32> > > -------------------------------- > > "Wide" Booleans, such as <signed-boolean:32>, seems to allow more values than > > 0 and 1. For example, I've seen some IR operations like: > > > > _66 = _16 ? _Literal (<signed-boolean:32>) -1 : 0; > > > > But I guess there must be some semantic difference between > > <signed-boolean:32> and a 32-bit int, otherwise the wide Boolean type > > would not be needed... So what are the difference? > > See above, normally conversions to booleans that are wider than 1 bit are > _not_ useless (because they require booleanization to true/false). In the > above case the not-useless cast is within a COND_EXPR, so it's quite > possible that the gimplifier didn't look hard enough to split this out > into a proper conversion statement. (The verifier doesn't look inside > the expressions of the COND_EXPR, so also doesn't catch this one) Note the above is GIMPLE FE syntax for a constant of a specific type, a regular dump would just show _16 ? -1 : 0; The thing with signed bools is that the two relevant values are -1 (true) and 0 (false), those are used for vector bool components where we also need them to be of wider type (32bits in this case). Richard. > If that turns out to be true and the above still happens when -1 is > replaced by (say) 42, then it might be possible to construct a > wrong-code testcase based on the fact that _66 as boolean should contain > only two observable values (true/false), but could then contain 42. OTOH, > it might also not be possible to create such testcase, namely when GCC is > internally too conservative in handling wide bools :-) In that case we > probably have a missed optimization somewhere, which when implemented > would enable construction of such wrong-code testcase ;) > > (I'm saying that -1 should be replaced by something else for a wrong-code > testcase, because -1 is special and could justifieably be special-cased in > GCC: -1 is the proper arithmetic value for a signed boolean that is > "true"). > > > Ciao, > Michael. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR 2023-06-29 6:21 ` Richard Biener @ 2023-06-29 12:06 ` Krister Walfridsson 2023-06-29 12:24 ` Michael Matz 2023-06-29 13:19 ` Richard Biener 0 siblings, 2 replies; 8+ messages in thread From: Krister Walfridsson @ 2023-06-29 12:06 UTC (permalink / raw) To: Richard Biener; +Cc: Michael Matz, Krister Walfridsson, gcc On Thu, 29 Jun 2023, Richard Biener wrote: > The thing with signed bools is that the two relevant values are -1 (true) > and 0 (false), those are used for vector bool components where we also > need them to be of wider type (32bits in this case). My main confusion comes from seeing IR doing arithmetic such as <signed-boolean:32> _127; <signed-boolean:32> _169; ... _169 = _127 + -1; or <signed-boolean:32> _127; <signed-boolean:32> _169; ... _169 = -_127; and it was unclear to me what kind of arithmetic is allowed. I have now verified that all cases seems to be just one operation of this form (where _127 has the value 0 or 1), so it cannot construct values such as 42. But the wide signed Boolean can have the three different values 1, 0, and -1, which I still think is at least one too many. :) I'll update my tool to complain if the value is outside the range [-1, 1]. /Krister ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR 2023-06-29 12:06 ` Krister Walfridsson @ 2023-06-29 12:24 ` Michael Matz 2023-06-29 13:19 ` Richard Biener 1 sibling, 0 replies; 8+ messages in thread From: Michael Matz @ 2023-06-29 12:24 UTC (permalink / raw) To: Krister Walfridsson; +Cc: Richard Biener, gcc Hello, On Thu, 29 Jun 2023, Krister Walfridsson wrote: > > The thing with signed bools is that the two relevant values are -1 (true) > > and 0 (false), those are used for vector bool components where we also > > need them to be of wider type (32bits in this case). > > My main confusion comes from seeing IR doing arithmetic such as > > <signed-boolean:32> _127; > <signed-boolean:32> _169; > ... > _169 = _127 + -1; > > or > > <signed-boolean:32> _127; > <signed-boolean:32> _169; > ... > _169 = -_127; > > and it was unclear to me what kind of arithmetic is allowed. > > I have now verified that all cases seems to be just one operation of this form > (where _127 has the value 0 or 1), so it cannot construct values such as 42. > But the wide signed Boolean can have the three different values 1, 0, and -1, > which I still think is at least one too many. :) It definitely is. For signed bool it should be -1 and 0, for unsigned bool 1 and 0. And of course, arithmetic on bools is always dubious, that should all be logical operations. Modulo-arithmetic (mod 2) could be made to work, but then we would have to give up the idea of signed bools and always use conversions to signed int to get a bitmaks of all-ones. And as mod-2-arithmetic is equivalent to logical ops it seems a bit futile to go that way. Of course, enforcing this all might lead to a surprising heap of errors, but one has to start somewhere, so ... > I'll update my tool to complain if the value is outside the range [-1, > 1]. ... maybe not do that, at least optionally, that maybe somewhen someone can look into fixing that all up? :-) -fdubious-bools? Ciao, Michael. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR 2023-06-29 12:06 ` Krister Walfridsson 2023-06-29 12:24 ` Michael Matz @ 2023-06-29 13:19 ` Richard Biener 2023-06-29 19:08 ` Krister Walfridsson 1 sibling, 1 reply; 8+ messages in thread From: Richard Biener @ 2023-06-29 13:19 UTC (permalink / raw) To: Krister Walfridsson; +Cc: Michael Matz, gcc On Thu, Jun 29, 2023 at 2:06 PM Krister Walfridsson <krister.walfridsson@gmail.com> wrote: > > On Thu, 29 Jun 2023, Richard Biener wrote: > > > The thing with signed bools is that the two relevant values are -1 (true) > > and 0 (false), those are used for vector bool components where we also > > need them to be of wider type (32bits in this case). > > My main confusion comes from seeing IR doing arithmetic such as > > <signed-boolean:32> _127; > <signed-boolean:32> _169; > ... > _169 = _127 + -1; > > or > > <signed-boolean:32> _127; > <signed-boolean:32> _169; > ... > _169 = -_127; > > and it was unclear to me what kind of arithmetic is allowed. IIRC we have some simplification rules that turn bit operations into arithmetics. Arithmetic is allowed if it keeps the values inside [-1,0] for signed bools or [0, 1] for unsigned bools. > I have now verified that all cases seems to be just one operation of this > form (where _127 has the value 0 or 1), so it cannot construct values > such as 42. But the wide signed Boolean can have the three different > values 1, 0, and -1, which I still think is at least one too many. :) Yeah, I'd be interested in a testcase that shows this behavior. Richard. > I'll update my tool to complain if the value is outside the range [-1, 1]. > > /Krister ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR 2023-06-29 13:19 ` Richard Biener @ 2023-06-29 19:08 ` Krister Walfridsson 2023-06-29 19:32 ` Andrew Pinski 0 siblings, 1 reply; 8+ messages in thread From: Krister Walfridsson @ 2023-06-29 19:08 UTC (permalink / raw) To: Richard Biener; +Cc: Krister Walfridsson, Michael Matz, gcc On Thu, 29 Jun 2023, Richard Biener wrote: > IIRC we have some simplification rules that turn bit operations into > arithmetics. Arithmetic is allowed if it keeps the values inside > [-1,0] for signed bools or [0, 1] for unsigned bools. > >> I have now verified that all cases seems to be just one operation of this >> form (where _127 has the value 0 or 1), so it cannot construct values >> such as 42. But the wide signed Boolean can have the three different >> values 1, 0, and -1, which I still think is at least one too many. :) > > Yeah, I'd be interested in a testcase that shows this behavior. I created PR 110487 with one example. >> I'll update my tool to complain if the value is outside the range [-1, 1]. It is likely that all issues I have seen so far are due to PR 110487, so I'll keep the current behavior that complains if the value is outside the range [-1, 0]. /Krister ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR 2023-06-29 19:08 ` Krister Walfridsson @ 2023-06-29 19:32 ` Andrew Pinski 0 siblings, 0 replies; 8+ messages in thread From: Andrew Pinski @ 2023-06-29 19:32 UTC (permalink / raw) To: Krister Walfridsson; +Cc: Richard Biener, Michael Matz, gcc On Thu, Jun 29, 2023 at 12:10 PM Krister Walfridsson via Gcc <gcc@gcc.gnu.org> wrote: > > On Thu, 29 Jun 2023, Richard Biener wrote: > > > IIRC we have some simplification rules that turn bit operations into > > arithmetics. Arithmetic is allowed if it keeps the values inside > > [-1,0] for signed bools or [0, 1] for unsigned bools. > > > >> I have now verified that all cases seems to be just one operation of this > >> form (where _127 has the value 0 or 1), so it cannot construct values > >> such as 42. But the wide signed Boolean can have the three different > >> values 1, 0, and -1, which I still think is at least one too many. :) > > > > Yeah, I'd be interested in a testcase that shows this behavior. > > I created PR 110487 with one example. > > > >> I'll update my tool to complain if the value is outside the range [-1, 1]. > > It is likely that all issues I have seen so far are due to PR 110487, so > I'll keep the current behavior that complains if the value is outside the > range [-1, 0]. Yes there are many similar to this all over GCC's folding. In this case checking TYPE_PRECISION as described in the match.pd is not even the right approach. The whole TYPE_PRECISION on boolean types is definitely a big can of worms. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102622 is related but that was signed boolean:1. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106053 is another related case. For this weekend, I am going to audit some of the match patterns for these issues. > > /Krister ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-06-29 19:32 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-06-28 9:15 types in GIMPLE IR Krister Walfridsson 2023-06-28 15:46 ` Michael Matz 2023-06-29 6:21 ` Richard Biener 2023-06-29 12:06 ` Krister Walfridsson 2023-06-29 12:24 ` Michael Matz 2023-06-29 13:19 ` Richard Biener 2023-06-29 19:08 ` Krister Walfridsson 2023-06-29 19:32 ` Andrew Pinski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).