* types in GIMPLE IR
@ 2023-06-28 9:15 Krister Walfridsson
2023-06-28 15:46 ` Michael Matz
0 siblings, 1 reply; 8+ messages in thread
From: Krister Walfridsson @ 2023-06-28 9:15 UTC (permalink / raw)
To: gcc
I have some random questions concerning types in the GIMPLE IR that I
would appreciate some clarification on.
Type safety
-----------
Some transformations treat 1-bit types as a synonym of _Bool and mix the
types in expressions, such as:
<unnamed-unsigned:1> _2;
_Bool _3;
_Bool _4;
...
_4 = _2 ^ _3;
and similarly mixing _Bool and enum
enum E:bool { E0, E1 };
in one operation.
I had expected this to be invalid... What are the type safety rules in the
GIMPLE IR?
Somewhat related, gcc.c-torture/compile/pr96796.c performs a
VIEW_CONVERT_EXPR from
struct S1 {
long f3;
char f4;
} g_3_4;
to an int
p_51_9 = VIEW_CONVERT_EXPR<int>(g_3_4);
That must be wrong?
Semantics of <signed-boolean:32>
--------------------------------
"Wide" Booleans, such as <signed-boolean:32>, seems to allow more values
than 0 and 1. For example, I've seen some IR operations like:
_66 = _16 ? _Literal (<signed-boolean:32>) -1 : 0;
But I guess there must be some semantic difference between
<signed-boolean:32> and a 32-bit int, otherwise the wide Boolean type
would not be needed... So what are the difference?
/Krister
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR
2023-06-28 9:15 types in GIMPLE IR Krister Walfridsson
@ 2023-06-28 15:46 ` Michael Matz
2023-06-29 6:21 ` Richard Biener
0 siblings, 1 reply; 8+ messages in thread
From: Michael Matz @ 2023-06-28 15:46 UTC (permalink / raw)
To: Krister Walfridsson; +Cc: gcc
Hello,
On Wed, 28 Jun 2023, Krister Walfridsson via Gcc wrote:
> Type safety
> -----------
> Some transformations treat 1-bit types as a synonym of _Bool and mix the types
> in expressions, such as:
>
> <unnamed-unsigned:1> _2;
> _Bool _3;
> _Bool _4;
> ...
> _4 = _2 ^ _3;
>
> and similarly mixing _Bool and enum
>
> enum E:bool { E0, E1 };
>
> in one operation.
>
> I had expected this to be invalid... What are the type safety rules in the
> GIMPLE IR?
Type safety in gimple is defined in terms of type compatiblity, with
_many_ exceptions for specific types of statements. Generally stuff is
verified in verify_gimple_seq., in this case of a binary assign statement
in verify_gimple_assign_binary. As you can see there the normal rules for
bread-and-butter binary assigns is simply that RHS, LHS1 and LHS2 must
all be type-compatible.
T1 and T2 are compatible if conversions from T1 to T2 are useless and
conversion from T2 to T1 are also useless. (types_compatible_p) The meat
for that is all in gimple-expr.cc:useless_type_conversion_p. For this
specific case again we have:
/* Preserve conversions to/from BOOLEAN_TYPE if types are not
of precision one. */
if (((TREE_CODE (inner_type) == BOOLEAN_TYPE)
!= (TREE_CODE (outer_type) == BOOLEAN_TYPE))
&& TYPE_PRECISION (outer_type) != 1)
return false;
So, yes, booleans and 1-bit types can be compatible (under certain other
conditions, see the function).
> Somewhat related, gcc.c-torture/compile/pr96796.c performs a VIEW_CONVERT_EXPR
> from
>
> struct S1 {
> long f3;
> char f4;
> } g_3_4;
>
> to an int
>
> p_51_9 = VIEW_CONVERT_EXPR<int>(g_3_4);
>
> That must be wrong?
VIEW_CONVERT_EXPR is _very_ generous. See
verify_types_in_gimple_reference:
if (TREE_CODE (expr) == VIEW_CONVERT_EXPR)
{
/* For VIEW_CONVERT_EXPRs which are allowed here too, we only check
that their operand is not a register an invariant when
requiring an lvalue (this usually means there is a SRA or IPA-SRA
bug). Otherwise there is nothing to verify, gross mismatches at
most invoke undefined behavior. */
if (require_lvalue
&& (is_gimple_reg (op) || is_gimple_min_invariant (op)))
{
error ("conversion of %qs on the left hand side of %qs",
get_tree_code_name (TREE_CODE (op)), code_name);
debug_generic_stmt (expr);
return true;
}
else if (is_gimple_reg (op)
&& TYPE_SIZE (TREE_TYPE (expr)) != TYPE_SIZE (TREE_TYPE (op)))
{
error ("conversion of register to a different size in %qs",
code_name);
debug_generic_stmt (expr);
return true;
}
}
Here the operand is not a register (but a global memory object), so
everything goes.
It should be said that over the years gimples type system became stricter
and stricter, but it started as mostly everything-goes, so making it
stricter is a bumpy road that isn't fully travelled yet, because checking
types often results in testcase regressions :-)
> Semantics of <signed-boolean:32>
> --------------------------------
> "Wide" Booleans, such as <signed-boolean:32>, seems to allow more values than
> 0 and 1. For example, I've seen some IR operations like:
>
> _66 = _16 ? _Literal (<signed-boolean:32>) -1 : 0;
>
> But I guess there must be some semantic difference between
> <signed-boolean:32> and a 32-bit int, otherwise the wide Boolean type
> would not be needed... So what are the difference?
See above, normally conversions to booleans that are wider than 1 bit are
_not_ useless (because they require booleanization to true/false). In the
above case the not-useless cast is within a COND_EXPR, so it's quite
possible that the gimplifier didn't look hard enough to split this out
into a proper conversion statement. (The verifier doesn't look inside
the expressions of the COND_EXPR, so also doesn't catch this one)
If that turns out to be true and the above still happens when -1 is
replaced by (say) 42, then it might be possible to construct a
wrong-code testcase based on the fact that _66 as boolean should contain
only two observable values (true/false), but could then contain 42. OTOH,
it might also not be possible to create such testcase, namely when GCC is
internally too conservative in handling wide bools :-) In that case we
probably have a missed optimization somewhere, which when implemented
would enable construction of such wrong-code testcase ;)
(I'm saying that -1 should be replaced by something else for a wrong-code
testcase, because -1 is special and could justifieably be special-cased in
GCC: -1 is the proper arithmetic value for a signed boolean that is
"true").
Ciao,
Michael.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR
2023-06-28 15:46 ` Michael Matz
@ 2023-06-29 6:21 ` Richard Biener
2023-06-29 12:06 ` Krister Walfridsson
0 siblings, 1 reply; 8+ messages in thread
From: Richard Biener @ 2023-06-29 6:21 UTC (permalink / raw)
To: Michael Matz; +Cc: Krister Walfridsson, gcc
On Wed, Jun 28, 2023 at 5:47 PM Michael Matz via Gcc <gcc@gcc.gnu.org> wrote:
>
> Hello,
>
> On Wed, 28 Jun 2023, Krister Walfridsson via Gcc wrote:
>
> > Type safety
> > -----------
> > Some transformations treat 1-bit types as a synonym of _Bool and mix the types
> > in expressions, such as:
> >
> > <unnamed-unsigned:1> _2;
> > _Bool _3;
> > _Bool _4;
> > ...
> > _4 = _2 ^ _3;
> >
> > and similarly mixing _Bool and enum
> >
> > enum E:bool { E0, E1 };
> >
> > in one operation.
> >
> > I had expected this to be invalid... What are the type safety rules in the
> > GIMPLE IR?
>
> Type safety in gimple is defined in terms of type compatiblity, with
> _many_ exceptions for specific types of statements. Generally stuff is
> verified in verify_gimple_seq., in this case of a binary assign statement
> in verify_gimple_assign_binary. As you can see there the normal rules for
> bread-and-butter binary assigns is simply that RHS, LHS1 and LHS2 must
> all be type-compatible.
>
> T1 and T2 are compatible if conversions from T1 to T2 are useless and
> conversion from T2 to T1 are also useless. (types_compatible_p) The meat
> for that is all in gimple-expr.cc:useless_type_conversion_p. For this
> specific case again we have:
>
> /* Preserve conversions to/from BOOLEAN_TYPE if types are not
> of precision one. */
> if (((TREE_CODE (inner_type) == BOOLEAN_TYPE)
> != (TREE_CODE (outer_type) == BOOLEAN_TYPE))
> && TYPE_PRECISION (outer_type) != 1)
> return false;
>
> So, yes, booleans and 1-bit types can be compatible (under certain other
> conditions, see the function).
>
> > Somewhat related, gcc.c-torture/compile/pr96796.c performs a VIEW_CONVERT_EXPR
> > from
> >
> > struct S1 {
> > long f3;
> > char f4;
> > } g_3_4;
> >
> > to an int
> >
> > p_51_9 = VIEW_CONVERT_EXPR<int>(g_3_4);
> >
> > That must be wrong?
>
> VIEW_CONVERT_EXPR is _very_ generous. See
> verify_types_in_gimple_reference:
Yep. In general these cases should rather use a BIT_FIELD_REF to select
a same sized subpart and only then do the rvalue conversion. But as Micha says
below making the IL stricter isn't an easy task.
> if (TREE_CODE (expr) == VIEW_CONVERT_EXPR)
> {
> /* For VIEW_CONVERT_EXPRs which are allowed here too, we only check
> that their operand is not a register an invariant when
> requiring an lvalue (this usually means there is a SRA or IPA-SRA
> bug). Otherwise there is nothing to verify, gross mismatches at
> most invoke undefined behavior. */
> if (require_lvalue
> && (is_gimple_reg (op) || is_gimple_min_invariant (op)))
> {
> error ("conversion of %qs on the left hand side of %qs",
> get_tree_code_name (TREE_CODE (op)), code_name);
> debug_generic_stmt (expr);
> return true;
> }
> else if (is_gimple_reg (op)
> && TYPE_SIZE (TREE_TYPE (expr)) != TYPE_SIZE (TREE_TYPE (op)))
> {
> error ("conversion of register to a different size in %qs",
> code_name);
> debug_generic_stmt (expr);
> return true;
> }
> }
>
> Here the operand is not a register (but a global memory object), so
> everything goes.
>
> It should be said that over the years gimples type system became stricter
> and stricter, but it started as mostly everything-goes, so making it
> stricter is a bumpy road that isn't fully travelled yet, because checking
> types often results in testcase regressions :-)
>
> > Semantics of <signed-boolean:32>
> > --------------------------------
> > "Wide" Booleans, such as <signed-boolean:32>, seems to allow more values than
> > 0 and 1. For example, I've seen some IR operations like:
> >
> > _66 = _16 ? _Literal (<signed-boolean:32>) -1 : 0;
> >
> > But I guess there must be some semantic difference between
> > <signed-boolean:32> and a 32-bit int, otherwise the wide Boolean type
> > would not be needed... So what are the difference?
>
> See above, normally conversions to booleans that are wider than 1 bit are
> _not_ useless (because they require booleanization to true/false). In the
> above case the not-useless cast is within a COND_EXPR, so it's quite
> possible that the gimplifier didn't look hard enough to split this out
> into a proper conversion statement. (The verifier doesn't look inside
> the expressions of the COND_EXPR, so also doesn't catch this one)
Note the above is GIMPLE FE syntax for a constant of a specific type,
a regular dump would just show _16 ? -1 : 0;
The thing with signed bools is that the two relevant values are -1 (true)
and 0 (false), those are used for vector bool components where we also
need them to be of wider type (32bits in this case).
Richard.
> If that turns out to be true and the above still happens when -1 is
> replaced by (say) 42, then it might be possible to construct a
> wrong-code testcase based on the fact that _66 as boolean should contain
> only two observable values (true/false), but could then contain 42. OTOH,
> it might also not be possible to create such testcase, namely when GCC is
> internally too conservative in handling wide bools :-) In that case we
> probably have a missed optimization somewhere, which when implemented
> would enable construction of such wrong-code testcase ;)
>
> (I'm saying that -1 should be replaced by something else for a wrong-code
> testcase, because -1 is special and could justifieably be special-cased in
> GCC: -1 is the proper arithmetic value for a signed boolean that is
> "true").
>
>
> Ciao,
> Michael.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR
2023-06-29 6:21 ` Richard Biener
@ 2023-06-29 12:06 ` Krister Walfridsson
2023-06-29 12:24 ` Michael Matz
2023-06-29 13:19 ` Richard Biener
0 siblings, 2 replies; 8+ messages in thread
From: Krister Walfridsson @ 2023-06-29 12:06 UTC (permalink / raw)
To: Richard Biener; +Cc: Michael Matz, Krister Walfridsson, gcc
On Thu, 29 Jun 2023, Richard Biener wrote:
> The thing with signed bools is that the two relevant values are -1 (true)
> and 0 (false), those are used for vector bool components where we also
> need them to be of wider type (32bits in this case).
My main confusion comes from seeing IR doing arithmetic such as
<signed-boolean:32> _127;
<signed-boolean:32> _169;
...
_169 = _127 + -1;
or
<signed-boolean:32> _127;
<signed-boolean:32> _169;
...
_169 = -_127;
and it was unclear to me what kind of arithmetic is allowed.
I have now verified that all cases seems to be just one operation of this
form (where _127 has the value 0 or 1), so it cannot construct values
such as 42. But the wide signed Boolean can have the three different
values 1, 0, and -1, which I still think is at least one too many. :)
I'll update my tool to complain if the value is outside the range [-1, 1].
/Krister
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR
2023-06-29 12:06 ` Krister Walfridsson
@ 2023-06-29 12:24 ` Michael Matz
2023-06-29 13:19 ` Richard Biener
1 sibling, 0 replies; 8+ messages in thread
From: Michael Matz @ 2023-06-29 12:24 UTC (permalink / raw)
To: Krister Walfridsson; +Cc: Richard Biener, gcc
Hello,
On Thu, 29 Jun 2023, Krister Walfridsson wrote:
> > The thing with signed bools is that the two relevant values are -1 (true)
> > and 0 (false), those are used for vector bool components where we also
> > need them to be of wider type (32bits in this case).
>
> My main confusion comes from seeing IR doing arithmetic such as
>
> <signed-boolean:32> _127;
> <signed-boolean:32> _169;
> ...
> _169 = _127 + -1;
>
> or
>
> <signed-boolean:32> _127;
> <signed-boolean:32> _169;
> ...
> _169 = -_127;
>
> and it was unclear to me what kind of arithmetic is allowed.
>
> I have now verified that all cases seems to be just one operation of this form
> (where _127 has the value 0 or 1), so it cannot construct values such as 42.
> But the wide signed Boolean can have the three different values 1, 0, and -1,
> which I still think is at least one too many. :)
It definitely is. For signed bool it should be -1 and 0, for unsigned
bool 1 and 0. And of course, arithmetic on bools is always dubious, that
should all be logical operations. Modulo-arithmetic (mod 2) could be
made to work, but then we would have to give up the idea of signed bools
and always use conversions to signed int to get a bitmaks of all-ones.
And as mod-2-arithmetic is equivalent to logical ops it seems a bit futile
to go that way.
Of course, enforcing this all might lead to a surprising heap of errors,
but one has to start somewhere, so ...
> I'll update my tool to complain if the value is outside the range [-1,
> 1].
... maybe not do that, at least optionally, that maybe somewhen someone
can look into fixing that all up? :-) -fdubious-bools?
Ciao,
Michael.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR
2023-06-29 12:06 ` Krister Walfridsson
2023-06-29 12:24 ` Michael Matz
@ 2023-06-29 13:19 ` Richard Biener
2023-06-29 19:08 ` Krister Walfridsson
1 sibling, 1 reply; 8+ messages in thread
From: Richard Biener @ 2023-06-29 13:19 UTC (permalink / raw)
To: Krister Walfridsson; +Cc: Michael Matz, gcc
On Thu, Jun 29, 2023 at 2:06 PM Krister Walfridsson
<krister.walfridsson@gmail.com> wrote:
>
> On Thu, 29 Jun 2023, Richard Biener wrote:
>
> > The thing with signed bools is that the two relevant values are -1 (true)
> > and 0 (false), those are used for vector bool components where we also
> > need them to be of wider type (32bits in this case).
>
> My main confusion comes from seeing IR doing arithmetic such as
>
> <signed-boolean:32> _127;
> <signed-boolean:32> _169;
> ...
> _169 = _127 + -1;
>
> or
>
> <signed-boolean:32> _127;
> <signed-boolean:32> _169;
> ...
> _169 = -_127;
>
> and it was unclear to me what kind of arithmetic is allowed.
IIRC we have some simplification rules that turn bit operations into
arithmetics. Arithmetic is allowed if it keeps the values inside
[-1,0] for signed bools or [0, 1] for unsigned bools.
> I have now verified that all cases seems to be just one operation of this
> form (where _127 has the value 0 or 1), so it cannot construct values
> such as 42. But the wide signed Boolean can have the three different
> values 1, 0, and -1, which I still think is at least one too many. :)
Yeah, I'd be interested in a testcase that shows this behavior.
Richard.
> I'll update my tool to complain if the value is outside the range [-1, 1].
>
> /Krister
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR
2023-06-29 13:19 ` Richard Biener
@ 2023-06-29 19:08 ` Krister Walfridsson
2023-06-29 19:32 ` Andrew Pinski
0 siblings, 1 reply; 8+ messages in thread
From: Krister Walfridsson @ 2023-06-29 19:08 UTC (permalink / raw)
To: Richard Biener; +Cc: Krister Walfridsson, Michael Matz, gcc
On Thu, 29 Jun 2023, Richard Biener wrote:
> IIRC we have some simplification rules that turn bit operations into
> arithmetics. Arithmetic is allowed if it keeps the values inside
> [-1,0] for signed bools or [0, 1] for unsigned bools.
>
>> I have now verified that all cases seems to be just one operation of this
>> form (where _127 has the value 0 or 1), so it cannot construct values
>> such as 42. But the wide signed Boolean can have the three different
>> values 1, 0, and -1, which I still think is at least one too many. :)
>
> Yeah, I'd be interested in a testcase that shows this behavior.
I created PR 110487 with one example.
>> I'll update my tool to complain if the value is outside the range [-1, 1].
It is likely that all issues I have seen so far are due to PR 110487, so
I'll keep the current behavior that complains if the value is outside the
range [-1, 0].
/Krister
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: types in GIMPLE IR
2023-06-29 19:08 ` Krister Walfridsson
@ 2023-06-29 19:32 ` Andrew Pinski
0 siblings, 0 replies; 8+ messages in thread
From: Andrew Pinski @ 2023-06-29 19:32 UTC (permalink / raw)
To: Krister Walfridsson; +Cc: Richard Biener, Michael Matz, gcc
On Thu, Jun 29, 2023 at 12:10 PM Krister Walfridsson via Gcc
<gcc@gcc.gnu.org> wrote:
>
> On Thu, 29 Jun 2023, Richard Biener wrote:
>
> > IIRC we have some simplification rules that turn bit operations into
> > arithmetics. Arithmetic is allowed if it keeps the values inside
> > [-1,0] for signed bools or [0, 1] for unsigned bools.
> >
> >> I have now verified that all cases seems to be just one operation of this
> >> form (where _127 has the value 0 or 1), so it cannot construct values
> >> such as 42. But the wide signed Boolean can have the three different
> >> values 1, 0, and -1, which I still think is at least one too many. :)
> >
> > Yeah, I'd be interested in a testcase that shows this behavior.
>
> I created PR 110487 with one example.
>
>
> >> I'll update my tool to complain if the value is outside the range [-1, 1].
>
> It is likely that all issues I have seen so far are due to PR 110487, so
> I'll keep the current behavior that complains if the value is outside the
> range [-1, 0].
Yes there are many similar to this all over GCC's folding.
In this case checking TYPE_PRECISION as described in the match.pd is
not even the right approach.
The whole TYPE_PRECISION on boolean types is definitely a big can of worms.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102622 is related but
that was signed boolean:1.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106053 is another related case.
For this weekend, I am going to audit some of the match patterns for
these issues.
>
> /Krister
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-06-29 19:32 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-28 9:15 types in GIMPLE IR Krister Walfridsson
2023-06-28 15:46 ` Michael Matz
2023-06-29 6:21 ` Richard Biener
2023-06-29 12:06 ` Krister Walfridsson
2023-06-29 12:24 ` Michael Matz
2023-06-29 13:19 ` Richard Biener
2023-06-29 19:08 ` Krister Walfridsson
2023-06-29 19:32 ` Andrew Pinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).