public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* HELP: Questions on unshare_expr
@ 2024-01-12 15:54 Qing Zhao
  2024-01-12 16:28 ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Qing Zhao @ 2024-01-12 15:54 UTC (permalink / raw)
  To: gcc Patches

Hi, 

I have some questions on using the utility routine “unshare_expr”:

From my understanding, there should be NO shared nodes in a GENERIC function. 
 Otherwise, gimplication might fail. 

Therefore, when we insert new tree nodes manually into the GENERIC function, we should
Make sure there is no shared nodes introduced. 

1. Is the above understanding correct?
2. Is there any tool to check there is no shared nodes in the GENERIC function?
3. Are there any tree nodes that are allowed to be shared in a GENERIC function? If so, what are they?

4. For the following:

If both “op1” and “op2” are existing tree nodes in the current GENERIC function, 
and we will insert a new tree node:

tree  new_tree = build2 (CODE, TYPE, op1, op2)


Should we add “unshare_expr” on both “op1” and “op2” as:

Tree new_tree = build2 (CODE, TYPE, unshare_expr (op1), unshare_expr (op2))
?

If op2 is a node that is allowed to be shared, whether the additional “unshare_expr” on it trigger any potential problem?

Thanks a lot for your help.

Qing 






^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-12 15:54 HELP: Questions on unshare_expr Qing Zhao
@ 2024-01-12 16:28 ` Richard Biener
  2024-01-12 17:30   ` Qing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2024-01-12 16:28 UTC (permalink / raw)
  To: Qing Zhao; +Cc: gcc Patches



> Am 12.01.2024 um 16:55 schrieb Qing Zhao <qing.zhao@oracle.com>:
> 
> Hi,
> 
> I have some questions on using the utility routine “unshare_expr”:
> 
> From my understanding, there should be NO shared nodes in a GENERIC function.
> Otherwise, gimplication might fail.

There is sharing and this is why we unshare everything before gimplification.

> Therefore, when we insert new tree nodes manually into the GENERIC function, we should
> Make sure there is no shared nodes introduced.
> 
> 1. Is the above understanding correct?

No

> 2. Is there any tool to check there is no shared nodes in the GENERIC function?
> 3. Are there any tree nodes that are allowed to be shared in a GENERIC function? If so, what are they?

There’s some allowed sharing on GIMPLE and a verifier.

> 4. For the following:
> 
> If both “op1” and “op2” are existing tree nodes in the current GENERIC function,
> and we will insert a new tree node:
> 
> tree  new_tree = build2 (CODE, TYPE, op1, op2)
> 
> 
> Should we add “unshare_expr” on both “op1” and “op2” as:
> 
> Tree new_tree = build2 (CODE, TYPE, unshare_expr (op1), unshare_expr (op2))
> ?

Not necessarily but instead you have to watch for evaluating side-effects only once.  See save_expr.

> 
> If op2 is a node that is allowed to be shared, whether the additional “unshare_expr” on it trigger any potential problem?

If you unshare side-effects that’s generating wrong-code.  Otherwise unsharing is safe.

Richard 

> Thanks a lot for your help.
> 
> Qing
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-12 16:28 ` Richard Biener
@ 2024-01-12 17:30   ` Qing Zhao
  2024-01-15  8:13     ` Eric Botcazou
  2024-01-15  9:31     ` Richard Biener
  0 siblings, 2 replies; 18+ messages in thread
From: Qing Zhao @ 2024-01-12 17:30 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc Patches

Thanks a lot for the reply.  

> On Jan 12, 2024, at 11:28 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> 
> 
>> Am 12.01.2024 um 16:55 schrieb Qing Zhao <qing.zhao@oracle.com>:
>> 
>> Hi,
>> 
>> I have some questions on using the utility routine “unshare_expr”:
>> 
>> From my understanding, there should be NO shared nodes in a GENERIC function.
>> Otherwise, gimplication might fail.
> 
> There is sharing and this is why we unshare everything before gimplification.

Okay, so, the "unsharing everything” is done automatically by the compiler before gimplification? 
I don’t need to worry about this?

I see  many places in FE where “unshare_expr” is used, for example, “ubsan_instrument_division”,
 “ubsan_instrument_shift”, etc. 

So, usually, when should “unshare_expr” be used? 

>> Therefore, when we insert new tree nodes manually into the GENERIC function, we should
>> Make sure there is no shared nodes introduced.
>> 
>> 1. Is the above understanding correct?
> 
> No
> 
>> 2. Is there any tool to check there is no shared nodes in the GENERIC function?
>> 3. Are there any tree nodes that are allowed to be shared in a GENERIC function? If so, what are they?
> 
> There’s some allowed sharing on GIMPLE and a verifier.
What’s the name of the verifier that I can search and check? 
> 
>> 4. For the following:
>> 
>> If both “op1” and “op2” are existing tree nodes in the current GENERIC function,
>> and we will insert a new tree node:
>> 
>> tree  new_tree = build2 (CODE, TYPE, op1, op2)
>> 
>> 
>> Should we add “unshare_expr” on both “op1” and “op2” as:
>> 
>> Tree new_tree = build2 (CODE, TYPE, unshare_expr (op1), unshare_expr (op2))
>> ?
> 
> Not necessarily but instead you have to watch for evaluating side-effects only once.  See save_expr.

Okay.  I see.
> 
>> 
>> If op2 is a node that is allowed to be shared, whether the additional “unshare_expr” on it trigger any potential problem?
> 
> If you unshare side-effects that’s generating wrong-code.  Otherwise unsharing is safe.

Okay. 
Will unnecessary unshareing produce redundant IRs?

All my questions for unshare_expr relate to a  LTO bug that I currently stuck with 
when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):

[opc@qinzhao-aarch64-ol8 gcc]$ sh t
during IPA pass: modref
t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
0x14c3993 lto_write_tree
	../../latest-gcc-write/gcc/lto-streamer-out.cc:561
0x14c3aeb lto_output_tree_1

And the value of the tree node that triggered the ICE is:
(gdb) call debug_tree(expr)
 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
    nothrow
    def_stmt 
    version:13 in-free-list>

Is there any good way to debug LTO bug?

Thanks a lot for the help.

Qing


> 
> Richard 
> 
>> Thanks a lot for your help.
>> 
>> Qing


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-12 17:30   ` Qing Zhao
@ 2024-01-15  8:13     ` Eric Botcazou
  2024-01-15 16:42       ` Qing Zhao
  2024-01-15  9:31     ` Richard Biener
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Botcazou @ 2024-01-15  8:13 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Richard Biener, gcc-patches

> Okay, so, the "unsharing everything” is done automatically by the compiler
> before gimplification? 

See the blurb at gimplify.cc:835 and below about this.

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-12 17:30   ` Qing Zhao
  2024-01-15  8:13     ` Eric Botcazou
@ 2024-01-15  9:31     ` Richard Biener
  2024-01-15 14:54       ` Qing Zhao
  2024-01-16 20:25       ` Qing Zhao
  1 sibling, 2 replies; 18+ messages in thread
From: Richard Biener @ 2024-01-15  9:31 UTC (permalink / raw)
  To: Qing Zhao; +Cc: gcc Patches

On Fri, Jan 12, 2024 at 6:30 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>
> Thanks a lot for the reply.
>
> > On Jan 12, 2024, at 11:28 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> >
> >
> >
> >> Am 12.01.2024 um 16:55 schrieb Qing Zhao <qing.zhao@oracle.com>:
> >>
> >> Hi,
> >>
> >> I have some questions on using the utility routine “unshare_expr”:
> >>
> >> From my understanding, there should be NO shared nodes in a GENERIC function.
> >> Otherwise, gimplication might fail.
> >
> > There is sharing and this is why we unshare everything before gimplification.
>
> Okay, so, the "unsharing everything” is done automatically by the compiler before gimplification?
> I don’t need to worry about this?
>
> I see  many places in FE where “unshare_expr” is used, for example, “ubsan_instrument_division”,
>  “ubsan_instrument_shift”, etc.

It's likely doing sth during gimplification.

> So, usually, when should “unshare_expr” be used?

You should usually unshare when you are putting the same 'tree' into multiple
operands.  Using a SAVE_EXPR avoids redundant code but it also requires
that the SAVE_EXPR uses are ordered.

> >> Therefore, when we insert new tree nodes manually into the GENERIC function, we should
> >> Make sure there is no shared nodes introduced.
> >>
> >> 1. Is the above understanding correct?
> >
> > No
> >
> >> 2. Is there any tool to check there is no shared nodes in the GENERIC function?
> >> 3. Are there any tree nodes that are allowed to be shared in a GENERIC function? If so, what are they?
> >
> > There’s some allowed sharing on GIMPLE and a verifier.
> What’s the name of the verifier that I can search and check?

verify_node_sharing

> >
> >> 4. For the following:
> >>
> >> If both “op1” and “op2” are existing tree nodes in the current GENERIC function,
> >> and we will insert a new tree node:
> >>
> >> tree  new_tree = build2 (CODE, TYPE, op1, op2)
> >>
> >>
> >> Should we add “unshare_expr” on both “op1” and “op2” as:
> >>
> >> Tree new_tree = build2 (CODE, TYPE, unshare_expr (op1), unshare_expr (op2))
> >> ?
> >
> > Not necessarily but instead you have to watch for evaluating side-effects only once.  See save_expr.
>
> Okay.  I see.
> >
> >>
> >> If op2 is a node that is allowed to be shared, whether the additional “unshare_expr” on it trigger any potential problem?
> >
> > If you unshare side-effects that’s generating wrong-code.  Otherwise unsharing is safe.
>
> Okay.
> Will unnecessary unshareing produce redundant IRs?

Yes.

> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
>
> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
> during IPA pass: modref
> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
> 0x14c3993 lto_write_tree
>         ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
> 0x14c3aeb lto_output_tree_1
>
> And the value of the tree node that triggered the ICE is:
> (gdb) call debug_tree(expr)
>  <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>     nothrow
>     def_stmt
>     version:13 in-free-list>
>
> Is there any good way to debug LTO bug?

This happens usually when you have a VLA type and its type fields are not
properly gimplified which usually happens because the frontend fails to
insert a gimplification point for it (a DECL_EXPR).

> Thanks a lot for the help.
>
> Qing
>
>
> >
> > Richard
> >
> >> Thanks a lot for your help.
> >>
> >> Qing
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-15  9:31     ` Richard Biener
@ 2024-01-15 14:54       ` Qing Zhao
  2024-01-15 15:06         ` Jakub Jelinek
  2024-01-16 20:25       ` Qing Zhao
  1 sibling, 1 reply; 18+ messages in thread
From: Qing Zhao @ 2024-01-15 14:54 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc Patches



> On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> On Fri, Jan 12, 2024 at 6:30 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>> 
>> Thanks a lot for the reply.
>> 
>>> On Jan 12, 2024, at 11:28 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>> 
>>> 
>>> 
>>>> Am 12.01.2024 um 16:55 schrieb Qing Zhao <qing.zhao@oracle.com>:
>>>> 
>>>> Hi,
>>>> 
>>>> I have some questions on using the utility routine “unshare_expr”:
>>>> 
>>>> From my understanding, there should be NO shared nodes in a GENERIC function.
>>>> Otherwise, gimplication might fail.
>>> 
>>> There is sharing and this is why we unshare everything before gimplification.
>> 
>> Okay, so, the "unsharing everything” is done automatically by the compiler before gimplification?
>> I don’t need to worry about this?
>> 
>> I see  many places in FE where “unshare_expr” is used, for example, “ubsan_instrument_division”,
>> “ubsan_instrument_shift”, etc.
> 
> It's likely doing sth during gimplification.

So, before gimplification,  when inserting tree node, we don’t need manually
 add unshare_expr since the gimplification will automatically unshare nodes. 

However, during or after gimplfication, when inserting nodes, we should manually
 add unshare_expr when we put the same “tree” into multiple operands.

Is this understanding correct?

>> So, usually, when should “unshare_expr” be used?
> 
> You should usually unshare when you are putting the same 'tree' into multiple
> operands.  

Okay, I see.

> Using a SAVE_EXPR avoids redundant code but it also requires
> that the SAVE_EXPR uses are ordered.

“Require the SAVE_EXPR uses are ordered”, does this mean that 
SAVE_EXPRs for the same node should be in a correct order? Or something else?


> 
>>>> Therefore, when we insert new tree nodes manually into the GENERIC function, we should
>>>> Make sure there is no shared nodes introduced.
>>>> 
>>>> 1. Is the above understanding correct?
>>> 
>>> No
>>> 
>>>> 2. Is there any tool to check there is no shared nodes in the GENERIC function?
>>>> 3. Are there any tree nodes that are allowed to be shared in a GENERIC function? If so, what are they?
>>> 
>>> There’s some allowed sharing on GIMPLE and a verifier.
>> What’s the name of the verifier that I can search and check?
> 
> verify_node_sharing

Okay, thanks. 

> 
>>> 
>>>> 4. For the following:
>>>> 
>>>> If both “op1” and “op2” are existing tree nodes in the current GENERIC function,
>>>> and we will insert a new tree node:
>>>> 
>>>> tree  new_tree = build2 (CODE, TYPE, op1, op2)
>>>> 
>>>> 
>>>> Should we add “unshare_expr” on both “op1” and “op2” as:
>>>> 
>>>> Tree new_tree = build2 (CODE, TYPE, unshare_expr (op1), unshare_expr (op2))
>>>> ?
>>> 
>>> Not necessarily but instead you have to watch for evaluating side-effects only once.  See save_expr.
>> 
>> Okay.  I see.
>>> 
>>>> 
>>>> If op2 is a node that is allowed to be shared, whether the additional “unshare_expr” on it trigger any potential problem?
>>> 
>>> If you unshare side-effects that’s generating wrong-code.  Otherwise unsharing is safe.
>> 
>> Okay.
>> Will unnecessary unshareing produce redundant IRs?
> 
> Yes.
> 
>> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
>> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
>> 
>> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
>> during IPA pass: modref
>> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
>> 0x14c3993 lto_write_tree
>>        ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
>> 0x14c3aeb lto_output_tree_1
>> 
>> And the value of the tree node that triggered the ICE is:
>> (gdb) call debug_tree(expr)
>> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>    nothrow
>>    def_stmt
>>    version:13 in-free-list>
>> 
>> Is there any good way to debug LTO bug?
> 
> This happens usually when you have a VLA type and its type fields are not
> properly gimplified which usually happens because the frontend fails to
> insert a gimplification point for it (a DECL_EXPR).
Thanks for the info. 
This is happening for a structure TYPE with FAM (I guess similar as VLA?)
Usually what’s the good solution to it?

thanks.

Qing
> 
>> Thanks a lot for the help.
>> 
>> Qing
>> 
>> 
>>> 
>>> Richard
>>> 
>>>> Thanks a lot for your help.
>>>> 
>>>> Qing
>> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-15 14:54       ` Qing Zhao
@ 2024-01-15 15:06         ` Jakub Jelinek
  2024-01-15 16:41           ` Qing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Jelinek @ 2024-01-15 15:06 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Richard Biener, gcc Patches

On Mon, Jan 15, 2024 at 02:54:26PM +0000, Qing Zhao wrote:
> So, before gimplification,  when inserting tree node, we don’t need manually
>  add unshare_expr since the gimplification will automatically unshare nodes. 

There are cases where unshare_expr is needed even then, such as the uses in
the sanitizer, because code is then modifying suboperands in place later on
and if things are shared bad things happen.  If trees can be shared until
they are unshared before gimplification, one doesn't need to worry about it,
sure.

> However, during or after gimplfication, when inserting nodes, we should manually
>  add unshare_expr when we put the same “tree” into multiple operands.

Yes.

> > Using a SAVE_EXPR avoids redundant code but it also requires
> > that the SAVE_EXPR uses are ordered.
> 
> “Require the SAVE_EXPR uses are ordered”, does this mean that 
> SAVE_EXPRs for the same node should be in a correct order? Or something else?

The basic requirement is that SAVE_EXPR is evaluated somewhere in a code
which dominates all other uses of the SAVE_EXPR.
Say
SAVE_EXPR <something_complex>, if (x) use1 (SAVE_EXPR <something_complex>); else use2 (SAVE_EXPR <something_complex>);
is fine, but
if (x) use1 (SAVE_EXPR <something_complex>); else use2 (SAVE_EXPR <something_complex>);
is not.  Because in the latter case, it will be gimplified into evaluating
the complex expression in the conditional code guarded on if (x != 0), save
into some temporary variable and then in the else code just use that
temporary variable, except it is uninitialized then.

	Jakub


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-15 15:06         ` Jakub Jelinek
@ 2024-01-15 16:41           ` Qing Zhao
  0 siblings, 0 replies; 18+ messages in thread
From: Qing Zhao @ 2024-01-15 16:41 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc Patches



> On Jan 15, 2024, at 10:06 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Mon, Jan 15, 2024 at 02:54:26PM +0000, Qing Zhao wrote:
>> So, before gimplification,  when inserting tree node, we don’t need manually
>> add unshare_expr since the gimplification will automatically unshare nodes. 
> 
> There are cases where unshare_expr is needed even then, such as the uses in
> the sanitizer, because code is then modifying suboperands in place later on
> and if things are shared bad things happen.

for my case, it’s in bound sanitizer, and the instrumentation happens 
during “c_genericize”, which seems before gimplfication. 

So,  when adding instrumentation for bound sanitizer, we still need to 
manually unshare expr even it’s before gimpflication?


If trees can be shared until
> they are unshared before gimplification, one doesn't need to worry about it,
> sure.
> 
>> However, during or after gimplfication, when inserting nodes, we should manually
>> add unshare_expr when we put the same “tree” into multiple operands.
> 
> Yes.
> 
>>> Using a SAVE_EXPR avoids redundant code but it also requires
>>> that the SAVE_EXPR uses are ordered.
>> 
>> “Require the SAVE_EXPR uses are ordered”, does this mean that 
>> SAVE_EXPRs for the same node should be in a correct order? Or something else?
> 
> The basic requirement is that SAVE_EXPR is evaluated somewhere in a code
> which dominates all other uses of the SAVE_EXPR.
> Say
> SAVE_EXPR <something_complex>, if (x) use1 (SAVE_EXPR <something_complex>); else use2 (SAVE_EXPR <something_complex>);
> is fine, but
> if (x) use1 (SAVE_EXPR <something_complex>); else use2 (SAVE_EXPR <something_complex>);
> is not.  Because in the latter case, it will be gimplified into evaluating
> the complex expression in the conditional code guarded on if (x != 0), save
> into some temporary variable and then in the else code just use that
> temporary variable, except it is uninitialized then.

Okay, I see.

Is there utility tool to check for any violation of this order? Or I have to manually check the order myself?

Thanks a lot for the help.

Qing
> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-15  8:13     ` Eric Botcazou
@ 2024-01-15 16:42       ` Qing Zhao
  0 siblings, 0 replies; 18+ messages in thread
From: Qing Zhao @ 2024-01-15 16:42 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Richard Biener, gcc-patches



> On Jan 15, 2024, at 3:13 AM, Eric Botcazou <botcazou@adacore.com> wrote:
> 
>> Okay, so, the "unsharing everything” is done automatically by the compiler
>> before gimplification? 
> 
> See the blurb at gimplify.cc:835 and below about this.

Thanks a lot for the info.  (I read this paragraph before sending the questions…)

Qing
> 
> -- 
> Eric Botcazou
> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-15  9:31     ` Richard Biener
  2024-01-15 14:54       ` Qing Zhao
@ 2024-01-16 20:25       ` Qing Zhao
  2024-01-17  6:42         ` Richard Biener
  1 sibling, 1 reply; 18+ messages in thread
From: Qing Zhao @ 2024-01-16 20:25 UTC (permalink / raw)
  To: Richard Biener, jakub; +Cc: gcc Patches



> On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> 
>> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
>> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
>> 
>> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
>> during IPA pass: modref
>> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
>> 0x14c3993 lto_write_tree
>>        ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
>> 0x14c3aeb lto_output_tree_1
>> 
>> And the value of the tree node that triggered the ICE is:
>> (gdb) call debug_tree(expr)
>> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>    nothrow
>>    def_stmt
>>    version:13 in-free-list>
>> 
>> Is there any good way to debug LTO bug?
> 
> This happens usually when you have a VLA type and its type fields are not
> properly gimplified which usually happens because the frontend fails to
> insert a gimplification point for it (a DECL_EXPR).

I found an old gcc bug 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57

Which is very similar to the bug I am having right now. 

After further study, I suspect that the issue I am having right now with the LTO streaming also 
relate to “unshare_expr”, “save_expr”, and the combination of these two, I suspect that
the current gcc cannot handle the combination of these two correctly for my case. 

My testing case is:

#include <stdlib.h>
void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
{
   struct foo {
       int n;
       int p[][n2][n1] __attribute__((counted_by(n)));
   } *f;

   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
   f->n = m;
   f->p[m][n2][n1]=1;
   return;
}

int main(int argc, char *argv[])
{
  setup_and_test_vla (10, 11, 20);
  return 0;
}

Failed with 
my_gcc -Os -fsanitize=bounds -flto

If changing either n1 or n2 to a constant, the testing passed. 
If deleting -flto, the testing passed too. 

I double checked my code per the suggestions provided by you and Jakub in this
email thread, and I think the code should be fine.

The code is following:

=====
504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
505    POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
506    builtins that gets expanded in the sanopt pass, and make an array
507    dimension of it.  ARRAY is the pointer to the base of the array,
508    which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
509    beginning of array.
510    Return NULL_TREE if no instrumentation is emitted.  */
511  
512 tree
513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, tree *offset)
514 {
515   if (!is_access_with_size_p (array))
516     return NULL_TREE;
517   tree bound = get_bound_from_access_with_size (array);
518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
519      the element of the array.  */
520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
521   gcc_assert (bound);
522 
523   /* Given the offset, and the size of each element, the index can be
524      computed as: offset/element_size.  */
525   *offset = save_expr (*offset);
526   tree index = fold_build2 (EXACT_DIV_EXPR,
527                            sizetype, *offset,
528                            unshare_expr (element_size));
529   /* Create a "(T *) 0" tree node to describe the original array type.
530      We get the original array type from the first argument of the call to
531      .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
532  
533      Originally, REF is a COMPONENT_REF with the original array type,
534      it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
535      first operand is the original COMPONENT_REF.  */
536   tree ref = CALL_EXPR_ARG (array, 0);
537   tree array_type
538     = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 0)));
539   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
540   return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
541                                        void_type_node, 3, zero_with_type,
542                                        index, bound);
543 }

=====

Inside gdb, the guilty IR failed in LTO streaming is from the above line 520:
TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array))),

When I use this tree node as an operand of the expression at line 526, I added 
unshare_expr. 

However, I still see the guilty IR as in gdb:

            unit-size <mult_expr 0xfffff5aabf90 type <integer_type 0xfffff57c0000 sizetype>
                side-effects
                arg:0 <mult_expr 0xfffff5aabf68 type <integer_type 0xfffff57c0000 sizetype>

                    arg:0 <ssa_name 0xfffff5761e18 type <error_mark 0xfffff56c0e58>
                        nothrow
                        def_stmt
                        version:12 in-free-list>
                    arg:1 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
                        nothrow
                        def_stmt
                        version:13 in-free-list>>
                arg:1 <integer_cst 0xfffff56c10c8 constant 4>>


I have been stuck with this bug for quite some time. 
Any help is helpful.

Qing

> 
>> Thanks a lot for the help.
>> 
>> Qing
>> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-16 20:25       ` Qing Zhao
@ 2024-01-17  6:42         ` Richard Biener
  2024-01-17  6:43           ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2024-01-17  6:42 UTC (permalink / raw)
  To: Qing Zhao; +Cc: jakub, gcc Patches

On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>
>
>
> > On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> >
> >> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
> >> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
> >>
> >> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
> >> during IPA pass: modref
> >> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
> >> 0x14c3993 lto_write_tree
> >>        ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
> >> 0x14c3aeb lto_output_tree_1
> >>
> >> And the value of the tree node that triggered the ICE is:
> >> (gdb) call debug_tree(expr)
> >> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
> >>    nothrow
> >>    def_stmt
> >>    version:13 in-free-list>
> >>
> >> Is there any good way to debug LTO bug?
> >
> > This happens usually when you have a VLA type and its type fields are not
> > properly gimplified which usually happens because the frontend fails to
> > insert a gimplification point for it (a DECL_EXPR).
>
> I found an old gcc bug
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
> ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57
>
> Which is very similar to the bug I am having right now.
>
> After further study, I suspect that the issue I am having right now with the LTO streaming also
> relate to “unshare_expr”, “save_expr”, and the combination of these two, I suspect that
> the current gcc cannot handle the combination of these two correctly for my case.
>
> My testing case is:
>
> #include <stdlib.h>
> void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
> {
>    struct foo {
>        int n;
>        int p[][n2][n1] __attribute__((counted_by(n)));
>    } *f;
>
>    f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
>    f->n = m;
>    f->p[m][n2][n1]=1;
>    return;
> }
>
> int main(int argc, char *argv[])
> {
>   setup_and_test_vla (10, 11, 20);
>   return 0;
> }
>
> Failed with
> my_gcc -Os -fsanitize=bounds -flto
>
> If changing either n1 or n2 to a constant, the testing passed.
> If deleting -flto, the testing passed too.
>
> I double checked my code per the suggestions provided by you and Jakub in this
> email thread, and I think the code should be fine.
>
> The code is following:
>
> =====
> 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
> 505    POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
> 506    builtins that gets expanded in the sanopt pass, and make an array
> 507    dimension of it.  ARRAY is the pointer to the base of the array,
> 508    which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
> 509    beginning of array.
> 510    Return NULL_TREE if no instrumentation is emitted.  */
> 511
> 512 tree
> 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, tree *offset)
> 514 {
> 515   if (!is_access_with_size_p (array))
> 516     return NULL_TREE;
> 517   tree bound = get_bound_from_access_with_size (array);
> 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
> 519      the element of the array.  */
> 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
> 521   gcc_assert (bound);
> 522
> 523   /* Given the offset, and the size of each element, the index can be
> 524      computed as: offset/element_size.  */
> 525   *offset = save_expr (*offset);
> 526   tree index = fold_build2 (EXACT_DIV_EXPR,
> 527                            sizetype, *offset,
> 528                            unshare_expr (element_size));
> 529   /* Create a "(T *) 0" tree node to describe the original array type.
> 530      We get the original array type from the first argument of the call to
> 531      .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
> 532
> 533      Originally, REF is a COMPONENT_REF with the original array type,
> 534      it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
> 535      first operand is the original COMPONENT_REF.  */
> 536   tree ref = CALL_EXPR_ARG (array, 0);
> 537   tree array_type
> 538     = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 0)));
> 539   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
> 540   return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
> 541                                        void_type_node, 3, zero_with_type,
> 542                                        index, bound);
> 543 }
>
> =====
>
> Inside gdb, the guilty IR failed in LTO streaming is from the above line 520:
> TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array))),
>
> When I use this tree node as an operand of the expression at line 526, I added
> unshare_expr.
>
> However, I still see the guilty IR as in gdb:
>
>             unit-size <mult_expr 0xfffff5aabf90 type <integer_type 0xfffff57c0000 sizetype>
>                 side-effects
>                 arg:0 <mult_expr 0xfffff5aabf68 type <integer_type 0xfffff57c0000 sizetype>
>
>                     arg:0 <ssa_name 0xfffff5761e18 type <error_mark 0xfffff56c0e58>
>                         nothrow
>                         def_stmt
>                         version:12 in-free-list>
>                     arg:1 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>                         nothrow
>                         def_stmt
>                         version:13 in-free-list>>
>                 arg:1 <integer_cst 0xfffff56c10c8 constant 4>>
>
>
> I have been stuck with this bug for quite some time.
> Any help is helpful.

The above hasn't been gimplified correctly, you'd instead see
a D.1234 in there, not an expression with SSA names.  That happens
when the frontend fails to emit a DECL_EXPR for a decl with this
type.

>
> Qing
>
> >
> >> Thanks a lot for the help.
> >>
> >> Qing
> >>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-17  6:42         ` Richard Biener
@ 2024-01-17  6:43           ` Richard Biener
  2024-01-18 14:45             ` Qing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2024-01-17  6:43 UTC (permalink / raw)
  To: Qing Zhao; +Cc: jakub, gcc Patches

On Wed, Jan 17, 2024 at 7:42 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
> >
> >
> >
> > > On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> > >
> > >> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
> > >> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
> > >>
> > >> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
> > >> during IPA pass: modref
> > >> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
> > >> 0x14c3993 lto_write_tree
> > >>        ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
> > >> 0x14c3aeb lto_output_tree_1
> > >>
> > >> And the value of the tree node that triggered the ICE is:
> > >> (gdb) call debug_tree(expr)
> > >> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
> > >>    nothrow
> > >>    def_stmt
> > >>    version:13 in-free-list>
> > >>
> > >> Is there any good way to debug LTO bug?
> > >
> > > This happens usually when you have a VLA type and its type fields are not
> > > properly gimplified which usually happens because the frontend fails to
> > > insert a gimplification point for it (a DECL_EXPR).
> >
> > I found an old gcc bug
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
> > ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57
> >
> > Which is very similar to the bug I am having right now.
> >
> > After further study, I suspect that the issue I am having right now with the LTO streaming also
> > relate to “unshare_expr”, “save_expr”, and the combination of these two, I suspect that
> > the current gcc cannot handle the combination of these two correctly for my case.
> >
> > My testing case is:
> >
> > #include <stdlib.h>
> > void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
> > {
> >    struct foo {
> >        int n;
> >        int p[][n2][n1] __attribute__((counted_by(n)));
> >    } *f;
> >
> >    f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
> >    f->n = m;
> >    f->p[m][n2][n1]=1;
> >    return;
> > }
> >
> > int main(int argc, char *argv[])
> > {
> >   setup_and_test_vla (10, 11, 20);
> >   return 0;
> > }
> >
> > Failed with
> > my_gcc -Os -fsanitize=bounds -flto
> >
> > If changing either n1 or n2 to a constant, the testing passed.
> > If deleting -flto, the testing passed too.
> >
> > I double checked my code per the suggestions provided by you and Jakub in this
> > email thread, and I think the code should be fine.
> >
> > The code is following:
> >
> > =====
> > 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
> > 505    POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
> > 506    builtins that gets expanded in the sanopt pass, and make an array
> > 507    dimension of it.  ARRAY is the pointer to the base of the array,
> > 508    which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
> > 509    beginning of array.
> > 510    Return NULL_TREE if no instrumentation is emitted.  */
> > 511
> > 512 tree
> > 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, tree *offset)
> > 514 {
> > 515   if (!is_access_with_size_p (array))
> > 516     return NULL_TREE;
> > 517   tree bound = get_bound_from_access_with_size (array);
> > 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
> > 519      the element of the array.  */
> > 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
> > 521   gcc_assert (bound);
> > 522
> > 523   /* Given the offset, and the size of each element, the index can be
> > 524      computed as: offset/element_size.  */
> > 525   *offset = save_expr (*offset);
> > 526   tree index = fold_build2 (EXACT_DIV_EXPR,
> > 527                            sizetype, *offset,
> > 528                            unshare_expr (element_size));
> > 529   /* Create a "(T *) 0" tree node to describe the original array type.
> > 530      We get the original array type from the first argument of the call to
> > 531      .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
> > 532
> > 533      Originally, REF is a COMPONENT_REF with the original array type,
> > 534      it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
> > 535      first operand is the original COMPONENT_REF.  */
> > 536   tree ref = CALL_EXPR_ARG (array, 0);
> > 537   tree array_type
> > 538     = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 0)));
> > 539   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
> > 540   return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
> > 541                                        void_type_node, 3, zero_with_type,
> > 542                                        index, bound);
> > 543 }
> >
> > =====
> >
> > Inside gdb, the guilty IR failed in LTO streaming is from the above line 520:
> > TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array))),
> >
> > When I use this tree node as an operand of the expression at line 526, I added
> > unshare_expr.
> >
> > However, I still see the guilty IR as in gdb:
> >
> >             unit-size <mult_expr 0xfffff5aabf90 type <integer_type 0xfffff57c0000 sizetype>
> >                 side-effects
> >                 arg:0 <mult_expr 0xfffff5aabf68 type <integer_type 0xfffff57c0000 sizetype>
> >
> >                     arg:0 <ssa_name 0xfffff5761e18 type <error_mark 0xfffff56c0e58>
> >                         nothrow
> >                         def_stmt
> >                         version:12 in-free-list>
> >                     arg:1 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
> >                         nothrow
> >                         def_stmt
> >                         version:13 in-free-list>>
> >                 arg:1 <integer_cst 0xfffff56c10c8 constant 4>>
> >
> >
> > I have been stuck with this bug for quite some time.
> > Any help is helpful.
>
> The above hasn't been gimplified correctly, you'd instead see
> a D.1234 in there, not an expression with SSA names.  That happens
> when the frontend fails to emit a DECL_EXPR for a decl with this
> type.

.. which then also results in missing unsharing of this expression
(so the SSA names leak in)

>
> >
> > Qing
> >
> > >
> > >> Thanks a lot for the help.
> > >>
> > >> Qing
> > >>
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-17  6:43           ` Richard Biener
@ 2024-01-18 14:45             ` Qing Zhao
  2024-01-19  9:30               ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Qing Zhao @ 2024-01-18 14:45 UTC (permalink / raw)
  To: Richard Biener; +Cc: jakub, gcc Patches



> On Jan 17, 2024, at 1:43 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> On Wed, Jan 17, 2024 at 7:42 AM Richard Biener
> <richard.guenther@gmail.com> wrote:
>> 
>> On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>> 
>>> 
>>> 
>>>> On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>>> 
>>>>> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
>>>>> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
>>>>> 
>>>>> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
>>>>> during IPA pass: modref
>>>>> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
>>>>> 0x14c3993 lto_write_tree
>>>>>       ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
>>>>> 0x14c3aeb lto_output_tree_1
>>>>> 
>>>>> And the value of the tree node that triggered the ICE is:
>>>>> (gdb) call debug_tree(expr)
>>>>> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>>>>   nothrow
>>>>>   def_stmt
>>>>>   version:13 in-free-list>
>>>>> 
>>>>> Is there any good way to debug LTO bug?
>>>> 
>>>> This happens usually when you have a VLA type and its type fields are not
>>>> properly gimplified which usually happens because the frontend fails to
>>>> insert a gimplification point for it (a DECL_EXPR).
>>> 
>>> I found an old gcc bug
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
>>> ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57
>>> 
>>> Which is very similar to the bug I am having right now.
>>> 
>>> After further study, I suspect that the issue I am having right now with the LTO streaming also
>>> relate to “unshare_expr”, “save_expr”, and the combination of these two, I suspect that
>>> the current gcc cannot handle the combination of these two correctly for my case.
>>> 
>>> My testing case is:
>>> 
>>> #include <stdlib.h>
>>> void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
>>> {
>>>   struct foo {
>>>       int n;
>>>       int p[][n2][n1] __attribute__((counted_by(n)));
>>>   } *f;
>>> 
>>>   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
>>>   f->n = m;
>>>   f->p[m][n2][n1]=1;
>>>   return;
>>> }
>>> 
>>> int main(int argc, char *argv[])
>>> {
>>>  setup_and_test_vla (10, 11, 20);
>>>  return 0;
>>> }
>>> 
>>> Failed with
>>> my_gcc -Os -fsanitize=bounds -flto
>>> 
>>> If changing either n1 or n2 to a constant, the testing passed.
>>> If deleting -flto, the testing passed too.
>>> 
>>> I double checked my code per the suggestions provided by you and Jakub in this
>>> email thread, and I think the code should be fine.
>>> 
>>> The code is following:
>>> 
>>> =====
>>> 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
>>> 505    POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
>>> 506    builtins that gets expanded in the sanopt pass, and make an array
>>> 507    dimension of it.  ARRAY is the pointer to the base of the array,
>>> 508    which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
>>> 509    beginning of array.
>>> 510    Return NULL_TREE if no instrumentation is emitted.  */
>>> 511
>>> 512 tree
>>> 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, tree *offset)
>>> 514 {
>>> 515   if (!is_access_with_size_p (array))
>>> 516     return NULL_TREE;
>>> 517   tree bound = get_bound_from_access_with_size (array);
>>> 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
>>> 519      the element of the array.  */
>>> 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
>>> 521   gcc_assert (bound);
>>> 522
>>> 523   /* Given the offset, and the size of each element, the index can be
>>> 524      computed as: offset/element_size.  */
>>> 525   *offset = save_expr (*offset);
>>> 526   tree index = fold_build2 (EXACT_DIV_EXPR,
>>> 527                            sizetype, *offset,
>>> 528                            unshare_expr (element_size));
>>> 529   /* Create a "(T *) 0" tree node to describe the original array type.
>>> 530      We get the original array type from the first argument of the call to
>>> 531      .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
>>> 532
>>> 533      Originally, REF is a COMPONENT_REF with the original array type,
>>> 534      it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
>>> 535      first operand is the original COMPONENT_REF.  */
>>> 536   tree ref = CALL_EXPR_ARG (array, 0);
>>> 537   tree array_type
>>> 538     = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 0)));
>>> 539   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
>>> 540   return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
>>> 541                                        void_type_node, 3, zero_with_type,
>>> 542                                        index, bound);
>>> 543 }
>>> 
>>> =====
>>> 
>>> Inside gdb, the guilty IR failed in LTO streaming is from the above line 520:
>>> TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array))),
>>> 
>>> When I use this tree node as an operand of the expression at line 526, I added
>>> unshare_expr.
>>> 
>>> However, I still see the guilty IR as in gdb:
>>> 
>>>            unit-size <mult_expr 0xfffff5aabf90 type <integer_type 0xfffff57c0000 sizetype>
>>>                side-effects
>>>                arg:0 <mult_expr 0xfffff5aabf68 type <integer_type 0xfffff57c0000 sizetype>
>>> 
>>>                    arg:0 <ssa_name 0xfffff5761e18 type <error_mark 0xfffff56c0e58>
>>>                        nothrow
>>>                        def_stmt
>>>                        version:12 in-free-list>
>>>                    arg:1 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>>                        nothrow
>>>                        def_stmt
>>>                        version:13 in-free-list>>
>>>                arg:1 <integer_cst 0xfffff56c10c8 constant 4>>
>>> 
>>> 
>>> I have been stuck with this bug for quite some time.
>>> Any help is helpful.
>> 
>> The above hasn't been gimplified correctly, you'd instead see
>> a D.1234 in there, not an expression with SSA names.  That happens
>> when the frontend fails to emit a DECL_EXPR for a decl with this
>> type.
> 
> .. which then also results in missing unsharing of this expression
> (so the SSA names leak in)

Thanks a lot for the hints.

One correction first, the LTO bug is not related to -fsanitize=bounds.  Deleting -fsanitize=bounds still can
repeat the failure.

After further debugging into the gimplification phase related with the SAVE_EXPR, I finally locate the place 
where the unshareing of the expression is missing.   This is in the routine “pointer_int_sum” of c-family/c-common.cc:

3330     {
3331       if (!complain && !COMPLETE_TYPE_P (TREE_TYPE (result_type)))
3332         return error_mark_node;
3333       size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
3334       /* Wrap the pointer expression in a SAVE_EXPR to make sure it
3335          is evaluated first when the size expression may depend
3336          on it for VM types.  */
3337       if (TREE_SIDE_EFFECTS (size_exp)
3338           && TREE_SIDE_EFFECTS (ptrop)
3339           && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
3340         {
3341           ptrop = save_expr (ptrop);
3342           size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
3343         }
3344     }

In the above, at line 3333, the tree node, TYPE_SIZE_UNIT (TREE_TYPE(result_type)), is returned directly as
the size_exp, 

 (gdb) call debug_tree(size_exp)
 <mult_expr 0xfffff5a6f910
    type <integer_type 0xfffff57c0000 sizetype public unsigned DI
        size <integer_cst 0xfffff56c0e70 constant 64>
        unit-size <integer_cst 0xfffff56c0e88 constant 8>
        align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0xfffff57c0000 precision:64 min <integer_cst 0xfffff56c0ea0 0> max <integer_cst 0xfffff56d05e0 18446744073709551615>>
    side-effects
    arg:0 <mult_expr 0xfffff5a6f8e8 type <integer_type 0xfffff57c0000 sizetype>
        side-effects
        arg:0 <nop_expr 0xfffff56dc540 type <integer_type 0xfffff57c0000 sizetype>
            side-effects
            arg:0 <save_expr 0xfffff56dc4c0 type <integer_type 0xfffff57c05e8 int>
                side-effects arg:0 <parm_decl 0xfffff76b6f80 n1>>>
        arg:1 <nop_expr 0xfffff56dc600 type <integer_type 0xfffff57c0000 sizetype>
            side-effects
            arg:0 <save_expr 0xfffff56dc580 type <integer_type 0xfffff57c05e8 int>
                side-effects arg:0 <parm_decl 0xfffff76b7000 n2>>>>
    arg:1 <integer_cst 0xfffff56c10c8 type <integer_type 0xfffff57c0000 sizetype> constant 4>>


Without unshare_expr to this size_exp, the above TYPE_SIZE_UNIT node containing SAVE_EXPRs 
is gimpflified to expressions with SSA_NAME during gimplification.  (This is unaccepted by LTO).

Adding an unshare_expr (size_exp) resolved this problem. 

Although I still think that there might be potential issue with the gimpflication of SAVE_EXPRs, I dare not
to modify that part of the code. 

At this moment, I will add unshare_expr to the routine “pointer_int_sum” to workaround this issue.

Let me know if you have any comment and suggestion.

Thanks a lot.

Qing

>> 
>>> 
>>> Qing
>>> 
>>>> 
>>>>> Thanks a lot for the help.
>>>>> 
>>>>> Qing



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-18 14:45             ` Qing Zhao
@ 2024-01-19  9:30               ` Richard Biener
  2024-01-19 16:25                 ` Qing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2024-01-19  9:30 UTC (permalink / raw)
  To: Qing Zhao; +Cc: jakub, gcc Patches

On Thu, Jan 18, 2024 at 3:46 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>
>
>
> > On Jan 17, 2024, at 1:43 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Jan 17, 2024 at 7:42 AM Richard Biener
> > <richard.guenther@gmail.com> wrote:
> >>
> >> On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
> >>>
> >>>
> >>>
> >>>> On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> >>>>
> >>>>> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
> >>>>> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
> >>>>>
> >>>>> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
> >>>>> during IPA pass: modref
> >>>>> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
> >>>>> 0x14c3993 lto_write_tree
> >>>>>       ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
> >>>>> 0x14c3aeb lto_output_tree_1
> >>>>>
> >>>>> And the value of the tree node that triggered the ICE is:
> >>>>> (gdb) call debug_tree(expr)
> >>>>> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
> >>>>>   nothrow
> >>>>>   def_stmt
> >>>>>   version:13 in-free-list>
> >>>>>
> >>>>> Is there any good way to debug LTO bug?
> >>>>
> >>>> This happens usually when you have a VLA type and its type fields are not
> >>>> properly gimplified which usually happens because the frontend fails to
> >>>> insert a gimplification point for it (a DECL_EXPR).
> >>>
> >>> I found an old gcc bug
> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
> >>> ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57
> >>>
> >>> Which is very similar to the bug I am having right now.
> >>>
> >>> After further study, I suspect that the issue I am having right now with the LTO streaming also
> >>> relate to “unshare_expr”, “save_expr”, and the combination of these two, I suspect that
> >>> the current gcc cannot handle the combination of these two correctly for my case.
> >>>
> >>> My testing case is:
> >>>
> >>> #include <stdlib.h>
> >>> void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
> >>> {
> >>>   struct foo {
> >>>       int n;
> >>>       int p[][n2][n1] __attribute__((counted_by(n)));
> >>>   } *f;
> >>>
> >>>   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
> >>>   f->n = m;
> >>>   f->p[m][n2][n1]=1;
> >>>   return;
> >>> }
> >>>
> >>> int main(int argc, char *argv[])
> >>> {
> >>>  setup_and_test_vla (10, 11, 20);
> >>>  return 0;
> >>> }
> >>>
> >>> Failed with
> >>> my_gcc -Os -fsanitize=bounds -flto
> >>>
> >>> If changing either n1 or n2 to a constant, the testing passed.
> >>> If deleting -flto, the testing passed too.
> >>>
> >>> I double checked my code per the suggestions provided by you and Jakub in this
> >>> email thread, and I think the code should be fine.
> >>>
> >>> The code is following:
> >>>
> >>> =====
> >>> 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
> >>> 505    POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
> >>> 506    builtins that gets expanded in the sanopt pass, and make an array
> >>> 507    dimension of it.  ARRAY is the pointer to the base of the array,
> >>> 508    which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
> >>> 509    beginning of array.
> >>> 510    Return NULL_TREE if no instrumentation is emitted.  */
> >>> 511
> >>> 512 tree
> >>> 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, tree *offset)
> >>> 514 {
> >>> 515   if (!is_access_with_size_p (array))
> >>> 516     return NULL_TREE;
> >>> 517   tree bound = get_bound_from_access_with_size (array);
> >>> 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
> >>> 519      the element of the array.  */
> >>> 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
> >>> 521   gcc_assert (bound);
> >>> 522
> >>> 523   /* Given the offset, and the size of each element, the index can be
> >>> 524      computed as: offset/element_size.  */
> >>> 525   *offset = save_expr (*offset);
> >>> 526   tree index = fold_build2 (EXACT_DIV_EXPR,
> >>> 527                            sizetype, *offset,
> >>> 528                            unshare_expr (element_size));
> >>> 529   /* Create a "(T *) 0" tree node to describe the original array type.
> >>> 530      We get the original array type from the first argument of the call to
> >>> 531      .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
> >>> 532
> >>> 533      Originally, REF is a COMPONENT_REF with the original array type,
> >>> 534      it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
> >>> 535      first operand is the original COMPONENT_REF.  */
> >>> 536   tree ref = CALL_EXPR_ARG (array, 0);
> >>> 537   tree array_type
> >>> 538     = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 0)));
> >>> 539   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
> >>> 540   return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
> >>> 541                                        void_type_node, 3, zero_with_type,
> >>> 542                                        index, bound);
> >>> 543 }
> >>>
> >>> =====
> >>>
> >>> Inside gdb, the guilty IR failed in LTO streaming is from the above line 520:
> >>> TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array))),
> >>>
> >>> When I use this tree node as an operand of the expression at line 526, I added
> >>> unshare_expr.
> >>>
> >>> However, I still see the guilty IR as in gdb:
> >>>
> >>>            unit-size <mult_expr 0xfffff5aabf90 type <integer_type 0xfffff57c0000 sizetype>
> >>>                side-effects
> >>>                arg:0 <mult_expr 0xfffff5aabf68 type <integer_type 0xfffff57c0000 sizetype>
> >>>
> >>>                    arg:0 <ssa_name 0xfffff5761e18 type <error_mark 0xfffff56c0e58>
> >>>                        nothrow
> >>>                        def_stmt
> >>>                        version:12 in-free-list>
> >>>                    arg:1 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
> >>>                        nothrow
> >>>                        def_stmt
> >>>                        version:13 in-free-list>>
> >>>                arg:1 <integer_cst 0xfffff56c10c8 constant 4>>
> >>>
> >>>
> >>> I have been stuck with this bug for quite some time.
> >>> Any help is helpful.
> >>
> >> The above hasn't been gimplified correctly, you'd instead see
> >> a D.1234 in there, not an expression with SSA names.  That happens
> >> when the frontend fails to emit a DECL_EXPR for a decl with this
> >> type.
> >
> > .. which then also results in missing unsharing of this expression
> > (so the SSA names leak in)
>
> Thanks a lot for the hints.
>
> One correction first, the LTO bug is not related to -fsanitize=bounds.  Deleting -fsanitize=bounds still can
> repeat the failure.
>
> After further debugging into the gimplification phase related with the SAVE_EXPR, I finally locate the place
> where the unshareing of the expression is missing.   This is in the routine “pointer_int_sum” of c-family/c-common.cc:
>
> 3330     {
> 3331       if (!complain && !COMPLETE_TYPE_P (TREE_TYPE (result_type)))
> 3332         return error_mark_node;
> 3333       size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
> 3334       /* Wrap the pointer expression in a SAVE_EXPR to make sure it
> 3335          is evaluated first when the size expression may depend
> 3336          on it for VM types.  */
> 3337       if (TREE_SIDE_EFFECTS (size_exp)
> 3338           && TREE_SIDE_EFFECTS (ptrop)
> 3339           && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
> 3340         {
> 3341           ptrop = save_expr (ptrop);
> 3342           size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
> 3343         }
> 3344     }
>
> In the above, at line 3333, the tree node, TYPE_SIZE_UNIT (TREE_TYPE(result_type)), is returned directly as
> the size_exp,
>
>  (gdb) call debug_tree(size_exp)
>  <mult_expr 0xfffff5a6f910
>     type <integer_type 0xfffff57c0000 sizetype public unsigned DI
>         size <integer_cst 0xfffff56c0e70 constant 64>
>         unit-size <integer_cst 0xfffff56c0e88 constant 8>
>         align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0xfffff57c0000 precision:64 min <integer_cst 0xfffff56c0ea0 0> max <integer_cst 0xfffff56d05e0 18446744073709551615>>
>     side-effects
>     arg:0 <mult_expr 0xfffff5a6f8e8 type <integer_type 0xfffff57c0000 sizetype>
>         side-effects
>         arg:0 <nop_expr 0xfffff56dc540 type <integer_type 0xfffff57c0000 sizetype>
>             side-effects
>             arg:0 <save_expr 0xfffff56dc4c0 type <integer_type 0xfffff57c05e8 int>
>                 side-effects arg:0 <parm_decl 0xfffff76b6f80 n1>>>
>         arg:1 <nop_expr 0xfffff56dc600 type <integer_type 0xfffff57c0000 sizetype>
>             side-effects
>             arg:0 <save_expr 0xfffff56dc580 type <integer_type 0xfffff57c05e8 int>
>                 side-effects arg:0 <parm_decl 0xfffff76b7000 n2>>>>
>     arg:1 <integer_cst 0xfffff56c10c8 type <integer_type 0xfffff57c0000 sizetype> constant 4>>
>
>
> Without unshare_expr to this size_exp, the above TYPE_SIZE_UNIT node containing SAVE_EXPRs
> is gimpflified to expressions with SSA_NAME during gimplification.  (This is unaccepted by LTO).
>
> Adding an unshare_expr (size_exp) resolved this problem.
>
> Although I still think that there might be potential issue with the gimpflication of SAVE_EXPRs, I dare not
> to modify that part of the code.
>
> At this moment, I will add unshare_expr to the routine “pointer_int_sum” to workaround this issue.

It's only a workaround mind you.  The bug is that the frontend fails
to emit a DECL_EXPR which would
trigger both unsharing and proper gimplification of the type size.

> Let me know if you have any comment and suggestion.
>
> Thanks a lot.
>
> Qing
>
> >>
> >>>
> >>> Qing
> >>>
> >>>>
> >>>>> Thanks a lot for the help.
> >>>>>
> >>>>> Qing
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-19  9:30               ` Richard Biener
@ 2024-01-19 16:25                 ` Qing Zhao
  2024-01-22  7:40                   ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Qing Zhao @ 2024-01-19 16:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: jakub, gcc Patches



> On Jan 19, 2024, at 4:30 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> On Thu, Jan 18, 2024 at 3:46 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>> 
>> 
>> 
>>> On Jan 17, 2024, at 1:43 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>> 
>>> On Wed, Jan 17, 2024 at 7:42 AM Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> 
>>>> On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>> 
>>>>>>> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
>>>>>>> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
>>>>>>> 
>>>>>>> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
>>>>>>> during IPA pass: modref
>>>>>>> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
>>>>>>> 0x14c3993 lto_write_tree
>>>>>>>      ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
>>>>>>> 0x14c3aeb lto_output_tree_1
>>>>>>> 
>>>>>>> And the value of the tree node that triggered the ICE is:
>>>>>>> (gdb) call debug_tree(expr)
>>>>>>> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>>>>>>  nothrow
>>>>>>>  def_stmt
>>>>>>>  version:13 in-free-list>
>>>>>>> 
>>>>>>> Is there any good way to debug LTO bug?
>>>>>> 
>>>>>> This happens usually when you have a VLA type and its type fields are not
>>>>>> properly gimplified which usually happens because the frontend fails to
>>>>>> insert a gimplification point for it (a DECL_EXPR).
>>>>> 
>>>>> I found an old gcc bug
>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
>>>>> ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57
>>>>> 
>>>>> Which is very similar to the bug I am having right now.
>>>>> 
>>>>> After further study, I suspect that the issue I am having right now with the LTO streaming also
>>>>> relate to “unshare_expr”, “save_expr”, and the combination of these two, I suspect that
>>>>> the current gcc cannot handle the combination of these two correctly for my case.
>>>>> 
>>>>> My testing case is:
>>>>> 
>>>>> #include <stdlib.h>
>>>>> void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
>>>>> {
>>>>>  struct foo {
>>>>>      int n;
>>>>>      int p[][n2][n1] __attribute__((counted_by(n)));
>>>>>  } *f;
>>>>> 
>>>>>  f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
>>>>>  f->n = m;
>>>>>  f->p[m][n2][n1]=1;
>>>>>  return;
>>>>> }
>>>>> 
>>>>> int main(int argc, char *argv[])
>>>>> {
>>>>> setup_and_test_vla (10, 11, 20);
>>>>> return 0;
>>>>> }
>>>>> 
>>>>> Failed with
>>>>> my_gcc -Os -fsanitize=bounds -flto
>>>>> 
>>>>> If changing either n1 or n2 to a constant, the testing passed.
>>>>> If deleting -flto, the testing passed too.
>>>>> 
>>>>> I double checked my code per the suggestions provided by you and Jakub in this
>>>>> email thread, and I think the code should be fine.
>>>>> 
>>>>> The code is following:
>>>>> 
>>>>> =====
>>>>> 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
>>>>> 505    POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
>>>>> 506    builtins that gets expanded in the sanopt pass, and make an array
>>>>> 507    dimension of it.  ARRAY is the pointer to the base of the array,
>>>>> 508    which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
>>>>> 509    beginning of array.
>>>>> 510    Return NULL_TREE if no instrumentation is emitted.  */
>>>>> 511
>>>>> 512 tree
>>>>> 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, tree *offset)
>>>>> 514 {
>>>>> 515   if (!is_access_with_size_p (array))
>>>>> 516     return NULL_TREE;
>>>>> 517   tree bound = get_bound_from_access_with_size (array);
>>>>> 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
>>>>> 519      the element of the array.  */
>>>>> 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
>>>>> 521   gcc_assert (bound);
>>>>> 522
>>>>> 523   /* Given the offset, and the size of each element, the index can be
>>>>> 524      computed as: offset/element_size.  */
>>>>> 525   *offset = save_expr (*offset);
>>>>> 526   tree index = fold_build2 (EXACT_DIV_EXPR,
>>>>> 527                            sizetype, *offset,
>>>>> 528                            unshare_expr (element_size));
>>>>> 529   /* Create a "(T *) 0" tree node to describe the original array type.
>>>>> 530      We get the original array type from the first argument of the call to
>>>>> 531      .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
>>>>> 532
>>>>> 533      Originally, REF is a COMPONENT_REF with the original array type,
>>>>> 534      it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
>>>>> 535      first operand is the original COMPONENT_REF.  */
>>>>> 536   tree ref = CALL_EXPR_ARG (array, 0);
>>>>> 537   tree array_type
>>>>> 538     = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 0)));
>>>>> 539   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
>>>>> 540   return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
>>>>> 541                                        void_type_node, 3, zero_with_type,
>>>>> 542                                        index, bound);
>>>>> 543 }
>>>>> 
>>>>> =====
>>>>> 
>>>>> Inside gdb, the guilty IR failed in LTO streaming is from the above line 520:
>>>>> TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array))),
>>>>> 
>>>>> When I use this tree node as an operand of the expression at line 526, I added
>>>>> unshare_expr.
>>>>> 
>>>>> However, I still see the guilty IR as in gdb:
>>>>> 
>>>>>           unit-size <mult_expr 0xfffff5aabf90 type <integer_type 0xfffff57c0000 sizetype>
>>>>>               side-effects
>>>>>               arg:0 <mult_expr 0xfffff5aabf68 type <integer_type 0xfffff57c0000 sizetype>
>>>>> 
>>>>>                   arg:0 <ssa_name 0xfffff5761e18 type <error_mark 0xfffff56c0e58>
>>>>>                       nothrow
>>>>>                       def_stmt
>>>>>                       version:12 in-free-list>
>>>>>                   arg:1 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>>>>                       nothrow
>>>>>                       def_stmt
>>>>>                       version:13 in-free-list>>
>>>>>               arg:1 <integer_cst 0xfffff56c10c8 constant 4>>
>>>>> 
>>>>> 
>>>>> I have been stuck with this bug for quite some time.
>>>>> Any help is helpful.
>>>> 
>>>> The above hasn't been gimplified correctly, you'd instead see
>>>> a D.1234 in there, not an expression with SSA names.  That happens
>>>> when the frontend fails to emit a DECL_EXPR for a decl with this
>>>> type.
>>> 
>>> .. which then also results in missing unsharing of this expression
>>> (so the SSA names leak in)
>> 
>> Thanks a lot for the hints.
>> 
>> One correction first, the LTO bug is not related to -fsanitize=bounds.  Deleting -fsanitize=bounds still can
>> repeat the failure.
>> 
>> After further debugging into the gimplification phase related with the SAVE_EXPR, I finally locate the place
>> where the unshareing of the expression is missing.   This is in the routine “pointer_int_sum” of c-family/c-common.cc:
>> 
>> 3330     {
>> 3331       if (!complain && !COMPLETE_TYPE_P (TREE_TYPE (result_type)))
>> 3332         return error_mark_node;
>> 3333       size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
>> 3334       /* Wrap the pointer expression in a SAVE_EXPR to make sure it
>> 3335          is evaluated first when the size expression may depend
>> 3336          on it for VM types.  */
>> 3337       if (TREE_SIDE_EFFECTS (size_exp)
>> 3338           && TREE_SIDE_EFFECTS (ptrop)
>> 3339           && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
>> 3340         {
>> 3341           ptrop = save_expr (ptrop);
>> 3342           size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
>> 3343         }
>> 3344     }
>> 
>> In the above, at line 3333, the tree node, TYPE_SIZE_UNIT (TREE_TYPE(result_type)), is returned directly as
>> the size_exp,
>> 
>> (gdb) call debug_tree(size_exp)
>> <mult_expr 0xfffff5a6f910
>>    type <integer_type 0xfffff57c0000 sizetype public unsigned DI
>>        size <integer_cst 0xfffff56c0e70 constant 64>
>>        unit-size <integer_cst 0xfffff56c0e88 constant 8>
>>        align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0xfffff57c0000 precision:64 min <integer_cst 0xfffff56c0ea0 0> max <integer_cst 0xfffff56d05e0 18446744073709551615>>
>>    side-effects
>>    arg:0 <mult_expr 0xfffff5a6f8e8 type <integer_type 0xfffff57c0000 sizetype>
>>        side-effects
>>        arg:0 <nop_expr 0xfffff56dc540 type <integer_type 0xfffff57c0000 sizetype>
>>            side-effects
>>            arg:0 <save_expr 0xfffff56dc4c0 type <integer_type 0xfffff57c05e8 int>
>>                side-effects arg:0 <parm_decl 0xfffff76b6f80 n1>>>
>>        arg:1 <nop_expr 0xfffff56dc600 type <integer_type 0xfffff57c0000 sizetype>
>>            side-effects
>>            arg:0 <save_expr 0xfffff56dc580 type <integer_type 0xfffff57c05e8 int>
>>                side-effects arg:0 <parm_decl 0xfffff76b7000 n2>>>>
>>    arg:1 <integer_cst 0xfffff56c10c8 type <integer_type 0xfffff57c0000 sizetype> constant 4>>
>> 
>> 
>> Without unshare_expr to this size_exp, the above TYPE_SIZE_UNIT node containing SAVE_EXPRs
>> is gimpflified to expressions with SSA_NAME during gimplification.  (This is unaccepted by LTO).
>> 
>> Adding an unshare_expr (size_exp) resolved this problem.
>> 
>> Although I still think that there might be potential issue with the gimpflication of SAVE_EXPRs, I dare not
>> to modify that part of the code.
>> 
>> At this moment, I will add unshare_expr to the routine “pointer_int_sum” to workaround this issue.
> 
> It's only a workaround mind you.  The bug is that the frontend fails
> to emit a DECL_EXPR which would
> trigger both unsharing and proper gimplification of the type size.

For a simple testing case:

$ cat test.c
struct annotated {
  unsigned int foo;
  char b;
  int array[] __attribute__((counted_by (foo)));
};
extern struct annotated * alloc_buf (int index);

static void bar ()
{
  struct annotated *p2 = alloc_buf (10);
  p2->array[11] = 0;
  return;
}

The C FE generates the following IR:

[opc@qinzhao-ol8u3-x86 108896]$ cat test.c.005t.original   
;; Function bar (null)
;; enabled by -tree-original


{
  struct annotated * p2 = alloc_buf (10);

    struct annotated * p2 = alloc_buf (10);
  *(.ACCESS_WITH_SIZE ((int *) &p2->array, &p2->foo, 1, 32, -1) + 44) = 0;
  return;
}

Do you see any obvious IR issue in the above? Do I miss to generate any DECL_EXPR in the above IR?

Thanks.

Qing


I compared it with the following testing case without the “counted-by” annotation 
and use an user-defined “access_with_size” function, The IR looks like exactly the same:

$ cat test_1.c
struct annotated {
  unsigned int foo;
  char b;
  int array[];
};
extern struct annotated *alloc_buf (int);
extern int *access_with_size (int * ref, unsigned int * size, int a, int b, int c);

static void bar ()
{
  struct annotated *p2 = alloc_buf (10);
  access_with_size ((int *) &p2->array, &p2->foo, 1, 32, -1)[11] = 0;
  return;
}
[opc@qinzhao-ol8u3-x86 108896]$ cat test_1.c.005t.original

;; Function bar (null)
;; enabled by -tree-original


{
  struct annotated * p2 = alloc_buf (10);

    struct annotated * p2 = alloc_buf (10);
  *(access_with_size ((int *) &p2->array, &p2->foo, 1, 32, -1) + 44) = 0;
  return;
}



> 
>> Let me know if you have any comment and suggestion.
>> 
>> Thanks a lot.
>> 
>> Qing
>> 
>>>> 
>>>>> 
>>>>> Qing
>>>>> 
>>>>>> 
>>>>>>> Thanks a lot for the help.
>>>>>>> 
>>>>>>> Qing



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-19 16:25                 ` Qing Zhao
@ 2024-01-22  7:40                   ` Richard Biener
  2024-01-22 14:52                     ` Qing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2024-01-22  7:40 UTC (permalink / raw)
  To: Qing Zhao; +Cc: jakub, gcc Patches

On Fri, Jan 19, 2024 at 5:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>
>
>
> > On Jan 19, 2024, at 4:30 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> >
> > On Thu, Jan 18, 2024 at 3:46 PM Qing Zhao <qing.zhao@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Jan 17, 2024, at 1:43 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> >>>
> >>> On Wed, Jan 17, 2024 at 7:42 AM Richard Biener
> >>> <richard.guenther@gmail.com> wrote:
> >>>>
> >>>> On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> >>>>>>
> >>>>>>> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
> >>>>>>> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
> >>>>>>>
> >>>>>>> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
> >>>>>>> during IPA pass: modref
> >>>>>>> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
> >>>>>>> 0x14c3993 lto_write_tree
> >>>>>>>      ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
> >>>>>>> 0x14c3aeb lto_output_tree_1
> >>>>>>>
> >>>>>>> And the value of the tree node that triggered the ICE is:
> >>>>>>> (gdb) call debug_tree(expr)
> >>>>>>> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
> >>>>>>>  nothrow
> >>>>>>>  def_stmt
> >>>>>>>  version:13 in-free-list>
> >>>>>>>
> >>>>>>> Is there any good way to debug LTO bug?
> >>>>>>
> >>>>>> This happens usually when you have a VLA type and its type fields are not
> >>>>>> properly gimplified which usually happens because the frontend fails to
> >>>>>> insert a gimplification point for it (a DECL_EXPR).
> >>>>>
> >>>>> I found an old gcc bug
> >>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
> >>>>> ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57
> >>>>>
> >>>>> Which is very similar to the bug I am having right now.
> >>>>>
> >>>>> After further study, I suspect that the issue I am having right now with the LTO streaming also
> >>>>> relate to “unshare_expr”, “save_expr”, and the combination of these two, I suspect that
> >>>>> the current gcc cannot handle the combination of these two correctly for my case.
> >>>>>
> >>>>> My testing case is:
> >>>>>
> >>>>> #include <stdlib.h>
> >>>>> void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
> >>>>> {
> >>>>>  struct foo {
> >>>>>      int n;
> >>>>>      int p[][n2][n1] __attribute__((counted_by(n)));
> >>>>>  } *f;
> >>>>>
> >>>>>  f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
> >>>>>  f->n = m;
> >>>>>  f->p[m][n2][n1]=1;
> >>>>>  return;
> >>>>> }
> >>>>>
> >>>>> int main(int argc, char *argv[])
> >>>>> {
> >>>>> setup_and_test_vla (10, 11, 20);
> >>>>> return 0;
> >>>>> }
> >>>>>
> >>>>> Failed with
> >>>>> my_gcc -Os -fsanitize=bounds -flto
> >>>>>
> >>>>> If changing either n1 or n2 to a constant, the testing passed.
> >>>>> If deleting -flto, the testing passed too.
> >>>>>
> >>>>> I double checked my code per the suggestions provided by you and Jakub in this
> >>>>> email thread, and I think the code should be fine.
> >>>>>
> >>>>> The code is following:
> >>>>>
> >>>>> =====
> >>>>> 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
> >>>>> 505    POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
> >>>>> 506    builtins that gets expanded in the sanopt pass, and make an array
> >>>>> 507    dimension of it.  ARRAY is the pointer to the base of the array,
> >>>>> 508    which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
> >>>>> 509    beginning of array.
> >>>>> 510    Return NULL_TREE if no instrumentation is emitted.  */
> >>>>> 511
> >>>>> 512 tree
> >>>>> 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, tree *offset)
> >>>>> 514 {
> >>>>> 515   if (!is_access_with_size_p (array))
> >>>>> 516     return NULL_TREE;
> >>>>> 517   tree bound = get_bound_from_access_with_size (array);
> >>>>> 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
> >>>>> 519      the element of the array.  */
> >>>>> 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
> >>>>> 521   gcc_assert (bound);
> >>>>> 522
> >>>>> 523   /* Given the offset, and the size of each element, the index can be
> >>>>> 524      computed as: offset/element_size.  */
> >>>>> 525   *offset = save_expr (*offset);
> >>>>> 526   tree index = fold_build2 (EXACT_DIV_EXPR,
> >>>>> 527                            sizetype, *offset,
> >>>>> 528                            unshare_expr (element_size));
> >>>>> 529   /* Create a "(T *) 0" tree node to describe the original array type.
> >>>>> 530      We get the original array type from the first argument of the call to
> >>>>> 531      .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
> >>>>> 532
> >>>>> 533      Originally, REF is a COMPONENT_REF with the original array type,
> >>>>> 534      it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
> >>>>> 535      first operand is the original COMPONENT_REF.  */
> >>>>> 536   tree ref = CALL_EXPR_ARG (array, 0);
> >>>>> 537   tree array_type
> >>>>> 538     = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 0)));
> >>>>> 539   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
> >>>>> 540   return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
> >>>>> 541                                        void_type_node, 3, zero_with_type,
> >>>>> 542                                        index, bound);
> >>>>> 543 }
> >>>>>
> >>>>> =====
> >>>>>
> >>>>> Inside gdb, the guilty IR failed in LTO streaming is from the above line 520:
> >>>>> TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array))),
> >>>>>
> >>>>> When I use this tree node as an operand of the expression at line 526, I added
> >>>>> unshare_expr.
> >>>>>
> >>>>> However, I still see the guilty IR as in gdb:
> >>>>>
> >>>>>           unit-size <mult_expr 0xfffff5aabf90 type <integer_type 0xfffff57c0000 sizetype>
> >>>>>               side-effects
> >>>>>               arg:0 <mult_expr 0xfffff5aabf68 type <integer_type 0xfffff57c0000 sizetype>
> >>>>>
> >>>>>                   arg:0 <ssa_name 0xfffff5761e18 type <error_mark 0xfffff56c0e58>
> >>>>>                       nothrow
> >>>>>                       def_stmt
> >>>>>                       version:12 in-free-list>
> >>>>>                   arg:1 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
> >>>>>                       nothrow
> >>>>>                       def_stmt
> >>>>>                       version:13 in-free-list>>
> >>>>>               arg:1 <integer_cst 0xfffff56c10c8 constant 4>>
> >>>>>
> >>>>>
> >>>>> I have been stuck with this bug for quite some time.
> >>>>> Any help is helpful.
> >>>>
> >>>> The above hasn't been gimplified correctly, you'd instead see
> >>>> a D.1234 in there, not an expression with SSA names.  That happens
> >>>> when the frontend fails to emit a DECL_EXPR for a decl with this
> >>>> type.
> >>>
> >>> .. which then also results in missing unsharing of this expression
> >>> (so the SSA names leak in)
> >>
> >> Thanks a lot for the hints.
> >>
> >> One correction first, the LTO bug is not related to -fsanitize=bounds.  Deleting -fsanitize=bounds still can
> >> repeat the failure.
> >>
> >> After further debugging into the gimplification phase related with the SAVE_EXPR, I finally locate the place
> >> where the unshareing of the expression is missing.   This is in the routine “pointer_int_sum” of c-family/c-common.cc:
> >>
> >> 3330     {
> >> 3331       if (!complain && !COMPLETE_TYPE_P (TREE_TYPE (result_type)))
> >> 3332         return error_mark_node;
> >> 3333       size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
> >> 3334       /* Wrap the pointer expression in a SAVE_EXPR to make sure it
> >> 3335          is evaluated first when the size expression may depend
> >> 3336          on it for VM types.  */
> >> 3337       if (TREE_SIDE_EFFECTS (size_exp)
> >> 3338           && TREE_SIDE_EFFECTS (ptrop)
> >> 3339           && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
> >> 3340         {
> >> 3341           ptrop = save_expr (ptrop);
> >> 3342           size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
> >> 3343         }
> >> 3344     }
> >>
> >> In the above, at line 3333, the tree node, TYPE_SIZE_UNIT (TREE_TYPE(result_type)), is returned directly as
> >> the size_exp,
> >>
> >> (gdb) call debug_tree(size_exp)
> >> <mult_expr 0xfffff5a6f910
> >>    type <integer_type 0xfffff57c0000 sizetype public unsigned DI
> >>        size <integer_cst 0xfffff56c0e70 constant 64>
> >>        unit-size <integer_cst 0xfffff56c0e88 constant 8>
> >>        align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0xfffff57c0000 precision:64 min <integer_cst 0xfffff56c0ea0 0> max <integer_cst 0xfffff56d05e0 18446744073709551615>>
> >>    side-effects
> >>    arg:0 <mult_expr 0xfffff5a6f8e8 type <integer_type 0xfffff57c0000 sizetype>
> >>        side-effects
> >>        arg:0 <nop_expr 0xfffff56dc540 type <integer_type 0xfffff57c0000 sizetype>
> >>            side-effects
> >>            arg:0 <save_expr 0xfffff56dc4c0 type <integer_type 0xfffff57c05e8 int>
> >>                side-effects arg:0 <parm_decl 0xfffff76b6f80 n1>>>
> >>        arg:1 <nop_expr 0xfffff56dc600 type <integer_type 0xfffff57c0000 sizetype>
> >>            side-effects
> >>            arg:0 <save_expr 0xfffff56dc580 type <integer_type 0xfffff57c05e8 int>
> >>                side-effects arg:0 <parm_decl 0xfffff76b7000 n2>>>>
> >>    arg:1 <integer_cst 0xfffff56c10c8 type <integer_type 0xfffff57c0000 sizetype> constant 4>>
> >>
> >>
> >> Without unshare_expr to this size_exp, the above TYPE_SIZE_UNIT node containing SAVE_EXPRs
> >> is gimpflified to expressions with SSA_NAME during gimplification.  (This is unaccepted by LTO).
> >>
> >> Adding an unshare_expr (size_exp) resolved this problem.
> >>
> >> Although I still think that there might be potential issue with the gimpflication of SAVE_EXPRs, I dare not
> >> to modify that part of the code.
> >>
> >> At this moment, I will add unshare_expr to the routine “pointer_int_sum” to workaround this issue.
> >
> > It's only a workaround mind you.  The bug is that the frontend fails
> > to emit a DECL_EXPR which would
> > trigger both unsharing and proper gimplification of the type size.
>
> For a simple testing case:
>
> $ cat test.c
> struct annotated {
>   unsigned int foo;
>   char b;
>   int array[] __attribute__((counted_by (foo)));
> };
> extern struct annotated * alloc_buf (int index);
>
> static void bar ()
> {
>   struct annotated *p2 = alloc_buf (10);
>   p2->array[11] = 0;
>   return;
> }
>
> The C FE generates the following IR:
>
> [opc@qinzhao-ol8u3-x86 108896]$ cat test.c.005t.original
> ;; Function bar (null)
> ;; enabled by -tree-original
>
>
> {
>   struct annotated * p2 = alloc_buf (10);
>
>     struct annotated * p2 = alloc_buf (10);
>   *(.ACCESS_WITH_SIZE ((int *) &p2->array, &p2->foo, 1, 32, -1) + 44) = 0;
>   return;
> }
>
> Do you see any obvious IR issue in the above? Do I miss to generate any DECL_EXPR in the above IR?

It's an interesting question - I don't see where the gimplifier would
need to access DECL/TYPE_SIZE
so the mistake, if any, should be that you need to unshare the size
expressions you are using as
argument to .ACCESS_WITH_SIZE?  Mind, you are replacing an ARRAY_REF
with a pointer indirection
as well - IMO we shouldn't replace accesses this way but instead make
it possible for analysis to
discover the base/size values?

> Thanks.
>
> Qing
>
>
> I compared it with the following testing case without the “counted-by” annotation
> and use an user-defined “access_with_size” function, The IR looks like exactly the same:
>
> $ cat test_1.c
> struct annotated {
>   unsigned int foo;
>   char b;
>   int array[];
> };
> extern struct annotated *alloc_buf (int);
> extern int *access_with_size (int * ref, unsigned int * size, int a, int b, int c);
>
> static void bar ()
> {
>   struct annotated *p2 = alloc_buf (10);
>   access_with_size ((int *) &p2->array, &p2->foo, 1, 32, -1)[11] = 0;
>   return;
> }
> [opc@qinzhao-ol8u3-x86 108896]$ cat test_1.c.005t.original
>
> ;; Function bar (null)
> ;; enabled by -tree-original
>
>
> {
>   struct annotated * p2 = alloc_buf (10);
>
>     struct annotated * p2 = alloc_buf (10);
>   *(access_with_size ((int *) &p2->array, &p2->foo, 1, 32, -1) + 44) = 0;
>   return;
> }
>
>
>
> >
> >> Let me know if you have any comment and suggestion.
> >>
> >> Thanks a lot.
> >>
> >> Qing
> >>
> >>>>
> >>>>>
> >>>>> Qing
> >>>>>
> >>>>>>
> >>>>>>> Thanks a lot for the help.
> >>>>>>>
> >>>>>>> Qing
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-22  7:40                   ` Richard Biener
@ 2024-01-22 14:52                     ` Qing Zhao
  2024-01-22 16:54                       ` Qing Zhao
  0 siblings, 1 reply; 18+ messages in thread
From: Qing Zhao @ 2024-01-22 14:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: jakub, gcc Patches



> On Jan 22, 2024, at 2:40 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> On Fri, Jan 19, 2024 at 5:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>> 
>> 
>> 
>>> On Jan 19, 2024, at 4:30 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>> 
>>> On Thu, Jan 18, 2024 at 3:46 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Jan 17, 2024, at 1:43 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>> 
>>>>> On Wed, Jan 17, 2024 at 7:42 AM Richard Biener
>>>>> <richard.guenther@gmail.com> wrote:
>>>>>> 
>>>>>> On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>>>> 
>>>>>>>>> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
>>>>>>>>> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
>>>>>>>>> 
>>>>>>>>> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
>>>>>>>>> during IPA pass: modref
>>>>>>>>> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
>>>>>>>>> 0x14c3993 lto_write_tree
>>>>>>>>>     ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
>>>>>>>>> 0x14c3aeb lto_output_tree_1
>>>>>>>>> 
>>>>>>>>> And the value of the tree node that triggered the ICE is:
>>>>>>>>> (gdb) call debug_tree(expr)
>>>>>>>>> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>>>>>>>> nothrow
>>>>>>>>> def_stmt
>>>>>>>>> version:13 in-free-list>
>>>>>>>>> 
>>>>>>>>> Is there any good way to debug LTO bug?
>>>>>>>> 
>>>>>>>> This happens usually when you have a VLA type and its type fields are not
>>>>>>>> properly gimplified which usually happens because the frontend fails to
>>>>>>>> insert a gimplification point for it (a DECL_EXPR).
>>>>>>> 
>>>>>>> I found an old gcc bug
>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
>>>>>>> ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57
>>>>>>> 
>>>>>>> Which is very similar to the bug I am having right now.
>>>>>>> 
>>>>>>> After further study, I suspect that the issue I am having right now with the LTO streaming also
>>>>>>> relate to “unshare_expr”, “save_expr”, and the combination of these two, I suspect that
>>>>>>> the current gcc cannot handle the combination of these two correctly for my case.
>>>>>>> 
>>>>>>> My testing case is:
>>>>>>> 
>>>>>>> #include <stdlib.h>
>>>>>>> void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
>>>>>>> {
>>>>>>> struct foo {
>>>>>>>     int n;
>>>>>>>     int p[][n2][n1] __attribute__((counted_by(n)));
>>>>>>> } *f;
>>>>>>> 
>>>>>>> f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
>>>>>>> f->n = m;
>>>>>>> f->p[m][n2][n1]=1;
>>>>>>> return;
>>>>>>> }
>>>>>>> 
>>>>>>> int main(int argc, char *argv[])
>>>>>>> {
>>>>>>> setup_and_test_vla (10, 11, 20);
>>>>>>> return 0;
>>>>>>> }
>>>>>>> 
>>>>>>> Failed with
>>>>>>> my_gcc -Os -fsanitize=bounds -flto
>>>>>>> 
>>>>>>> If changing either n1 or n2 to a constant, the testing passed.
>>>>>>> If deleting -flto, the testing passed too.
>>>>>>> 
>>>>>>> I double checked my code per the suggestions provided by you and Jakub in this
>>>>>>> email thread, and I think the code should be fine.
>>>>>>> 
>>>>>>> The code is following:
>>>>>>> 
>>>>>>> =====
>>>>>>> 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
>>>>>>> 505    POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
>>>>>>> 506    builtins that gets expanded in the sanopt pass, and make an array
>>>>>>> 507    dimension of it.  ARRAY is the pointer to the base of the array,
>>>>>>> 508    which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
>>>>>>> 509    beginning of array.
>>>>>>> 510    Return NULL_TREE if no instrumentation is emitted.  */
>>>>>>> 511
>>>>>>> 512 tree
>>>>>>> 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, tree *offset)
>>>>>>> 514 {
>>>>>>> 515   if (!is_access_with_size_p (array))
>>>>>>> 516     return NULL_TREE;
>>>>>>> 517   tree bound = get_bound_from_access_with_size (array);
>>>>>>> 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
>>>>>>> 519      the element of the array.  */
>>>>>>> 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
>>>>>>> 521   gcc_assert (bound);
>>>>>>> 522
>>>>>>> 523   /* Given the offset, and the size of each element, the index can be
>>>>>>> 524      computed as: offset/element_size.  */
>>>>>>> 525   *offset = save_expr (*offset);
>>>>>>> 526   tree index = fold_build2 (EXACT_DIV_EXPR,
>>>>>>> 527                            sizetype, *offset,
>>>>>>> 528                            unshare_expr (element_size));
>>>>>>> 529   /* Create a "(T *) 0" tree node to describe the original array type.
>>>>>>> 530      We get the original array type from the first argument of the call to
>>>>>>> 531      .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
>>>>>>> 532
>>>>>>> 533      Originally, REF is a COMPONENT_REF with the original array type,
>>>>>>> 534      it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
>>>>>>> 535      first operand is the original COMPONENT_REF.  */
>>>>>>> 536   tree ref = CALL_EXPR_ARG (array, 0);
>>>>>>> 537   tree array_type
>>>>>>> 538     = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 0)));
>>>>>>> 539   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
>>>>>>> 540   return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
>>>>>>> 541                                        void_type_node, 3, zero_with_type,
>>>>>>> 542                                        index, bound);
>>>>>>> 543 }
>>>>>>> 
>>>>>>> =====
>>>>>>> 
>>>>>>> Inside gdb, the guilty IR failed in LTO streaming is from the above line 520:
>>>>>>> TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array))),
>>>>>>> 
>>>>>>> When I use this tree node as an operand of the expression at line 526, I added
>>>>>>> unshare_expr.
>>>>>>> 
>>>>>>> However, I still see the guilty IR as in gdb:
>>>>>>> 
>>>>>>>          unit-size <mult_expr 0xfffff5aabf90 type <integer_type 0xfffff57c0000 sizetype>
>>>>>>>              side-effects
>>>>>>>              arg:0 <mult_expr 0xfffff5aabf68 type <integer_type 0xfffff57c0000 sizetype>
>>>>>>> 
>>>>>>>                  arg:0 <ssa_name 0xfffff5761e18 type <error_mark 0xfffff56c0e58>
>>>>>>>                      nothrow
>>>>>>>                      def_stmt
>>>>>>>                      version:12 in-free-list>
>>>>>>>                  arg:1 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>>>>>>                      nothrow
>>>>>>>                      def_stmt
>>>>>>>                      version:13 in-free-list>>
>>>>>>>              arg:1 <integer_cst 0xfffff56c10c8 constant 4>>
>>>>>>> 
>>>>>>> 
>>>>>>> I have been stuck with this bug for quite some time.
>>>>>>> Any help is helpful.
>>>>>> 
>>>>>> The above hasn't been gimplified correctly, you'd instead see
>>>>>> a D.1234 in there, not an expression with SSA names.  That happens
>>>>>> when the frontend fails to emit a DECL_EXPR for a decl with this
>>>>>> type.
>>>>> 
>>>>> .. which then also results in missing unsharing of this expression
>>>>> (so the SSA names leak in)
>>>> 
>>>> Thanks a lot for the hints.
>>>> 
>>>> One correction first, the LTO bug is not related to -fsanitize=bounds.  Deleting -fsanitize=bounds still can
>>>> repeat the failure.
>>>> 
>>>> After further debugging into the gimplification phase related with the SAVE_EXPR, I finally locate the place
>>>> where the unshareing of the expression is missing.   This is in the routine “pointer_int_sum” of c-family/c-common.cc:
>>>> 
>>>> 3330     {
>>>> 3331       if (!complain && !COMPLETE_TYPE_P (TREE_TYPE (result_type)))
>>>> 3332         return error_mark_node;
>>>> 3333       size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
>>>> 3334       /* Wrap the pointer expression in a SAVE_EXPR to make sure it
>>>> 3335          is evaluated first when the size expression may depend
>>>> 3336          on it for VM types.  */
>>>> 3337       if (TREE_SIDE_EFFECTS (size_exp)
>>>> 3338           && TREE_SIDE_EFFECTS (ptrop)
>>>> 3339           && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
>>>> 3340         {
>>>> 3341           ptrop = save_expr (ptrop);
>>>> 3342           size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
>>>> 3343         }
>>>> 3344     }
>>>> 
>>>> In the above, at line 3333, the tree node, TYPE_SIZE_UNIT (TREE_TYPE(result_type)), is returned directly as
>>>> the size_exp,
>>>> 
>>>> (gdb) call debug_tree(size_exp)
>>>> <mult_expr 0xfffff5a6f910
>>>>   type <integer_type 0xfffff57c0000 sizetype public unsigned DI
>>>>       size <integer_cst 0xfffff56c0e70 constant 64>
>>>>       unit-size <integer_cst 0xfffff56c0e88 constant 8>
>>>>       align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0xfffff57c0000 precision:64 min <integer_cst 0xfffff56c0ea0 0> max <integer_cst 0xfffff56d05e0 18446744073709551615>>
>>>>   side-effects
>>>>   arg:0 <mult_expr 0xfffff5a6f8e8 type <integer_type 0xfffff57c0000 sizetype>
>>>>       side-effects
>>>>       arg:0 <nop_expr 0xfffff56dc540 type <integer_type 0xfffff57c0000 sizetype>
>>>>           side-effects
>>>>           arg:0 <save_expr 0xfffff56dc4c0 type <integer_type 0xfffff57c05e8 int>
>>>>               side-effects arg:0 <parm_decl 0xfffff76b6f80 n1>>>
>>>>       arg:1 <nop_expr 0xfffff56dc600 type <integer_type 0xfffff57c0000 sizetype>
>>>>           side-effects
>>>>           arg:0 <save_expr 0xfffff56dc580 type <integer_type 0xfffff57c05e8 int>
>>>>               side-effects arg:0 <parm_decl 0xfffff76b7000 n2>>>>
>>>>   arg:1 <integer_cst 0xfffff56c10c8 type <integer_type 0xfffff57c0000 sizetype> constant 4>>
>>>> 
>>>> 
>>>> Without unshare_expr to this size_exp, the above TYPE_SIZE_UNIT node containing SAVE_EXPRs
>>>> is gimpflified to expressions with SSA_NAME during gimplification.  (This is unaccepted by LTO).
>>>> 
>>>> Adding an unshare_expr (size_exp) resolved this problem.
>>>> 
>>>> Although I still think that there might be potential issue with the gimpflication of SAVE_EXPRs, I dare not
>>>> to modify that part of the code.
>>>> 
>>>> At this moment, I will add unshare_expr to the routine “pointer_int_sum” to workaround this issue.
>>> 
>>> It's only a workaround mind you.  The bug is that the frontend fails
>>> to emit a DECL_EXPR which would
>>> trigger both unsharing and proper gimplification of the type size.
>> 
>> For a simple testing case:
>> 
>> $ cat test.c
>> struct annotated {
>>  unsigned int foo;
>>  char b;
>>  int array[] __attribute__((counted_by (foo)));
>> };
>> extern struct annotated * alloc_buf (int index);
>> 
>> static void bar ()
>> {
>>  struct annotated *p2 = alloc_buf (10);
>>  p2->array[11] = 0;
>>  return;
>> }
>> 
>> The C FE generates the following IR:
>> 
>> [opc@qinzhao-ol8u3-x86 108896]$ cat test.c.005t.original
>> ;; Function bar (null)
>> ;; enabled by -tree-original
>> 
>> 
>> {
>>  struct annotated * p2 = alloc_buf (10);
>> 
>>    struct annotated * p2 = alloc_buf (10);
>>  *(.ACCESS_WITH_SIZE ((int *) &p2->array, &p2->foo, 1, 32, -1) + 44) = 0;
>>  return;
>> }
>> 
>> Do you see any obvious IR issue in the above? Do I miss to generate any DECL_EXPR in the above IR?
> 
> It's an interesting question - I don't see where the gimplifier would
> need to access DECL/TYPE_SIZE

My bad, the above simple example did not expose the LTO error. Just used to show the IR generated for the array access.

The LTO error only happens with multi-dimension array, whose inner dimensions are VLA.  And the outermost dimension is flexible array member.  Like following:


   void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
{
   struct foo {
       int n;
       int p[][n2][n1] __attribute__((counted_by(n)));
   } *f;

   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
   f->n = m;
   f->p[m][n2][n1]=1;
   return;
}
 
And the IR for the routine “setup_and_test_vla” is:

{
  typedef struct foo struct struct foo;
  struct foo * f;

  SAVE_EXPR <n1>;, SAVE_EXPR <n2>;;, {
        typedef struct foo struct struct foo;
  };
    struct foo * f;
  f = (struct foo *) malloc (0, SAVE_EXPR <n1>;, SAVE_EXPR <n2>;;, ((long unsigned int) m * (long unsigned int) ((sizetype) SAVE_EXPR <n1> * (sizetype) SAVE_EXPR <n2>) + 1) * 4;);
  f->n = m;
  (*(.ACCESS_WITH_SIZE ((int[0:(sizetype) ((long int) SAVE_EXPR <n2> + -1)][0:(sizetype) ((long int) SAVE_EXPR <n1> + -1)] *) &f->p, &f->n, 1, 32, -1) + (sizetype) (((long unsigned int) m * (long unsigned int) ((sizetype) SAVE_EXPR <n1> * (sizetype) SAVE_EXPR <n2>)) * 4)))[n2][n1] = 1;
  return;
}



> so the mistake, if any, should be that you need to unshare the size
> expressions you are using as
> argument to .ACCESS_WITH_SIZE?

Hm, I tried that in the beginning, but didn’t work. I will check again. 

>  Mind, you are replacing an ARRAY_REF
> with a pointer indirection
> as well

Yes, this is for resolving a very early gimplification issue as I reported last Nov:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html

Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF
With a pointer indirection:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html

The reason for such change is:  return a flexible array member TYPE is not allowed
 by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. 


> - IMO we shouldn't replace accesses this way but instead make
> it possible for analysis to
> discover the base/size values?

Yes, if keeping the ARRAY_REF, then bound sanitizer instrumentation will be much simpler since the INDEX
Information is kept in the TYPE NODE of the ARRAY_REF.  
However, due to the above reason, we have to replace the ARRAY_REF with a pointer INDIRECT_REF. 
Such change made the bound sanitizer more difficult due to the INDEX was lost when the ARRAY_REF was
Converted to the INDIRECT_REF.

I resolved this issue by adding a new argument to .ACCESS_WITH_SIZE to record the INDEX when converting
The ARRAY_REF to INDIRECT_REF during C FE.

Let me know if you have any comment.

Thanks.

Qing
> 
>> Thanks.
>> 
>> Qing
>> 
>> 
>> I compared it with the following testing case without the “counted-by” annotation
>> and use an user-defined “access_with_size” function, The IR looks like exactly the same:
>> 
>> $ cat test_1.c
>> struct annotated {
>>  unsigned int foo;
>>  char b;
>>  int array[];
>> };
>> extern struct annotated *alloc_buf (int);
>> extern int *access_with_size (int * ref, unsigned int * size, int a, int b, int c);
>> 
>> static void bar ()
>> {
>>  struct annotated *p2 = alloc_buf (10);
>>  access_with_size ((int *) &p2->array, &p2->foo, 1, 32, -1)[11] = 0;
>>  return;
>> }
>> [opc@qinzhao-ol8u3-x86 108896]$ cat test_1.c.005t.original
>> 
>> ;; Function bar (null)
>> ;; enabled by -tree-original
>> 
>> 
>> {
>>  struct annotated * p2 = alloc_buf (10);
>> 
>>    struct annotated * p2 = alloc_buf (10);
>>  *(access_with_size ((int *) &p2->array, &p2->foo, 1, 32, -1) + 44) = 0;
>>  return;
>> }
>> 
>> 
>> 
>>> 
>>>> Let me know if you have any comment and suggestion.
>>>> 
>>>> Thanks a lot.
>>>> 
>>>> Qing
>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Qing
>>>>>>> 
>>>>>>>> 
>>>>>>>>> Thanks a lot for the help.
>>>>>>>>> 
>>>>>>>>> Qing



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: HELP: Questions on unshare_expr
  2024-01-22 14:52                     ` Qing Zhao
@ 2024-01-22 16:54                       ` Qing Zhao
  0 siblings, 0 replies; 18+ messages in thread
From: Qing Zhao @ 2024-01-22 16:54 UTC (permalink / raw)
  To: Richard Biener; +Cc: jakub, gcc Patches

One update, last Friday, I merged all my patches for counted-by support 
(including the Patch to workaround the LTO issue)  with the latest trunk, bootstrapped
 and run the testing, everything is good.

Today, when I disabled the Patch that workaround the LTO issue, surprisingly, I cannot
repeat the LTO issue anymore with the latest trunk + my counted-by support patches.
I.e., without the LTO workaround, everything works just fine with the latest gcc.

I suspected that some change in the latest GCC “fixed” (or hide) the issue. 

Qing

> On Jan 22, 2024, at 9:52 AM, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> 
> 
>> On Jan 22, 2024, at 2:40 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>> 
>> On Fri, Jan 19, 2024 at 5:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>> 
>>> 
>>> 
>>>> On Jan 19, 2024, at 4:30 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>>> 
>>>> On Thu, Jan 18, 2024 at 3:46 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Jan 17, 2024, at 1:43 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>> 
>>>>>> On Wed, Jan 17, 2024 at 7:42 AM Richard Biener
>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>> 
>>>>>>> On Tue, Jan 16, 2024 at 9:26 PM Qing Zhao <qing.zhao@oracle.com> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Jan 15, 2024, at 4:31 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> All my questions for unshare_expr relate to a  LTO bug that I currently stuck with
>>>>>>>>>> when using .ACCESS_WITH_SIZE in bound sanitizer (only with -flto, without -flto, no issue):
>>>>>>>>>> 
>>>>>>>>>> [opc@qinzhao-aarch64-ol8 gcc]$ sh t
>>>>>>>>>> during IPA pass: modref
>>>>>>>>>> t.c:20:1: internal compiler error: tree code ‘ssa_name’ is not supported in LTO streams
>>>>>>>>>> 0x14c3993 lto_write_tree
>>>>>>>>>>    ../../latest-gcc-write/gcc/lto-streamer-out.cc:561
>>>>>>>>>> 0x14c3aeb lto_output_tree_1
>>>>>>>>>> 
>>>>>>>>>> And the value of the tree node that triggered the ICE is:
>>>>>>>>>> (gdb) call debug_tree(expr)
>>>>>>>>>> <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>>>>>>>>> nothrow
>>>>>>>>>> def_stmt
>>>>>>>>>> version:13 in-free-list>
>>>>>>>>>> 
>>>>>>>>>> Is there any good way to debug LTO bug?
>>>>>>>>> 
>>>>>>>>> This happens usually when you have a VLA type and its type fields are not
>>>>>>>>> properly gimplified which usually happens because the frontend fails to
>>>>>>>>> insert a gimplification point for it (a DECL_EXPR).
>>>>>>>> 
>>>>>>>> I found an old gcc bug
>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172
>>>>>>>> ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57
>>>>>>>> 
>>>>>>>> Which is very similar to the bug I am having right now.
>>>>>>>> 
>>>>>>>> After further study, I suspect that the issue I am having right now with the LTO streaming also
>>>>>>>> relate to “unshare_expr”, “save_expr”, and the combination of these two, I suspect that
>>>>>>>> the current gcc cannot handle the combination of these two correctly for my case.
>>>>>>>> 
>>>>>>>> My testing case is:
>>>>>>>> 
>>>>>>>> #include <stdlib.h>
>>>>>>>> void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
>>>>>>>> {
>>>>>>>> struct foo {
>>>>>>>>    int n;
>>>>>>>>    int p[][n2][n1] __attribute__((counted_by(n)));
>>>>>>>> } *f;
>>>>>>>> 
>>>>>>>> f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
>>>>>>>> f->n = m;
>>>>>>>> f->p[m][n2][n1]=1;
>>>>>>>> return;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> int main(int argc, char *argv[])
>>>>>>>> {
>>>>>>>> setup_and_test_vla (10, 11, 20);
>>>>>>>> return 0;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> Failed with
>>>>>>>> my_gcc -Os -fsanitize=bounds -flto
>>>>>>>> 
>>>>>>>> If changing either n1 or n2 to a constant, the testing passed.
>>>>>>>> If deleting -flto, the testing passed too.
>>>>>>>> 
>>>>>>>> I double checked my code per the suggestions provided by you and Jakub in this
>>>>>>>> email thread, and I think the code should be fine.
>>>>>>>> 
>>>>>>>> The code is following:
>>>>>>>> 
>>>>>>>> =====
>>>>>>>> 504 /* Instrument array bounds for INDIRECT_REFs whose pointers are
>>>>>>>> 505    POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE. We create special
>>>>>>>> 506    builtins that gets expanded in the sanopt pass, and make an array
>>>>>>>> 507    dimension of it.  ARRAY is the pointer to the base of the array,
>>>>>>>> 508    which is a call to .ACCESS_WITH_SIZE, *OFFSET is the offset to the
>>>>>>>> 509    beginning of array.
>>>>>>>> 510    Return NULL_TREE if no instrumentation is emitted.  */
>>>>>>>> 511
>>>>>>>> 512 tree
>>>>>>>> 513 ubsan_instrument_bounds_indirect_ref (location_t loc, tree array, tree *offset)
>>>>>>>> 514 {
>>>>>>>> 515   if (!is_access_with_size_p (array))
>>>>>>>> 516     return NULL_TREE;
>>>>>>>> 517   tree bound = get_bound_from_access_with_size (array);
>>>>>>>> 518   /* The type of the call to .ACCESS_WITH_SIZE is a pointer type to
>>>>>>>> 519      the element of the array.  */
>>>>>>>> 520   tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array)));
>>>>>>>> 521   gcc_assert (bound);
>>>>>>>> 522
>>>>>>>> 523   /* Given the offset, and the size of each element, the index can be
>>>>>>>> 524      computed as: offset/element_size.  */
>>>>>>>> 525   *offset = save_expr (*offset);
>>>>>>>> 526   tree index = fold_build2 (EXACT_DIV_EXPR,
>>>>>>>> 527                            sizetype, *offset,
>>>>>>>> 528                            unshare_expr (element_size));
>>>>>>>> 529   /* Create a "(T *) 0" tree node to describe the original array type.
>>>>>>>> 530      We get the original array type from the first argument of the call to
>>>>>>>> 531      .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
>>>>>>>> 532
>>>>>>>> 533      Originally, REF is a COMPONENT_REF with the original array type,
>>>>>>>> 534      it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
>>>>>>>> 535      first operand is the original COMPONENT_REF.  */
>>>>>>>> 536   tree ref = CALL_EXPR_ARG (array, 0);
>>>>>>>> 537   tree array_type
>>>>>>>> 538     = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND(ref, 0), 0)));
>>>>>>>> 539   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
>>>>>>>> 540   return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
>>>>>>>> 541                                        void_type_node, 3, zero_with_type,
>>>>>>>> 542                                        index, bound);
>>>>>>>> 543 }
>>>>>>>> 
>>>>>>>> =====
>>>>>>>> 
>>>>>>>> Inside gdb, the guilty IR failed in LTO streaming is from the above line 520:
>>>>>>>> TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array))),
>>>>>>>> 
>>>>>>>> When I use this tree node as an operand of the expression at line 526, I added
>>>>>>>> unshare_expr.
>>>>>>>> 
>>>>>>>> However, I still see the guilty IR as in gdb:
>>>>>>>> 
>>>>>>>>         unit-size <mult_expr 0xfffff5aabf90 type <integer_type 0xfffff57c0000 sizetype>
>>>>>>>>             side-effects
>>>>>>>>             arg:0 <mult_expr 0xfffff5aabf68 type <integer_type 0xfffff57c0000 sizetype>
>>>>>>>> 
>>>>>>>>                 arg:0 <ssa_name 0xfffff5761e18 type <error_mark 0xfffff56c0e58>
>>>>>>>>                     nothrow
>>>>>>>>                     def_stmt
>>>>>>>>                     version:12 in-free-list>
>>>>>>>>                 arg:1 <ssa_name 0xfffff5761e60 type <error_mark 0xfffff56c0e58>
>>>>>>>>                     nothrow
>>>>>>>>                     def_stmt
>>>>>>>>                     version:13 in-free-list>>
>>>>>>>>             arg:1 <integer_cst 0xfffff56c10c8 constant 4>>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I have been stuck with this bug for quite some time.
>>>>>>>> Any help is helpful.
>>>>>>> 
>>>>>>> The above hasn't been gimplified correctly, you'd instead see
>>>>>>> a D.1234 in there, not an expression with SSA names.  That happens
>>>>>>> when the frontend fails to emit a DECL_EXPR for a decl with this
>>>>>>> type.
>>>>>> 
>>>>>> .. which then also results in missing unsharing of this expression
>>>>>> (so the SSA names leak in)
>>>>> 
>>>>> Thanks a lot for the hints.
>>>>> 
>>>>> One correction first, the LTO bug is not related to -fsanitize=bounds.  Deleting -fsanitize=bounds still can
>>>>> repeat the failure.
>>>>> 
>>>>> After further debugging into the gimplification phase related with the SAVE_EXPR, I finally locate the place
>>>>> where the unshareing of the expression is missing.   This is in the routine “pointer_int_sum” of c-family/c-common.cc:
>>>>> 
>>>>> 3330     {
>>>>> 3331       if (!complain && !COMPLETE_TYPE_P (TREE_TYPE (result_type)))
>>>>> 3332         return error_mark_node;
>>>>> 3333       size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
>>>>> 3334       /* Wrap the pointer expression in a SAVE_EXPR to make sure it
>>>>> 3335          is evaluated first when the size expression may depend
>>>>> 3336          on it for VM types.  */
>>>>> 3337       if (TREE_SIDE_EFFECTS (size_exp)
>>>>> 3338           && TREE_SIDE_EFFECTS (ptrop)
>>>>> 3339           && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
>>>>> 3340         {
>>>>> 3341           ptrop = save_expr (ptrop);
>>>>> 3342           size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
>>>>> 3343         }
>>>>> 3344     }
>>>>> 
>>>>> In the above, at line 3333, the tree node, TYPE_SIZE_UNIT (TREE_TYPE(result_type)), is returned directly as
>>>>> the size_exp,
>>>>> 
>>>>> (gdb) call debug_tree(size_exp)
>>>>> <mult_expr 0xfffff5a6f910
>>>>>  type <integer_type 0xfffff57c0000 sizetype public unsigned DI
>>>>>      size <integer_cst 0xfffff56c0e70 constant 64>
>>>>>      unit-size <integer_cst 0xfffff56c0e88 constant 8>
>>>>>      align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0xfffff57c0000 precision:64 min <integer_cst 0xfffff56c0ea0 0> max <integer_cst 0xfffff56d05e0 18446744073709551615>>
>>>>>  side-effects
>>>>>  arg:0 <mult_expr 0xfffff5a6f8e8 type <integer_type 0xfffff57c0000 sizetype>
>>>>>      side-effects
>>>>>      arg:0 <nop_expr 0xfffff56dc540 type <integer_type 0xfffff57c0000 sizetype>
>>>>>          side-effects
>>>>>          arg:0 <save_expr 0xfffff56dc4c0 type <integer_type 0xfffff57c05e8 int>
>>>>>              side-effects arg:0 <parm_decl 0xfffff76b6f80 n1>>>
>>>>>      arg:1 <nop_expr 0xfffff56dc600 type <integer_type 0xfffff57c0000 sizetype>
>>>>>          side-effects
>>>>>          arg:0 <save_expr 0xfffff56dc580 type <integer_type 0xfffff57c05e8 int>
>>>>>              side-effects arg:0 <parm_decl 0xfffff76b7000 n2>>>>
>>>>>  arg:1 <integer_cst 0xfffff56c10c8 type <integer_type 0xfffff57c0000 sizetype> constant 4>>
>>>>> 
>>>>> 
>>>>> Without unshare_expr to this size_exp, the above TYPE_SIZE_UNIT node containing SAVE_EXPRs
>>>>> is gimpflified to expressions with SSA_NAME during gimplification.  (This is unaccepted by LTO).
>>>>> 
>>>>> Adding an unshare_expr (size_exp) resolved this problem.
>>>>> 
>>>>> Although I still think that there might be potential issue with the gimpflication of SAVE_EXPRs, I dare not
>>>>> to modify that part of the code.
>>>>> 
>>>>> At this moment, I will add unshare_expr to the routine “pointer_int_sum” to workaround this issue.
>>>> 
>>>> It's only a workaround mind you.  The bug is that the frontend fails
>>>> to emit a DECL_EXPR which would
>>>> trigger both unsharing and proper gimplification of the type size.
>>> 
>>> For a simple testing case:
>>> 
>>> $ cat test.c
>>> struct annotated {
>>> unsigned int foo;
>>> char b;
>>> int array[] __attribute__((counted_by (foo)));
>>> };
>>> extern struct annotated * alloc_buf (int index);
>>> 
>>> static void bar ()
>>> {
>>> struct annotated *p2 = alloc_buf (10);
>>> p2->array[11] = 0;
>>> return;
>>> }
>>> 
>>> The C FE generates the following IR:
>>> 
>>> [opc@qinzhao-ol8u3-x86 108896]$ cat test.c.005t.original
>>> ;; Function bar (null)
>>> ;; enabled by -tree-original
>>> 
>>> 
>>> {
>>> struct annotated * p2 = alloc_buf (10);
>>> 
>>>   struct annotated * p2 = alloc_buf (10);
>>> *(.ACCESS_WITH_SIZE ((int *) &p2->array, &p2->foo, 1, 32, -1) + 44) = 0;
>>> return;
>>> }
>>> 
>>> Do you see any obvious IR issue in the above? Do I miss to generate any DECL_EXPR in the above IR?
>> 
>> It's an interesting question - I don't see where the gimplifier would
>> need to access DECL/TYPE_SIZE
> 
> My bad, the above simple example did not expose the LTO error. Just used to show the IR generated for the array access.
> 
> The LTO error only happens with multi-dimension array, whose inner dimensions are VLA.  And the outermost dimension is flexible array member.  Like following:
> 
> 
>   void __attribute__((__noinline__)) setup_and_test_vla (int n1, int n2, int m)
> {
>   struct foo {
>       int n;
>       int p[][n2][n1] __attribute__((counted_by(n)));
>   } *f;
> 
>   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
>   f->n = m;
>   f->p[m][n2][n1]=1;
>   return;
> }
> 
> And the IR for the routine “setup_and_test_vla” is:
> 
> {
>  typedef struct foo struct struct foo;
>  struct foo * f;
> 
>  SAVE_EXPR <n1>;, SAVE_EXPR <n2>;;, {
>        typedef struct foo struct struct foo;
>  };
>    struct foo * f;
>  f = (struct foo *) malloc (0, SAVE_EXPR <n1>;, SAVE_EXPR <n2>;;, ((long unsigned int) m * (long unsigned int) ((sizetype) SAVE_EXPR <n1> * (sizetype) SAVE_EXPR <n2>) + 1) * 4;);
>  f->n = m;
>  (*(.ACCESS_WITH_SIZE ((int[0:(sizetype) ((long int) SAVE_EXPR <n2> + -1)][0:(sizetype) ((long int) SAVE_EXPR <n1> + -1)] *) &f->p, &f->n, 1, 32, -1) + (sizetype) (((long unsigned int) m * (long unsigned int) ((sizetype) SAVE_EXPR <n1> * (sizetype) SAVE_EXPR <n2>)) * 4)))[n2][n1] = 1;
>  return;
> }
> 
> 
> 
>> so the mistake, if any, should be that you need to unshare the size
>> expressions you are using as
>> argument to .ACCESS_WITH_SIZE?
> 
> Hm, I tried that in the beginning, but didn’t work. I will check again. 
> 
>> Mind, you are replacing an ARRAY_REF
>> with a pointer indirection
>> as well
> 
> Yes, this is for resolving a very early gimplification issue as I reported last Nov:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> 
> Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF
> With a pointer indirection:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
> 
> The reason for such change is:  return a flexible array member TYPE is not allowed
> by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. 
> 
> 
>> - IMO we shouldn't replace accesses this way but instead make
>> it possible for analysis to
>> discover the base/size values?
> 
> Yes, if keeping the ARRAY_REF, then bound sanitizer instrumentation will be much simpler since the INDEX
> Information is kept in the TYPE NODE of the ARRAY_REF.  
> However, due to the above reason, we have to replace the ARRAY_REF with a pointer INDIRECT_REF. 
> Such change made the bound sanitizer more difficult due to the INDEX was lost when the ARRAY_REF was
> Converted to the INDIRECT_REF.
> 
> I resolved this issue by adding a new argument to .ACCESS_WITH_SIZE to record the INDEX when converting
> The ARRAY_REF to INDIRECT_REF during C FE.
> 
> Let me know if you have any comment.
> 
> Thanks.
> 
> Qing
>> 
>>> Thanks.
>>> 
>>> Qing
>>> 
>>> 
>>> I compared it with the following testing case without the “counted-by” annotation
>>> and use an user-defined “access_with_size” function, The IR looks like exactly the same:
>>> 
>>> $ cat test_1.c
>>> struct annotated {
>>> unsigned int foo;
>>> char b;
>>> int array[];
>>> };
>>> extern struct annotated *alloc_buf (int);
>>> extern int *access_with_size (int * ref, unsigned int * size, int a, int b, int c);
>>> 
>>> static void bar ()
>>> {
>>> struct annotated *p2 = alloc_buf (10);
>>> access_with_size ((int *) &p2->array, &p2->foo, 1, 32, -1)[11] = 0;
>>> return;
>>> }
>>> [opc@qinzhao-ol8u3-x86 108896]$ cat test_1.c.005t.original
>>> 
>>> ;; Function bar (null)
>>> ;; enabled by -tree-original
>>> 
>>> 
>>> {
>>> struct annotated * p2 = alloc_buf (10);
>>> 
>>>   struct annotated * p2 = alloc_buf (10);
>>> *(access_with_size ((int *) &p2->array, &p2->foo, 1, 32, -1) + 44) = 0;
>>> return;
>>> }
>>> 
>>> 
>>> 
>>>> 
>>>>> Let me know if you have any comment and suggestion.
>>>>> 
>>>>> Thanks a lot.
>>>>> 
>>>>> Qing
>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> Qing
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Thanks a lot for the help.
>>>>>>>>>> 
>>>>>>>>>> Qing



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-01-22 16:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-12 15:54 HELP: Questions on unshare_expr Qing Zhao
2024-01-12 16:28 ` Richard Biener
2024-01-12 17:30   ` Qing Zhao
2024-01-15  8:13     ` Eric Botcazou
2024-01-15 16:42       ` Qing Zhao
2024-01-15  9:31     ` Richard Biener
2024-01-15 14:54       ` Qing Zhao
2024-01-15 15:06         ` Jakub Jelinek
2024-01-15 16:41           ` Qing Zhao
2024-01-16 20:25       ` Qing Zhao
2024-01-17  6:42         ` Richard Biener
2024-01-17  6:43           ` Richard Biener
2024-01-18 14:45             ` Qing Zhao
2024-01-19  9:30               ` Richard Biener
2024-01-19 16:25                 ` Qing Zhao
2024-01-22  7:40                   ` Richard Biener
2024-01-22 14:52                     ` Qing Zhao
2024-01-22 16:54                       ` Qing Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).