public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Another bug for __builtin_object_size? (Or expected behavior)
@ 2023-08-16 15:59 Qing Zhao
  2023-08-16 20:16 ` Qing Zhao
  2023-08-17 11:00 ` Siddhesh Poyarekar
  0 siblings, 2 replies; 13+ messages in thread
From: Qing Zhao @ 2023-08-16 15:59 UTC (permalink / raw)
  To: jakub Jelinek, Siddhesh Poyarekar; +Cc: gcc-patches

Jakub and Sid,

During my study, I found an interesting behavior for the following small testing case:

#include <stddef.h>
#include <stdio.h>

struct fixed {
  size_t foo;
  char b;
  char array[10]; 
} q = {};

#define noinline __attribute__((__noinline__))

static void noinline bar ()
{
  struct fixed *p = &q;

  printf("the__bos of MAX p->array sub is %d \n", __builtin_object_size(p->array, 1)); 
  printf("the__bos of MIN p->array sub is %d \n", __builtin_object_size(p->array, 3)); 

  return;
}

int main ()
{
  bar ();
  return 0;
}
[opc@qinzhao-aarch64-ol8 108896]$ sh t
/home/opc/Install/latest-d/bin/gcc -O -fstrict-flex-arrays=3 t2.c
the__bos of MAX p->array sub is 10 
the__bos of MIN p->array sub is 15 

I assume that the Minimum size in the sub-object should be 10 too (i.e __builtin_object_size(p->array, 3) should be 10 too). 

So, first question: Is this correct or wrong behavior for __builtin_object_size(p->array, 3)?

The second question is, when I debugged into why __builtin_object_size(p->array, 3) returns 15 instead of 10, I observed the following:

1. In “early_objz” phase, The IR for p->array is:
(gdb) call debug_generic_expr(ptr)
&p_5->array

And the pt_var is:
(gdb) call debug_generic_expr(pt_var)
*p_5

As a result, the following condition in tree-object-size.cc:

 585   if (pt_var != TREE_OPERAND (ptr, 0))

Was satisfied, and then the algorithm for computing the SUBOBJECT was invoked and the size of the subobject 10 was used. 

and then an MAX_EXPR was inserted after the __builtin_object_size call as:
  _3 = &p_5->array;
  _10 = __builtin_object_size (_3, 3);
  _4 = MAX_EXPR <_10, 10>;

Till now, everything looks fine.

2. within “ccp1” phase, when folding the call  to __builtin_object_size, the IR for the p-:>array is:
(gdb) call debug_generic_expr(ptr)
&MEM <char[10]> [(void *)&q + 9B]

And the pt_var is:
(gdb) call debug_generic_expr(pt_var)
MEM <char[10]> [(void *)&q + 9B]

As a result, the following condition in tree-object-size.cc:

 585   if (pt_var != TREE_OPERAND (ptr, 0))

Was NOT satisfied, therefore the algorithm for computing the SUBOBJECT was NOT invoked at all, as a result, the size in the whole object, 15, was used. 

And then finally, MAX_EXPR (_10, 10) becomes MAX_EXPR (15, 10), 15 is the final result.

Based on the above, is there any issue with the current algorithm?

Thanks a lot for the help.

Qing 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-16 15:59 Another bug for __builtin_object_size? (Or expected behavior) Qing Zhao
@ 2023-08-16 20:16 ` Qing Zhao
  2023-08-17 11:00 ` Siddhesh Poyarekar
  1 sibling, 0 replies; 13+ messages in thread
From: Qing Zhao @ 2023-08-16 20:16 UTC (permalink / raw)
  To: Jakub Jelinek, Siddhesh Poyarekar; +Cc: gcc-patches

FYI, I filed a new PR https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111040
to record this issue. 

Qing
> On Aug 16, 2023, at 11:59 AM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> Jakub and Sid,
> 
> During my study, I found an interesting behavior for the following small testing case:
> 
> #include <stddef.h>
> #include <stdio.h>
> 
> struct fixed {
>  size_t foo;
>  char b;
>  char array[10]; 
> } q = {};
> 
> #define noinline __attribute__((__noinline__))
> 
> static void noinline bar ()
> {
>  struct fixed *p = &q;
> 
>  printf("the__bos of MAX p->array sub is %d \n", __builtin_object_size(p->array, 1)); 
>  printf("the__bos of MIN p->array sub is %d \n", __builtin_object_size(p->array, 3)); 
> 
>  return;
> }
> 
> int main ()
> {
>  bar ();
>  return 0;
> }
> [opc@qinzhao-aarch64-ol8 108896]$ sh t
> /home/opc/Install/latest-d/bin/gcc -O -fstrict-flex-arrays=3 t2.c
> the__bos of MAX p->array sub is 10 
> the__bos of MIN p->array sub is 15 
> 
> I assume that the Minimum size in the sub-object should be 10 too (i.e __builtin_object_size(p->array, 3) should be 10 too). 
> 
> So, first question: Is this correct or wrong behavior for __builtin_object_size(p->array, 3)?
> 
> The second question is, when I debugged into why __builtin_object_size(p->array, 3) returns 15 instead of 10, I observed the following:
> 
> 1. In “early_objz” phase, The IR for p->array is:
> (gdb) call debug_generic_expr(ptr)
> &p_5->array
> 
> And the pt_var is:
> (gdb) call debug_generic_expr(pt_var)
> *p_5
> 
> As a result, the following condition in tree-object-size.cc:
> 
> 585   if (pt_var != TREE_OPERAND (ptr, 0))
> 
> Was satisfied, and then the algorithm for computing the SUBOBJECT was invoked and the size of the subobject 10 was used. 
> 
> and then an MAX_EXPR was inserted after the __builtin_object_size call as:
>  _3 = &p_5->array;
>  _10 = __builtin_object_size (_3, 3);
>  _4 = MAX_EXPR <_10, 10>;
> 
> Till now, everything looks fine.
> 
> 2. within “ccp1” phase, when folding the call  to __builtin_object_size, the IR for the p-:>array is:
> (gdb) call debug_generic_expr(ptr)
> &MEM <char[10]> [(void *)&q + 9B]
> 
> And the pt_var is:
> (gdb) call debug_generic_expr(pt_var)
> MEM <char[10]> [(void *)&q + 9B]
> 
> As a result, the following condition in tree-object-size.cc:
> 
> 585   if (pt_var != TREE_OPERAND (ptr, 0))
> 
> Was NOT satisfied, therefore the algorithm for computing the SUBOBJECT was NOT invoked at all, as a result, the size in the whole object, 15, was used. 
> 
> And then finally, MAX_EXPR (_10, 10) becomes MAX_EXPR (15, 10), 15 is the final result.
> 
> Based on the above, is there any issue with the current algorithm?
> 
> Thanks a lot for the help.
> 
> Qing 
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-16 15:59 Another bug for __builtin_object_size? (Or expected behavior) Qing Zhao
  2023-08-16 20:16 ` Qing Zhao
@ 2023-08-17 11:00 ` Siddhesh Poyarekar
  2023-08-17 13:58   ` Qing Zhao
  1 sibling, 1 reply; 13+ messages in thread
From: Siddhesh Poyarekar @ 2023-08-17 11:00 UTC (permalink / raw)
  To: Qing Zhao, jakub Jelinek; +Cc: gcc-patches

On 2023-08-16 11:59, Qing Zhao wrote:
> Jakub and Sid,
> 
> During my study, I found an interesting behavior for the following small testing case:
> 
> #include <stddef.h>
> #include <stdio.h>
> 
> struct fixed {
>    size_t foo;
>    char b;
>    char array[10];
> } q = {};
> 
> #define noinline __attribute__((__noinline__))
> 
> static void noinline bar ()
> {
>    struct fixed *p = &q;
> 
>    printf("the__bos of MAX p->array sub is %d \n", __builtin_object_size(p->array, 1));
>    printf("the__bos of MIN p->array sub is %d \n", __builtin_object_size(p->array, 3));
> 
>    return;
> }
> 
> int main ()
> {
>    bar ();
>    return 0;
> }
> [opc@qinzhao-aarch64-ol8 108896]$ sh t
> /home/opc/Install/latest-d/bin/gcc -O -fstrict-flex-arrays=3 t2.c
> the__bos of MAX p->array sub is 10
> the__bos of MIN p->array sub is 15
> 
> I assume that the Minimum size in the sub-object should be 10 too (i.e __builtin_object_size(p->array, 3) should be 10 too).
> 
> So, first question: Is this correct or wrong behavior for __builtin_object_size(p->array, 3)?
> 
> The second question is, when I debugged into why __builtin_object_size(p->array, 3) returns 15 instead of 10, I observed the following:
> 
> 1. In “early_objz” phase, The IR for p->array is:
> (gdb) call debug_generic_expr(ptr)
> &p_5->array
> 
> And the pt_var is:
> (gdb) call debug_generic_expr(pt_var)
> *p_5
> 
> As a result, the following condition in tree-object-size.cc:
> 
>   585   if (pt_var != TREE_OPERAND (ptr, 0))
> 
> Was satisfied, and then the algorithm for computing the SUBOBJECT was invoked and the size of the subobject 10 was used.
> 
> and then an MAX_EXPR was inserted after the __builtin_object_size call as:
>    _3 = &p_5->array;
>    _10 = __builtin_object_size (_3, 3);
>    _4 = MAX_EXPR <_10, 10>;
> 
> Till now, everything looks fine.
> 
> 2. within “ccp1” phase, when folding the call  to __builtin_object_size, the IR for the p-:>array is:
> (gdb) call debug_generic_expr(ptr)
> &MEM <char[10]> [(void *)&q + 9B]
> 
> And the pt_var is:
> (gdb) call debug_generic_expr(pt_var)
> MEM <char[10]> [(void *)&q + 9B]
> 
> As a result, the following condition in tree-object-size.cc:
> 
>   585   if (pt_var != TREE_OPERAND (ptr, 0))
> 
> Was NOT satisfied, therefore the algorithm for computing the SUBOBJECT was NOT invoked at all, as a result, the size in the whole object, 15, was used.
> 
> And then finally, MAX_EXPR (_10, 10) becomes MAX_EXPR (15, 10), 15 is the final result.
> 
> Based on the above, is there any issue with the current algorithm?

So this is a (sort of) known issue, which necessitated the early_objsz 
pass to get an estimate before a subobject reference was optimized to a 
MEM_REF.  However it looks like the MIN/MAX hack doesn't work in this 
case for OST_MINIMUM; it should probably get the minimum of the two 
passes if both passes were successful, or only the result of the pass 
that was successful.

Thanks,
Sid

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-17 11:00 ` Siddhesh Poyarekar
@ 2023-08-17 13:58   ` Qing Zhao
  2023-08-17 17:49     ` Siddhesh Poyarekar
  0 siblings, 1 reply; 13+ messages in thread
From: Qing Zhao @ 2023-08-17 13:58 UTC (permalink / raw)
  To: Siddhesh Poyarekar; +Cc: jakub Jelinek, gcc-patches



> On Aug 17, 2023, at 7:00 AM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> 
> On 2023-08-16 11:59, Qing Zhao wrote:
>> Jakub and Sid,
>> During my study, I found an interesting behavior for the following small testing case:
>> #include <stddef.h>
>> #include <stdio.h>
>> struct fixed {
>>   size_t foo;
>>   char b;
>>   char array[10];
>> } q = {};
>> #define noinline __attribute__((__noinline__))
>> static void noinline bar ()
>> {
>>   struct fixed *p = &q;
>>   printf("the__bos of MAX p->array sub is %d \n", __builtin_object_size(p->array, 1));
>>   printf("the__bos of MIN p->array sub is %d \n", __builtin_object_size(p->array, 3));
>>   return;
>> }
>> int main ()
>> {
>>   bar ();
>>   return 0;
>> }
>> [opc@qinzhao-aarch64-ol8 108896]$ sh t
>> /home/opc/Install/latest-d/bin/gcc -O -fstrict-flex-arrays=3 t2.c
>> the__bos of MAX p->array sub is 10
>> the__bos of MIN p->array sub is 15
>> I assume that the Minimum size in the sub-object should be 10 too (i.e __builtin_object_size(p->array, 3) should be 10 too).
>> So, first question: Is this correct or wrong behavior for __builtin_object_size(p->array, 3)?
>> The second question is, when I debugged into why __builtin_object_size(p->array, 3) returns 15 instead of 10, I observed the following:
>> 1. In “early_objz” phase, The IR for p->array is:
>> (gdb) call debug_generic_expr(ptr)
>> &p_5->array
>> And the pt_var is:
>> (gdb) call debug_generic_expr(pt_var)
>> *p_5
>> As a result, the following condition in tree-object-size.cc:
>>  585   if (pt_var != TREE_OPERAND (ptr, 0))
>> Was satisfied, and then the algorithm for computing the SUBOBJECT was invoked and the size of the subobject 10 was used.
>> and then an MAX_EXPR was inserted after the __builtin_object_size call as:
>>   _3 = &p_5->array;
>>   _10 = __builtin_object_size (_3, 3);
>>   _4 = MAX_EXPR <_10, 10>;
>> Till now, everything looks fine.
>> 2. within “ccp1” phase, when folding the call  to __builtin_object_size, the IR for the p-:>array is:
>> (gdb) call debug_generic_expr(ptr)
>> &MEM <char[10]> [(void *)&q + 9B]
>> And the pt_var is:
>> (gdb) call debug_generic_expr(pt_var)
>> MEM <char[10]> [(void *)&q + 9B]
>> As a result, the following condition in tree-object-size.cc:
>>  585   if (pt_var != TREE_OPERAND (ptr, 0))
>> Was NOT satisfied, therefore the algorithm for computing the SUBOBJECT was NOT invoked at all, as a result, the size in the whole object, 15, was used.
>> And then finally, MAX_EXPR (_10, 10) becomes MAX_EXPR (15, 10), 15 is the final result.
>> Based on the above, is there any issue with the current algorithm?
> 
> So this is a (sort of) known issue, which necessitated the early_objsz pass to get an estimate before a subobject reference was optimized to a MEM_REF.

Do you mean that after a subobject reference was optimized to a MEM_REF, there is no way to compute the size of the subobject anymore?

>  However it looks like the MIN/MAX hack doesn't work in this case for OST_MINIMUM; it should probably get the minimum of the two passes if both passes were successful, or only the result of the pass that was successful.

You mean that the following line:
2053   enum tree_code code = object_size_type & OST_MINIMUM ? MAX_EXPR : MIN_EXPR;
Might need to be changed to:
2053   enum tree_code code =  MIN_EXPR;

?

thanks.

Qing
> 
> Thanks,
> Sid


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-17 13:58   ` Qing Zhao
@ 2023-08-17 17:49     ` Siddhesh Poyarekar
  2023-08-17 19:27       ` Qing Zhao
  0 siblings, 1 reply; 13+ messages in thread
From: Siddhesh Poyarekar @ 2023-08-17 17:49 UTC (permalink / raw)
  To: Qing Zhao; +Cc: jakub Jelinek, gcc-patches

On 2023-08-17 09:58, Qing Zhao wrote:
>> So this is a (sort of) known issue, which necessitated the early_objsz pass to get an estimate before a subobject reference was optimized to a MEM_REF.
> 
> Do you mean that after a subobject reference was optimized to a MEM_REF, there is no way to compute the size of the subobject anymore?

Yes, in cases where the TYPE_SIZE is lost and there's no other 
allocation information to fall back on.

>>   However it looks like the MIN/MAX hack doesn't work in this case for OST_MINIMUM; it should probably get the minimum of the two passes if both passes were successful, or only the result of the pass that was successful.
> 
> You mean that the following line:
> 2053   enum tree_code code = object_size_type & OST_MINIMUM ? MAX_EXPR : MIN_EXPR;
> Might need to be changed to:
> 2053   enum tree_code code =  MIN_EXPR;

Yes, that's it.  Maybe it's more correct if instead of MAX_EXPR if for 
OST_MINIMUM we stick with the early_objsz answer if it's non-zero.  I'm 
not sure if that's the case for maximum size though, my gut says it isn't.

Thanks,
Sid

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-17 17:49     ` Siddhesh Poyarekar
@ 2023-08-17 19:27       ` Qing Zhao
  2023-08-17 19:59         ` Siddhesh Poyarekar
  0 siblings, 1 reply; 13+ messages in thread
From: Qing Zhao @ 2023-08-17 19:27 UTC (permalink / raw)
  To: Siddhesh Poyarekar; +Cc: jakub Jelinek, gcc-patches



> On Aug 17, 2023, at 1:49 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> 
> On 2023-08-17 09:58, Qing Zhao wrote:
>>> So this is a (sort of) known issue, which necessitated the early_objsz pass to get an estimate before a subobject reference was optimized to a MEM_REF.
>> Do you mean that after a subobject reference was optimized to a MEM_REF, there is no way to compute the size of the subobject anymore?
> 
> Yes, in cases where the TYPE_SIZE is lost and there's no other allocation information to fall back on.

Okay, I see.

> 
>>>  However it looks like the MIN/MAX hack doesn't work in this case for OST_MINIMUM; it should probably get the minimum of the two passes if both passes were successful, or only the result of the pass that was successful.
>> You mean that the following line:
>> 2053   enum tree_code code = object_size_type & OST_MINIMUM ? MAX_EXPR : MIN_EXPR;
>> Might need to be changed to:
>> 2053   enum tree_code code =  MIN_EXPR;
> 
> Yes, that's it.  Maybe it's more correct if instead of MAX_EXPR if for OST_MINIMUM we stick with the early_objsz answer if it's non-zero.  I'm not sure if that's the case for maximum size though, my gut says it isn't.

So, the major purpose for adding the early object size phase is for computing SUBobjects size more precisely before the subobject information lost?

Then, I think whatever MIN or MAX, the early phase has more precise information than the later phase, we should use its result if it’s NOT UNKNOWN?

Qing
> 
> Thanks,
> Sid


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-17 19:27       ` Qing Zhao
@ 2023-08-17 19:59         ` Siddhesh Poyarekar
  2023-08-17 20:23           ` Qing Zhao
  0 siblings, 1 reply; 13+ messages in thread
From: Siddhesh Poyarekar @ 2023-08-17 19:59 UTC (permalink / raw)
  To: Qing Zhao; +Cc: jakub Jelinek, gcc-patches

On 2023-08-17 15:27, Qing Zhao wrote:
>> Yes, that's it.  Maybe it's more correct if instead of MAX_EXPR if for OST_MINIMUM we stick with the early_objsz answer if it's non-zero.  I'm not sure if that's the case for maximum size though, my gut says it isn't.
> 
> So, the major purpose for adding the early object size phase is for computing SUBobjects size more precisely before the subobject information lost?

I suppose it's more about being able to do it at all, rather than precision.

> Then, I think whatever MIN or MAX, the early phase has more precise information than the later phase, we should use its result if it’s NOT UNKNOWN?

We can't be sure about that though, can we?  For example for something 
like this:

struct S
{
   int a;
   char b[10];
   int c;
};

size_t foo (struct S *s)
{
   return __builtin_object_size (s->b, 1);
}

size_t bar ()
{
   struct S *in = malloc (8);

   return foo (in);
}

returns 10 for __builtin_object_size in early_objsz but when it sees the 
malloc in the later objsz pass, it returns 4:

$ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
...
foo:
.LFB0:
	.cfi_startproc
	movl	$10, %eax
	ret
	.cfi_endproc
...
bar:
.LFB1:
	.cfi_startproc
	movl	$4, %eax
	ret
	.cfi_endproc
...

In fact, this ends up returning the wrong result for OST_MINIMUM:

$ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
...
foo:
.LFB0:
	.cfi_startproc
	movl	$10, %eax
	ret
	.cfi_endproc
...
bar:
.LFB1:
	.cfi_startproc
	movl	$10, %eax
	ret
	.cfi_endproc
...

bar ought to have returned 4 too (and I'm betting the later objsz must 
have seen that) but it got overridden by the earlier estimate of 10.

We probably need smarter heuristics on choosing between the estimate of 
the early_objsz and late objsz passes each by itself isn't good enough 
for subobjects.

Thanks,
Sid

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-17 19:59         ` Siddhesh Poyarekar
@ 2023-08-17 20:23           ` Qing Zhao
  2023-08-17 20:57             ` Siddhesh Poyarekar
  0 siblings, 1 reply; 13+ messages in thread
From: Qing Zhao @ 2023-08-17 20:23 UTC (permalink / raw)
  To: Siddhesh Poyarekar; +Cc: jakub Jelinek, gcc-patches



> On Aug 17, 2023, at 3:59 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> 
> On 2023-08-17 15:27, Qing Zhao wrote:
>>> Yes, that's it.  Maybe it's more correct if instead of MAX_EXPR if for OST_MINIMUM we stick with the early_objsz answer if it's non-zero.  I'm not sure if that's the case for maximum size though, my gut says it isn't.
>> So, the major purpose for adding the early object size phase is for computing SUBobjects size more precisely before the subobject information lost?
> 
> I suppose it's more about being able to do it at all, rather than precision.

Without the subobject information in IR, our later object size phase uses the whole object size as an estimation as it currently does. -:)
> 
>> Then, I think whatever MIN or MAX, the early phase has more precise information than the later phase, we should use its result if it’s NOT UNKNOWN?
> 
> We can't be sure about that though, can we?  For example for something like this:
> 
> struct S
> {
>  int a;
>  char b[10];
>  int c;
> };
> 
> size_t foo (struct S *s)
> {
>  return __builtin_object_size (s->b, 1);
> }
> 
> size_t bar ()
> {
>  struct S *in = malloc (8);
> 
>  return foo (in);
> }
> 
> returns 10 for __builtin_object_size in early_objsz but when it sees the malloc in the later objsz pass, it returns 4:
> 
> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
> ...
> foo:
> .LFB0:
> 	.cfi_startproc
> 	movl	$10, %eax
> 	ret
> 	.cfi_endproc
> ...
> bar:
> .LFB1:
> 	.cfi_startproc
> 	movl	$4, %eax
> 	ret
> 	.cfi_endproc
> ...
> 
> In fact, this ends up returning the wrong result for OST_MINIMUM:
> 
> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
> ...
> foo:
> .LFB0:
> 	.cfi_startproc
> 	movl	$10, %eax
> 	ret
> 	.cfi_endproc
> ...
> bar:
> .LFB1:
> 	.cfi_startproc
> 	movl	$10, %eax
> 	ret
> 	.cfi_endproc
> ...
> 
> bar ought to have returned 4 too (and I'm betting the later objsz must have seen that) but it got overridden by the earlier estimate of 10.

Okay, I see. 

Then is this the similar issue we discussed previously?  (As following:)

"
> Hi, Sid and Jakub,
> I have a question in the following source portion of the routine “addr_object_size” of gcc/tree-object-size.cc:
>  743       bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
>  744       if (bytes != error_mark_node)
>  745         {
>  746           bytes = size_for_offset (var_size, bytes);
>  747           if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == MEM_REF)
>  748             {
>  749               tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0),
>  750                                                    pt_var);
>  751               if (bytes2 != error_mark_node)
>  752                 {
>  753                   bytes2 = size_for_offset (pt_var_size, bytes2);
>  754                   bytes = size_binop (MIN_EXPR, bytes, bytes2);
>  755                 }
>  756             }
>  757         }
> At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or not?
> Shall we use
> (object_size_type & OST_MINIMUM
>                             ? MIN_EXPR : MAX_EXPR)

That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations like this:

typedef struct
{
 int a;
} A;

size_t f()
{
 A *p = malloc (1);

 return __builtin_object_size (p, 0);
}

where the returned size should be 1 and not sizeof (int).  The mode doesn't really matter in this case.
“

If this is the same issue, I think we can use the same solution: always use MIN_EXPR, 
What do you think?

Qing

> 
> We probably need smarter heuristics on choosing between the estimate of the early_objsz and late objsz passes each by itself isn't good enough for subobjects.
> 
> Thanks,
> Sid


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-17 20:23           ` Qing Zhao
@ 2023-08-17 20:57             ` Siddhesh Poyarekar
  2023-08-17 21:25               ` Qing Zhao
  0 siblings, 1 reply; 13+ messages in thread
From: Siddhesh Poyarekar @ 2023-08-17 20:57 UTC (permalink / raw)
  To: Qing Zhao; +Cc: jakub Jelinek, gcc-patches

On 2023-08-17 16:23, Qing Zhao wrote:
>>> Then, I think whatever MIN or MAX, the early phase has more precise information than the later phase, we should use its result if it’s NOT UNKNOWN?
>>
>> We can't be sure about that though, can we?  For example for something like this:
>>
>> struct S
>> {
>>   int a;
>>   char b[10];
>>   int c;
>> };
>>
>> size_t foo (struct S *s)
>> {
>>   return __builtin_object_size (s->b, 1);
>> }
>>
>> size_t bar ()
>> {
>>   struct S *in = malloc (8);
>>
>>   return foo (in);
>> }
>>
>> returns 10 for __builtin_object_size in early_objsz but when it sees the malloc in the later objsz pass, it returns 4:
>>
>> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
>> ...
>> foo:
>> .LFB0:
>> 	.cfi_startproc
>> 	movl	$10, %eax
>> 	ret
>> 	.cfi_endproc
>> ...
>> bar:
>> .LFB1:
>> 	.cfi_startproc
>> 	movl	$4, %eax
>> 	ret
>> 	.cfi_endproc
>> ...
>>
>> In fact, this ends up returning the wrong result for OST_MINIMUM:
>>
>> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
>> ...
>> foo:
>> .LFB0:
>> 	.cfi_startproc
>> 	movl	$10, %eax
>> 	ret
>> 	.cfi_endproc
>> ...
>> bar:
>> .LFB1:
>> 	.cfi_startproc
>> 	movl	$10, %eax
>> 	ret
>> 	.cfi_endproc
>> ...
>>
>> bar ought to have returned 4 too (and I'm betting the later objsz must have seen that) but it got overridden by the earlier estimate of 10.
> 
> Okay, I see.
> 
> Then is this the similar issue we discussed previously?  (As following:)
> 
> "
>> Hi, Sid and Jakub,
>> I have a question in the following source portion of the routine “addr_object_size” of gcc/tree-object-size.cc:
>>   743       bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
>>   744       if (bytes != error_mark_node)
>>   745         {
>>   746           bytes = size_for_offset (var_size, bytes);
>>   747           if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == MEM_REF)
>>   748             {
>>   749               tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0),
>>   750                                                    pt_var);
>>   751               if (bytes2 != error_mark_node)
>>   752                 {
>>   753                   bytes2 = size_for_offset (pt_var_size, bytes2);
>>   754                   bytes = size_binop (MIN_EXPR, bytes, bytes2);
>>   755                 }
>>   756             }
>>   757         }
>> At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or not?
>> Shall we use
>> (object_size_type & OST_MINIMUM
>>                              ? MIN_EXPR : MAX_EXPR)
> 
> That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations like this:
> 
> typedef struct
> {
>   int a;
> } A;
> 
> size_t f()
> {
>   A *p = malloc (1);
> 
>   return __builtin_object_size (p, 0);
> }
> 
> where the returned size should be 1 and not sizeof (int).  The mode doesn't really matter in this case.
> “
> 
> If this is the same issue, I think we can use the same solution: always use MIN_EXPR,
> What do you think?

It's not exactly the same issue, the earlier discussion was about 
choosing sizes in the same pass while the current one is about choosing 
between passes, but I agree it "rhymes".  This is what I was alluding to 
originally (for OST_MINIMUM use MIN_EXPR if both passes returned a pass) 
but I haven't thought about it hard enough to be 100% confident that 
it's the better solution, especially for OST_MAXIMUM.

Thanks,
Sid

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-17 20:57             ` Siddhesh Poyarekar
@ 2023-08-17 21:25               ` Qing Zhao
  2023-08-17 21:32                 ` Siddhesh Poyarekar
  0 siblings, 1 reply; 13+ messages in thread
From: Qing Zhao @ 2023-08-17 21:25 UTC (permalink / raw)
  To: Siddhesh Poyarekar; +Cc: jakub Jelinek, gcc-patches



> On Aug 17, 2023, at 4:57 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> 
> On 2023-08-17 16:23, Qing Zhao wrote:
>>>> Then, I think whatever MIN or MAX, the early phase has more precise information than the later phase, we should use its result if it’s NOT UNKNOWN?
>>> 
>>> We can't be sure about that though, can we?  For example for something like this:
>>> 
>>> struct S
>>> {
>>>  int a;
>>>  char b[10];
>>>  int c;
>>> };
>>> 
>>> size_t foo (struct S *s)
>>> {
>>>  return __builtin_object_size (s->b, 1);
>>> }
>>> 
>>> size_t bar ()
>>> {
>>>  struct S *in = malloc (8);
>>> 
>>>  return foo (in);
>>> }
>>> 
>>> returns 10 for __builtin_object_size in early_objsz but when it sees the malloc in the later objsz pass, it returns 4:
>>> 
>>> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
>>> ...
>>> foo:
>>> .LFB0:
>>> 	.cfi_startproc
>>> 	movl	$10, %eax
>>> 	ret
>>> 	.cfi_endproc
>>> ...
>>> bar:
>>> .LFB1:
>>> 	.cfi_startproc
>>> 	movl	$4, %eax
>>> 	ret
>>> 	.cfi_endproc
>>> ...
>>> 
>>> In fact, this ends up returning the wrong result for OST_MINIMUM:
>>> 
>>> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
>>> ...
>>> foo:
>>> .LFB0:
>>> 	.cfi_startproc
>>> 	movl	$10, %eax
>>> 	ret
>>> 	.cfi_endproc
>>> ...
>>> bar:
>>> .LFB1:
>>> 	.cfi_startproc
>>> 	movl	$10, %eax
>>> 	ret
>>> 	.cfi_endproc
>>> ...
>>> 
>>> bar ought to have returned 4 too (and I'm betting the later objsz must have seen that) but it got overridden by the earlier estimate of 10.
>> Okay, I see.
>> Then is this the similar issue we discussed previously?  (As following:)
>> "
>>> Hi, Sid and Jakub,
>>> I have a question in the following source portion of the routine “addr_object_size” of gcc/tree-object-size.cc:
>>>  743       bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
>>>  744       if (bytes != error_mark_node)
>>>  745         {
>>>  746           bytes = size_for_offset (var_size, bytes);
>>>  747           if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == MEM_REF)
>>>  748             {
>>>  749               tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0),
>>>  750                                                    pt_var);
>>>  751               if (bytes2 != error_mark_node)
>>>  752                 {
>>>  753                   bytes2 = size_for_offset (pt_var_size, bytes2);
>>>  754                   bytes = size_binop (MIN_EXPR, bytes, bytes2);
>>>  755                 }
>>>  756             }
>>>  757         }
>>> At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or not?
>>> Shall we use
>>> (object_size_type & OST_MINIMUM
>>>                             ? MIN_EXPR : MAX_EXPR)
>> That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations like this:
>> typedef struct
>> {
>>  int a;
>> } A;
>> size_t f()
>> {
>>  A *p = malloc (1);
>>  return __builtin_object_size (p, 0);
>> }
>> where the returned size should be 1 and not sizeof (int).  The mode doesn't really matter in this case.
>> “
>> If this is the same issue, I think we can use the same solution: always use MIN_EXPR,
>> What do you think?
> 
> It's not exactly the same issue, the earlier discussion was about choosing sizes in the same pass while the current one is about choosing between passes, but I agree it "rhymes".  This is what I was alluding to originally (for OST_MINIMUM use MIN_EXPR if both passes returned a pass) but I haven't thought about it hard enough to be 100% confident that it's the better solution, especially for OST_MAXIMUM.

We have two different sources to get SIZE information for the subobject:

1. From the TYPESIZE information embedded in the IR;
2. From the initialization information propagated from data flow, this includes both malloc call and the DECL_INIT. 

We need to choose between these two when both available, (these two information could be
 in the same pass as we discussed before, or in different passes which is shown in this discussion). 

I think that the MIN_EXPR might be the right choice (especially for OST_MAXIMUM) -:)

Qing
> 
> Thanks,
> Sid


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-17 21:25               ` Qing Zhao
@ 2023-08-17 21:32                 ` Siddhesh Poyarekar
  2023-08-18 16:00                   ` Qing Zhao
  0 siblings, 1 reply; 13+ messages in thread
From: Siddhesh Poyarekar @ 2023-08-17 21:32 UTC (permalink / raw)
  To: Qing Zhao; +Cc: jakub Jelinek, gcc-patches

On 2023-08-17 17:25, Qing Zhao wrote:
>> It's not exactly the same issue, the earlier discussion was about choosing sizes in the same pass while the current one is about choosing between passes, but I agree it "rhymes".  This is what I was alluding to originally (for OST_MINIMUM use MIN_EXPR if both passes returned a pass) but I haven't thought about it hard enough to be 100% confident that it's the better solution, especially for OST_MAXIMUM.
> 
> We have two different sources to get SIZE information for the subobject:
> 
> 1. From the TYPESIZE information embedded in the IR;
> 2. From the initialization information propagated from data flow, this includes both malloc call and the DECL_INIT.
> 
> We need to choose between these two when both available, (these two information could be
>   in the same pass as we discussed before, or in different passes which is shown in this discussion).
> 
> I think that the MIN_EXPR might be the right choice (especially for OST_MAXIMUM) -:)

It's worth a shot I guess.  We could emit something like the following 
in early_object_sizes_execute_one:

   sz = (__bos(o->sub, ost) == unknown
         ? early_size
         : MIN_EXPR (__bos(o->sub, ost), early_size));

and see if it sticks.

Thanks,
Sid

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-17 21:32                 ` Siddhesh Poyarekar
@ 2023-08-18 16:00                   ` Qing Zhao
  2023-08-23 16:40                     ` Qing Zhao
  0 siblings, 1 reply; 13+ messages in thread
From: Qing Zhao @ 2023-08-18 16:00 UTC (permalink / raw)
  To: Siddhesh Poyarekar; +Cc: jakub Jelinek, gcc-patches



> On Aug 17, 2023, at 5:32 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> 
> On 2023-08-17 17:25, Qing Zhao wrote:
>>> It's not exactly the same issue, the earlier discussion was about choosing sizes in the same pass while the current one is about choosing between passes, but I agree it "rhymes".  This is what I was alluding to originally (for OST_MINIMUM use MIN_EXPR if both passes returned a pass) but I haven't thought about it hard enough to be 100% confident that it's the better solution, especially for OST_MAXIMUM.
>> We have two different sources to get SIZE information for the subobject:
>> 1. From the TYPESIZE information embedded in the IR;
>> 2. From the initialization information propagated from data flow, this includes both malloc call and the DECL_INIT.
>> We need to choose between these two when both available, (these two information could be
>>  in the same pass as we discussed before, or in different passes which is shown in this discussion).
>> I think that the MIN_EXPR might be the right choice (especially for OST_MAXIMUM) -:)
> 
> It's worth a shot I guess.  We could emit something like the following in early_object_sizes_execute_one:
> 
>  sz = (__bos(o->sub, ost) == unknown
>        ? early_size
>        : MIN_EXPR (__bos(o->sub, ost), early_size));
> 
> and see if it sticks.

I came up with the following change for tree-object-size.cc:

diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index a62af0500563..e1b2008c6dcc 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -2016,10 +2016,22 @@ do_valueize (tree t)
   return t;
 }
 
-/* Process a __builtin_object_size or __builtin_dynamic_object_size call in
-   CALL early for subobjects before any object information is lost due to
-   optimization.  Insert a MIN or MAX expression of the result and
-   __builtin_object_size at I so that it may be processed in the second pass.
+/* Process a __builtin_object_size or __builtin_dynamic_object_size call
+   early for subobjects before any object information is lost due to
+   optimization.
+
+   We have two different sources to get the size information for subobjects:
+   A. The TYPE information of the subobject in the IR;
+   B. The initialization information propagated through data flow.
+   In the early pass, only A is available.
+   B might be available in the second pass.
+
+   If both A and B are available, we should choose the minimum one between
+   these two.
+
+   Insert a MIN expression of the result from the early pass and the original
+   __builtin_object_size call at I so that it may be processed in the second pass.
+
    __builtin_dynamic_object_size is treated like __builtin_object_size here
    since we're only looking for constant bounds.  */
 
@@ -2036,7 +2048,7 @@ early_object_sizes_execute_one (gimple_stmt_iterator *i, gimple *call)
   unsigned HOST_WIDE_INT object_size_type = tree_to_uhwi (ost);
   tree ptr = gimple_call_arg (call, 0);
 
-  if (object_size_type != 1 && object_size_type != 3)
+  if (object_size_type & OST_SUBOBJECT == 0)
     return;
 
   if (TREE_CODE (ptr) != ADDR_EXPR && TREE_CODE (ptr) != SSA_NAME)
@@ -2050,9 +2062,8 @@ early_object_sizes_execute_one (gimple_stmt_iterator *i, gimple *call)
 
   tree tem = make_ssa_name (type);
   gimple_call_set_lhs (call, tem);
-  enum tree_code code = object_size_type & OST_MINIMUM ? MAX_EXPR : MIN_EXPR;
   tree cst = fold_convert (type, bytes);
-  gimple *g = gimple_build_assign (lhs, code, tem, cst);
+  gimple *g = gimple_build_assign (lhs, MIN_EXPR, tem, cst);
   gsi_insert_after (i, g, GSI_NEW_STMT);
   update_stmt (call);
 }

Let me know if you see any issue with the change.

thanks.

Qing

> 
> Thanks,
> Sid


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Another bug for __builtin_object_size? (Or expected behavior)
  2023-08-18 16:00                   ` Qing Zhao
@ 2023-08-23 16:40                     ` Qing Zhao
  0 siblings, 0 replies; 13+ messages in thread
From: Qing Zhao @ 2023-08-23 16:40 UTC (permalink / raw)
  To: Siddhesh Poyarekar; +Cc: jakub Jelinek, gcc-patches



> On Aug 18, 2023, at 12:00 PM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> 
> 
>> On Aug 17, 2023, at 5:32 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
>> 
>> On 2023-08-17 17:25, Qing Zhao wrote:
>>>> It's not exactly the same issue, the earlier discussion was about choosing sizes in the same pass while the current one is about choosing between passes, but I agree it "rhymes".  This is what I was alluding to originally (for OST_MINIMUM use MIN_EXPR if both passes returned a pass) but I haven't thought about it hard enough to be 100% confident that it's the better solution, especially for OST_MAXIMUM.
>>> We have two different sources to get SIZE information for the subobject:
>>> 1. From the TYPESIZE information embedded in the IR;
>>> 2. From the initialization information propagated from data flow, this includes both malloc call and the DECL_INIT.
>>> We need to choose between these two when both available, (these two information could be
>>> in the same pass as we discussed before, or in different passes which is shown in this discussion).
>>> I think that the MIN_EXPR might be the right choice (especially for OST_MAXIMUM) -:)
>> 
>> It's worth a shot I guess.  We could emit something like the following in early_object_sizes_execute_one:
>> 
>> sz = (__bos(o->sub, ost) == unknown
>>       ? early_size
>>       : MIN_EXPR (__bos(o->sub, ost), early_size));
>> 
>> and see if it sticks.
> 
> I came up with the following change for tree-object-size.cc:
> 
> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
> index a62af0500563..e1b2008c6dcc 100644
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -2016,10 +2016,22 @@ do_valueize (tree t)
>   return t;
> }
> 
> -/* Process a __builtin_object_size or __builtin_dynamic_object_size call in
> -   CALL early for subobjects before any object information is lost due to
> -   optimization.  Insert a MIN or MAX expression of the result and
> -   __builtin_object_size at I so that it may be processed in the second pass.
> +/* Process a __builtin_object_size or __builtin_dynamic_object_size call
> +   early for subobjects before any object information is lost due to
> +   optimization.
> +
> +   We have two different sources to get the size information for subobjects:
> +   A. The TYPE information of the subobject in the IR;
> +   B. The initialization information propagated through data flow.
> +   In the early pass, only A is available.
> +   B might be available in the second pass.
> +
> +   If both A and B are available, we should choose the minimum one between
> +   these two.
> +
> +   Insert a MIN expression of the result from the early pass and the original
> +   __builtin_object_size call at I so that it may be processed in the second pass.
> +
>    __builtin_dynamic_object_size is treated like __builtin_object_size here
>    since we're only looking for constant bounds.  */
> 
> @@ -2036,7 +2048,7 @@ early_object_sizes_execute_one (gimple_stmt_iterator *i, gimple *call)
>   unsigned HOST_WIDE_INT object_size_type = tree_to_uhwi (ost);
>   tree ptr = gimple_call_arg (call, 0);
> 
> -  if (object_size_type != 1 && object_size_type != 3)
> +  if (object_size_type & OST_SUBOBJECT == 0)
>     return;
> 
>   if (TREE_CODE (ptr) != ADDR_EXPR && TREE_CODE (ptr) != SSA_NAME)
> @@ -2050,9 +2062,8 @@ early_object_sizes_execute_one (gimple_stmt_iterator *i, gimple *call)
> 
>   tree tem = make_ssa_name (type);
>   gimple_call_set_lhs (call, tem);
> -  enum tree_code code = object_size_type & OST_MINIMUM ? MAX_EXPR : MIN_EXPR;
>   tree cst = fold_convert (type, bytes);
> -  gimple *g = gimple_build_assign (lhs, code, tem, cst);
> +  gimple *g = gimple_build_assign (lhs, MIN_EXPR, tem, cst);
>   gsi_insert_after (i, g, GSI_NEW_STMT);
>   update_stmt (call);
> }
> 
> Let me know if you see any issue with the change.

I tested the above change, everything is fine except one testing case in gcc.dg/builtin-object-size-4.c

I reduced the failed case to the following small one:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* Tests for strdup/strndup.  */
size_t
__attribute__ ((noinline))
test9 (void)
{
  const char *ptr = "abcdefghijklmnopqrstuvwxyz";
  char *res = strndup (ptr, 21);
  int n = 0;
  if ((n = __builtin_object_size (res, 3)) != 22)
    printf("FAIL is %d\n", n);

  free (res);
}

int
main (void)
{
  test9 ();
}
[opc@qinzhao-ol8u3-x86 gcc]$ sh t
FAIL is 1

I debugged into tree-object-size.cc, the routine “strdup_object_size”, and have two questions on two places:

1. For the following:

 844   /* In all other cases, return the size of SRC since the object size cannot
 845      exceed that.  We cannot do this for OST_MINIMUM unless SRC points into a
 846      string constant since otherwise the object size could go all the way down
 847      to zero.  */
…
 864       /* For maximum estimate, our next best guess is the object size of the
 865          source.  */
 866       if (size_unknown_p (sz, object_size_type)
 867           && !(object_size_type & OST_MINIMUM))
 868         compute_builtin_object_size (src, object_size_type, &sz);

I still don’t understand why for OST_MINIMUM, we cannot call “compute_builtin_object_size” at line 868?

2. For the following:

 871   /* String duplication allocates at least one byte, so we should never fail
 872      for OST_MINIMUM.  */
 873   if ((!size_valid_p (sz, object_size_type)
 874        || size_unknown_p (sz, object_size_type))
 875       && (object_size_type & OST_MINIMUM))
 876     sz = size_one_node;

I checked the doc for strdup/strndup, cannot find anyplace mentioning the routine returns at least one byte.

Where the above come from?

thanks.

Qing


> thanks.
> 
> Qing
> 
>> 
>> Thanks,
>> Sid


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-08-23 16:40 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-16 15:59 Another bug for __builtin_object_size? (Or expected behavior) Qing Zhao
2023-08-16 20:16 ` Qing Zhao
2023-08-17 11:00 ` Siddhesh Poyarekar
2023-08-17 13:58   ` Qing Zhao
2023-08-17 17:49     ` Siddhesh Poyarekar
2023-08-17 19:27       ` Qing Zhao
2023-08-17 19:59         ` Siddhesh Poyarekar
2023-08-17 20:23           ` Qing Zhao
2023-08-17 20:57             ` Siddhesh Poyarekar
2023-08-17 21:25               ` Qing Zhao
2023-08-17 21:32                 ` Siddhesh Poyarekar
2023-08-18 16:00                   ` Qing Zhao
2023-08-23 16:40                     ` Qing Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).