public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RFC (V2) the proposal to resolve the missing dependency issue for counted_by attribute
@ 2023-11-07  0:12 Qing Zhao
  2023-11-09 15:49 ` Qing Zhao
  0 siblings, 1 reply; 5+ messages in thread
From: Qing Zhao @ 2023-11-07  0:12 UTC (permalink / raw)
  To: richard Biener, Joseph Myers, Siddhesh Poyarekar, Martin Uecker,
	Jakub Jelinek
  Cc: Kees Cook, isanbard, GCC Patches

Hi,

Attached is the 2nd version of the proposal based on all the discussion so far.

Let me know if you have more comment and suggestion.

Thanks a lot for all the help.

Qing
===========================================
Represent the missing dependence for the "counted_by" attribute and its consumers 

Qing Zhao

11/06/2023
==============================================

The whole discussion is at:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633783.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634844.html

1. The problem

There is a data dependency between the size assignment and the implicit use of the size information in the __builtin_dynamic_object_size that is missing in the IL (line 11 and line 13 in the below example). Such information missing will result incorrect code reordering and other code transformations. 

  1 struct A
  2 {
  3  size_t size;
  4  char buf[] __attribute__((counted_by(size)));
  5 };
  6 
  7 size_t 
  8 foo (size_t sz)
  9 {
 10  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
 11  obj->size = sz;
 12  obj->buf[0] = 2;
 13  return __builtin_dynamic_object_size (obj->buf, 1);
 14 }
  
Please see a more complicate example in the Appendex 1.

We need to represent such data dependency correctly in the IL. 

2. The solution:

2.1 Summary

* Add a new internal function ".ACCESS_WITH_SIZE" to carry the size information for every reference to a FAM field;
* In C FE, Replace every reference to a FAM field whose TYPE has the "counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
* In every consumer of the size information, for example, BDOS or array bound sanitizer, query the size information or ACCESS_MODE information from the new internal function;
* When the size information and the "ACCESS_MODE" information are not used anymore, possibly at the 2nd object size phase, replace the internal function with the actual reference to the FAM field; 
* Some adjustment to inlining heuristic, ipa alias analysis, and other SSA passes to mitigate the impact to the optimizer and code generation. 

2.2 The new internal function 

  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE)

INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)

which returns the "REF_TO_OBJ" same as the 1st argument;

1st argument "REF_TO_OBJ": The reference to the object;
2nd argument "REF_TO_SIZE": The reference to the size of the object, 
3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE represents 
   0: unknown;
   1: the number of the elements of the object type;
   2: the number of bytes; 
4th argument "SIZE_OF_SIZE": how many bytes is the object that REF_TO_SIZE points;
5th argument "ACCESS_MODE": 
  -1: Unknown access semantics
   0: none
   1: read_only
   2: write_only
   3: read_write

NOTEs, 
  A. This new internal function is intended for a more general use from all the 3 attributes, "access", "alloc_size", and the new "counted_by", to encode the "size" and "access_mode" information to the corresponding pointer. (in order to resolve PR96503, etc. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503);
  B. For "counted_by" the 3rd argument will be 1;
  C. For "counted_by" and "alloc_size" attributes, the 5th argument will be -1;   
  D. In this wrieup, we focus on the implementation details for the "counted_by" attribute. However, this function should be ready to be used by "access" and "alloc_size" without issue. 

2.3 A new semantic requirement in the user documentation of "counted_by"

For the following structure including a FAM with a counted_by attribute:

  struct A
  {
   size_t size;
   char buf[] __attribute__((counted_by(size)));
  };

for any object with such type:

  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));

The initialization to the size field should be done before the first reference to the FAM field,
Otherwise, the behavior is undefined.
Such additional requirement to the user will guarantee that the first reference to the FAM knows the size of the FAM.  

Another thing that need to be clarified is:
A later reference to the FAM field will use the latest value assigned to the size field before that reference. For example, 
 obj->size = val1;
 ref1 (obj->buf);
 obj->size = val2;
 ref2 (obj->buf);
in the above, "ref1" will use val1 and "ref2" will use val2. 
This clarification will inform user that the dynamic array feature is fully supported.

We need to add the above additional requirement and clarification to the user documentation.
The complete user documentation is in Appendix 2. 

2.4 Replace the reference to a FAM field with the new function .ACCESS_WITH_SIZE

In C FE:

for every reference to a FAM, for example, "obj->buf" in the small example,
  check whether the corresponding FIELD_DECL has a "counted_by" attribute?
  if YES, replace the reference to "obj->buf" with a call to
      .ACCESS_WITH_SIZE (obj->buf, &obj->size, 1, sizeof(obj->size), -1); 

This includes the case when the object is statically allocated and initialized as following example:

static struct A x = { sizeof "hello", "hello" };
static char *y = &x.buf;

The reference to x.buf should be replaced by the .ACCESS_WITH_SIZE (&x.buf, &x.size, 1, sizeof(x.size), -1).


2.5 Query the size info 

There are multiple consumers of the size info (and ACCESS_MODE info):

  * __builtin_dynamic_object_size;
  * array bound sanitizer;

in these consumers, get the size info from the 2nd, 3rd, and 4th arguments of the call to
ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, 1, sizeof(SIZE), -1)

2.6 Eliminate the internal function when not useful anymore

After the last consumer of the size information in the ACCESS_WITH_SIZE, We should replace the internal call with its first argument.

Do it in the 2nd object size phase. 

2.7 Adjustment to inlining heuristic, IPA alias analysis and other IPA analysis

the FE changes:

obj->buf

to

_1 = obj->buf;
_2 = &(obj->size);
.ACCESS_WITH_SIZE (_1, _2, 1, sizeof (obj->size), -1)

there are major two changes:

  A. the # of Insns of the routine will be increased,
  B. additional calls to an non-pure internal routines.

In order to minimize the impact on code generation and optimizaitons, We might need to

  * Adjust Inlining decision to ignore the additional insns and calls;
  * Adjust the routine "call_may_clobber_ref_p_1" in tree-ssa-alias.cc to exclude the new internal .ACCESS_WITH_SIZE from clobbering anything. 


3. The patch sets:

We will provide the following patches:

  3.1 Provide counted_by attribute to flexible array member field;
      which includes:
      * "counted_by" attribute documentation;  
      * C FE handling of the new attribute; 
        syntax checking, error reporting;
      * testing cases; (aditional testing cases for staticly initialized object)

  3.2 Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.  
      which includes:
      * The definition of the new internal function .ACCESS_WITH_SIZE in internal-fn.def. 
      * C FE converts every reference to a FAM with "counted_by" attribute to a call to the internal function .ACCESS_WITH_SIZE.
        (build_component_ref in c_typeck.cc)
      * Convert every call to .ACCESS_WITH_SIZE to its first argument.
        (After the size and access_mode information are used in the 2nd object size phase, we need to convert .ACCESS_WITH_SIZE back to its first argument as soon as possible in order to minimize the impact to other phases after object size phase. So, do this in the end of the 2nd object size phase)
      * adjust inlining heuristic for the additional call. 
      * adjust alias analysis to exclude the new internal from clobbering anything.
        (call_may_clobber_ref_p_1 in tree-ssa-alias.cc)
      * verify a call to .ACCESS_WITH_SIZE in tree-cfg.cc.
      * provide utility routines to get the size or ACCESS_MODE from the call to .ACCESS_WITH_SIZE. 

  3.3 Use the .ACCESS_WITH_SIZE in builtin object size (sub-object only)
      which includes: 
      * use the size info of the .ACCESS_WITH_SIZE for sub-object.
      * testing cases. (additional testing cases for dynamic array support)

  3.4 Use the .ACCESS_WITH_SIZE in bound sanitizer
      which includes:
      * use the size info of the .ACCESS_WITH_SIZE for bound sanitizer.
      * testing cases. (additional testing cases for dynamic array support)

  3.5 Improve __bdos to use the counted_by info in whole-object size for the structure with FAM. 
  3.6 Emit warnings when the user breaks the requirments for the new counted_by attribute
      compilation time: -Wcounted-by
      run time: -fsanitizer=counted-by
      
      * The initialization to the size field should be done before the first reference to the FAM field. 
      * the array has at least # of elements specified by the size field all the time during the program.  
 

4. Future work
  
  4.1 Use the new .ACCESS_WITH_SIZE for other relative attributes, for example, "access" and "alloc_size", to resolve the known issues for them:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503

  4.2 Add "counted_by" attribute to general pointers as proposed by the -fbounds-safety proposal to LLVM from Apple (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854).  

the .ACCESS_WITH_SIZE approach should be adopted naturally. 


Appendex 1. A more complicated example: 

  1 #include <malloc.h>
  2 struct A
  3 {
  4  size_t size;
  5  char buf[] __attribute__((counted_by(size)));
  6 };
  7 
  8 static size_t
  9 get_size_from (void *ptr)
 10 {
 11  return __builtin_dynamic_object_size (ptr, 1);
 12 }
 13 
 14 void
 15 foo (size_t sz)
 16 {
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
 21  return;
 22 }
 23 
 24 int main ()
 25 {
 26  foo (20);
 27  return 0;
 28 }

In this example, the instantiation of the object at line 17 is not in the same routine as the _BDOS call at line 11. 

Appendex 2. The user documentation:

'counted_by (COUNT)'
     The 'counted_by' attribute may be attached to the flexible array
     member of a structure.  It indicates that the number of the
     elements of the array is given by the field named "COUNT" in the
     same structure as the flexible array member.  GCC uses this
     information to improve the results of the array bound sanitizer and
     the '__builtin_dynamic_object_size'.

     For instance, the following code:

          struct P {
            size_t count;
            char other;
            char array[] __attribute__ ((counted_by (count)));
          } *p;

     specifies that the 'array' is a flexible array member whose number
     of elements is given by the field 'count' in the same structure.

     The field that represents the number of the elements should have an
     integer type. Otherwise, the compiler will report a warning and ignore 
     the attribute.

     An explicit 'counted_by' annotation defines a relationship between
     two objects, 'p->array' and 'p->count', and there are the following
     requirementthat on the relationship between this pair:

     A. 'p->count' should be initialized before the first reference to 
        'p->array'.
     B. "p->array' has _at least_ 'p->count' number of elements available
        all the time.
        This requirement must hold even after any of these related objects
        are updated during the program.

     It's the user's responsibility to make sure the above requirments to 
     be kept all the time.  Otherwise the compiler will report warnings, 
     at the same time, the results of the array bound sanitizer and the
     '__builtin_dynamic_object_size' is undefined. 

     One important feature of the attribute is, A reference to the flexible
     array member field will use the latest value assigned to the field that
     represent the number of the elements before that reference. For example,
       p->count = val1;
       ref1 (p->array);
       p->count = val2;
       ref2 (p->array);
     in the above, 'ref1' will use 'val1' as the number of the elements in
     'p->array', and "ref2" will use 'val2' as the number of elements in
     'p->array'.


Appendex 3. Other approaches that have been discussed:

A.  Add an additional argument, the size parameter, to the call to __bdos, 
      A.1, during FE;
      A.2, during gimplification phase;
B.  Encode the implicit USE in the type of size, to make the size as “volatile”;
C.  Encode the implicit USE in the type of buf, then update the optimization passes to use this implicit USE encoded in the type of buf.

** Approach A (both A.1 and A.2) will not work if the object instantiation is not in the same routine as the __bdos call as shown in the above example;
** Approach B will have a bigger performance impact due to the "volatile" will inhibit a lot of optimizations; 
** Approach C will involve a lot of changes in GCC, and also not very necessary since the ONLY implicit use of the size parameter is in the __bdos call when __bdos can see the real object.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC (V2) the proposal to resolve the missing dependency issue for counted_by attribute
  2023-11-07  0:12 RFC (V2) the proposal to resolve the missing dependency issue for counted_by attribute Qing Zhao
@ 2023-11-09 15:49 ` Qing Zhao
  2023-11-09 15:57   ` Jakub Jelinek
  0 siblings, 1 reply; 5+ messages in thread
From: Qing Zhao @ 2023-11-09 15:49 UTC (permalink / raw)
  To: richard Biener, Joseph Myers, Siddhesh Poyarekar, Martin Uecker,
	Jakub Jelinek
  Cc: Kees Cook, isanbard, GCC Patches

Is it reasonable to add one option to disable the “counted_by” attribute?
(then no insertion of the new .ACCESS_WITH_SIZE into IL).  

The major reason is: some users might want to ignore all the “counted_by” attribute added in the source code,
We need to provide them a way to disable this feature.

thanks.

Qing

> On Nov 6, 2023, at 7:12 PM, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> Hi,
> 
> Attached is the 2nd version of the proposal based on all the discussion so far.
> 
> Let me know if you have more comment and suggestion.
> 
> Thanks a lot for all the help.
> 
> Qing
> ===========================================
> Represent the missing dependence for the "counted_by" attribute and its consumers 
> 
> Qing Zhao
> 
> 11/06/2023
> ==============================================
> 
> The whole discussion is at:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633783.html
> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634844.html
> 
> 1. The problem
> 
> There is a data dependency between the size assignment and the implicit use of the size information in the __builtin_dynamic_object_size that is missing in the IL (line 11 and line 13 in the below example). Such information missing will result incorrect code reordering and other code transformations. 
> 
>  1 struct A
>  2 {
>  3  size_t size;
>  4  char buf[] __attribute__((counted_by(size)));
>  5 };
>  6 
>  7 size_t 
>  8 foo (size_t sz)
>  9 {
> 10  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
> 11  obj->size = sz;
> 12  obj->buf[0] = 2;
> 13  return __builtin_dynamic_object_size (obj->buf, 1);
> 14 }
> 
> Please see a more complicate example in the Appendex 1.
> 
> We need to represent such data dependency correctly in the IL. 
> 
> 2. The solution:
> 
> 2.1 Summary
> 
> * Add a new internal function ".ACCESS_WITH_SIZE" to carry the size information for every reference to a FAM field;
> * In C FE, Replace every reference to a FAM field whose TYPE has the "counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
> * In every consumer of the size information, for example, BDOS or array bound sanitizer, query the size information or ACCESS_MODE information from the new internal function;
> * When the size information and the "ACCESS_MODE" information are not used anymore, possibly at the 2nd object size phase, replace the internal function with the actual reference to the FAM field; 
> * Some adjustment to inlining heuristic, ipa alias analysis, and other SSA passes to mitigate the impact to the optimizer and code generation. 
> 
> 2.2 The new internal function 
> 
>  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE)
> 
> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> 
> which returns the "REF_TO_OBJ" same as the 1st argument;
> 
> 1st argument "REF_TO_OBJ": The reference to the object;
> 2nd argument "REF_TO_SIZE": The reference to the size of the object, 
> 3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE represents 
>   0: unknown;
>   1: the number of the elements of the object type;
>   2: the number of bytes; 
> 4th argument "SIZE_OF_SIZE": how many bytes is the object that REF_TO_SIZE points;
> 5th argument "ACCESS_MODE": 
>  -1: Unknown access semantics
>   0: none
>   1: read_only
>   2: write_only
>   3: read_write
> 
> NOTEs, 
>  A. This new internal function is intended for a more general use from all the 3 attributes, "access", "alloc_size", and the new "counted_by", to encode the "size" and "access_mode" information to the corresponding pointer. (in order to resolve PR96503, etc. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503);
>  B. For "counted_by" the 3rd argument will be 1;
>  C. For "counted_by" and "alloc_size" attributes, the 5th argument will be -1;   
>  D. In this wrieup, we focus on the implementation details for the "counted_by" attribute. However, this function should be ready to be used by "access" and "alloc_size" without issue. 
> 
> 2.3 A new semantic requirement in the user documentation of "counted_by"
> 
> For the following structure including a FAM with a counted_by attribute:
> 
>  struct A
>  {
>   size_t size;
>   char buf[] __attribute__((counted_by(size)));
>  };
> 
> for any object with such type:
> 
>  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
> 
> The initialization to the size field should be done before the first reference to the FAM field,
> Otherwise, the behavior is undefined.
> Such additional requirement to the user will guarantee that the first reference to the FAM knows the size of the FAM.  
> 
> Another thing that need to be clarified is:
> A later reference to the FAM field will use the latest value assigned to the size field before that reference. For example, 
> obj->size = val1;
> ref1 (obj->buf);
> obj->size = val2;
> ref2 (obj->buf);
> in the above, "ref1" will use val1 and "ref2" will use val2. 
> This clarification will inform user that the dynamic array feature is fully supported.
> 
> We need to add the above additional requirement and clarification to the user documentation.
> The complete user documentation is in Appendix 2. 
> 
> 2.4 Replace the reference to a FAM field with the new function .ACCESS_WITH_SIZE
> 
> In C FE:
> 
> for every reference to a FAM, for example, "obj->buf" in the small example,
>  check whether the corresponding FIELD_DECL has a "counted_by" attribute?
>  if YES, replace the reference to "obj->buf" with a call to
>      .ACCESS_WITH_SIZE (obj->buf, &obj->size, 1, sizeof(obj->size), -1); 
> 
> This includes the case when the object is statically allocated and initialized as following example:
> 
> static struct A x = { sizeof "hello", "hello" };
> static char *y = &x.buf;
> 
> The reference to x.buf should be replaced by the .ACCESS_WITH_SIZE (&x.buf, &x.size, 1, sizeof(x.size), -1).
> 
> 
> 2.5 Query the size info 
> 
> There are multiple consumers of the size info (and ACCESS_MODE info):
> 
>  * __builtin_dynamic_object_size;
>  * array bound sanitizer;
> 
> in these consumers, get the size info from the 2nd, 3rd, and 4th arguments of the call to
> ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, 1, sizeof(SIZE), -1)
> 
> 2.6 Eliminate the internal function when not useful anymore
> 
> After the last consumer of the size information in the ACCESS_WITH_SIZE, We should replace the internal call with its first argument.
> 
> Do it in the 2nd object size phase. 
> 
> 2.7 Adjustment to inlining heuristic, IPA alias analysis and other IPA analysis
> 
> the FE changes:
> 
> obj->buf
> 
> to
> 
> _1 = obj->buf;
> _2 = &(obj->size);
> .ACCESS_WITH_SIZE (_1, _2, 1, sizeof (obj->size), -1)
> 
> there are major two changes:
> 
>  A. the # of Insns of the routine will be increased,
>  B. additional calls to an non-pure internal routines.
> 
> In order to minimize the impact on code generation and optimizaitons, We might need to
> 
>  * Adjust Inlining decision to ignore the additional insns and calls;
>  * Adjust the routine "call_may_clobber_ref_p_1" in tree-ssa-alias.cc to exclude the new internal .ACCESS_WITH_SIZE from clobbering anything. 
> 
> 
> 3. The patch sets:
> 
> We will provide the following patches:
> 
>  3.1 Provide counted_by attribute to flexible array member field;
>      which includes:
>      * "counted_by" attribute documentation;  
>      * C FE handling of the new attribute; 
>        syntax checking, error reporting;
>      * testing cases; (aditional testing cases for staticly initialized object)
> 
>  3.2 Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.  
>      which includes:
>      * The definition of the new internal function .ACCESS_WITH_SIZE in internal-fn.def. 
>      * C FE converts every reference to a FAM with "counted_by" attribute to a call to the internal function .ACCESS_WITH_SIZE.
>        (build_component_ref in c_typeck.cc)
>      * Convert every call to .ACCESS_WITH_SIZE to its first argument.
>        (After the size and access_mode information are used in the 2nd object size phase, we need to convert .ACCESS_WITH_SIZE back to its first argument as soon as possible in order to minimize the impact to other phases after object size phase. So, do this in the end of the 2nd object size phase)
>      * adjust inlining heuristic for the additional call. 
>      * adjust alias analysis to exclude the new internal from clobbering anything.
>        (call_may_clobber_ref_p_1 in tree-ssa-alias.cc)
>      * verify a call to .ACCESS_WITH_SIZE in tree-cfg.cc.
>      * provide utility routines to get the size or ACCESS_MODE from the call to .ACCESS_WITH_SIZE. 
> 
>  3.3 Use the .ACCESS_WITH_SIZE in builtin object size (sub-object only)
>      which includes: 
>      * use the size info of the .ACCESS_WITH_SIZE for sub-object.
>      * testing cases. (additional testing cases for dynamic array support)
> 
>  3.4 Use the .ACCESS_WITH_SIZE in bound sanitizer
>      which includes:
>      * use the size info of the .ACCESS_WITH_SIZE for bound sanitizer.
>      * testing cases. (additional testing cases for dynamic array support)
> 
>  3.5 Improve __bdos to use the counted_by info in whole-object size for the structure with FAM. 
>  3.6 Emit warnings when the user breaks the requirments for the new counted_by attribute
>      compilation time: -Wcounted-by
>      run time: -fsanitizer=counted-by
> 
>      * The initialization to the size field should be done before the first reference to the FAM field. 
>      * the array has at least # of elements specified by the size field all the time during the program.  
> 
> 
> 4. Future work
> 
>  4.1 Use the new .ACCESS_WITH_SIZE for other relative attributes, for example, "access" and "alloc_size", to resolve the known issues for them:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503
> 
>  4.2 Add "counted_by" attribute to general pointers as proposed by the -fbounds-safety proposal to LLVM from Apple (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854).  
> 
> the .ACCESS_WITH_SIZE approach should be adopted naturally. 
> 
> 
> Appendex 1. A more complicated example: 
> 
>  1 #include <malloc.h>
>  2 struct A
>  3 {
>  4  size_t size;
>  5  char buf[] __attribute__((counted_by(size)));
>  6 };
>  7 
>  8 static size_t
>  9 get_size_from (void *ptr)
> 10 {
> 11  return __builtin_dynamic_object_size (ptr, 1);
> 12 }
> 13 
> 14 void
> 15 foo (size_t sz)
> 16 {
> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
> 18  obj->size = sz;
> 19  obj->buf[0] = 2;
> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> 21  return;
> 22 }
> 23 
> 24 int main ()
> 25 {
> 26  foo (20);
> 27  return 0;
> 28 }
> 
> In this example, the instantiation of the object at line 17 is not in the same routine as the _BDOS call at line 11. 
> 
> Appendex 2. The user documentation:
> 
> 'counted_by (COUNT)'
>     The 'counted_by' attribute may be attached to the flexible array
>     member of a structure.  It indicates that the number of the
>     elements of the array is given by the field named "COUNT" in the
>     same structure as the flexible array member.  GCC uses this
>     information to improve the results of the array bound sanitizer and
>     the '__builtin_dynamic_object_size'.
> 
>     For instance, the following code:
> 
>          struct P {
>            size_t count;
>            char other;
>            char array[] __attribute__ ((counted_by (count)));
>          } *p;
> 
>     specifies that the 'array' is a flexible array member whose number
>     of elements is given by the field 'count' in the same structure.
> 
>     The field that represents the number of the elements should have an
>     integer type. Otherwise, the compiler will report a warning and ignore 
>     the attribute.
> 
>     An explicit 'counted_by' annotation defines a relationship between
>     two objects, 'p->array' and 'p->count', and there are the following
>     requirementthat on the relationship between this pair:
> 
>     A. 'p->count' should be initialized before the first reference to 
>        'p->array'.
>     B. "p->array' has _at least_ 'p->count' number of elements available
>        all the time.
>        This requirement must hold even after any of these related objects
>        are updated during the program.
> 
>     It's the user's responsibility to make sure the above requirments to 
>     be kept all the time.  Otherwise the compiler will report warnings, 
>     at the same time, the results of the array bound sanitizer and the
>     '__builtin_dynamic_object_size' is undefined. 
> 
>     One important feature of the attribute is, A reference to the flexible
>     array member field will use the latest value assigned to the field that
>     represent the number of the elements before that reference. For example,
>       p->count = val1;
>       ref1 (p->array);
>       p->count = val2;
>       ref2 (p->array);
>     in the above, 'ref1' will use 'val1' as the number of the elements in
>     'p->array', and "ref2" will use 'val2' as the number of elements in
>     'p->array'.
> 
> 
> Appendex 3. Other approaches that have been discussed:
> 
> A.  Add an additional argument, the size parameter, to the call to __bdos, 
>      A.1, during FE;
>      A.2, during gimplification phase;
> B.  Encode the implicit USE in the type of size, to make the size as “volatile”;
> C.  Encode the implicit USE in the type of buf, then update the optimization passes to use this implicit USE encoded in the type of buf.
> 
> ** Approach A (both A.1 and A.2) will not work if the object instantiation is not in the same routine as the __bdos call as shown in the above example;
> ** Approach B will have a bigger performance impact due to the "volatile" will inhibit a lot of optimizations; 
> ** Approach C will involve a lot of changes in GCC, and also not very necessary since the ONLY implicit use of the size parameter is in the __bdos call when __bdos can see the real object.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC (V2) the proposal to resolve the missing dependency issue for counted_by attribute
  2023-11-09 15:49 ` Qing Zhao
@ 2023-11-09 15:57   ` Jakub Jelinek
  2023-11-09 16:50     ` Jose E. Marchesi
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2023-11-09 15:57 UTC (permalink / raw)
  To: Qing Zhao
  Cc: richard Biener, Joseph Myers, Siddhesh Poyarekar, Martin Uecker,
	Kees Cook, isanbard, GCC Patches

On Thu, Nov 09, 2023 at 03:49:49PM +0000, Qing Zhao wrote:
> Is it reasonable to add one option to disable the “counted_by” attribute?
> (then no insertion of the new .ACCESS_WITH_SIZE into IL).  
> 
> The major reason is: some users might want to ignore all the “counted_by” attribute added in the source code,
> We need to provide them a way to disable this feature.

-D'counted_by(x)='
and/or
-D'__counted_by__(x)='
?

	Jakub


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC (V2) the proposal to resolve the missing dependency issue for counted_by attribute
  2023-11-09 15:57   ` Jakub Jelinek
@ 2023-11-09 16:50     ` Jose E. Marchesi
  2023-11-09 17:23       ` Qing Zhao
  0 siblings, 1 reply; 5+ messages in thread
From: Jose E. Marchesi @ 2023-11-09 16:50 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Qing Zhao, richard Biener, Joseph Myers, Siddhesh Poyarekar,
	Martin Uecker, Kees Cook, isanbard, GCC Patches


> On Thu, Nov 09, 2023 at 03:49:49PM +0000, Qing Zhao wrote:
>> Is it reasonable to add one option to disable the “counted_by” attribute?
>> (then no insertion of the new .ACCESS_WITH_SIZE into IL).  
>> 
>> The major reason is: some users might want to ignore all the “counted_by” attribute added in the source code,
>> We need to provide them a way to disable this feature.
>
> -D'counted_by(x)='
> and/or
> -D'__counted_by__(x)='
> ?

The insertion of .ACCESS_WITH_SIZE collides with the BPF CO-RE
preserve_access_index implementation.

I don't think this will be a problem in practice (the BPF program can
define counted_by to the empty string as Jakub suggests) but we ought to
at least detect when a data structure featuring a counted_by FMA is
accessed with access index preservation (either attribute or builtin)
and either error out or warning out and try to accomodate by turning the
.ACCESS_WTIH_INDEX back to plain accesses.  We can do either with BPF
specific backend code.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC (V2) the proposal to resolve the missing dependency issue for counted_by attribute
  2023-11-09 16:50     ` Jose E. Marchesi
@ 2023-11-09 17:23       ` Qing Zhao
  0 siblings, 0 replies; 5+ messages in thread
From: Qing Zhao @ 2023-11-09 17:23 UTC (permalink / raw)
  To: Jose Marchesi
  Cc: Jakub Jelinek, richard Biener, Joseph Myers, Siddhesh Poyarekar,
	Martin Uecker, Kees Cook, isanbard, GCC Patches



> On Nov 9, 2023, at 11:50 AM, Jose Marchesi <jose.marchesi@oracle.com> wrote:
> 
>> 
>> On Thu, Nov 09, 2023 at 03:49:49PM +0000, Qing Zhao wrote:
>>> Is it reasonable to add one option to disable the “counted_by” attribute?
>>> (then no insertion of the new .ACCESS_WITH_SIZE into IL).  
>>> 
>>> The major reason is: some users might want to ignore all the “counted_by” attribute added in the source code,
>>> We need to provide them a way to disable this feature.
>> 
>> -D'counted_by(x)='
>> and/or
>> -D'__counted_by__(x)='
>> ?
> 
> The insertion of .ACCESS_WITH_SIZE collides with the BPF CO-RE
> preserve_access_index implementation.
> 
> I don't think this will be a problem in practice (the BPF program can
> define counted_by to the empty string as Jakub suggests) but we ought to
> at least detect when a data structure featuring a counted_by FMA is
> accessed with access index preservation (either attribute or builtin)
> and either error out or warning out and try to accomodate by turning the
> .ACCESS_WTIH_INDEX back to plain accesses.  We can do either with BPF
> specific backend code.

Yes, I agree that handling this in BPF backend code might be a better approach
 since this is really a BPF CO-RE specific issue.

For the counted_by implementation, I will keep the current design.

But I will add this identified BPF CO-RE issue into the proposal as a known issue for record purpose.

Thanks a lot for raising this issue and the possible solutions.

Qing


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-09 17:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-07  0:12 RFC (V2) the proposal to resolve the missing dependency issue for counted_by attribute Qing Zhao
2023-11-09 15:49 ` Qing Zhao
2023-11-09 15:57   ` Jakub Jelinek
2023-11-09 16:50     ` Jose E. Marchesi
2023-11-09 17:23       ` Qing Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).