public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
@ 2024-01-24  0:29 Qing Zhao
  2024-01-24  0:29 ` [PATCH v4 1/4] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-24  0:29 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, siddhesh, uecker
  Cc: keescook, isanbard, gcc-patches, Qing Zhao

Hi,

This is the 4th version of the patch.

It based on the following proposal:

https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html
Represent the missing dependence for the "counted_by" attribute and its consumers

******The summary of the proposal is:

* Add a new internal function ".ACCESS_WITH_SIZE" to carry the size information for every reference to a FAM field;
* In C FE, Replace every reference to a FAM field whose TYPE has the "counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
* In every consumer of the size information, for example, BDOS or array bound sanitizer, query the size information or ACCESS_MODE information from the new internal function;
* When expansing to RTL, replace the internal function with the actual reference to the FAM field;
* Some adjustment to ipa alias analysis, and other SSA passes to mitigate the impact to the optimizer and code generation.


******The new internal function

  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX)

INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)

which returns the "REF_TO_OBJ" same as the 1st argument;

Both the return type and the type of the first argument of this function have been converted from the incomplete array type to the corresponding pointer type.

Please see the following link for why:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html

1st argument "REF_TO_OBJ": The reference to the object;
2nd argument "REF_TO_SIZE": The reference to the size of the object,
3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE represents
   0: unknown;
   1: the number of the elements of the object type;
   2: the number of bytes;
4th argument "PRECISION_OF_SIZE": The precision of the integer that REF_TO_SIZE points;
5th argument "ACCESS_MODE":
  -1: Unknown access semantics
   0: none
   1: read_only
   2: write_only
   3: read_write
6th argument "INDEX": the INDEX for the original array reference.
  -1: Unknown

NOTE: The 6th Argument is added for bound sanitizer instrumentation.

****** The Patch sets included:

1. Provide counted_by attribute to flexible array member field;
      which includes:
      * "counted_by" attribute documentation;
      * C FE handling of the new attribute;
        syntax checking, error reporting;
      * testing cases;

2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.
      which includes:
      * The definition of the new internal function .ACCESS_WITH_SIZE in internal-fn.def.
      * C FE converts every reference to a FAM with "counted_by" attribute to a call to the internal function .ACCESS_WITH_SIZE.
        (build_component_ref in c_typeck.cc)
        This includes the case when the object is statically allocated and initialized.
        In order to make this working, we should update initializer_constant_valid_p_1 and output_constant in varasm.cc to include calls to .ACCESS_WITH_SIZE.

        However, for the reference inside "offsetof", ignore the "counted_by" attribute since it's not useful at all. (c_parser_postfix_expression in c/c-parser.cc)

      * Convert every call to .ACCESS_WITH_SIZE to its first argument.
        (expand_ACCESS_WITH_SIZE in internal-fn.cc)
      * adjust alias analysis to exclude the new internal from clobbering anything.
        (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in tree-ssa-alias.cc)
      * adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE when
        it's LHS is eliminated as dead code.
        (eliminate_unnecessary_stmts in tree-ssa-dce.cc)
      * Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
        get the reference from the call to .ACCESS_WITH_SIZE.
        (is_access_with_size_p and get_ref_from_access_with_size in tree.cc)
      * testing cases. (for offsetof, static initialization, generation of calls to
        .ACCESS_WITH_SIZE, code runs correctly after calls to .ACCESS_WITH_SIZE are
        converted back)

3. Use the .ACCESS_WITH_SIZE in builtin object size (sub-object only)
      which includes:
      * use the size info of the .ACCESS_WITH_SIZE for sub-object.
      * testing cases. 

4 Use the .ACCESS_WITH_SIZE in bound sanitizer
      Since the result type of the call to .ACCESS_WITH_SIZE is a pointer to
        the element type. The original array_ref is converted to an indirect_ref,
        which introduced two issues for the instrumenation of bound sanitizer:

        A. The index for the original array_ref was mixed into the offset
        expression for the new indirect_ref.

        In order to make the instrumentation for the bound sanitizer easier, one
        more argument for the function .ACCESS_WITH_SIZE is added to record this
        original index for the array_ref. then later during bound instrumentation,
        get the index from the additional argument from the call to the function
        .ACCESS_WITH_SIZE.

        B. In the current bound sanitizer, no instrumentation will be inserted for
        an indirect_ref.

        In order to add instrumentation for an indirect_ref with a call to
        .ACCESS_WITH_SIZE, we should specially handle the indirect_ref with a
        call to .ACCESS_WITH_SIZE, and get the index and bound info from the
        arguments of the call.

      which includes:
      * Record the index to the 6th argument of the call to .ACCESS_WITH_SIZE.
      * Instrument indirect_ref with call to .ACCESS_WITH_SIZE for bound sanitizer.
      * testing cases. (additional testing cases for dynamic array support)

******Remaining works: 

5  Improve __bdos to use the counted_by info in whole-object size for the structure with FAM.
6  Emit warnings when the user breaks the requirments for the new counted_by attribute
   compilation time: -Wcounted-by
   run time: -fsanitizer=counted-by
      * The initialization to the size field should be done before the first reference to the FAM field.
      * the array has at least # of elements specified by the size field all the time during the program.

I have bootstrapped and regression tested on both x86 and aarch64, no issue.

Let me know your comments.

thanks.

Qing

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4 1/4] Provide counted_by attribute to flexible array member field (PR108896)
  2024-01-24  0:29 [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
@ 2024-01-24  0:29 ` Qing Zhao
  2024-01-24  0:29 ` [PATCH v4 2/4] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE Qing Zhao
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-24  0:29 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, siddhesh, uecker
  Cc: keescook, isanbard, gcc-patches, Qing Zhao

'counted_by (COUNT)'
     The 'counted_by' attribute may be attached to the C99 flexible
     array member of a structure.  It indicates that the number of the
     elements of the array is given by the field named "COUNT" in the
     same structure as the flexible array member.  GCC uses this
     information to improve the results of the array bound sanitizer and
     the '__builtin_dynamic_object_size'.

     For instance, the following code:

          struct P {
            size_t count;
            char other;
            char array[] __attribute__ ((counted_by (count)));
          } *p;

     specifies that the 'array' is a flexible array member whose number
     of elements is given by the field 'count' in the same structure.

     The field that represents the number of the elements should have an
     integer type.  Otherwise, the compiler will report a warning and
     ignore the attribute.

     An explicit 'counted_by' annotation defines a relationship between
     two objects, 'p->array' and 'p->count', and there are the following
     requirementthat on the relationship between this pair:

        * 'p->count' should be initialized before the first reference to
          'p->array';

        * 'p->array' has _at least_ 'p->count' number of elements
          available all the time.  This relationship must hold even
          after any of these related objects are updated during the
          program.

     It's the user's responsibility to make sure the above requirements
     to be kept all the time.  Otherwise the compiler will report
     warnings, at the same time, the results of the array bound
     sanitizer and the '__builtin_dynamic_object_size' is undefined.

     One important feature of the attribute is, a reference to the
     flexible array member field will use the latest value assigned to
     the field that represents the number of the elements before that
     reference.  For example,

            p->count = val1;
            p->array[20] = 0;  // ref1 to p->array
            p->count = val2;
            p->array[30] = 0;  // ref2 to p->array

     in the above, 'ref1' will use 'val1' as the number of the elements
     in 'p->array', and 'ref2' will use 'val2' as the number of elements
     in 'p->array'.

gcc/c-family/ChangeLog:

	PR C/108896
	* c-attribs.cc (handle_counted_by_attribute): New function.
	(attribute_takes_identifier_p): Add counted_by attribute to the list.
	* c-common.cc (c_flexible_array_member_type_p): ...To this.
	* c-common.h (c_flexible_array_member_type_p): New prototype.

gcc/c/ChangeLog:

	PR C/108896
	* c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
	(add_flexible_array_elts_to_size): Use renamed function.
	(is_flexible_array_member_p): Use renamed function.
	(verify_counted_by_attribute): New function.
	(finish_struct): Use renamed function and verify counted_by
	attribute.
	* c-tree.h (lookup_field): New prototype.
	* c-typeck.cc (lookup_field): Expose as extern function.

gcc/ChangeLog:

	PR C/108896
	* doc/extend.texi: Document attribute counted_by.

gcc/testsuite/ChangeLog:

	PR C/108896
	* gcc.dg/flex-array-counted-by.c: New test.
---
 gcc/c-family/c-attribs.cc                    | 54 ++++++++++++-
 gcc/c-family/c-common.cc                     | 13 +++
 gcc/c-family/c-common.h                      |  1 +
 gcc/c/c-decl.cc                              | 85 ++++++++++++++++----
 gcc/c/c-tree.h                               |  1 +
 gcc/c/c-typeck.cc                            |  3 +-
 gcc/doc/extend.texi                          | 62 ++++++++++++++
 gcc/testsuite/gcc.dg/flex-array-counted-by.c | 40 +++++++++
 8 files changed, 239 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by.c

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 40a0cf90295d..4395c0656b14 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -105,6 +105,8 @@ static tree handle_warn_if_not_aligned_attribute (tree *, tree, tree,
 						  int, bool *);
 static tree handle_strict_flex_array_attribute (tree *, tree, tree,
 						 int, bool *);
+static tree handle_counted_by_attribute (tree *, tree, tree,
+					   int, bool *);
 static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_noplt_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *);
@@ -412,6 +414,8 @@ const struct attribute_spec c_common_gnu_attributes[] =
 			      handle_warn_if_not_aligned_attribute, NULL },
   { "strict_flex_array",      1, 1, true, false, false, false,
 			      handle_strict_flex_array_attribute, NULL },
+  { "counted_by",	      1, 1, true, false, false, false,
+			      handle_counted_by_attribute, NULL },
   { "weak",                   0, 0, true,  false, false, false,
 			      handle_weak_attribute, NULL },
   { "noplt",                   0, 0, true,  false, false, false,
@@ -659,7 +663,8 @@ attribute_takes_identifier_p (const_tree attr_id)
   else if (!strcmp ("mode", spec->name)
 	   || !strcmp ("format", spec->name)
 	   || !strcmp ("cleanup", spec->name)
-	   || !strcmp ("access", spec->name))
+	   || !strcmp ("access", spec->name)
+	   || !strcmp ("counted_by", spec->name))
     return true;
   else
     return targetm.attribute_takes_identifier_p (attr_id);
@@ -2806,6 +2811,53 @@ handle_strict_flex_array_attribute (tree *node, tree name,
   return NULL_TREE;
 }
 
+/* Handle a "counted_by" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_counted_by_attribute (tree *node, tree name,
+			     tree args, int ARG_UNUSED (flags),
+			     bool *no_add_attrs)
+{
+  tree decl = *node;
+  tree argval = TREE_VALUE (args);
+
+  /* This attribute only applies to field decls of a structure.  */
+  if (TREE_CODE (decl) != FIELD_DECL)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for non-field"
+		" declaration %q+D", name, decl);
+      *no_add_attrs = true;
+    }
+  /* This attribute only applies to field with array type.  */
+  else if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for a non-array field",
+		name);
+      *no_add_attrs = true;
+    }
+  /* This attribute only applies to a C99 flexible array member type.  */
+  else if (! c_flexible_array_member_type_p (TREE_TYPE (decl)))
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for a non"
+		" flexible array member field",
+		name);
+      *no_add_attrs = true;
+    }
+  /* The argument should be an identifier.  */
+  else if (TREE_CODE (argval) != IDENTIFIER_NODE)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%<counted_by%> argument not an identifier");
+      *no_add_attrs = true;
+    }
+
+  return NULL_TREE;
+}
+
 /* Handle a "weak" attribute; arguments as in
    struct attribute_spec.handler.  */
 
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index dc67562ce923..8e9f7370abfe 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9903,6 +9903,19 @@ c_common_finalize_early_debug (void)
       (*debug_hooks->early_global_decl) (cnode->decl);
 }
 
+/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */
+bool
+c_flexible_array_member_type_p (const_tree type)
+{
+  if (TREE_CODE (type) == ARRAY_TYPE
+      && TYPE_SIZE (type) == NULL_TREE
+      && TYPE_DOMAIN (type) != NULL_TREE
+      && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
+    return true;
+
+  return false;
+}
+
 /* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
    values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
 unsigned int
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 2d5f53998855..3e0eed0548b0 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -904,6 +904,7 @@ extern tree fold_for_warn (tree);
 extern tree c_common_get_narrower (tree, int *);
 extern bool get_attribute_operand (tree, unsigned HOST_WIDE_INT *);
 extern void c_common_finalize_early_debug (void);
+extern bool c_flexible_array_member_type_p (const_tree);
 extern unsigned int c_strict_flex_array_level_of (tree);
 extern bool c_option_is_from_cpp_diagnostics (int);
 extern tree c_hardbool_type_attr_1 (tree, tree *, tree *);
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 4029bbc59fe2..77130e839a27 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5301,19 +5301,6 @@ set_array_declarator_inner (struct c_declarator *decl,
   return decl;
 }
 
-/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */
-static bool
-flexible_array_member_type_p (const_tree type)
-{
-  if (TREE_CODE (type) == ARRAY_TYPE
-      && TYPE_SIZE (type) == NULL_TREE
-      && TYPE_DOMAIN (type) != NULL_TREE
-      && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
-    return true;
-
-  return false;
-}
-
 /* Determine whether TYPE is a one-element array type "[1]".  */
 static bool
 one_element_array_type_p (const_tree type)
@@ -5350,7 +5337,7 @@ add_flexible_array_elts_to_size (tree decl, tree init)
 
   elt = CONSTRUCTOR_ELTS (init)->last ().value;
   type = TREE_TYPE (elt);
-  if (flexible_array_member_type_p (type))
+  if (c_flexible_array_member_type_p (type))
     {
       complete_array_type (&type, elt, false);
       DECL_SIZE (decl)
@@ -9317,7 +9304,7 @@ is_flexible_array_member_p (bool is_last_field,
 
   bool is_zero_length_array = zero_length_array_type_p (TREE_TYPE (x));
   bool is_one_element_array = one_element_array_type_p (TREE_TYPE (x));
-  bool is_flexible_array = flexible_array_member_type_p (TREE_TYPE (x));
+  bool is_flexible_array = c_flexible_array_member_type_p (TREE_TYPE (x));
 
   unsigned int strict_flex_array_level = c_strict_flex_array_level_of (x);
 
@@ -9347,6 +9334,60 @@ is_flexible_array_member_p (bool is_last_field,
   return false;
 }
 
+/* Verify the argument of the counted_by attribute of the flexible array
+   member FIELD_DECL is a valid field of the containing structure,
+   STRUCT_TYPE, Report error and remove this attribute when it's not.  */
+static void
+verify_counted_by_attribute (tree struct_type, tree field_decl)
+{
+  tree attr_counted_by = lookup_attribute ("counted_by",
+					   DECL_ATTRIBUTES (field_decl));
+
+  if (!attr_counted_by)
+    return;
+
+  /* If there is an counted_by attribute attached to the field,
+     verify it.  */
+
+  tree fieldname = TREE_VALUE (TREE_VALUE (attr_counted_by));
+
+  /* Verify the argument of the attrbute is a valid field of the
+     containing structure.  */
+
+  tree counted_by_field = lookup_field (struct_type, fieldname);
+
+  /* Error when the field is not found in the containing structure.  */
+  if (!counted_by_field)
+    {
+      error_at (DECL_SOURCE_LOCATION (field_decl),
+		"%qE attribute argument not a field declaration"
+		" in the same structure, ignore it",
+		(get_attribute_name (attr_counted_by)));
+
+      DECL_ATTRIBUTES (field_decl)
+	= remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
+    }
+  else
+  /* Error when the field is not with an integer type.  */
+    {
+      while (TREE_CHAIN (counted_by_field))
+	counted_by_field = TREE_CHAIN (counted_by_field);
+      tree real_field = TREE_VALUE (counted_by_field);
+
+      if (TREE_CODE (TREE_TYPE (real_field)) != INTEGER_TYPE)
+	{
+	  error_at (DECL_SOURCE_LOCATION (field_decl),
+		     "%qE attribute argument not a field declaration"
+		     " with integer type, ignore it",
+		     (get_attribute_name (attr_counted_by)));
+
+	  DECL_ATTRIBUTES (field_decl)
+	    = remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
+	}
+    }
+
+  return;
+}
 
 /* Fill in the fields of a RECORD_TYPE or UNION_TYPE node, T.
    LOC is the location of the RECORD_TYPE or UNION_TYPE's definition.
@@ -9408,6 +9449,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
      until now.)  */
 
   bool saw_named_field = false;
+  tree counted_by_fam_field = NULL_TREE;
   for (x = fieldlist; x; x = DECL_CHAIN (x))
     {
       /* Whether this field is the last field of the structure or union.
@@ -9468,7 +9510,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 	DECL_PACKED (x) = 1;
 
       /* Detect flexible array member in an invalid context.  */
-      if (flexible_array_member_type_p (TREE_TYPE (x)))
+      if (c_flexible_array_member_type_p (TREE_TYPE (x)))
 	{
 	  if (TREE_CODE (t) == UNION_TYPE)
 	    {
@@ -9489,6 +9531,12 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 			"members");
 	      TREE_TYPE (x) = error_mark_node;
 	    }
+
+	  /* If there is a counted_by attribute attached to this field,
+	     record it here and do more verification later after the
+	     whole structure is complete.  */
+	  if (lookup_attribute ("counted_by", DECL_ATTRIBUTES (x)))
+	    counted_by_fam_field = x;
 	}
 
       if (pedantic && TREE_CODE (t) == RECORD_TYPE
@@ -9503,7 +9551,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 	 when x is an array and is the last field.  */
       if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE)
 	TYPE_INCLUDES_FLEXARRAY (t)
-	  = is_last_field && flexible_array_member_type_p (TREE_TYPE (x));
+	  = is_last_field && c_flexible_array_member_type_p (TREE_TYPE (x));
       /* Recursively set TYPE_INCLUDES_FLEXARRAY for the context of x, t
 	 when x is an union or record and is the last field.  */
       else if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
@@ -9758,6 +9806,9 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 	struct_parse_info->struct_types.safe_push (t);
      }
 
+  if (counted_by_fam_field)
+    verify_counted_by_attribute (t, counted_by_fam_field);
+
   return t;
 }
 
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index cf29534c0915..27a5c857dfbd 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -775,6 +775,7 @@ extern struct c_expr convert_lvalue_to_rvalue (location_t, struct c_expr,
 extern tree decl_constant_value_1 (tree, bool);
 extern void mark_exp_read (tree);
 extern tree composite_type (tree, tree);
+extern tree lookup_field (tree, tree);
 extern tree build_component_ref (location_t, tree, tree, location_t,
 				 location_t);
 extern tree build_array_ref (location_t, tree, tree);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 66c6abc9f076..d0895d5d9f28 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -101,7 +101,6 @@ static bool function_types_compatible_p (const_tree, const_tree,
 					 struct comptypes_data *);
 static bool type_lists_compatible_p (const_tree, const_tree,
 				     struct comptypes_data *);
-static tree lookup_field (tree, tree);
 static int convert_arguments (location_t, vec<location_t>, tree,
 			      vec<tree, va_gc> *, vec<tree, va_gc> *, tree,
 			      tree);
@@ -2370,7 +2369,7 @@ default_conversion (tree exp)
    the component is embedded within (nested) anonymous structures or
    unions, the list steps down the chain to the component.  */
 
-static tree
+tree
 lookup_field (tree type, tree component)
 {
   tree field;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index d1893ad860c9..64d3dfcefe65 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7743,6 +7743,68 @@ align them on any target.
 The @code{aligned} attribute can also be used for functions
 (@pxref{Common Function Attributes}.)
 
+@cindex @code{counted_by} variable attribute
+@item counted_by (@var{count})
+The @code{counted_by} attribute may be attached to the C99 flexible array
+member of a structure.  It indicates that the number of the elements of the
+array is given by the field named "@var{count}" in the same structure as the
+flexible array member.  GCC uses this information to improve the results of
+the array bound sanitizer and the @code{__builtin_dynamic_object_size}.
+
+For instance, the following code:
+
+@smallexample
+struct P @{
+  size_t count;
+  char other;
+  char array[] __attribute__ ((counted_by (count)));
+@} *p;
+@end smallexample
+
+@noindent
+specifies that the @code{array} is a flexible array member whose number of
+elements is given by the field @code{count} in the same structure.
+
+The field that represents the number of the elements should have an
+integer type.  Otherwise, the compiler will report a warning and ignore
+the attribute.
+
+An explicit @code{counted_by} annotation defines a relationship between
+two objects, @code{p->array} and @code{p->count}, and there are the
+following requirementthat on the relationship between this pair:
+
+@itemize @bullet
+@item
+@code{p->count} should be initialized before the first reference to
+@code{p->array};
+
+@item
+@code{p->array} has @emph{at least} @code{p->count} number of elements
+available all the time.  This relationship must hold even after any of
+these related objects are updated during the program.
+@end itemize
+
+It's the user's responsibility to make sure the above requirements to
+be kept all the time.  Otherwise the compiler will report warnings,
+at the same time, the results of the array bound sanitizer and the
+@code{__builtin_dynamic_object_size} is undefined.
+
+One important feature of the attribute is, a reference to the flexible
+array member field will use the latest value assigned to the field that
+represents the number of the elements before that reference.  For example,
+
+@smallexample
+  p->count = val1;
+  p->array[20] = 0;  // ref1 to p->array
+  p->count = val2;
+  p->array[30] = 0;  // ref2 to p->array
+@end smallexample
+
+@noindent
+in the above, @code{ref1} will use @code{val1} as the number of the elements in
+@code{p->array}, and @code{ref2} will use @code{val2} as the number of elements
+in @code{p->array}.
+
 @cindex @code{alloc_size} variable attribute
 @item alloc_size (@var{position})
 @itemx alloc_size (@var{position-1}, @var{position-2})
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by.c b/gcc/testsuite/gcc.dg/flex-array-counted-by.c
new file mode 100644
index 000000000000..f8ce9776bf86
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by.c
@@ -0,0 +1,40 @@
+/* testing the correct usage of attribute counted_by.  */   
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include <wchar.h>
+
+int size;
+int x __attribute ((counted_by (size))); /* { dg-error "attribute may not be specified for non-field declaration" } */
+
+struct trailing {
+  int count;
+  int field __attribute ((counted_by (count))); /* { dg-error "attribute may not be specified for a non-array field" } */
+};
+
+struct trailing_1 {
+  int count;
+  int array_1[0] __attribute ((counted_by (count))); /* { dg-error "attribute may not be specified for a non flexible array member field" } */
+};
+
+int count;
+struct trailing_array_2 {
+  int count;
+  int array_2[] __attribute ((counted_by ("count"))); /* { dg-error "argument not an identifier" } */
+};
+
+struct trailing_array_3 {
+  int other;
+  int array_3[] __attribute ((counted_by (L"count"))); /* { dg-error "argument not an identifier" } */
+};
+
+struct trailing_array_4 {
+  int other;
+  int array_4[] __attribute ((counted_by (count))); /* { dg-error "attribute argument not a field declaration in the same structure, ignore it" } */
+};
+
+int count;
+struct trailing_array_5 {
+  float count;
+  int array_5[] __attribute ((counted_by (count))); /* { dg-error "attribute argument not a field declaration with integer type, ignore it" } */
+};
-- 
2.31.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4 2/4] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.
  2024-01-24  0:29 [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
  2024-01-24  0:29 ` [PATCH v4 1/4] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
@ 2024-01-24  0:29 ` Qing Zhao
  2024-01-24  0:29 ` [PATCH v4 3/4] Use the .ACCESS_WITH_SIZE in builtin object size Qing Zhao
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-24  0:29 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, siddhesh, uecker
  Cc: keescook, isanbard, gcc-patches, Qing Zhao

Including the following changes:
* The definition of the new internal function .ACCESS_WITH_SIZE
  in internal-fn.def.
* C FE converts every reference to a FAM with a "counted_by" attribute
  to a call to the internal function .ACCESS_WITH_SIZE.
  (build_component_ref in c_typeck.cc)

  This includes the case when the object is statically allocated and
  initialized.
  In order to make this working, the routines initializer_constant_valid_p_1
  and output_constant in varasm.cc are updated to handle calls to
  .ACCESS_WITH_SIZE.
  (initializer_constant_valid_p_1 and output_constant in varasm.c)

  However, for the reference inside "offsetof", the "counted_by" attribute is
  ignored since it's not useful at all.
  (c_parser_postfix_expression in c/c-parser.cc)
* Convert every call to .ACCESS_WITH_SIZE to its first argument.
  (expand_ACCESS_WITH_SIZE in internal-fn.cc)
* Adjust alias analysis to exclude the new internal from clobbering anything.
  (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in tree-ssa-alias.cc)
* Adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE when
  it's LHS is eliminated as dead code.
  (eliminate_unnecessary_stmts in tree-ssa-dce.cc)
* Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
  get the reference from the call to .ACCESS_WITH_SIZE.
  (is_access_with_size_p and get_ref_from_access_with_size in tree.cc)

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_postfix_expression): Ignore the counted-by
	attribute when build_component_ref inside offsetof operator.
	* c-tree.h (build_component_ref): Add one more parameter.
	* c-typeck.cc (build_counted_by_ref): New function.
	(build_access_with_size_for_counted_by): New function.
	(build_component_ref): Check the counted-by attribute and build
	call to .ACCESS_WITH_SIZE.

gcc/ChangeLog:

	* internal-fn.cc (expand_ACCESS_WITH_SIZE): New function.
	* internal-fn.def (ACCESS_WITH_SIZE): New internal function.
	* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Special case
	IFN_ACCESS_WITH_SIZE.
	(call_may_clobber_ref_p_1): Special case IFN_ACCESS_WITH_SIZE.
	* tree-ssa-dce.cc (eliminate_unnecessary_stmts): Eliminate the call
	to .ACCESS_WITH_SIZE when its LHS is dead.
	* tree.cc (process_call_operands): Adjust side effect for function
	.ACCESS_WITH_SIZE.
	(is_access_with_size_p): New function.
	(get_ref_from_access_with_size): New function.
	* tree.h (is_access_with_size_p): New prototype.
	(get_ref_from_access_with_size): New prototype.
	* varasm.cc (initializer_constant_valid_p_1): Handle call to
	.ACCESS_WITH_SIZE.
	(output_constant): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

	* gcc.dg/flex-array-counted-by-2.c: New test.
---
 gcc/c/c-parser.cc                             |  10 +-
 gcc/c/c-tree.h                                |   2 +-
 gcc/c/c-typeck.cc                             | 108 +++++++++++++++++-
 gcc/internal-fn.cc                            |  35 ++++++
 gcc/internal-fn.def                           |   4 +
 .../gcc.dg/flex-array-counted-by-2.c          |  94 +++++++++++++++
 gcc/tree-ssa-alias.cc                         |   2 +
 gcc/tree-ssa-dce.cc                           |   5 +-
 gcc/tree.cc                                   |  25 +++-
 gcc/tree.h                                    |   8 ++
 gcc/varasm.cc                                 |  10 ++
 11 files changed, 294 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index c31349dae2ff..a6ed5ac43bb1 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -10850,9 +10850,12 @@ c_parser_postfix_expression (c_parser *parser)
 	    if (c_parser_next_token_is (parser, CPP_NAME))
 	      {
 		c_token *comp_tok = c_parser_peek_token (parser);
+		/* Ignore the counted_by attribute for reference inside
+		   offsetof since the information is not useful at all.  */
 		offsetof_ref
 		  = build_component_ref (loc, offsetof_ref, comp_tok->value,
-					 comp_tok->location, UNKNOWN_LOCATION);
+					 comp_tok->location, UNKNOWN_LOCATION,
+					 false);
 		c_parser_consume_token (parser);
 		while (c_parser_next_token_is (parser, CPP_DOT)
 		       || c_parser_next_token_is (parser,
@@ -10879,11 +10882,14 @@ c_parser_postfix_expression (c_parser *parser)
 			    break;
 			  }
 			c_token *comp_tok = c_parser_peek_token (parser);
+			/* Ignore the counted_by attribute for reference inside
+			   offsetof since the information is not useful.  */
 			offsetof_ref
 			  = build_component_ref (loc, offsetof_ref,
 						 comp_tok->value,
 						 comp_tok->location,
-						 UNKNOWN_LOCATION);
+						 UNKNOWN_LOCATION,
+						 false);
 			c_parser_consume_token (parser);
 		      }
 		    else
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 27a5c857dfbd..fdda413da498 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -777,7 +777,7 @@ extern void mark_exp_read (tree);
 extern tree composite_type (tree, tree);
 extern tree lookup_field (tree, tree);
 extern tree build_component_ref (location_t, tree, tree, location_t,
-				 location_t);
+				 location_t, bool = true);
 extern tree build_array_ref (location_t, tree, tree);
 extern tree build_omp_array_section (location_t, tree, tree, tree);
 extern tree build_external_ref (location_t, tree, bool, tree *);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index d0895d5d9f28..4020bafc8e36 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2541,15 +2541,107 @@ should_suggest_deref_p (tree datum_type)
     return false;
 }
 
+/* For a SUBDATUM field of a structure or union DATUM, generate a REF to
+   the object that represents its counted_by per the attribute counted_by
+   attached to this field if it's a flexible array member field, otherwise
+   return NULL_TREE.
+   set COUNTED_BY_TYPE to the TYPE of the counted_by field.
+   For example, if:
+
+    struct P {
+      int k;
+      int x[] __attribute__ ((counted_by (k)));
+    } *p;
+
+    for:
+    p->x
+
+    the ref to the object that represents its element count will be:
+
+    &(p->k)
+
+*/
+static tree
+build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
+{
+  tree type = TREE_TYPE (datum);
+  if (!(c_flexible_array_member_type_p (TREE_TYPE (subdatum))))
+    return NULL_TREE;
+
+  tree attr_counted_by = lookup_attribute ("counted_by",
+					   DECL_ATTRIBUTES (subdatum));
+  tree counted_by_ref = NULL_TREE;
+  *counted_by_type = NULL_TREE;
+  if (attr_counted_by)
+    {
+      tree field_id = TREE_VALUE (TREE_VALUE (attr_counted_by));
+      counted_by_ref
+	= build_component_ref (UNKNOWN_LOCATION, datum, field_id,
+			       UNKNOWN_LOCATION, UNKNOWN_LOCATION);
+      counted_by_ref = build_fold_addr_expr (counted_by_ref);
+
+      /* get the TYPE of the counted_by field.  */
+      tree counted_by_field = lookup_field (type, field_id);
+      gcc_assert (counted_by_field);
+
+      do
+	{
+	  *counted_by_type = TREE_TYPE (TREE_VALUE (counted_by_field));
+	  counted_by_field = TREE_CHAIN (counted_by_field);
+	}
+      while (counted_by_field);
+    }
+  return counted_by_ref;
+}
+
+/* Given a COMPONENT_REF REF with location LOC, the corresponding
+   COUNTED_BY_REF, and the COUNTED_BY_TYPE, generate a call to the
+   internal function .ACCESS_WITH_SIZE.
+
+   REF
+
+   to:
+
+   .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1)
+
+   NOTE: Both the return type and the type of the first argument
+   of this function have been converted from the incomplete array
+   type to the corresponding pointer type.
+  */
+static tree
+build_access_with_size_for_counted_by (location_t loc, tree ref,
+				       tree counted_by_ref,
+				       tree counted_by_type)
+{
+  gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
+  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
+  unsigned int counted_by_precision = TYPE_PRECISION (counted_by_type);
+
+  tree call
+    = build_call_expr_internal_loc (loc, IFN_ACCESS_WITH_SIZE,
+				  result_type, 5,
+				  array_to_pointer_conversion (loc, ref),
+				  counted_by_ref,
+				  build_int_cst (integer_type_node, 1),
+				  build_int_cst (integer_type_node,
+						 counted_by_precision),
+				  build_int_cst (integer_type_node, -1));
+  SET_EXPR_LOCATION (call, loc);
+  return call;
+}
+
 /* Make an expression to refer to the COMPONENT field of structure or
    union value DATUM.  COMPONENT is an IDENTIFIER_NODE.  LOC is the
    location of the COMPONENT_REF.  COMPONENT_LOC is the location
    of COMPONENT.  ARROW_LOC is the location of the first -> operand if
-   it is from -> operator.  */
+   it is from -> operator.
+   If HANDLE_COUNTED_BY is true, check the counted_by attribute and generate
+   call to .ACCESS_WITH_SIZE. otherwise, ignore the attribute.  */
 
 tree
 build_component_ref (location_t loc, tree datum, tree component,
-		     location_t component_loc, location_t arrow_loc)
+		     location_t component_loc, location_t arrow_loc,
+		     bool handle_counted_by)
 {
   tree type = TREE_TYPE (datum);
   enum tree_code code = TREE_CODE (type);
@@ -2621,7 +2713,11 @@ build_component_ref (location_t loc, tree datum, tree component,
 	  int quals;
 	  tree subtype;
 	  bool use_datum_quals;
-
+	  tree counted_by_type = NULL_TREE;
+	  tree counted_by_ref = handle_counted_by
+				? build_counted_by_ref (datum, subdatum,
+							&counted_by_type)
+				: NULL_TREE;
 	  if (TREE_TYPE (subdatum) == error_mark_node)
 	    return error_mark_node;
 
@@ -2640,6 +2736,12 @@ build_component_ref (location_t loc, tree datum, tree component,
 	  ref = build3 (COMPONENT_REF, subtype, datum, subdatum,
 			NULL_TREE);
 	  SET_EXPR_LOCATION (ref, loc);
+
+	  if (counted_by_ref)
+	    ref = build_access_with_size_for_counted_by (loc, ref,
+							 counted_by_ref,
+							 counted_by_type);
+
 	  if (TREE_READONLY (subdatum)
 	      || (use_datum_quals && TREE_READONLY (datum)))
 	    TREE_READONLY (ref) = 1;
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index a07f25f3aee3..e132117307e6 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3393,6 +3393,41 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
     }
 }
 
+/* Expand the IFN_ACCESS_WITH_SIZE function:
+   ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, TYPE_OF_SIZE,
+		     PREC_OF_SIZE, ACCESS_MODE)
+   which returns the REF_TO_OBJ same as the 1st argument;
+
+   1st argument REF_TO_OBJ: The reference to the object;
+   2nd argument REF_TO_SIZE: The reference to the size of the object,
+   3rd argument TYPE_OF_SIZE: The size referenced by the REF_TO_SIZE represents
+     0: unknown;
+     1: the number of the elements of the object type;
+     2: the number of bytes;
+   4th argument PREC_OF_SIZE: The precision of the object refed by REF_TO_SIZE
+   5th argument ACCESS_MODE:
+    -1: Unknown access semantics
+     0: none
+     1: read_only
+     2: write_only
+     3: read_write
+
+   Both the return type and the type of the first argument of this
+   function have been converted from the incomplete array type to
+   the corresponding pointer type.
+
+   For each call to a .ACCESS_WITH_SIZE, replace it with its 1st argument.  */
+static void
+expand_ACCESS_WITH_SIZE (internal_fn, gcall *stmt)
+{
+  tree lhs = gimple_call_lhs (stmt);
+  tree ref_to_obj = gimple_call_arg (stmt, 0);
+  if (lhs)
+    expand_assignment (lhs, ref_to_obj, false);
+  else
+    emit_insn (expand_normal (ref_to_obj));
+}
+
 /* The size of an OpenACC compute dimension.  */
 
 static void
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index c14d30365c14..0801c8bfe61d 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -510,6 +510,10 @@ DEF_INTERNAL_FN (PHI, 0, NULL)
    automatic variable.  */
 DEF_INTERNAL_FN (DEFERRED_INIT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 
+/* A function to associate the access size and access mode information
+   with the corresponding reference to an object.  */
+DEF_INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
+
 /* DIM_SIZE and DIM_POS return the size of a particular compute
    dimension and the executing thread's position within that
    dimension.  DIM_POS is pure (and not const) so that it isn't
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
new file mode 100644
index 000000000000..d96d025b63d6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
@@ -0,0 +1,94 @@
+/* test the code generation for the new attribute counted_by.
+   and also the offsetof operator on such array.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-original" } */
+
+#include <stdlib.h>
+
+struct annotated {
+  int b;
+  char c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+static struct annotated static_annotated = { sizeof "hello", "hello" };
+static char *y = static_annotated.c;
+
+struct flex {
+  int b;
+  char c[]; 
+}; 
+
+struct nested_annotated {
+  struct {
+    union {
+      int b;
+      float f;	
+    };
+    int n;
+  };
+  char c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+static struct nested_annotated nested_static_annotated
+				 = { sizeof "hello1", 0, "hello1" };
+static char *nested_y = nested_static_annotated.c;
+
+struct nested_flex {
+  struct {
+    union {
+      int b;
+      float f;	
+    };
+    int n;
+  };
+  char c[];
+};
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_annotated
+    = (struct annotated *)malloc (sizeof (struct annotated)
+				  + attr_count *  sizeof (char));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+    = (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
+					 + attr_count *  sizeof (char));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test (char a, char b)
+{
+  if (__builtin_offsetof (struct annotated, c[0])
+      != __builtin_offsetof (struct flex, c[0]))
+    abort ();
+  if (__builtin_offsetof (struct annotated, c[1])
+      != __builtin_offsetof (struct flex, c[1]))
+    abort ();
+  if (__builtin_offsetof (struct nested_annotated, c[0]) 
+      != __builtin_offsetof (struct nested_flex, c[0])) 
+    abort ();
+  if (__builtin_offsetof (struct nested_annotated, c[1]) 
+      != __builtin_offsetof (struct nested_flex, c[1])) 
+    abort ();
+   
+  array_annotated->c[2] = a;
+  array_nested_annotated->c[3] = b;
+
+  if (y[2] != 'l') abort ();
+  if (nested_y[4] !='o') abort ();
+
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);   
+  test ('A', 'B');
+  if (array_annotated->c[2] != 'A') abort ();
+  if (array_nested_annotated->c[3] != 'B') abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "ACCESS_WITH_SIZE" 4 "original" } } */
diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
index e7c1c1aa6243..8c070e173bdd 100644
--- a/gcc/tree-ssa-alias.cc
+++ b/gcc/tree-ssa-alias.cc
@@ -2823,6 +2823,7 @@ ref_maybe_used_by_call_p_1 (gcall *call, ao_ref *ref, bool tbaa_p)
 	return false;
       case IFN_MASK_STORE_LANES:
       case IFN_MASK_LEN_STORE_LANES:
+      case IFN_ACCESS_WITH_SIZE:
 	goto process_args;
       case IFN_MASK_LOAD:
       case IFN_LEN_LOAD:
@@ -3073,6 +3074,7 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref, bool tbaa_p)
       case IFN_UBSAN_OBJECT_SIZE:
       case IFN_UBSAN_PTR:
       case IFN_ASAN_CHECK:
+      case IFN_ACCESS_WITH_SIZE:
 	return false;
       case IFN_MASK_STORE:
       case IFN_LEN_STORE:
diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc
index 636c471d4c89..a54fb1b754dd 100644
--- a/gcc/tree-ssa-dce.cc
+++ b/gcc/tree-ssa-dce.cc
@@ -1459,8 +1459,8 @@ eliminate_unnecessary_stmts (bool aggressive)
 		  update_stmt (stmt);
 		  release_ssa_name (name);
 
-		  /* GOMP_SIMD_LANE (unless three argument) or ASAN_POISON
-		     without lhs is not needed.  */
+		  /* GOMP_SIMD_LANE (unless three argument), ASAN_POISON
+		     or .ACCESS_WITH_SIZE without lhs is not needed.  */
 		  if (gimple_call_internal_p (stmt))
 		    switch (gimple_call_internal_fn (stmt))
 		      {
@@ -1470,6 +1470,7 @@ eliminate_unnecessary_stmts (bool aggressive)
 			  break;
 			/* FALLTHRU */
 		      case IFN_ASAN_POISON:
+		      case IFN_ACCESS_WITH_SIZE:
 			remove_dead_stmt (&gsi, bb, to_remove_edges);
 			break;
 		      default:
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 8aee3ef18d82..022010ae78a7 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -4068,7 +4068,8 @@ process_call_operands (tree t)
   int i = call_expr_flags (t);
 
   /* Calls have side-effects, except those to const or pure functions.  */
-  if ((i & ECF_LOOPING_CONST_OR_PURE) || !(i & (ECF_CONST | ECF_PURE)))
+  if ((i & ECF_LOOPING_CONST_OR_PURE) 
+      || (!(i & (ECF_CONST | ECF_PURE)) && !is_access_with_size_p (t)))
     side_effects = true;
   /* Propagate TREE_READONLY of arguments for const functions.  */
   if (i & ECF_CONST)
@@ -13357,6 +13358,28 @@ component_ref_size (tree ref, special_array_member *sam /* = NULL */)
 	  ? NULL_TREE : size_zero_node);
 }
 
+/* Return true if the given node CALL is a call to a .ACCESS_WITH_SIZE
+   function.  */
+bool
+is_access_with_size_p (tree call)
+{
+  if (TREE_CODE (call) != CALL_EXPR)
+    return false;
+  if (CALL_EXPR_IFN (call) == IFN_ACCESS_WITH_SIZE)
+    return true;
+  return false;
+}
+
+/* Get the corresponding reference from the call to a .ACCESS_WITH_SIZE.
+ * i.e the first argument of this call. return NULL_TREE otherwise.  */
+tree
+get_ref_from_access_with_size (tree call)
+{
+  if (is_access_with_size_p (call))
+    return  CALL_EXPR_ARG (call, 0);
+  return NULL_TREE;
+}
+
 /* Return the machine mode of T.  For vectors, returns the mode of the
    inner type.  The main use case is to feed the result to HONOR_NANS,
    avoiding the BLKmode that a direct TYPE_MODE (T) might return.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 972a067a1f7a..75c97145cb3c 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5760,6 +5760,14 @@ extern special_array_member component_ref_sam_type (tree);
    cannot be determined.  */
 extern tree component_ref_size (tree, special_array_member * = NULL);
 
+/* Return true if the given node is a call to a .ACCESS_WITH_SIZE
+   function.  */
+extern bool is_access_with_size_p (tree);
+
+/* Get the corresponding reference from the call to a .ACCESS_WITH_SIZE.
+ * i.e the first argument of this call. return NULL_TREE otherwise.  */
+extern tree get_ref_from_access_with_size (tree);
+
 extern int tree_map_base_eq (const void *, const void *);
 extern unsigned int tree_map_base_hash (const void *);
 extern bool tree_map_base_marked_p (const void *);
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index d2c879b7da44..72fec9b38c73 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -5077,6 +5077,11 @@ initializer_constant_valid_p_1 (tree value, tree endtype, tree *cache)
 	}
       return ret;
 
+    case CALL_EXPR:
+      /* for a call to .ACCESS_WITH_SIZE, check the first argument.  */
+      if (tree ref = get_ref_from_access_with_size (value))
+	return initializer_constant_valid_p_1 (ref, endtype, cache);
+      /* FALLTHROUGH.  */
     default:
       break;
     }
@@ -5271,6 +5276,11 @@ output_constant (tree exp, unsigned HOST_WIDE_INT size, unsigned int align,
 	exp = TREE_OPERAND (exp, 0);
     }
 
+  /* for a call to .ACCESS_WITH_SIZE, check the first argument.  */
+  if (TREE_CODE (exp) == CALL_EXPR)
+    if (tree ref = get_ref_from_access_with_size (exp))
+      exp = ref;
+
   code = TREE_CODE (TREE_TYPE (exp));
   thissize = int_size_in_bytes (TREE_TYPE (exp));
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4 3/4] Use the .ACCESS_WITH_SIZE in builtin object size.
  2024-01-24  0:29 [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
  2024-01-24  0:29 ` [PATCH v4 1/4] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
  2024-01-24  0:29 ` [PATCH v4 2/4] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE Qing Zhao
@ 2024-01-24  0:29 ` Qing Zhao
  2024-01-24  0:29 ` [PATCH v4 4/4] Use the .ACCESS_WITH_SIZE in bound sanitizer Qing Zhao
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-24  0:29 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, siddhesh, uecker
  Cc: keescook, isanbard, gcc-patches, Qing Zhao

gcc/ChangeLog:

	* tree-object-size.cc (access_with_size_object_size): New function.
	(call_object_size): Call the new function.

gcc/testsuite/ChangeLog:

	* gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT.
	* gcc.dg/flex-array-counted-by-3.c: New test.
	* gcc.dg/flex-array-counted-by-4.c: New test.
---
 .../gcc.dg/builtin-object-size-common.h       |  11 ++
 .../gcc.dg/flex-array-counted-by-3.c          |  63 +++++++
 .../gcc.dg/flex-array-counted-by-4.c          | 178 ++++++++++++++++++
 gcc/tree-object-size.cc                       |  47 +++++
 4 files changed, 299 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
index 66ff7cdd953a..b677067c6e6b 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
@@ -30,3 +30,14 @@ unsigned nfails = 0;
       __builtin_abort ();						      \
     return 0;								      \
   } while (0)
+
+#define EXPECT(p, _v) do {						      \
+  size_t v = _v;							      \
+  if (p == v)								      \
+    __builtin_printf ("ok:  %s == %zd\n", #p, p);			      \
+  else									      \
+    {									      \
+      __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v);	      \
+      FAIL ();								      \
+    }									      \
+} while (0);
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index 000000000000..0066c32ca808
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,63 @@
+/* test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */ 
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+    union {
+      int b;
+      float f;	
+    };
+    int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+    = (struct flex *)malloc (sizeof (struct flex)
+			     + normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+    = (struct annotated *)malloc (sizeof (struct annotated)
+				  + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+    = (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
+					 + attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+    EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+    EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1),
+	   array_annotated->b * sizeof (int));
+    EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+	   array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);   
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
new file mode 100644
index 000000000000..3ce7f3545549
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
@@ -0,0 +1,178 @@
+/* test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the
+allocation size mismatched with the value of counted_by attribute?
+we should always use the latest value that is hold by the counted_by
+field.  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  char others;
+  char array[] __attribute__((counted_by (foo)));
+};
+
+#define noinline __attribute__((__noinline__))
+#define SIZE_BUMP 10 
+#define MAX(a, b) ((a) > (b) ? (a) : (b))
+
+/* In general, Due to type casting, the type for the pointee of a pointer
+   does not say anything about the object it points to,
+   So, __builtin_object_size can not directly use the type of the pointee
+   to decide the size of the object the pointer points to.
+
+   there are only two reliable ways:
+   A. observed allocations  (call to the allocation functions in the routine)
+   B. observed accesses     (read or write access to the location of the
+                             pointer points to)
+
+   that provide information about the type/existence of an object at
+   the corresponding address.
+
+   for A, we use the "alloc_size" attribute for the corresponding allocation
+   functions to determine the object size;
+   (We treat counted_by attribute the same as the "alloc_size" attribute)
+
+   For B, we use the SIZE info of the TYPE attached to the corresponding access.
+
+   The only other way in C which ensures that a pointer actually points
+   to an object of the correct type is 'static':
+
+   void foo(struct P *p[static 1]);
+
+   See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
+   for more details.  */
+
+/* in the following function, malloc allocated more space than the value
+   of counted_by attribute.  Then what's the correct behavior we expect 
+   the __builtin_dynamic_object_size should have for each of the cases?  */
+
+static struct annotated * noinline alloc_buf_more (size_t index)
+{
+  struct annotated *p;
+  size_t allocated_size
+    = MAX (sizeof (struct annotated),
+	   (__builtin_offsetof (struct annotated, array[0])
+	    + (index + SIZE_BUMP) * sizeof (char)));
+  p = (struct annotated *) malloc (allocated_size);
+
+  p->foo = index;
+
+  /*when checking the observed access p->array, we have info on both
+    observered allocation and observed access,
+    A.1 from observed allocation: 
+    	allocated_size - offsetof (struct annotated, array[0])
+
+    A.2 from the counted-by attribute:
+    	p->foo * sizeof (char)
+
+    we always use the latest value that is hold by the counted-by field.
+   */
+
+  EXPECT(__builtin_dynamic_object_size(p->array, 0),
+	 (p->foo) * sizeof(char));
+
+  EXPECT(__builtin_dynamic_object_size(p->array, 1),
+	 (p->foo) * sizeof(char));
+
+  EXPECT(__builtin_dynamic_object_size(p->array, 2),
+	 (p->foo) * sizeof(char));
+
+  EXPECT(__builtin_dynamic_object_size(p->array, 3),
+	 (p->foo) * sizeof(char));
+
+  /*when checking the pointer p, we only have info on the observed allocation.
+    So, the object size info can only been obtained from the call to malloc.
+    for both MAXIMUM and MINIMUM: A = (index + SIZE_BUMP) * sizeof (char)  */
+  EXPECT(__builtin_dynamic_object_size(p, 0), allocated_size);
+  EXPECT(__builtin_dynamic_object_size(p, 1), allocated_size);
+  EXPECT(__builtin_dynamic_object_size(p, 2), allocated_size);
+  EXPECT(__builtin_dynamic_object_size(p, 3), allocated_size);
+  return p;
+}
+
+/* in the following function, malloc allocated less space than the value
+   of counted_by attribute.  Then what's the correct behavior we expect 
+   the __builtin_dynamic_object_size should have for each of the cases?
+   NOTE: this is an user error, GCC should issue warnings for such case.
+   this is a seperate issue we should address later.  */
+
+static struct annotated * noinline alloc_buf_less (size_t index)
+{
+  struct annotated *p;
+  size_t allocated_size
+    = MAX (sizeof (struct annotated),
+	   (__builtin_offsetof (struct annotated, array[0])
+	    + (index) * sizeof (char)));
+  p = (struct annotated *) malloc (allocated_size);
+
+  p->foo = index + SIZE_BUMP;
+
+  /*when checking the observed access p->array, we have info on both
+    observered allocation and observed access,
+    A.1 from observed allocation:
+    	allocated_size - offsetof (struct annotated, array[0])
+    A.2 from the counted-by attribute:
+    	p->foo * sizeof (char)
+
+    we always use the latest value that is hold by the counted-by field.
+   */
+
+  EXPECT(__builtin_dynamic_object_size(p->array, 0),
+	 (p->foo) * sizeof(char));
+
+  EXPECT(__builtin_dynamic_object_size(p->array, 1),
+	 (p->foo) * sizeof(char));
+
+  EXPECT(__builtin_dynamic_object_size(p->array, 2),
+	 (p->foo) * sizeof(char));
+
+  EXPECT(__builtin_dynamic_object_size(p->array, 3),
+	 (p->foo) * sizeof(char));
+
+  /*when checking the pointer p, we only have info on the observed
+    allocation. So, the object size info can only been obtained from
+    the call to malloc.  */
+  EXPECT(__builtin_dynamic_object_size(p, 0), allocated_size);
+  EXPECT(__builtin_dynamic_object_size(p, 1), allocated_size);
+  EXPECT(__builtin_dynamic_object_size(p, 2), allocated_size);
+  EXPECT(__builtin_dynamic_object_size(p, 3), allocated_size);
+  return p;
+}
+
+int main ()
+{
+  struct annotated *p, *q;
+  p = alloc_buf_more (10);
+  q = alloc_buf_less (10);
+
+  /*when checking the access p->array, we only have info on the counted-by
+    value.  */ 
+  EXPECT(__builtin_dynamic_object_size(p->array, 0), p->foo * sizeof(char));
+  EXPECT(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(char));
+  EXPECT(__builtin_dynamic_object_size(p->array, 2), p->foo * sizeof(char));
+  EXPECT(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(char));
+  /*when checking the pointer p, we have no observed allocation nor observed
+    access, therefore, we cannot determine the size info here.  */
+  EXPECT(__builtin_dynamic_object_size(p, 0), -1);
+  EXPECT(__builtin_dynamic_object_size(p, 1), -1);
+  EXPECT(__builtin_dynamic_object_size(p, 2), 0);
+  EXPECT(__builtin_dynamic_object_size(p, 3), 0);
+
+  /*when checking the access p->array, we only have info on the counted-by
+    value.  */ 
+  EXPECT(__builtin_dynamic_object_size(q->array, 0), q->foo * sizeof(char));
+  EXPECT(__builtin_dynamic_object_size(q->array, 1), q->foo * sizeof(char));
+  EXPECT(__builtin_dynamic_object_size(q->array, 2), q->foo * sizeof(char));
+  EXPECT(__builtin_dynamic_object_size(q->array, 3), q->foo * sizeof(char));
+  /*when checking the pointer p, we have no observed allocation nor observed
+    access, therefore, we cannot determine the size info here.  */
+  EXPECT(__builtin_dynamic_object_size(q, 0), -1);
+  EXPECT(__builtin_dynamic_object_size(q, 1), -1);
+  EXPECT(__builtin_dynamic_object_size(q, 2), 0);
+  EXPECT(__builtin_dynamic_object_size(q, 3), 0);
+
+  DONE ();
+}
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 018fbc30cbb6..74d778f71eed 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -60,6 +60,7 @@ static tree compute_object_offset (tree, const_tree);
 static bool addr_object_size (struct object_size_info *,
 			      const_tree, int, tree *, tree *t = NULL);
 static tree alloc_object_size (const gcall *, int);
+static tree access_with_size_object_size (const gcall *, int);
 static tree pass_through_call (const gcall *);
 static void collect_object_sizes_for (struct object_size_info *, tree);
 static void expr_object_size (struct object_size_info *, tree, tree);
@@ -749,6 +750,48 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
   return false;
 }
 
+/* Compute __builtin_object_size for a CALL to .ACCESS_WITH_SIZE,
+   OBJECT_SIZE_TYPE is the second argument from __builtin_object_size.
+   the 2nd, 3rd, and the 4th parameters of the call determine the size of
+   the CALL:
+
+   2nd argument REF_TO_SIZE: The reference to the size of the object,
+   3rd argument TYPE_OF_SIZE: The size referenced by the REF_TO_SIZE represents
+     0: unknown;
+     1: the number of the elements of the object type;
+     2: the number of bytes;
+   4th argument PREC_OF_SIZE: The precision of the object refed by REF_TO_SIZE
+
+   the size of the element can be retrived from the result type of the call,
+   which is the pointer to the type of element.  */
+static tree
+access_with_size_object_size (const gcall *call, int object_size_type)
+{
+  gcc_assert (gimple_call_internal_p (call, IFN_ACCESS_WITH_SIZE));
+  tree result_type = gimple_call_return_type (call);
+  gcc_assert (POINTER_TYPE_P (result_type));
+  tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (result_type));
+  tree ref_to_size = gimple_call_arg (call, 1);
+  unsigned int type_of_size = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
+  unsigned int prec_of_size = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
+  tree type = build_nonstandard_integer_type (prec_of_size, 1);
+  tree size = fold_build2 (MEM_REF, type, ref_to_size,
+			   build_int_cst (ptr_type_node, 0));
+
+  if (type_of_size == 0)
+    return size_unknown (object_size_type);
+  else if (type_of_size == 1)
+    size = size_binop (MULT_EXPR,
+		       fold_convert (sizetype, size),
+		       fold_convert (sizetype, element_size));
+  else
+    size = fold_convert (sizetype, size);
+
+  if (!todo)
+    todo = TODO_update_ssa_only_virtuals;
+
+  return size;
+}
 
 /* Compute __builtin_object_size for CALL, which is a GIMPLE_CALL.
    Handles calls to functions declared with attribute alloc_size.
@@ -1350,8 +1393,12 @@ call_object_size (struct object_size_info *osi, tree ptr, gcall *call)
 
   bool is_strdup = gimple_call_builtin_p (call, BUILT_IN_STRDUP);
   bool is_strndup = gimple_call_builtin_p (call, BUILT_IN_STRNDUP);
+  bool is_access_with_size
+	 = gimple_call_internal_p (call, IFN_ACCESS_WITH_SIZE);
   if (is_strdup || is_strndup)
     bytes = strdup_object_size (call, object_size_type, is_strndup);
+  else if (is_access_with_size)
+    bytes = access_with_size_object_size (call, object_size_type);
   else
     bytes = alloc_object_size (call, object_size_type);
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4 4/4] Use the .ACCESS_WITH_SIZE in bound sanitizer.
  2024-01-24  0:29 [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
                   ` (2 preceding siblings ...)
  2024-01-24  0:29 ` [PATCH v4 3/4] Use the .ACCESS_WITH_SIZE in builtin object size Qing Zhao
@ 2024-01-24  0:29 ` Qing Zhao
  2024-01-25  0:51 ` [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
  2024-01-30 16:04 ` Qing Zhao
  5 siblings, 0 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-24  0:29 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, siddhesh, uecker
  Cc: keescook, isanbard, gcc-patches, Qing Zhao

Since the result type of the call to .ACCESS_WITH_SIZE is a pointer to
the element type. The original array_ref is converted to an indirect_ref,
which introduced two issues for the instrumenation of bound sanitizer:

A. The index for the original array_ref was mixed into the offset
expression for the new indirect_ref.

In order to make the instrumentation for the bound sanitizer easier, one
more argument for the function .ACCESS_WITH_SIZE is added to record this
original index for the array_ref. then later during bound instrumentation,
get the index from the additional argument from the call to the function
.ACCESS_WITH_SIZE.

B. In the current bound sanitizer, no instrumentation will be inserted for
an indirect_ref.

In order to add instrumentation for an indirect_ref with a call to
.ACCESS_WITH_SIZE, we should specially handle the indirect_ref with a
call to .ACCESS_WITH_SIZE, and get the index and bound info from the
arguments of the call.

gcc/c-family/ChangeLog:

	* c-gimplify.cc (ubsan_walk_array_refs_r): Instrument indirect_ref.
	* c-ubsan.cc (get_bound_from_access_with_size): New function.
	(ubsan_instrument_bounds_indirect_ref): New function.
	(ubsan_indirect_ref_instrumented_p): New function.
	(ubsan_maybe_instrument_indirect_ref): New function.
	* c-ubsan.h (ubsan_maybe_instrument_indirect_ref): New prototype.

gcc/c/ChangeLog:

	* c-typeck.cc (build_counted_by_ref): Minor style fix.
	(build_access_with_size_for_counted_by): Add one more argument.
	(build_array_ref): Set the 5th argument of a call to .ACCESS_WITH_SIZE
	to the index.

gcc/testsuite/ChangeLog:

	* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
	* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
---
 gcc/c-family/c-gimplify.cc                    |   2 +
 gcc/c-family/c-ubsan.cc                       | 130 ++++++++++++++++++
 gcc/c-family/c-ubsan.h                        |   1 +
 gcc/c/c-typeck.cc                             |  14 +-
 .../ubsan/flex-array-counted-by-bounds-2.c    |  45 ++++++
 .../ubsan/flex-array-counted-by-bounds.c      |  46 +++++++
 6 files changed, 235 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c

diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc
index 494da49791d5..25a3ca1a9a99 100644
--- a/gcc/c-family/c-gimplify.cc
+++ b/gcc/c-family/c-gimplify.cc
@@ -108,6 +108,8 @@ ubsan_walk_array_refs_r (tree *tp, int *walk_subtrees, void *data)
     }
   else if (TREE_CODE (*tp) == ARRAY_REF)
     ubsan_maybe_instrument_array_ref (tp, false);
+  else if (TREE_CODE (*tp) == INDIRECT_REF)
+    ubsan_maybe_instrument_indirect_ref (tp);
   else if (TREE_CODE (*tp) == MODIFY_EXPR)
     {
       /* Since r7-1900, we gimplify RHS before LHS.  Consider
diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 940982819ddf..7bb6464eb5b5 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -376,6 +376,7 @@ ubsan_instrument_return (location_t loc)
   return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, data));
 }
 
+
 /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
    that gets expanded in the sanopt pass, and make an array dimension
    of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -501,6 +502,68 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index,
 				       *index, bound);
 }
 
+/* Get the tree that represented the number of counted_by, i.e, the maximum
+   number of the elements of the object that the call to .ACCESS_WITH_SIZE
+   points to, this number will be the bound of the corresponding array.  */
+static tree
+get_bound_from_access_with_size (tree call)
+{
+  if (!is_access_with_size_p (call))
+    return NULL_TREE;
+
+  tree ref_to_size = CALL_EXPR_ARG (call, 1);
+  unsigned int type_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2));
+  unsigned int prec_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 3));
+  tree type = build_nonstandard_integer_type (prec_of_size, 1);
+  tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size),
+			   build_int_cst (ptr_type_node, 0));
+  /* Only when type_of_size is 1,i.e, the number of the elements of
+     the object type, return the size.  */
+  if (type_of_size != 1)
+    return NULL_TREE;
+  else
+    size = fold_convert (sizetype, size);
+
+  return size;
+}
+
+/* Instrument array bounds for INDIRECT_REFs whose pointers are
+   POINTER_PLUS_EXPRs of calls to .ACCESS_WITH_SIZE.  We create special
+   builtins that gets expanded in the sanopt pass, and make an array
+   dimension of it.  ARRAY is the pointer to the base of the array,
+   which is a call to .ACCESS_WITH_SIZE.
+   We get the INDEX from the 6th argument of the call to .ACCESS_WITH_SIZE
+   Return NULL_TREE if no instrumentation is emitted.  */
+
+tree
+ubsan_instrument_bounds_indirect_ref (location_t loc, tree array)
+{
+  if (!is_access_with_size_p (array))
+    return NULL_TREE;
+  tree bound = get_bound_from_access_with_size (array);
+  tree index = CALL_EXPR_ARG (array, 5);
+  /* When the index is a constant -1, it's an invalid index.  */
+  if ((TREE_CODE (index) == INTEGER_CST)
+       && TREE_INT_CST_LOW (index) == (long unsigned int) -1)
+    return NULL_TREE;
+  gcc_assert (bound);
+
+  /* Create a "(T *) 0" tree node to describe the original array type.
+     We get the original array type from the first argument of the call to
+     .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1).
+
+     Originally, REF is a COMPONENT_REF with the original array type,
+     it was converted to a pointer to an ADDR_EXPR, and the ADDR_EXPR's
+     first operand is the original COMPONENT_REF.  */
+  tree ref = CALL_EXPR_ARG (array, 0);
+  tree array_type
+    = unshare_expr (TREE_TYPE (TREE_OPERAND (TREE_OPERAND (ref, 0), 0)));
+  tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
+  return build_call_expr_internal_loc (loc, IFN_UBSAN_BOUNDS,
+				       void_type_node, 3, zero_with_type,
+				       unshare_expr (index), bound);
+}
+
 /* Return true iff T is an array that was instrumented by SANITIZE_BOUNDS.  */
 
 bool
@@ -536,6 +599,73 @@ ubsan_maybe_instrument_array_ref (tree *expr_p, bool ignore_off_by_one)
     }
 }
 
+/* Return true iff T is an INDIRECT_REF that was instrumented
+   by SANITIZE_BOUNDS.  */
+
+bool
+ubsan_indirect_ref_instrumented_p (const_tree t)
+{
+  if (TREE_CODE (t) != INDIRECT_REF)
+    return false;
+
+  tree pointer = TREE_OPERAND (t, 0);
+  if (TREE_CODE (pointer) != POINTER_PLUS_EXPR)
+    return false;
+  tree offset = NULL_TREE;
+  if (is_access_with_size_p (TREE_OPERAND (pointer, 0)))
+    offset = TREE_OPERAND (pointer, 1);
+  else if (is_access_with_size_p (TREE_OPERAND (pointer, 1)))
+    offset = TREE_OPERAND (pointer, 0);
+  else
+    return false;
+  return TREE_CODE (offset) == COMPOUND_EXPR
+	 && TREE_CODE (TREE_OPERAND (offset, 0)) == CALL_EXPR
+	 && CALL_EXPR_FN (TREE_OPERAND (offset, 0)) == NULL_TREE
+	 && CALL_EXPR_IFN (TREE_OPERAND (offset, 0)) == IFN_UBSAN_BOUNDS;
+}
+
+/* Instrument an INDIRECT_REF, if it hasn't already been instrumented.
+   Right now, we only instrument an INDIRECT_REF when its pointer is a
+   POINTER_PLUS_EXPR, with one operand is a call to .ACCESS_WITH_SIZE,
+   and the other operand is an offset to the array.  We will compute the
+   array index based on the offset and the size of each element, and use
+   the computed index for the instrumentation.  */
+
+void
+ubsan_maybe_instrument_indirect_ref (tree *expr_p)
+{
+  if (!ubsan_indirect_ref_instrumented_p (*expr_p)
+      && sanitize_flags_p (SANITIZE_BOUNDS | SANITIZE_BOUNDS_STRICT)
+      && current_function_decl != NULL_TREE)
+    {
+      tree pointer = TREE_OPERAND (*expr_p, 0);
+      if (TREE_CODE (pointer) != POINTER_PLUS_EXPR)
+	return;
+      tree array = NULL_TREE;
+      tree offset = NULL_TREE;
+      int nth_op = 0;
+      if (is_access_with_size_p (TREE_OPERAND (pointer, 0)))
+	{
+	  array = TREE_OPERAND (pointer, 0);
+	  offset = TREE_OPERAND (pointer, 1);
+	  nth_op = 1;
+	}
+      else if (is_access_with_size_p (TREE_OPERAND (pointer, 1)))
+	{
+	  array = TREE_OPERAND (pointer, 1);
+	  offset = TREE_OPERAND (pointer, 0);
+	}
+      else
+	return;
+
+      tree e = ubsan_instrument_bounds_indirect_ref (EXPR_LOCATION (*expr_p),
+						     array);
+      if (e != NULL_TREE)
+	TREE_OPERAND (pointer, nth_op)
+	  = build2 (COMPOUND_EXPR, TREE_TYPE (offset), e, offset);
+    }
+}
+
 static tree
 ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree ptype,
 					  enum ubsan_null_ckind ckind)
diff --git a/gcc/c-family/c-ubsan.h b/gcc/c-family/c-ubsan.h
index 9df03445a2ba..ed495266e82d 100644
--- a/gcc/c-family/c-ubsan.h
+++ b/gcc/c-family/c-ubsan.h
@@ -28,6 +28,7 @@ extern tree ubsan_instrument_return (location_t);
 extern tree ubsan_instrument_bounds (location_t, tree, tree *, bool);
 extern bool ubsan_array_ref_instrumented_p (const_tree);
 extern void ubsan_maybe_instrument_array_ref (tree *, bool);
+extern void ubsan_maybe_instrument_indirect_ref (tree *);
 extern void ubsan_maybe_instrument_reference (tree *);
 extern void ubsan_maybe_instrument_member_call (tree, bool);
 
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 4020bafc8e36..4fcb99fa0a5d 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2576,7 +2576,8 @@ build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
     {
       tree field_id = TREE_VALUE (TREE_VALUE (attr_counted_by));
       counted_by_ref
-	= build_component_ref (UNKNOWN_LOCATION, datum, field_id,
+	= build_component_ref (UNKNOWN_LOCATION,
+			       datum, field_id,
 			       UNKNOWN_LOCATION, UNKNOWN_LOCATION);
       counted_by_ref = build_fold_addr_expr (counted_by_ref);
 
@@ -2602,11 +2603,15 @@ build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
 
    to:
 
-   .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1)
+   .ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, num_bytes, -1, INDEX)
 
    NOTE: Both the return type and the type of the first argument
    of this function have been converted from the incomplete array
    type to the corresponding pointer type.
+
+   INDEX is -1 when we build the call to .ACCESS_WITH_SIZE. and
+   will be set to the corresponding tree node when we parse the
+   index at build_array_ref.
   */
 static tree
 build_access_with_size_for_counted_by (location_t loc, tree ref,
@@ -2619,12 +2624,13 @@ build_access_with_size_for_counted_by (location_t loc, tree ref,
 
   tree call
     = build_call_expr_internal_loc (loc, IFN_ACCESS_WITH_SIZE,
-				  result_type, 5,
+				  result_type, 6,
 				  array_to_pointer_conversion (loc, ref),
 				  counted_by_ref,
 				  build_int_cst (integer_type_node, 1),
 				  build_int_cst (integer_type_node,
 						 counted_by_precision),
+				  build_int_cst (integer_type_node, -1),
 				  build_int_cst (integer_type_node, -1));
   SET_EXPR_LOCATION (call, loc);
   return call;
@@ -3006,6 +3012,8 @@ build_array_ref (location_t loc, tree array, tree index)
       gcc_assert (TREE_CODE (TREE_TYPE (ar)) == POINTER_TYPE);
       gcc_assert (TREE_CODE (TREE_TYPE (TREE_TYPE (ar))) != FUNCTION_TYPE);
 
+      if (is_access_with_size_p (ar))
+	CALL_EXPR_ARG (ar, 5) = index;
       ret = build_indirect_ref (loc, build_binary_op (loc, PLUS_EXPR, ar,
 						      index, false),
 				RO_ARRAY_INDEXING);
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index 000000000000..148934975ee5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,45 @@
+/* test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-output "index 11 out of bounds for type 'int \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int \\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int \\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+#include <stdlib.h>
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+       int n;
+       int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
+   f->n = m;
+   f->p[m][n-1]=1;
+   return;
+}
+
+void __attribute__((__noinline__)) setup_and_test_vla_1 (int n1, int n2, int m)
+{
+  struct foo {
+    int n;
+    int p[][n2][n1] __attribute__((counted_by(n)));
+  } *f;
+
+  f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n2][n1]));
+  f->n = m;
+  f->p[m][n2][n1]=1;
+  return;
+}
+
+int main(int argc, char *argv[])
+{
+  setup_and_test_vla (10, 11);
+  setup_and_test_vla_1 (10, 11, 20);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
new file mode 100644
index 000000000000..81eaeb3f2681
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
@@ -0,0 +1,46 @@
+/* test the attribute counted_by and its usage in
+   bounds sanitizer.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+
+#include <stdlib.h>
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int annotated_count)
+{
+  array_flex
+    = (struct flex *)malloc (sizeof (struct flex)
+			     + normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+    = (struct annotated *)malloc (sizeof (struct annotated)
+				  + annotated_count *  sizeof (int));
+  array_annotated->b = annotated_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test (int normal_index, int annotated_index)
+{
+  array_flex->c[normal_index] = 1;
+  array_annotated->c[annotated_index] = 2;
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10, 10);   
+  test (10, 10);
+  return 0;
+}
+
+/* { dg-output "36:21: runtime error: index 10 out of bounds for type" } */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-24  0:29 [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
                   ` (3 preceding siblings ...)
  2024-01-24  0:29 ` [PATCH v4 4/4] Use the .ACCESS_WITH_SIZE in bound sanitizer Qing Zhao
@ 2024-01-25  0:51 ` Kees Cook
  2024-01-25 20:11   ` Qing Zhao
  2024-01-30 16:04 ` Qing Zhao
  5 siblings, 1 reply; 23+ messages in thread
From: Kees Cook @ 2024-01-25  0:51 UTC (permalink / raw)
  To: Qing Zhao
  Cc: joseph, richard.guenther, jakub, siddhesh, uecker, isanbard, gcc-patches

On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
> This is the 4th version of the patch.

Thanks very much for this!

I tripped over an unexpected behavioral change that the Linux kernel
depends on:

__builtin_types_compatible_p() no longer treats an array marked with
counted_by as different from that array's decayed pointer. Specifically,
the kernel uses these macros:


/*
 * Force a compilation error if condition is true, but also produce a
 * result (of value 0 and type int), so the expression can be used
 * e.g. in a structure initializer (or where-ever else comma expressions
 * aren't permitted).
 */
#define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))

#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))

/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))


This gets used in various places to make sure we're dealing with an
array for a macro:

#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))


So this builds:

struct untracked {
        int size;
        int array[];
} *a;

__must_be_array(a->array)
	=> 0 (as expected)
__builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
	=> 0 (as expected, array vs decayed array pointer)


But if counted_by is added, we get a build failure:

struct tracked {
        int size;
        int array[] __counted_by(size);
} *b;

__must_be_array(b->array)
	=> build failure (not expected)
__builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
	=> 1 (not expected, both pointers?)




-- 
Kees Cook

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-25  0:51 ` [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
@ 2024-01-25 20:11   ` Qing Zhao
  2024-01-26  8:04     ` Martin Uecker
  2024-01-29 16:00     ` Qing Zhao
  0 siblings, 2 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-25 20:11 UTC (permalink / raw)
  To: Kees Cook
  Cc: joseph, richard.guenther, jakub, siddhesh, uecker, isanbard, gcc-patches

Thanks a lot for the testing.

Yes, I can repeat the issue with the following small example:

#include <stdlib.h>
#include <stddef.h>
#include <stdio.h>

#define MAX(a, b)  ((a) > (b) ? (a) :  (b))

struct untracked {
       int size;
       int array[] __attribute__((counted_by (size)));
} *a;
struct untracked * alloc_buf (int index)
{
  struct untracked *p;
  p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
                                        (offsetof (struct untracked, array[0])
                                         + (index) * sizeof (int))));
  p->size = index;
  return p;
}

int main()
{
  a = alloc_buf(10);
     printf ("same_type is %d\n",
  (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
  return 0;
}


/home/opc/Install/latest-d/bin/gcc -O2 btcp.c
same_type is 1

Looks like that the “typeof” operator need to be handled specially in C FE
 for the new internal function .ACCESS_WITH_SIZE. 

(I have specially handle the operator “offsetof” in C FE already).

Will fix this issue.

Thanks.

Qing

> On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
>> This is the 4th version of the patch.
> 
> Thanks very much for this!
> 
> I tripped over an unexpected behavioral change that the Linux kernel
> depends on:
> 
> __builtin_types_compatible_p() no longer treats an array marked with
> counted_by as different from that array's decayed pointer. Specifically,
> the kernel uses these macros:
> 
> 
> /*
> * Force a compilation error if condition is true, but also produce a
> * result (of value 0 and type int), so the expression can be used
> * e.g. in a structure initializer (or where-ever else comma expressions
> * aren't permitted).
> */
> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
> 
> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
> 
> /* &a[0] degrades to a pointer: a different type from an array */
> #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
> 
> 
> This gets used in various places to make sure we're dealing with an
> array for a macro:
> 
> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
> 
> 
> So this builds:
> 
> struct untracked {
>        int size;
>        int array[];
> } *a;
> 
> __must_be_array(a->array)
> => 0 (as expected)
> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
> => 0 (as expected, array vs decayed array pointer)
> 
> 
> But if counted_by is added, we get a build failure:
> 
> struct tracked {
>        int size;
>        int array[] __counted_by(size);
> } *b;
> 
> __must_be_array(b->array)
> => build failure (not expected)
> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
> => 1 (not expected, both pointers?)
> 
> 
> 
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-25 20:11   ` Qing Zhao
@ 2024-01-26  8:04     ` Martin Uecker
  2024-01-26 14:33       ` Qing Zhao
  2024-01-29 16:00     ` Qing Zhao
  1 sibling, 1 reply; 23+ messages in thread
From: Martin Uecker @ 2024-01-26  8:04 UTC (permalink / raw)
  To: Qing Zhao, Kees Cook
  Cc: joseph, richard.guenther, jakub, siddhesh, isanbard, gcc-patches


I haven't looked at the patch, but it sounds you give the result
the wrong type. Then patching up all use cases instead of the
type seems wrong.

Martin


Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao:
> Thanks a lot for the testing.
> 
> Yes, I can repeat the issue with the following small example:
> 
> #include <stdlib.h>
> #include <stddef.h>
> #include <stdio.h>
> 
> #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
> 
> struct untracked {
>        int size;
>        int array[] __attribute__((counted_by (size)));
> } *a;
> struct untracked * alloc_buf (int index)
> {
>   struct untracked *p;
>   p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
>                                         (offsetof (struct untracked, array[0])
>                                          + (index) * sizeof (int))));
>   p->size = index;
>   return p;
> }
> 
> int main()
> {
>   a = alloc_buf(10);
>      printf ("same_type is %d\n",
>   (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
>   return 0;
> }
> 
> 
> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
> same_type is 1
> 
> Looks like that the “typeof” operator need to be handled specially in C FE
>  for the new internal function .ACCESS_WITH_SIZE. 
> 
> (I have specially handle the operator “offsetof” in C FE already).
> 
> Will fix this issue.
> 
> Thanks.
> 
> Qing
> 
> > On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
> > 
> > On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
> > > This is the 4th version of the patch.
> > 
> > Thanks very much for this!
> > 
> > I tripped over an unexpected behavioral change that the Linux kernel
> > depends on:
> > 
> > __builtin_types_compatible_p() no longer treats an array marked with
> > counted_by as different from that array's decayed pointer. Specifically,
> > the kernel uses these macros:
> > 
> > 
> > /*
> > * Force a compilation error if condition is true, but also produce a
> > * result (of value 0 and type int), so the expression can be used
> > * e.g. in a structure initializer (or where-ever else comma expressions
> > * aren't permitted).
> > */
> > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
> > 
> > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
> > 
> > /* &a[0] degrades to a pointer: a different type from an array */
> > #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
> > 
> > 
> > This gets used in various places to make sure we're dealing with an
> > array for a macro:
> > 
> > #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
> > 
> > 
> > So this builds:
> > 
> > struct untracked {
> >        int size;
> >        int array[];
> > } *a;
> > 
> > __must_be_array(a->array)
> > => 0 (as expected)
> > __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
> > => 0 (as expected, array vs decayed array pointer)
> > 
> > 
> > But if counted_by is added, we get a build failure:
> > 
> > struct tracked {
> >        int size;
> >        int array[] __counted_by(size);
> > } *b;
> > 
> > __must_be_array(b->array)
> > => build failure (not expected)
> > __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
> > => 1 (not expected, both pointers?)
> > 
> > 
> > 
> > 
> > -- 
> > Kees Cook
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-26  8:04     ` Martin Uecker
@ 2024-01-26 14:33       ` Qing Zhao
  2024-01-28 10:09         ` Martin Uecker
  0 siblings, 1 reply; 23+ messages in thread
From: Qing Zhao @ 2024-01-26 14:33 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Kees Cook, joseph, richard.guenther, jakub, siddhesh, isanbard,
	gcc-patches



> On Jan 26, 2024, at 3:04 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> 
> I haven't looked at the patch, but it sounds you give the result
> the wrong type. Then patching up all use cases instead of the
> type seems wrong.

Yes, this is for resolving a very early gimplification issue as I reported last Nov:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html

Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF
With a pointer indirection:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html

The reason for such change is:  return a flexible array member TYPE is not allowed
by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. 

******The new internal function

 .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX)

INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)

which returns the "REF_TO_OBJ" same as the 1st argument;

Both the return type and the type of the first argument of this function have been converted from 
the incomplete array type to the corresponding pointer type.

As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the original INDEX of the ARRAY_REF was lost
when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for bound sanitizer instrumentation, I added
The 6th argument “INDEX”.

What’s your comment and suggestion on this solution?

Thanks.

Qing


> 
> Martin
> 
> 
> Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao:
>> Thanks a lot for the testing.
>> 
>> Yes, I can repeat the issue with the following small example:
>> 
>> #include <stdlib.h>
>> #include <stddef.h>
>> #include <stdio.h>
>> 
>> #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
>> 
>> struct untracked {
>>       int size;
>>       int array[] __attribute__((counted_by (size)));
>> } *a;
>> struct untracked * alloc_buf (int index)
>> {
>>  struct untracked *p;
>>  p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
>>                                        (offsetof (struct untracked, array[0])
>>                                         + (index) * sizeof (int))));
>>  p->size = index;
>>  return p;
>> }
>> 
>> int main()
>> {
>>  a = alloc_buf(10);
>>     printf ("same_type is %d\n",
>>  (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
>>  return 0;
>> }
>> 
>> 
>> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
>> same_type is 1
>> 
>> Looks like that the “typeof” operator need to be handled specially in C FE
>> for the new internal function .ACCESS_WITH_SIZE. 
>> 
>> (I have specially handle the operator “offsetof” in C FE already).
>> 
>> Will fix this issue.
>> 
>> Thanks.
>> 
>> Qing
>> 
>>> On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
>>> 
>>> On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
>>>> This is the 4th version of the patch.
>>> 
>>> Thanks very much for this!
>>> 
>>> I tripped over an unexpected behavioral change that the Linux kernel
>>> depends on:
>>> 
>>> __builtin_types_compatible_p() no longer treats an array marked with
>>> counted_by as different from that array's decayed pointer. Specifically,
>>> the kernel uses these macros:
>>> 
>>> 
>>> /*
>>> * Force a compilation error if condition is true, but also produce a
>>> * result (of value 0 and type int), so the expression can be used
>>> * e.g. in a structure initializer (or where-ever else comma expressions
>>> * aren't permitted).
>>> */
>>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
>>> 
>>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
>>> 
>>> /* &a[0] degrades to a pointer: a different type from an array */
>>> #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
>>> 
>>> 
>>> This gets used in various places to make sure we're dealing with an
>>> array for a macro:
>>> 
>>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
>>> 
>>> 
>>> So this builds:
>>> 
>>> struct untracked {
>>>       int size;
>>>       int array[];
>>> } *a;
>>> 
>>> __must_be_array(a->array)
>>> => 0 (as expected)
>>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
>>> => 0 (as expected, array vs decayed array pointer)
>>> 
>>> 
>>> But if counted_by is added, we get a build failure:
>>> 
>>> struct tracked {
>>>       int size;
>>>       int array[] __counted_by(size);
>>> } *b;
>>> 
>>> __must_be_array(b->array)
>>> => build failure (not expected)
>>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
>>> => 1 (not expected, both pointers?)
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Kees Cook
>> 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-26 14:33       ` Qing Zhao
@ 2024-01-28 10:09         ` Martin Uecker
  2024-01-29 15:09           ` Qing Zhao
  0 siblings, 1 reply; 23+ messages in thread
From: Martin Uecker @ 2024-01-28 10:09 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, joseph, richard.guenther, jakub, siddhesh, isanbard,
	gcc-patches

Am Freitag, dem 26.01.2024 um 14:33 +0000 schrieb Qing Zhao:
> 
> > On Jan 26, 2024, at 3:04 AM, Martin Uecker <uecker@tugraz.at> wrote:
> > 
> > 
> > I haven't looked at the patch, but it sounds you give the result
> > the wrong type. Then patching up all use cases instead of the
> > type seems wrong.
> 
> Yes, this is for resolving a very early gimplification issue as I reported last Nov:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> 
> Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF
> With a pointer indirection:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
> 
> The reason for such change is:  return a flexible array member TYPE is not allowed
> by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. 
> 
> ******The new internal function
> 
>  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX)
> 
> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> 
> which returns the "REF_TO_OBJ" same as the 1st argument;
> 
> Both the return type and the type of the first argument of this function have been converted from 
> the incomplete array type to the corresponding pointer type.
> 
> As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the original INDEX of the ARRAY_REF was lost
> when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for bound sanitizer instrumentation, I added
> The 6th argument “INDEX”.
> 
> What’s your comment and suggestion on this solution?

I am not entirely sure but changing types in the FE seems
problematic because this breaks language semantics. And
then adding special code everywhere to treat it specially
in the FE does not seem a good way forward.

If I understand correctly, returning an incomplete array 
type is not allowed and then fails during gimplification.
So I would suggest to make it return a pointer to the 
incomplete array (and not the element type) but then wrap
it with an indirection when inserting this code in the FE
so that the full replacement has the correct type again
(of the incomplete array).


Alternatively, one could allow this during gimplification
or add some conversion.

Martin


> 
> Thanks.
> 
> Qing
> 
> 
> > 
> > Martin
> > 
> > 
> > Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao:
> > > Thanks a lot for the testing.
> > > 
> > > Yes, I can repeat the issue with the following small example:
> > > 
> > > #include <stdlib.h>
> > > #include <stddef.h>
> > > #include <stdio.h>
> > > 
> > > #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
> > > 
> > > struct untracked {
> > >       int size;
> > >       int array[] __attribute__((counted_by (size)));
> > > } *a;
> > > struct untracked * alloc_buf (int index)
> > > {
> > >  struct untracked *p;
> > >  p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
> > >                                        (offsetof (struct untracked, array[0])
> > >                                         + (index) * sizeof (int))));
> > >  p->size = index;
> > >  return p;
> > > }
> > > 
> > > int main()
> > > {
> > >  a = alloc_buf(10);
> > >     printf ("same_type is %d\n",
> > >  (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
> > >  return 0;
> > > }
> > > 
> > > 
> > > /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
> > > same_type is 1
> > > 
> > > Looks like that the “typeof” operator need to be handled specially in C FE
> > > for the new internal function .ACCESS_WITH_SIZE. 
> > > 
> > > (I have specially handle the operator “offsetof” in C FE already).
> > > 
> > > Will fix this issue.
> > > 
> > > Thanks.
> > > 
> > > Qing
> > > 
> > > > On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
> > > > 
> > > > On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
> > > > > This is the 4th version of the patch.
> > > > 
> > > > Thanks very much for this!
> > > > 
> > > > I tripped over an unexpected behavioral change that the Linux kernel
> > > > depends on:
> > > > 
> > > > __builtin_types_compatible_p() no longer treats an array marked with
> > > > counted_by as different from that array's decayed pointer. Specifically,
> > > > the kernel uses these macros:
> > > > 
> > > > 
> > > > /*
> > > > * Force a compilation error if condition is true, but also produce a
> > > > * result (of value 0 and type int), so the expression can be used
> > > > * e.g. in a structure initializer (or where-ever else comma expressions
> > > > * aren't permitted).
> > > > */
> > > > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
> > > > 
> > > > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
> > > > 
> > > > /* &a[0] degrades to a pointer: a different type from an array */
> > > > #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
> > > > 
> > > > 
> > > > This gets used in various places to make sure we're dealing with an
> > > > array for a macro:
> > > > 
> > > > #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
> > > > 
> > > > 
> > > > So this builds:
> > > > 
> > > > struct untracked {
> > > >       int size;
> > > >       int array[];
> > > > } *a;
> > > > 
> > > > __must_be_array(a->array)
> > > > => 0 (as expected)
> > > > __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
> > > > => 0 (as expected, array vs decayed array pointer)
> > > > 
> > > > 
> > > > But if counted_by is added, we get a build failure:
> > > > 
> > > > struct tracked {
> > > >       int size;
> > > >       int array[] __counted_by(size);
> > > > } *b;
> > > > 
> > > > __must_be_array(b->array)
> > > > => build failure (not expected)
> > > > __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
> > > > => 1 (not expected, both pointers?)
> > > > 
> > > > 
> > > > 
> > > > 
> > > > -- 
> > > > Kees Cook
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-28 10:09         ` Martin Uecker
@ 2024-01-29 15:09           ` Qing Zhao
  2024-01-29 15:50             ` Martin Uecker
  2024-01-29 20:35             ` Joseph Myers
  0 siblings, 2 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-29 15:09 UTC (permalink / raw)
  To: Martin Uecker, josmyers, Richard Biener
  Cc: Kees Cook, jakub, siddhesh, isanbard, gcc-patches

Thank you!

Joseph and Richard,  could you also comment on this?

> On Jan 28, 2024, at 5:09 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Freitag, dem 26.01.2024 um 14:33 +0000 schrieb Qing Zhao:
>> 
>>> On Jan 26, 2024, at 3:04 AM, Martin Uecker <uecker@tugraz.at> wrote:
>>> 
>>> 
>>> I haven't looked at the patch, but it sounds you give the result
>>> the wrong type. Then patching up all use cases instead of the
>>> type seems wrong.
>> 
>> Yes, this is for resolving a very early gimplification issue as I reported last Nov:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
>> 
>> Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF
>> With a pointer indirection:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
>> 
>> The reason for such change is:  return a flexible array member TYPE is not allowed
>> by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. 
>> 
>> ******The new internal function
>> 
>> .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX)
>> 
>> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
>> 
>> which returns the "REF_TO_OBJ" same as the 1st argument;
>> 
>> Both the return type and the type of the first argument of this function have been converted from 
>> the incomplete array type to the corresponding pointer type.
>> 
>> As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the original INDEX of the ARRAY_REF was lost
>> when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for bound sanitizer instrumentation, I added
>> The 6th argument “INDEX”.
>> 
>> What’s your comment and suggestion on this solution?
> 
> I am not entirely sure but changing types in the FE seems
> problematic because this breaks language semantics. And
> then adding special code everywhere to treat it specially
> in the FE does not seem a good way forward.
> 
> If I understand correctly, returning an incomplete array 
> type is not allowed and then fails during gimplification.

Yes, this is the problem in gimplification. 

> So I would suggest to make it return a pointer to the 
> incomplete array (and not the element type)


for the following:

struct annotated {
  unsigned int size;
  int array[] __attribute__((counted_by (size)));
};

  struct annotated * p = ….
  p->array[9] = 0;

The IL for the above array reference p->array[9] is:

1. If the return type is the original incomplete array type, 

.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;

(this triggered the gimplification failure since the return type cannot be a complete type).

2. When the return type is changed to a pointer to the element type of the incomplete array, (the current patch)
Then the original array reference naturally becomes an indirect reference through the pointer

*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0;

Since the original array reference becomes an indirect reference through the pointer to the element array, the INDEX info 
is mixed into the OFFSET of the indirect reference and lost, so, I added the 6th argument to the routine .ACCESS_WITH_SIZE
to record the INDEX. 

3. With your suggestion, the return type is changed to a pointer to the incomplete array, 
I just tried this to change the result type :


--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref,
                                       tree counted_by_type)
 {
   gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
-  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
+  tree result_type = build_pointer_type (TREE_TYPE (ref));

Then, I got the following FE errors:

test.c:10:11: error: invalid use of flexible array member
   10 |   p->array[9] = 0;

The reason for the error is: when the original array_ref becomes an indirect_ref through the pointer to the incomplete array,
During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT (type) is invalid since the type is an incomplete array. 
As a result, the OFFSET cannot computed for the indirect_ref.

Looks like even more issues with this approach.


> but then wrap
> it with an indirection when inserting this code in the FE
> so that the full replacement has the correct type again
> (of the incomplete array).

I don’t quite understand the above, could you please explain this in more details? (If possible, could you please use the above small example?)
thanks.

> 
> 
> Alternatively, one could allow this during gimplification
> or add some conversion.

Allowing this in gimplification might trigger some other issues.  I guess that adding conversion 
in the end of the FE or in the beginning of gimplification might be better.

i.e,  in FE, still keep the original incomplete array type as the return type for the routine .ACCESS_WITH_SIZE, i.e

.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;

But add a conversion from the above array_ref to an indirect_ref in the end of FE or in the beginning of gimplification:

*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1) + 36) = 0;

With this approach,  during FE, the original ARRAY TYPE and the INDEX can be kept, the array bound sanitizer instrumentation
Will be much easier than my current approach. 

Is this approach reasonable?

If so, where is better to add this conversion, in the end of FE or in the beginning of gimplification?

Thanks.

Qing


> 
> Martin
> 
> 
>> 
>> Thanks.
>> 
>> Qing
>> 
>> 
>>> 
>>> Martin
>>> 
>>> 
>>> Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao:
>>>> Thanks a lot for the testing.
>>>> 
>>>> Yes, I can repeat the issue with the following small example:
>>>> 
>>>> #include <stdlib.h>
>>>> #include <stddef.h>
>>>> #include <stdio.h>
>>>> 
>>>> #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
>>>> 
>>>> struct untracked {
>>>>      int size;
>>>>      int array[] __attribute__((counted_by (size)));
>>>> } *a;
>>>> struct untracked * alloc_buf (int index)
>>>> {
>>>> struct untracked *p;
>>>> p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
>>>>                                       (offsetof (struct untracked, array[0])
>>>>                                        + (index) * sizeof (int))));
>>>> p->size = index;
>>>> return p;
>>>> }
>>>> 
>>>> int main()
>>>> {
>>>> a = alloc_buf(10);
>>>>    printf ("same_type is %d\n",
>>>> (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
>>>> return 0;
>>>> }
>>>> 
>>>> 
>>>> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
>>>> same_type is 1
>>>> 
>>>> Looks like that the “typeof” operator need to be handled specially in C FE
>>>> for the new internal function .ACCESS_WITH_SIZE. 
>>>> 
>>>> (I have specially handle the operator “offsetof” in C FE already).
>>>> 
>>>> Will fix this issue.
>>>> 
>>>> Thanks.
>>>> 
>>>> Qing
>>>> 
>>>>> On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
>>>>> 
>>>>> On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
>>>>>> This is the 4th version of the patch.
>>>>> 
>>>>> Thanks very much for this!
>>>>> 
>>>>> I tripped over an unexpected behavioral change that the Linux kernel
>>>>> depends on:
>>>>> 
>>>>> __builtin_types_compatible_p() no longer treats an array marked with
>>>>> counted_by as different from that array's decayed pointer. Specifically,
>>>>> the kernel uses these macros:
>>>>> 
>>>>> 
>>>>> /*
>>>>> * Force a compilation error if condition is true, but also produce a
>>>>> * result (of value 0 and type int), so the expression can be used
>>>>> * e.g. in a structure initializer (or where-ever else comma expressions
>>>>> * aren't permitted).
>>>>> */
>>>>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
>>>>> 
>>>>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
>>>>> 
>>>>> /* &a[0] degrades to a pointer: a different type from an array */
>>>>> #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
>>>>> 
>>>>> 
>>>>> This gets used in various places to make sure we're dealing with an
>>>>> array for a macro:
>>>>> 
>>>>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
>>>>> 
>>>>> 
>>>>> So this builds:
>>>>> 
>>>>> struct untracked {
>>>>>      int size;
>>>>>      int array[];
>>>>> } *a;
>>>>> 
>>>>> __must_be_array(a->array)
>>>>> => 0 (as expected)
>>>>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
>>>>> => 0 (as expected, array vs decayed array pointer)
>>>>> 
>>>>> 
>>>>> But if counted_by is added, we get a build failure:
>>>>> 
>>>>> struct tracked {
>>>>>      int size;
>>>>>      int array[] __counted_by(size);
>>>>> } *b;
>>>>> 
>>>>> __must_be_array(b->array)
>>>>> => build failure (not expected)
>>>>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
>>>>> => 1 (not expected, both pointers?)
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Kees Cook



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-29 15:09           ` Qing Zhao
@ 2024-01-29 15:50             ` Martin Uecker
  2024-01-29 16:19               ` Qing Zhao
  2024-01-29 20:35             ` Joseph Myers
  1 sibling, 1 reply; 23+ messages in thread
From: Martin Uecker @ 2024-01-29 15:50 UTC (permalink / raw)
  To: Qing Zhao, josmyers, Richard Biener
  Cc: Kees Cook, jakub, siddhesh, isanbard, gcc-patches

Am Montag, dem 29.01.2024 um 15:09 +0000 schrieb Qing Zhao:
> Thank you!
> 
> Joseph and Richard,  could you also comment on this?
> 
> > On Jan 28, 2024, at 5:09 AM, Martin Uecker <uecker@tugraz.at> wrote:
> > 
> > Am Freitag, dem 26.01.2024 um 14:33 +0000 schrieb Qing Zhao:
> > > 
> > > > On Jan 26, 2024, at 3:04 AM, Martin Uecker <uecker@tugraz.at> wrote:
> > > > 
> > > > 
> > > > I haven't looked at the patch, but it sounds you give the result
> > > > the wrong type. Then patching up all use cases instead of the
> > > > type seems wrong.
> > > 
> > > Yes, this is for resolving a very early gimplification issue as I reported last Nov:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> > > 
> > > Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF
> > > With a pointer indirection:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
> > > 
> > > The reason for such change is:  return a flexible array member TYPE is not allowed
> > > by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. 
> > > 
> > > ******The new internal function
> > > 
> > > .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX)
> > > 
> > > INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> > > 
> > > which returns the "REF_TO_OBJ" same as the 1st argument;
> > > 
> > > Both the return type and the type of the first argument of this function have been converted from 
> > > the incomplete array type to the corresponding pointer type.
> > > 
> > > As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the original INDEX of the ARRAY_REF was lost
> > > when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for bound sanitizer instrumentation, I added
> > > The 6th argument “INDEX”.
> > > 
> > > What’s your comment and suggestion on this solution?
> > 
> > I am not entirely sure but changing types in the FE seems
> > problematic because this breaks language semantics. And
> > then adding special code everywhere to treat it specially
> > in the FE does not seem a good way forward.
> > 
> > If I understand correctly, returning an incomplete array 
> > type is not allowed and then fails during gimplification.
> 
> Yes, this is the problem in gimplification. 
> 
> > So I would suggest to make it return a pointer to the 
> > incomplete array (and not the element type)
> 
> 
> for the following:
> 
> struct annotated {
>   unsigned int size;
>   int array[] __attribute__((counted_by (size)));
> };
> 
>   struct annotated * p = ….
>   p->array[9] = 0;
> 
> The IL for the above array reference p->array[9] is:
> 
> 1. If the return type is the original incomplete array type, 
> 
> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
> 
> (this triggered the gimplification failure since the return type cannot be a complete type).
> 
> 2. When the return type is changed to a pointer to the element type of the incomplete array, (the current patch)
> Then the original array reference naturally becomes an indirect reference through the pointer
> 
> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0;
> 
> Since the original array reference becomes an indirect reference through the pointer to the element array, the INDEX info 
> is mixed into the OFFSET of the indirect reference and lost, so, I added the 6th argument to the routine .ACCESS_WITH_SIZE
> to record the INDEX. 
> 
> 3. With your suggestion, the return type is changed to a pointer to the incomplete array, 
> I just tried this to change the result type :
> 
> 
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref,
>                                        tree counted_by_type)
>  {
>    gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
> -  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
> +  tree result_type = build_pointer_type (TREE_TYPE (ref));
> 
> Then, I got the following FE errors:
> 
> test.c:10:11: error: invalid use of flexible array member
>    10 |   p->array[9] = 0;
> 
> The reason for the error is: when the original array_ref becomes an indirect_ref through the pointer to the incomplete array,
> During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT (type) is invalid since the type is an incomplete array. 
> As a result, the OFFSET cannot computed for the indirect_ref.
> 
> Looks like even more issues with this approach.

Yes, but only because the following is missing:

> 
> 
> > but then wrap
> > it with an indirection when inserting this code in the FE
> > so that the full replacement has the correct type again
> > (of the incomplete array).
> 
> I don’t quite understand the above, could you please explain this in more details? (If possible, could you please use the above small example?)
> thanks.

You would need to add an indirection:

(*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)))[9] = 0;

if .ACCESS_WITH_SIZE has type    T (*)[], i.e. pointer to incomplete
array of type T, then (*(.ACCESS_WITH_SIZE (...))) has type T[], i.e.
incomplete array of type.   

And you shouldn't even consider array derefencing part at all,
but replace the p->array when the component ref is constructed. 


Martin



> 
> > 
> > 
> > Alternatively, one could allow this during gimplification
> > or add some conversion.
> 
> Allowing this in gimplification might trigger some other issues.  I guess that adding conversion 
> in the end of the FE or in the beginning of gimplification might be better.
> 
> i.e,  in FE, still keep the original incomplete array type as the return type for the routine .ACCESS_WITH_SIZE, i.e
> 
> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
> 
> But add a conversion from the above array_ref to an indirect_ref in the end of FE or in the beginning of gimplification:
> 
> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1) + 36) = 0;
> 
> With this approach,  during FE, the original ARRAY TYPE and the INDEX can be kept, the array bound sanitizer instrumentation
> Will be much easier than my current approach. 
> 
> Is this approach reasonable?
> 
> If so, where is better to add this conversion, in the end of FE or in the beginning of gimplification?
> 
> Thanks.
> 
> Qing
> 
> 
> > 
> > Martin
> > 
> > 
> > > 
> > > Thanks.
> > > 
> > > Qing
> > > 
> > > 
> > > > 
> > > > Martin
> > > > 
> > > > 
> > > > Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao:
> > > > > Thanks a lot for the testing.
> > > > > 
> > > > > Yes, I can repeat the issue with the following small example:
> > > > > 
> > > > > #include <stdlib.h>
> > > > > #include <stddef.h>
> > > > > #include <stdio.h>
> > > > > 
> > > > > #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
> > > > > 
> > > > > struct untracked {
> > > > >      int size;
> > > > >      int array[] __attribute__((counted_by (size)));
> > > > > } *a;
> > > > > struct untracked * alloc_buf (int index)
> > > > > {
> > > > > struct untracked *p;
> > > > > p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
> > > > >                                       (offsetof (struct untracked, array[0])
> > > > >                                        + (index) * sizeof (int))));
> > > > > p->size = index;
> > > > > return p;
> > > > > }
> > > > > 
> > > > > int main()
> > > > > {
> > > > > a = alloc_buf(10);
> > > > >    printf ("same_type is %d\n",
> > > > > (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
> > > > > return 0;
> > > > > }
> > > > > 
> > > > > 
> > > > > /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
> > > > > same_type is 1
> > > > > 
> > > > > Looks like that the “typeof” operator need to be handled specially in C FE
> > > > > for the new internal function .ACCESS_WITH_SIZE. 
> > > > > 
> > > > > (I have specially handle the operator “offsetof” in C FE already).
> > > > > 
> > > > > Will fix this issue.
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > > Qing
> > > > > 
> > > > > > On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
> > > > > > 
> > > > > > On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
> > > > > > > This is the 4th version of the patch.
> > > > > > 
> > > > > > Thanks very much for this!
> > > > > > 
> > > > > > I tripped over an unexpected behavioral change that the Linux kernel
> > > > > > depends on:
> > > > > > 
> > > > > > __builtin_types_compatible_p() no longer treats an array marked with
> > > > > > counted_by as different from that array's decayed pointer. Specifically,
> > > > > > the kernel uses these macros:
> > > > > > 
> > > > > > 
> > > > > > /*
> > > > > > * Force a compilation error if condition is true, but also produce a
> > > > > > * result (of value 0 and type int), so the expression can be used
> > > > > > * e.g. in a structure initializer (or where-ever else comma expressions
> > > > > > * aren't permitted).
> > > > > > */
> > > > > > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
> > > > > > 
> > > > > > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
> > > > > > 
> > > > > > /* &a[0] degrades to a pointer: a different type from an array */
> > > > > > #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
> > > > > > 
> > > > > > 
> > > > > > This gets used in various places to make sure we're dealing with an
> > > > > > array for a macro:
> > > > > > 
> > > > > > #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
> > > > > > 
> > > > > > 
> > > > > > So this builds:
> > > > > > 
> > > > > > struct untracked {
> > > > > >      int size;
> > > > > >      int array[];
> > > > > > } *a;
> > > > > > 
> > > > > > __must_be_array(a->array)
> > > > > > => 0 (as expected)
> > > > > > __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
> > > > > > => 0 (as expected, array vs decayed array pointer)
> > > > > > 
> > > > > > 
> > > > > > But if counted_by is added, we get a build failure:
> > > > > > 
> > > > > > struct tracked {
> > > > > >      int size;
> > > > > >      int array[] __counted_by(size);
> > > > > > } *b;
> > > > > > 
> > > > > > __must_be_array(b->array)
> > > > > > => build failure (not expected)
> > > > > > __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
> > > > > > => 1 (not expected, both pointers?)
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > -- 
> > > > > > Kees Cook
> 
> 

-- 
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-25 20:11   ` Qing Zhao
  2024-01-26  8:04     ` Martin Uecker
@ 2024-01-29 16:00     ` Qing Zhao
  2024-01-29 17:25       ` Kees Cook
  1 sibling, 1 reply; 23+ messages in thread
From: Qing Zhao @ 2024-01-29 16:00 UTC (permalink / raw)
  To: Kees Cook, Martin Uecker, josmyers, Richard Biener
  Cc: jakub, siddhesh, isanbard, gcc-patches

An update on the kernel building with my version 4 patch.

Kees reported two FE issues with the current version 4 patch:

1. The operator “typeof” cannot return correct type for a->array;
2. The operator “&” cannot return correct address for a->array;

I fixed both in my local repository. 

With these additional fix.  Kernel with counted-by annotation can be built successfully. 

And then, Kees reported one behavioral issue with the current counted-by:

When the counted-by value is below zero, my current patch 

A. Didn’t report any warning for it.
B. Accepted the negative value as a wrapped size.

i.e. for:

struct foo {
signed char size;
unsigned char array[] __counted_by(size);
} *a;

...
a->size = -3;
report(__builtin_dynamic_object_size(p->array, 1));

this reports 253, rather than 0.

And the array-bounds sanitizer doesn’t catch negative index bounds neither. 

a->size = -3;
report(a->array[1]); // does not trap


So, my questions are:

 How should we handle the negative counted-by value?

 My approach is:

   I think that this is a user error, the compiler need to Issue warning during runtime about this user error.

Since I have one remaining patch that has not been finished yet:

6  Emit warnings when the user breaks the requirments for the new counted_by attribute
  compilation time: -Wcounted-by
  run time: -fsanitizer=counted-by
     * The initialization to the size field should be done before the first reference to the FAM field.
     * the array has at least # of elements specified by the size field all the time during the program.
     * the value of counted-by should not be negative.

Let me know your comment and suggestions.

Thanks

Qing

> On Jan 25, 2024, at 3:11 PM, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> Thanks a lot for the testing.
> 
> Yes, I can repeat the issue with the following small example:
> 
> #include <stdlib.h>
> #include <stddef.h>
> #include <stdio.h>
> 
> #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
> 
> struct untracked {
>       int size;
>       int array[] __attribute__((counted_by (size)));
> } *a;
> struct untracked * alloc_buf (int index)
> {
>  struct untracked *p;
>  p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
>                                        (offsetof (struct untracked, array[0])
>                                         + (index) * sizeof (int))));
>  p->size = index;
>  return p;
> }
> 
> int main()
> {
>  a = alloc_buf(10);
>     printf ("same_type is %d\n",
>  (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
>  return 0;
> }
> 
> 
> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
> same_type is 1
> 
> Looks like that the “typeof” operator need to be handled specially in C FE
> for the new internal function .ACCESS_WITH_SIZE. 
> 
> (I have specially handle the operator “offsetof” in C FE already).
> 
> Will fix this issue.
> 
> Thanks.
> 
> Qing
> 
>> On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
>> 
>> On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
>>> This is the 4th version of the patch.
>> 
>> Thanks very much for this!
>> 
>> I tripped over an unexpected behavioral change that the Linux kernel
>> depends on:
>> 
>> __builtin_types_compatible_p() no longer treats an array marked with
>> counted_by as different from that array's decayed pointer. Specifically,
>> the kernel uses these macros:
>> 
>> 
>> /*
>> * Force a compilation error if condition is true, but also produce a
>> * result (of value 0 and type int), so the expression can be used
>> * e.g. in a structure initializer (or where-ever else comma expressions
>> * aren't permitted).
>> */
>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
>> 
>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
>> 
>> /* &a[0] degrades to a pointer: a different type from an array */
>> #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
>> 
>> 
>> This gets used in various places to make sure we're dealing with an
>> array for a macro:
>> 
>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
>> 
>> 
>> So this builds:
>> 
>> struct untracked {
>>       int size;
>>       int array[];
>> } *a;
>> 
>> __must_be_array(a->array)
>> => 0 (as expected)
>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
>> => 0 (as expected, array vs decayed array pointer)
>> 
>> 
>> But if counted_by is added, we get a build failure:
>> 
>> struct tracked {
>>       int size;
>>       int array[] __counted_by(size);
>> } *b;
>> 
>> __must_be_array(b->array)
>> => build failure (not expected)
>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
>> => 1 (not expected, both pointers?)
>> 
>> 
>> 
>> 
>> -- 
>> Kees Cook
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-29 15:50             ` Martin Uecker
@ 2024-01-29 16:19               ` Qing Zhao
  0 siblings, 0 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-29 16:19 UTC (permalink / raw)
  To: Martin Uecker
  Cc: josmyers, Richard Biener, Kees Cook, jakub, siddhesh, isanbard,
	gcc-patches



> On Jan 29, 2024, at 10:50 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Montag, dem 29.01.2024 um 15:09 +0000 schrieb Qing Zhao:
>> Thank you!
>> 
>> Joseph and Richard,  could you also comment on this?
>> 
>>> On Jan 28, 2024, at 5:09 AM, Martin Uecker <uecker@tugraz.at> wrote:
>>> 
>>> Am Freitag, dem 26.01.2024 um 14:33 +0000 schrieb Qing Zhao:
>>>> 
>>>>> On Jan 26, 2024, at 3:04 AM, Martin Uecker <uecker@tugraz.at> wrote:
>>>>> 
>>>>> 
>>>>> I haven't looked at the patch, but it sounds you give the result
>>>>> the wrong type. Then patching up all use cases instead of the
>>>>> type seems wrong.
>>>> 
>>>> Yes, this is for resolving a very early gimplification issue as I reported last Nov:
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
>>>> 
>>>> Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF
>>>> With a pointer indirection:
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
>>>> 
>>>> The reason for such change is:  return a flexible array member TYPE is not allowed
>>>> by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. 
>>>> 
>>>> ******The new internal function
>>>> 
>>>> .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX)
>>>> 
>>>> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
>>>> 
>>>> which returns the "REF_TO_OBJ" same as the 1st argument;
>>>> 
>>>> Both the return type and the type of the first argument of this function have been converted from 
>>>> the incomplete array type to the corresponding pointer type.
>>>> 
>>>> As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the original INDEX of the ARRAY_REF was lost
>>>> when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for bound sanitizer instrumentation, I added
>>>> The 6th argument “INDEX”.
>>>> 
>>>> What’s your comment and suggestion on this solution?
>>> 
>>> I am not entirely sure but changing types in the FE seems
>>> problematic because this breaks language semantics. And
>>> then adding special code everywhere to treat it specially
>>> in the FE does not seem a good way forward.
>>> 
>>> If I understand correctly, returning an incomplete array 
>>> type is not allowed and then fails during gimplification.
>> 
>> Yes, this is the problem in gimplification. 
>> 
>>> So I would suggest to make it return a pointer to the 
>>> incomplete array (and not the element type)
>> 
>> 
>> for the following:
>> 
>> struct annotated {
>>  unsigned int size;
>>  int array[] __attribute__((counted_by (size)));
>> };
>> 
>>  struct annotated * p = ….
>>  p->array[9] = 0;
>> 
>> The IL for the above array reference p->array[9] is:
>> 
>> 1. If the return type is the original incomplete array type, 
>> 
>> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
>> 
>> (this triggered the gimplification failure since the return type cannot be a complete type).
>> 
>> 2. When the return type is changed to a pointer to the element type of the incomplete array, (the current patch)
>> Then the original array reference naturally becomes an indirect reference through the pointer
>> 
>> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0;
>> 
>> Since the original array reference becomes an indirect reference through the pointer to the element array, the INDEX info 
>> is mixed into the OFFSET of the indirect reference and lost, so, I added the 6th argument to the routine .ACCESS_WITH_SIZE
>> to record the INDEX. 
>> 
>> 3. With your suggestion, the return type is changed to a pointer to the incomplete array, 
>> I just tried this to change the result type :
>> 
>> 
>> --- a/gcc/c/c-typeck.cc
>> +++ b/gcc/c/c-typeck.cc
>> @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref,
>>                                       tree counted_by_type)
>> {
>>   gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
>> -  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
>> +  tree result_type = build_pointer_type (TREE_TYPE (ref));
>> 
>> Then, I got the following FE errors:
>> 
>> test.c:10:11: error: invalid use of flexible array member
>>   10 |   p->array[9] = 0;
>> 
>> The reason for the error is: when the original array_ref becomes an indirect_ref through the pointer to the incomplete array,
>> During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT (type) is invalid since the type is an incomplete array. 
>> As a result, the OFFSET cannot computed for the indirect_ref.
>> 
>> Looks like even more issues with this approach.
> 
> Yes, but only because the following is missing:
> 
>> 
>> 
>>> but then wrap
>>> it with an indirection when inserting this code in the FE
>>> so that the full replacement has the correct type again
>>> (of the incomplete array).
>> 
>> I don’t quite understand the above, could you please explain this in more details? (If possible, could you please use the above small example?)
>> thanks.
> 
> You would need to add an indirection:
> 
> (*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)))[9] = 0;
> 
> if .ACCESS_WITH_SIZE has type    T (*)[], i.e. pointer to incomplete
> array of type T, then (*(.ACCESS_WITH_SIZE (...))) has type T[], i.e.
> incomplete array of type.   
> 
> And you shouldn't even consider array derefencing part at all,
> but replace the p->array when the component ref is constructed. 

Thanks, I see now.

 I just updated the routine “build_access_with_size_for_counted_by” as following:
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref,
                                       tree counted_by_type)
 {
   gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
-  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
+  tree result_type = build_pointer_type (TREE_TYPE (ref));
   unsigned int counted_by_precision = TYPE_PRECISION (counted_by_type);
     tree call
@@ -2632,6 +2632,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref,
                                                 counted_by_precision),
                                  build_int_cst (integer_type_node, -1),
                                  build_int_cst (integer_type_node, -1));
+  call = build1 (INDIRECT_REF, TREE_TYPE (ref), call);
   SET_EXPR_LOCATION (call, loc);
   return call;
 }

This works for the small testing case, the generated IR is:

  (*.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, -1))[9] = 0;

I will do more testings and adjust my patch on this change.

Thanks a lot for your help.

Qing

> 
> Martin
> 
> 
> 
>> 
>>> 
>>> 
>>> Alternatively, one could allow this during gimplification
>>> or add some conversion.
>> 
>> Allowing this in gimplification might trigger some other issues.  I guess that adding conversion 
>> in the end of the FE or in the beginning of gimplification might be better.
>> 
>> i.e,  in FE, still keep the original incomplete array type as the return type for the routine .ACCESS_WITH_SIZE, i.e
>> 
>> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
>> 
>> But add a conversion from the above array_ref to an indirect_ref in the end of FE or in the beginning of gimplification:
>> 
>> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1) + 36) = 0;
>> 
>> With this approach,  during FE, the original ARRAY TYPE and the INDEX can be kept, the array bound sanitizer instrumentation
>> Will be much easier than my current approach. 
>> 
>> Is this approach reasonable?
>> 
>> If so, where is better to add this conversion, in the end of FE or in the beginning of gimplification?
>> 
>> Thanks.
>> 
>> Qing
>> 
>> 
>>> 
>>> Martin
>>> 
>>> 
>>>> 
>>>> Thanks.
>>>> 
>>>> Qing
>>>> 
>>>> 
>>>>> 
>>>>> Martin
>>>>> 
>>>>> 
>>>>> Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao:
>>>>>> Thanks a lot for the testing.
>>>>>> 
>>>>>> Yes, I can repeat the issue with the following small example:
>>>>>> 
>>>>>> #include <stdlib.h>
>>>>>> #include <stddef.h>
>>>>>> #include <stdio.h>
>>>>>> 
>>>>>> #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
>>>>>> 
>>>>>> struct untracked {
>>>>>>     int size;
>>>>>>     int array[] __attribute__((counted_by (size)));
>>>>>> } *a;
>>>>>> struct untracked * alloc_buf (int index)
>>>>>> {
>>>>>> struct untracked *p;
>>>>>> p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
>>>>>>                                      (offsetof (struct untracked, array[0])
>>>>>>                                       + (index) * sizeof (int))));
>>>>>> p->size = index;
>>>>>> return p;
>>>>>> }
>>>>>> 
>>>>>> int main()
>>>>>> {
>>>>>> a = alloc_buf(10);
>>>>>>   printf ("same_type is %d\n",
>>>>>> (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]))));
>>>>>> return 0;
>>>>>> }
>>>>>> 
>>>>>> 
>>>>>> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
>>>>>> same_type is 1
>>>>>> 
>>>>>> Looks like that the “typeof” operator need to be handled specially in C FE
>>>>>> for the new internal function .ACCESS_WITH_SIZE. 
>>>>>> 
>>>>>> (I have specially handle the operator “offsetof” in C FE already).
>>>>>> 
>>>>>> Will fix this issue.
>>>>>> 
>>>>>> Thanks.
>>>>>> 
>>>>>> Qing
>>>>>> 
>>>>>>> On Jan 24, 2024, at 7:51 PM, Kees Cook <keescook@chromium.org> wrote:
>>>>>>> 
>>>>>>> On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
>>>>>>>> This is the 4th version of the patch.
>>>>>>> 
>>>>>>> Thanks very much for this!
>>>>>>> 
>>>>>>> I tripped over an unexpected behavioral change that the Linux kernel
>>>>>>> depends on:
>>>>>>> 
>>>>>>> __builtin_types_compatible_p() no longer treats an array marked with
>>>>>>> counted_by as different from that array's decayed pointer. Specifically,
>>>>>>> the kernel uses these macros:
>>>>>>> 
>>>>>>> 
>>>>>>> /*
>>>>>>> * Force a compilation error if condition is true, but also produce a
>>>>>>> * result (of value 0 and type int), so the expression can be used
>>>>>>> * e.g. in a structure initializer (or where-ever else comma expressions
>>>>>>> * aren't permitted).
>>>>>>> */
>>>>>>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
>>>>>>> 
>>>>>>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
>>>>>>> 
>>>>>>> /* &a[0] degrades to a pointer: a different type from an array */
>>>>>>> #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
>>>>>>> 
>>>>>>> 
>>>>>>> This gets used in various places to make sure we're dealing with an
>>>>>>> array for a macro:
>>>>>>> 
>>>>>>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
>>>>>>> 
>>>>>>> 
>>>>>>> So this builds:
>>>>>>> 
>>>>>>> struct untracked {
>>>>>>>     int size;
>>>>>>>     int array[];
>>>>>>> } *a;
>>>>>>> 
>>>>>>> __must_be_array(a->array)
>>>>>>> => 0 (as expected)
>>>>>>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
>>>>>>> => 0 (as expected, array vs decayed array pointer)
>>>>>>> 
>>>>>>> 
>>>>>>> But if counted_by is added, we get a build failure:
>>>>>>> 
>>>>>>> struct tracked {
>>>>>>>     int size;
>>>>>>>     int array[] __counted_by(size);
>>>>>>> } *b;
>>>>>>> 
>>>>>>> __must_be_array(b->array)
>>>>>>> => build failure (not expected)
>>>>>>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
>>>>>>> => 1 (not expected, both pointers?)
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> Kees Cook
>> 
>> 
> 
> -- 
> Univ.-Prof. Dr. rer. nat. Martin Uecker
> Graz University of Technology
> Institute of Biomedical Imaging



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-29 16:00     ` Qing Zhao
@ 2024-01-29 17:25       ` Kees Cook
  2024-01-29 19:32         ` Qing Zhao
  0 siblings, 1 reply; 23+ messages in thread
From: Kees Cook @ 2024-01-29 17:25 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Martin Uecker, josmyers, Richard Biener, jakub, siddhesh,
	isanbard, gcc-patches

On Mon, Jan 29, 2024 at 04:00:20PM +0000, Qing Zhao wrote:
> An update on the kernel building with my version 4 patch.
> 
> Kees reported two FE issues with the current version 4 patch:
> 
> 1. The operator “typeof” cannot return correct type for a->array;
> 2. The operator “&” cannot return correct address for a->array;
> 
> I fixed both in my local repository. 
> 
> With these additional fix.  Kernel with counted-by annotation can be built successfully. 

Thanks for the fixes!

> 
> And then, Kees reported one behavioral issue with the current counted-by:
> 
> When the counted-by value is below zero, my current patch 
> 
> A. Didn’t report any warning for it.
> B. Accepted the negative value as a wrapped size.
> 
> i.e. for:
> 
> struct foo {
> signed char size;
> unsigned char array[] __counted_by(size);
> } *a;
> 
> ...
> a->size = -3;
> report(__builtin_dynamic_object_size(p->array, 1));
> 
> this reports 253, rather than 0.
> 
> And the array-bounds sanitizer doesn’t catch negative index bounds neither. 
> 
> a->size = -3;
> report(a->array[1]); // does not trap
> 
> 
> So, my questions are:
> 
>  How should we handle the negative counted-by value?

Treat it as always 0-bounded: count < 0 ? 0 : count

> 
>  My approach is:
> 
>    I think that this is a user error, the compiler need to Issue warning during runtime about this user error.
> 
> Since I have one remaining patch that has not been finished yet:
> 
> 6  Emit warnings when the user breaks the requirments for the new counted_by attribute
>   compilation time: -Wcounted-by
>   run time: -fsanitizer=counted-by
>      * The initialization to the size field should be done before the first reference to the FAM field.

I would hope that regular compile-time warnings would catch this.

>      * the array has at least # of elements specified by the size field all the time during the program.
>      * the value of counted-by should not be negative.

This seems reasonable for a very strict program, but it won't work for
the kernel as-is: a negative "count" is sometimes used to carry failure
details back to other users of the structure. This could be refactored in
the kernel, but I'd prefer that even without -fsanitizer=counted-by the
runtime behaviors will be "safe".

It does not seem sensible to me that adding a buffer size validation
primitive to GCC will result in conditions where a size calculation
will wrap around. I prefer no surprises. :)

> Let me know your comment and suggestions.

Clang has implemented the safety logic I'd prefer:

* __bdos will report 0 for any sizing where the "counted_by" count
  variable is negative. Effectively, the count variable is always
  processed as: count < 0 ? 0 : count

  struct foo {
	int count;
	short array[] __counted_by(count);
  } *p;

  __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count)

  The logic for this is that __bdos can be _certain_ that the size is 0
  when the count variable is pathological.

* -fsanitize=array-bounds similarly treats count as above, so that:

  printf("%d\n", p->array[index]); ==> trap when index > (count < 0 ? 0 : count)

  Same logic for the sanitizer: any access to the array when count is
  invalid means the access is invalid and must be trapped.


This means that software can run safely even in pathological conditions.

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-29 17:25       ` Kees Cook
@ 2024-01-29 19:32         ` Qing Zhao
  2024-01-29 20:19           ` Kees Cook
  0 siblings, 1 reply; 23+ messages in thread
From: Qing Zhao @ 2024-01-29 19:32 UTC (permalink / raw)
  To: Kees Cook
  Cc: Martin Uecker, josmyers, Richard Biener, jakub, siddhesh,
	isanbard, gcc-patches



> On Jan 29, 2024, at 12:25 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Mon, Jan 29, 2024 at 04:00:20PM +0000, Qing Zhao wrote:
>> An update on the kernel building with my version 4 patch.
>> 
>> Kees reported two FE issues with the current version 4 patch:
>> 
>> 1. The operator “typeof” cannot return correct type for a->array;
>> 2. The operator “&” cannot return correct address for a->array;
>> 
>> I fixed both in my local repository. 
>> 
>> With these additional fix.  Kernel with counted-by annotation can be built successfully. 
> 
> Thanks for the fixes!
> 
>> 
>> And then, Kees reported one behavioral issue with the current counted-by:
>> 
>> When the counted-by value is below zero, my current patch 
>> 
>> A. Didn’t report any warning for it.
>> B. Accepted the negative value as a wrapped size.
>> 
>> i.e. for:
>> 
>> struct foo {
>> signed char size;
>> unsigned char array[] __counted_by(size);
>> } *a;
>> 
>> ...
>> a->size = -3;
>> report(__builtin_dynamic_object_size(p->array, 1));
>> 
>> this reports 253, rather than 0.
>> 
>> And the array-bounds sanitizer doesn’t catch negative index bounds neither. 
>> 
>> a->size = -3;
>> report(a->array[1]); // does not trap
>> 
>> 
>> So, my questions are:
>> 
>> How should we handle the negative counted-by value?
> 
> Treat it as always 0-bounded: count < 0 ? 0 : count

Then the size of the object is 0?

> 
>> 
>> My approach is:
>> 
>>   I think that this is a user error, the compiler need to Issue warning during runtime about this user error.
>> 
>> Since I have one remaining patch that has not been finished yet:
>> 
>> 6  Emit warnings when the user breaks the requirments for the new counted_by attribute
>>  compilation time: -Wcounted-by
>>  run time: -fsanitizer=counted-by
>>     * The initialization to the size field should be done before the first reference to the FAM field.
> 
> I would hope that regular compile-time warnings would catch this.
If the value is known at compile-time, then compile-time should catch it.

> 
>>     * the array has at least # of elements specified by the size field all the time during the program.
>>     * the value of counted-by should not be negative.
> 
> This seems reasonable for a very strict program, but it won't work for
> the kernel as-is: a negative "count" is sometimes used to carry failure
> details back to other users of the structure. This could be refactored in
> the kernel, but I'd prefer that even without -fsanitizer=counted-by the
> runtime behaviors will be "safe".

So, In the kernel’s source code, for example:

struct foo {
  int count;
  short array[] __counted_by(count);
};

The field “count” will be used for two purposes:
A. As the counted_by for the “array” when its value > 0;
B. As an errno when its value < 0;  under such condition, the size of “array” is zero. 

Is the understanding correct?

Is doing this for saving space?  (Curious -:)


> 
> It does not seem sensible to me that adding a buffer size validation
> primitive to GCC will result in conditions where a size calculation
> will wrap around. I prefer no surprises. :)

Might be a bug here. I guess. 
> 
>> Let me know your comment and suggestions.
> 
> Clang has implemented the safety logic I'd prefer:
> 
> * __bdos will report 0 for any sizing where the "counted_by" count
>  variable is negative. Effectively, the count variable is always
>  processed as: count < 0 ? 0 : count
> 
>  struct foo {
> int count;
> short array[] __counted_by(count);
>  } *p;
> 
>  __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count)

NOTE,  __bdo will use value 0 as UNKNOWN_SIZE for MINMUM SIZE query, i.e:

size_t __builtin_object_size (const void * ptr, int type)

Will return 0 as UNKNOW_SIZE when type= 2 or 3.

So, I am wondering: should  the 0 here is  UNKNOWN_SIZE or 0 size?

I guess should be the UNKNOWN_SIZE?  (I,e, -1 for MAXIMUM type,  0 for MINIMUM type).

i.e, when the value of “count” is 0 or negative,  the __bdos will return UNKNOWN_SIZE.  Is this correct?

> 
>  The logic for this is that __bdos can be _certain_ that the size is 0
>  when the count variable is pathological.


> 
> * -fsanitize=array-bounds similarly treats count as above, so that:
> 
>  printf("%d\n", p->array[index]); ==> trap when index > (count < 0 ? 0 : count)
> 
>  Same logic for the sanitizer: any access to the array when count is
>  invalid means the access is invalid and must be trapped.

Okay, when the value of “count” is 0 or negative, bound sanitizer will report out-of-bound (or trap) for any access to the array. 
This should be reasonable.

Qing


> 
> 
> This means that software can run safely even in pathological conditions.
> 
> -Kees
> 
> -- 
> Kees Cook



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-29 19:32         ` Qing Zhao
@ 2024-01-29 20:19           ` Kees Cook
  2024-01-29 22:45             ` Qing Zhao
  0 siblings, 1 reply; 23+ messages in thread
From: Kees Cook @ 2024-01-29 20:19 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Martin Uecker, josmyers, Richard Biener, jakub, siddhesh,
	isanbard, gcc-patches

On Mon, Jan 29, 2024 at 07:32:06PM +0000, Qing Zhao wrote:
> 
> 
> > On Jan 29, 2024, at 12:25 PM, Kees Cook <keescook@chromium.org> wrote:
> > 
> > On Mon, Jan 29, 2024 at 04:00:20PM +0000, Qing Zhao wrote:
> >> An update on the kernel building with my version 4 patch.
> >> 
> >> Kees reported two FE issues with the current version 4 patch:
> >> 
> >> 1. The operator “typeof” cannot return correct type for a->array;
> >> 2. The operator “&” cannot return correct address for a->array;
> >> 
> >> I fixed both in my local repository. 
> >> 
> >> With these additional fix.  Kernel with counted-by annotation can be built successfully. 
> > 
> > Thanks for the fixes!
> > 
> >> 
> >> And then, Kees reported one behavioral issue with the current counted-by:
> >> 
> >> When the counted-by value is below zero, my current patch 
> >> 
> >> A. Didn’t report any warning for it.
> >> B. Accepted the negative value as a wrapped size.
> >> 
> >> i.e. for:
> >> 
> >> struct foo {
> >> signed char size;
> >> unsigned char array[] __counted_by(size);
> >> } *a;
> >> 
> >> ...
> >> a->size = -3;
> >> report(__builtin_dynamic_object_size(p->array, 1));
> >> 
> >> this reports 253, rather than 0.
> >> 
> >> And the array-bounds sanitizer doesn’t catch negative index bounds neither. 
> >> 
> >> a->size = -3;
> >> report(a->array[1]); // does not trap
> >> 
> >> 
> >> So, my questions are:
> >> 
> >> How should we handle the negative counted-by value?
> > 
> > Treat it as always 0-bounded: count < 0 ? 0 : count
> 
> Then the size of the object is 0?

That would be the purpose, yes. It's possible something else has
happened, but it would mean "the array contents should not be accessed
(since we don't have a valid size)".

> 
> > 
> >> 
> >> My approach is:
> >> 
> >>   I think that this is a user error, the compiler need to Issue warning during runtime about this user error.
> >> 
> >> Since I have one remaining patch that has not been finished yet:
> >> 
> >> 6  Emit warnings when the user breaks the requirments for the new counted_by attribute
> >>  compilation time: -Wcounted-by
> >>  run time: -fsanitizer=counted-by
> >>     * The initialization to the size field should be done before the first reference to the FAM field.
> > 
> > I would hope that regular compile-time warnings would catch this.
> If the value is known at compile-time, then compile-time should catch it.
> 
> > 
> >>     * the array has at least # of elements specified by the size field all the time during the program.
> >>     * the value of counted-by should not be negative.
> > 
> > This seems reasonable for a very strict program, but it won't work for
> > the kernel as-is: a negative "count" is sometimes used to carry failure
> > details back to other users of the structure. This could be refactored in
> > the kernel, but I'd prefer that even without -fsanitizer=counted-by the
> > runtime behaviors will be "safe".
> 
> So, In the kernel’s source code, for example:
> 
> struct foo {
>   int count;
>   short array[] __counted_by(count);
> };
> 
> The field “count” will be used for two purposes:
> A. As the counted_by for the “array” when its value > 0;
> B. As an errno when its value < 0;  under such condition, the size of “array” is zero. 
> 
> Is the understanding correct?

Yes.

> Is doing this for saving space?  (Curious -:)

It seems so, yes.

> > It does not seem sensible to me that adding a buffer size validation
> > primitive to GCC will result in conditions where a size calculation
> > will wrap around. I prefer no surprises. :)
> 
> Might be a bug here. I guess. 
> > 
> >> Let me know your comment and suggestions.
> > 
> > Clang has implemented the safety logic I'd prefer:
> > 
> > * __bdos will report 0 for any sizing where the "counted_by" count
> >  variable is negative. Effectively, the count variable is always
> >  processed as: count < 0 ? 0 : count
> > 
> >  struct foo {
> > int count;
> > short array[] __counted_by(count);
> >  } *p;
> > 
> >  __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count)
> 
> NOTE,  __bdo will use value 0 as UNKNOWN_SIZE for MINMUM SIZE query, i.e:
> 
> size_t __builtin_object_size (const void * ptr, int type)
> 
> Will return 0 as UNKNOW_SIZE when type= 2 or 3.
> 
> So, I am wondering: should  the 0 here is  UNKNOWN_SIZE or 0 size?
> 
> I guess should be the UNKNOWN_SIZE?  (I,e, -1 for MAXIMUM type,  0 for MINIMUM type).
> 
> i.e, when the value of “count” is 0 or negative,  the __bdos will return UNKNOWN_SIZE.  Is this correct?

I would suggest that a negative count should always return 0. The size
isn't "unknown", the "count" has been clamped to 0 to avoid surprises,
so the result is as if the "count" had a zero value.

> Okay, when the value of “count” is 0 or negative, bound sanitizer will report out-of-bound (or trap) for any access to the array. 
> This should be reasonable.

Thanks! And with __bdos() following this logic there won't be a disconnect
between the two. i.e. FORTIFY-style things like memcpy use __bdos for
validating the array size, and direct index walking uses the bounds
sanitizer. These are effectively the same thing, so they need to agree.

i.e. these are the same thing:

	memcpy(p->array, src, bytes_to_copy);

and

	for (i = 0; i < elements_to_copy; i++)
		p->array[i] = src++

so the __bdos() used by the fortified memcpy() needs to agree with the
logic that the bounds sanitizer uses for the for loop's accesses.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-29 15:09           ` Qing Zhao
  2024-01-29 15:50             ` Martin Uecker
@ 2024-01-29 20:35             ` Joseph Myers
  2024-01-29 22:20               ` Qing Zhao
  1 sibling, 1 reply; 23+ messages in thread
From: Joseph Myers @ 2024-01-29 20:35 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Martin Uecker, Richard Biener, Kees Cook, jakub, siddhesh,
	isanbard, gcc-patches

On Mon, 29 Jan 2024, Qing Zhao wrote:

> Thank you!
> 
> Joseph and Richard,  could you also comment on this?

I think Martin's suggestions are reasonable.

-- 
Joseph S. Myers
josmyers@redhat.com


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-29 20:35             ` Joseph Myers
@ 2024-01-29 22:20               ` Qing Zhao
  0 siblings, 0 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-29 22:20 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Martin Uecker, Richard Biener, Kees Cook, jakub, siddhesh,
	isanbard, gcc-patches



> On Jan 29, 2024, at 3:35 PM, Joseph Myers <josmyers@redhat.com> wrote:
> 
> On Mon, 29 Jan 2024, Qing Zhao wrote:
> 
>> Thank you!
>> 
>> Joseph and Richard,  could you also comment on this?
> 
> I think Martin's suggestions are reasonable.

Okay, I will update the patches based on this approach. 

Thanks a lot for the comment.

Qing
> 
> -- 
> Joseph S. Myers
> josmyers@redhat.com
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-29 20:19           ` Kees Cook
@ 2024-01-29 22:45             ` Qing Zhao
  2024-01-30  5:41               ` Kees Cook
  0 siblings, 1 reply; 23+ messages in thread
From: Qing Zhao @ 2024-01-29 22:45 UTC (permalink / raw)
  To: Kees Cook
  Cc: Martin Uecker, josmyers, Richard Biener, jakub, siddhesh,
	isanbard, gcc-patches



> On Jan 29, 2024, at 3:19 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Mon, Jan 29, 2024 at 07:32:06PM +0000, Qing Zhao wrote:
>> 
>> 
>>> On Jan 29, 2024, at 12:25 PM, Kees Cook <keescook@chromium.org> wrote:
>>> 
>>> On Mon, Jan 29, 2024 at 04:00:20PM +0000, Qing Zhao wrote:
>>>> An update on the kernel building with my version 4 patch.
>>>> 
>>>> Kees reported two FE issues with the current version 4 patch:
>>>> 
>>>> 1. The operator “typeof” cannot return correct type for a->array;
>>>> 2. The operator “&” cannot return correct address for a->array;
>>>> 
>>>> I fixed both in my local repository. 
>>>> 
>>>> With these additional fix.  Kernel with counted-by annotation can be built successfully. 
>>> 
>>> Thanks for the fixes!
>>> 
>>>> 
>>>> And then, Kees reported one behavioral issue with the current counted-by:
>>>> 
>>>> When the counted-by value is below zero, my current patch 
>>>> 
>>>> A. Didn’t report any warning for it.
>>>> B. Accepted the negative value as a wrapped size.
>>>> 
>>>> i.e. for:
>>>> 
>>>> struct foo {
>>>> signed char size;
>>>> unsigned char array[] __counted_by(size);
>>>> } *a;
>>>> 
>>>> ...
>>>> a->size = -3;
>>>> report(__builtin_dynamic_object_size(p->array, 1));
>>>> 
>>>> this reports 253, rather than 0.
>>>> 
>>>> And the array-bounds sanitizer doesn’t catch negative index bounds neither. 
>>>> 
>>>> a->size = -3;
>>>> report(a->array[1]); // does not trap
>>>> 
>>>> 
>>>> So, my questions are:
>>>> 
>>>> How should we handle the negative counted-by value?
>>> 
>>> Treat it as always 0-bounded: count < 0 ? 0 : count
>> 
>> Then the size of the object is 0?
> 
> That would be the purpose, yes. It's possible something else has
> happened, but it would mean "the array contents should not be accessed
> (since we don't have a valid size)".

This might be a new concept we need to add, from my understanding,
 C/C++ don’t have the zero-sized object. 
So, I am a little worried about where should we add this concept?

The most reasonable place I am thinking is adding such concept to the 
doc of “counted-by” attribute, but still not very sure on this.
> 
>> 
>>> 
>>>> 
>>>> My approach is:
>>>> 
>>>>  I think that this is a user error, the compiler need to Issue warning during runtime about this user error.
>>>> 
>>>> Since I have one remaining patch that has not been finished yet:
>>>> 
>>>> 6  Emit warnings when the user breaks the requirments for the new counted_by attribute
>>>> compilation time: -Wcounted-by
>>>> run time: -fsanitizer=counted-by
>>>>    * The initialization to the size field should be done before the first reference to the FAM field.
>>> 
>>> I would hope that regular compile-time warnings would catch this.
>> If the value is known at compile-time, then compile-time should catch it.
>> 
>>> 
>>>>    * the array has at least # of elements specified by the size field all the time during the program.
>>>>    * the value of counted-by should not be negative.
>>> 
>>> This seems reasonable for a very strict program, but it won't work for
>>> the kernel as-is: a negative "count" is sometimes used to carry failure
>>> details back to other users of the structure. This could be refactored in
>>> the kernel, but I'd prefer that even without -fsanitizer=counted-by the
>>> runtime behaviors will be "safe".
>> 
>> So, In the kernel’s source code, for example:
>> 
>> struct foo {
>>  int count;
>>  short array[] __counted_by(count);
>> };
>> 
>> The field “count” will be used for two purposes:
>> A. As the counted_by for the “array” when its value > 0;
>> B. As an errno when its value < 0;  under such condition, the size of “array” is zero. 
>> 
>> Is the understanding correct?
> 
> Yes.
> 
>> Is doing this for saving space?  (Curious -:)
> 
> It seems so, yes.
> 
>>> It does not seem sensible to me that adding a buffer size validation
>>> primitive to GCC will result in conditions where a size calculation
>>> will wrap around. I prefer no surprises. :)
>> 
>> Might be a bug here. I guess. 
>>> 
>>>> Let me know your comment and suggestions.
>>> 
>>> Clang has implemented the safety logic I'd prefer:
>>> 
>>> * __bdos will report 0 for any sizing where the "counted_by" count
>>> variable is negative. Effectively, the count variable is always
>>> processed as: count < 0 ? 0 : count
>>> 
>>> struct foo {
>>> int count;
>>> short array[] __counted_by(count);
>>> } *p;
>>> 
>>> __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count)
>> 
>> NOTE,  __bdo will use value 0 as UNKNOWN_SIZE for MINMUM SIZE query, i.e:
>> 
>> size_t __builtin_object_size (const void * ptr, int type)
>> 
>> Will return 0 as UNKNOW_SIZE when type= 2 or 3.
>> 
>> So, I am wondering: should  the 0 here is  UNKNOWN_SIZE or 0 size?
>> 
>> I guess should be the UNKNOWN_SIZE?  (I,e, -1 for MAXIMUM type,  0 for MINIMUM type).
>> 
>> i.e, when the value of “count” is 0 or negative,  the __bdos will return UNKNOWN_SIZE.  Is this correct?
> 
> I would suggest that a negative count should always return 0. The size
> isn't "unknown", the "count" has been clamped to 0 to avoid surprises,
> so the result is as if the "count" had a zero value.

There are two things here. 

1. The value of the “counted-by” is 0; (which is easy to be understood)
2. The result of the _builtin_object_size when see a “counted-by” 0.

For 1, it’s simple, if we see a counted-by value <= 0,  then counted-by is 0;

But for 2, when the _builtin_object_size sees a “counted-by” 0, what’s value it will return for the object size?

 Can we return 0 for the object size? 
(As I mentioned in the previous email, 0 in __builtin_object_size doesn’t mean size 0,
 it means UNKNOWN_SIZE when the type is 2/3, So, what’s value we should return for the size 0?)
https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html

Hope I am clear this time. -:)

thanks.

Qing
> 
>> Okay, when the value of “count” is 0 or negative, bound sanitizer will report out-of-bound (or trap) for any access to the array. 
>> This should be reasonable.
> 
> Thanks! And with __bdos() following this logic there won't be a disconnect
> between the two. i.e. FORTIFY-style things like memcpy use __bdos for
> validating the array size, and direct index walking uses the bounds
> sanitizer. These are effectively the same thing, so they need to agree.
> 
> i.e. these are the same thing:
> 
> memcpy(p->array, src, bytes_to_copy);
> 
> and
> 
> for (i = 0; i < elements_to_copy; i++)
> p->array[i] = src++
> 
> so the __bdos() used by the fortified memcpy() needs to agree with the
> logic that the bounds sanitizer uses for the for loop's accesses.
> 
> -- 
> Kees Cook



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-29 22:45             ` Qing Zhao
@ 2024-01-30  5:41               ` Kees Cook
  2024-01-30 15:43                 ` Qing Zhao
  0 siblings, 1 reply; 23+ messages in thread
From: Kees Cook @ 2024-01-30  5:41 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Martin Uecker, josmyers, Richard Biener, jakub, siddhesh,
	isanbard, gcc-patches

On Mon, Jan 29, 2024 at 10:45:23PM +0000, Qing Zhao wrote:
> There are two things here. 
> 
> 1. The value of the “counted-by” is 0; (which is easy to be understood)
> 2. The result of the _builtin_object_size when see a “counted-by” 0.
> 
> For 1, it’s simple, if we see a counted-by value <= 0,  then counted-by is 0;

Okay, that's good; this matches my understanding. :)

> But for 2, when the _builtin_object_size sees a “counted-by” 0, what’s value it will return for the object size?
> 
>  Can we return 0 for the object size? 

I don't see why not. For example:

// -O2 -fstrict-flex-arrays=3
struct s {
    int a;
    int b[4];
} foo;

#define report(x)   printf("%s: %zu\n", #x, (size_t)(x))

int main(int argc, char *argv[])
{
    struct s foo;
    report(__builtin_dynamic_object_size(&foo.b[4], 0));
    report(__builtin_dynamic_object_size(&foo.b[5], 0));
    report(__builtin_dynamic_object_size(&foo.b[-10], 0));
    report(__builtin_dynamic_object_size(&foo.b[4], 1));
    report(__builtin_dynamic_object_size(&foo.b[5], 1));
    report(__builtin_dynamic_object_size(&foo.b[-10], 1));
    report(__builtin_dynamic_object_size(&foo.b[4], 2));
    report(__builtin_dynamic_object_size(&foo.b[5], 2));
    report(__builtin_dynamic_object_size(&foo.b[-10], 2));
    report(__builtin_dynamic_object_size(&foo.b[4], 3));
    report(__builtin_dynamic_object_size(&foo.b[5], 3));
    report(__builtin_dynamic_object_size(&foo.b[-10], 3));
    return 0;
}

shows:

__builtin_dynamic_object_size(&foo.b[4], 0): 0
__builtin_dynamic_object_size(&foo.b[5], 0): 0
__builtin_dynamic_object_size(&foo.b[-10], 0): 0
__builtin_dynamic_object_size(&foo.b[4], 1): 0
__builtin_dynamic_object_size(&foo.b[5], 1): 0
__builtin_dynamic_object_size(&foo.b[-10], 1): 0
__builtin_dynamic_object_size(&foo.b[4], 2): 0
__builtin_dynamic_object_size(&foo.b[5], 2): 0
__builtin_dynamic_object_size(&foo.b[-10], 2): 0
__builtin_dynamic_object_size(&foo.b[4], 3): 0
__builtin_dynamic_object_size(&foo.b[5], 3): 0
__builtin_dynamic_object_size(&foo.b[-10], 3): 0

This is showing "no bytes left" for the end of the b array, and if this
index keeps going, it still reports 0 if we're past the end of the object
completely. And it is similarly capped for negative indexes. This is
true for all the __bos type bits.

A "counted-by" of 0 (or below) would have the same meaning as an out of
bounds index here.

> (As I mentioned in the previous email, 0 in __builtin_object_size doesn’t mean size 0,
>  it means UNKNOWN_SIZE when the type is 2/3, So, what’s value we should return for the size 0?)
> https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html

I think I see what you mean, but I still think it should be 0 for 2/3,
regardless of the documented interpretation. If that's the current
response for a pathological index under 2/3, then I think it's totally
reasonable that it should do the same for pathological bounds.


And BTW, it seems there are 0-sized objects, though maybe they're some
kind of special case:

struct s {
    int a;
    struct { } nothing;
    int b;
};

#define report(x)   printf("%s: %zu\n", #x, (size_t)(x))

int main(int argc, char *argv[])
{
    struct s foo;
    report(__builtin_dynamic_object_size(&foo.nothing, 1));
}

shows:

__builtin_dynamic_object_size(&foo.nothing, 1): 0


-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-30  5:41               ` Kees Cook
@ 2024-01-30 15:43                 ` Qing Zhao
  0 siblings, 0 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-30 15:43 UTC (permalink / raw)
  To: Kees Cook
  Cc: Martin Uecker, josmyers, Richard Biener, jakub, siddhesh,
	isanbard, gcc-patches



> On Jan 30, 2024, at 12:41 AM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Mon, Jan 29, 2024 at 10:45:23PM +0000, Qing Zhao wrote:
>> There are two things here. 
>> 
>> 1. The value of the “counted-by” is 0; (which is easy to be understood)
>> 2. The result of the _builtin_object_size when see a “counted-by” 0.
>> 
>> For 1, it’s simple, if we see a counted-by value <= 0,  then counted-by is 0;
> 
> Okay, that's good; this matches my understanding. :)
> 
>> But for 2, when the _builtin_object_size sees a “counted-by” 0, what’s value it will return for the object size?
>> 
>> Can we return 0 for the object size? 
> 
> I don't see why not. For example:
> 
> // -O2 -fstrict-flex-arrays=3
> struct s {
>    int a;
>    int b[4];
> } foo;
> 
> #define report(x)   printf("%s: %zu\n", #x, (size_t)(x))
> 
> int main(int argc, char *argv[])
> {
>    struct s foo;
>    report(__builtin_dynamic_object_size(&foo.b[4], 0));
>    report(__builtin_dynamic_object_size(&foo.b[5], 0));
>    report(__builtin_dynamic_object_size(&foo.b[-10], 0));
>    report(__builtin_dynamic_object_size(&foo.b[4], 1));
>    report(__builtin_dynamic_object_size(&foo.b[5], 1));
>    report(__builtin_dynamic_object_size(&foo.b[-10], 1));
>    report(__builtin_dynamic_object_size(&foo.b[4], 2));
>    report(__builtin_dynamic_object_size(&foo.b[5], 2));
>    report(__builtin_dynamic_object_size(&foo.b[-10], 2));
>    report(__builtin_dynamic_object_size(&foo.b[4], 3));
>    report(__builtin_dynamic_object_size(&foo.b[5], 3));
>    report(__builtin_dynamic_object_size(&foo.b[-10], 3));
>    return 0;
> }
> 
> shows:
> 
> __builtin_dynamic_object_size(&foo.b[4], 0): 0
> __builtin_dynamic_object_size(&foo.b[5], 0): 0
> __builtin_dynamic_object_size(&foo.b[-10], 0): 0
> __builtin_dynamic_object_size(&foo.b[4], 1): 0
> __builtin_dynamic_object_size(&foo.b[5], 1): 0
> __builtin_dynamic_object_size(&foo.b[-10], 1): 0
> __builtin_dynamic_object_size(&foo.b[4], 2): 0
> __builtin_dynamic_object_size(&foo.b[5], 2): 0
> __builtin_dynamic_object_size(&foo.b[-10], 2): 0
> __builtin_dynamic_object_size(&foo.b[4], 3): 0
> __builtin_dynamic_object_size(&foo.b[5], 3): 0
> __builtin_dynamic_object_size(&foo.b[-10], 3): 0
> 
> This is showing "no bytes left" for the end of the b array, and if this
> index keeps going, it still reports 0 if we're past the end of the object
> completely. And it is similarly capped for negative indexes. This is
> true for all the __bos type bits.
> 
> A "counted-by" of 0 (or below) would have the same meaning as an out of
> bounds index here.

Okay. I will keep this behavior when counted-by is zero (and negative) for __bos. 
> 
>> (As I mentioned in the previous email, 0 in __builtin_object_size doesn’t mean size 0,
>> it means UNKNOWN_SIZE when the type is 2/3, So, what’s value we should return for the size 0?)
>> https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
> 
> I think I see what you mean, but I still think it should be 0 for 2/3,
> regardless of the documented interpretation. If that's the current
> response for a pathological index under 2/3, then I think it's totally
> reasonable that it should do the same for pathological bounds.

Okay, will keep this behavior for “counted-by” zero. 

(But still feel that 0 for 2/3, i.e the MINIMUM size will represent as UNKNOWN_SIZE.
 If that’s the value kernel expected, that’s good)
> 
> 
> And BTW, it seems there are 0-sized objects, though maybe they're some
> kind of special case:
> 
> struct s {
>    int a;
>    struct { } nothing;
>    int b;
> };
> 
> #define report(x)   printf("%s: %zu\n", #x, (size_t)(x))
> 
> int main(int argc, char *argv[])
> {
>    struct s foo;
>    report(__builtin_dynamic_object_size(&foo.nothing, 1));
> }
> 
> shows:
> 
> __builtin_dynamic_object_size(&foo.nothing, 1): 0

Looks like that GCC has such extension: https://gcc.gnu.org/onlinedocs/gcc/Empty-Structures.html

***GCC permits a C structure to have no members:
struct empty {
};

The structure has size zero. In C++, empty structures are part of the language. G++ treats empty structures as if they had a single member of type char.

Thanks.

Qing


> 
> -Kees
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2024-01-24  0:29 [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
                   ` (4 preceding siblings ...)
  2024-01-25  0:51 ` [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
@ 2024-01-30 16:04 ` Qing Zhao
  5 siblings, 0 replies; 23+ messages in thread
From: Qing Zhao @ 2024-01-30 16:04 UTC (permalink / raw)
  To: Joseph Myers, Richard Biener, Martin Uecker, Kees Cook
  Cc: isanbard, gcc Patches, jakub, Siddhesh Poyarekar

Okay, Based on the comments so far, I will work on the 5th version of the patch, major changes will include:

1. Change the return type of the routine .ACCESS_WITH_SIZE 
    FROM:
      Pointer to the type of the element of the flexible array;
    TO:
       Pointer to the type of the flexible array;

     And then wrap the call with an indirection reference. 

2. Adjust all other parts with this change, (this will simplify the bound sanitizer instrument);

3. Add the fixes to the kernel building failures, which include:

     A. The operator “typeof” cannot return correct type for a->array; 
	(I guess that the above change 1 might automatically resolve this issue)
     B. The operator “&” cannot return correct address for a->array;

4. Correctly handle the case when the value of “counted-by” is zero or negative as following

    4.1 . Update the counted-by doc with the following:

     When the counted-by field is assigned a negative integer value, the compiler will treat the value as zero. 

    4.2.   (possibly) Adjust __bdos and array bound sanitizer to handle correctly when “counted-by” is zero. 

           __bdos will return size 0 when counted-by is zero;

          Array bound sanitizer will report out-of-bound when the counted-by is zero for any array access. 

Let me know if I missed anything.

Thanks a lot for all the comments

 Qing
    

> On Jan 23, 2024, at 7:29 PM, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> Hi,
> 
> This is the 4th version of the patch.
> 
> It based on the following proposal:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html
> Represent the missing dependence for the "counted_by" attribute and its consumers
> 
> ******The summary of the proposal is:
> 
> * Add a new internal function ".ACCESS_WITH_SIZE" to carry the size information for every reference to a FAM field;
> * In C FE, Replace every reference to a FAM field whose TYPE has the "counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
> * In every consumer of the size information, for example, BDOS or array bound sanitizer, query the size information or ACCESS_MODE information from the new internal function;
> * When expansing to RTL, replace the internal function with the actual reference to the FAM field;
> * Some adjustment to ipa alias analysis, and other SSA passes to mitigate the impact to the optimizer and code generation.
> 
> 
> ******The new internal function
> 
>  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX)
> 
> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> 
> which returns the "REF_TO_OBJ" same as the 1st argument;
> 
> Both the return type and the type of the first argument of this function have been converted from the incomplete array type to the corresponding pointer type.
> 
> Please see the following link for why:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
> 
> 1st argument "REF_TO_OBJ": The reference to the object;
> 2nd argument "REF_TO_SIZE": The reference to the size of the object,
> 3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE represents
>   0: unknown;
>   1: the number of the elements of the object type;
>   2: the number of bytes;
> 4th argument "PRECISION_OF_SIZE": The precision of the integer that REF_TO_SIZE points;
> 5th argument "ACCESS_MODE":
>  -1: Unknown access semantics
>   0: none
>   1: read_only
>   2: write_only
>   3: read_write
> 6th argument "INDEX": the INDEX for the original array reference.
>  -1: Unknown
> 
> NOTE: The 6th Argument is added for bound sanitizer instrumentation.
> 
> ****** The Patch sets included:
> 
> 1. Provide counted_by attribute to flexible array member field;
>      which includes:
>      * "counted_by" attribute documentation;
>      * C FE handling of the new attribute;
>        syntax checking, error reporting;
>      * testing cases;
> 
> 2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.
>      which includes:
>      * The definition of the new internal function .ACCESS_WITH_SIZE in internal-fn.def.
>      * C FE converts every reference to a FAM with "counted_by" attribute to a call to the internal function .ACCESS_WITH_SIZE.
>        (build_component_ref in c_typeck.cc)
>        This includes the case when the object is statically allocated and initialized.
>        In order to make this working, we should update initializer_constant_valid_p_1 and output_constant in varasm.cc to include calls to .ACCESS_WITH_SIZE.
> 
>        However, for the reference inside "offsetof", ignore the "counted_by" attribute since it's not useful at all. (c_parser_postfix_expression in c/c-parser.cc)
> 
>      * Convert every call to .ACCESS_WITH_SIZE to its first argument.
>        (expand_ACCESS_WITH_SIZE in internal-fn.cc)
>      * adjust alias analysis to exclude the new internal from clobbering anything.
>        (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in tree-ssa-alias.cc)
>      * adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE when
>        it's LHS is eliminated as dead code.
>        (eliminate_unnecessary_stmts in tree-ssa-dce.cc)
>      * Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
>        get the reference from the call to .ACCESS_WITH_SIZE.
>        (is_access_with_size_p and get_ref_from_access_with_size in tree.cc)
>      * testing cases. (for offsetof, static initialization, generation of calls to
>        .ACCESS_WITH_SIZE, code runs correctly after calls to .ACCESS_WITH_SIZE are
>        converted back)
> 
> 3. Use the .ACCESS_WITH_SIZE in builtin object size (sub-object only)
>      which includes:
>      * use the size info of the .ACCESS_WITH_SIZE for sub-object.
>      * testing cases. 
> 
> 4 Use the .ACCESS_WITH_SIZE in bound sanitizer
>      Since the result type of the call to .ACCESS_WITH_SIZE is a pointer to
>        the element type. The original array_ref is converted to an indirect_ref,
>        which introduced two issues for the instrumenation of bound sanitizer:
> 
>        A. The index for the original array_ref was mixed into the offset
>        expression for the new indirect_ref.
> 
>        In order to make the instrumentation for the bound sanitizer easier, one
>        more argument for the function .ACCESS_WITH_SIZE is added to record this
>        original index for the array_ref. then later during bound instrumentation,
>        get the index from the additional argument from the call to the function
>        .ACCESS_WITH_SIZE.
> 
>        B. In the current bound sanitizer, no instrumentation will be inserted for
>        an indirect_ref.
> 
>        In order to add instrumentation for an indirect_ref with a call to
>        .ACCESS_WITH_SIZE, we should specially handle the indirect_ref with a
>        call to .ACCESS_WITH_SIZE, and get the index and bound info from the
>        arguments of the call.
> 
>      which includes:
>      * Record the index to the 6th argument of the call to .ACCESS_WITH_SIZE.
>      * Instrument indirect_ref with call to .ACCESS_WITH_SIZE for bound sanitizer.
>      * testing cases. (additional testing cases for dynamic array support)
> 
> ******Remaining works: 
> 
> 5  Improve __bdos to use the counted_by info in whole-object size for the structure with FAM.
> 6  Emit warnings when the user breaks the requirments for the new counted_by attribute
>   compilation time: -Wcounted-by
>   run time: -fsanitizer=counted-by
>      * The initialization to the size field should be done before the first reference to the FAM field.
>      * the array has at least # of elements specified by the size field all the time during the program.
> 
> I have bootstrapped and regression tested on both x86 and aarch64, no issue.
> 
> Let me know your comments.
> 
> thanks.
> 
> Qing


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-01-30 16:04 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-24  0:29 [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
2024-01-24  0:29 ` [PATCH v4 1/4] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
2024-01-24  0:29 ` [PATCH v4 2/4] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE Qing Zhao
2024-01-24  0:29 ` [PATCH v4 3/4] Use the .ACCESS_WITH_SIZE in builtin object size Qing Zhao
2024-01-24  0:29 ` [PATCH v4 4/4] Use the .ACCESS_WITH_SIZE in bound sanitizer Qing Zhao
2024-01-25  0:51 ` [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
2024-01-25 20:11   ` Qing Zhao
2024-01-26  8:04     ` Martin Uecker
2024-01-26 14:33       ` Qing Zhao
2024-01-28 10:09         ` Martin Uecker
2024-01-29 15:09           ` Qing Zhao
2024-01-29 15:50             ` Martin Uecker
2024-01-29 16:19               ` Qing Zhao
2024-01-29 20:35             ` Joseph Myers
2024-01-29 22:20               ` Qing Zhao
2024-01-29 16:00     ` Qing Zhao
2024-01-29 17:25       ` Kees Cook
2024-01-29 19:32         ` Qing Zhao
2024-01-29 20:19           ` Kees Cook
2024-01-29 22:45             ` Qing Zhao
2024-01-30  5:41               ` Kees Cook
2024-01-30 15:43                 ` Qing Zhao
2024-01-30 16:04 ` Qing Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).