public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
@ 2023-08-04 19:44 Qing Zhao
  2023-08-04 19:44 ` [V2][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: Qing Zhao @ 2023-08-04 19:44 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, gcc-patches
  Cc: keescook, siddhesh, uecker, isanbard, Qing Zhao

Hi,

This is the 2nd version of the patch, per our discussion based on the
review comments for the 1st version, the major changes in this version
are:

1. change the name "element_count" to "counted_by";
2. change the parameter for the attribute from a STRING to an
Identifier;
3. Add logic and testing cases to handle anonymous structure/unions;
4. Clarify documentation to permit the situation when the allocation
size is larger than what's specified by "counted_by", at the same time,
it's user's error if allocation size is smaller than what's specified by
"counted_by";
5. Add a complete testing case for using counted_by attribute in
__builtin_dynamic_object_size when there is mismatch between the
allocation size and the value of "counted_by", the expecting behavior
for each case and the explanation on why in the comments. 

As discussed, I plan to add two more separate patch sets after this initial
patch set is approved and committed.

set 1. A new warning option and a new sanitizer option for the user error
       when the allocation size is smaller than the value of "counted_by".
set 2. An improvement to __builtin_dynamic_object_size  for the following
       case:

struct A
{
size_t foo;
int array[] __attribute__((counted_by (foo)));
};

extern struct fix * alloc_buf ();

int main ()
{
struct fix *p = alloc_buf ();
__builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
  /* with the current algorithm, it’s UNKNOWN */ 
__builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
  /* with the current algorithm, it’s UNKNOWN */
}

Bootstrapped and regression tested on both aarch64 and X86, no issue.

Please see more details on the description of this work on:

https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619708.html

Okay for committing?

thanks.

Qing

Qing Zhao (3):
  Provide counted_by attribute to flexible array member field (PR108896)
  Use the counted_by atribute info in builtin object size [PR108896]
  Use the counted_by attribute information in bound sanitizer[PR108896]

 gcc/c-family/c-attribs.cc                     |  54 ++++-
 gcc/c-family/c-common.cc                      |  13 ++
 gcc/c-family/c-common.h                       |   1 +
 gcc/c-family/c-ubsan.cc                       |  16 ++
 gcc/c/c-decl.cc                               |  79 +++++--
 gcc/doc/extend.texi                           |  73 +++++++
 .../gcc.dg/flex-array-counted-by-2.c          |  74 +++++++
 .../gcc.dg/flex-array-counted-by-3.c          | 197 ++++++++++++++++++
 gcc/testsuite/gcc.dg/flex-array-counted-by.c  |  40 ++++
 .../ubsan/flex-array-counted-by-bounds-2.c    |  27 +++
 .../ubsan/flex-array-counted-by-bounds.c      |  46 ++++
 gcc/tree-object-size.cc                       |  37 +++-
 gcc/tree.cc                                   | 133 ++++++++++++
 gcc/tree.h                                    |  15 ++
 14 files changed, 780 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [V2][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)
  2023-08-04 19:44 [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
@ 2023-08-04 19:44 ` Qing Zhao
  2023-08-04 19:44 ` [V2][PATCH 2/3] Use the counted_by atribute info in builtin object size [PR108896] Qing Zhao
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Qing Zhao @ 2023-08-04 19:44 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, gcc-patches
  Cc: keescook, siddhesh, uecker, isanbard, Qing Zhao

'counted_by (COUNT)'
     The 'counted_by' attribute may be attached to the flexible array
     member of a structure.  It indicates that the number of the
     elements of the array is given by the field named "COUNT" in the
     same structure as the flexible array member.  GCC uses this
     information to improve the results of the array bound sanitizer and
     the '__builtin_dynamic_object_size'.

     For instance, the following code:

          struct P {
            size_t count;
            int array[] __attribute__ ((counted_by (count)));
          } *p;

     specifies that the 'array' is a flexible array member whose number
     of elements is given by the field 'count' in the same structure.

     The field that represents the number of the elements should have an
     integer type.  An explicit 'counted_by' annotation defines a
     relationship between two objects, 'p->array' and 'p->count', that
     'p->array' has _at least_ 'p->count' number of elements available.
     This relationship must hold even after any of these related objects
     are updated.  It's the user's responsibility to make sure this
     relationship to be kept all the time.  Otherwise the results of the
     array bound sanitizer and the '__builtin_dynamic_object_size' might
     be incorrect.

     For instance, in the following example, the allocated array has
     less elements than what's specified by the 'sbuf->count', this is
     an user error.  As a result, out-of-bounds access to the array
     might not be detected.

          #define SIZE_BUMP 10
          struct P *sbuf;
          void alloc_buf (size_t nelems)
          {
            sbuf = (int *) malloc (sizeof (struct P) + sizeof (int) * nelems);
            sbuf->count = nelems + SIZE_BUMP;
            /* This is invalid when the sbuf->array has less than sbuf->count
               elements.  */
          }

     In the following example, the 2nd update to the field 'sbuf->count'
     of the above structure will permit out-of-bounds access to the
     array 'sbuf>array' as well.

          #define SIZE_BUMP 10
          struct P *sbuf;
          void alloc_buf (size_t nelems)
          {
            sbuf = (int *) malloc (sizeof (struct P)
                                 + sizeof (int) * (nelems + SIZE_BUMP));
            sbuf->count = nelems;
            /* This is valid when the sbuf->array has at least sbuf->count
               elements.  */
          }
          void use_buf (int index)
          {
            sbuf->count = sbuf->count + SIZE_BUMP + 1;
            /* Now the value of sbuf->count is larger than the number
               of elements of sbuf->array.  */
            sbuf->array[index] = 0;
            /* then the out-of-bound access to this array
               might not be detected.  */
          }

gcc/c-family/ChangeLog:

	PR C/108896
	* c-attribs.cc (handle_counted_by_attribute): New function.
	(attribute_takes_identifier_p): Add counted_by attribute to the list.
	* c-common.cc (c_flexible_array_member_type_p): ...To this.
	* c-common.h (c_flexible_array_member_type_p): New prototype.

gcc/c/ChangeLog:

	PR C/108896
	* c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
	(add_flexible_array_elts_to_size): Use renamed function.
	(is_flexible_array_member_p): Use renamed function.
	(verify_counted_by_attribute): New function.
	(finish_struct): Use renamed function and verify counted_by
	attribute.

gcc/ChangeLog:

	PR C/108896
	* doc/extend.texi: Document attribute counted_by.
	* tree.cc (get_named_field): New function.
	* tree.h (get_named_field): New prototype.

gcc/testsuite/ChangeLog:

	PR C/108896
	* gcc.dg/flex-array-counted-by.c: New test.
---
 gcc/c-family/c-attribs.cc                    | 54 ++++++++++++-
 gcc/c-family/c-common.cc                     | 13 ++++
 gcc/c-family/c-common.h                      |  1 +
 gcc/c/c-decl.cc                              | 79 +++++++++++++++-----
 gcc/doc/extend.texi                          | 73 ++++++++++++++++++
 gcc/testsuite/gcc.dg/flex-array-counted-by.c | 40 ++++++++++
 gcc/tree.cc                                  | 40 ++++++++++
 gcc/tree.h                                   |  5 ++
 8 files changed, 287 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by.c

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index e2792ca6898b..65e4f6639109 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -103,6 +103,8 @@ static tree handle_warn_if_not_aligned_attribute (tree *, tree, tree,
 						  int, bool *);
 static tree handle_strict_flex_array_attribute (tree *, tree, tree,
 						 int, bool *);
+static tree handle_counted_by_attribute (tree *, tree, tree,
+					   int, bool *);
 static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_noplt_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *);
@@ -373,6 +375,8 @@ const struct attribute_spec c_common_attribute_table[] =
 			      handle_warn_if_not_aligned_attribute, NULL },
   { "strict_flex_array",      1, 1, true, false, false, false,
 			      handle_strict_flex_array_attribute, NULL },
+  { "counted_by",	      1, 1, true, false, false, false,
+			      handle_counted_by_attribute, NULL },
   { "weak",                   0, 0, true,  false, false, false,
 			      handle_weak_attribute, NULL },
   { "noplt",                   0, 0, true,  false, false, false,
@@ -601,7 +605,8 @@ attribute_takes_identifier_p (const_tree attr_id)
   else if (!strcmp ("mode", spec->name)
 	   || !strcmp ("format", spec->name)
 	   || !strcmp ("cleanup", spec->name)
-	   || !strcmp ("access", spec->name))
+	   || !strcmp ("access", spec->name)
+	   || !strcmp ("counted_by", spec->name))
     return true;
   else
     return targetm.attribute_takes_identifier_p (attr_id);
@@ -2555,6 +2560,53 @@ handle_strict_flex_array_attribute (tree *node, tree name,
   return NULL_TREE;
 }
 
+/* Handle a "counted_by" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_counted_by_attribute (tree *node, tree name,
+			     tree args, int ARG_UNUSED (flags),
+			     bool *no_add_attrs)
+{
+  tree decl = *node;
+  tree argval = TREE_VALUE (args);
+
+  /* This attribute only applies to field decls of a structure.  */
+  if (TREE_CODE (decl) != FIELD_DECL)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for non-field"
+		" declaration %q+D", name, decl);
+      *no_add_attrs = true;
+    }
+  /* This attribute only applies to field with array type.  */
+  else if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for a non-array field",
+		name);
+      *no_add_attrs = true;
+    }
+  /* This attribute only applies to a C99 flexible array member type.  */
+  else if (! c_flexible_array_member_type_p (TREE_TYPE (decl)))
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for a non"
+		" flexible array member field",
+		name);
+      *no_add_attrs = true;
+    }
+  /* The argument should be an identifier.  */
+  else if (TREE_CODE (argval) != IDENTIFIER_NODE)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%<counted_by%> argument not an identifier");
+      *no_add_attrs = true;
+    }
+
+  return NULL_TREE;
+}
+
 /* Handle a "weak" attribute; arguments as in
    struct attribute_spec.handler.  */
 
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 9fbaeb437a12..a18937245c2a 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9521,6 +9521,19 @@ c_common_finalize_early_debug (void)
       (*debug_hooks->early_global_decl) (cnode->decl);
 }
 
+/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */
+bool
+c_flexible_array_member_type_p (const_tree type)
+{
+  if (TREE_CODE (type) == ARRAY_TYPE
+      && TYPE_SIZE (type) == NULL_TREE
+      && TYPE_DOMAIN (type) != NULL_TREE
+      && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
+    return true;
+
+  return false;
+}
+
 /* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
    values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
 unsigned int
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 78fc5248ba68..c29bb429062b 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -909,6 +909,7 @@ extern tree fold_for_warn (tree);
 extern tree c_common_get_narrower (tree, int *);
 extern bool get_attribute_operand (tree, unsigned HOST_WIDE_INT *);
 extern void c_common_finalize_early_debug (void);
+extern bool c_flexible_array_member_type_p (const_tree);
 extern unsigned int c_strict_flex_array_level_of (tree);
 extern bool c_option_is_from_cpp_diagnostics (int);
 
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 1f9eb44dbaa2..e943b49b5230 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5173,19 +5173,6 @@ set_array_declarator_inner (struct c_declarator *decl,
   return decl;
 }
 
-/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */
-static bool
-flexible_array_member_type_p (const_tree type)
-{
-  if (TREE_CODE (type) == ARRAY_TYPE
-      && TYPE_SIZE (type) == NULL_TREE
-      && TYPE_DOMAIN (type) != NULL_TREE
-      && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
-    return true;
-
-  return false;
-}
-
 /* Determine whether TYPE is a one-element array type "[1]".  */
 static bool
 one_element_array_type_p (const_tree type)
@@ -5222,7 +5209,7 @@ add_flexible_array_elts_to_size (tree decl, tree init)
 
   elt = CONSTRUCTOR_ELTS (init)->last ().value;
   type = TREE_TYPE (elt);
-  if (flexible_array_member_type_p (type))
+  if (c_flexible_array_member_type_p (type))
     {
       complete_array_type (&type, elt, false);
       DECL_SIZE (decl)
@@ -9094,7 +9081,7 @@ is_flexible_array_member_p (bool is_last_field,
 
   bool is_zero_length_array = zero_length_array_type_p (TREE_TYPE (x));
   bool is_one_element_array = one_element_array_type_p (TREE_TYPE (x));
-  bool is_flexible_array = flexible_array_member_type_p (TREE_TYPE (x));
+  bool is_flexible_array = c_flexible_array_member_type_p (TREE_TYPE (x));
 
   unsigned int strict_flex_array_level = c_strict_flex_array_level_of (x);
 
@@ -9124,6 +9111,61 @@ is_flexible_array_member_p (bool is_last_field,
   return false;
 }
 
+/* Verify the argument of the counted_by attribute of the flexible array
+   member FIELD_DECL is a valid field of the containing structure's fieldlist,
+   FIELDLIST, Report error and remove this attribute when it's not.  */
+static void
+verify_counted_by_attribute (tree fieldlist, tree field_decl)
+{
+  tree attr_counted_by = lookup_attribute ("counted_by",
+					   DECL_ATTRIBUTES (field_decl));
+
+  if (!attr_counted_by)
+    return;
+
+  /* If there is an counted_by attribute attached to the field,
+     verify it.  */
+
+  const char *fieldname
+    = IDENTIFIER_POINTER (TREE_VALUE (TREE_VALUE (attr_counted_by)));
+
+  /* Verify the argument of the attrbute is a valid field of the
+     containing structure.  */
+
+  tree counted_by_field = get_named_field (fieldlist, fieldname);
+
+  /* Error when the field is not found in the containing structure.  */
+  if (!counted_by_field)
+    {
+      error_at (DECL_SOURCE_LOCATION (field_decl),
+		"%qE attribute argument not a field declaration"
+		" in the same structure, ignore it",
+		(get_attribute_name (attr_counted_by)));
+
+      DECL_ATTRIBUTES (field_decl)
+	= remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
+    }
+  else
+  /* Error when the field is not with an integer type.  */
+    {
+      while (TREE_CHAIN (counted_by_field))
+	counted_by_field = TREE_CHAIN (counted_by_field);
+      tree real_field = TREE_VALUE (counted_by_field);
+
+      if (TREE_CODE (TREE_TYPE (real_field)) != INTEGER_TYPE)
+	{
+	  error_at (DECL_SOURCE_LOCATION (field_decl),
+		     "%qE attribute argument not a field declaration"
+		     " with integer type, ignore it",
+		     (get_attribute_name (attr_counted_by)));
+
+	  DECL_ATTRIBUTES (field_decl)
+	    = remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
+	}
+    }
+
+  return;
+}
 
 /* Fill in the fields of a RECORD_TYPE or UNION_TYPE node, T.
    LOC is the location of the RECORD_TYPE or UNION_TYPE's definition.
@@ -9244,7 +9286,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 	DECL_PACKED (x) = 1;
 
       /* Detect flexible array member in an invalid context.  */
-      if (flexible_array_member_type_p (TREE_TYPE (x)))
+      if (c_flexible_array_member_type_p (TREE_TYPE (x)))
 	{
 	  if (TREE_CODE (t) == UNION_TYPE)
 	    {
@@ -9265,6 +9307,9 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 			"members");
 	      TREE_TYPE (x) = error_mark_node;
 	    }
+	  /* if there is a counted_by attribute attached to this field,
+	     verify it.  */
+	  verify_counted_by_attribute (fieldlist, x);
 	}
 
       if (pedantic && TREE_CODE (t) == RECORD_TYPE
@@ -9279,7 +9324,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 	 when x is an array and is the last field.  */
       if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE)
 	TYPE_INCLUDES_FLEXARRAY (t)
-	  = is_last_field && flexible_array_member_type_p (TREE_TYPE (x));
+	  = is_last_field && c_flexible_array_member_type_p (TREE_TYPE (x));
       /* Recursively set TYPE_INCLUDES_FLEXARRAY for the context of x, t
 	 when x is an union or record and is the last field.  */
       else if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 97eaacf8a7ec..a19cfe7547d8 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7617,6 +7617,79 @@ When both the attribute and the option present at the same time, the level of
 the strictness for the specific trailing array field is determined by the
 attribute.
 
+@cindex @code{counted_by} variable attribute
+@item counted_by (@var{count})
+The @code{counted_by} attribute may be attached to the flexible array
+member of a structure.  It indicates that the number of the elements of the
+array is given by the field named "@var{count}" in the same structure as the
+flexible array member.  GCC uses this information to improve the results of
+the array bound sanitizer and the @code{__builtin_dynamic_object_size}.
+
+For instance, the following code:
+
+@smallexample
+struct P @{
+  size_t count;
+  int array[] __attribute__ ((counted_by (count)));
+@} *p;
+@end smallexample
+
+@noindent
+specifies that the @code{array} is a flexible array member whose number of
+elements is given by the field @code{count} in the same structure.
+
+The field that represents the number of the elements should have an integer
+type.  An explicit @code{counted_by} annotation defines a relationship between
+two objects, @code{p->array} and @code{p->count}, that @code{p->array} has
+@emph{at least} @code{p->count} number of elements available.  This relationship
+must hold even after any of these related objects are updated.  It's the user's
+responsibility to make sure this relationship to be kept all the time.
+Otherwise the results of the array bound sanitizer and the
+@code{__builtin_dynamic_object_size} might be incorrect.
+
+For instance, in the following example, the allocated array has less elements
+than what's specified by the @code{sbuf->count}, this is an user error.  As a
+result, out-of-bounds access to the array might not be detected.
+
+@smallexample
+#define SIZE_BUMP 10
+struct P *sbuf;
+void alloc_buf (size_t nelems)
+@{
+  sbuf = (int *) malloc (sizeof (struct P) + sizeof (int) * nelems);
+  sbuf->count = nelems + SIZE_BUMP;
+  /* This is invalid when the sbuf->array has less than sbuf->count
+     elements.  */
+@}
+@end smallexample
+
+In the following example, the 2nd update to the field @code{sbuf->count} of
+the above structure will permit out-of-bounds access to the array
+@code{sbuf>array} as well.
+
+@smallexample
+#define SIZE_BUMP 10
+struct P *sbuf;
+void alloc_buf (size_t nelems)
+@{
+  sbuf = (int *) malloc (sizeof (struct P)
+			 + sizeof (int) * (nelems + SIZE_BUMP));
+  sbuf->count = nelems;
+  /* This is valid when the sbuf->array has at least sbuf->count
+     elements.  */
+@}
+void use_buf (int index)
+@{
+  sbuf->count = sbuf->count + SIZE_BUMP + 1;
+  /* Now the value of sbuf->count is larger than the number
+     of elements of sbuf->array.  */
+  sbuf->array[index] = 0;
+  /* then the out-of-bound access to this array
+     might not be detected.  */
+@}
+@end smallexample
+
+
 @cindex @code{alloc_size} variable attribute
 @item alloc_size (@var{position})
 @itemx alloc_size (@var{position-1}, @var{position-2})
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by.c b/gcc/testsuite/gcc.dg/flex-array-counted-by.c
new file mode 100644
index 000000000000..f8ce9776bf86
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by.c
@@ -0,0 +1,40 @@
+/* testing the correct usage of attribute counted_by.  */   
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include <wchar.h>
+
+int size;
+int x __attribute ((counted_by (size))); /* { dg-error "attribute may not be specified for non-field declaration" } */
+
+struct trailing {
+  int count;
+  int field __attribute ((counted_by (count))); /* { dg-error "attribute may not be specified for a non-array field" } */
+};
+
+struct trailing_1 {
+  int count;
+  int array_1[0] __attribute ((counted_by (count))); /* { dg-error "attribute may not be specified for a non flexible array member field" } */
+};
+
+int count;
+struct trailing_array_2 {
+  int count;
+  int array_2[] __attribute ((counted_by ("count"))); /* { dg-error "argument not an identifier" } */
+};
+
+struct trailing_array_3 {
+  int other;
+  int array_3[] __attribute ((counted_by (L"count"))); /* { dg-error "argument not an identifier" } */
+};
+
+struct trailing_array_4 {
+  int other;
+  int array_4[] __attribute ((counted_by (count))); /* { dg-error "attribute argument not a field declaration in the same structure, ignore it" } */
+};
+
+int count;
+struct trailing_array_5 {
+  float count;
+  int array_5[] __attribute ((counted_by (count))); /* { dg-error "attribute argument not a field declaration with integer type, ignore it" } */
+};
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 420857b110c4..fcd36ae0cd74 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -12745,6 +12745,46 @@ array_ref_element_size (tree exp)
     return SUBSTITUTE_PLACEHOLDER_IN_EXPR (TYPE_SIZE_UNIT (elmt_type), exp);
 }
 
+/* Given a field list, FIELDLIST, of a structure/union, return a TREE_LIST,
+   with each TREE_VALUE a FIELD_DECL stepping down the chain to the FIELD
+   whose name is FIELDNAME, which is the last TREE_VALUE of the list.
+   return NULL_TREE if such field is not found.  Normally this list is of
+   length one, but if the field is embedded with (nested) anonymous structures
+   or unions, this list steps down the chain to the field.  */
+tree
+get_named_field (tree fieldlist, const char *fieldname)
+{
+  tree named_field = NULL_TREE;
+  for (tree field = fieldlist; field; field = DECL_CHAIN (field))
+    {
+      if (TREE_CODE (field) != FIELD_DECL)
+	continue;
+      if (DECL_NAME (field) != NULL)
+	if (strcmp (IDENTIFIER_POINTER (DECL_NAME (field)), fieldname) == 0)
+	  {
+	    named_field = tree_cons (NULL_TREE, field, named_field);
+	    break;
+	  }
+	else
+	  continue;
+      /* if the field is an anonymous struct/union, we will check the nested
+	 fields inside it recursively.  */
+      else if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (field)))
+	  if ((named_field = get_named_field (TYPE_FIELDS (TREE_TYPE (field)),
+						  fieldname)) != NULL_TREE)
+	    {
+	      named_field = tree_cons (NULL_TREE, field, named_field);
+	      break;
+	    }
+	  else
+	    continue;
+      else
+	continue;
+    }
+  return named_field;
+}
+
+
 /* Return a tree representing the lower bound of the array mentioned in
    EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
 
diff --git a/gcc/tree.h b/gcc/tree.h
index 4c04245e2b1b..4859becaa1e7 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5619,6 +5619,11 @@ extern tree get_base_address (tree t);
    of EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
 extern tree array_ref_element_size (tree);
 
+/* Given a field list, FIELDLIST, of a structure/union, return the FIELD whose
+   name is FIELDNAME, return NULL_TREE if such field is not found.
+   searching nested anonymous structure/union recursively.  */
+extern tree get_named_field (tree, const char *);
+
 /* Return a typenode for the "standard" C type with a given name.  */
 extern tree get_typenode_from_name (const char *);
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [V2][PATCH 2/3] Use the counted_by atribute info in builtin object size [PR108896]
  2023-08-04 19:44 [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
  2023-08-04 19:44 ` [V2][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
@ 2023-08-04 19:44 ` Qing Zhao
  2023-08-04 19:44 ` [V2][PATCH 3/3] Use the counted_by attribute information in bound sanitizer[PR108896] Qing Zhao
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Qing Zhao @ 2023-08-04 19:44 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, gcc-patches
  Cc: keescook, siddhesh, uecker, isanbard, Qing Zhao

gcc/ChangeLog:

	PR C/108896
	* tree-object-size.cc (addr_object_size): Use the counted_by
	attribute info.
	* tree.cc (component_ref_has_counted_by_p): New function.
	(component_ref_get_counted_by): New function.
	* tree.h (component_ref_has_counted_by_p): New prototype.
	(component_ref_get_counted_by): New prototype.

gcc/testsuite/ChangeLog:

	PR C/108896
	* gcc.dg/flex-array-counted-by-2.c: New test.
	* gcc.dg/flex-array-counted-by-3.c: New test.
---
 .../gcc.dg/flex-array-counted-by-2.c          |  74 +++++++
 .../gcc.dg/flex-array-counted-by-3.c          | 197 ++++++++++++++++++
 gcc/tree-object-size.cc                       |  37 +++-
 gcc/tree.cc                                   |  95 ++++++++-
 gcc/tree.h                                    |  10 +
 5 files changed, 405 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c

diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
new file mode 100644
index 000000000000..ec580c1f1f01
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
@@ -0,0 +1,74 @@
+/* test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */ 
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+#define expect(p, _v) do { \
+    size_t v = _v; \
+    if (p == v) \
+	__builtin_printf ("ok:  %s == %zd\n", #p, p); \
+    else \
+	{  \
+	  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
+	  FAIL (); \
+	} \
+} while (0);
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+    union {
+      int b;
+      float f;	
+    };
+    int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+    = (struct flex *)malloc (sizeof (struct flex)
+			     + normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+    = (struct annotated *)malloc (sizeof (struct annotated)
+				  + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+    = (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
+					 + attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+    expect(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+    expect(__builtin_dynamic_object_size(array_annotated->c, 1),
+	   array_annotated->b * sizeof (int));
+    expect(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+	   array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);   
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index 000000000000..22ef2af31c20
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,197 @@
+/* test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the allocaiton
+size mismatched with the value of counted_by attribute?  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  int array[] __attribute__((counted_by (foo)));
+};
+
+#define expect(p, _v) do { \
+    size_t v = _v; \
+    if (p == v) \
+        __builtin_printf ("ok:  %s == %zd\n", #p, p); \
+    else \
+        {  \
+          __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
+	  FAIL (); \
+        } \
+} while (0);
+
+#define noinline __attribute__((__noinline__))
+#define SIZE_BUMP 5
+
+/* In general, Due to type casting, the type for the pointee of a pointer
+   does not say anything about the object it points to,
+   So, __builtin_object_size can not directly use the type of the pointee
+   to decide the size of the object the pointer points to.
+
+   there are only two reliable ways:
+   A. observed allocations  (call to the allocation functions in the routine)
+   B. observed accesses     (read or write access to the location of the
+                             pointer points to)
+
+   that provide information about the type/existence of an object at
+   the corresponding address.
+
+   for A, we use the "alloc_size" attribute for the corresponding allocation
+   functions to determine the object size;
+
+   For B, we use the SIZE info of the TYPE attached to the corresponding access.
+   (We treat counted_by attribute as a complement to the SIZE info of the TYPE
+    for FMA)
+
+   The only other way in C which ensures that a pointer actually points
+   to an object of the correct type is 'static':
+
+   void foo(struct P *p[static 1]);
+
+   See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
+   for more details.  */
+
+/* in the following function, malloc allocated more space than the value
+   of counted_by attribute.  Then what's the correct behavior we expect
+   the __builtin_dynamic_object_size should have for each of the cases?  */
+
+static struct annotated * noinline alloc_buf_more (int index)
+{
+  struct annotated *p;
+  p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int));
+  p->foo = index;
+
+  /*when checking the observed access p->array, we have info on both
+    observered allocation and observed access,
+    A. from observed allocation: (index + SIZE_BUMP) * sizeof (int)
+    B. from observed access: p->foo * sizeof (int)
+
+    in the above, p->foo = index.
+   */
+
+  /* for size in the whole object: always uses A.  */
+  /* for size in the sub-object: chose the smaller of A and B.
+   * Please see https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625891.html
+   * for details on why.  */
+
+  /* for MAXIMUM size in the whole object: use the allocation size
+     for the whole object.  */
+  expect(__builtin_dynamic_object_size(p->array, 0),
+	 (index + SIZE_BUMP) * sizeof(int));
+
+  /* for MAXIMUM size in the sub-object. use the smaller of A and B.  */
+  expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int));
+
+  /* for MINIMUM size in the whole object: use the allocation size
+     for the whole object.  */
+  expect(__builtin_dynamic_object_size(p->array, 2),
+	 (index + SIZE_BUMP) * sizeof(int));
+
+  /* for MINIMUM size in the sub-object: use the smaller of A and B.  */
+  expect(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(int));
+
+  /*when checking the pointer p, we only have info on the observed allocation.
+    So, the object size info can only been obtained from the call to malloc.
+    for both MAXIMUM and MINIMUM: A = (index + SIZE_BUMP) * sizeof (int)  */
+  expect(__builtin_dynamic_object_size(p, 1),
+	 sizeof (*p) + (index + SIZE_BUMP) * sizeof(int));
+  expect(__builtin_dynamic_object_size(p, 0),
+	 sizeof (*p) + (index + SIZE_BUMP) * sizeof(int));
+  expect(__builtin_dynamic_object_size(p, 3),
+	 sizeof (*p) + (index + SIZE_BUMP) * sizeof(int));
+  expect(__builtin_dynamic_object_size(p, 2),
+	 sizeof (*p) + (index + SIZE_BUMP) * sizeof(int));
+  return p;
+}
+
+/* in the following function, malloc allocated less space than the value
+   of counted_by attribute.  Then what's the correct behavior we expect
+   the __builtin_dynamic_object_size should have for each of the cases?
+   NOTE: this is an user error, GCC should issue warnings for such case.
+   this is a seperate issue we should address later.  */
+
+static struct annotated * noinline alloc_buf_less (int index)
+{
+  struct annotated *p;
+  p = malloc(sizeof (*p) + (index) * sizeof (int));
+  p->foo = index + SIZE_BUMP;
+
+  /*when checking the observed access p->array, we have info on both
+    observered allocation and observed access,
+    A. from observed allocation: (index) * sizeof (int)
+    B. from observed access: p->foo * sizeof (int)
+
+    in the above, p->foo = index + SIZE_BUMP.
+   */
+
+  /* for size in the whole object: always uses A.  */
+  /* for size in the sub-object: chose the smaller of A and B.
+   * Please see https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625891.html
+   * for details on why.  */
+
+  /* for MAXIMUM size in the whole object: use the allocation size 
+     for the whole object.  */
+  expect(__builtin_dynamic_object_size(p->array, 0), (index) * sizeof(int));
+
+  /* for MAXIMUM size in the sub-object. use the smaller of A and B.  */ 
+  expect(__builtin_dynamic_object_size(p->array, 1), (index) * sizeof(int));
+
+  /* for MINIMUM size in the whole object: use the allocation size
+     for the whole object.  */
+  expect(__builtin_dynamic_object_size(p->array, 2), (index) * sizeof(int));
+
+  /* for MINIMUM size in the sub-object: use the smaller of A and B.  */
+  expect(__builtin_dynamic_object_size(p->array, 3), (index) * sizeof(int));
+
+  /*when checking the pointer p, we only have info on the observed allocation.
+    So, the object size info can only been obtained from the call to malloc.
+    for both MAXIMUM and MINIMUM: A = (index + SIZE_BUMP) * sizeof (int)  */
+  expect(__builtin_dynamic_object_size(p, 1),
+	 sizeof (*p) + (index) * sizeof(int));
+  expect(__builtin_dynamic_object_size(p, 0),
+	 sizeof (*p) + (index) * sizeof(int));
+  expect(__builtin_dynamic_object_size(p, 3),
+	 sizeof (*p) + (index) * sizeof(int));
+  expect(__builtin_dynamic_object_size(p, 2),
+	 sizeof (*p) + (index) * sizeof(int));
+  return p;
+}
+
+int main ()
+{
+  struct annotated *p, *q;
+  p = alloc_buf_more (10);
+  q = alloc_buf_less (10);
+
+  /*when checking the observed access p->array, we only have info on the
+    observed access, i.e, the TYPE_SIZE info from the access. We don't have
+    info on the whole object.  */
+  expect(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(int));
+  expect(__builtin_dynamic_object_size(p->array, 0), -1);
+  expect(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(int));
+  expect(__builtin_dynamic_object_size(p->array, 2), 0);
+  /*when checking the pointer p, we have no observed allocation nor observed
+    access, therefore, we cannot determine the size info here.  */
+  expect(__builtin_dynamic_object_size(p, 1), -1);
+  expect(__builtin_dynamic_object_size(p, 0), -1);
+  expect(__builtin_dynamic_object_size(p, 3), 0);
+  expect(__builtin_dynamic_object_size(p, 2), 0);
+
+  /*when checking the observed access p->array, we only have info on the
+    observed access, i.e, the TYPE_SIZE info from the access. We don't have
+    info on the whole object.  */
+  expect(__builtin_dynamic_object_size(q->array, 1), q->foo * sizeof(int));
+  expect(__builtin_dynamic_object_size(q->array, 0), -1);
+  expect(__builtin_dynamic_object_size(q->array, 3), q->foo * sizeof(int));
+  expect(__builtin_dynamic_object_size(q->array, 2), 0);
+  /*when checking the pointer p, we have no observed allocation nor observed
+    access, therefore, we cannot determine the size info here.  */
+  expect(__builtin_dynamic_object_size(q, 1), -1);
+  expect(__builtin_dynamic_object_size(q, 0), -1);
+  expect(__builtin_dynamic_object_size(q, 3), 0);
+  expect(__builtin_dynamic_object_size(q, 2), 0);
+
+  DONE ();
+}
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index a62af0500563..cf7843c5684b 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -585,6 +585,7 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
   if (pt_var != TREE_OPERAND (ptr, 0))
     {
       tree var;
+      tree counted_by_ref = NULL_TREE;
 
       if (object_size_type & OST_SUBOBJECT)
 	{
@@ -600,11 +601,12 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 	    var = TREE_OPERAND (var, 0);
 	  if (var != pt_var && TREE_CODE (var) == ARRAY_REF)
 	    var = TREE_OPERAND (var, 0);
-	  if (! TYPE_SIZE_UNIT (TREE_TYPE (var))
+	  if (! component_ref_has_counted_by_p (var)
+	     && ((! TYPE_SIZE_UNIT (TREE_TYPE (var))
 	      || ! tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (var)))
 	      || (pt_var_size && TREE_CODE (pt_var_size) == INTEGER_CST
 		  && tree_int_cst_lt (pt_var_size,
-				      TYPE_SIZE_UNIT (TREE_TYPE (var)))))
+				      TYPE_SIZE_UNIT (TREE_TYPE (var)))))))
 	    var = pt_var;
 	  else if (var != pt_var && TREE_CODE (pt_var) == MEM_REF)
 	    {
@@ -612,6 +614,7 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 	      /* For &X->fld, compute object size if fld isn't a flexible array
 		 member.  */
 	      bool is_flexible_array_mem_ref = false;
+
 	      while (v && v != pt_var)
 		switch (TREE_CODE (v))
 		  {
@@ -660,6 +663,8 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 		    /* Now the ref is to an array type.  */
 		    gcc_assert (TREE_CODE (TREE_TYPE (v)) == ARRAY_TYPE);
 		    is_flexible_array_mem_ref = array_ref_flexible_size_p (v);
+		    counted_by_ref = component_ref_get_counted_by (v);
+
 		    while (v != pt_var && TREE_CODE (v) == COMPONENT_REF)
 		      if (TREE_CODE (TREE_TYPE (TREE_OPERAND (v, 0)))
 			  != UNION_TYPE
@@ -673,8 +678,11 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 			   == RECORD_TYPE)
 		      {
 			/* compute object size only if v is not a
-			   flexible array member.  */
-			if (!is_flexible_array_mem_ref)
+			   flexible array member or the flexible array member
+			   has a known element count indicated by the user
+			   through attribute counted_by.  */
+			if (!is_flexible_array_mem_ref
+			    || counted_by_ref)
 			  {
 			    v = NULL_TREE;
 			    break;
@@ -707,9 +715,24 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 
       if (var != pt_var)
 	{
-	  var_size = TYPE_SIZE_UNIT (TREE_TYPE (var));
-	  if (!TREE_CONSTANT (var_size))
-	    var_size = get_or_create_ssa_default_def (cfun, var_size);
+	  if (!counted_by_ref)
+	    {
+	      var_size = TYPE_SIZE_UNIT (TREE_TYPE (var));
+	      if (!TREE_CONSTANT (var_size))
+		var_size = get_or_create_ssa_default_def (cfun, var_size);
+	    }
+	  else
+	    {
+	      gcc_assert (TREE_CODE (var) == COMPONENT_REF
+			  && TREE_CODE (TREE_TYPE (var)) == ARRAY_TYPE);
+	      tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (var)));
+	      var_size
+		= size_binop (MULT_EXPR,
+			      fold_convert (sizetype, counted_by_ref),
+			      fold_convert (sizetype, element_size));
+	      if (!todo)
+		todo = TODO_update_ssa_only_virtuals;
+	    }
 	  if (!var_size)
 	    return false;
 	}
diff --git a/gcc/tree.cc b/gcc/tree.cc
index fcd36ae0cd74..c46b73be3906 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -12745,6 +12745,32 @@ array_ref_element_size (tree exp)
     return SUBSTITUTE_PLACEHOLDER_IN_EXPR (TYPE_SIZE_UNIT (elmt_type), exp);
 }
 
+/*  For a component_ref that has an array type ARRAY_REF, return TRUE when
+    an counted_by attribute attached to the corresponding FIELD_DECL.
+    return FALSE otherwise.  */
+bool
+component_ref_has_counted_by_p (tree array_ref)
+{
+  if (TREE_CODE (array_ref) != COMPONENT_REF)
+    return false;
+
+  if (TREE_CODE (TREE_TYPE (array_ref)) != ARRAY_TYPE)
+    return false;
+
+  tree struct_object = TREE_OPERAND (array_ref, 0);
+  tree struct_type = TREE_TYPE (struct_object);
+
+  if (!RECORD_OR_UNION_TYPE_P (struct_type))
+    return false;
+  tree field_decl = TREE_OPERAND (array_ref, 1);
+  tree attr_counted_by = lookup_attribute ("counted_by",
+					      DECL_ATTRIBUTES (field_decl));
+
+  if (!attr_counted_by)
+    return false;
+  return true;
+}
+
 /* Given a field list, FIELDLIST, of a structure/union, return a TREE_LIST,
    with each TREE_VALUE a FIELD_DECL stepping down the chain to the FIELD
    whose name is FIELDNAME, which is the last TREE_VALUE of the list.
@@ -12771,7 +12797,7 @@ get_named_field (tree fieldlist, const char *fieldname)
 	 fields inside it recursively.  */
       else if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (field)))
 	  if ((named_field = get_named_field (TYPE_FIELDS (TREE_TYPE (field)),
-						  fieldname)) != NULL_TREE)
+					     fieldname)) != NULL_TREE)
 	    {
 	      named_field = tree_cons (NULL_TREE, field, named_field);
 	      break;
@@ -12784,6 +12810,73 @@ get_named_field (tree fieldlist, const char *fieldname)
   return named_field;
 }
 
+/* For a component_ref that has an array type ARRAY_REF, get the object that
+   represents its counted_by per the attribute counted_by attached to
+   the corresponding FIELD_DECL.  return NULL_TREE when cannot find such
+   object.
+   For example, if:
+
+    struct P {
+      int k;
+      int x[] __attribute__ ((counted_by (k)));
+    } *p;
+
+    for the following reference:
+
+    p->x[b]
+
+    the object that represents its element count will be:
+
+    p->k
+
+    So, when component_ref_get_counted_by (p->x[b]) is called, p->k should be
+    returned.
+*/
+
+tree
+component_ref_get_counted_by (tree array_ref)
+{
+  if (! component_ref_has_counted_by_p (array_ref))
+    return NULL_TREE;
+
+  tree struct_object = TREE_OPERAND (array_ref, 0);
+  tree struct_type = TREE_TYPE (struct_object);
+  tree field_decl = TREE_OPERAND (array_ref, 1);
+  tree attr_counted_by = lookup_attribute ("counted_by",
+					   DECL_ATTRIBUTES (field_decl));
+  gcc_assert (attr_counted_by);
+
+  /* If there is an counted_by attribute attached to the field,
+     get the field that maps to the counted_by.  */
+
+  const char *fieldname
+    = IDENTIFIER_POINTER (TREE_VALUE (TREE_VALUE (attr_counted_by)));
+
+  tree counted_by_field = get_named_field (TYPE_FIELDS (struct_type),
+					   fieldname);
+
+  gcc_assert (counted_by_field);
+
+  /* generate the tree node that represent the counted_by of this array
+     ref.  This is a (possible nested) COMPONENT_REF to the counted_by_field
+     of the containing structure.  */
+
+  tree counted_by_ref = NULL_TREE;
+  tree object = struct_object;
+  do
+    {
+      tree field = TREE_VALUE (counted_by_field);
+
+      counted_by_ref = build3 (COMPONENT_REF,
+				  TREE_TYPE (field),
+				  object, field,
+				  NULL_TREE);
+      object = counted_by_ref;
+      counted_by_field = TREE_CHAIN (counted_by_field);
+    }
+  while (counted_by_field);
+  return counted_by_ref;
+}
 
 /* Return a tree representing the lower bound of the array mentioned in
    EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 4859becaa1e7..07eed7219835 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5619,11 +5619,21 @@ extern tree get_base_address (tree t);
    of EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
 extern tree array_ref_element_size (tree);
 
+/* Give a component_ref that has an array type, return true when an
+   attribute counted_by attached to the corresponding FIELD_DECL.  */
+extern bool component_ref_has_counted_by_p (tree);
+
 /* Given a field list, FIELDLIST, of a structure/union, return the FIELD whose
    name is FIELDNAME, return NULL_TREE if such field is not found.
    searching nested anonymous structure/union recursively.  */
 extern tree get_named_field (tree, const char *);
 
+/* Give a component_ref that has an array type, return the object that
+   represents its counted_by per the attribute counted_by attached to
+   the corresponding FIELD_DECL.  return NULL_TREE when cannot find such
+   object.  */
+extern tree component_ref_get_counted_by (tree);
+
 /* Return a typenode for the "standard" C type with a given name.  */
 extern tree get_typenode_from_name (const char *);
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [V2][PATCH 3/3] Use the counted_by attribute information in bound sanitizer[PR108896]
  2023-08-04 19:44 [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
  2023-08-04 19:44 ` [V2][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
  2023-08-04 19:44 ` [V2][PATCH 2/3] Use the counted_by atribute info in builtin object size [PR108896] Qing Zhao
@ 2023-08-04 19:44 ` Qing Zhao
  2023-08-07 16:16 ` [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
  2023-08-17  5:31 ` Kees Cook
  4 siblings, 0 replies; 33+ messages in thread
From: Qing Zhao @ 2023-08-04 19:44 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, gcc-patches
  Cc: keescook, siddhesh, uecker, isanbard, Qing Zhao

gcc/c-family/ChangeLog:

	PR C/108896
	* c-ubsan.cc (ubsan_instrument_bounds): Use counted_by attribute
	information.

gcc/testsuite/ChangeLog:

	PR C/108896
	* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
	* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
---
 gcc/c-family/c-ubsan.cc                       | 16 +++++++
 .../ubsan/flex-array-counted-by-bounds-2.c    | 27 +++++++++++
 .../ubsan/flex-array-counted-by-bounds.c      | 46 +++++++++++++++++++
 3 files changed, 89 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 51aa83a378d2..a99e8433069f 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -362,6 +362,10 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index,
 {
   tree type = TREE_TYPE (array);
   tree domain = TYPE_DOMAIN (type);
+  /* whether the array ref is a flexible array member with valid counted_by
+     attribute.  */
+  bool fam_has_count_attr = false;
+  tree counted_by = NULL_TREE;
 
   if (domain == NULL_TREE)
     return NULL_TREE;
@@ -375,6 +379,17 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index,
 	  && COMPLETE_TYPE_P (type)
 	  && integer_zerop (TYPE_SIZE (type)))
 	bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+      /* If the array ref is to flexible array member field which has
+	 counted_by attribute.  We can use the information from the
+	 attribute as the bound to instrument the reference.  */
+      else if ((counted_by = component_ref_get_counted_by (array))
+		!= NULL_TREE)
+	{
+	  fam_has_count_attr = true;
+	  bound = fold_build2 (MINUS_EXPR, TREE_TYPE (counted_by),
+			       counted_by,
+			       build_int_cst (TREE_TYPE (counted_by), 1));
+	}
       else
 	return NULL_TREE;
     }
@@ -387,6 +402,7 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index,
      -fsanitize=bounds-strict.  */
   tree base = get_base_address (array);
   if (!sanitize_flags_p (SANITIZE_BOUNDS_STRICT)
+      && !fam_has_count_attr
       && TREE_CODE (array) == COMPONENT_REF
       && base && (INDIRECT_REF_P (base) || TREE_CODE (base) == MEM_REF))
     {
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index 000000000000..77ec333509d0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,27 @@
+/* test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+
+#include <stdlib.h>
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+       int n;
+       int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
+   f->n = m;
+   f->p[m][n-1]=1;
+   return;
+}
+
+int main(int argc, char *argv[])
+{
+  setup_and_test_vla (10, 11);
+  return 0;
+}
+
+/* { dg-output "17:8: runtime error: index 11 out of bounds for type" } */
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
new file mode 100644
index 000000000000..81eaeb3f2681
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
@@ -0,0 +1,46 @@
+/* test the attribute counted_by and its usage in
+   bounds sanitizer.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+
+#include <stdlib.h>
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int annotated_count)
+{
+  array_flex
+    = (struct flex *)malloc (sizeof (struct flex)
+			     + normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+    = (struct annotated *)malloc (sizeof (struct annotated)
+				  + annotated_count *  sizeof (int));
+  array_annotated->b = annotated_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test (int normal_index, int annotated_index)
+{
+  array_flex->c[normal_index] = 1;
+  array_annotated->c[annotated_index] = 2;
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10, 10);   
+  test (10, 10);
+  return 0;
+}
+
+/* { dg-output "36:21: runtime error: index 10 out of bounds for type" } */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-04 19:44 [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
                   ` (2 preceding siblings ...)
  2023-08-04 19:44 ` [V2][PATCH 3/3] Use the counted_by attribute information in bound sanitizer[PR108896] Qing Zhao
@ 2023-08-07 16:16 ` Kees Cook
  2023-08-07 16:33   ` Qing Zhao
  2023-08-08 14:54   ` Martin Uecker
  2023-08-17  5:31 ` Kees Cook
  4 siblings, 2 replies; 33+ messages in thread
From: Kees Cook @ 2023-08-07 16:16 UTC (permalink / raw)
  To: Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard

On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
> This is the 2nd version of the patch, per our discussion based on the
> review comments for the 1st version, the major changes in this version
> are:

Thanks for the update!

> 
> 1. change the name "element_count" to "counted_by";
> 2. change the parameter for the attribute from a STRING to an
> Identifier;
> 3. Add logic and testing cases to handle anonymous structure/unions;
> 4. Clarify documentation to permit the situation when the allocation
> size is larger than what's specified by "counted_by", at the same time,
> it's user's error if allocation size is smaller than what's specified by
> "counted_by";
> 5. Add a complete testing case for using counted_by attribute in
> __builtin_dynamic_object_size when there is mismatch between the
> allocation size and the value of "counted_by", the expecting behavior
> for each case and the explanation on why in the comments. 

All the "normal" test cases I have are passing; this is wonderful! :)

I'm still seeing unexpected situations when I've intentionally set
counted_by to be smaller than alloc_size, but I assume it's due to not
yet having the patch you mention below.

> As discussed, I plan to add two more separate patch sets after this initial
> patch set is approved and committed.
> 
> set 1. A new warning option and a new sanitizer option for the user error
>        when the allocation size is smaller than the value of "counted_by".
> set 2. An improvement to __builtin_dynamic_object_size  for the following
>        case:
> 
> struct A
> {
> size_t foo;
> int array[] __attribute__((counted_by (foo)));
> };
> 
> extern struct fix * alloc_buf ();
> 
> int main ()
> {
> struct fix *p = alloc_buf ();
> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
>   /* with the current algorithm, it’s UNKNOWN */ 
> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
>   /* with the current algorithm, it’s UNKNOWN */
> }

Should the above be bdos instead of bos?

> Bootstrapped and regression tested on both aarch64 and X86, no issue.

I've updated the Linux kernel's macros for the name change and done
build tests with my first pass at "easy" cases for adding counted_by:
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b

Everything is working as expected. :)

-Kees

-- 
Kees Cook



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-07 16:16 ` [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
@ 2023-08-07 16:33   ` Qing Zhao
  2023-08-09 19:17     ` Kees Cook
  2023-08-08 14:54   ` Martin Uecker
  1 sibling, 1 reply; 33+ messages in thread
From: Qing Zhao @ 2023-08-07 16:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard



> On Aug 7, 2023, at 12:16 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
>> This is the 2nd version of the patch, per our discussion based on the
>> review comments for the 1st version, the major changes in this version
>> are:
> 
> Thanks for the update!
> 
>> 
>> 1. change the name "element_count" to "counted_by";
>> 2. change the parameter for the attribute from a STRING to an
>> Identifier;
>> 3. Add logic and testing cases to handle anonymous structure/unions;
>> 4. Clarify documentation to permit the situation when the allocation
>> size is larger than what's specified by "counted_by", at the same time,
>> it's user's error if allocation size is smaller than what's specified by
>> "counted_by";
>> 5. Add a complete testing case for using counted_by attribute in
>> __builtin_dynamic_object_size when there is mismatch between the
>> allocation size and the value of "counted_by", the expecting behavior
>> for each case and the explanation on why in the comments. 
> 
> All the "normal" test cases I have are passing; this is wonderful! :)
> 
> I'm still seeing unexpected situations when I've intentionally set
> counted_by to be smaller than alloc_size, but I assume it's due to not
> yet having the patch you mention below.

What’s the testing case for the one that failed? 
If it’s 

__builtin_dynamic_object_size(p->array, 0/2) without the allocation information in the routine, 
then with the current algorithm, gcc cannot deduce the size for the whole object.

If not such case, let me know.

> 
>> As discussed, I plan to add two more separate patch sets after this initial
>> patch set is approved and committed.
>> 
>> set 1. A new warning option and a new sanitizer option for the user error
>>       when the allocation size is smaller than the value of "counted_by".
>> set 2. An improvement to __builtin_dynamic_object_size  for the following
>>       case:
>> 
>> struct A
>> {
>> size_t foo;
>> int array[] __attribute__((counted_by (foo)));
>> };
>> 
>> extern struct fix * alloc_buf ();
>> 
>> int main ()
>> {
>> struct fix *p = alloc_buf ();
>> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
>>  /* with the current algorithm, it’s UNKNOWN */ 
>> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
>>  /* with the current algorithm, it’s UNKNOWN */
>> }
> 
> Should the above be bdos instead of bos?

Yes, sorry for the typo.  -:)
> 
>> Bootstrapped and regression tested on both aarch64 and X86, no issue.
> 
> I've updated the Linux kernel's macros for the name change and done
> build tests with my first pass at "easy" cases for adding counted_by:
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b

> 
> Everything is working as expected. :)

Thanks a lot for the testing!

Qing
> 
> -Kees
> 
> -- 
> Kees Cook
> 
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-07 16:16 ` [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
  2023-08-07 16:33   ` Qing Zhao
@ 2023-08-08 14:54   ` Martin Uecker
  2023-08-08 16:18     ` Michael Matz
                       ` (3 more replies)
  1 sibling, 4 replies; 33+ messages in thread
From: Martin Uecker @ 2023-08-08 14:54 UTC (permalink / raw)
  To: Kees Cook, Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, isanbard



I am sure this has been discussed before, but seeing that you
test for a specific formula, let me point out the following:

There at least three different size expression which could
make sense. Consider

short foo { int a; short b; char t[]; }; 

Most people seem to use

sizeof(struct foo) + N * sizeof(foo->t);

which for N == 3 yields 11 bytes on x86-64 because the formula
adds the padding of the original struct. There is an example
in the  C standard that uses this formula.


But he minimum size of an object which stores N elements is

max(sizeof (struct s), offsetof(struct s, t[n]))

which is 9 bytes. 

This is what clang uses for statically allocated objects with
initialization, while GCC uses the rule above (11 bytes). But 
bdos / bos  then returns the smaller size of 9 which is a bit
confusing.


https://godbolt.org/z/K1hvaK1ns

https://github.com/llvm/llvm-project/issues/62929
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956


Then there is also the size of a similar array where the FAM
is replaced with an array of static size:

struct foo { int a; short b; char t[3]; }; 

This would make the most sense to me, but it has 12 bytes
because the padding is according to the usual alignment
rules.


Martin



Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook:
> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
> > This is the 2nd version of the patch, per our discussion based on the
> > review comments for the 1st version, the major changes in this version
> > are:
> 
> Thanks for the update!
> 
> > 
> > 1. change the name "element_count" to "counted_by";
> > 2. change the parameter for the attribute from a STRING to an
> > Identifier;
> > 3. Add logic and testing cases to handle anonymous structure/unions;
> > 4. Clarify documentation to permit the situation when the allocation
> > size is larger than what's specified by "counted_by", at the same time,
> > it's user's error if allocation size is smaller than what's specified by
> > "counted_by";
> > 5. Add a complete testing case for using counted_by attribute in
> > __builtin_dynamic_object_size when there is mismatch between the
> > allocation size and the value of "counted_by", the expecting behavior
> > for each case and the explanation on why in the comments. 
> 
> All the "normal" test cases I have are passing; this is wonderful! :)
> 
> I'm still seeing unexpected situations when I've intentionally set
> counted_by to be smaller than alloc_size, but I assume it's due to not
> yet having the patch you mention below.
> 
> > As discussed, I plan to add two more separate patch sets after this initial
> > patch set is approved and committed.
> > 
> > set 1. A new warning option and a new sanitizer option for the user error
> >        when the allocation size is smaller than the value of "counted_by".
> > set 2. An improvement to __builtin_dynamic_object_size  for the following
> >        case:
> > 
> > struct A
> > {
> > size_t foo;
> > int array[] __attribute__((counted_by (foo)));
> > };
> > 
> > extern struct fix * alloc_buf ();
> > 
> > int main ()
> > {
> > struct fix *p = alloc_buf ();
> > __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
> >   /* with the current algorithm, it’s UNKNOWN */ 
> > __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
> >   /* with the current algorithm, it’s UNKNOWN */
> > }
> 
> Should the above be bdos instead of bos?
> 
> > Bootstrapped and regression tested on both aarch64 and X86, no issue.
> 
> I've updated the Linux kernel's macros for the name change and done
> build tests with my first pass at "easy" cases for adding counted_by:
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b
> 
> Everything is working as expected. :)
> 
> -Kees
> 

-- 
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-08 14:54   ` Martin Uecker
@ 2023-08-08 16:18     ` Michael Matz
  2023-08-08 19:58     ` Kees Cook
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Michael Matz @ 2023-08-08 16:18 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Kees Cook, Qing Zhao, joseph, richard.guenther, jakub,
	gcc-patches, siddhesh, isanbard

[-- Attachment #1: Type: text/plain, Size: 1422 bytes --]

Hello,

On Tue, 8 Aug 2023, Martin Uecker via Gcc-patches wrote:

> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);
> 
> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 

But should it really?  struct sizes should usually be a multiple of a 
structs alignment, and if int is 4-aligned only size 12 would be correct. 
I don't see why one should deviate from that general rule for sizes for 
FAM structs.  (I realize that the above is not about sizeof(), but rather 
bdos/bos, but I think it's fairly useful that both agree when possbible).

Say, if you were to allocate an array of such struct foos, each having 3 
elements in foo.t[].  You need to add 12 bytes to go to the next array 
element, not 9.  (Of course arrays of FAM-structs are somewhat meh, but 
still).  It's true that you would be allowed to rely on only 9 bytes of 
those 12 bytes (the rest being padding), so perhaps it's really the right 
answer for __bos, but, hmm, 12 seems to be "more correct" to my guts :-)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-08 14:54   ` Martin Uecker
  2023-08-08 16:18     ` Michael Matz
@ 2023-08-08 19:58     ` Kees Cook
  2023-08-09 16:05     ` Qing Zhao
  2023-08-09 20:34     ` Qing Zhao
  3 siblings, 0 replies; 33+ messages in thread
From: Kees Cook @ 2023-08-08 19:58 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Qing Zhao, joseph, richard.guenther, jakub, gcc-patches,
	siddhesh, isanbard

On Tue, Aug 08, 2023 at 04:54:46PM +0200, Martin Uecker wrote:
> 
> 
> I am sure this has been discussed before, but seeing that you
> test for a specific formula, let me point out the following:
> 
> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);

I think this contains a typo. The above should be:

	sizeof(struct foo) + N * sizeof(*foo->t);

> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 
> 
> This is what clang uses for statically allocated objects with
> initialization, while GCC uses the rule above (11 bytes). But 
> bdos / bos  then returns the smaller size of 9 which is a bit
> confusing.
> 
> https://godbolt.org/z/K1hvaK1ns

And to add to the mess here, Clang used to get this wrong for statically
initialized flexible array members (returning 8):

https://godbolt.org/z/Tee96dazs

But that was fixed recently when I noticed it a few weeks ago.

> https://github.com/llvm/llvm-project/issues/62929
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956
> 
> 
> Then there is also the size of a similar array where the FAM
> is replaced with an array of static size:
> 
> struct foo { int a; short b; char t[3]; }; 
> 
> This would make the most sense to me, but it has 12 bytes
> because the padding is according to the usual alignment
> rules.

Hm, in my thinking, FAMs are individually padded, so since they don't
"count" to the struct size, I think this is "as expected".

Note that the Linux kernel explicitly chooses to potentially over-estimate
for FAM allocations since there has been inconsistent sizes used depend
on the style of the prior fake FAM usage, the use of unions, etc.

i.e. our macro is, effectively:

#define struct_size(ptr, member, count)	\
	(sizeof(*ptr) + (count) * sizeof(*ptr->member))

since the other case can lead to truncated sizes:

#define struct_size(ptr, member, count) \
	(offsetof(typeof(*ptr), member) + (count) * sizeof(*ptr->member))


struct foo {
	int a, b, c;
	int count;
	union {
		struct {
			char small;
			char char_fam[];
		};
		struct {
			int large;
			int int_fam[];
		};
	};
};

https://godbolt.org/z/b1v74Mzhd

i.e.:
struct_size(x, char_fam, 1) < sizeof(*x)

so accessing "large" fails, etc. Yes, it's all horrible, but we have to
deal with this kind of thing in the kernel. :(

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-08 14:54   ` Martin Uecker
  2023-08-08 16:18     ` Michael Matz
  2023-08-08 19:58     ` Kees Cook
@ 2023-08-09 16:05     ` Qing Zhao
  2023-08-09 16:21       ` Michael Matz
  2023-08-09 20:34     ` Qing Zhao
  3 siblings, 1 reply; 33+ messages in thread
From: Qing Zhao @ 2023-08-09 16:05 UTC (permalink / raw)
  To: Martin Uecker, Kees Cook, Michael Matz
  Cc: Joseph Myers, Richard Biener, jakub Jelinek,
	Qing Zhao via Gcc-patches, Siddhesh Poyarekar, isanbard

Hi, Martin,

Thanks for raising this issue.

Although this is an old FAM related issue that does not relate to my current patch 
(and might need to be resolved in a separate patch).  I think that it’s necessary to have
more discussion on this old issue and resolve it. 

The first thing that I’d like to confirm is:

What the exact memory layout for the following structure x?

struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };

And the key that is confusing me is, where should the field “t” start? 

A.  Starting at offset 8 as the following:

a          	4-bytes
b          	2-bytes
padding   2-bytes
t           	3-bytes

B. Starting at offset 6 as the following:

a          	4-bytes
b          	2-bytes
t           	3-bytes

From my understanding, A should be correct. However, when I debugged into gcc, I found that the following

tree
byte_position (const_tree field)
{
  return byte_from_pos (DECL_FIELD_OFFSET (field),
                        DECL_FIELD_BIT_OFFSET (field));
}

Returned 6 for the field “t”:

498	  tree pos = byte_position (last);
(gdb) n
499	  size = fold_build2 (PLUS_EXPR, TREE_TYPE (size), pos, compsize);
(gdb) call debug_generic_expr(pos)
6

So, I suspect that there is a bug in GCC which incorrectly represent the offset of the FAM field in the IR.

Thanks.

Qing
> On Aug 8, 2023, at 10:54 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> 
> 
> I am sure this has been discussed before, but seeing that you
> test for a specific formula, let me point out the following:
> 
> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);
> 
> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 
> 
> This is what clang uses for statically allocated objects with
> initialization, while GCC uses the rule above (11 bytes). But 
> bdos / bos  then returns the smaller size of 9 which is a bit
> confusing.
> 
> 
> https://godbolt.org/z/K1hvaK1ns
> 
> https://github.com/llvm/llvm-project/issues/62929
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956
> 
> 
> Then there is also the size of a similar array where the FAM
> is replaced with an array of static size:
> 
> struct foo { int a; short b; char t[3]; }; 
> 
> This would make the most sense to me, but it has 12 bytes
> because the padding is according to the usual alignment
> rules.
> 
> 
> Martin
> 
> 
> 
> Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook:
>> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
>>> This is the 2nd version of the patch, per our discussion based on the
>>> review comments for the 1st version, the major changes in this version
>>> are:
>> 
>> Thanks for the update!
>> 
>>> 
>>> 1. change the name "element_count" to "counted_by";
>>> 2. change the parameter for the attribute from a STRING to an
>>> Identifier;
>>> 3. Add logic and testing cases to handle anonymous structure/unions;
>>> 4. Clarify documentation to permit the situation when the allocation
>>> size is larger than what's specified by "counted_by", at the same time,
>>> it's user's error if allocation size is smaller than what's specified by
>>> "counted_by";
>>> 5. Add a complete testing case for using counted_by attribute in
>>> __builtin_dynamic_object_size when there is mismatch between the
>>> allocation size and the value of "counted_by", the expecting behavior
>>> for each case and the explanation on why in the comments. 
>> 
>> All the "normal" test cases I have are passing; this is wonderful! :)
>> 
>> I'm still seeing unexpected situations when I've intentionally set
>> counted_by to be smaller than alloc_size, but I assume it's due to not
>> yet having the patch you mention below.
>> 
>>> As discussed, I plan to add two more separate patch sets after this initial
>>> patch set is approved and committed.
>>> 
>>> set 1. A new warning option and a new sanitizer option for the user error
>>>        when the allocation size is smaller than the value of "counted_by".
>>> set 2. An improvement to __builtin_dynamic_object_size  for the following
>>>        case:
>>> 
>>> struct A
>>> {
>>> size_t foo;
>>> int array[] __attribute__((counted_by (foo)));
>>> };
>>> 
>>> extern struct fix * alloc_buf ();
>>> 
>>> int main ()
>>> {
>>> struct fix *p = alloc_buf ();
>>> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */ 
>>> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */
>>> }
>> 
>> Should the above be bdos instead of bos?
>> 
>>> Bootstrapped and regression tested on both aarch64 and X86, no issue.
>> 
>> I've updated the Linux kernel's macros for the name change and done
>> build tests with my first pass at "easy" cases for adding counted_by:
>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b
>> 
>> Everything is working as expected. :)
>> 
>> -Kees
>> 
> 
> -- 
> Univ.-Prof. Dr. rer. nat. Martin Uecker
> Graz University of Technology
> Institute of Biomedical Imaging


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-09 16:05     ` Qing Zhao
@ 2023-08-09 16:21       ` Michael Matz
  2023-08-09 20:10         ` Qing Zhao
  0 siblings, 1 reply; 33+ messages in thread
From: Michael Matz @ 2023-08-09 16:21 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Martin Uecker, Kees Cook, Joseph Myers, Richard Biener,
	jakub Jelinek, Qing Zhao via Gcc-patches, Siddhesh Poyarekar,
	isanbard

[-- Attachment #1: Type: text/plain, Size: 2779 bytes --]

Hello,

On Wed, 9 Aug 2023, Qing Zhao wrote:

> Although this is an old FAM related issue that does not relate to my current patch 
> (and might need to be resolved in a separate patch).  I think that it’s necessary to have
> more discussion on this old issue and resolve it. 
> 
> The first thing that I’d like to confirm is:
> 
> What the exact memory layout for the following structure x?
> 
> struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
> 
> And the key that is confusing me is, where should the field “t” start? 
> 
> A.  Starting at offset 8 as the following:
> 
> a          	4-bytes
> b          	2-bytes
> padding   2-bytes
> t           	3-bytes

Why should there be padding before 't'?  It's a char array (FAM or not), 
so it can be (and should be) placed directly after 'b'.  So ...

> B. Starting at offset 6 as the following:
> 
> a          	4-bytes
> b          	2-bytes
> t           	3-bytes

... this is the correct layout, when seen in isolation.  The discussion 
revolves around what should come after 't': if it's a non-FAM struct (with 
t[3]), then it's clear that there needs to be padding after it, so to pad 
out the whole struct to be 12 bytes long (for sizeof() purpose), as 
required by its alignment (due to the int field 'a').

So, should the equivalent FAM struct also have this sizeof()?  If no: 
there should be a good argument why it shouldn't be similar to the non-FAM 
one.

Then there is an argument that the compiler would be fine, when allocating 
a single object of such type (not as part of an array!), to only reserve 9 
bytes of space for the FAM-struct.  Then the question is: should it also 
do that for a non-FAM struct (based on the argument that the padding 
behind 't' isn't accessible, and hence doesn't need to be alloced).  I 
think it would be quite unexpected for the compiler to actually reserve 
less space than sizeof() implies, so I personally don't buy that argument.  
For FAM or non-FAM structs.

Note that if one choses to allocate less space than sizeof implies that 
this will have quite some consequences for code generation, in that 
sometimes the instruction sequences (e.g. for copying) need to be careful 
to never access tail padding that should be there in array context, but 
isn't there in single-object context.  I think this alone should make it 
clear that it's advisable that sizeof() and allocated size agree.

As in: I think sizeof for both structs should return 12, and 12 bytes 
should be reserved for objects of such types.

And then the next question is what __builtin_object_size should do with 
these: should it return the size with or without padding at end (i.e. 
could/should it return 9 even if sizeof is 12).  I can see arguments for 
both.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-07 16:33   ` Qing Zhao
@ 2023-08-09 19:17     ` Kees Cook
  0 siblings, 0 replies; 33+ messages in thread
From: Kees Cook @ 2023-08-09 19:17 UTC (permalink / raw)
  To: Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker,
	isanbard, morbo

On Mon, Aug 07, 2023 at 04:33:13PM +0000, Qing Zhao wrote:
> What’s the testing case for the one that failed? 
> If it’s 
> 
> __builtin_dynamic_object_size(p->array, 0/2) without the allocation information in the routine, 
> then with the current algorithm, gcc cannot deduce the size for the whole object.
> 
> If not such case, let me know.

I found some more bugs in my tests (now fixed), but I'm left with a single
failure case, which is think it going to boil down to pointer/pointee
issue we discussed earlier.

Using my existing testing tool:
https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c

I see this error with the "counted_by_seen_by_bdos" case:

Expected __builtin_dynamic_object_size(p, 1) (18446744073709551615) == sizeof(*p) + p->count * sizeof(*p->array) (80)

A reduced view of the code is:

        struct annotated *p;
        int index = MAX_INDEX + unconst;

        p = alloc_annotated(index);

        EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->count * sizeof(*p->array));

It looks like __bdos will not use the __counted_by information from the
pointee if the argument is only the pointer. i.e. this test works:

        EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->count * sizeof(*p->array));

However, I thought if any part of the pointee was used (e.g. p->count,
p->array), GCC would be happy to start using the pointee details?

And, again, for the initial version at this feature, I'm fine with this
failing test being declared not a valid test. :) But I'll need some
kind of builtin that can correctly interrogate a pointer to find the
full runtime size with the assumption that pointer is valid, but that
can come later.

And as a side note, I am excited to see the very correct warnings for
bad (too late) assignment of the __counted_by member:

	p->array[0] = 0;
	p->count = 1;

array-bounds.c: In function 'invalid_assignment_order':
array-bounds.c:366:17: warning: '*p.count' is used uninitialized [-Wuninitialized]
  366 |         p->array[0] = 0;
      |         ~~~~~~~~^~~

Yay! :)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-09 16:21       ` Michael Matz
@ 2023-08-09 20:10         ` Qing Zhao
  2023-08-10  6:58           ` Martin Uecker
  2023-08-10 13:54           ` Michael Matz
  0 siblings, 2 replies; 33+ messages in thread
From: Qing Zhao @ 2023-08-09 20:10 UTC (permalink / raw)
  To: Michael Matz
  Cc: Martin Uecker, Kees Cook, Joseph Myers, Richard Biener,
	jakub Jelinek, Qing Zhao via Gcc-patches, Siddhesh Poyarekar,
	isanbard



> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> 
> Hello,
> 
> On Wed, 9 Aug 2023, Qing Zhao wrote:
> 
>> Although this is an old FAM related issue that does not relate to my current patch 
>> (and might need to be resolved in a separate patch).  I think that it’s necessary to have
>> more discussion on this old issue and resolve it. 
>> 
>> The first thing that I’d like to confirm is:
>> 
>> What the exact memory layout for the following structure x?
>> 
>> struct foo { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
>> 
>> And the key that is confusing me is, where should the field “t” start? 
>> 
>> A.  Starting at offset 8 as the following:
>> 
>> a          	4-bytes
>> b          	2-bytes
>> padding   2-bytes
>> t           	3-bytes
> 
> Why should there be padding before 't'?  It's a char array (FAM or not), 
> so it can be (and should be) placed directly after 'b'.  So ...

Yes, you are right. My mistake -:)
> 
>> B. Starting at offset 6 as the following:
>> 
>> a          	4-bytes
>> b          	2-bytes
>> t           	3-bytes
> 
> ... this is the correct layout, when seen in isolation.  The discussion 
> revolves around what should come after 't': if it's a non-FAM struct (with 
> t[3]), then it's clear that there needs to be padding after it, so to pad 
> out the whole struct to be 12 bytes long (for sizeof() purpose), as 
> required by its alignment (due to the int field 'a').

right.
> 
> So, should the equivalent FAM struct also have this sizeof()?  If no: 
> there should be a good argument why it shouldn't be similar to the non-FAM 
> one.

The sizeof() of a structure with FAM is defined as: (after I searched online,
 I think that the one in Wikipedia is the most reasonable one):
https://en.wikipedia.org/wiki/Flexible_array_member

======
Effect on struct size and padding
The sizeof operator on such a struct gives the size of the structure as if the flexible 
array member were empty. This may include padding added to accommodate the
 flexible member; the compiler is also free to re-use such padding as part of the array itself.[2]

It is common to allocate sizeof(struct) + array_len*sizeof(array element) bytes. 

This is not wrong, but it may allocate a few more bytes than necessary: the compiler
 may be re-purposing some of the padding that is included in sizeof(struct). Should 
this be a concern, macros are available[3] to compute the minimum size while 
ensuring that the compiler's padding is not disrupted. 

As the array may start in the padding before the end of the structure, its content
 should always be accessed via indexing (arr[i]) or  offsetof, not sizeof.
======

By definition, the sizeof() of a struct with FAM might not be the same as the non-FAM one. 
i.e, for the following two structures, one with FAM, the other with fixed array:

struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
struct foo_fix {int a; short b; char t[3]; } 

With current GCC:
sizeof(foo_flex) == 8
sizeof(foo_fix) == 12

I think that the current behavior of sizeof for structure with FAM in GCC is correct. 

The major issue is what was pointed out by Martin in the previous email:

Whether using the following formula is correct to compute the allocation?

sizeof(struct foo_flex) + N * sizeof(foo->t);

As pointed out  in the wikipedia, the value computed by this formula might
 be bigger than the actual size since “sizeof(struct foo_flex)” might include 
paddings that are used as part of the array.

So the more accurate formula should be

offset(struct foo_flex, t[0]) + N * sizeof(foo->t);

With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. 


> Then there is an argument that the compiler would be fine, when allocating 
> a single object of such type (not as part of an array!), to only reserve 9 
> bytes of space for the FAM-struct.  Then the question is: should it also 
> do that for a non-FAM struct (based on the argument that the padding 
> behind 't' isn't accessible, and hence doesn't need to be alloced).  I 
> think it would be quite unexpected for the compiler to actually reserve 
> less space than sizeof() implies, so I personally don't buy that argument.  
> For FAM or non-FAM structs.

For the question, whether the compiler needs to allocate paddings after the FAM field,
 I don’t know the answer, and it’s not specified in the standard either. 
Does it matter?

> 
> Note that if one choses to allocate less space than sizeof implies that 
> this will have quite some consequences for code generation, in that 
> sometimes the instruction sequences (e.g. for copying) need to be careful 
> to never access tail padding that should be there in array context, but 
> isn't there in single-object context.  I think this alone should make it 
> clear that it's advisable that sizeof() and allocated size agree.

Sizeof by definition is return the size of the TYPE, not the size of the allocated object.

Another thing I need to point out is,  a structure with FAM cannot be an element of an array.
> 
> As in: I think sizeof for both structs should return 12, and 12 bytes 
> should be reserved for objects of such types.
> 
> And then the next question is what __builtin_object_size should do with 
> these: should it return the size with or without padding at end (i.e. 
> could/should it return 9 even if sizeof is 12).  I can see arguments for 
> both.

Currently, GCC’s __builtin_object_size use the following formula to compute the object size for
The structure with FAM:

offset(struct foo_flex, t[0]) + N * sizeof(foo->t);

I think it’s correct. 

I think that the users might need to use this formula to compute the allocation size for a structure with FAM too.

Qing

> 
> 
> Ciao,
> Michael.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-08 14:54   ` Martin Uecker
                       ` (2 preceding siblings ...)
  2023-08-09 16:05     ` Qing Zhao
@ 2023-08-09 20:34     ` Qing Zhao
  3 siblings, 0 replies; 33+ messages in thread
From: Qing Zhao @ 2023-08-09 20:34 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Kees Cook, joseph, richard.guenther, jakub, gcc-patches,
	siddhesh, isanbard



> On Aug 8, 2023, at 10:54 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> 
> 
> I am sure this has been discussed before, but seeing that you
> test for a specific formula, let me point out the following:
> 
> There at least three different size expression which could
> make sense. Consider
> 
> short foo { int a; short b; char t[]; }; 
> 
> Most people seem to use
> 
> sizeof(struct foo) + N * sizeof(foo->t);
> 
> which for N == 3 yields 11 bytes on x86-64 because the formula
> adds the padding of the original struct. There is an example
> in the  C standard that uses this formula.
> 
> 
> But he minimum size of an object which stores N elements is
> 
> max(sizeof (struct s), offsetof(struct s, t[n]))
> 
> which is 9 bytes. 
> 
> This is what clang uses for statically allocated objects with
> initialization, while GCC uses the rule above (11 bytes). But 
> bdos / bos  then returns the smaller size of 9 which is a bit
> confusing.

As I checked the algorithm for bos in GCC,  it uses a similar formula as the following to compute the object size:

offset(struct foo, t[0]) + N * sizeof(*foo->t);

Which seems correct to me.  (Therefore bos returns 9 for this example).
> 
> 
> https://godbolt.org/z/K1hvaK1ns
> 
> https://github.com/llvm/llvm-project/issues/62929
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956
> 
> 
> Then there is also the size of a similar array where the FAM
> is replaced with an array of static size:
> 
> struct foo { int a; short b; char t[3]; }; 
> 
> This would make the most sense to me, but it has 12 bytes
> because the padding is according to the usual alignment
> rules.

My understanding is, since a structure with FAM cannot be an element of an array, 
 the tailing padding might not be necessary?

Qing

> 
> 
> Martin
> 
> 
> 
> Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook:
>> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
>>> This is the 2nd version of the patch, per our discussion based on the
>>> review comments for the 1st version, the major changes in this version
>>> are:
>> 
>> Thanks for the update!
>> 
>>> 
>>> 1. change the name "element_count" to "counted_by";
>>> 2. change the parameter for the attribute from a STRING to an
>>> Identifier;
>>> 3. Add logic and testing cases to handle anonymous structure/unions;
>>> 4. Clarify documentation to permit the situation when the allocation
>>> size is larger than what's specified by "counted_by", at the same time,
>>> it's user's error if allocation size is smaller than what's specified by
>>> "counted_by";
>>> 5. Add a complete testing case for using counted_by attribute in
>>> __builtin_dynamic_object_size when there is mismatch between the
>>> allocation size and the value of "counted_by", the expecting behavior
>>> for each case and the explanation on why in the comments. 
>> 
>> All the "normal" test cases I have are passing; this is wonderful! :)
>> 
>> I'm still seeing unexpected situations when I've intentionally set
>> counted_by to be smaller than alloc_size, but I assume it's due to not
>> yet having the patch you mention below.
>> 
>>> As discussed, I plan to add two more separate patch sets after this initial
>>> patch set is approved and committed.
>>> 
>>> set 1. A new warning option and a new sanitizer option for the user error
>>>        when the allocation size is smaller than the value of "counted_by".
>>> set 2. An improvement to __builtin_dynamic_object_size  for the following
>>>        case:
>>> 
>>> struct A
>>> {
>>> size_t foo;
>>> int array[] __attribute__((counted_by (foo)));
>>> };
>>> 
>>> extern struct fix * alloc_buf ();
>>> 
>>> int main ()
>>> {
>>> struct fix *p = alloc_buf ();
>>> __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */ 
>>> __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int);
>>>   /* with the current algorithm, it’s UNKNOWN */
>>> }
>> 
>> Should the above be bdos instead of bos?
>> 
>>> Bootstrapped and regression tested on both aarch64 and X86, no issue.
>> 
>> I've updated the Linux kernel's macros for the name change and done
>> build tests with my first pass at "easy" cases for adding counted_by:
>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by&id=adc5b3cb48a049563dc673f348eab7b6beba8a9b
>> 
>> Everything is working as expected. :)
>> 
>> -Kees
>> 
> 
> -- 
> Univ.-Prof. Dr. rer. nat. Martin Uecker
> Graz University of Technology
> Institute of Biomedical Imaging


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-09 20:10         ` Qing Zhao
@ 2023-08-10  6:58           ` Martin Uecker
  2023-08-10 13:59             ` Qing Zhao
  2023-08-10 14:02             ` Michael Matz
  2023-08-10 13:54           ` Michael Matz
  1 sibling, 2 replies; 33+ messages in thread
From: Martin Uecker @ 2023-08-10  6:58 UTC (permalink / raw)
  To: Qing Zhao, Michael Matz
  Cc: Kees Cook, Joseph Myers, Richard Biener, jakub Jelinek,
	Qing Zhao via Gcc-patches, Siddhesh Poyarekar, isanbard

Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> 
> > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:

...
> 
> By definition, the sizeof() of a struct with FAM might not be the same as the non-FAM one. 
> i.e, for the following two structures, one with FAM, the other with fixed array:
> 
> struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
> struct foo_fix {int a; short b; char t[3]; } 
> 
> With current GCC:
> sizeof(foo_flex) == 8
> sizeof(foo_fix) == 12
> 
> I think that the current behavior of sizeof for structure with FAM in GCC is correct. 

Yes, sadly the sizeof has to be like this as required by ISO C.

> 
> The major issue is what was pointed out by Martin in the previous email:
> 
> Whether using the following formula is correct to compute the allocation?
> 
> sizeof(struct foo_flex) + N * sizeof(foo->t);

That formula is safe for allocation, but might allocate more padding
than the minimum amount and it might allocate less storage than a
similar array with fixed array.

> As pointed out  in the wikipedia, the value computed by this formula might
>  be bigger than the actual size since “sizeof(struct foo_flex)” might include 
> paddings that are used as part of the array.
> 
> So the more accurate formula should be
> 
> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
> 
> With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. 

This formula might be considered incorrect / dangerous because
it might allocate less storage than sizeof(struct foo_flex). 


https://godbolt.org/z/8accq75f3
> 
...
> > As in: I think sizeof for both structs should return 12, and 12 bytes 
> > should be reserved for objects of such types.
> > 
> > And then the next question is what __builtin_object_size should do with 
> > these: should it return the size with or without padding at end (i.e. 
> > could/should it return 9 even if sizeof is 12).  I can see arguments for 
> > both.
> 
> Currently, GCC’s __builtin_object_size use the following formula to compute the object size for
> The structure with FAM:
> 
> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
> 
> I think it’s correct. 
> 
> I think that the users might need to use this formula to compute the allocation size for a structure with FAM too.

I am not sure for the reason given above. The following
code would not work:

struct foo_flex { int a; short b; char t[]; } x;
x.a = 1;
struct foo_flex *p = malloc(sizeof(x) + x.a);
if (!p) abort();
memcpy(p, &x, sizeof(x)); // initialize struct


Martin




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-09 20:10         ` Qing Zhao
  2023-08-10  6:58           ` Martin Uecker
@ 2023-08-10 13:54           ` Michael Matz
  1 sibling, 0 replies; 33+ messages in thread
From: Michael Matz @ 2023-08-10 13:54 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Martin Uecker, Kees Cook, Joseph Myers, Richard Biener,
	jakub Jelinek, Qing Zhao via Gcc-patches, Siddhesh Poyarekar,
	isanbard

[-- Attachment #1: Type: text/plain, Size: 5709 bytes --]

Hello,

On Wed, 9 Aug 2023, Qing Zhao wrote:

> > So, should the equivalent FAM struct also have this sizeof()?  If no: 
> > there should be a good argument why it shouldn't be similar to the non-FAM 
> > one.
> 
> The sizeof() of a structure with FAM is defined as: (after I searched online,
>  I think that the one in Wikipedia is the most reasonable one):
> https://en.wikipedia.org/wiki/Flexible_array_member

Well, wikipedia has it's uses.  Here we are in language lawyering land 
together with a discussion what makes most sense in many circumstances.  
FWIW, in this case I think the cited text from wikipedia is correct in the 
sense of "not wrong" but not helpful in the sense of "good advise".

> By definition, the sizeof() of a struct with FAM might not be the same 
> as the non-FAM one. i.e, for the following two structures, one with FAM, 
> the other with fixed array:
> 
> struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
> struct foo_fix {int a; short b; char t[3]; } 
> 
> With current GCC:
> sizeof(foo_flex) == 8
> sizeof(foo_fix) == 12
> 
> I think that the current behavior of sizeof for structure with FAM in 
> GCC is correct.

It is, yes.

> The major issue is what was pointed out by Martin in the previous email:
> 
> Whether using the following formula is correct to compute the 
> allocation?
> 
> sizeof(struct foo_flex) + N * sizeof(foo->t);
> 
> As pointed out  in the wikipedia, the value computed by this formula might
> be bigger than the actual size since “sizeof(struct foo_flex)” might include 
> paddings that are used as part of the array.

That doesn't make the formula incorrect, but rather conservatively 
correct.  If you don't want to be conservative, then yes, you can use a 
different formula if you happen to know the layout rules your compiler at 
hand uses (or the ones prescribed by an ABI, if it does that).  I think 
it would be bad advise to the general population do advertise this scheme 
as better.

> So the more accurate formula should be
> 
> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);

"* sizeof(foo->t[0])", but yes.

> For the question, whether the compiler needs to allocate paddings after 
> the FAM field, I don’t know the answer, and it’s not specified in the 
> standard either. Does it matter?

It matters for two things:

1) Abstract reasons: is there as reason to deviate 
from the normal rules?  If not: it shouldn't deviate.  Future 
extensibility: while it right now is not possible to form an array 
of FMA-structs (in C!), who's to say that it may not be eventually added?  
It seems a natural enough extension of an extension, and while it has 
certain implementation problems (the "real" size of the elements needs to 
be computable, and hence be part of the array type) those could be 
overcome.  At that point you _really_ want to have the elements aligned 
naturally, be compatible with sizeof, and be the same as an individual 
object.

2) Practical reasons: codegeneration works better if the accessible sizes 
of objects are a multiple of their alignment, as often you have 
instructions that can move around alignment-sized blobs (say, words).  If 
you don't pad out objects you have to be careful to use only byte accesses 
when you get to the end of an object.

Let me ask the question in the opposite way: what would you _gain_ by not 
allocating padding?  And is that enough to deviate from the most obvious 
choices?  (Do note that e.g. global or local FMA-typed objects don't exist 
in isolation, and their surrounding objects have their own alignment 
rules, which often will lead to tail padding anyway, even if you would use 
the non-conservative size calculation; the same applies for malloc'ed 
objects).

> > Note that if one choses to allocate less space than sizeof implies that 
> > this will have quite some consequences for code generation, in that 
> > sometimes the instruction sequences (e.g. for copying) need to be careful 
> > to never access tail padding that should be there in array context, but 
> > isn't there in single-object context.  I think this alone should make it 
> > clear that it's advisable that sizeof() and allocated size agree.
> 
> Sizeof by definition is return the size of the TYPE, not the size of the 
> allocated object.

Sure.  Outside special cases like FMA it's the same, though.  And there 
sizeof does include tail padding.

> > And then the next question is what __builtin_object_size should do with 
> > these: should it return the size with or without padding at end (i.e. 
> > could/should it return 9 even if sizeof is 12).  I can see arguments for 
> > both.
> 
> Currently, GCC’s __builtin_object_size use the following formula to 
> compute the object size for The structure with FAM:
> 
> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
> 
> I think it’s correct.

See above.  It's non-conservatively correct.  And that may be the right 
choice for this builtin, considering its intended use-cases (strict 
checking of allowed access ranges) ...

> I think that the users might need to use this formula to compute the 
> allocation size for a structure with FAM too.

... but that doesn't automatically follows.  There's a difference between 
accessible memory range (for language semantic purposes) and allocated 
memory range.  I could imagine that somewhen __builtin_object_size doesn't 
include tail padding for normal non-FMA types either.  After all you can't 
rely on the contents there, though you won't get segfaults when you access 
it.  sizeof will continue to have to include it, though.  So that's a 
demonstration of why _bos is not the right thing to use for allocation 
purposes.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10  6:58           ` Martin Uecker
@ 2023-08-10 13:59             ` Qing Zhao
  2023-08-10 14:38               ` Martin Uecker
  2023-08-10 14:02             ` Michael Matz
  1 sibling, 1 reply; 33+ messages in thread
From: Qing Zhao @ 2023-08-10 13:59 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Michael Matz, Kees Cook, Joseph Myers, Richard Biener,
	jakub Jelinek, Qing Zhao via Gcc-patches, Siddhesh Poyarekar,
	isanbard



> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> 
> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
>> 
>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> 
> ...
>> 
>> By definition, the sizeof() of a struct with FAM might not be the same as the non-FAM one. 
>> i.e, for the following two structures, one with FAM, the other with fixed array:
>> 
>> struct foo_flex { int a; short b; char t[]; } x = { .t = { 1, 2, 3 } };
>> struct foo_fix {int a; short b; char t[3]; } 
>> 
>> With current GCC:
>> sizeof(foo_flex) == 8
>> sizeof(foo_fix) == 12
>> 
>> I think that the current behavior of sizeof for structure with FAM in GCC is correct. 
> 
> Yes, sadly the sizeof has to be like this as required by ISO C.
Agreed. Yes, if the size information of the FAM can be integrated into the 
TYPE system in C standard, we will not have such issue anymore. 

> 
>> 
>> The major issue is what was pointed out by Martin in the previous email:
>> 
>> Whether using the following formula is correct to compute the allocation?
>> 
>> sizeof(struct foo_flex) + N * sizeof(foo->t);
> 
> That formula is safe for allocation, but might allocate more padding
> than the minimum amount and
Yes. 
> it might allocate less storage than a
> similar array with fixed array.
Yes. 
> 
>> As pointed out  in the wikipedia, the value computed by this formula might
>>  be bigger than the actual size since “sizeof(struct foo_flex)” might include 
>> paddings that are used as part of the array.
>> 
>> So the more accurate formula should be
>> 
>> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
>> 
>> With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. 
> 
> This formula might be considered incorrect / dangerous because
> it might allocate less storage than sizeof(struct foo_flex). 
> 
> 
> https://godbolt.org/z/8accq75f3

I see, thanks.
>> 
> ...
>>> As in: I think sizeof for both structs should return 12, and 12 bytes 
>>> should be reserved for objects of such types.
>>> 
>>> And then the next question is what __builtin_object_size should do with 
>>> these: should it return the size with or without padding at end (i.e. 
>>> could/should it return 9 even if sizeof is 12).  I can see arguments for 
>>> both.
>> 
>> Currently, GCC’s __builtin_object_size use the following formula to compute the object size for
>> The structure with FAM:
>> 
>> offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
>> 
>> I think it’s correct. 
>> 
>> I think that the users might need to use this formula to compute the allocation size for a structure with FAM too.
> 
> I am not sure for the reason given above. The following
> code would not work:
> 
> struct foo_flex { int a; short b; char t[]; } x;
> x.a = 1;
> struct foo_flex *p = malloc(sizeof(x) + x.a);
> if (!p) abort();
> memcpy(p, &x, sizeof(x)); // initialize struct
> 
Okay. 
Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)

Let me know if I still miss anything.

Thanks a lot for the explanation.

Qing
> Martin
> 
> 
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10  6:58           ` Martin Uecker
  2023-08-10 13:59             ` Qing Zhao
@ 2023-08-10 14:02             ` Michael Matz
  1 sibling, 0 replies; 33+ messages in thread
From: Michael Matz @ 2023-08-10 14:02 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Qing Zhao, Kees Cook, Joseph Myers, Richard Biener,
	jakub Jelinek, Qing Zhao via Gcc-patches, Siddhesh Poyarekar,
	isanbard

Hey,

On Thu, 10 Aug 2023, Martin Uecker wrote:

> > offset(struct foo_flex, t[0]) + N * sizeof(foo->t);
> > 
> > With GCC, offset(struct foo_flex,t[0]) == 6, which is also correct. 
> 
> This formula might be considered incorrect / dangerous because
> it might allocate less storage than sizeof(struct foo_flex). 

Oh indeed.  I hadn't even considered that.  That could be "fixed" with 
another max(theabove, sizeof(struct foo_flex)), but that starts to become 
silly when the obvious choice works fine.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 13:59             ` Qing Zhao
@ 2023-08-10 14:38               ` Martin Uecker
  2023-08-10 14:42                 ` Jakub Jelinek
  0 siblings, 1 reply; 33+ messages in thread
From: Martin Uecker @ 2023-08-10 14:38 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Michael Matz, Kees Cook, Joseph Myers, Richard Biener,
	jakub Jelinek, Qing Zhao via Gcc-patches, Siddhesh Poyarekar,
	isanbard

Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
> 
> > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> > 
> > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> > > 
> > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> > 

> > I am not sure for the reason given above. The following
> > code would not work:
> > 
> > struct foo_flex { int a; short b; char t[]; } x;
> > x.a = 1;
> > struct foo_flex *p = malloc(sizeof(x) + x.a);
> > if (!p) abort();
> > memcpy(p, &x, sizeof(x)); // initialize struct
> > 
> Okay. 
> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
> 
> Let me know if I still miss anything.

The question is not only what the user should use to
allocate, but also what BDOS should return.  In my
example the user uses the sizeof() + N * sizeof
formula and the memcpy is safe, but it would be flagged
as a buffer overrun if BDOS uses the offsetof formula.

Martin



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 14:38               ` Martin Uecker
@ 2023-08-10 14:42                 ` Jakub Jelinek
  2023-08-10 14:47                   ` Martin Uecker
  0 siblings, 1 reply; 33+ messages in thread
From: Jakub Jelinek @ 2023-08-10 14:42 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Qing Zhao, Michael Matz, Kees Cook, Joseph Myers, Richard Biener,
	Qing Zhao via Gcc-patches, Siddhesh Poyarekar, isanbard

On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
> > 
> > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> > > 
> > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> > > > 
> > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> > > 
> 
> > > I am not sure for the reason given above. The following
> > > code would not work:
> > > 
> > > struct foo_flex { int a; short b; char t[]; } x;
> > > x.a = 1;
> > > struct foo_flex *p = malloc(sizeof(x) + x.a);
> > > if (!p) abort();
> > > memcpy(p, &x, sizeof(x)); // initialize struct
> > > 
> > Okay. 
> > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
> > 
> > Let me know if I still miss anything.
> 
> The question is not only what the user should use to
> allocate, but also what BDOS should return.  In my
> example the user uses the sizeof() + N * sizeof
> formula and the memcpy is safe, but it would be flagged
> as a buffer overrun if BDOS uses the offsetof formula.

BDOS/BOS (at least the 0 level) should return what is actually
allocated for the var, what size was passed to malloc and if it
is a var with flex array member with initialization what is actually the
size on the stack or in .data/.rodata etc.
And for 1 level the same unless it is just access to some element, then
it should be capped by the size of that element.

	Jakub


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 14:42                 ` Jakub Jelinek
@ 2023-08-10 14:47                   ` Martin Uecker
  2023-08-10 14:58                     ` Siddhesh Poyarekar
  0 siblings, 1 reply; 33+ messages in thread
From: Martin Uecker @ 2023-08-10 14:47 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Qing Zhao, Michael Matz, Kees Cook, Joseph Myers, Richard Biener,
	Qing Zhao via Gcc-patches, Siddhesh Poyarekar, isanbard

Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
> > Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
> > > 
> > > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> > > > 
> > > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> > > > > 
> > > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> > > > 
> > 
> > > > I am not sure for the reason given above. The following
> > > > code would not work:
> > > > 
> > > > struct foo_flex { int a; short b; char t[]; } x;
> > > > x.a = 1;
> > > > struct foo_flex *p = malloc(sizeof(x) + x.a);
> > > > if (!p) abort();
> > > > memcpy(p, &x, sizeof(x)); // initialize struct
> > > > 
> > > Okay. 
> > > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
> > > 
> > > Let me know if I still miss anything.
> > 
> > The question is not only what the user should use to
> > allocate, but also what BDOS should return.  In my
> > example the user uses the sizeof() + N * sizeof
> > formula and the memcpy is safe, but it would be flagged
> > as a buffer overrun if BDOS uses the offsetof formula.
> 
> BDOS/BOS (at least the 0 level) should return what is actually
> allocated for the var, what size was passed to malloc and if it
> is a var with flex array member with initialization what is actually the
> size on the stack or in .data/.rodata etc.

Agreed.

But what about a struct with FAM with the new "counted_by" attribute
if the original allocation is not visible?

Martin

> And for 1 level the same unless it is just access to some element, then
> it should be capped by the size of that element.
> 





^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 14:47                   ` Martin Uecker
@ 2023-08-10 14:58                     ` Siddhesh Poyarekar
  2023-08-10 15:18                       ` Martin Uecker
  0 siblings, 1 reply; 33+ messages in thread
From: Siddhesh Poyarekar @ 2023-08-10 14:58 UTC (permalink / raw)
  To: Martin Uecker, Jakub Jelinek
  Cc: Qing Zhao, Michael Matz, Kees Cook, Joseph Myers, Richard Biener,
	Qing Zhao via Gcc-patches, isanbard

On 2023-08-10 10:47, Martin Uecker wrote:
> Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
>> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
>>> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
>>>>
>>>>> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
>>>>>
>>>>> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
>>>>>>
>>>>>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
>>>>>
>>>
>>>>> I am not sure for the reason given above. The following
>>>>> code would not work:
>>>>>
>>>>> struct foo_flex { int a; short b; char t[]; } x;
>>>>> x.a = 1;
>>>>> struct foo_flex *p = malloc(sizeof(x) + x.a);
>>>>> if (!p) abort();
>>>>> memcpy(p, &x, sizeof(x)); // initialize struct
>>>>>
>>>> Okay.
>>>> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
>>>>
>>>> Let me know if I still miss anything.
>>>
>>> The question is not only what the user should use to
>>> allocate, but also what BDOS should return.  In my
>>> example the user uses the sizeof() + N * sizeof
>>> formula and the memcpy is safe, but it would be flagged
>>> as a buffer overrun if BDOS uses the offsetof formula.
>>
>> BDOS/BOS (at least the 0 level) should return what is actually
>> allocated for the var, what size was passed to malloc and if it
>> is a var with flex array member with initialization what is actually the
>> size on the stack or in .data/.rodata etc.
> 
> Agreed.
> 
> But what about a struct with FAM with the new "counted_by" attribute
> if the original allocation is not visible?

There's precedent for this through the __access__ attribute; __bos 
trusts what the attribute says about the allocation.

Sid

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 14:58                     ` Siddhesh Poyarekar
@ 2023-08-10 15:18                       ` Martin Uecker
  2023-08-10 16:28                         ` Qing Zhao
  2023-08-10 16:30                         ` Siddhesh Poyarekar
  0 siblings, 2 replies; 33+ messages in thread
From: Martin Uecker @ 2023-08-10 15:18 UTC (permalink / raw)
  To: Siddhesh Poyarekar, Jakub Jelinek
  Cc: Qing Zhao, Michael Matz, Kees Cook, Joseph Myers, Richard Biener,
	Qing Zhao via Gcc-patches, isanbard

Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar:
> On 2023-08-10 10:47, Martin Uecker wrote:
> > Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
> > > On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
> > > > Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
> > > > > 
> > > > > > On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
> > > > > > 
> > > > > > Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
> > > > > > > 
> > > > > > > > On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> > > > > > 
> > > > 
> > > > > > I am not sure for the reason given above. The following
> > > > > > code would not work:
> > > > > > 
> > > > > > struct foo_flex { int a; short b; char t[]; } x;
> > > > > > x.a = 1;
> > > > > > struct foo_flex *p = malloc(sizeof(x) + x.a);
> > > > > > if (!p) abort();
> > > > > > memcpy(p, &x, sizeof(x)); // initialize struct
> > > > > > 
> > > > > Okay.
> > > > > Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
> > > > > 
> > > > > Let me know if I still miss anything.
> > > > 
> > > > The question is not only what the user should use to
> > > > allocate, but also what BDOS should return.  In my
> > > > example the user uses the sizeof() + N * sizeof
> > > > formula and the memcpy is safe, but it would be flagged
> > > > as a buffer overrun if BDOS uses the offsetof formula.
> > > 
> > > BDOS/BOS (at least the 0 level) should return what is actually
> > > allocated for the var, what size was passed to malloc and if it
> > > is a var with flex array member with initialization what is actually the
> > > size on the stack or in .data/.rodata etc.
> > 
> > Agreed.
> > 
> > But what about a struct with FAM with the new "counted_by" attribute
> > if the original allocation is not visible?
> 
> There's precedent for this through the __access__ attribute; __bos 
> trusts what the attribute says about the allocation.

The access attribute gives the size directly. The counted_by gives
a length for the array which needs to be translated into a size
via a formula. There are different formulas in use. The question 
is which formula should bdos trust?

Whatever you pick, if this is not consistent with the actual
allocation or use, then it will cause problems either by
breaking code or not detecting buffer overruns.

So it needs to be consistent with what GCC allocates for a
var with FAM and initialization and also the user needs to 
be told what the right choice is so that he can use the right
size for allocation and argument to memcpy / memset etc.

Martin







^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 15:18                       ` Martin Uecker
@ 2023-08-10 16:28                         ` Qing Zhao
  2023-08-10 16:30                         ` Siddhesh Poyarekar
  1 sibling, 0 replies; 33+ messages in thread
From: Qing Zhao @ 2023-08-10 16:28 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Siddhesh Poyarekar, Jakub Jelinek, Michael Matz, Kees Cook,
	Joseph Myers, Richard Biener, Qing Zhao via Gcc-patches,
	isanbard

Hi, Martin,

> On Aug 10, 2023, at 11:18 AM, Martin Uecker <muecker@gwdg.de> wrote:
> 
> Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar:
>> On 2023-08-10 10:47, Martin Uecker wrote:
>>> Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
>>>> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
>>>>> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
>>>>>> 
>>>>>>> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
>>>>>>> 
>>>>>>> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
>>>>>>>> 
>>>>>>>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
>>>>>>> 
>>>>> 
>>>>>>> I am not sure for the reason given above. The following
>>>>>>> code would not work:
>>>>>>> 
>>>>>>> struct foo_flex { int a; short b; char t[]; } x;
>>>>>>> x.a = 1;
>>>>>>> struct foo_flex *p = malloc(sizeof(x) + x.a);
>>>>>>> if (!p) abort();
>>>>>>> memcpy(p, &x, sizeof(x)); // initialize struct
>>>>>>> 
>>>>>> Okay.
>>>>>> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
>>>>>> 
>>>>>> Let me know if I still miss anything.
>>>>> 
>>>>> The question is not only what the user should use to
>>>>> allocate, but also what BDOS should return.  In my
>>>>> example the user uses the sizeof() + N * sizeof
>>>>> formula and the memcpy is safe, but it would be flagged
>>>>> as a buffer overrun if BDOS uses the offsetof formula.
>>>> 
>>>> BDOS/BOS (at least the 0 level) should return what is actually
>>>> allocated for the var, what size was passed to malloc and if it
>>>> is a var with flex array member with initialization what is actually the
>>>> size on the stack or in .data/.rodata etc.
>>> 
>>> Agreed.
>>> 
>>> But what about a struct with FAM with the new "counted_by" attribute
>>> if the original allocation is not visible?
>> 
>> There's precedent for this through the __access__ attribute; __bos 
>> trusts what the attribute says about the allocation.
> 
> The access attribute gives the size directly. The counted_by gives
> a length for the array which needs to be translated into a size
> via a formula. There are different formulas in use. The question 
> is which formula should bdos trust?
> 
> Whatever you pick, if this is not consistent with the actual
> allocation or use, then it will cause problems either by
> breaking code or not detecting buffer overruns.
> 
> So it needs to be consistent with what GCC allocates for a
> var with FAM and initialization and also the user needs to 
> be told what the right choice is so that he can use the right
> size for allocation and argument to memcpy / memset etc.

All agreed. Thanks a lot for raising these issues and providing helpful suggestions. 
I will double check on these and make sure __bos behaves correctly for structure with FAM.

Might come back with more questions for discussion…-:).

Thanks.

Qing
> 
> Martin


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 15:18                       ` Martin Uecker
  2023-08-10 16:28                         ` Qing Zhao
@ 2023-08-10 16:30                         ` Siddhesh Poyarekar
  2023-08-10 16:39                           ` Jakub Jelinek
  1 sibling, 1 reply; 33+ messages in thread
From: Siddhesh Poyarekar @ 2023-08-10 16:30 UTC (permalink / raw)
  To: Martin Uecker, Jakub Jelinek
  Cc: Qing Zhao, Michael Matz, Kees Cook, Joseph Myers, Richard Biener,
	Qing Zhao via Gcc-patches, isanbard

On 2023-08-10 11:18, Martin Uecker wrote:
> Am Donnerstag, dem 10.08.2023 um 10:58 -0400 schrieb Siddhesh Poyarekar:
>> On 2023-08-10 10:47, Martin Uecker wrote:
>>> Am Donnerstag, dem 10.08.2023 um 16:42 +0200 schrieb Jakub Jelinek:
>>>> On Thu, Aug 10, 2023 at 04:38:21PM +0200, Martin Uecker wrote:
>>>>> Am Donnerstag, dem 10.08.2023 um 13:59 +0000 schrieb Qing Zhao:
>>>>>>
>>>>>>> On Aug 10, 2023, at 2:58 AM, Martin Uecker <muecker@gwdg.de> wrote:
>>>>>>>
>>>>>>> Am Mittwoch, dem 09.08.2023 um 20:10 +0000 schrieb Qing Zhao:
>>>>>>>>
>>>>>>>>> On Aug 9, 2023, at 12:21 PM, Michael Matz <matz@suse.de> wrote:
>>>>>>>
>>>>>
>>>>>>> I am not sure for the reason given above. The following
>>>>>>> code would not work:
>>>>>>>
>>>>>>> struct foo_flex { int a; short b; char t[]; } x;
>>>>>>> x.a = 1;
>>>>>>> struct foo_flex *p = malloc(sizeof(x) + x.a);
>>>>>>> if (!p) abort();
>>>>>>> memcpy(p, &x, sizeof(x)); // initialize struct
>>>>>>>
>>>>>> Okay.
>>>>>> Then, the user still should use the sizeof(struct foo_flex) + N * sizeof(foo->t) for the allocation, even though this might allocate more bytes than necessary. (But this is safe)
>>>>>>
>>>>>> Let me know if I still miss anything.
>>>>>
>>>>> The question is not only what the user should use to
>>>>> allocate, but also what BDOS should return.  In my
>>>>> example the user uses the sizeof() + N * sizeof
>>>>> formula and the memcpy is safe, but it would be flagged
>>>>> as a buffer overrun if BDOS uses the offsetof formula.
>>>>
>>>> BDOS/BOS (at least the 0 level) should return what is actually
>>>> allocated for the var, what size was passed to malloc and if it
>>>> is a var with flex array member with initialization what is actually the
>>>> size on the stack or in .data/.rodata etc.
>>>
>>> Agreed.
>>>
>>> But what about a struct with FAM with the new "counted_by" attribute
>>> if the original allocation is not visible?
>>
>> There's precedent for this through the __access__ attribute; __bos
>> trusts what the attribute says about the allocation.
> 
> The access attribute gives the size directly. The counted_by gives
> a length for the array which needs to be translated into a size
> via a formula. There are different formulas in use. The question
> is which formula should bdos trust?
> 
> Whatever you pick, if this is not consistent with the actual
> allocation or use, then it will cause problems either by
> breaking code or not detecting buffer overruns.
> 
> So it needs to be consistent with what GCC allocates for a
> var with FAM and initialization and also the user needs to
> be told what the right choice is so that he can use the right
> size for allocation and argument to memcpy / memset etc.

We'd rather miss overflow to the extent of padding than to try and be 
overly aggressive; I doubt if we're missing much protection in practice 
by trying to account for the padding.

The definition of __bos/__bdos allows us the freedom to *estimate* 
rather than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since 
it's bound to give the more conservative answer of the two.

Thanks,
Sid

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 16:30                         ` Siddhesh Poyarekar
@ 2023-08-10 16:39                           ` Jakub Jelinek
  2023-08-10 17:06                             ` Siddhesh Poyarekar
  2023-08-10 18:18                             ` Qing Zhao
  0 siblings, 2 replies; 33+ messages in thread
From: Jakub Jelinek @ 2023-08-10 16:39 UTC (permalink / raw)
  To: Siddhesh Poyarekar
  Cc: Martin Uecker, Qing Zhao, Michael Matz, Kees Cook, Joseph Myers,
	Richard Biener, Qing Zhao via Gcc-patches, isanbard

On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote:
> The definition of __bos/__bdos allows us the freedom to *estimate* rather
> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound
> to give the more conservative answer of the two.

To be precise, we have the 0/1 modes vs. 2/3.  So, when not determining
__bos/__bdos from actual allocation size or size of an stack object or
size of data section object but something else (say counted_by), perhaps
0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt)
and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt),
then user code can continue testing if both modes are equal to have
exact number.

	Jakub


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 16:39                           ` Jakub Jelinek
@ 2023-08-10 17:06                             ` Siddhesh Poyarekar
  2023-08-16 21:42                               ` Qing Zhao
  2023-08-10 18:18                             ` Qing Zhao
  1 sibling, 1 reply; 33+ messages in thread
From: Siddhesh Poyarekar @ 2023-08-10 17:06 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Martin Uecker, Qing Zhao, Michael Matz, Kees Cook, Joseph Myers,
	Richard Biener, Qing Zhao via Gcc-patches, isanbard

On 2023-08-10 12:39, Jakub Jelinek wrote:
> On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote:
>> The definition of __bos/__bdos allows us the freedom to *estimate* rather
>> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound
>> to give the more conservative answer of the two.
> 
> To be precise, we have the 0/1 modes vs. 2/3.  So, when not determining
> __bos/__bdos from actual allocation size or size of an stack object or
> size of data section object but something else (say counted_by), perhaps
> 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt)
> and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt),
> then user code can continue testing if both modes are equal to have
> exact number.

Ack, that's fair.

Thanks,
Sid

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 16:39                           ` Jakub Jelinek
  2023-08-10 17:06                             ` Siddhesh Poyarekar
@ 2023-08-10 18:18                             ` Qing Zhao
  1 sibling, 0 replies; 33+ messages in thread
From: Qing Zhao @ 2023-08-10 18:18 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Siddhesh Poyarekar, Martin Uecker, Michael Matz, Kees Cook,
	Joseph Myers, Richard Biener, Qing Zhao via Gcc-patches,
	isanbard



> On Aug 10, 2023, at 12:39 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote:
>> The definition of __bos/__bdos allows us the freedom to *estimate* rather
>> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound
>> to give the more conservative answer of the two.
> 
> To be precise, we have the 0/1 modes vs. 2/3.  So, when not determining
> __bos/__bdos from actual allocation size or size of an stack object or
> size of data section object but something else (say counted_by), perhaps
> 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt)
> and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt),
> then user code can continue testing if both modes are equal to have
> exact number.

Yes, this sounds reasonable to me.

Qing
> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-10 17:06                             ` Siddhesh Poyarekar
@ 2023-08-16 21:42                               ` Qing Zhao
  0 siblings, 0 replies; 33+ messages in thread
From: Qing Zhao @ 2023-08-16 21:42 UTC (permalink / raw)
  To: Siddhesh Poyarekar, jakub Jelinek, Martin Uecker
  Cc: Michael Matz, Kees Cook, Joseph Myers, Richard Biener,
	Qing Zhao via Gcc-patches, isanbard

Hi, 

After some more studying and consideration, the following is my thoughts:

For a structure with FMA annotated with counted_by attribute: (the following small example)

====
struct annotated {
        size_t foo;
        char b;
        char array[] __attribute__((counted_by (foo)));
};

#define noinline __attribute__((__noinline__))
#define MAX(a, b)  ((a) > (b) ? (a) :  (b))

static struct annotated * noinline alloc_buf (size_t length)
{
  struct annotated *p;
  p = (struct annotated *) malloc (MAX (sizeof (struct annotated),
                                        			(offsetof (struct annotated, array[0])
                                         			+ (length) * sizeof (char)))); 
  p->foo = length;
  return p;
}

int main ()
{
  struct annotated *p = alloc_buf (10);
  printf("the__bdos of max p->array whole is %d \n", __builtin_dynamic_object_size(p->array, 0)); 
  printf("the__bdos of max p->array sub is %d \n", __builtin_dynamic_object_size(p->array, 1));
  printf("the__bdos of min p->array whole is %d \n", __builtin_dynamic_object_size(p->array, 2));
  printf("the__bdos of min p->array sub is %d \n", __builtin_dynamic_object_size(p->array, 3));   
}

=====

 The actual allocation of the structure and the layout of the structure p is fixed at compilation time,
    A. We know the offsetof (p->array) during compilation time, (it’s 9)
    B. We also know the size of the p->array though the counted_by attribute, it’s p->foo * sizeof (char).

  1.  for subobject size (1/3 modes), Both A and B are know at compilation time, whatever it’s MAX or MIN, we 
    can determine  the size of the subobject p->array is:  p->foo * sizeof(char) without estimation. 

  2.  for whole object size (0/2 modes), since we don’t have any info on the actual allocation or structure 
    Initialization, we don’t know the size for the whole object whatever it’s MAX or MIN. 

So, the problem to decide which formula to use ((sizeof (x) + N * sizeof(elt), or offsetof + N * sizeof(elt)) is actually
the programmer’s job when allocating memory for the structure with FMA. (It’s not compiler’s job).  

Since this size computation is really confusing for the structure with FMA, I think that adding some clarification in
the documentation might be necessary to provide more details and guidance to the end-users.

Let me know if I miss anything here.

Thanks a lot.

Qing




> On Aug 10, 2023, at 11:18 AM, Martin Uecker <muecker@gwdg.de> wrote:
> The access attribute gives the size directly. The counted_by gives
> a length for the array which needs to be translated into a size
> via a formula. There are different formulas in use. The question 
> is which formula should bdos trust?
> 
> Whatever you pick, if this is not consistent with the actual
> allocation or use, then it will cause problems either by
> breaking code or not detecting buffer overruns.
> 
> So it needs to be consistent with what GCC allocates for a
> var with FAM and initialization and also the user needs to 
> be told what the right choice is so that he can use the right
> size for allocation and argument to memcpy / memset etc.

> On Aug 10, 2023, at 1:06 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> 
> On 2023-08-10 12:39, Jakub Jelinek wrote:
>> On Thu, Aug 10, 2023 at 12:30:06PM -0400, Siddhesh Poyarekar wrote:
>>> The definition of __bos/__bdos allows us the freedom to *estimate* rather
>>> than be precise, so I'd go for sizeof(x) + N * sizeof(*x.a) since it's bound
>>> to give the more conservative answer of the two.
>> To be precise, we have the 0/1 modes vs. 2/3.  So, when not determining
>> __bos/__bdos from actual allocation size or size of an stack object or
>> size of data section object but something else (say counted_by), perhaps
>> 0/1 modes should give the upper estimate of sizeof (x) + N * sizeof(elt)
>> and 2/3 modes should give a lower estimate, so offsetof + N * sizeof(elt),
>> then user code can continue testing if both modes are equal to have
>> exact number.
> 
> Ack, that's fair.
> 
> Thanks,
> Sid


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-04 19:44 [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
                   ` (3 preceding siblings ...)
  2023-08-07 16:16 ` [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
@ 2023-08-17  5:31 ` Kees Cook
  2023-08-17  6:38   ` Kees Cook
  4 siblings, 1 reply; 33+ messages in thread
From: Kees Cook @ 2023-08-17  5:31 UTC (permalink / raw)
  To: Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard

On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
> This is the 2nd version of the patch, per our discussion based on the
> review comments for the 1st version, the major changes in this version

I've been using Coccinelle to find and annotate[1] structures (193 so
far...), and I've encountered 2 cases of GCC internal errors. I'm working
on a minimized test case, but just in case these details are immediately
helpful, here's what I'm seeing:

../drivers/net/wireless/ath/wcn36xx/smd.c: In function 'wcn36xx_smd_rsp_process':
../drivers/net/wireless/ath/wcn36xx/smd.c:3299:5: error: incorrect sharing of tree nodes
 3299 | int wcn36xx_smd_rsp_process(struct rpmsg_device *rpdev,
      |     ^~~~~~~~~~~~~~~~~~~~~~~
MEM[(struct wcn36xx_hal_ind_msg *)_96]
_15 = &MEM[(struct wcn36xx_hal_ind_msg *)_96].msg;
during GIMPLE pass: objsz
../drivers/net/wireless/ath/wcn36xx/smd.c:3299:5: internal compiler error: verify_gimple failed
0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
        ../../../../gcc/gcc/tree-cfg.cc:5646
0xe84894 execute_function_todo
        ../../../../gcc/gcc/passes.cc:2088
0xe84dee execute_todo
        ../../../../gcc/gcc/passes.cc:2142

The associated struct is:

struct wcn36xx_hal_ind_msg {
        struct list_head list;
        size_t msg_len;
        u8 msg[] __counted_by(msg_len);
};



And:

../drivers/usb/gadget/function/f_fs.c: In function '__ffs_epfile_read_data':
../drivers/usb/gadget/function/f_fs.c:900:16: error: incorrect sharing of tree nodes
  900 | static ssize_t __ffs_epfile_read_data(struct ffs_epfile *epfile,
      |                ^~~~~~~~~~~~~~~~~~~~~~
MEM[(struct ffs_buffer *)_67]
_5 = &MEM[(struct ffs_buffer *)_67].storage;
during GIMPLE pass: objsz
../drivers/usb/gadget/function/f_fs.c:900:16: internal compiler error: verify_gimple failed
0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
        ../../../../gcc/gcc/tree-cfg.cc:5646
0xe84894 execute_function_todo
        ../../../../gcc/gcc/passes.cc:2088
0xe84dee execute_todo
        ../../../../gcc/gcc/passes.cc:2142

with:

struct ffs_buffer {
        size_t length;
        char *data;
        char storage[] __counted_by(length);
};


[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-17  5:31 ` Kees Cook
@ 2023-08-17  6:38   ` Kees Cook
  2023-08-17 13:44     ` Qing Zhao
  0 siblings, 1 reply; 33+ messages in thread
From: Kees Cook @ 2023-08-17  6:38 UTC (permalink / raw)
  To: Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard

On Wed, Aug 16, 2023 at 10:31:30PM -0700, Kees Cook wrote:
> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
> > This is the 2nd version of the patch, per our discussion based on the
> > review comments for the 1st version, the major changes in this version
> 
> I've been using Coccinelle to find and annotate[1] structures (193 so
> far...), and I've encountered 2 cases of GCC internal errors. I'm working
> on a minimized test case, but just in case these details are immediately
> helpful, here's what I'm seeing:

Okay, I got it minimized:

$ cat poc.c
struct a {
  unsigned long c;
  char d[] __attribute__((__counted_by__(c)));
} *b;

void f(long);

void e(void) {
  long g = __builtin_dynamic_object_size(b->d, 1);
  f(g);
}
$ gcc -O2 -c -o /dev/null poc.c
poc.c: In function 'e':
poc.c:8:6: error: incorrect sharing of tree nodes
    8 | void e(void) {
      |      ^
*b.0_1
_2 = &b.0_1->d;
during GIMPLE pass: objsz
poc.c:8:6: internal compiler error: verify_gimple failed
0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
        ../../../../gcc/gcc/tree-cfg.cc:5646
0xe84894 execute_function_todo
        ../../../../gcc/gcc/passes.cc:2088
0xe84dee execute_todo
        ../../../../gcc/gcc/passes.cc:2142

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-17  6:38   ` Kees Cook
@ 2023-08-17 13:44     ` Qing Zhao
  2023-08-17 16:54       ` Kees Cook
  0 siblings, 1 reply; 33+ messages in thread
From: Qing Zhao @ 2023-08-17 13:44 UTC (permalink / raw)
  To: Kees Cook
  Cc: Joseph Myers, Richard Biener, jakub Jelinek, gcc-patches,
	Siddhesh Poyarekar, Martin Uecker, isanbard

Hi, Kees,

Thanks for the testing case. 
Yes, I noticed this issue too, and already fixed it in my private branch. 

With the latest patch, the compilation has no issue:
[opc@qinzhao-ol8u3-x86 108896]$ sh t
/home/opc/Install/latest-d/bin/gcc -O2 -c -o /dev/null bug.c
[opc@qinzhao-ol8u3-x86 108896]$ 

Qing

> On Aug 17, 2023, at 2:38 AM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Wed, Aug 16, 2023 at 10:31:30PM -0700, Kees Cook wrote:
>> On Fri, Aug 04, 2023 at 07:44:28PM +0000, Qing Zhao wrote:
>>> This is the 2nd version of the patch, per our discussion based on the
>>> review comments for the 1st version, the major changes in this version
>> 
>> I've been using Coccinelle to find and annotate[1] structures (193 so
>> far...), and I've encountered 2 cases of GCC internal errors. I'm working
>> on a minimized test case, but just in case these details are immediately
>> helpful, here's what I'm seeing:
> 
> Okay, I got it minimized:
> 
> $ cat poc.c
> struct a {
>  unsigned long c;
>  char d[] __attribute__((__counted_by__(c)));
> } *b;
> 
> void f(long);
> 
> void e(void) {
>  long g = __builtin_dynamic_object_size(b->d, 1);
>  f(g);
> }
> $ gcc -O2 -c -o /dev/null poc.c
> poc.c: In function 'e':
> poc.c:8:6: error: incorrect sharing of tree nodes
>    8 | void e(void) {
>      |      ^
> *b.0_1
> _2 = &b.0_1->d;
> during GIMPLE pass: objsz
> poc.c:8:6: internal compiler error: verify_gimple failed
> 0xfe97fd verify_gimple_in_cfg(function*, bool, bool)
>        ../../../../gcc/gcc/tree-cfg.cc:5646
> 0xe84894 execute_function_todo
>        ../../../../gcc/gcc/passes.cc:2088
> 0xe84dee execute_todo
>        ../../../../gcc/gcc/passes.cc:2142
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
  2023-08-17 13:44     ` Qing Zhao
@ 2023-08-17 16:54       ` Kees Cook
  0 siblings, 0 replies; 33+ messages in thread
From: Kees Cook @ 2023-08-17 16:54 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Joseph Myers, Richard Biener, jakub Jelinek, gcc-patches,
	Siddhesh Poyarekar, Martin Uecker, isanbard

On Thu, Aug 17, 2023 at 01:44:42PM +0000, Qing Zhao wrote:
> Thanks for the testing case. 
> Yes, I noticed this issue too, and already fixed it in my private branch. 
> 
> With the latest patch, the compilation has no issue:
> [opc@qinzhao-ol8u3-x86 108896]$ sh t
> /home/opc/Install/latest-d/bin/gcc -O2 -c -o /dev/null bug.c
> [opc@qinzhao-ol8u3-x86 108896]$ 

Great! Thanks. I look forward to v3. For now I'll leave off these 2
annotations in my kernel builds. :)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2023-08-17 16:54 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-04 19:44 [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Qing Zhao
2023-08-04 19:44 ` [V2][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896) Qing Zhao
2023-08-04 19:44 ` [V2][PATCH 2/3] Use the counted_by atribute info in builtin object size [PR108896] Qing Zhao
2023-08-04 19:44 ` [V2][PATCH 3/3] Use the counted_by attribute information in bound sanitizer[PR108896] Qing Zhao
2023-08-07 16:16 ` [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896) Kees Cook
2023-08-07 16:33   ` Qing Zhao
2023-08-09 19:17     ` Kees Cook
2023-08-08 14:54   ` Martin Uecker
2023-08-08 16:18     ` Michael Matz
2023-08-08 19:58     ` Kees Cook
2023-08-09 16:05     ` Qing Zhao
2023-08-09 16:21       ` Michael Matz
2023-08-09 20:10         ` Qing Zhao
2023-08-10  6:58           ` Martin Uecker
2023-08-10 13:59             ` Qing Zhao
2023-08-10 14:38               ` Martin Uecker
2023-08-10 14:42                 ` Jakub Jelinek
2023-08-10 14:47                   ` Martin Uecker
2023-08-10 14:58                     ` Siddhesh Poyarekar
2023-08-10 15:18                       ` Martin Uecker
2023-08-10 16:28                         ` Qing Zhao
2023-08-10 16:30                         ` Siddhesh Poyarekar
2023-08-10 16:39                           ` Jakub Jelinek
2023-08-10 17:06                             ` Siddhesh Poyarekar
2023-08-16 21:42                               ` Qing Zhao
2023-08-10 18:18                             ` Qing Zhao
2023-08-10 14:02             ` Michael Matz
2023-08-10 13:54           ` Michael Matz
2023-08-09 20:34     ` Qing Zhao
2023-08-17  5:31 ` Kees Cook
2023-08-17  6:38   ` Kees Cook
2023-08-17 13:44     ` Qing Zhao
2023-08-17 16:54       ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).