public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
@ 2023-05-25 16:14 Qing Zhao
  2023-05-25 16:14 ` [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896) Qing Zhao
                   ` (4 more replies)
  0 siblings, 5 replies; 52+ messages in thread
From: Qing Zhao @ 2023-05-25 16:14 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, gcc-patches
  Cc: keescook, siddhesh, uecker, isanbard, Qing Zhao

Hi,

This patch set introduces a new attribute "element_count" to annotate bounds 
for C99 flexible array member.

A gcc bugzilla PR108896 has been created to record this task:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896

A nice writeup "Bounded Flexible Arrays in C" 
https://people.kernel.org/kees/bounded-flexible-arrays-in-c.
written by Kees Cook, from Kernel Self-Protection Project, provides a solid
background and motivation of this new attribute:

"With flexible arrays now a first-class citizen in Linux and the compilers,
it becomes possible to extend their available diagnostics.  What the compiler
is missing is knowledge of how the length of a given flexible array is tracked.
For well-described flexible array structs, this means associating the member 
holding the element count with the flexible array member. This idea is not new,
though prior implementation (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2660.pdf)
proposals have wanted to make changes to the C language syntax. A simpler
approach is the addition of struct member attributes, and is under discussion
 and early development by both the GCC and Clang developer communities."

The basic idea is to annotate the flexible array member with a new attribute
 "element_count" to track its number of elements to another field in the same
 structure, for example:

struct object {
..
 size_t count;  /* carries the number of elements info for the FAM flex.  */
 int flex[]; 
};

will become:

struct object {
..
 size_t count:  /* carries the number of elements info for the FAM flex.  */
 int flex[] __attribute__((element_count ("count")));
};

GCC will pass the number of elements info from the attached attribute to both 
__builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds
or dynamic object size issues during runtime for flexible array members.

This new feature will provide nice protection to flexible array members (which
currently are completely ignored by both __builtin_dynamic_object_size and
bounds sanitizers).

Possible future additions to this initial work include supporting counts from
a variable outside the structure, or a field in the outer structure if needed.  

If the GCC extension works well, this feature might be promoted into new C
 standard in the future.

Clang has a similar initial implemenation which is under review:

https://reviews.llvm.org/D148381

Linux kernel also has a patch to use this new feature:

https://lore.kernel.org/lkml/20230504211827.GA1666363@dev-arch.thelio-3990X/T/

The patch set include 3 patches:

1/3: Provide element_count attribute to flexible array member field (PR108896)
2/3: Use the element_count atribute info in builtin object size [PR108896].
3/3: Use the element_count attribute information in bound sanitizer[PR108896]

bootstrapped and regression tested on aarch64 and x86.

Let me know if you have any comment or suggestion.

Thanks.

Qing 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-05-25 16:14 [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Qing Zhao
@ 2023-05-25 16:14 ` Qing Zhao
  2023-05-25 21:02   ` Joseph Myers
  2023-05-25 16:14 ` [V1][PATCH 2/3] Use the element_count atribute info in builtin object size [PR108896] Qing Zhao
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-05-25 16:14 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, gcc-patches
  Cc: keescook, siddhesh, uecker, isanbard, Qing Zhao

'element_count ("COUNT")'
     The 'element_count' attribute may be attached to the flexible array
     member of a structure.  It indicates that the number of the
     elements of the array is given by the field named "COUNT" in the
     same structure as the flexible array member.  GCC uses this
     information to improve the results of
     '__builtin_dynamic_object_size' and array bound sanitizer.

     For instance, the following declaration:

          struct P {
            size_t count;
            int array[] __attribute__ ((element_count ("count")));
          };

     specify that 'array' is a flexible array member whose number of
     element is given by the field "'count'" in the same structure.

The number of elements information provided by this attribute can be
used by __builtin_dynamic_object_size and array bound sanitizer to detect
out-of-bound errors for flexible array member references.

2023-05-17 Qing Zhao <qing.zhao@oracle.com>

gcc/c-family/ChangeLog:

	PR C/108896
	* c-attribs.cc (handle_element_count_attribute): New function.
	* c-common.cc (c_flexible_array_member_type_p): To this.
	* c-common.h (c_flexible_array_member_type_p): New prototype.

gcc/c/ChangeLog:

	PR C/108896
	* c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
	(add_flexible_array_elts_to_size): Use renamed function.
	(is_flexible_array_member_p): Use renamed function.
	(verify_element_count_attribute): New function.
	(finish_struct): Use renamed function and verify element count
	attribute.

gcc/ChangeLog:

	PR C/108896
	* doc/extend.texi: Document attribute element_count.

gcc/testsuite/ChangeLog:

	PR C/108896
	* gcc.dg/flex-array-element-count.c: New test.
---
 gcc/c-family/c-attribs.cc                     | 51 ++++++++++++++++
 gcc/c-family/c-common.cc                      | 13 ++++
 gcc/c-family/c-common.h                       |  1 +
 gcc/c/c-decl.cc                               | 61 ++++++++++++++-----
 gcc/doc/extend.texi                           | 21 +++++++
 .../gcc.dg/flex-array-element-count.c         | 27 ++++++++
 6 files changed, 158 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-element-count.c

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 072cfb69147..d45d11077c3 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -103,6 +103,8 @@ static tree handle_warn_if_not_aligned_attribute (tree *, tree, tree,
 						  int, bool *);
 static tree handle_strict_flex_array_attribute (tree *, tree, tree,
 						 int, bool *);
+static tree handle_element_count_attribute (tree *, tree, tree,
+					   int, bool *);
 static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_noplt_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *);
@@ -373,6 +375,8 @@ const struct attribute_spec c_common_attribute_table[] =
 			      handle_warn_if_not_aligned_attribute, NULL },
   { "strict_flex_array",      1, 1, true, false, false, false,
 			      handle_strict_flex_array_attribute, NULL },
+  { "element_count",	      1, 1, true, false, false, false,
+			      handle_element_count_attribute, NULL },
   { "weak",                   0, 0, true,  false, false, false,
 			      handle_weak_attribute, NULL },
   { "noplt",                   0, 0, true,  false, false, false,
@@ -2555,6 +2559,53 @@ handle_strict_flex_array_attribute (tree *node, tree name,
   return NULL_TREE;
 }
 
+/* Handle a "element_count" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_element_count_attribute (tree *node, tree name,
+				tree args, int ARG_UNUSED (flags),
+				bool *no_add_attrs)
+{
+  tree decl = *node;
+  tree argval = TREE_VALUE (args);
+
+  /* This attribute only applies to field decls of a structure.  */
+  if (TREE_CODE (decl) != FIELD_DECL)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for non-field"
+		" declaration %q+D", name, decl);
+      *no_add_attrs = true;
+    }
+  /* This attribute only applies to field with array type.  */
+  else if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for a non-array field",
+		name);
+      *no_add_attrs = true;
+    }
+  /* This attribute only applies to a C99 flexible array member type.  */
+  else if (! c_flexible_array_member_type_p (TREE_TYPE (decl)))
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute may not be specified for a non"
+		" flexible array member field",
+		name);
+      *no_add_attrs = true;
+    }
+  /* The argument of the attribute should be a string.  */
+  else if (TREE_CODE (argval) != STRING_CST)
+    {
+      error_at (DECL_SOURCE_LOCATION (decl),
+		"%qE attribute argument not a string", name);
+      *no_add_attrs = true;
+    }
+
+  return NULL_TREE;
+}
+
 /* Handle a "weak" attribute; arguments as in
    struct attribute_spec.handler.  */
 
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 2b4c82facf7..b2bf7a230e2 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9516,6 +9516,19 @@ c_common_finalize_early_debug (void)
       (*debug_hooks->early_global_decl) (cnode->decl);
 }
 
+/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */
+bool
+c_flexible_array_member_type_p (const_tree type)
+{
+  if (TREE_CODE (type) == ARRAY_TYPE
+      && TYPE_SIZE (type) == NULL_TREE
+      && TYPE_DOMAIN (type) != NULL_TREE
+      && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
+    return true;
+
+  return false;
+}
+
 /* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
    values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
 unsigned int
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index f96350b64af..77f6633b862 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -907,6 +907,7 @@ extern tree fold_for_warn (tree);
 extern tree c_common_get_narrower (tree, int *);
 extern bool get_attribute_operand (tree, unsigned HOST_WIDE_INT *);
 extern void c_common_finalize_early_debug (void);
+extern bool c_flexible_array_member_type_p (const_tree);
 extern unsigned int c_strict_flex_array_level_of (tree);
 extern bool c_option_is_from_cpp_diagnostics (int);
 
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index f8ede362bfd..0f25fe0be0d 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5187,19 +5187,6 @@ set_array_declarator_inner (struct c_declarator *decl,
   return decl;
 }
 
-/* Determine whether TYPE is a ISO C99 flexible array memeber type "[]".  */
-static bool
-flexible_array_member_type_p (const_tree type)
-{
-  if (TREE_CODE (type) == ARRAY_TYPE
-      && TYPE_SIZE (type) == NULL_TREE
-      && TYPE_DOMAIN (type) != NULL_TREE
-      && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) == NULL_TREE)
-    return true;
-
-  return false;
-}
-
 /* Determine whether TYPE is a one-element array type "[1]".  */
 static bool
 one_element_array_type_p (const_tree type)
@@ -5236,7 +5223,7 @@ add_flexible_array_elts_to_size (tree decl, tree init)
 
   elt = CONSTRUCTOR_ELTS (init)->last ().value;
   type = TREE_TYPE (elt);
-  if (flexible_array_member_type_p (type))
+  if (c_flexible_array_member_type_p (type))
     {
       complete_array_type (&type, elt, false);
       DECL_SIZE (decl)
@@ -9087,7 +9074,7 @@ is_flexible_array_member_p (bool is_last_field,
 
   bool is_zero_length_array = zero_length_array_type_p (TREE_TYPE (x));
   bool is_one_element_array = one_element_array_type_p (TREE_TYPE (x));
-  bool is_flexible_array = flexible_array_member_type_p (TREE_TYPE (x));
+  bool is_flexible_array = c_flexible_array_member_type_p (TREE_TYPE (x));
 
   unsigned int strict_flex_array_level = c_strict_flex_array_level_of (x);
 
@@ -9117,6 +9104,45 @@ is_flexible_array_member_p (bool is_last_field,
   return false;
 }
 
+/* Verify the argument of the element_count attribute of the flexible array
+   member FIELD_DECL is a valid field of the containing structure's fieldlist,
+   FIELDLIST, Report error when it's not.  */
+static void
+verify_element_count_attribute (tree fieldlist, tree field_decl)
+{
+  tree attr_element_count = lookup_attribute ("element_count",
+					      DECL_ATTRIBUTES (field_decl));
+
+  if (!attr_element_count)
+    return;
+
+  /* If there is an element_count attribute attached to the field,
+     verify it.  */
+
+  const char *fieldname
+    = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr_element_count)));
+
+  /* Verify the argument of the attrbute is a valid field of the
+     containing structure.  */
+
+  tree element_count_field = NULL_TREE;
+  for (tree field = fieldlist; field; field = DECL_CHAIN (field))
+    if (TREE_CODE (field) == FIELD_DECL
+	&& DECL_NAME (field) != NULL
+	&& strcmp (IDENTIFIER_POINTER (DECL_NAME (field)), fieldname) == 0)
+      {
+	element_count_field = field;
+	break;
+      }
+
+  /* Error when the field is not found in the containing structure.  */
+  if (!element_count_field)
+      error_at (DECL_SOURCE_LOCATION (field_decl),
+		"%qE attribute argument not a field declaration"
+		" in the same structure",
+		(get_attribute_name (attr_element_count)));
+  return;
+}
 
 /* Fill in the fields of a RECORD_TYPE or UNION_TYPE node, T.
    LOC is the location of the RECORD_TYPE or UNION_TYPE's definition.
@@ -9237,7 +9263,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 	DECL_PACKED (x) = 1;
 
       /* Detect flexible array member in an invalid context.  */
-      if (flexible_array_member_type_p (TREE_TYPE (x)))
+      if (c_flexible_array_member_type_p (TREE_TYPE (x)))
 	{
 	  if (TREE_CODE (t) == UNION_TYPE)
 	    {
@@ -9258,6 +9284,9 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 			"members");
 	      TREE_TYPE (x) = error_mark_node;
 	    }
+	  /* if there is an element_count attribute attached to this field,
+	     verify it.  */
+	  verify_element_count_attribute (fieldlist, x);
 	}
 
       if (pedantic && TREE_CODE (t) == RECORD_TYPE
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 69b21a75e62..cdbe3923ef1 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7501,6 +7501,27 @@ When both the attribute and the option present at the same time, the level of
 the strictness for the specific trailing array field is determined by the
 attribute.
 
+@cindex @code{element_count} variable attribute
+@item element_count ("@var{count}")
+The @code{element_count} attribute may be attached to the flexible array
+member of a structure.  It indicates that the number of the elements of the
+array is given by the field named "@var{count}" in the same structure as the
+flexible array member.  GCC uses this information to improve the results of
+@code{__builtin_dynamic_object_size} and array bound sanitizer.
+
+For instance, the following declaration:
+
+@smallexample
+struct P @{
+  size_t count;
+  int array[] __attribute__ ((element_count ("count")));
+@};
+@end smallexample
+
+@noindent
+specify that @code{array} is a flexible array member whose number of element
+is given by the field "@code{count}" in the same structure.
+
 @cindex @code{alloc_size} variable attribute
 @item alloc_size (@var{position})
 @itemx alloc_size (@var{position-1}, @var{position-2})
diff --git a/gcc/testsuite/gcc.dg/flex-array-element-count.c b/gcc/testsuite/gcc.dg/flex-array-element-count.c
new file mode 100644
index 00000000000..988f41e9f5e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-element-count.c
@@ -0,0 +1,27 @@
+/* testing the correct usage of attribute element_count.  */   
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int size;
+int x __attribute ((element_count ("size"))); /* { dg-error "attribute may not be specified for non-field declaration" } */
+
+struct trailing {
+  int count;
+  int field __attribute ((element_count ("count"))); /* { dg-error "attribute may not be specified for a non-array field" } */
+};
+
+struct trailing_1 {
+  int count;
+  int array_1[0] __attribute ((element_count ("count"))); /* { dg-error "attribute may not be specified for a non flexible array member field" } */
+};
+
+int count;
+struct trailing_array_2 {
+  int count;
+  int array_2[] __attribute ((element_count (count))); /* { dg-error "attribute argument not a string" } */
+};
+
+struct trailing_array_3 {
+  int other;
+  int array_3[] __attribute ((element_count ("count"))); /* { dg-error "attribute argument not a field declaration in the same structure" } */
+};
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [V1][PATCH 2/3] Use the element_count atribute info in builtin object size [PR108896].
  2023-05-25 16:14 [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Qing Zhao
  2023-05-25 16:14 ` [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896) Qing Zhao
@ 2023-05-25 16:14 ` Qing Zhao
  2023-05-27 10:20   ` Martin Uecker
  2023-05-25 16:14 ` [V1][PATCH 3/3] Use the element_count attribute information in bound sanitizer[PR108896] Qing Zhao
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-05-25 16:14 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, gcc-patches
  Cc: keescook, siddhesh, uecker, isanbard, Qing Zhao

2023-05-17 Qing Zhao <qing.zhao@oracle.com>

gcc/ChangeLog:

	PR C/108896
	* tree-object-size.cc (addr_object_size): Use the element_count
	attribute info.
	* tree.cc (component_ref_has_element_count_p): New function.
	(component_ref_get_element_count): New function.
	* tree.h (component_ref_has_element_count_p): New prototype.
	(component_ref_get_element_count): New prototype.

gcc/testsuite/ChangeLog:

	PR C/108896
	* gcc.dg/flex-array-element-count-2.c: New test.
---
 .../gcc.dg/flex-array-element-count-2.c       | 56 +++++++++++
 gcc/tree-object-size.cc                       | 37 ++++++--
 gcc/tree.cc                                   | 93 +++++++++++++++++++
 gcc/tree.h                                    | 10 ++
 4 files changed, 189 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-element-count-2.c

diff --git a/gcc/testsuite/gcc.dg/flex-array-element-count-2.c b/gcc/testsuite/gcc.dg/flex-array-element-count-2.c
new file mode 100644
index 00000000000..5a280e8c731
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-element-count-2.c
@@ -0,0 +1,56 @@
+/* test the attribute element_count and its usage in
+ * __builtin_dynamic_object_size.  */ 
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+#define expect(p, _v) do { \
+    size_t v = _v; \
+    if (p == v) \
+	__builtin_printf ("ok:  %s == %zd\n", #p, p); \
+    else \
+	{  \
+	  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
+	  FAIL (); \
+	} \
+} while (0);
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((element_count ("b")));
+} *array_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+    = (struct flex *)malloc (sizeof (struct flex)
+				       + normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+    = (struct annotated *)malloc (sizeof (struct annotated)
+					    + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+    expect(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+    expect(__builtin_dynamic_object_size(array_annotated->c, 1),
+	   array_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);   
+  test ();
+  DONE ();
+}
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 9a936a91983..f9aadd59054 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -585,6 +585,7 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
   if (pt_var != TREE_OPERAND (ptr, 0))
     {
       tree var;
+      tree element_count_ref = NULL_TREE;
 
       if (object_size_type & OST_SUBOBJECT)
 	{
@@ -600,11 +601,12 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 	    var = TREE_OPERAND (var, 0);
 	  if (var != pt_var && TREE_CODE (var) == ARRAY_REF)
 	    var = TREE_OPERAND (var, 0);
-	  if (! TYPE_SIZE_UNIT (TREE_TYPE (var))
+	  if (! component_ref_has_element_count_p (var)
+	     && ((! TYPE_SIZE_UNIT (TREE_TYPE (var))
 	      || ! tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (var)))
 	      || (pt_var_size && TREE_CODE (pt_var_size) == INTEGER_CST
 		  && tree_int_cst_lt (pt_var_size,
-				      TYPE_SIZE_UNIT (TREE_TYPE (var)))))
+				      TYPE_SIZE_UNIT (TREE_TYPE (var)))))))
 	    var = pt_var;
 	  else if (var != pt_var && TREE_CODE (pt_var) == MEM_REF)
 	    {
@@ -612,6 +614,7 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 	      /* For &X->fld, compute object size if fld isn't a flexible array
 		 member.  */
 	      bool is_flexible_array_mem_ref = false;
+
 	      while (v && v != pt_var)
 		switch (TREE_CODE (v))
 		  {
@@ -639,6 +642,8 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 			break;
 		      }
 		    is_flexible_array_mem_ref = array_ref_flexible_size_p (v);
+		    element_count_ref = component_ref_get_element_count (v);
+
 		    while (v != pt_var && TREE_CODE (v) == COMPONENT_REF)
 		      if (TREE_CODE (TREE_TYPE (TREE_OPERAND (v, 0)))
 			  != UNION_TYPE
@@ -652,8 +657,11 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 			   == RECORD_TYPE)
 		      {
 			/* compute object size only if v is not a
-			   flexible array member.  */
-			if (!is_flexible_array_mem_ref)
+			   flexible array member or the flexible array member
+			   has a known element count indicated by the user
+			   through attribute element_count.  */
+			if (!is_flexible_array_mem_ref
+			    || element_count_ref)
 			  {
 			    v = NULL_TREE;
 			    break;
@@ -686,9 +694,24 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
 
       if (var != pt_var)
 	{
-	  var_size = TYPE_SIZE_UNIT (TREE_TYPE (var));
-	  if (!TREE_CONSTANT (var_size))
-	    var_size = get_or_create_ssa_default_def (cfun, var_size);
+	  if (!element_count_ref)
+	    {
+	      var_size = TYPE_SIZE_UNIT (TREE_TYPE (var));
+	      if (!TREE_CONSTANT (var_size))
+		var_size = get_or_create_ssa_default_def (cfun, var_size);
+	    }
+	  else
+	    {
+	      gcc_assert (TREE_CODE (var) == COMPONENT_REF
+			  && TREE_CODE (TREE_TYPE (var)) == ARRAY_TYPE);
+	      tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (var)));
+	      var_size
+		= size_binop (MULT_EXPR,
+			      fold_convert (sizetype, element_count_ref),
+			      fold_convert (sizetype, element_size));
+	      if (!todo)
+		todo = TODO_update_ssa_only_virtuals;
+	    }
 	  if (!var_size)
 	    return false;
 	}
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 7e6de288886..48753b94f2c 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -12745,6 +12745,99 @@ array_ref_element_size (tree exp)
     return SUBSTITUTE_PLACEHOLDER_IN_EXPR (TYPE_SIZE_UNIT (elmt_type), exp);
 }
 
+/*  For a component_ref that has an array type ARRAY_REF, return TRUE when
+    an element_count attribute attached to the corresponding FIELD_DECL.
+    return FALSE otherwise.  */
+bool
+component_ref_has_element_count_p (tree array_ref)
+{
+  if (TREE_CODE (array_ref) != COMPONENT_REF)
+    return false;
+
+  if (TREE_CODE (TREE_TYPE (array_ref)) != ARRAY_TYPE)
+    return false;
+
+  tree struct_object = TREE_OPERAND (array_ref, 0);
+  tree struct_type = TREE_TYPE (struct_object);
+
+  if (!RECORD_OR_UNION_TYPE_P (struct_type))
+    return false;
+  tree field_decl = TREE_OPERAND (array_ref, 1);
+  tree attr_element_count = lookup_attribute ("element_count",
+					      DECL_ATTRIBUTES (field_decl));
+
+  if (!attr_element_count)
+    return false;
+  return true;
+}
+
+
+/* For a component_ref that has an array type ARRAY_REF, get the object that
+   represents its element_count per the attribute element_count attached to
+   the corresponding FIELD_DECL.  return NULL_TREE when cannot find such
+   object.
+   For example, if:
+
+    struct P {
+      int k;
+      int x[] __attribute__ ((element_count ("k")));
+    } *p;
+
+    for the following reference:
+
+    p->x[b]
+
+    the object that represents its element count will be:
+
+    p->k
+
+    So, when component_ref_get_element_count (p->x[b]) is called, p->k should be
+    returned.
+*/
+
+tree
+component_ref_get_element_count (tree array_ref)
+{
+  if (! component_ref_has_element_count_p (array_ref))
+    return NULL_TREE;
+
+  tree struct_object = TREE_OPERAND (array_ref, 0);
+  tree struct_type = TREE_TYPE (struct_object);
+  tree field_decl = TREE_OPERAND (array_ref, 1);
+  tree attr_element_count = lookup_attribute ("element_count",
+					      DECL_ATTRIBUTES (field_decl));
+  gcc_assert (attr_element_count);
+
+  /* If there is an element_count attribute attached to the field,
+     get the field that maps to the element_count.  */
+
+  const char *fieldname
+    = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr_element_count)));
+
+  tree element_count_field = NULL_TREE;
+  for (tree field = TYPE_FIELDS (struct_type); field;
+       field = DECL_CHAIN (field))
+    if (TREE_CODE (field) == FIELD_DECL
+	&& DECL_NAME (field) != NULL
+	&& strcmp (IDENTIFIER_POINTER (DECL_NAME (field)), fieldname) == 0)
+      {
+	element_count_field = field;
+	break;
+      }
+
+  gcc_assert (element_count_field);
+
+  /* generate the tree node that represent the element_count of this array
+     ref.  This is a COMPONENT_REF to the element_count_field of the
+     containing structure.  */
+
+  tree element_count_ref = build3 (COMPONENT_REF,
+				   TREE_TYPE (element_count_field),
+				   struct_object, element_count_field,
+				   NULL_TREE);
+  return element_count_ref;
+}
+
 /* Return a tree representing the lower bound of the array mentioned in
    EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
 
diff --git a/gcc/tree.h b/gcc/tree.h
index 0b72663e6a1..8f36370a02d 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5582,6 +5582,16 @@ extern tree get_base_address (tree t);
    of EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
 extern tree array_ref_element_size (tree);
 
+/* Give a component_ref that has an array type, return true when an
+   attribute element_count attached to the corresponding FIELD_DECL.  */
+extern bool component_ref_has_element_count_p (tree);
+
+/* Give a component_ref that has an array type, return the object that
+   represents its element_count per the attribute element_count attached to
+   the corresponding FIELD_DECL.  return NULL_TREE when cannot find such
+   object.  */
+extern tree component_ref_get_element_count (tree);
+
 /* Return a typenode for the "standard" C type with a given name.  */
 extern tree get_typenode_from_name (const char *);
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [V1][PATCH 3/3] Use the element_count attribute information in bound sanitizer[PR108896]
  2023-05-25 16:14 [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Qing Zhao
  2023-05-25 16:14 ` [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896) Qing Zhao
  2023-05-25 16:14 ` [V1][PATCH 2/3] Use the element_count atribute info in builtin object size [PR108896] Qing Zhao
@ 2023-05-25 16:14 ` Qing Zhao
  2023-05-26 16:12 ` [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Kees Cook
  2023-05-26 20:40 ` Kees Cook
  4 siblings, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-05-25 16:14 UTC (permalink / raw)
  To: joseph, richard.guenther, jakub, gcc-patches
  Cc: keescook, siddhesh, uecker, isanbard, Qing Zhao

2023-05-17 Qing Zhao <qing.zhao@oracle.com>

gcc/c-family/ChangeLog:

	PR C/108896
	* c-ubsan.cc (ubsan_instrument_bounds): Use element_count attribute
	information.

gcc/testsuite/ChangeLog:

	PR C/108896
	* gcc.dg/ubsan/flex-array-element-count-bounds.c: New test.
---
 gcc/c-family/c-ubsan.cc                       | 16 +++++++
 .../ubsan/flex-array-element-count-bounds.c   | 46 +++++++++++++++++++
 2 files changed, 62 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-element-count-bounds.c

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index cfb7cbf389c..04eb05b2c24 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -362,6 +362,10 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index,
 {
   tree type = TREE_TYPE (array);
   tree domain = TYPE_DOMAIN (type);
+  /* whether the array ref is a flexible array member with valid element_count
+     attribute.  */
+  bool fam_has_count_attr = false;
+  tree element_count = NULL_TREE;
 
   if (domain == NULL_TREE)
     return NULL_TREE;
@@ -375,6 +379,17 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index,
 	  && COMPLETE_TYPE_P (type)
 	  && integer_zerop (TYPE_SIZE (type)))
 	bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+      /* If the array ref is to flexible array member field which has
+	 element_count attribute.  We can use the information from the
+	 attribute as the bound to instrument the reference.  */
+      else if ((element_count = component_ref_get_element_count (array))
+		!= NULL_TREE)
+	{
+	  fam_has_count_attr = true;
+	  bound = fold_build2 (MINUS_EXPR, TREE_TYPE (element_count),
+			       element_count,
+			       build_int_cst (TREE_TYPE (element_count), 1));
+	}
       else
 	return NULL_TREE;
     }
@@ -387,6 +402,7 @@ ubsan_instrument_bounds (location_t loc, tree array, tree *index,
      -fsanitize=bounds-strict.  */
   tree base = get_base_address (array);
   if (!sanitize_flags_p (SANITIZE_BOUNDS_STRICT)
+      && !fam_has_count_attr
       && TREE_CODE (array) == COMPONENT_REF
       && base && (INDIRECT_REF_P (base) || TREE_CODE (base) == MEM_REF))
     {
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-element-count-bounds.c b/gcc/testsuite/gcc.dg/ubsan/flex-array-element-count-bounds.c
new file mode 100644
index 00000000000..be5ee352144
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-element-count-bounds.c
@@ -0,0 +1,46 @@
+/* test the attribute element_count and its usage in
+   bounds sanitizer.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+
+#include <stdlib.h>
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((element_count ("b")));
+} *array_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int annotated_count)
+{
+  array_flex
+    = (struct flex *)malloc (sizeof (struct flex)
+			     + normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+    = (struct annotated *)malloc (sizeof (struct annotated)
+				  + annotated_count *  sizeof (int));
+  array_annotated->b = annotated_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test (int normal_index, int annotated_index)
+{
+  array_flex->c[normal_index] = 1;
+  array_annotated->c[annotated_index] = 2;
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);   
+  test (10,10);
+  return 0;
+}
+
+/* { dg-output "36:21: runtime error: index 10 out of bounds for type" } */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-05-25 16:14 ` [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896) Qing Zhao
@ 2023-05-25 21:02   ` Joseph Myers
  2023-05-26 13:32     ` Qing Zhao
  0 siblings, 1 reply; 52+ messages in thread
From: Joseph Myers @ 2023-05-25 21:02 UTC (permalink / raw)
  To: Qing Zhao
  Cc: richard.guenther, jakub, gcc-patches, keescook, siddhesh, uecker,
	isanbard

What happens if the field giving the number of elements is in a contained 
anonymous structure or union?

struct s {
  struct { size_t count; };
  int array[] __attribute__ ((element_count ("count")));
};

This ought to work - a general principle in C is that anonymous structures 
and unions are transparent as far as name lookup for fields is concerned.  
But I don't see any testcases for it and I'm not sure it would work with 
the present code.

What if the string is a wide string?  I don't expect that to work (either 
as a matter of interface design, or in the present code), but I think that 
case should have a specific check and error.

What happens in the case where -fexec-charset specifies a 
non-ASCII-compatible character set?  I expect that to work OK with the 
existing code, because translation of string literals to the execution 
character set is disabled in __attribute__ parsing, but having a testcase 
for it would be good.

What happens if the field referenced for the element count does not have 
integer type?  I'd expect an error, but don't see one in the code or tests 
here.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-05-25 21:02   ` Joseph Myers
@ 2023-05-26 13:32     ` Qing Zhao
  2023-05-26 18:15       ` Joseph Myers
  0 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-05-26 13:32 UTC (permalink / raw)
  To: Joseph Myers
  Cc: richard.guenther, jakub, gcc-patches, keescook, siddhesh, uecker,
	isanbard



> On May 25, 2023, at 5:02 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> What happens if the field giving the number of elements is in a contained 
> anonymous structure or union?
> 
> struct s {
>  struct { size_t count; };
>  int array[] __attribute__ ((element_count ("count")));
> };
> 
> This ought to work - a general principle in C is that anonymous structures 
> and unions are transparent as far as name lookup for fields is concerned.  
> But I don't see any testcases for it and I'm not sure it would work with 
> the present code.
Will check on this and add testing case for this case.
> 
> What if the string is a wide string?  I don't expect that to work (either 
> as a matter of interface design, or in the present code), but I think that 
> case should have a specific check and error.

Dump question: how to check whether the string is a wide string? -:)

> 
> What happens in the case where -fexec-charset specifies a 
> non-ASCII-compatible character set?  I expect that to work OK with the 
> existing code, because translation of string literals to the execution 
> character set is disabled in __attribute__ parsing, but having a testcase 
> for it would be good.
Okay, will check on this and add test case.
> 
> What happens if the field referenced for the element count does not have 
> integer type?  I'd expect an error, but don't see one in the code or tests 
> here.
Yes, that’s right, will add this.

Thanks a lot.

Qing

> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-05-25 16:14 [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Qing Zhao
                   ` (2 preceding siblings ...)
  2023-05-25 16:14 ` [V1][PATCH 3/3] Use the element_count attribute information in bound sanitizer[PR108896] Qing Zhao
@ 2023-05-26 16:12 ` Kees Cook
  2023-05-30 21:44   ` Qing Zhao
  2023-05-26 20:40 ` Kees Cook
  4 siblings, 1 reply; 52+ messages in thread
From: Kees Cook @ 2023-05-26 16:12 UTC (permalink / raw)
  To: Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard

On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote:
> This patch set introduces a new attribute "element_count" to annotate bounds 
> for C99 flexible array member.

Thank you for this work! I'm really excited to start using it in the
Linux kernel. I'll give this a spin, but I know you've already been
testing this series against the test cases I created earlier, so I don't
expect any problems. :)

One bike-shedding note: with the recent "-fbounds-safety" RFC posted for
LLVM, we may want to consider renaming "element_count" to "counted_by":
https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/

Thanks again!

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-05-26 13:32     ` Qing Zhao
@ 2023-05-26 18:15       ` Joseph Myers
  2023-05-26 19:09         ` Qing Zhao
  2023-06-07 19:59         ` Qing Zhao
  0 siblings, 2 replies; 52+ messages in thread
From: Joseph Myers @ 2023-05-26 18:15 UTC (permalink / raw)
  To: Qing Zhao
  Cc: richard.guenther, jakub, gcc-patches, keescook, siddhesh, uecker,
	isanbard

On Fri, 26 May 2023, Qing Zhao via Gcc-patches wrote:

> > What if the string is a wide string?  I don't expect that to work (either 
> > as a matter of interface design, or in the present code), but I think that 
> > case should have a specific check and error.
> 
> Dump question: how to check whether the string is a wide string? -:)

By examining the element type; the only valid case for the attribute would 
be an element type of (const) char.  (I think it's reasonable to reject 
all of char8_t, char16_t, char32_t, wchar_t strings in this context.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-05-26 18:15       ` Joseph Myers
@ 2023-05-26 19:09         ` Qing Zhao
  2023-06-07 19:59         ` Qing Zhao
  1 sibling, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-05-26 19:09 UTC (permalink / raw)
  To: Joseph Myers
  Cc: richard.guenther, jakub, gcc-patches, keescook, siddhesh, uecker,
	isanbard



> On May 26, 2023, at 2:15 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Fri, 26 May 2023, Qing Zhao via Gcc-patches wrote:
> 
>>> What if the string is a wide string?  I don't expect that to work (either 
>>> as a matter of interface design, or in the present code), but I think that 
>>> case should have a specific check and error.
>> 
>> Dump question: how to check whether the string is a wide string? -:)
> 
> By examining the element type; the only valid case for the attribute would 
> be an element type of (const) char.  (I think it's reasonable to reject 
> all of char8_t, char16_t, char32_t, wchar_t strings in this context.)

Okay, will study a little bit and do this.
Thanks

Qing
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-05-25 16:14 [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Qing Zhao
                   ` (3 preceding siblings ...)
  2023-05-26 16:12 ` [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Kees Cook
@ 2023-05-26 20:40 ` Kees Cook
  2023-05-30 15:43   ` Qing Zhao
  2023-07-06 18:56   ` Qing Zhao
  4 siblings, 2 replies; 52+ messages in thread
From: Kees Cook @ 2023-05-26 20:40 UTC (permalink / raw)
  To: Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard

On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote:
> GCC will pass the number of elements info from the attached attribute to both 
> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds
> or dynamic object size issues during runtime for flexible array members.
> 
> This new feature will provide nice protection to flexible array members (which
> currently are completely ignored by both __builtin_dynamic_object_size and
> bounds sanitizers).

Testing went pretty well, though I think I found some bdos issues:

- some things that bdos can't know the size of, and correctly returned
  SIZE_MAX in the past, now thinks are 0-sized.
- while bdos correctly knows the size of an element_count-annotated
  flexible array, it doesn't know the size of the containing object
  (i.e. it returns SIZE_MAX).

Also, I think I found a precedence issue:

- if both __alloc_size and 'element_count' are in use, the _smallest_
  of the two is what I would expect to be enforced by the sanitizer
  and reported by __bdos. As is, alloc_size appears to be used when
  it is available, regardless of what 'element_count' shows.

I've updated my test cases to show it more clearly, but here is the
before/after:


GCC 13 (correctly does not implement "element_count"):

$ ./array-bounds 2>&1 | grep -v ^'#'
TAP version 13
1..12
ok 1 global.fixed_size_seen_by_bdos
ok 2 global.fixed_size_enforced_by_sanitizer
ok 3 global.unknown_size_unknown_to_bdos
ok 4 global.unknown_size_ignored_by_sanitizer
ok 5 global.alloc_size_seen_by_bdos
ok 6 global.alloc_size_enforced_by_sanitizer
not ok 7 global.element_count_seen_by_bdos
not ok 8 global.element_count_enforced_by_sanitizer
not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer


ToT GCC + this element_count series:

$ ./array-bounds 2>&1 | grep -v ^'#'
TAP version 13
1..12
ok 1 global.fixed_size_seen_by_bdos
ok 2 global.fixed_size_enforced_by_sanitizer
not ok 3 global.unknown_size_unknown_to_bdos
not ok 4 global.unknown_size_ignored_by_sanitizer
ok 5 global.alloc_size_seen_by_bdos
ok 6 global.alloc_size_enforced_by_sanitizer
not ok 7 global.element_count_seen_by_bdos
ok 8 global.element_count_enforced_by_sanitizer
not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer


Test suite is here:
https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 2/3] Use the element_count atribute info in builtin object size [PR108896].
  2023-05-25 16:14 ` [V1][PATCH 2/3] Use the element_count atribute info in builtin object size [PR108896] Qing Zhao
@ 2023-05-27 10:20   ` Martin Uecker
  2023-05-30 16:08     ` Qing Zhao
  0 siblings, 1 reply; 52+ messages in thread
From: Martin Uecker @ 2023-05-27 10:20 UTC (permalink / raw)
  To: Qing Zhao, joseph, richard.guenther, jakub, gcc-patches
  Cc: keescook, siddhesh, isanbard


Thank you for working on this!


Here are a couple of comments:

How is the size for an object with FAM defined? 

There are at least three possible choices:

offset(..) + N * sizeof
sizeof(..) + N * sizeof
or the size of a struct with the replacement array.

Or is this not relevant here?


I would personally prefer an attribute which does
not use a string, but uses C expressions, so that
one could write something like this (although I would
limit it initially to the most simple case) 

struct {
  struct bar { int n; }* ptr;
  int buf[] [[element_count(.ptr->n + 3)]];
};

Of course, we could still support this later even
if we use a string now.

Martin




Am Donnerstag, dem 25.05.2023 um 16:14 +0000 schrieb Qing Zhao:
> 2023-05-17 Qing Zhao <qing.zhao@oracle.com>
> 
> gcc/ChangeLog:
> 
> 	PR C/108896
> 	* tree-object-size.cc (addr_object_size): Use the element_count
> 	attribute info.
> 	* tree.cc (component_ref_has_element_count_p): New function.
> 	(component_ref_get_element_count): New function.
> 	* tree.h (component_ref_has_element_count_p): New prototype.
> 	(component_ref_get_element_count): New prototype.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR C/108896
> 	* gcc.dg/flex-array-element-count-2.c: New test.
> ---
>  .../gcc.dg/flex-array-element-count-2.c       | 56 +++++++++++
>  gcc/tree-object-size.cc                       | 37 ++++++--
>  gcc/tree.cc                                   | 93 +++++++++++++++++++
>  gcc/tree.h                                    | 10 ++
>  4 files changed, 189 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/flex-array-element-count-2.c
> 
> diff --git a/gcc/testsuite/gcc.dg/flex-array-element-count-2.c b/gcc/testsuite/gcc.dg/flex-array-element-count-2.c
> new file mode 100644
> index 00000000000..5a280e8c731
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/flex-array-element-count-2.c
> @@ -0,0 +1,56 @@
> +/* test the attribute element_count and its usage in
> + * __builtin_dynamic_object_size.  */ 
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +#include "builtin-object-size-common.h"
> +
> +#define expect(p, _v) do { \
> +    size_t v = _v; \
> +    if (p == v) \
> +	__builtin_printf ("ok:  %s == %zd\n", #p, p); \
> +    else \
> +	{  \
> +	  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
> +	  FAIL (); \
> +	} \
> +} while (0);
> +
> +struct flex {
> +  int b;
> +  int c[];
> +} *array_flex;
> +
> +struct annotated {
> +  int b;
> +  int c[] __attribute__ ((element_count ("b")));
> +} *array_annotated;
> +
> +void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
> +{
> +  array_flex
> +    = (struct flex *)malloc (sizeof (struct flex)
> +				       + normal_count *  sizeof (int));
> +  array_flex->b = normal_count;
> +
> +  array_annotated
> +    = (struct annotated *)malloc (sizeof (struct annotated)
> +					    + attr_count *  sizeof (int));
> +  array_annotated->b = attr_count;
> +
> +  return;
> +}
> +
> +void __attribute__((__noinline__)) test ()
> +{
> +    expect(__builtin_dynamic_object_size(array_flex->c, 1), -1);
> +    expect(__builtin_dynamic_object_size(array_annotated->c, 1),
> +	   array_annotated->b * sizeof (int));
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +  setup (10,10);   
> +  test ();
> +  DONE ();
> +}
> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
> index 9a936a91983..f9aadd59054 100644
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -585,6 +585,7 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>    if (pt_var != TREE_OPERAND (ptr, 0))
>      {
>        tree var;
> +      tree element_count_ref = NULL_TREE;
>  
> 
>        if (object_size_type & OST_SUBOBJECT)
>  	{
> @@ -600,11 +601,12 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>  	    var = TREE_OPERAND (var, 0);
>  	  if (var != pt_var && TREE_CODE (var) == ARRAY_REF)
>  	    var = TREE_OPERAND (var, 0);
> -	  if (! TYPE_SIZE_UNIT (TREE_TYPE (var))
> +	  if (! component_ref_has_element_count_p (var)
> +	     && ((! TYPE_SIZE_UNIT (TREE_TYPE (var))
>  	      || ! tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (var)))
>  	      || (pt_var_size && TREE_CODE (pt_var_size) == INTEGER_CST
>  		  && tree_int_cst_lt (pt_var_size,
> -				      TYPE_SIZE_UNIT (TREE_TYPE (var)))))
> +				      TYPE_SIZE_UNIT (TREE_TYPE (var)))))))
>  	    var = pt_var;
>  	  else if (var != pt_var && TREE_CODE (pt_var) == MEM_REF)
>  	    {
> @@ -612,6 +614,7 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>  	      /* For &X->fld, compute object size if fld isn't a flexible array
>  		 member.  */
>  	      bool is_flexible_array_mem_ref = false;
> +
>  	      while (v && v != pt_var)
>  		switch (TREE_CODE (v))
>  		  {
> @@ -639,6 +642,8 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>  			break;
>  		      }
>  		    is_flexible_array_mem_ref = array_ref_flexible_size_p (v);
> +		    element_count_ref = component_ref_get_element_count (v);
> +
>  		    while (v != pt_var && TREE_CODE (v) == COMPONENT_REF)
>  		      if (TREE_CODE (TREE_TYPE (TREE_OPERAND (v, 0)))
>  			  != UNION_TYPE
> @@ -652,8 +657,11 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>  			   == RECORD_TYPE)
>  		      {
>  			/* compute object size only if v is not a
> -			   flexible array member.  */
> -			if (!is_flexible_array_mem_ref)
> +			   flexible array member or the flexible array member
> +			   has a known element count indicated by the user
> +			   through attribute element_count.  */
> +			if (!is_flexible_array_mem_ref
> +			    || element_count_ref)
>  			  {
>  			    v = NULL_TREE;
>  			    break;
> @@ -686,9 +694,24 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>  
> 
>        if (var != pt_var)
>  	{
> -	  var_size = TYPE_SIZE_UNIT (TREE_TYPE (var));
> -	  if (!TREE_CONSTANT (var_size))
> -	    var_size = get_or_create_ssa_default_def (cfun, var_size);
> +	  if (!element_count_ref)
> +	    {
> +	      var_size = TYPE_SIZE_UNIT (TREE_TYPE (var));
> +	      if (!TREE_CONSTANT (var_size))
> +		var_size = get_or_create_ssa_default_def (cfun, var_size);
> +	    }
> +	  else
> +	    {
> +	      gcc_assert (TREE_CODE (var) == COMPONENT_REF
> +			  && TREE_CODE (TREE_TYPE (var)) == ARRAY_TYPE);
> +	      tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (var)));
> +	      var_size
> +		= size_binop (MULT_EXPR,
> +			      fold_convert (sizetype, element_count_ref),
> +			      fold_convert (sizetype, element_size));
> +	      if (!todo)
> +		todo = TODO_update_ssa_only_virtuals;
> +	    }
>  	  if (!var_size)
>  	    return false;
>  	}
> diff --git a/gcc/tree.cc b/gcc/tree.cc
> index 7e6de288886..48753b94f2c 100644
> --- a/gcc/tree.cc
> +++ b/gcc/tree.cc
> @@ -12745,6 +12745,99 @@ array_ref_element_size (tree exp)
>      return SUBSTITUTE_PLACEHOLDER_IN_EXPR (TYPE_SIZE_UNIT (elmt_type), exp);
>  }
>  
> 
> +/*  For a component_ref that has an array type ARRAY_REF, return TRUE when
> +    an element_count attribute attached to the corresponding FIELD_DECL.
> +    return FALSE otherwise.  */
> +bool
> +component_ref_has_element_count_p (tree array_ref)
> +{
> +  if (TREE_CODE (array_ref) != COMPONENT_REF)
> +    return false;
> +
> +  if (TREE_CODE (TREE_TYPE (array_ref)) != ARRAY_TYPE)
> +    return false;
> +
> +  tree struct_object = TREE_OPERAND (array_ref, 0);
> +  tree struct_type = TREE_TYPE (struct_object);
> +
> +  if (!RECORD_OR_UNION_TYPE_P (struct_type))
> +    return false;
> +  tree field_decl = TREE_OPERAND (array_ref, 1);
> +  tree attr_element_count = lookup_attribute ("element_count",
> +					      DECL_ATTRIBUTES (field_decl));
> +
> +  if (!attr_element_count)
> +    return false;
> +  return true;
> +}
> +
> +
> +/* For a component_ref that has an array type ARRAY_REF, get the object that
> +   represents its element_count per the attribute element_count attached to
> +   the corresponding FIELD_DECL.  return NULL_TREE when cannot find such
> +   object.
> +   For example, if:
> +
> +    struct P {
> +      int k;
> +      int x[] __attribute__ ((element_count ("k")));
> +    } *p;
> +
> +    for the following reference:
> +
> +    p->x[b]
> +
> +    the object that represents its element count will be:
> +
> +    p->k
> +
> +    So, when component_ref_get_element_count (p->x[b]) is called, p->k should be
> +    returned.
> +*/
> +
> +tree
> +component_ref_get_element_count (tree array_ref)
> +{
> +  if (! component_ref_has_element_count_p (array_ref))
> +    return NULL_TREE;
> +
> +  tree struct_object = TREE_OPERAND (array_ref, 0);
> +  tree struct_type = TREE_TYPE (struct_object);
> +  tree field_decl = TREE_OPERAND (array_ref, 1);
> +  tree attr_element_count = lookup_attribute ("element_count",
> +					      DECL_ATTRIBUTES (field_decl));
> +  gcc_assert (attr_element_count);
> +
> +  /* If there is an element_count attribute attached to the field,
> +     get the field that maps to the element_count.  */
> +
> +  const char *fieldname
> +    = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr_element_count)));
> +
> +  tree element_count_field = NULL_TREE;
> +  for (tree field = TYPE_FIELDS (struct_type); field;
> +       field = DECL_CHAIN (field))
> +    if (TREE_CODE (field) == FIELD_DECL
> +	&& DECL_NAME (field) != NULL
> +	&& strcmp (IDENTIFIER_POINTER (DECL_NAME (field)), fieldname) == 0)
> +      {
> +	element_count_field = field;
> +	break;
> +      }
> +
> +  gcc_assert (element_count_field);
> +
> +  /* generate the tree node that represent the element_count of this array
> +     ref.  This is a COMPONENT_REF to the element_count_field of the
> +     containing structure.  */
> +
> +  tree element_count_ref = build3 (COMPONENT_REF,
> +				   TREE_TYPE (element_count_field),
> +				   struct_object, element_count_field,
> +				   NULL_TREE);
> +  return element_count_ref;
> +}
> +
>  /* Return a tree representing the lower bound of the array mentioned in
>     EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
>  
> 
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 0b72663e6a1..8f36370a02d 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -5582,6 +5582,16 @@ extern tree get_base_address (tree t);
>     of EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
>  extern tree array_ref_element_size (tree);
>  
> 
> +/* Give a component_ref that has an array type, return true when an
> +   attribute element_count attached to the corresponding FIELD_DECL.  */
> +extern bool component_ref_has_element_count_p (tree);
> +
> +/* Give a component_ref that has an array type, return the object that
> +   represents its element_count per the attribute element_count attached to
> +   the corresponding FIELD_DECL.  return NULL_TREE when cannot find such
> +   object.  */
> +extern tree component_ref_get_element_count (tree);
> +
>  /* Return a typenode for the "standard" C type with a given name.  */
>  extern tree get_typenode_from_name (const char *);
>  
> 



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-05-26 20:40 ` Kees Cook
@ 2023-05-30 15:43   ` Qing Zhao
  2023-07-06 18:56   ` Qing Zhao
  1 sibling, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-05-30 15:43 UTC (permalink / raw)
  To: Kees Cook
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard



> On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote:
>> GCC will pass the number of elements info from the attached attribute to both 
>> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds
>> or dynamic object size issues during runtime for flexible array members.
>> 
>> This new feature will provide nice protection to flexible array members (which
>> currently are completely ignored by both __builtin_dynamic_object_size and
>> bounds sanitizers).
> 
> Testing went pretty well, though I think I found some bdos issues:
> 
> - some things that bdos can't know the size of, and correctly returned
>  SIZE_MAX in the past, now thinks are 0-sized.

Will check this issue and fix it.

> - while bdos correctly knows the size of an element_count-annotated
>  flexible array, it doesn't know the size of the containing object
>  (i.e. it returns SIZE_MAX).

This is a known issue I found during the implementation, and filed a bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557
for it

And turned out that this was expected behavior. 


> 
> Also, I think I found a precedence issue:
> 
> - if both __alloc_size and 'element_count' are in use, the _smallest_
>  of the two is what I would expect to be enforced by the sanitizer
>  and reported by __bdos. As is, alloc_size appears to be used when
>  it is available, regardless of what 'element_count' shows.

Will check on this and fix it.

> 
> I've updated my test cases to show it more clearly, but here is the
> before/after:
> 
> 
> GCC 13 (correctly does not implement "element_count"):
> 
> $ ./array-bounds 2>&1 | grep -v ^'#'
> TAP version 13
> 1..12
> ok 1 global.fixed_size_seen_by_bdos
> ok 2 global.fixed_size_enforced_by_sanitizer
> ok 3 global.unknown_size_unknown_to_bdos
> ok 4 global.unknown_size_ignored_by_sanitizer
> ok 5 global.alloc_size_seen_by_bdos
> ok 6 global.alloc_size_enforced_by_sanitizer
> not ok 7 global.element_count_seen_by_bdos
> not ok 8 global.element_count_enforced_by_sanitizer
> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> 
> 
> ToT GCC + this element_count series:
> 
> $ ./array-bounds 2>&1 | grep -v ^'#'
> TAP version 13
> 1..12
> ok 1 global.fixed_size_seen_by_bdos
> ok 2 global.fixed_size_enforced_by_sanitizer
> not ok 3 global.unknown_size_unknown_to_bdos
> not ok 4 global.unknown_size_ignored_by_sanitizer
> ok 5 global.alloc_size_seen_by_bdos
> ok 6 global.alloc_size_enforced_by_sanitizer
> not ok 7 global.element_count_seen_by_bdos
> ok 8 global.element_count_enforced_by_sanitizer
> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> 
> 
> Test suite is here:
> https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c

Thanks a lot for the testing, this is really helpful.

Will study and fix all these issues.

Qing
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 2/3] Use the element_count atribute info in builtin object size [PR108896].
  2023-05-27 10:20   ` Martin Uecker
@ 2023-05-30 16:08     ` Qing Zhao
  0 siblings, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-05-30 16:08 UTC (permalink / raw)
  To: Martin Uecker
  Cc: joseph, Richard Biener, jakub, gcc-patches, keescook, siddhesh, isanbard



> On May 27, 2023, at 6:20 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> 
> Thank you for working on this!
> 
> 
> Here are a couple of comments:
> 
> How is the size for an object with FAM defined? 

Right now, with the attribute approach, the sizeof for the object with FAM is not impacted, and kept the same as before without this attribute. 

The information carried from the attribute only is used in _builtin_object_size and bound sanitizer. 

Doing this will help mitigating the existing source code to use this new information easier. 

If later this feature is promoted to C standard, I believe that the sizeof for the object with FAM should be updated too.  And the size information of the
FAM will be embedded into the TYPE. 

> 
> There are at least three possible choices:
> 
> offset(..) + N * sizeof
> sizeof(..) + N * sizeof
> or the size of a struct with the replacement array.
> 
> Or is this not relevant here?
> 
> 
> I would personally prefer an attribute which does
> not use a string,
Using “string” is the simplest implementation to prove the feasibility of this new attribute.
We can definitely use C expressions if we all think it’s necessary, the implementation in the C FE will be more complicate when using
C expression I think.  (When parsing the attribute, the type of the whole structure and other fields are not available yet, so, we might need to 
Delay the parsing of the attribute after the whole structure is done..)

> but uses C expressions, so that
> one could write something like this (although I would
> limit it initially to the most simple case) 
> 
> struct {
>  struct bar { int n; }* ptr;
>  int buf[] [[element_count(.ptr->n + 3)]];
> };
> 
> Of course, we could still support this later even
> if we use a string now.

Looks like that CLANG’s -fbounds-safely implements this attribute by using C expression as the argument and they added a late parse for C.
Not sure whether we should be compatible with CLANG on this, if so, we might need to do the same.

Thanks a lot for your suggestions and comments.

Qing
> 
> Martin
> 
> 
> 
> 
> Am Donnerstag, dem 25.05.2023 um 16:14 +0000 schrieb Qing Zhao:
>> 2023-05-17 Qing Zhao <qing.zhao@oracle.com>
>> 
>> gcc/ChangeLog:
>> 
>> 	PR C/108896
>> 	* tree-object-size.cc (addr_object_size): Use the element_count
>> 	attribute info.
>> 	* tree.cc (component_ref_has_element_count_p): New function.
>> 	(component_ref_get_element_count): New function.
>> 	* tree.h (component_ref_has_element_count_p): New prototype.
>> 	(component_ref_get_element_count): New prototype.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 	PR C/108896
>> 	* gcc.dg/flex-array-element-count-2.c: New test.
>> ---
>>  .../gcc.dg/flex-array-element-count-2.c       | 56 +++++++++++
>>  gcc/tree-object-size.cc                       | 37 ++++++--
>>  gcc/tree.cc                                   | 93 +++++++++++++++++++
>>  gcc/tree.h                                    | 10 ++
>>  4 files changed, 189 insertions(+), 7 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/flex-array-element-count-2.c
>> 
>> diff --git a/gcc/testsuite/gcc.dg/flex-array-element-count-2.c b/gcc/testsuite/gcc.dg/flex-array-element-count-2.c
>> new file mode 100644
>> index 00000000000..5a280e8c731
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/flex-array-element-count-2.c
>> @@ -0,0 +1,56 @@
>> +/* test the attribute element_count and its usage in
>> + * __builtin_dynamic_object_size.  */ 
>> +/* { dg-do run } */
>> +/* { dg-options "-O2" } */
>> +
>> +#include "builtin-object-size-common.h"
>> +
>> +#define expect(p, _v) do { \
>> +    size_t v = _v; \
>> +    if (p == v) \
>> +	__builtin_printf ("ok:  %s == %zd\n", #p, p); \
>> +    else \
>> +	{  \
>> +	  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
>> +	  FAIL (); \
>> +	} \
>> +} while (0);
>> +
>> +struct flex {
>> +  int b;
>> +  int c[];
>> +} *array_flex;
>> +
>> +struct annotated {
>> +  int b;
>> +  int c[] __attribute__ ((element_count ("b")));
>> +} *array_annotated;
>> +
>> +void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
>> +{
>> +  array_flex
>> +    = (struct flex *)malloc (sizeof (struct flex)
>> +				       + normal_count *  sizeof (int));
>> +  array_flex->b = normal_count;
>> +
>> +  array_annotated
>> +    = (struct annotated *)malloc (sizeof (struct annotated)
>> +					    + attr_count *  sizeof (int));
>> +  array_annotated->b = attr_count;
>> +
>> +  return;
>> +}
>> +
>> +void __attribute__((__noinline__)) test ()
>> +{
>> +    expect(__builtin_dynamic_object_size(array_flex->c, 1), -1);
>> +    expect(__builtin_dynamic_object_size(array_annotated->c, 1),
>> +	   array_annotated->b * sizeof (int));
>> +}
>> +
>> +int main(int argc, char *argv[])
>> +{
>> +  setup (10,10);   
>> +  test ();
>> +  DONE ();
>> +}
>> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
>> index 9a936a91983..f9aadd59054 100644
>> --- a/gcc/tree-object-size.cc
>> +++ b/gcc/tree-object-size.cc
>> @@ -585,6 +585,7 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>>    if (pt_var != TREE_OPERAND (ptr, 0))
>>      {
>>        tree var;
>> +      tree element_count_ref = NULL_TREE;
>>  
>> 
>>        if (object_size_type & OST_SUBOBJECT)
>>  	{
>> @@ -600,11 +601,12 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>>  	    var = TREE_OPERAND (var, 0);
>>  	  if (var != pt_var && TREE_CODE (var) == ARRAY_REF)
>>  	    var = TREE_OPERAND (var, 0);
>> -	  if (! TYPE_SIZE_UNIT (TREE_TYPE (var))
>> +	  if (! component_ref_has_element_count_p (var)
>> +	     && ((! TYPE_SIZE_UNIT (TREE_TYPE (var))
>>  	      || ! tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (var)))
>>  	      || (pt_var_size && TREE_CODE (pt_var_size) == INTEGER_CST
>>  		  && tree_int_cst_lt (pt_var_size,
>> -				      TYPE_SIZE_UNIT (TREE_TYPE (var)))))
>> +				      TYPE_SIZE_UNIT (TREE_TYPE (var)))))))
>>  	    var = pt_var;
>>  	  else if (var != pt_var && TREE_CODE (pt_var) == MEM_REF)
>>  	    {
>> @@ -612,6 +614,7 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>>  	      /* For &X->fld, compute object size if fld isn't a flexible array
>>  		 member.  */
>>  	      bool is_flexible_array_mem_ref = false;
>> +
>>  	      while (v && v != pt_var)
>>  		switch (TREE_CODE (v))
>>  		  {
>> @@ -639,6 +642,8 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>>  			break;
>>  		      }
>>  		    is_flexible_array_mem_ref = array_ref_flexible_size_p (v);
>> +		    element_count_ref = component_ref_get_element_count (v);
>> +
>>  		    while (v != pt_var && TREE_CODE (v) == COMPONENT_REF)
>>  		      if (TREE_CODE (TREE_TYPE (TREE_OPERAND (v, 0)))
>>  			  != UNION_TYPE
>> @@ -652,8 +657,11 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>>  			   == RECORD_TYPE)
>>  		      {
>>  			/* compute object size only if v is not a
>> -			   flexible array member.  */
>> -			if (!is_flexible_array_mem_ref)
>> +			   flexible array member or the flexible array member
>> +			   has a known element count indicated by the user
>> +			   through attribute element_count.  */
>> +			if (!is_flexible_array_mem_ref
>> +			    || element_count_ref)
>>  			  {
>>  			    v = NULL_TREE;
>>  			    break;
>> @@ -686,9 +694,24 @@ addr_object_size (struct object_size_info *osi, const_tree ptr,
>>  
>> 
>>        if (var != pt_var)
>>  	{
>> -	  var_size = TYPE_SIZE_UNIT (TREE_TYPE (var));
>> -	  if (!TREE_CONSTANT (var_size))
>> -	    var_size = get_or_create_ssa_default_def (cfun, var_size);
>> +	  if (!element_count_ref)
>> +	    {
>> +	      var_size = TYPE_SIZE_UNIT (TREE_TYPE (var));
>> +	      if (!TREE_CONSTANT (var_size))
>> +		var_size = get_or_create_ssa_default_def (cfun, var_size);
>> +	    }
>> +	  else
>> +	    {
>> +	      gcc_assert (TREE_CODE (var) == COMPONENT_REF
>> +			  && TREE_CODE (TREE_TYPE (var)) == ARRAY_TYPE);
>> +	      tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (var)));
>> +	      var_size
>> +		= size_binop (MULT_EXPR,
>> +			      fold_convert (sizetype, element_count_ref),
>> +			      fold_convert (sizetype, element_size));
>> +	      if (!todo)
>> +		todo = TODO_update_ssa_only_virtuals;
>> +	    }
>>  	  if (!var_size)
>>  	    return false;
>>  	}
>> diff --git a/gcc/tree.cc b/gcc/tree.cc
>> index 7e6de288886..48753b94f2c 100644
>> --- a/gcc/tree.cc
>> +++ b/gcc/tree.cc
>> @@ -12745,6 +12745,99 @@ array_ref_element_size (tree exp)
>>      return SUBSTITUTE_PLACEHOLDER_IN_EXPR (TYPE_SIZE_UNIT (elmt_type), exp);
>>  }
>>  
>> 
>> +/*  For a component_ref that has an array type ARRAY_REF, return TRUE when
>> +    an element_count attribute attached to the corresponding FIELD_DECL.
>> +    return FALSE otherwise.  */
>> +bool
>> +component_ref_has_element_count_p (tree array_ref)
>> +{
>> +  if (TREE_CODE (array_ref) != COMPONENT_REF)
>> +    return false;
>> +
>> +  if (TREE_CODE (TREE_TYPE (array_ref)) != ARRAY_TYPE)
>> +    return false;
>> +
>> +  tree struct_object = TREE_OPERAND (array_ref, 0);
>> +  tree struct_type = TREE_TYPE (struct_object);
>> +
>> +  if (!RECORD_OR_UNION_TYPE_P (struct_type))
>> +    return false;
>> +  tree field_decl = TREE_OPERAND (array_ref, 1);
>> +  tree attr_element_count = lookup_attribute ("element_count",
>> +					      DECL_ATTRIBUTES (field_decl));
>> +
>> +  if (!attr_element_count)
>> +    return false;
>> +  return true;
>> +}
>> +
>> +
>> +/* For a component_ref that has an array type ARRAY_REF, get the object that
>> +   represents its element_count per the attribute element_count attached to
>> +   the corresponding FIELD_DECL.  return NULL_TREE when cannot find such
>> +   object.
>> +   For example, if:
>> +
>> +    struct P {
>> +      int k;
>> +      int x[] __attribute__ ((element_count ("k")));
>> +    } *p;
>> +
>> +    for the following reference:
>> +
>> +    p->x[b]
>> +
>> +    the object that represents its element count will be:
>> +
>> +    p->k
>> +
>> +    So, when component_ref_get_element_count (p->x[b]) is called, p->k should be
>> +    returned.
>> +*/
>> +
>> +tree
>> +component_ref_get_element_count (tree array_ref)
>> +{
>> +  if (! component_ref_has_element_count_p (array_ref))
>> +    return NULL_TREE;
>> +
>> +  tree struct_object = TREE_OPERAND (array_ref, 0);
>> +  tree struct_type = TREE_TYPE (struct_object);
>> +  tree field_decl = TREE_OPERAND (array_ref, 1);
>> +  tree attr_element_count = lookup_attribute ("element_count",
>> +					      DECL_ATTRIBUTES (field_decl));
>> +  gcc_assert (attr_element_count);
>> +
>> +  /* If there is an element_count attribute attached to the field,
>> +     get the field that maps to the element_count.  */
>> +
>> +  const char *fieldname
>> +    = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr_element_count)));
>> +
>> +  tree element_count_field = NULL_TREE;
>> +  for (tree field = TYPE_FIELDS (struct_type); field;
>> +       field = DECL_CHAIN (field))
>> +    if (TREE_CODE (field) == FIELD_DECL
>> +	&& DECL_NAME (field) != NULL
>> +	&& strcmp (IDENTIFIER_POINTER (DECL_NAME (field)), fieldname) == 0)
>> +      {
>> +	element_count_field = field;
>> +	break;
>> +      }
>> +
>> +  gcc_assert (element_count_field);
>> +
>> +  /* generate the tree node that represent the element_count of this array
>> +     ref.  This is a COMPONENT_REF to the element_count_field of the
>> +     containing structure.  */
>> +
>> +  tree element_count_ref = build3 (COMPONENT_REF,
>> +				   TREE_TYPE (element_count_field),
>> +				   struct_object, element_count_field,
>> +				   NULL_TREE);
>> +  return element_count_ref;
>> +}
>> +
>>  /* Return a tree representing the lower bound of the array mentioned in
>>     EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
>>  
>> 
>> diff --git a/gcc/tree.h b/gcc/tree.h
>> index 0b72663e6a1..8f36370a02d 100644
>> --- a/gcc/tree.h
>> +++ b/gcc/tree.h
>> @@ -5582,6 +5582,16 @@ extern tree get_base_address (tree t);
>>     of EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
>>  extern tree array_ref_element_size (tree);
>>  
>> 
>> +/* Give a component_ref that has an array type, return true when an
>> +   attribute element_count attached to the corresponding FIELD_DECL.  */
>> +extern bool component_ref_has_element_count_p (tree);
>> +
>> +/* Give a component_ref that has an array type, return the object that
>> +   represents its element_count per the attribute element_count attached to
>> +   the corresponding FIELD_DECL.  return NULL_TREE when cannot find such
>> +   object.  */
>> +extern tree component_ref_get_element_count (tree);
>> +
>>  /* Return a typenode for the "standard" C type with a given name.  */
>>  extern tree get_typenode_from_name (const char *);
>>  
>> 
> 
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-05-26 16:12 ` [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Kees Cook
@ 2023-05-30 21:44   ` Qing Zhao
  0 siblings, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-05-30 21:44 UTC (permalink / raw)
  To: Kees Cook
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard



> On May 26, 2023, at 12:12 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote:
>> This patch set introduces a new attribute "element_count" to annotate bounds 
>> for C99 flexible array member.
> 
> Thank you for this work! I'm really excited to start using it in the
> Linux kernel. I'll give this a spin, but I know you've already been
> testing this series against the test cases I created earlier, so I don't
> expect any problems. :)
> 
> One bike-shedding note: with the recent "-fbounds-safety" RFC posted for
> LLVM, we may want to consider renaming "element_count" to "counted_by":
> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/

Yeah, we can do that.

thanks.

Qing
> 
> Thanks again!
> 
> -Kees
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-05-26 18:15       ` Joseph Myers
  2023-05-26 19:09         ` Qing Zhao
@ 2023-06-07 19:59         ` Qing Zhao
  2023-06-07 20:53           ` Joseph Myers
  1 sibling, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-06-07 19:59 UTC (permalink / raw)
  To: Joseph Myers
  Cc: richard.guenther, jakub, gcc-patches, keescook, siddhesh, uecker,
	isanbard

Hi, Joseph,

A question here:  can an identifier in C be a wide char string? 

Qing

> On May 26, 2023, at 2:15 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Fri, 26 May 2023, Qing Zhao via Gcc-patches wrote:
> 
>>> What if the string is a wide string?  I don't expect that to work (either 
>>> as a matter of interface design, or in the present code), but I think that 
>>> case should have a specific check and error.
>> 
>> Dump question: how to check whether the string is a wide string? -:)
> 
> By examining the element type; the only valid case for the attribute would 
> be an element type of (const) char.  (I think it's reasonable to reject 
> all of char8_t, char16_t, char32_t, wchar_t strings in this context.)
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-07 19:59         ` Qing Zhao
@ 2023-06-07 20:53           ` Joseph Myers
  2023-06-07 21:32             ` Qing Zhao
  0 siblings, 1 reply; 52+ messages in thread
From: Joseph Myers @ 2023-06-07 20:53 UTC (permalink / raw)
  To: Qing Zhao
  Cc: richard.guenther, jakub, gcc-patches, keescook, siddhesh, uecker,
	isanbard

On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:

> Hi, Joseph,
> 
> A question here:  can an identifier in C be a wide char string? 

Identifiers and strings are different kinds of tokens; an identifier can't 
be a string of any kind, wide or narrow.  It just so happens that the 
proposed interface here involves interpreting the contents of a string as 
referring to an identifier (presumably for parsing convenience compared to 
using an identifier directly in an attribute).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-07 20:53           ` Joseph Myers
@ 2023-06-07 21:32             ` Qing Zhao
  2023-06-07 22:05               ` Joseph Myers
  0 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-06-07 21:32 UTC (permalink / raw)
  To: Joseph Myers
  Cc: richard.guenther, jakub, gcc-patches, keescook, siddhesh, uecker,
	isanbard



> On Jun 7, 2023, at 4:53 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> Hi, Joseph,
>> 
>> A question here:  can an identifier in C be a wide char string? 
> 
> Identifiers and strings are different kinds of tokens; an identifier can't 
> be a string of any kind, wide or narrow.  It just so happens that the 
> proposed interface here involves interpreting the contents of a string as 
> referring to an identifier (presumably for parsing convenience compared to 
> using an identifier directly in an attribute).

Are you suggesting to use identifier directly as the argument of the attribute?
I tried this in the beginning, however, the current parser for the attribute argument can not identify that this identifier is a field identifier inside the same structure. 

For example:

int count;
struct trailing_array_7 {
  Int count;
  int array_7[] __attribute ((element_count (count))); 
};

The identifier “count” inside the attribute will refer to the variable “int count” outside of the structure.

We need to introduce new syntax for this and also need to update the parser of the attribute.
Not sure at this moment whether the extra effort is necessary or not?
Any suggestions?

thanks.

Qing

> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-07 21:32             ` Qing Zhao
@ 2023-06-07 22:05               ` Joseph Myers
  2023-06-08 13:06                 ` Qing Zhao
  2023-06-15 15:09                 ` Qing Zhao
  0 siblings, 2 replies; 52+ messages in thread
From: Joseph Myers @ 2023-06-07 22:05 UTC (permalink / raw)
  To: Qing Zhao
  Cc: richard.guenther, jakub, gcc-patches, keescook, siddhesh, uecker,
	isanbard

[-- Attachment #1: Type: text/plain, Size: 1576 bytes --]

On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:

> Are you suggesting to use identifier directly as the argument of the attribute?
> I tried this in the beginning, however, the current parser for the attribute argument can not identify that this identifier is a field identifier inside the same structure. 
> 
> For example:
> 
> int count;
> struct trailing_array_7 {
>   Int count;
>   int array_7[] __attribute ((element_count (count))); 
> };
> 
> The identifier “count” inside the attribute will refer to the variable 
> “int count” outside of the structure.

c_parser_attribute_arguments is supposed to allow an identifier as an 
attribute argument - and not look it up (the user of the attribute would 
later need to look it up in the context of the containing structure).  
Callers use attribute_takes_identifier_p to determine which attributes 
take identifiers (versus expressions) as arguments, which would need 
updating to cover the new attribute.

There is a ??? comment about the case where the identifier is declared as 
a type name.  That would simply be one of the cases carried over from the 
old Bison parser, and it would seem reasonable to remove that 
special-casing so that the attribute works even when the identifier is 
declared as a typedef name as an ordinary identifier, since it's fine for 
structure members to have the same name as a typedef name.

Certainly taking an identifier directly seems like cleaner syntax than 
taking a string that then needs reinterpreting as an identifier.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-07 22:05               ` Joseph Myers
@ 2023-06-08 13:06                 ` Qing Zhao
  2023-06-15 15:09                 ` Qing Zhao
  1 sibling, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-06-08 13:06 UTC (permalink / raw)
  To: Joseph Myers
  Cc: richard.guenther, jakub, gcc-patches, keescook, siddhesh, uecker,
	isanbard



> On Jun 7, 2023, at 6:05 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> Are you suggesting to use identifier directly as the argument of the attribute?
>> I tried this in the beginning, however, the current parser for the attribute argument can not identify that this identifier is a field identifier inside the same structure. 
>> 
>> For example:
>> 
>> int count;
>> struct trailing_array_7 {
>>  Int count;
>>  int array_7[] __attribute ((element_count (count))); 
>> };
>> 
>> The identifier “count” inside the attribute will refer to the variable 
>> “int count” outside of the structure.
> 
> c_parser_attribute_arguments is supposed to allow an identifier as an 
> attribute argument - and not look it up (the user of the attribute would 
> later need to look it up in the context of the containing structure).  
> Callers use attribute_takes_identifier_p to determine which attributes 
> take identifiers (versus expressions) as arguments, which would need 
> updating to cover the new attribute.
> 
> There is a ??? comment about the case where the identifier is declared as 
> a type name.  That would simply be one of the cases carried over from the 
> old Bison parser, and it would seem reasonable to remove that 
> special-casing so that the attribute works even when the identifier is 
> declared as a typedef name as an ordinary identifier, since it's fine for 
> structure members to have the same name as a typedef name.
> 
> Certainly taking an identifier directly seems like cleaner syntax than 
> taking a string that then needs reinterpreting as an identifier.

Thanks a lot for the helpful info. I will study a little bit here to see how to do this.
Qing
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-07 22:05               ` Joseph Myers
  2023-06-08 13:06                 ` Qing Zhao
@ 2023-06-15 15:09                 ` Qing Zhao
  2023-06-15 16:55                   ` Joseph Myers
  1 sibling, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-06-15 15:09 UTC (permalink / raw)
  To: Joseph Myers, Qing Zhao via Gcc-patches
  Cc: richard.guenther, jakub, keescook, siddhesh, uecker, isanbard

Hi, Joseph,

I studied c_parser_attribute_arguments and related source code, 
and also studied PLACEHOLDER_EXPR and related source code. 

Right now, I still cannot decide what’s the best user-interface should be for the 
argument of the new attribute “element_count". (Or the new attribute 
name “counted_by” as suggested by Kees). 

There are 3 possible interfaces for the argument of the new attribute “counted_by”:

The first 2 interfaces are the following A and B:

A. the argument of the new attribute “counted_by” is a string as the current patch:

struct trailing_array_A {
  Int count;
  int array_A[] __attribute ((counted_by (“count"))); 
};

B. The argument of the new attribute “counted_by” is an identifier that can be
 accepted by “c_parser_attribute_arguments”:

struct trailing_array_B {
  Int count;
  int array_B[] __attribute ((counted_by (count))); 
};

To implement this interface,  we need to adjust “attribute_takes_identifier_p” to 
accept the new attribute “counted_by” and then interpret this new identifier  “count” as a 
field of the containing structure by looking at the name.  (Otherwise, the identifier “count”
will be treated as an identifier in the current scope, which is not declared yet)

Comparing B with A, I don’t see too much benefit, either from user-interface point of view, 
or from implementation point of view.  

For implementation, both A and B need to search the fields of the containing structure by 
the name of the field “count”.

For user interface, I think that A and B are similar.

In addition to consider the user-interface and implementation, another concern is the possibility
 to extend the argument of this new attribute to an expression in the future, for example:

struct trailing_array_F {
  Int count;
  int array_F[] __attribute ((counted_by (count * 4))); 
};

In the above struct “trailing_array_F”, the argument of the attribute “counted_by” is “count * 4”, which
is an expression.  

If we plan to extend the argument of this new attribute to an expression, then neither A nor B is
good, right?

For this purpose, it might be cleaner to introduce a new syntax similar as the designator as mentioned in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896 as following, i.e, the approach C:

C. The argument of the new attribute “counted_by” is a new designator identifier that will be parsed as
The field of the containing structure:

struct trailing_array_C {
  Int count;
  int array_C[] __attribute ((counted_by (.count))); 
};

I think that after the C FE accepts this new designator syntax as the argument of the attribute, then it’s
very easy to extend the argument to an arbitrary expression later. 

For the implementation of this approach, my current thinking is: 

1. Update the routine “c_parser_postfix_expression” (is this the right place? ) to accept the new designator syntax.
2. Use “PLACEHOLDER_EXPR” to represent the containing structure, and build a COMPONENT_REF to hold
    the argument of the attribute in the IR.
3. When using this attribute in middle-end or sanitizer, use SUBSTITUTE_PLACEHOLDER_IN_EXPR(EXP, OBJ)
    to get the size info in IR. 

So, right now, I think that we might need to take the approach C?

What’s your opinion and suggestions?

Thanks a lot for your help.

Qing


> On Jun 7, 2023, at 6:05 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> Are you suggesting to use identifier directly as the argument of the attribute?
>> I tried this in the beginning, however, the current parser for the attribute argument can not identify that this identifier is a field identifier inside the same structure. 
>> 
>> For example:
>> 
>> int count;
>> struct trailing_array_7 {
>>  Int count;
>>  int array_7[] __attribute ((element_count (count))); 
>> };
>> 
>> The identifier “count” inside the attribute will refer to the variable 
>> “int count” outside of the structure.
> 
> c_parser_attribute_arguments is supposed to allow an identifier as an 
> attribute argument - and not look it up (the user of the attribute would 
> later need to look it up in the context of the containing structure).  
> Callers use attribute_takes_identifier_p to determine which attributes 
> take identifiers (versus expressions) as arguments, which would need 
> updating to cover the new attribute.
> 
> There is a ??? comment about the case where the identifier is declared as 
> a type name.  That would simply be one of the cases carried over from the 
> old Bison parser, and it would seem reasonable to remove that 
> special-casing so that the attribute works even when the identifier is 
> declared as a typedef name as an ordinary identifier, since it's fine for 
> structure members to have the same name as a typedef name.
> 
> Certainly taking an identifier directly seems like cleaner syntax than 
> taking a string that then needs reinterpreting as an identifier.
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-15 15:09                 ` Qing Zhao
@ 2023-06-15 16:55                   ` Joseph Myers
  2023-06-15 19:54                     ` Qing Zhao
  2023-06-16  7:21                     ` Martin Uecker
  0 siblings, 2 replies; 52+ messages in thread
From: Joseph Myers @ 2023-06-15 16:55 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Qing Zhao via Gcc-patches, richard.guenther, jakub, keescook,
	siddhesh, uecker, isanbard

[-- Attachment #1: Type: text/plain, Size: 2590 bytes --]

On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:

> Comparing B with A, I don’t see too much benefit, either from 
> user-interface point of view, or from implementation point of view.
> 
> For implementation, both A and B need to search the fields of the 
> containing structure by the name of the field “count”.
> 
> For user interface, I think that A and B are similar.

But as a language design matter, there are no standard C interfaces that 
interpret a string as an identifier, so doing so does not fit well with 
the language.

> 1. Update the routine “c_parser_postfix_expression” (is this the right 
> place? ) to accept the new designator syntax.

Any design that might work with an expression is the sort of thing that 
would likely involve many iterations on the specification (i.e. proposed 
wording changes to the C standard) for the interpretation of the new kinds 
of expressions, including how to resolve syntactic ambiguities and how 
name lookup works, before it could be considered ready to implement, and 
then a lot more work on the specification based on implementation 
experience.

Note that no expressions can start with the '.' token at present.  As soon 
as you invent a new kind of expression that can start with that token, you 
have syntactic ambiguity.

struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };

Is ".c=.c" a use of the existing syntax for designated initializers, with 
the first ".c" being a designator and the second being a use of the new 
kind of expression, or is it an assignment expression, where both the LHS 
and the RHS of the assignment use the new kind of expression?  And do 
those .c, when the use the new kind of expression, refer to the inner or 
outer struct definition?

There are obvious advantages to using tokens that don't introduce such an 
ambiguity with designators (i.e., not '.' as the token to start the new 
kind of expression, but something that cannot start a designator), if such 
tokens can be found.  But you still have the name lookup question when 
there are multiple nested structure definitions.  And the question of when 
expressions are considered to be evaluated, if they have side effects such 
as ".c=.c" does.

"Whatever falls out of the implementation" is not a good approach for 
language design here.  If you want a new kind of expressions here, you 
need a careful multi-implementation design phase that produces a proper 
specification and has good reasons for the particular choices made in 
cases of ambiguity.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-15 16:55                   ` Joseph Myers
@ 2023-06-15 19:54                     ` Qing Zhao
  2023-06-15 22:48                       ` Joseph Myers
  2023-06-16  7:21                     ` Martin Uecker
  1 sibling, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-06-15 19:54 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Qing Zhao via Gcc-patches, richard.guenther, jakub, keescook,
	siddhesh, uecker, isanbard



> On Jun 15, 2023, at 12:55 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> Comparing B with A, I don’t see too much benefit, either from 
>> user-interface point of view, or from implementation point of view.
>> 
>> For implementation, both A and B need to search the fields of the 
>> containing structure by the name of the field “count”.
>> 
>> For user interface, I think that A and B are similar.
> 
> But as a language design matter, there are no standard C interfaces that 
> interpret a string as an identifier, so doing so does not fit well with 
> the language.

Okay, makes sense.  So I will choose B over A. -:) 
> 
>> 1. Update the routine “c_parser_postfix_expression” (is this the right 
>> place? ) to accept the new designator syntax.
> 
> Any design that might work with an expression is the sort of thing that 
> would likely involve many iterations on the specification (i.e. proposed 
> wording changes to the C standard) for the interpretation of the new kinds 
> of expressions, including how to resolve syntactic ambiguities and how 
> name lookup works, before it could be considered ready to implement, and 
> then a lot more work on the specification based on implementation 
> experience.

Okay, I see the complication to make such new syntax into C standard…

> 
> Note that no expressions can start with the '.' token at present.  As soon 
> as you invent a new kind of expression that can start with that token, you 
> have syntactic ambiguity.
> 
> struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
> 
> Is ".c=.c" a use of the existing syntax for designated initializers, with 
> the first ".c" being a designator and the second being a use of the new 
> kind of expression, or is it an assignment expression, where both the LHS 
> and the RHS of the assignment use the new kind of expression?  And do 
> those .c, when the use the new kind of expression, refer to the inner or 
> outer struct definition?

Okay, I see. Yes, this will be really confusing. 

> 
> There are obvious advantages to using tokens that don't introduce such an 
> ambiguity with designators (i.e., not '.' as the token to start the new 
> kind of expression, but something that cannot start a designator), if such 
> tokens can be found.  But you still have the name lookup question when 
> there are multiple nested structure definitions.  And the question of when 
> expressions are considered to be evaluated, if they have side effects such 
> as ".c=.c" does.
> 
> "Whatever falls out of the implementation" is not a good approach for 
> language design here.  If you want a new kind of expressions here, you 
> need a careful multi-implementation design phase that produces a proper 
> specification and has good reasons for the particular choices made in 
> cases of ambiguity.

Thanks a lot for your detailed explanation on the language design concerns. 
For this new attribute, I was convinced that it might not worth the effort to introduce a new syntax at this stage.

Another question,  whether it’s possible to extend such attribute later to accept expression as its argument if we take approach B:

B. The argument of the new attribute “counted_by” is an identifier that can be
accepted by “c_parser_attribute_arguments”:

struct trailing_array_B {
 Int count;
 int array_B[] __attribute ((counted_by (count))); 
};


From my current very limited understanding of the C FE source code, it’s not easy to extend the argument to an expression later for the above.
Is this understanding right?

(The motivation of accepting expression as the argument for the new attribute “counted_by” is 
   from the proposal for LLVM: https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854:

	• __counted_by(N) : The pointer points to memory that contains N elements of pointee type. N is an expression of integer type which can be a simple reference to declaration, a constant including calls to constant functions, or an arithmetic expression that does not have side effect. The annotation cannot apply to pointers to incomplete types or types without size such as  void *.
)
 
thanks.
Qing

> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-15 19:54                     ` Qing Zhao
@ 2023-06-15 22:48                       ` Joseph Myers
  2023-06-16 15:01                         ` Qing Zhao
  0 siblings, 1 reply; 52+ messages in thread
From: Joseph Myers @ 2023-06-15 22:48 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Qing Zhao via Gcc-patches, richard.guenther, jakub, keescook,
	siddhesh, uecker, isanbard

[-- Attachment #1: Type: text/plain, Size: 961 bytes --]

On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:

> B. The argument of the new attribute “counted_by” is an identifier that can be
> accepted by “c_parser_attribute_arguments”:
> 
> struct trailing_array_B {
>  Int count;
>  int array_B[] __attribute ((counted_by (count))); 
> };
> 
> 
> From my current very limited understanding of the C FE source code, it’s 
> not easy to extend the argument to an expression later for the above. Is 
> this understanding right?

It wouldn't be entirely compatible: if you change to interpreting the 
argument as an expression, then the above would suggest a global variable 
count is used (as opposed to some other syntax for referring to an element 
of the containing structure).

So an attribute that takes an element name might best be a *different* 
attribute from any potential future one taking an expression (with some 
new syntax to refer to an element).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-15 16:55                   ` Joseph Myers
  2023-06-15 19:54                     ` Qing Zhao
@ 2023-06-16  7:21                     ` Martin Uecker
  2023-06-16 15:14                       ` Qing Zhao
  2023-06-16 16:21                       ` Joseph Myers
  1 sibling, 2 replies; 52+ messages in thread
From: Martin Uecker @ 2023-06-16  7:21 UTC (permalink / raw)
  To: Joseph Myers, Qing Zhao
  Cc: Qing Zhao via Gcc-patches, richard.guenther, jakub, keescook,
	siddhesh, isanbard

Am Donnerstag, dem 15.06.2023 um 16:55 +0000 schrieb Joseph Myers:
> On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
...
> > 1. Update the routine “c_parser_postfix_expression” (is this the right 
> > place? ) to accept the new designator syntax.
> 
> Any design that might work with an expression is the sort of thing that 
> would likely involve many iterations on the specification (i.e. proposed 
> wording changes to the C standard) for the interpretation of the new kinds 
> of expressions, including how to resolve syntactic ambiguities and how 
> name lookup works, before it could be considered ready to implement, and 
> then a lot more work on the specification based on implementation 
> experience.
> 
> Note that no expressions can start with the '.' token at present.  As soon 
> as you invent a new kind of expression that can start with that token, you 
> have syntactic ambiguity.
> 
> struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
> 
> Is ".c=.c" a use of the existing syntax for designated initializers, with 
> the first ".c" being a designator and the second being a use of the new 
> kind of expression, or is it an assignment expression, where both the LHS 
> and the RHS of the assignment use the new kind of expression?  And do 
> those .c, when the use the new kind of expression, refer to the inner or 
> outer struct definition?

I would treat this is one integrated feature. Essentially .c is
somthing like this->c for the current struct for designated
initializer *and* size expressions because it is semantically 
so close.    In the initializer I would allow only 
the current use for designated initialization for all names of
member of the currently initialized struct,  so .c = .c would 
be invalid.   It should never refer to the outer struct if there
is a member with the same name in the inner struct, i.e. the
outside member is then hidden. 

So this would be ok:

struct s1 { int d; char a[(struct s2 { int c; char b[.c]; }) {.c=.d}.c]; };

Here the use of .d would be ok because it is not from the struct
currently initialized, but from an outside scope.

Martin





^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-15 22:48                       ` Joseph Myers
@ 2023-06-16 15:01                         ` Qing Zhao
  0 siblings, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-06-16 15:01 UTC (permalink / raw)
  To: Joseph Myers, Kees Cook
  Cc: Qing Zhao via Gcc-patches, richard.guenther, jakub, siddhesh,
	uecker, isanbard



> On Jun 15, 2023, at 6:48 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> B. The argument of the new attribute “counted_by” is an identifier that can be
>> accepted by “c_parser_attribute_arguments”:
>> 
>> struct trailing_array_B {
>> Int count;
>> int array_B[] __attribute ((counted_by (count))); 
>> };
>> 
>> 
>> From my current very limited understanding of the C FE source code, it’s 
>> not easy to extend the argument to an expression later for the above. Is 
>> this understanding right?
> 
> It wouldn't be entirely compatible: if you change to interpreting the 
> argument as an expression, then the above would suggest a global variable 
> count is used (as opposed to some other syntax for referring to an element 
> of the containing structure).

Yeah, that’s the reason I tried to introduce the new “.count” syntax for the argument 
of the new attribute in the very beginning in order to avoid such incompatible issue later.  -:)
> 
> So an attribute that takes an element name might best be a *different* 
> attribute from any potential future one taking an expression (with some 
> new syntax to refer to an element).

So, if we add this “counted_by (identifier)” attribute now, and later we need to add another
 new attribute “new_counted_by (expression)”  at that time if needed?

Kees, what’s your opinion on this?

thanks.

Qing
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-16  7:21                     ` Martin Uecker
@ 2023-06-16 15:14                       ` Qing Zhao
  2023-06-16 16:21                       ` Joseph Myers
  1 sibling, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-06-16 15:14 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Qing Zhao via Gcc-patches, richard.guenther, jakub, keescook,
	siddhesh, isanbard



> On Jun 16, 2023, at 3:21 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Donnerstag, dem 15.06.2023 um 16:55 +0000 schrieb Joseph Myers:
>> On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:
>> 
> ...
>>> 1. Update the routine “c_parser_postfix_expression” (is this the right 
>>> place? ) to accept the new designator syntax.
>> 
>> Any design that might work with an expression is the sort of thing that 
>> would likely involve many iterations on the specification (i.e. proposed 
>> wording changes to the C standard) for the interpretation of the new kinds 
>> of expressions, including how to resolve syntactic ambiguities and how 
>> name lookup works, before it could be considered ready to implement, and 
>> then a lot more work on the specification based on implementation 
>> experience.
>> 
>> Note that no expressions can start with the '.' token at present.  As soon 
>> as you invent a new kind of expression that can start with that token, you 
>> have syntactic ambiguity.
>> 
>> struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
>> 
>> Is ".c=.c" a use of the existing syntax for designated initializers, with 
>> the first ".c" being a designator and the second being a use of the new 
>> kind of expression, or is it an assignment expression, where both the LHS 
>> and the RHS of the assignment use the new kind of expression?  And do 
>> those .c, when the use the new kind of expression, refer to the inner or 
>> outer struct definition?
> 
> I would treat this is one integrated feature. Essentially .c is
> somthing like this->c for the current struct for designated
> initializer *and* size expressions because it is semantically 
> so close.  

Yes, I think this is reasonable. (Is “this” the immediate containing structure?)

>  In the initializer I would allow only 
> the current use for designated initialization for all names of
> member of the currently initialized struct,  so .c = .c would 
> be invalid.

Given “.c” basically is “this->c”, then .c = .c should be considered as
this->c = this->c, is such self-initialization allowed in C?

>   It should never refer to the outer struct if there
> is a member with the same name in the inner struct, i.e. the
> outside member is then hidden. 

Does the above mean:  if there is NO name conflicting, it could refer to a field of an outer struct?

Why this is allowed? Why just disallow all referring to the field of the outer structure since .c basically is this->c?
> 
> So this would be ok:
> 
> struct s1 { int d; char a[(struct s2 { int c; char b[.c]; }) {.c=.d}.c]; };
> 
> Here the use of .d would be ok because it is not from the struct
> currently initialized, but from an outside scope.

I think that the above .c=.d should report an error, since .d does not exist in the containing structure.

Do I miss anything here?

thanks.

Qing
> 
> Martin
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-16  7:21                     ` Martin Uecker
  2023-06-16 15:14                       ` Qing Zhao
@ 2023-06-16 16:21                       ` Joseph Myers
  2023-06-16 17:07                         ` Martin Uecker
  1 sibling, 1 reply; 52+ messages in thread
From: Joseph Myers @ 2023-06-16 16:21 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Qing Zhao, Qing Zhao via Gcc-patches, richard.guenther, jakub,
	keescook, siddhesh, isanbard

[-- Attachment #1: Type: text/plain, Size: 2699 bytes --]

On Fri, 16 Jun 2023, Martin Uecker via Gcc-patches wrote:

> > Note that no expressions can start with the '.' token at present.  As soon 
> > as you invent a new kind of expression that can start with that token, you 
> > have syntactic ambiguity.
> > 
> > struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
> > 
> > Is ".c=.c" a use of the existing syntax for designated initializers, with 
> > the first ".c" being a designator and the second being a use of the new 
> > kind of expression, or is it an assignment expression, where both the LHS 
> > and the RHS of the assignment use the new kind of expression?  And do 
> > those .c, when the use the new kind of expression, refer to the inner or 
> > outer struct definition?
> 
> I would treat this is one integrated feature. Essentially .c is
> somthing like this->c for the current struct for designated
> initializer *and* size expressions because it is semantically 
> so close.    In the initializer I would allow only 
> the current use for designated initialization for all names of
> member of the currently initialized struct,  so .c = .c would 
> be invalid.   It should never refer to the outer struct if there

I'm not clear on what the intended disambiguation rule here is, when "." 
is seen in initializer list context - does this rule depend on whether the 
following identifier is a member of the struct being initialized, so 
".c=.c" would be OK above if the initialized struct didn't have a member 
called c but the outer struct definition did?  That seems like a rather 
messy rule.  And does "would allow only" apply other than in the ambiguous 
context?  That seems to be implied by ".c=.c" being invalid above, because 
to make it invalid you need to disallow the new construct being used for 
the second .c, not just make the first .c interpreted as a designator.

Again, this sort of thing needs a detailed written specification, with 
multiple iterations discussed among different implementations.  The above 
paragraph doesn't make clear to me any of: the disambiguation rules; what 
is allowed in what context; how name lookup works (consider tricky cases 
such as a reference to an identifier declared *later* in the same struct, 
possibly in the context of C2x tag compatibility where a previous 
definition of the struct is visible); when these expressions get 
evaluated; what the underlying principles are behind those choices.

Using a token (existing or new) other than '.' - one that doesn't 
introduce ambiguity in any context where expressions can be used - would 
help significantly, although some of the issues would still apply.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-16 16:21                       ` Joseph Myers
@ 2023-06-16 17:07                         ` Martin Uecker
  2023-06-16 20:20                           ` Qing Zhao
  0 siblings, 1 reply; 52+ messages in thread
From: Martin Uecker @ 2023-06-16 17:07 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Qing Zhao, Qing Zhao via Gcc-patches, richard.guenther, jakub,
	keescook, siddhesh, isanbard

Am Freitag, dem 16.06.2023 um 16:21 +0000 schrieb Joseph Myers:
> On Fri, 16 Jun 2023, Martin Uecker via Gcc-patches wrote:
> 
> > > Note that no expressions can start with the '.' token at present.  As soon 
> > > as you invent a new kind of expression that can start with that token, you 
> > > have syntactic ambiguity.
> > > 
> > > struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
> > > 
> > > Is ".c=.c" a use of the existing syntax for designated initializers, with 
> > > the first ".c" being a designator and the second being a use of the new 
> > > kind of expression, or is it an assignment expression, where both the LHS 
> > > and the RHS of the assignment use the new kind of expression?  And do 
> > > those .c, when the use the new kind of expression, refer to the inner or 
> > > outer struct definition?
> > 
> > I would treat this is one integrated feature. Essentially .c is
> > somthing like this->c for the current struct for designated
> > initializer *and* size expressions because it is semantically 
> > so close.    In the initializer I would allow only 
> > the current use for designated initialization for all names of
> > member of the currently initialized struct,  so .c = .c would 
> > be invalid.   It should never refer to the outer struct if there
> 
> I'm not clear on what the intended disambiguation rule here is, when "." 
> is seen in initializer list context - does this rule depend on whether the 
> following identifier is a member of the struct being initialized, so 
> ".c=.c" would be OK above if the initialized struct didn't have a member 
> called c but the outer struct definition did? 

When initializers are parsed it is already clear what
the names of the members of the inner struct are, so
one can differentiate between designated initializers 
and potential other uses in an expression. 

So the main rule is: if you parse .something in a context
where a designator is allowed and "something" is a member
of the current struct, then it is a designator.

So for 

struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };

one knows during parsing that the .d is a designator
and that .c is not. For

struct foo { int c; int buf[(struct { int d; }){ .c = .c }]; };

one knows that both uses of .c are not.

Whether these different use cases should be allowed or not
is a different question, but my point is that there does
not seem to be a problem directly identifying the uses 
as a designator as usual. To me, this seems to imply that
it is safe to use the same syntax.

>  That seems like a rather 
> messy rule.  And does "would allow only" apply other than in the ambiguous 
> context?  That seems to be implied by ".c=.c" being invalid above, because 
> to make it invalid you need to disallow the new construct being used for 
> the second .c, not just make the first .c interpreted as a designator.

Yes. 
> 
> Again, this sort of thing needs a detailed written specification, with 
> multiple iterations discussed among different implementations. 

Oh, I agree with this.

>  The above 
> paragraph doesn't make clear to me any of: the disambiguation rules; what 
> is allowed in what context; how name lookup works (consider tricky cases 
> such as a reference to an identifier declared *later* in the same struct, 
> possibly in the context of C2x tag compatibility where a previous 
> definition of the struct is visible); when these expressions get 
> evaluated; what the underlying principles are behind those choices.

I also agree that all this needs careful consideration and written
rules.  My point is mereley that there does not seem to be a
fundamental issue differentiating the new feature from 
designators during parsing, so there may not be a risk using 
the same syntax.

> Using a token (existing or new) other than '.' - one that doesn't 
> introduce ambiguity in any context where expressions can be used - would 
> help significantly, although some of the issues would still apply.

The cost of using a new symbol is that one has two different
syntax for something which is semantically equivalent, i.e.
a notion to refer to a member of the current struct.

Martin

> 



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-16 17:07                         ` Martin Uecker
@ 2023-06-16 20:20                           ` Qing Zhao
  2023-06-16 21:35                             ` Joseph Myers
  0 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-06-16 20:20 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Qing Zhao via Gcc-patches, richard.guenther, jakub, keescook,
	siddhesh, isanbard



> On Jun 16, 2023, at 1:07 PM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Freitag, dem 16.06.2023 um 16:21 +0000 schrieb Joseph Myers:
>> On Fri, 16 Jun 2023, Martin Uecker via Gcc-patches wrote:
>> 
>>>> Note that no expressions can start with the '.' token at present.  As soon 
>>>> as you invent a new kind of expression that can start with that token, you 
>>>> have syntactic ambiguity.
>>>> 
>>>> struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
>>>> 
>>>> Is ".c=.c" a use of the existing syntax for designated initializers, with 
>>>> the first ".c" being a designator and the second being a use of the new 
>>>> kind of expression, or is it an assignment expression, where both the LHS 
>>>> and the RHS of the assignment use the new kind of expression?  And do 
>>>> those .c, when the use the new kind of expression, refer to the inner or 
>>>> outer struct definition?
>>> 
>>> I would treat this is one integrated feature. Essentially .c is
>>> somthing like this->c for the current struct for designated
>>> initializer *and* size expressions because it is semantically 
>>> so close.    In the initializer I would allow only 
>>> the current use for designated initialization for all names of
>>> member of the currently initialized struct,  so .c = .c would 
>>> be invalid.   It should never refer to the outer struct if there
>> 
>> I'm not clear on what the intended disambiguation rule here is, when "." 
>> is seen in initializer list context - does this rule depend on whether the 
>> following identifier is a member of the struct being initialized, so 
>> ".c=.c" would be OK above if the initialized struct didn't have a member 
>> called c but the outer struct definition did? 
> 
> When initializers are parsed it is already clear what
> the names of the members of the inner struct are, so
> one can differentiate between designated initializers 
> and potential other uses in an expression. 
> 
> So the main rule is: if you parse .something in a context
> where a designator is allowed and "something" is a member
> of the current struct, then it is a designator.

So, Limiting the .something ONLY to the CURRENT structure/union might be the simple and clean rule.

And I guess that this is also the rule for the current designator initializer syntax in C99?

> 
> So for 
> 
> struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };
> 
> one knows during parsing that the .d is a designator
> and that .c is not.

Therefore, the above should be invalid based on this rule since .c is not a member in the current structure.


> For
> 
> struct foo { int c; int buf[(struct { int d; }){ .c = .c }]; };
> 
> one knows that both uses of .c are not.

And this also is invalid since .c is not to a member in the current structure. 

> 
> Whether these different use cases should be allowed or not
> is a different question, but my point is that there does
> not seem to be a problem directly identifying the uses 
> as a designator as usual. To me, this seems to imply that
> it is safe to use the same syntax.
> 
>> That seems like a rather 
>> messy rule.  And does "would allow only" apply other than in the ambiguous 
>> context?  That seems to be implied by ".c=.c" being invalid above, because 
>> to make it invalid you need to disallow the new construct being used for 
>> the second .c, not just make the first .c interpreted as a designator.
> 
> Yes. 
>> 
>> Again, this sort of thing needs a detailed written specification, with 
>> multiple iterations discussed among different implementations. 
> 
> Oh, I agree with this.
> 
>> The above 
>> paragraph doesn't make clear to me any of: the disambiguation rules; what 
>> is allowed in what context; how name lookup works (consider tricky cases 
>> such as a reference to an identifier declared *later* in the same struct, 
>> possibly in the context of C2x tag compatibility where a previous 
>> definition of the struct is visible); when these expressions get 
>> evaluated; what the underlying principles are behind those choices.
> 
> I also agree that all this needs careful consideration and written
> rules.  My point is mereley that there does not seem to be a
> fundamental issue differentiating the new feature from 
> designators during parsing, so there may not be a risk using 
> the same syntax.

Yes, I agree on this. 
Extending the existing designated initializer syntax, .member, for the purpose of the argument of the new attribute seems very natural. 
If we can use this syntax in the argument of this new attribute, 
1. it will be easy to extend the argument of this attribute to an expression. 
2. It will also easy to use this syntax later if we accept the following

struct foo {
    ...
    unsigned int count;
    ...
    int data[.count];
};


thanks.

Qing
> 
>> Using a token (existing or new) other than '.' - one that doesn't 
>> introduce ambiguity in any context where expressions can be used - would 
>> help significantly, although some of the issues would still apply.
> 
> The cost of using a new symbol is that one has two different
> syntax for something which is semantically equivalent, i.e.
> a notion to refer to a member of the current struct.
> 
> Martin


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-16 20:20                           ` Qing Zhao
@ 2023-06-16 21:35                             ` Joseph Myers
  2023-06-20 19:40                               ` Qing Zhao
  0 siblings, 1 reply; 52+ messages in thread
From: Joseph Myers @ 2023-06-16 21:35 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Martin Uecker, Qing Zhao via Gcc-patches, richard.guenther,
	jakub, keescook, siddhesh, isanbard

On Fri, 16 Jun 2023, Qing Zhao via Gcc-patches wrote:

> > So for 
> > 
> > struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };
> > 
> > one knows during parsing that the .d is a designator
> > and that .c is not.
> 
> Therefore, the above should be invalid based on this rule since .c is 
> not a member in the current structure.

What do you mean by "current structure"?  I think two different concepts 
are being conflated: the structure *being initialized* (what the C 
standard calls the "current object" for a brace-enclosed initializer 
list), and the structure *being defined*.  The former is what's relevant 
for designators.  The latter is what's relevant for the suggested new 
syntax.  And .c *is* a member of the structure being defined in this 
example.

Those two structure types are always different, except for corner cases 
with C2x tag compatibility (where an object of structure type might be 
initialized in the middle of a redefinition of that type).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-16 21:35                             ` Joseph Myers
@ 2023-06-20 19:40                               ` Qing Zhao
  2023-06-27 15:44                                 ` Qing Zhao
  0 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-06-20 19:40 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Martin Uecker, Qing Zhao via Gcc-patches, richard.guenther,
	jakub, keescook, siddhesh, isanbard



> On Jun 16, 2023, at 5:35 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> On Fri, 16 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>>> So for 
>>> 
>>> struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };
>>> 
>>> one knows during parsing that the .d is a designator
>>> and that .c is not.
>> 
>> Therefore, the above should be invalid based on this rule since .c is 
>> not a member in the current structure.
> 
> What do you mean by "current structure"?  I think two different concepts 
> are being conflated: the structure *being initialized* (what the C 
> standard calls the "current object" for a brace-enclosed initializer 
> list),

I think the concept of “current structure” should be stick to this. 

> and the structure *being defined*.
Not this.

(Forgive me about my poor English -:)).

Then it will be cleaner? 

What’s your opinion?


>  The former is what's relevant 
> for designators.  The latter is what's relevant for the suggested new 
> syntax.  And .c *is* a member of the structure being defined in this 
> example.
> 
> Those two structure types are always different, except for corner cases 
> with C2x tag compatibility (where an object of structure type might be 
> initialized in the middle of a redefinition of that type).

Can you give an example on this?  Thanks.

Qing
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)
  2023-06-20 19:40                               ` Qing Zhao
@ 2023-06-27 15:44                                 ` Qing Zhao
  0 siblings, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-06-27 15:44 UTC (permalink / raw)
  To: Joseph Myers, Martin Uecker, Kees Cook
  Cc: Qing Zhao via Gcc-patches, richard.guenther, jakub, keescook,
	siddhesh, isanbard

Hi,

Based on the discussion so far and further consideration, the following is my plan for this new attribute:

1.  The syntax of the new attribute will be:

__attribute__((counted_by (count_field_id)));

In the above, count_field_id is the identifier for the field that carries the number 
of elements info in the same structure of the FAM. 

For example:

struct object {
..
size_t count:  /* carries the number of elements info for the FAM flex.  */
int flex[] __attribute__((counted_by (count)));
};

2.  Later, if the argument of the this attribute need to be extended to an expression, we might need to 
extend the C FE to accept ".count”  in the future. 

Let me know if you have further comments and suggestions.

thanks.

Qing

> On Jun 20, 2023, at 3:40 PM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> 
> 
>> On Jun 16, 2023, at 5:35 PM, Joseph Myers <joseph@codesourcery.com> wrote:
>> 
>> On Fri, 16 Jun 2023, Qing Zhao via Gcc-patches wrote:
>> 
>>>> So for 
>>>> 
>>>> struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };
>>>> 
>>>> one knows during parsing that the .d is a designator
>>>> and that .c is not.
>>> 
>>> Therefore, the above should be invalid based on this rule since .c is 
>>> not a member in the current structure.
>> 
>> What do you mean by "current structure"?  I think two different concepts 
>> are being conflated: the structure *being initialized* (what the C 
>> standard calls the "current object" for a brace-enclosed initializer 
>> list),
> 
> I think the concept of “current structure” should be stick to this. 
> 
>> and the structure *being defined*.
> Not this.
> 
> (Forgive me about my poor English -:)).
> 
> Then it will be cleaner? 
> 
> What’s your opinion?
> 
> 
>> The former is what's relevant 
>> for designators.  The latter is what's relevant for the suggested new 
>> syntax.  And .c *is* a member of the structure being defined in this 
>> example.
>> 
>> Those two structure types are always different, except for corner cases 
>> with C2x tag compatibility (where an object of structure type might be 
>> initialized in the middle of a redefinition of that type).
> 
> Can you give an example on this?  Thanks.
> 
> Qing
>> 
>> -- 
>> Joseph S. Myers
>> joseph@codesourcery.com


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-05-26 20:40 ` Kees Cook
  2023-05-30 15:43   ` Qing Zhao
@ 2023-07-06 18:56   ` Qing Zhao
  2023-07-06 21:10     ` Martin Uecker
  2023-07-13 20:31     ` Kees Cook
  1 sibling, 2 replies; 52+ messages in thread
From: Qing Zhao @ 2023-07-06 18:56 UTC (permalink / raw)
  To: Kees Cook
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard

Hi, Kees,

I have updated my V1 patch with the following changes:
A. changed the name to "counted_by"
B. changed the argument from a string to an identifier
C. updated the documentation and testing cases accordingly.

And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change)
[opc@qinzhao-ol8u3-x86 Kees]$ !1091
diff array-bounds.c array-bounds.c.org
32c32
< # define __counted_by(member)	__attribute__((counted_by (member)))
---
> # define __counted_by(member)	__attribute__((__element_count__(#member)))
34c34
< # define __counted_by(member)   __attribute__((counted_by (member)))
---
> # define __counted_by(member)	/* __attribute__((__element_count__(#member))) */

Then I got the following result:
[opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#'
TAP version 13
1..12
ok 1 global.fixed_size_seen_by_bdos
ok 2 global.fixed_size_enforced_by_sanitizer
not ok 3 global.unknown_size_unknown_to_bdos
not ok 4 global.unknown_size_ignored_by_sanitizer
ok 5 global.alloc_size_seen_by_bdos
ok 6 global.alloc_size_enforced_by_sanitizer
not ok 7 global.element_count_seen_by_bdos
ok 8 global.element_count_enforced_by_sanitizer
not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer

The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them.

 in a summary, there are two major issues:
1.  The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557
Which is not a bug, it’s an expected behavior. 

2. The common issue for  the failed testing 3, 4, 9, 10 is:

for the following annotated structure: 

====
struct annotated {
        unsigned long flags;
        size_t foo;
        int array[] __attribute__((counted_by (foo)));
};


struct annotated *p;
int index = 16;

p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 

p->foo = index + 2;  // p->foo was set by a different value than the real size of p->array as in test 9 and 10
or
p->foo was not set to any value as in test 3 and 4

====

i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array.  

I think that this should be considered as an user error, and the documentation of the attribute should include
this requirement.  (In the LLVM’s RFC, such requirement was included in the programing model: 
https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)

We can add a new warning option -Wcounted-by to report such user error if needed.

What’s your opinion on this?

thanks.

Qing


> On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote:
>> GCC will pass the number of elements info from the attached attribute to both 
>> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds
>> or dynamic object size issues during runtime for flexible array members.
>> 
>> This new feature will provide nice protection to flexible array members (which
>> currently are completely ignored by both __builtin_dynamic_object_size and
>> bounds sanitizers).
> 
> Testing went pretty well, though I think I found some bdos issues:
> 
> - some things that bdos can't know the size of, and correctly returned
>  SIZE_MAX in the past, now thinks are 0-sized.
> - while bdos correctly knows the size of an element_count-annotated
>  flexible array, it doesn't know the size of the containing object
>  (i.e. it returns SIZE_MAX).
> 
> Also, I think I found a precedence issue:
> 
> - if both __alloc_size and 'element_count' are in use, the _smallest_
>  of the two is what I would expect to be enforced by the sanitizer
>  and reported by __bdos. As is, alloc_size appears to be used when
>  it is available, regardless of what 'element_count' shows.
> 
> I've updated my test cases to show it more clearly, but here is the
> before/after:
> 
> 
> GCC 13 (correctly does not implement "element_count"):
> 
> $ ./array-bounds 2>&1 | grep -v ^'#'
> TAP version 13
> 1..12
> ok 1 global.fixed_size_seen_by_bdos
> ok 2 global.fixed_size_enforced_by_sanitizer
> ok 3 global.unknown_size_unknown_to_bdos
> ok 4 global.unknown_size_ignored_by_sanitizer
> ok 5 global.alloc_size_seen_by_bdos
> ok 6 global.alloc_size_enforced_by_sanitizer
> not ok 7 global.element_count_seen_by_bdos
> not ok 8 global.element_count_enforced_by_sanitizer
> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> 
> 
> ToT GCC + this element_count series:
> 
> $ ./array-bounds 2>&1 | grep -v ^'#'
> TAP version 13
> 1..12
> ok 1 global.fixed_size_seen_by_bdos
> ok 2 global.fixed_size_enforced_by_sanitizer
> not ok 3 global.unknown_size_unknown_to_bdos
> not ok 4 global.unknown_size_ignored_by_sanitizer
> ok 5 global.alloc_size_seen_by_bdos
> ok 6 global.alloc_size_enforced_by_sanitizer
> not ok 7 global.element_count_seen_by_bdos
> ok 8 global.element_count_enforced_by_sanitizer
> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> 
> 
> Test suite is here:
> https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-06 18:56   ` Qing Zhao
@ 2023-07-06 21:10     ` Martin Uecker
  2023-07-07 15:47       ` Qing Zhao
  2023-07-13 20:31     ` Kees Cook
  1 sibling, 1 reply; 52+ messages in thread
From: Martin Uecker @ 2023-07-06 21:10 UTC (permalink / raw)
  To: Qing Zhao, Kees Cook
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, isanbard

Am Donnerstag, dem 06.07.2023 um 18:56 +0000 schrieb Qing Zhao:
> Hi, Kees,
> 
> I have updated my V1 patch with the following changes:
> A. changed the name to "counted_by"
> B. changed the argument from a string to an identifier
> C. updated the documentation and testing cases accordingly.
> 
> And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change)
> [opc@qinzhao-ol8u3-x86 Kees]$ !1091
> diff array-bounds.c array-bounds.c.org
> 32c32
> < # define __counted_by(member)	__attribute__((counted_by (member)))
> ---
> > # define __counted_by(member)	__attribute__((__element_count__(#member)))
> 34c34
> < # define __counted_by(member)   __attribute__((counted_by (member)))
> ---
> > # define __counted_by(member)	/* __attribute__((__element_count__(#member))) */
> 
> Then I got the following result:
> [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#'
> TAP version 13
> 1..12
> ok 1 global.fixed_size_seen_by_bdos
> ok 2 global.fixed_size_enforced_by_sanitizer
> not ok 3 global.unknown_size_unknown_to_bdos
> not ok 4 global.unknown_size_ignored_by_sanitizer
> ok 5 global.alloc_size_seen_by_bdos
> ok 6 global.alloc_size_enforced_by_sanitizer
> not ok 7 global.element_count_seen_by_bdos
> ok 8 global.element_count_enforced_by_sanitizer
> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> 
> The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them.
> 
>  in a summary, there are two major issues:
> 1.  The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557
> Which is not a bug, it’s an expected behavior. 
> 
> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
> 
> for the following annotated structure: 
> 
> ====
> struct annotated {
>         unsigned long flags;
>         size_t foo;
>         int array[] __attribute__((counted_by (foo)));
> };
> 
> 
> struct annotated *p;
> int index = 16;
> 
> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
> 
> p->foo = index + 2;  // p->foo was set by a different value than the real size of p->array as in test 9 and 10
> or
> p->foo was not set to any value as in test 3 and 4
> 
> ====
> 
> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array.  
> 
> I think that this should be considered as an user error, and the documentation of the attribute should include
> this requirement.  (In the LLVM’s RFC, such requirement was included in the programing model: 
> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
> 
> We can add a new warning option -Wcounted-by to report such user error if needed.
> 
> What’s your opinion on this?


Additionally, we could also have a sanitizer that
checks this at run-time.


Personally, I am still not very happy that in the
following example the two 'n's refer to different
entities:

void f(int n)
{
    struct foo {
        int n;   
        int (*p[])[n] [[counted_by(n)]];
    };
}

But I guess it will be difficult to convince everybody
that it would be wise to use a new syntax for
disambiguation:

void f(int n)
{
    struct foo {
        int n;   
        int (*p[])[n] [[counted_by(.n)]];
    };
}

Martin


> 
> thanks.
> 
> Qing
> 
> 
> > On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote:
> > 
> > On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote:
> > > GCC will pass the number of elements info from the attached attribute to both 
> > > __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds
> > > or dynamic object size issues during runtime for flexible array members.
> > > 
> > > This new feature will provide nice protection to flexible array members (which
> > > currently are completely ignored by both __builtin_dynamic_object_size and
> > > bounds sanitizers).
> > 
> > Testing went pretty well, though I think I found some bdos issues:
> > 
> > - some things that bdos can't know the size of, and correctly returned
> >  SIZE_MAX in the past, now thinks are 0-sized.
> > - while bdos correctly knows the size of an element_count-annotated
> >  flexible array, it doesn't know the size of the containing object
> >  (i.e. it returns SIZE_MAX).
> > 
> > Also, I think I found a precedence issue:
> > 
> > - if both __alloc_size and 'element_count' are in use, the _smallest_
> >  of the two is what I would expect to be enforced by the sanitizer
> >  and reported by __bdos. As is, alloc_size appears to be used when
> >  it is available, regardless of what 'element_count' shows.
> > 
> > I've updated my test cases to show it more clearly, but here is the
> > before/after:
> > 
> > 
> > GCC 13 (correctly does not implement "element_count"):
> > 
> > $ ./array-bounds 2>&1 | grep -v ^'#'
> > TAP version 13
> > 1..12
> > ok 1 global.fixed_size_seen_by_bdos
> > ok 2 global.fixed_size_enforced_by_sanitizer
> > ok 3 global.unknown_size_unknown_to_bdos
> > ok 4 global.unknown_size_ignored_by_sanitizer
> > ok 5 global.alloc_size_seen_by_bdos
> > ok 6 global.alloc_size_enforced_by_sanitizer
> > not ok 7 global.element_count_seen_by_bdos
> > not ok 8 global.element_count_enforced_by_sanitizer
> > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> > 
> > 
> > ToT GCC + this element_count series:
> > 
> > $ ./array-bounds 2>&1 | grep -v ^'#'
> > TAP version 13
> > 1..12
> > ok 1 global.fixed_size_seen_by_bdos
> > ok 2 global.fixed_size_enforced_by_sanitizer
> > not ok 3 global.unknown_size_unknown_to_bdos
> > not ok 4 global.unknown_size_ignored_by_sanitizer
> > ok 5 global.alloc_size_seen_by_bdos
> > ok 6 global.alloc_size_enforced_by_sanitizer
> > not ok 7 global.element_count_seen_by_bdos
> > ok 8 global.element_count_enforced_by_sanitizer
> > not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> > not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> > ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> > ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> > 
> > 
> > Test suite is here:
> > https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c
> > 
> > -- 
> > Kees Cook
> 



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-06 21:10     ` Martin Uecker
@ 2023-07-07 15:47       ` Qing Zhao
  2023-07-07 20:21         ` Qing Zhao
  0 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-07-07 15:47 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Kees Cook, joseph, richard.guenther, jakub, gcc-patches,
	siddhesh, isanbard



> On Jul 6, 2023, at 5:10 PM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Donnerstag, dem 06.07.2023 um 18:56 +0000 schrieb Qing Zhao:
>> Hi, Kees,
>> 
>> I have updated my V1 patch with the following changes:
>> A. changed the name to "counted_by"
>> B. changed the argument from a string to an identifier
>> C. updated the documentation and testing cases accordingly.
>> 
>> And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change)
>> [opc@qinzhao-ol8u3-x86 Kees]$ !1091
>> diff array-bounds.c array-bounds.c.org
>> 32c32
>> < # define __counted_by(member)	__attribute__((counted_by (member)))
>> ---
>>> # define __counted_by(member)	__attribute__((__element_count__(#member)))
>> 34c34
>> < # define __counted_by(member)   __attribute__((counted_by (member)))
>> ---
>>> # define __counted_by(member)	/* __attribute__((__element_count__(#member))) */
>> 
>> Then I got the following result:
>> [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#'
>> TAP version 13
>> 1..12
>> ok 1 global.fixed_size_seen_by_bdos
>> ok 2 global.fixed_size_enforced_by_sanitizer
>> not ok 3 global.unknown_size_unknown_to_bdos
>> not ok 4 global.unknown_size_ignored_by_sanitizer
>> ok 5 global.alloc_size_seen_by_bdos
>> ok 6 global.alloc_size_enforced_by_sanitizer
>> not ok 7 global.element_count_seen_by_bdos
>> ok 8 global.element_count_enforced_by_sanitizer
>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
>> 
>> The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them.
>> 
>>  in a summary, there are two major issues:
>> 1.  The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557
>> Which is not a bug, it’s an expected behavior. 
>> 
>> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
>> 
>> for the following annotated structure: 
>> 
>> ====
>> struct annotated {
>>         unsigned long flags;
>>         size_t foo;
>>         int array[] __attribute__((counted_by (foo)));
>> };
>> 
>> 
>> struct annotated *p;
>> int index = 16;
>> 
>> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
>> 
>> p->foo = index + 2;  // p->foo was set by a different value than the real size of p->array as in test 9 and 10
>> or
>> p->foo was not set to any value as in test 3 and 4
>> 
>> ====
>> 
>> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array.  
>> 
>> I think that this should be considered as an user error, and the documentation of the attribute should include
>> this requirement.  (In the LLVM’s RFC, such requirement was included in the programing model: 
>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
>> 
>> We can add a new warning option -Wcounted-by to report such user error if needed.
>> 
>> What’s your opinion on this?
> 
> 
> Additionally, we could also have a sanitizer that
> checks this at run-time.

Yes, that’s also a nice feature to have. 
I think that the main point here is to catch such user errors during compilation time or run time. 

I will add one or two separate patches for these compilation warning and sanitizer feature.


> 
> Personally, I am still not very happy that in the
> following example the two 'n's refer to different
> entities:
> 
> void f(int n)
> {
>    struct foo {
>        int n;   
>        int (*p[])[n] [[counted_by(n)]];
>    };
> }
> 
Me either )-:


> But I guess it will be difficult to convince everybody
> that it would be wise to use a new syntax for
> disambiguation:
> 
> void f(int n)
> {
>    struct foo {
>        int n;   
>        int (*p[])[n] [[counted_by(.n)]];
>    };
> }
> 

I guess that it’s quite hard to convince everyone that the new syntax is the best solution at this moment. 
And it might not worth the effort at this time.

We can do the new syntax later if necessary.

thanks.

Qing

> Martin
> 
> 
>> 
>> thanks.
>> 
>> Qing
>> 
>> 
>>> On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote:
>>> 
>>> On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote:
>>>> GCC will pass the number of elements info from the attached attribute to both 
>>>> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds
>>>> or dynamic object size issues during runtime for flexible array members.
>>>> 
>>>> This new feature will provide nice protection to flexible array members (which
>>>> currently are completely ignored by both __builtin_dynamic_object_size and
>>>> bounds sanitizers).
>>> 
>>> Testing went pretty well, though I think I found some bdos issues:
>>> 
>>> - some things that bdos can't know the size of, and correctly returned
>>>  SIZE_MAX in the past, now thinks are 0-sized.
>>> - while bdos correctly knows the size of an element_count-annotated
>>>  flexible array, it doesn't know the size of the containing object
>>>  (i.e. it returns SIZE_MAX).
>>> 
>>> Also, I think I found a precedence issue:
>>> 
>>> - if both __alloc_size and 'element_count' are in use, the _smallest_
>>>  of the two is what I would expect to be enforced by the sanitizer
>>>  and reported by __bdos. As is, alloc_size appears to be used when
>>>  it is available, regardless of what 'element_count' shows.
>>> 
>>> I've updated my test cases to show it more clearly, but here is the
>>> before/after:
>>> 
>>> 
>>> GCC 13 (correctly does not implement "element_count"):
>>> 
>>> $ ./array-bounds 2>&1 | grep -v ^'#'
>>> TAP version 13
>>> 1..12
>>> ok 1 global.fixed_size_seen_by_bdos
>>> ok 2 global.fixed_size_enforced_by_sanitizer
>>> ok 3 global.unknown_size_unknown_to_bdos
>>> ok 4 global.unknown_size_ignored_by_sanitizer
>>> ok 5 global.alloc_size_seen_by_bdos
>>> ok 6 global.alloc_size_enforced_by_sanitizer
>>> not ok 7 global.element_count_seen_by_bdos
>>> not ok 8 global.element_count_enforced_by_sanitizer
>>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
>>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
>>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
>>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
>>> 
>>> 
>>> ToT GCC + this element_count series:
>>> 
>>> $ ./array-bounds 2>&1 | grep -v ^'#'
>>> TAP version 13
>>> 1..12
>>> ok 1 global.fixed_size_seen_by_bdos
>>> ok 2 global.fixed_size_enforced_by_sanitizer
>>> not ok 3 global.unknown_size_unknown_to_bdos
>>> not ok 4 global.unknown_size_ignored_by_sanitizer
>>> ok 5 global.alloc_size_seen_by_bdos
>>> ok 6 global.alloc_size_enforced_by_sanitizer
>>> not ok 7 global.element_count_seen_by_bdos
>>> ok 8 global.element_count_enforced_by_sanitizer
>>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
>>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
>>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
>>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
>>> 
>>> 
>>> Test suite is here:
>>> https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c
>>> 
>>> -- 
>>> Kees Cook
>> 
> 
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-07 15:47       ` Qing Zhao
@ 2023-07-07 20:21         ` Qing Zhao
  0 siblings, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-07-07 20:21 UTC (permalink / raw)
  To: Martin Uecker, Kees Cook, Joseph Myers
  Cc: richard.guenther, jakub, gcc-patches, siddhesh, isanbard

The following is the updated documentation on this new attribute, please let me know any suggestion and comment:

======

'counted_by (COUNT)'
     The 'counted_by' attribute may be attached to the flexible array
     member of a structure.  It indicates that the number of the
     elements of the array is given by the field named "COUNT" in the
     same structure as the flexible array member.  GCC uses this
     information to improve the results of
     '__builtin_dynamic_object_size' and array bound sanitizer.

     For instance, the following declaration:

          struct P {
            size_t count;
            int array[] __attribute__ ((counted_by (count)));
          } *p;

     specifies that the 'array' is a flexible array member whose number
     of elements is given by the field 'count' in the same structure.

     The field that represents the number of the elements should have an
     integer type.  An explicit 'counted_by' annotation defines a
     relationship between two objects, 'p->array' and 'p->count', that
     'p->array' has 'p->count' number of elements available.  This
     relationship must hold even after any of these related objects are
     updated.  It's the user's responsibility to make sure this
     relationship to be kept all the time.  Otherwise the results of the
     '__builtin_dynamic_object_size' and array bound sanitizer might be
     incorrect.

     For instance, the following 2nd update to the field 'count' of the
     above structure will permit out-of-bounds access to the array
     'sbuf>array':

          struct P *sbuf;
          void alloc_buf (size_t nelems)
          {
            sbuf = (int *) malloc (sizeof (struct P) + sizeof (int) * nelems);
            sbuf->count = nelems;
          }
          void use_buf (int index)
          {
            sbuf->count++;
            /* Now the value of sbuf->count and the number
               of elements of sbuf->array is out of sync.  */
            sbuf->array[index] = 0;
            /* then the out-of-bound access to this array
               might not be detected.  */
          }

     The users can use the warning option '-Wcounted-by-attribute' to
     detect such user errors during compilation time, or the sanitizer
     option '-fsanitize=counted-by-attribute' to detect such user errors
     during runtime.

=====

Qing

> On Jul 7, 2023, at 11:47 AM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> 
> 
>> On Jul 6, 2023, at 5:10 PM, Martin Uecker <uecker@tugraz.at> wrote:
>> 
>> Am Donnerstag, dem 06.07.2023 um 18:56 +0000 schrieb Qing Zhao:
>>> Hi, Kees,
>>> 
>>> I have updated my V1 patch with the following changes:
>>> A. changed the name to "counted_by"
>>> B. changed the argument from a string to an identifier
>>> C. updated the documentation and testing cases accordingly.
>>> 
>>> And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change)
>>> [opc@qinzhao-ol8u3-x86 Kees]$ !1091
>>> diff array-bounds.c array-bounds.c.org
>>> 32c32
>>> < # define __counted_by(member)	__attribute__((counted_by (member)))
>>> ---
>>>> # define __counted_by(member)	__attribute__((__element_count__(#member)))
>>> 34c34
>>> < # define __counted_by(member)   __attribute__((counted_by (member)))
>>> ---
>>>> # define __counted_by(member)	/* __attribute__((__element_count__(#member))) */
>>> 
>>> Then I got the following result:
>>> [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#'
>>> TAP version 13
>>> 1..12
>>> ok 1 global.fixed_size_seen_by_bdos
>>> ok 2 global.fixed_size_enforced_by_sanitizer
>>> not ok 3 global.unknown_size_unknown_to_bdos
>>> not ok 4 global.unknown_size_ignored_by_sanitizer
>>> ok 5 global.alloc_size_seen_by_bdos
>>> ok 6 global.alloc_size_enforced_by_sanitizer
>>> not ok 7 global.element_count_seen_by_bdos
>>> ok 8 global.element_count_enforced_by_sanitizer
>>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
>>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
>>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
>>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
>>> 
>>> The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them.
>>> 
>>> in a summary, there are two major issues:
>>> 1.  The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557
>>> Which is not a bug, it’s an expected behavior. 
>>> 
>>> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
>>> 
>>> for the following annotated structure: 
>>> 
>>> ====
>>> struct annotated {
>>>        unsigned long flags;
>>>        size_t foo;
>>>        int array[] __attribute__((counted_by (foo)));
>>> };
>>> 
>>> 
>>> struct annotated *p;
>>> int index = 16;
>>> 
>>> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
>>> 
>>> p->foo = index + 2;  // p->foo was set by a different value than the real size of p->array as in test 9 and 10
>>> or
>>> p->foo was not set to any value as in test 3 and 4
>>> 
>>> ====
>>> 
>>> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array.  
>>> 
>>> I think that this should be considered as an user error, and the documentation of the attribute should include
>>> this requirement.  (In the LLVM’s RFC, such requirement was included in the programing model: 
>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
>>> 
>>> We can add a new warning option -Wcounted-by to report such user error if needed.
>>> 
>>> What’s your opinion on this?
>> 
>> 
>> Additionally, we could also have a sanitizer that
>> checks this at run-time.
> 
> Yes, that’s also a nice feature to have. 
> I think that the main point here is to catch such user errors during compilation time or run time. 
> 
> I will add one or two separate patches for these compilation warning and sanitizer feature.
> 
> 
>> 
>> Personally, I am still not very happy that in the
>> following example the two 'n's refer to different
>> entities:
>> 
>> void f(int n)
>> {
>>   struct foo {
>>       int n;   
>>       int (*p[])[n] [[counted_by(n)]];
>>   };
>> }
>> 
> Me either )-:
> 
> 
>> But I guess it will be difficult to convince everybody
>> that it would be wise to use a new syntax for
>> disambiguation:
>> 
>> void f(int n)
>> {
>>   struct foo {
>>       int n;   
>>       int (*p[])[n] [[counted_by(.n)]];
>>   };
>> }
>> 
> 
> I guess that it’s quite hard to convince everyone that the new syntax is the best solution at this moment. 
> And it might not worth the effort at this time.
> 
> We can do the new syntax later if necessary.
> 
> thanks.
> 
> Qing
> 
>> Martin
>> 
>> 
>>> 
>>> thanks.
>>> 
>>> Qing
>>> 
>>> 
>>>> On May 26, 2023, at 4:40 PM, Kees Cook <keescook@chromium.org> wrote:
>>>> 
>>>> On Thu, May 25, 2023 at 04:14:47PM +0000, Qing Zhao wrote:
>>>>> GCC will pass the number of elements info from the attached attribute to both 
>>>>> __builtin_dynamic_object_size and bounds sanitizer to check the out-of-bounds
>>>>> or dynamic object size issues during runtime for flexible array members.
>>>>> 
>>>>> This new feature will provide nice protection to flexible array members (which
>>>>> currently are completely ignored by both __builtin_dynamic_object_size and
>>>>> bounds sanitizers).
>>>> 
>>>> Testing went pretty well, though I think I found some bdos issues:
>>>> 
>>>> - some things that bdos can't know the size of, and correctly returned
>>>> SIZE_MAX in the past, now thinks are 0-sized.
>>>> - while bdos correctly knows the size of an element_count-annotated
>>>> flexible array, it doesn't know the size of the containing object
>>>> (i.e. it returns SIZE_MAX).
>>>> 
>>>> Also, I think I found a precedence issue:
>>>> 
>>>> - if both __alloc_size and 'element_count' are in use, the _smallest_
>>>> of the two is what I would expect to be enforced by the sanitizer
>>>> and reported by __bdos. As is, alloc_size appears to be used when
>>>> it is available, regardless of what 'element_count' shows.
>>>> 
>>>> I've updated my test cases to show it more clearly, but here is the
>>>> before/after:
>>>> 
>>>> 
>>>> GCC 13 (correctly does not implement "element_count"):
>>>> 
>>>> $ ./array-bounds 2>&1 | grep -v ^'#'
>>>> TAP version 13
>>>> 1..12
>>>> ok 1 global.fixed_size_seen_by_bdos
>>>> ok 2 global.fixed_size_enforced_by_sanitizer
>>>> ok 3 global.unknown_size_unknown_to_bdos
>>>> ok 4 global.unknown_size_ignored_by_sanitizer
>>>> ok 5 global.alloc_size_seen_by_bdos
>>>> ok 6 global.alloc_size_enforced_by_sanitizer
>>>> not ok 7 global.element_count_seen_by_bdos
>>>> not ok 8 global.element_count_enforced_by_sanitizer
>>>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
>>>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
>>>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
>>>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
>>>> 
>>>> 
>>>> ToT GCC + this element_count series:
>>>> 
>>>> $ ./array-bounds 2>&1 | grep -v ^'#'
>>>> TAP version 13
>>>> 1..12
>>>> ok 1 global.fixed_size_seen_by_bdos
>>>> ok 2 global.fixed_size_enforced_by_sanitizer
>>>> not ok 3 global.unknown_size_unknown_to_bdos
>>>> not ok 4 global.unknown_size_ignored_by_sanitizer
>>>> ok 5 global.alloc_size_seen_by_bdos
>>>> ok 6 global.alloc_size_enforced_by_sanitizer
>>>> not ok 7 global.element_count_seen_by_bdos
>>>> ok 8 global.element_count_enforced_by_sanitizer
>>>> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
>>>> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
>>>> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
>>>> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
>>>> 
>>>> 
>>>> Test suite is here:
>>>> https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c
>>>> 
>>>> -- 
>>>> Kees Cook


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-06 18:56   ` Qing Zhao
  2023-07-06 21:10     ` Martin Uecker
@ 2023-07-13 20:31     ` Kees Cook
  2023-07-17 21:17       ` Qing Zhao
  1 sibling, 1 reply; 52+ messages in thread
From: Kees Cook @ 2023-07-13 20:31 UTC (permalink / raw)
  To: Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard

On Thu, Jul 06, 2023 at 06:56:21PM +0000, Qing Zhao wrote:
> Hi, Kees,
> 
> I have updated my V1 patch with the following changes:
> A. changed the name to "counted_by"
> B. changed the argument from a string to an identifier
> C. updated the documentation and testing cases accordingly.

Sounds great!

> 
> And then used this new gcc to test https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with the following change)
> [opc@qinzhao-ol8u3-x86 Kees]$ !1091
> diff array-bounds.c array-bounds.c.org
> 32c32
> < # define __counted_by(member)	__attribute__((counted_by (member)))
> ---
> > # define __counted_by(member)	__attribute__((__element_count__(#member)))
> 34c34
> < # define __counted_by(member)   __attribute__((counted_by (member)))
> ---
> > # define __counted_by(member)	/* __attribute__((__element_count__(#member))) */
> 
> Then I got the following result:
> [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#'
> TAP version 13
> 1..12
> ok 1 global.fixed_size_seen_by_bdos
> ok 2 global.fixed_size_enforced_by_sanitizer
> not ok 3 global.unknown_size_unknown_to_bdos
> not ok 4 global.unknown_size_ignored_by_sanitizer
> ok 5 global.alloc_size_seen_by_bdos
> ok 6 global.alloc_size_enforced_by_sanitizer
> not ok 7 global.element_count_seen_by_bdos
> ok 8 global.element_count_enforced_by_sanitizer
> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> 
> The same as your previous results. Then I took a look at all the failed testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them.
> 
>  in a summary, there are two major issues:
> 1.  The reason for the failed testing 7 is the same issue as I observed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557
> Which is not a bug, it’s an expected behavior. 

In the bug, the problem is that "p" isn't known to be allocated, if I'm
reading that correctly? I'm not sure this is a reasonable behavior, but
let me get into the specific test, which looks like this:

TEST(counted_by_seen_by_bdos)
{
        struct annotated *p;
        int index = MAX_INDEX + unconst;

        p = alloc_annotated(index);

        REPORT_SIZE(p->array);
/* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
        /* Check array size alone. */
/* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX);
/* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array));
        /* Check check entire object size. */
/* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
/* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array));
}

Test 1 trivially passes -- this is just a sanity check.

Test 2 should be SIZE_MAX because __bos only knows compile-time sizes.

Test 3 should pass: counted_by knows the allocation size and can be
queried as part of the "p" object.

Test 4 should be SIZE_MAX because __bos only knows compile-time sizes.

Test 5 should pass as well, since, again, p can be examined. Passing p
to __bdos implies it is allocated and the __counted_by annotation can be
examined.

If "return p->array[index];" would be caught by the sanitizer, then
it follows that __builtin_dynamic_object_size(p, 1) must also know the
size. i.e. both must examine "p" and its trailing flexible array and
__counted_by annotation.

> 
> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
> 
> for the following annotated structure: 
> 
> ====
> struct annotated {
>         unsigned long flags;
>         size_t foo;
>         int array[] __attribute__((counted_by (foo)));
> };
> 
> 
> struct annotated *p;
> int index = 16;
> 
> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
> 
> p->foo = index + 2;  // p->foo was set by a different value than the real size of p->array as in test 9 and 10

Right, tests 9 and 10 are checking that the _smallest_ possible value of
the array is used. (There are two sources of information: the allocation
size and the size calculated by counted_by. The smaller of the two
should be used when both are available.)

> or
> p->foo was not set to any value as in test 3 and 4

For tests 3 and 4, yes, this was my mistake. I have fixed up the tests
now. Bill noticed the same issue. Sorry for the think-o!

> ====
> 
> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array.  
> 
> I think that this should be considered as an user error, and the documentation of the attribute should include
> this requirement.  (In the LLVM’s RFC, such requirement was included in the programing model: 
> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
> 
> We can add a new warning option -Wcounted-by to report such user error if needed.
> 
> What’s your opinion on this?

I think it is correct to allow mismatch between allocation and
counted_by as long as only the least of the two is used. This may be
desirable in a few situations. One example would be a large allocation
that is slowly filled up by the program. I.e. the counted_by member is
slowly increased during runtime (but not beyond the true allocation size).

Of course allocation size is only available in limited situations, so
the loss of that info is fine: we have counted_by for everything else.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-13 20:31     ` Kees Cook
@ 2023-07-17 21:17       ` Qing Zhao
  2023-07-17 23:40         ` Kees Cook
  0 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-07-17 21:17 UTC (permalink / raw)
  To: Kees Cook
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard



> On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> In the bug, the problem is that "p" isn't known to be allocated, if I'm
> reading that correctly?

I think that the major point in PR109557 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):

for the following pointer p.3_1, 

p.3_1 = p;
_2 = __builtin_object_size (p.3_1, 0);

Question: why the size of p.3_1 cannot use the TYPE_SIZE of the pointee of p when the TYPE_SIZE can be determined at compile time?

Answer:  From just knowing the type of the pointee of p, the compiler cannot determine the size of the object.  

Therefore the bug has been closed. 

In your following testing 5:

> I'm not sure this is a reasonable behavior, but
> let me get into the specific test, which looks like this:
> 
> TEST(counted_by_seen_by_bdos)
> {
>        struct annotated *p;
>        int index = MAX_INDEX + unconst;
> 
>        p = alloc_annotated(index);
> 
>        REPORT_SIZE(p->array);
> /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
>        /* Check array size alone. */
> /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX);
> /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array));
>        /* Check check entire object size. */
> /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
> /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array));
> }
> 
> Test 5 should pass as well, since, again, p can be examined. Passing p
> to __bdos implies it is allocated and the __counted_by annotation can be
> examined.

Since the call to the routine “alloc_annotated" cannot be inlined, GCC does not see any allocation calls for the pointer p.
At the same time, due to the same reason as PR109986, GCC cannot determine the size of the object by just knowing the TYPE_SIZE
of the pointee of p.  

So, this is exactly the same issue as PR109557.  It’s an existing behavior per the current __buitlin_object_size algorithm.
I am still not very sure whether the situation in PR109557 can be improved or not, but either way, it’s a separate issue.

Please see the new testing case I added for PR109557, you will see that the above case 5 is a similar case as the new testing case in PR109557.

> 
> If "return p->array[index];" would be caught by the sanitizer, then
> it follows that __builtin_dynamic_object_size(p, 1) must also know the
> size. i.e. both must examine "p" and its trailing flexible array and
> __counted_by annotation.
> 
>> 
>> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
>> 
>> for the following annotated structure: 
>> 
>> ====
>> struct annotated {
>>        unsigned long flags;
>>        size_t foo;
>>        int array[] __attribute__((counted_by (foo)));
>> };
>> 
>> 
>> struct annotated *p;
>> int index = 16;
>> 
>> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
>> 
>> p->foo = index + 2;  // p->foo was set by a different value than the real size of p->array as in test 9 and 10
> 
> Right, tests 9 and 10 are checking that the _smallest_ possible value of
> the array is used. (There are two sources of information: the allocation
> size and the size calculated by counted_by. The smaller of the two
> should be used when both are available.)

The counted_by attribute is used to annotate a Flexible array member on how many elements it will have.
However, if this information can not accurately reflect the real number of elements for the array allocated, 
What’s the purpose of such information? 

>> or
>> p->foo was not set to any value as in test 3 and 4
> 
> For tests 3 and 4, yes, this was my mistake. I have fixed up the tests
> now. Bill noticed the same issue. Sorry for the think-o!
> 
>> ====
>> 
>> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array.  
>> 
>> I think that this should be considered as an user error, and the documentation of the attribute should include
>> this requirement.  (In the LLVM’s RFC, such requirement was included in the programing model: 
>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
>> 
>> We can add a new warning option -Wcounted-by to report such user error if needed.
>> 
>> What’s your opinion on this?
> 
> I think it is correct to allow mismatch between allocation and
> counted_by as long as only the least of the two is used.

What’s your mean by “only the least of the two is used”?

> This may be
> desirable in a few situations. One example would be a large allocation
> that is slowly filled up by the program.

So, for such situation, whenever the allocation is filled up, the field that hold the “counted_by” attribute should be increased at the same time,
Then, the “counted_by” value always sync with the real allocation. 
> I.e. the counted_by member is
> slowly increased during runtime (but not beyond the true allocation size).

Then there should be source code to increase the “counted_by” field whenever the allocated space increased too. 
> 
> Of course allocation size is only available in limited situations, so
> the loss of that info is fine: we have counted_by for everything else.

The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement:
https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18

Qing
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-17 21:17       ` Qing Zhao
@ 2023-07-17 23:40         ` Kees Cook
  2023-07-18 15:37           ` Qing Zhao
                             ` (3 more replies)
  0 siblings, 4 replies; 52+ messages in thread
From: Kees Cook @ 2023-07-17 23:40 UTC (permalink / raw)
  To: Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard

On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote:
> 
> > On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> wrote:
> > 
> > In the bug, the problem is that "p" isn't known to be allocated, if I'm
> > reading that correctly?
> 
> I think that the major point in PR109557 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
> 
> for the following pointer p.3_1, 
> 
> p.3_1 = p;
> _2 = __builtin_object_size (p.3_1, 0);
> 
> Question: why the size of p.3_1 cannot use the TYPE_SIZE of the pointee of p when the TYPE_SIZE can be determined at compile time?
> 
> Answer:  From just knowing the type of the pointee of p, the compiler cannot determine the size of the object.  

Why is that? "p" points to "struct P", which has a fixed size. There
must be an assumption somewhere that a pointer is allocated, otherwise
__bos would almost never work?

> Therefore the bug has been closed. 
> 
> In your following testing 5:
> 
> > I'm not sure this is a reasonable behavior, but
> > let me get into the specific test, which looks like this:
> > 
> > TEST(counted_by_seen_by_bdos)
> > {
> >        struct annotated *p;
> >        int index = MAX_INDEX + unconst;
> > 
> >        p = alloc_annotated(index);
> > 
> >        REPORT_SIZE(p->array);
> > /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
> >        /* Check array size alone. */
> > /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX);
> > /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array));
> >        /* Check check entire object size. */
> > /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
> > /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array));
> > }
> > 
> > Test 5 should pass as well, since, again, p can be examined. Passing p
> > to __bdos implies it is allocated and the __counted_by annotation can be
> > examined.
> 
> Since the call to the routine “alloc_annotated" cannot be inlined, GCC does not see any allocation calls for the pointer p.

Correct.

> At the same time, due to the same reason as PR109986, GCC cannot determine the size of the object by just knowing the TYPE_SIZE
> of the pointee of p.  

So the difference between test 3 and test 5 is that "p" is explicitly
dereferenced to find "array", and therefore an assumption is established
that "p" is allocated?

> So, this is exactly the same issue as PR109557.  It’s an existing behavior per the current __buitlin_object_size algorithm.
> I am still not very sure whether the situation in PR109557 can be improved or not, but either way, it’s a separate issue.

Okay, so the issue is one of object allocation visibility (or
assumptions there in)?

> Please see the new testing case I added for PR109557, you will see that the above case 5 is a similar case as the new testing case in PR109557.

I will ponder this a bit more to see if I can come up with a useful
test case to replace the part from "test 5" above.

> > 
> > If "return p->array[index];" would be caught by the sanitizer, then
> > it follows that __builtin_dynamic_object_size(p, 1) must also know the
> > size. i.e. both must examine "p" and its trailing flexible array and
> > __counted_by annotation.
> > 
> >> 
> >> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
> >> 
> >> for the following annotated structure: 
> >> 
> >> ====
> >> struct annotated {
> >>        unsigned long flags;
> >>        size_t foo;
> >>        int array[] __attribute__((counted_by (foo)));
> >> };
> >> 
> >> 
> >> struct annotated *p;
> >> int index = 16;
> >> 
> >> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
> >> 
> >> p->foo = index + 2;  // p->foo was set by a different value than the real size of p->array as in test 9 and 10
> > 
> > Right, tests 9 and 10 are checking that the _smallest_ possible value of
> > the array is used. (There are two sources of information: the allocation
> > size and the size calculated by counted_by. The smaller of the two
> > should be used when both are available.)
> 
> The counted_by attribute is used to annotate a Flexible array member on how many elements it will have.
> However, if this information can not accurately reflect the real number of elements for the array allocated, 
> What’s the purpose of such information? 

For example, imagine code that allocates space for 100 elements since
the common case is that the number of elements will grow over time.
Elements are added as it goes. For example:

struct grows {
	int alloc_count;
	int valid_count;
	struct element item[] __counted_by(valid_count);
} *p;

void something(void)
{
	p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
	p->alloc_count = 100;
	p->valid_count = 0;

	/* this loop doesn't check that we don't go over 100. */
	while (items_to_copy) {
		struct element *item_ptr = get_next_item();
		/* __counted_by stays in sync: */
		p->valid_count++;
		p->item[p->valid_count - 1] = *item_ptr;
	}
}

We would want to catch cases there p->item[] is accessed with an index
that is >= p->valid_count, even though the allocation is (currently)
larger.

However, if we ever reached valid_count >= alloc_count, we need to trap
too, since we can still "see" the true allocation size.

Now, the __alloc_size hint is visible in very few places, so if there is
a strong reason to do so, I can live with saying that __counted_by takes
full precedence over __alloc_size. It seems it should be possible to
compare when both are present, but I can live with __counted_by being
the universal truth. :)

> 
> >> or
> >> p->foo was not set to any value as in test 3 and 4
> > 
> > For tests 3 and 4, yes, this was my mistake. I have fixed up the tests
> > now. Bill noticed the same issue. Sorry for the think-o!
> > 
> >> ====
> >> 
> >> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array.  
> >> 
> >> I think that this should be considered as an user error, and the documentation of the attribute should include
> >> this requirement.  (In the LLVM’s RFC, such requirement was included in the programing model: 
> >> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
> >> 
> >> We can add a new warning option -Wcounted-by to report such user error if needed.
> >> 
> >> What’s your opinion on this?
> > 
> > I think it is correct to allow mismatch between allocation and
> > counted_by as long as only the least of the two is used.
> 
> What’s your mean by “only the least of the two is used”?

I was just restating the above: if size information is available via
both __alloc_size and __counted_by, the smaller of the two should be
used.

> 
> > This may be
> > desirable in a few situations. One example would be a large allocation
> > that is slowly filled up by the program.
> 
> So, for such situation, whenever the allocation is filled up, the field that hold the “counted_by” attribute should be increased at the same time,
> Then, the “counted_by” value always sync with the real allocation. 
> > I.e. the counted_by member is
> > slowly increased during runtime (but not beyond the true allocation size).
> 
> Then there should be source code to increase the “counted_by” field whenever the allocated space increased too. 
> > 
> > Of course allocation size is only available in limited situations, so
> > the loss of that info is fine: we have counted_by for everything else.
> 
> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement:
> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18

Right, I'm saying it would be nice if __alloc_size was checked as well,
in the sense that if it is available, it knows without question what the
size of the allocation is. If __alloc_size and __counted_by conflict,
the smaller of the two should be the truth.

But, as I said, if there is some need to explicitly ignore __alloc_size
when __counted_by is present, I can live with it; we just need to
document it.

If the RFC and you agree that the __counted_by variable can only ever be
(re)assigned after the flex array has been (re)allocated, then I guess
we'll see how it goes. :) I think most places in the kernel using
__counted_by will be fine, but I suspect we may have cases where we need
to update it like in the loop I described above. If that's true, we can
revisit the requirement then. :)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-17 23:40         ` Kees Cook
@ 2023-07-18 15:37           ` Qing Zhao
  2023-07-18 16:03             ` Martin Uecker
  2023-07-18 18:53             ` Qing Zhao
  2023-07-19  8:41           ` Martin Uecker
                             ` (2 subsequent siblings)
  3 siblings, 2 replies; 52+ messages in thread
From: Qing Zhao @ 2023-07-18 15:37 UTC (permalink / raw)
  To: Kees Cook, Siddhesh Poyarekar, Martin Uecker, Jakub Jelinek
  Cc: joseph, richard.guenther, gcc-patches, isanbard



> On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote:
>> 
>>> On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> wrote:
>>> 
>>> In the bug, the problem is that "p" isn't known to be allocated, if I'm
>>> reading that correctly?
>> 
>> I think that the major point in PR109557 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
>> 
>> for the following pointer p.3_1, 
>> 
>> p.3_1 = p;
>> _2 = __builtin_object_size (p.3_1, 0);
>> 
>> Question: why the size of p.3_1 cannot use the TYPE_SIZE of the pointee of p when the TYPE_SIZE can be determined at compile time?
>> 
>> Answer:  From just knowing the type of the pointee of p, the compiler cannot determine the size of the object.  
> 
> Why is that? "p" points to "struct P", which has a fixed size. There
> must be an assumption somewhere that a pointer is allocated, otherwise
> __bos would almost never work?

My understanding from the comments in PR109557 was: 

In general the pointer could point to the first object of an array that has more elements, or to an object of a different type. 
Without seeing the real allocation to the pointer, the compiler cannot assume that the pointer point to an object that has
the exact same type as its declaration. 

Sid and Martin, is the above understand correctly?

Honestly, I am still not 100% clear on this yet.

Jakub, Sid and Martin, could  you please explain a little bit more here, especially for the following small example?

[opc@qinzhao-ol8u3-x86 109557]$ cat t.c
#include <stdlib.h>
#include <assert.h>
struct P {
  int k;
  int x[10]; 
} *p;

void store(int a, int b) 
{
  p = (struct P *)malloc (sizeof (struct P));
  p->k = a;
  p->x[b] = 0;
  assert (__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10]));
  assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
  return;
}

int main()
{
  store(7, 7);
  assert (__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10]));
  assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
  free (p);
}
[opc@qinzhao-ol8u3-x86 109557]$ sh t
/home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c
a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10])' failed.
t: line 19: 859944 Aborted                 (core dumped) ./a.out


In the above, among the 4 assertions, only the last one failed.

Why GCC can use the TYPE_SIZE of the pointee of the pointer “p->x” as the size of the object, 
but cannot use the TYPE_SIZE of the pointee of the pointer “p” as the size of the object? 


> 
>> Therefore the bug has been closed. 
>> 
>> In your following testing 5:
>> 
>>> I'm not sure this is a reasonable behavior, but
>>> let me get into the specific test, which looks like this:
>>> 
>>> TEST(counted_by_seen_by_bdos)
>>> {
>>>       struct annotated *p;
>>>       int index = MAX_INDEX + unconst;
>>> 
>>>       p = alloc_annotated(index);
>>> 
>>>       REPORT_SIZE(p->array);
>>> /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
>>>       /* Check array size alone. */
>>> /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX);
>>> /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array));
>>>       /* Check check entire object size. */
>>> /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
>>> /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array));
>>> }
>>> 
>>> Test 5 should pass as well, since, again, p can be examined. Passing p
>>> to __bdos implies it is allocated and the __counted_by annotation can be
>>> examined.
>> 
>> Since the call to the routine “alloc_annotated" cannot be inlined, GCC does not see any allocation calls for the pointer p.
> 
> Correct.
> 
>> At the same time, due to the same reason as PR109986, GCC cannot determine the size of the object by just knowing the TYPE_SIZE
>> of the pointee of p.  
> 
> So the difference between test 3 and test 5 is that "p" is explicitly
> dereferenced to find "array", and therefore an assumption is established
> that "p" is allocated?

Yes, this might be the assumption that GCC made  -:)
> 
>> So, this is exactly the same issue as PR109557.  It’s an existing behavior per the current __buitlin_object_size algorithm.
>> I am still not very sure whether the situation in PR109557 can be improved or not, but either way, it’s a separate issue.
> 
> Okay, so the issue is one of object allocation visibility (or
> assumptions there in)?

Might be, let’s see what Sid or Martin will say on this.
> 
>> Please see the new testing case I added for PR109557, you will see that the above case 5 is a similar case as the new testing case in PR109557.
> 
> I will ponder this a bit more to see if I can come up with a useful
> test case to replace the part from "test 5" above.
> 
>>> 
>>> If "return p->array[index];" would be caught by the sanitizer, then
>>> it follows that __builtin_dynamic_object_size(p, 1) must also know the
>>> size. i.e. both must examine "p" and its trailing flexible array and
>>> __counted_by annotation.
>>> 
>>>> 
>>>> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
>>>> 
>>>> for the following annotated structure: 
>>>> 
>>>> ====
>>>> struct annotated {
>>>>       unsigned long flags;
>>>>       size_t foo;
>>>>       int array[] __attribute__((counted_by (foo)));
>>>> };
>>>> 
>>>> 
>>>> struct annotated *p;
>>>> int index = 16;
>>>> 
>>>> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
>>>> 
>>>> p->foo = index + 2;  // p->foo was set by a different value than the real size of p->array as in test 9 and 10
>>> 
>>> Right, tests 9 and 10 are checking that the _smallest_ possible value of
>>> the array is used. (There are two sources of information: the allocation
>>> size and the size calculated by counted_by. The smaller of the two
>>> should be used when both are available.)
>> 
>> The counted_by attribute is used to annotate a Flexible array member on how many elements it will have.
>> However, if this information can not accurately reflect the real number of elements for the array allocated, 
>> What’s the purpose of such information? 
> 
> For example, imagine code that allocates space for 100 elements since
> the common case is that the number of elements will grow over time.
> Elements are added as it goes. For example:
> 
> struct grows {
> 	int alloc_count;
> 	int valid_count;
> 	struct element item[] __counted_by(valid_count);
> } *p;
> 
> void something(void)
> {
> 	p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
> 	p->alloc_count = 100;
> 	p->valid_count = 0;
> 
> 	/* this loop doesn't check that we don't go over 100. */
> 	while (items_to_copy) {
> 		struct element *item_ptr = get_next_item();
> 		/* __counted_by stays in sync: */
> 		p->valid_count++;
> 		p->item[p->valid_count - 1] = *item_ptr;
> 	}
> }
> 
> We would want to catch cases there p->item[] is accessed with an index
> that is >= p->valid_count, even though the allocation is (currently)
> larger.
> 
> However, if we ever reached valid_count >= alloc_count, we need to trap
> too, since we can still "see" the true allocation size.
> 
> Now, the __alloc_size hint is visible in very few places, so if there is
> a strong reason to do so, I can live with saying that __counted_by takes
> full precedence over __alloc_size. It seems it should be possible to
> compare when both are present, but I can live with __counted_by being
> the universal truth. :)
> 

Thanks for the example and the explanation. Understood now.

LLVM’s RFC requires the following relationship: (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)

Buffer’s real allocated size >= counted_by value

Should be true all the time. 

I think that this is a reasonable requirement. 

(Otherwise, if counted_by > buffer’s real allocated size, overflow might happen)

Is the above enough to cover your use cases?

If so, I will study a little bit more to see how to implement this.
>> 
>>>> or
>>>> p->foo was not set to any value as in test 3 and 4
>>> 
>>> For tests 3 and 4, yes, this was my mistake. I have fixed up the tests
>>> now. Bill noticed the same issue. Sorry for the think-o!
>>> 
>>>> ====
>>>> 
>>>> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array.  
>>>> 
>>>> I think that this should be considered as an user error, and the documentation of the attribute should include
>>>> this requirement.  (In the LLVM’s RFC, such requirement was included in the programing model: 
>>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
>>>> 
>>>> We can add a new warning option -Wcounted-by to report such user error if needed.
>>>> 
>>>> What’s your opinion on this?
>>> 
>>> I think it is correct to allow mismatch between allocation and
>>> counted_by as long as only the least of the two is used.
>> 
>> What’s your mean by “only the least of the two is used”?
> 
> I was just restating the above: if size information is available via
> both __alloc_size and __counted_by, the smaller of the two should be
> used.

If we are consistent with LLVM’s RFC, __alloc_size >= __counted_by all the time.
Then we can always use the counted_by info for the array size. 
> 
>> 
>>> This may be
>>> desirable in a few situations. One example would be a large allocation
>>> that is slowly filled up by the program.
>> 
>> So, for such situation, whenever the allocation is filled up, the field that hold the “counted_by” attribute should be increased at the same time,
>> Then, the “counted_by” value always sync with the real allocation. 
>>> I.e. the counted_by member is
>>> slowly increased during runtime (but not beyond the true allocation size).
>> 
>> Then there should be source code to increase the “counted_by” field whenever the allocated space increased too. 
>>> 
>>> Of course allocation size is only available in limited situations, so
>>> the loss of that info is fine: we have counted_by for everything else.
>> 
>> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement:
>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> 
> Right, I'm saying it would be nice if __alloc_size was checked as well,
> in the sense that if it is available, it knows without question what the
> size of the allocation is. If __alloc_size and __counted_by conflict,
> the smaller of the two should be the truth.

> 
> But, as I said, if there is some need to explicitly ignore __alloc_size
> when __counted_by is present, I can live with it; we just need to
> document it.
> 
> If the RFC and you agree that the __counted_by variable can only ever be
> (re)assigned after the flex array has been (re)allocated, then I guess
> we'll see how it goes. :) I think most places in the kernel using
> __counted_by will be fine, but I suspect we may have cases where we need
> to update it like in the loop I described above. If that's true, we can
> revisit the requirement then. :)

I guess if we can always keep  the relationship:   __alloc_size >= __counted_by all the time. Should be fine.

Please check whether this is enough for kernel, I will check whether this is doable For GCC.

thanks.


Qing
> 
> -Kees
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-18 15:37           ` Qing Zhao
@ 2023-07-18 16:03             ` Martin Uecker
  2023-07-18 16:25               ` Qing Zhao
  2023-07-18 18:53             ` Qing Zhao
  1 sibling, 1 reply; 52+ messages in thread
From: Martin Uecker @ 2023-07-18 16:03 UTC (permalink / raw)
  To: Qing Zhao, Kees Cook, Siddhesh Poyarekar, Jakub Jelinek
  Cc: joseph, richard.guenther, gcc-patches, isanbard

Am Dienstag, dem 18.07.2023 um 15:37 +0000 schrieb Qing Zhao:
> 
> 
> > On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org>
> > wrote:
> > 
> > On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote:
> > > 
> > > > On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org>
> > > > wrote:
> > > > 
> > > > In the bug, the problem is that "p" isn't known to be
> > > > allocated, if I'm
> > > > reading that correctly?
> > > 
> > > I think that the major point in PR109557
> > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
> > > 
> > > for the following pointer p.3_1, 
> > > 
> > > p.3_1 = p;
> > > _2 = __builtin_object_size (p.3_1, 0);
> > > 
> > > Question: why the size of p.3_1 cannot use the TYPE_SIZE of the
> > > pointee of p when the TYPE_SIZE can be determined at compile
> > > time?
> > > 
> > > Answer:  From just knowing the type of the pointee of p, the
> > > compiler cannot determine the size of the object.  
> > 
> > Why is that? "p" points to "struct P", which has a fixed size.
> > There
> > must be an assumption somewhere that a pointer is allocated,
> > otherwise
> > __bos would almost never work?
> 
> My understanding from the comments in PR109557 was: 
> 
> In general the pointer could point to the first object of an array
> that has more elements, or to an object of a different type. 
> Without seeing the real allocation to the pointer, the compiler
> cannot assume that the pointer point to an object that has
> the exact same type as its declaration. 
> 
> Sid and Martin, is the above understand correctly?

Yes. 

In the example, it *could* work because the compiler
could inline 'store' or otherwise use its knowledge about
the function definition to conclude that 'p' points to
an object of size 'sizeof (struct P)'. But this is fragile
because it relies on optimization and will not work across
TUs.

> Honestly, I am still not 100% clear on this yet.

> Jakub, Sid and Martin, could  you please explain a little bit more
> here, especially for the following small example?
> 
> [opc@qinzhao-ol8u3-x86 109557]$ cat t.c
> #include <stdlib.h>
> #include <assert.h>
> struct P {
>   int k;
>   int x[10]; 
> } *p;
> 
> void store(int a, int b) 
> {
>   p = (struct P *)malloc (sizeof (struct P));
>   p->k = a;
>   p->x[b] = 0;
>   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
> (int[10]));
>   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
>   return;
> }
> 
> int main()
> {
>   store(7, 7);
>   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
> (int[10]));
>   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
>   free (p);
> }
> [opc@qinzhao-ol8u3-x86 109557]$ sh t
> /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c
> a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p->x,
> 1) == sizeof (int[10])' failed.
> t: line 19: 859944 Aborted                 (core dumped) ./a.out
> 
> 
> In the above, among the 4 assertions, only the last one failed.

I don't know why this is the case. 

> 
> Why GCC can use the TYPE_SIZE of the pointee of the pointer “p->x” as
> the size of the object, 

I do not think it can do this in general. Is this how it 
is implemented? Thus would then seem incorrect to me.  

> but cannot use the TYPE_SIZE of the pointee of the pointer “p” as the
> size of the object? 

In general, the type of a pointer does not say anything about the
object it points to, because you could cast the pointer to a different
type, pass it around, and then cast it back before use. 

Only observed allocations or observed accesses provide information
about the type / existence of an object at the corresponding address.

The only other way in C which ensures that a pointer actually points
to an object of the correct type is 'static':

void foo(struct P *p[static 1]);



Martin


> 
> > 
> > > Therefore the bug has been closed. 
> > > 
> > > In your following testing 5:
> > > 
> > > > I'm not sure this is a reasonable behavior, but
> > > > let me get into the specific test, which looks like this:
> > > > 
> > > > TEST(counted_by_seen_by_bdos)
> > > > {
> > > >       struct annotated *p;
> > > >       int index = MAX_INDEX + unconst;
> > > > 
> > > >       p = alloc_annotated(index);
> > > > 
> > > >       REPORT_SIZE(p->array);
> > > > /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
> > > >       /* Check array size alone. */
> > > > /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1),
> > > > SIZE_MAX);
> > > > /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1),
> > > > p->foo * sizeof(*p->array));
> > > >       /* Check check entire object size. */
> > > > /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
> > > > /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1),
> > > > sizeof(*p) + p->foo * sizeof(*p->array));
> > > > }
> > > > 
> > > > Test 5 should pass as well, since, again, p can be examined.
> > > > Passing p
> > > > to __bdos implies it is allocated and the __counted_by
> > > > annotation can be
> > > > examined.
> > > 
> > > Since the call to the routine “alloc_annotated" cannot be
> > > inlined, GCC does not see any allocation calls for the pointer p.
> > 
> > Correct.
> > 
> > > At the same time, due to the same reason as PR109986, GCC cannot
> > > determine the size of the object by just knowing the TYPE_SIZE
> > > of the pointee of p.  
> > 
> > So the difference between test 3 and test 5 is that "p" is
> > explicitly
> > dereferenced to find "array", and therefore an assumption is
> > established
> > that "p" is allocated?
> 
> Yes, this might be the assumption that GCC made  -:)
> > 
> > > So, this is exactly the same issue as PR109557.  It’s an existing
> > > behavior per the current __buitlin_object_size algorithm.
> > > I am still not very sure whether the situation in PR109557 can be
> > > improved or not, but either way, it’s a separate issue.
> > 
> > Okay, so the issue is one of object allocation visibility (or
> > assumptions there in)?
> 
> Might be, let’s see what Sid or Martin will say on this.
> > 
> > > Please see the new testing case I added for PR109557, you will
> > > see that the above case 5 is a similar case as the new testing
> > > case in PR109557.
> > 
> > I will ponder this a bit more to see if I can come up with a useful
> > test case to replace the part from "test 5" above.
> > 
> > > > 
> > > > If "return p->array[index];" would be caught by the sanitizer,
> > > > then
> > > > it follows that __builtin_dynamic_object_size(p, 1) must also
> > > > know the
> > > > size. i.e. both must examine "p" and its trailing flexible
> > > > array and
> > > > __counted_by annotation.
> > > > 
> > > > > 
> > > > > 2. The common issue for  the failed testing 3, 4, 9, 10 is:
> > > > > 
> > > > > for the following annotated structure: 
> > > > > 
> > > > > ====
> > > > > struct annotated {
> > > > >       unsigned long flags;
> > > > >       size_t foo;
> > > > >       int array[] __attribute__((counted_by (foo)));
> > > > > };
> > > > > 
> > > > > 
> > > > > struct annotated *p;
> > > > > int index = 16;
> > > > > 
> > > > > p = malloc(sizeof(*p) + index * sizeof(*p->array));  //
> > > > > allocated real size 
> > > > > 
> > > > > p->foo = index + 2;  // p->foo was set by a different value
> > > > > than the real size of p->array as in test 9 and 10
> > > > 
> > > > Right, tests 9 and 10 are checking that the _smallest_ possible
> > > > value of
> > > > the array is used. (There are two sources of information: the
> > > > allocation
> > > > size and the size calculated by counted_by. The smaller of the
> > > > two
> > > > should be used when both are available.)
> > > 
> > > The counted_by attribute is used to annotate a Flexible array
> > > member on how many elements it will have.
> > > However, if this information can not accurately reflect the real
> > > number of elements for the array allocated, 
> > > What’s the purpose of such information? 
> > 
> > For example, imagine code that allocates space for 100 elements
> > since
> > the common case is that the number of elements will grow over time.
> > Elements are added as it goes. For example:
> > 
> > struct grows {
> >         int alloc_count;
> >         int valid_count;
> >         struct element item[] __counted_by(valid_count);
> > } *p;
> > 
> > void something(void)
> > {
> >         p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
> >         p->alloc_count = 100;
> >         p->valid_count = 0;
> > 
> >         /* this loop doesn't check that we don't go over 100. */
> >         while (items_to_copy) {
> >                 struct element *item_ptr = get_next_item();
> >                 /* __counted_by stays in sync: */
> >                 p->valid_count++;
> >                 p->item[p->valid_count - 1] = *item_ptr;
> >         }
> > }
> > 
> > We would want to catch cases there p->item[] is accessed with an
> > index
> > that is >= p->valid_count, even though the allocation is
> > (currently)
> > larger.
> > 
> > However, if we ever reached valid_count >= alloc_count, we need to
> > trap
> > too, since we can still "see" the true allocation size.
> > 
> > Now, the __alloc_size hint is visible in very few places, so if
> > there is
> > a strong reason to do so, I can live with saying that __counted_by
> > takes
> > full precedence over __alloc_size. It seems it should be possible
> > to
> > compare when both are present, but I can live with __counted_by
> > being
> > the universal truth. :)
> > 
> 
> Thanks for the example and the explanation. Understood now.
> 
> LLVM’s RFC requires the following relationship:
> (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbound
> s-safety/70854#maintaining-correctness-of-bounds-annotations-18)
> 
> Buffer’s real allocated size >= counted_by value
> 
> Should be true all the time. 
> 
> I think that this is a reasonable requirement. 
> 
> (Otherwise, if counted_by > buffer’s real allocated size, overflow
> might happen)
> 
> Is the above enough to cover your use cases?
> 
> If so, I will study a little bit more to see how to implement this.
> > > 
> > > > > or
> > > > > p->foo was not set to any value as in test 3 and 4
> > > > 
> > > > For tests 3 and 4, yes, this was my mistake. I have fixed up
> > > > the tests
> > > > now. Bill noticed the same issue. Sorry for the think-o!
> > > > 
> > > > > ====
> > > > > 
> > > > > i.e, the value of p->foo is NOT synced with the number of
> > > > > elements allocated for the array p->array.  
> > > > > 
> > > > > I think that this should be considered as an user error, and
> > > > > the documentation of the attribute should include
> > > > > this requirement.  (In the LLVM’s RFC, such requirement was
> > > > > included in the programing model: 
> > > > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> > > > > )
> > > > > 
> > > > > We can add a new warning option -Wcounted-by to report such
> > > > > user error if needed.
> > > > > 
> > > > > What’s your opinion on this?
> > > > 
> > > > I think it is correct to allow mismatch between allocation and
> > > > counted_by as long as only the least of the two is used.
> > > 
> > > What’s your mean by “only the least of the two is used”?
> > 
> > I was just restating the above: if size information is available
> > via
> > both __alloc_size and __counted_by, the smaller of the two should
> > be
> > used.
> 
> If we are consistent with LLVM’s RFC, __alloc_size >= __counted_by
> all the time.
> Then we can always use the counted_by info for the array size. 
> > 
> > > 
> > > > This may be
> > > > desirable in a few situations. One example would be a large
> > > > allocation
> > > > that is slowly filled up by the program.
> > > 
> > > So, for such situation, whenever the allocation is filled up, the
> > > field that hold the “counted_by” attribute should be increased at
> > > the same time,
> > > Then, the “counted_by” value always sync with the real
> > > allocation. 
> > > > I.e. the counted_by member is
> > > > slowly increased during runtime (but not beyond the true
> > > > allocation size).
> > > 
> > > Then there should be source code to increase the “counted_by”
> > > field whenever the allocated space increased too. 
> > > > 
> > > > Of course allocation size is only available in limited
> > > > situations, so
> > > > the loss of that info is fine: we have counted_by for
> > > > everything else.
> > > 
> > > The point is: allocation size should synced with the value of
> > > “counted_by”. LLVM’s RFC also have the similar requirement:
> > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> > 
> > Right, I'm saying it would be nice if __alloc_size was checked as
> > well,
> > in the sense that if it is available, it knows without question
> > what the
> > size of the allocation is. If __alloc_size and __counted_by
> > conflict,
> > the smaller of the two should be the truth.
> 
> > 
> > But, as I said, if there is some need to explicitly ignore
> > __alloc_size
> > when __counted_by is present, I can live with it; we just need to
> > document it.
> > 
> > If the RFC and you agree that the __counted_by variable can only
> > ever be
> > (re)assigned after the flex array has been (re)allocated, then I
> > guess
> > we'll see how it goes. :) I think most places in the kernel using
> > __counted_by will be fine, but I suspect we may have cases where we
> > need
> > to update it like in the loop I described above. If that's true, we
> > can
> > revisit the requirement then. :)
> 
> I guess if we can always keep  the relationship:   __alloc_size >=
> __counted_by all the time. Should be fine.
> 
> Please check whether this is enough for kernel, I will check whether
> this is doable For GCC.
> 
> thanks.
> 
> 
> Qing
> > 
> > -Kees
> > 
> > -- 
> > Kees Cook
> 

-- 
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-18 16:03             ` Martin Uecker
@ 2023-07-18 16:25               ` Qing Zhao
  2023-07-18 16:50                 ` Martin Uecker
  0 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-07-18 16:25 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Kees Cook, Siddhesh Poyarekar, Jakub Jelinek, joseph,
	richard.guenther, gcc-patches, isanbard



> On Jul 18, 2023, at 12:03 PM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Dienstag, dem 18.07.2023 um 15:37 +0000 schrieb Qing Zhao:
>> 
>> 
>>> On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org>
>>> wrote:
>>> 
>>> On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote:
>>>> 
>>>>> On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org>
>>>>> wrote:
>>>>> 
>>>>> In the bug, the problem is that "p" isn't known to be
>>>>> allocated, if I'm
>>>>> reading that correctly?
>>>> 
>>>> I think that the major point in PR109557
>>>> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
>>>> 
>>>> for the following pointer p.3_1, 
>>>> 
>>>> p.3_1 = p;
>>>> _2 = __builtin_object_size (p.3_1, 0);
>>>> 
>>>> Question: why the size of p.3_1 cannot use the TYPE_SIZE of the
>>>> pointee of p when the TYPE_SIZE can be determined at compile
>>>> time?
>>>> 
>>>> Answer:  From just knowing the type of the pointee of p, the
>>>> compiler cannot determine the size of the object.  
>>> 
>>> Why is that? "p" points to "struct P", which has a fixed size.
>>> There
>>> must be an assumption somewhere that a pointer is allocated,
>>> otherwise
>>> __bos would almost never work?
>> 
>> My understanding from the comments in PR109557 was: 
>> 
>> In general the pointer could point to the first object of an array
>> that has more elements, or to an object of a different type. 
>> Without seeing the real allocation to the pointer, the compiler
>> cannot assume that the pointer point to an object that has
>> the exact same type as its declaration. 
>> 
>> Sid and Martin, is the above understand correctly?
> 
> Yes. 
> 
> In the example, it *could* work because the compiler
> could inline 'store' or otherwise use its knowledge about
> the function definition to conclude that 'p' points to
> an object of size 'sizeof (struct P)'. But this is fragile
> because it relies on optimization and will not work across
> TUs.
> 
>> Honestly, I am still not 100% clear on this yet.
> 
>> Jakub, Sid and Martin, could  you please explain a little bit more
>> here, especially for the following small example?
>> 
>> [opc@qinzhao-ol8u3-x86 109557]$ cat t.c
>> #include <stdlib.h>
>> #include <assert.h>
>> struct P {
>>   int k;
>>   int x[10]; 
>> } *p;
>> 
>> void store(int a, int b) 
>> {
>>   p = (struct P *)malloc (sizeof (struct P));
>>   p->k = a;
>>   p->x[b] = 0;
>>   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
>> (int[10]));
>>   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
>>   return;
>> }
>> 
>> int main()
>> {
>>   store(7, 7);
>>   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
>> (int[10]));
>>   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
>>   free (p);
>> }
>> [opc@qinzhao-ol8u3-x86 109557]$ sh t
>> /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c
>> a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p->x,
>> 1) == sizeof (int[10])' failed.
>> t: line 19: 859944 Aborted                 (core dumped) ./a.out
>> 
>> 
>> In the above, among the 4 assertions, only the last one failed.
> 
> I don't know why this is the case. 
> 
>> 
>> Why GCC can use the TYPE_SIZE of the pointee of the pointer “p->x” as
>> the size of the object, 
> 
> I do not think it can do this in general. Is this how it 
> is implemented?

No. -:)

 I guess that the implementation of this should base on your following case,  “observed accesses”:
Although I am not 100% sure on the definition of “observed accesses”.

p->x  is an access to the field of the object, so compiler can assume that the object exist and have
the type associate with this access?

On the other hand, “p” is just a plain pointer, no observed access.



> Thus would then seem incorrect to me.  
> 
>> but cannot use the TYPE_SIZE of the pointee of the pointer “p” as the
>> size of the object? 
> 
> In general, the type of a pointer does not say anything about the
> object it points to, because you could cast the pointer to a different
> type, pass it around, and then cast it back before use. 

Okay, I see.
> 
> Only observed allocations or observed accesses provide information
> about the type / existence of an object at the corresponding address.

What will be included in “observed accesses”?

> 
> The only other way in C which ensures that a pointer actually points
> to an object of the correct type is 'static':
> 
> void foo(struct P *p[static 1]);

Thanks for the info.

Qing
> 
> 
> 
> Martin
> 
> 
>> 
>>> 
>>>> Therefore the bug has been closed. 
>>>> 
>>>> In your following testing 5:
>>>> 
>>>>> I'm not sure this is a reasonable behavior, but
>>>>> let me get into the specific test, which looks like this:
>>>>> 
>>>>> TEST(counted_by_seen_by_bdos)
>>>>> {
>>>>>       struct annotated *p;
>>>>>       int index = MAX_INDEX + unconst;
>>>>> 
>>>>>       p = alloc_annotated(index);
>>>>> 
>>>>>       REPORT_SIZE(p->array);
>>>>> /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
>>>>>       /* Check array size alone. */
>>>>> /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1),
>>>>> SIZE_MAX);
>>>>> /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1),
>>>>> p->foo * sizeof(*p->array));
>>>>>       /* Check check entire object size. */
>>>>> /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
>>>>> /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1),
>>>>> sizeof(*p) + p->foo * sizeof(*p->array));
>>>>> }
>>>>> 
>>>>> Test 5 should pass as well, since, again, p can be examined.
>>>>> Passing p
>>>>> to __bdos implies it is allocated and the __counted_by
>>>>> annotation can be
>>>>> examined.
>>>> 
>>>> Since the call to the routine “alloc_annotated" cannot be
>>>> inlined, GCC does not see any allocation calls for the pointer p.
>>> 
>>> Correct.
>>> 
>>>> At the same time, due to the same reason as PR109986, GCC cannot
>>>> determine the size of the object by just knowing the TYPE_SIZE
>>>> of the pointee of p.  
>>> 
>>> So the difference between test 3 and test 5 is that "p" is
>>> explicitly
>>> dereferenced to find "array", and therefore an assumption is
>>> established
>>> that "p" is allocated?
>> 
>> Yes, this might be the assumption that GCC made  -:)
>>> 
>>>> So, this is exactly the same issue as PR109557.  It’s an existing
>>>> behavior per the current __buitlin_object_size algorithm.
>>>> I am still not very sure whether the situation in PR109557 can be
>>>> improved or not, but either way, it’s a separate issue.
>>> 
>>> Okay, so the issue is one of object allocation visibility (or
>>> assumptions there in)?
>> 
>> Might be, let’s see what Sid or Martin will say on this.
>>> 
>>>> Please see the new testing case I added for PR109557, you will
>>>> see that the above case 5 is a similar case as the new testing
>>>> case in PR109557.
>>> 
>>> I will ponder this a bit more to see if I can come up with a useful
>>> test case to replace the part from "test 5" above.
>>> 
>>>>> 
>>>>> If "return p->array[index];" would be caught by the sanitizer,
>>>>> then
>>>>> it follows that __builtin_dynamic_object_size(p, 1) must also
>>>>> know the
>>>>> size. i.e. both must examine "p" and its trailing flexible
>>>>> array and
>>>>> __counted_by annotation.
>>>>> 
>>>>>> 
>>>>>> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
>>>>>> 
>>>>>> for the following annotated structure: 
>>>>>> 
>>>>>> ====
>>>>>> struct annotated {
>>>>>>       unsigned long flags;
>>>>>>       size_t foo;
>>>>>>       int array[] __attribute__((counted_by (foo)));
>>>>>> };
>>>>>> 
>>>>>> 
>>>>>> struct annotated *p;
>>>>>> int index = 16;
>>>>>> 
>>>>>> p = malloc(sizeof(*p) + index * sizeof(*p->array));  //
>>>>>> allocated real size 
>>>>>> 
>>>>>> p->foo = index + 2;  // p->foo was set by a different value
>>>>>> than the real size of p->array as in test 9 and 10
>>>>> 
>>>>> Right, tests 9 and 10 are checking that the _smallest_ possible
>>>>> value of
>>>>> the array is used. (There are two sources of information: the
>>>>> allocation
>>>>> size and the size calculated by counted_by. The smaller of the
>>>>> two
>>>>> should be used when both are available.)
>>>> 
>>>> The counted_by attribute is used to annotate a Flexible array
>>>> member on how many elements it will have.
>>>> However, if this information can not accurately reflect the real
>>>> number of elements for the array allocated, 
>>>> What’s the purpose of such information? 
>>> 
>>> For example, imagine code that allocates space for 100 elements
>>> since
>>> the common case is that the number of elements will grow over time.
>>> Elements are added as it goes. For example:
>>> 
>>> struct grows {
>>>         int alloc_count;
>>>         int valid_count;
>>>         struct element item[] __counted_by(valid_count);
>>> } *p;
>>> 
>>> void something(void)
>>> {
>>>         p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
>>>         p->alloc_count = 100;
>>>         p->valid_count = 0;
>>> 
>>>         /* this loop doesn't check that we don't go over 100. */
>>>         while (items_to_copy) {
>>>                 struct element *item_ptr = get_next_item();
>>>                 /* __counted_by stays in sync: */
>>>                 p->valid_count++;
>>>                 p->item[p->valid_count - 1] = *item_ptr;
>>>         }
>>> }
>>> 
>>> We would want to catch cases there p->item[] is accessed with an
>>> index
>>> that is >= p->valid_count, even though the allocation is
>>> (currently)
>>> larger.
>>> 
>>> However, if we ever reached valid_count >= alloc_count, we need to
>>> trap
>>> too, since we can still "see" the true allocation size.
>>> 
>>> Now, the __alloc_size hint is visible in very few places, so if
>>> there is
>>> a strong reason to do so, I can live with saying that __counted_by
>>> takes
>>> full precedence over __alloc_size. It seems it should be possible
>>> to
>>> compare when both are present, but I can live with __counted_by
>>> being
>>> the universal truth. :)
>>> 
>> 
>> Thanks for the example and the explanation. Understood now.
>> 
>> LLVM’s RFC requires the following relationship:
>> (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbound
>> s-safety/70854#maintaining-correctness-of-bounds-annotations-18)
>> 
>> Buffer’s real allocated size >= counted_by value
>> 
>> Should be true all the time. 
>> 
>> I think that this is a reasonable requirement. 
>> 
>> (Otherwise, if counted_by > buffer’s real allocated size, overflow
>> might happen)
>> 
>> Is the above enough to cover your use cases?
>> 
>> If so, I will study a little bit more to see how to implement this.
>>>> 
>>>>>> or
>>>>>> p->foo was not set to any value as in test 3 and 4
>>>>> 
>>>>> For tests 3 and 4, yes, this was my mistake. I have fixed up
>>>>> the tests
>>>>> now. Bill noticed the same issue. Sorry for the think-o!
>>>>> 
>>>>>> ====
>>>>>> 
>>>>>> i.e, the value of p->foo is NOT synced with the number of
>>>>>> elements allocated for the array p->array.  
>>>>>> 
>>>>>> I think that this should be considered as an user error, and
>>>>>> the documentation of the attribute should include
>>>>>> this requirement.  (In the LLVM’s RFC, such requirement was
>>>>>> included in the programing model: 
>>>>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
>>>>>> )
>>>>>> 
>>>>>> We can add a new warning option -Wcounted-by to report such
>>>>>> user error if needed.
>>>>>> 
>>>>>> What’s your opinion on this?
>>>>> 
>>>>> I think it is correct to allow mismatch between allocation and
>>>>> counted_by as long as only the least of the two is used.
>>>> 
>>>> What’s your mean by “only the least of the two is used”?
>>> 
>>> I was just restating the above: if size information is available
>>> via
>>> both __alloc_size and __counted_by, the smaller of the two should
>>> be
>>> used.
>> 
>> If we are consistent with LLVM’s RFC, __alloc_size >= __counted_by
>> all the time.
>> Then we can always use the counted_by info for the array size. 
>>> 
>>>> 
>>>>> This may be
>>>>> desirable in a few situations. One example would be a large
>>>>> allocation
>>>>> that is slowly filled up by the program.
>>>> 
>>>> So, for such situation, whenever the allocation is filled up, the
>>>> field that hold the “counted_by” attribute should be increased at
>>>> the same time,
>>>> Then, the “counted_by” value always sync with the real
>>>> allocation. 
>>>>> I.e. the counted_by member is
>>>>> slowly increased during runtime (but not beyond the true
>>>>> allocation size).
>>>> 
>>>> Then there should be source code to increase the “counted_by”
>>>> field whenever the allocated space increased too. 
>>>>> 
>>>>> Of course allocation size is only available in limited
>>>>> situations, so
>>>>> the loss of that info is fine: we have counted_by for
>>>>> everything else.
>>>> 
>>>> The point is: allocation size should synced with the value of
>>>> “counted_by”. LLVM’s RFC also have the similar requirement:
>>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
>>> 
>>> Right, I'm saying it would be nice if __alloc_size was checked as
>>> well,
>>> in the sense that if it is available, it knows without question
>>> what the
>>> size of the allocation is. If __alloc_size and __counted_by
>>> conflict,
>>> the smaller of the two should be the truth.
>> 
>>> 
>>> But, as I said, if there is some need to explicitly ignore
>>> __alloc_size
>>> when __counted_by is present, I can live with it; we just need to
>>> document it.
>>> 
>>> If the RFC and you agree that the __counted_by variable can only
>>> ever be
>>> (re)assigned after the flex array has been (re)allocated, then I
>>> guess
>>> we'll see how it goes. :) I think most places in the kernel using
>>> __counted_by will be fine, but I suspect we may have cases where we
>>> need
>>> to update it like in the loop I described above. If that's true, we
>>> can
>>> revisit the requirement then. :)
>> 
>> I guess if we can always keep  the relationship:   __alloc_size >=
>> __counted_by all the time. Should be fine.
>> 
>> Please check whether this is enough for kernel, I will check whether
>> this is doable For GCC.
>> 
>> thanks.
>> 
>> 
>> Qing
>>> 
>>> -Kees
>>> 
>>> -- 
>>> Kees Cook
>> 
> 
> -- 
> Univ.-Prof. Dr. rer. nat. Martin Uecker
> Graz University of Technology
> Institute of Biomedical Imaging


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-18 16:25               ` Qing Zhao
@ 2023-07-18 16:50                 ` Martin Uecker
  0 siblings, 0 replies; 52+ messages in thread
From: Martin Uecker @ 2023-07-18 16:50 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Kees Cook, Siddhesh Poyarekar, Jakub Jelinek, joseph,
	richard.guenther, gcc-patches, isanbard

Am Dienstag, dem 18.07.2023 um 16:25 +0000 schrieb Qing Zhao:
> 
> 
> > On Jul 18, 2023, at 12:03 PM, Martin Uecker <uecker@tugraz.at>
> > wrote:
> > 
> > Am Dienstag, dem 18.07.2023 um 15:37 +0000 schrieb Qing Zhao:
> > > 
> > > 
> > > > On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org>
> > > > wrote:
> > > > 
> > > > On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote:
> > > > > 
> > > > > > On Jul 13, 2023, at 4:31 PM, Kees Cook
> > > > > > <keescook@chromium.org>
> > > > > > wrote:
> > > > > > 
> > > > > > In the bug, the problem is that "p" isn't known to be
> > > > > > allocated, if I'm
> > > > > > reading that correctly?
> > > > > 
> > > > > I think that the major point in PR109557
> > > > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
> > > > > 
> > > > > for the following pointer p.3_1, 
> > > > > 
> > > > > p.3_1 = p;
> > > > > _2 = __builtin_object_size (p.3_1, 0);
> > > > > 
> > > > > Question: why the size of p.3_1 cannot use the TYPE_SIZE of
> > > > > the
> > > > > pointee of p when the TYPE_SIZE can be determined at compile
> > > > > time?
> > > > > 
> > > > > Answer:  From just knowing the type of the pointee of p, the
> > > > > compiler cannot determine the size of the object.  
> > > > 
> > > > Why is that? "p" points to "struct P", which has a fixed size.
> > > > There
> > > > must be an assumption somewhere that a pointer is allocated,
> > > > otherwise
> > > > __bos would almost never work?
> > > 
> > > My understanding from the comments in PR109557 was: 
> > > 
> > > In general the pointer could point to the first object of an
> > > array
> > > that has more elements, or to an object of a different type. 
> > > Without seeing the real allocation to the pointer, the compiler
> > > cannot assume that the pointer point to an object that has
> > > the exact same type as its declaration. 
> > > 
> > > Sid and Martin, is the above understand correctly?
> > 
> > Yes. 
> > 
> > In the example, it *could* work because the compiler
> > could inline 'store' or otherwise use its knowledge about
> > the function definition to conclude that 'p' points to
> > an object of size 'sizeof (struct P)'. But this is fragile
> > because it relies on optimization and will not work across
> > TUs.
> > 
> > > Honestly, I am still not 100% clear on this yet.
> > 
> > > Jakub, Sid and Martin, could  you please explain a little bit
> > > more
> > > here, especially for the following small example?
> > > 
> > > [opc@qinzhao-ol8u3-x86 109557]$ cat t.c
> > > #include <stdlib.h>
> > > #include <assert.h>
> > > struct P {
> > >   int k;
> > >   int x[10]; 
> > > } *p;
> > > 
> > > void store(int a, int b) 
> > > {
> > >   p = (struct P *)malloc (sizeof (struct P));
> > >   p->k = a;
> > >   p->x[b] = 0;
> > >   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
> > > (int[10]));
> > >   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct
> > > P));
> > >   return;
> > > }
> > > 
> > > int main()
> > > {
> > >   store(7, 7);
> > >   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
> > > (int[10]));
> > >   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct
> > > P));
> > >   free (p);
> > > }
> > > [opc@qinzhao-ol8u3-x86 109557]$ sh t
> > > /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c
> > > a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p-
> > > >x,
> > > 1) == sizeof (int[10])' failed.
> > > t: line 19: 859944 Aborted                 (core dumped) ./a.out
> > > 
> > > 
> > > In the above, among the 4 assertions, only the last one failed.
> > 
> > I don't know why this is the case. 
> > 
> > > 
> > > Why GCC can use the TYPE_SIZE of the pointee of the pointer “p-
> > > >x” as
> > > the size of the object, 
> > 
> > I do not think it can do this in general. Is this how it 
> > is implemented?
> 
> No. -:)
> 
>  I guess that the implementation of this should base on your
> following case,  “observed accesses”:
> Although I am not 100% sure on the definition of “observed accesses”.
> 
> p->x  is an access to the field of the object, so compiler can assume
> that the object exist and have
> the type associate with this access?
> 

Only if the access happens at run-time, but the argument to BDOS is
not evaluated, so there is no access. At least, this is my guess based 
on C's semantics. 


> On the other hand, “p” is just a plain pointer, no observed access.
>
> > Thus would then seem incorrect to me.  
> > 
> > > but cannot use the TYPE_SIZE of the pointee of the pointer “p” as
> > > the
> > > size of the object? 
> > 
> > In general, the type of a pointer does not say anything about the
> > object it points to, because you could cast the pointer to a
> > different
> > type, pass it around, and then cast it back before use. 
> 
> Okay, I see.
> > 
> > Only observed allocations or observed accesses provide information
> > about the type / existence of an object at the corresponding
> > address.
> 
> What will be included in “observed accesses”?

Any read or write access can be used to determine that there must
be an object of the corresponding type at this location. (A write
access could possibly create an object.)  This still does not tell
one anything about the size of the storage region, because it could
be larger than one object of this type. 

Martin

> > 
> > The only other way in C which ensures that a pointer actually
> > points
> > to an object of the correct type is 'static':
> > 
> > void foo(struct P *p[static 1]);
> 
> Thanks for the info.
> 
> Qing
> > 
> > 
> > 
> > Martin
> > 
> > 
> > > 
> > > > 
> > > > > Therefore the bug has been closed. 
> > > > > 
> > > > > In your following testing 5:
> > > > > 
> > > > > > I'm not sure this is a reasonable behavior, but
> > > > > > let me get into the specific test, which looks like this:
> > > > > > 
> > > > > > TEST(counted_by_seen_by_bdos)
> > > > > > {
> > > > > >       struct annotated *p;
> > > > > >       int index = MAX_INDEX + unconst;
> > > > > > 
> > > > > >       p = alloc_annotated(index);
> > > > > > 
> > > > > >       REPORT_SIZE(p->array);
> > > > > > /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
> > > > > >       /* Check array size alone. */
> > > > > > /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1),
> > > > > > SIZE_MAX);
> > > > > > /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array,
> > > > > > 1),
> > > > > > p->foo * sizeof(*p->array));
> > > > > >       /* Check check entire object size. */
> > > > > > /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
> > > > > > /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1),
> > > > > > sizeof(*p) + p->foo * sizeof(*p->array));
> > > > > > }
> > > > > > 
> > > > > > Test 5 should pass as well, since, again, p can be
> > > > > > examined.
> > > > > > Passing p
> > > > > > to __bdos implies it is allocated and the __counted_by
> > > > > > annotation can be
> > > > > > examined.
> > > > > 
> > > > > Since the call to the routine “alloc_annotated" cannot be
> > > > > inlined, GCC does not see any allocation calls for the
> > > > > pointer p.
> > > > 
> > > > Correct.
> > > > 
> > > > > At the same time, due to the same reason as PR109986, GCC
> > > > > cannot
> > > > > determine the size of the object by just knowing the
> > > > > TYPE_SIZE
> > > > > of the pointee of p.  
> > > > 
> > > > So the difference between test 3 and test 5 is that "p" is
> > > > explicitly
> > > > dereferenced to find "array", and therefore an assumption is
> > > > established
> > > > that "p" is allocated?
> > > 
> > > Yes, this might be the assumption that GCC made  -:)
> > > > 
> > > > > So, this is exactly the same issue as PR109557.  It’s an
> > > > > existing
> > > > > behavior per the current __buitlin_object_size algorithm.
> > > > > I am still not very sure whether the situation in PR109557
> > > > > can be
> > > > > improved or not, but either way, it’s a separate issue.
> > > > 
> > > > Okay, so the issue is one of object allocation visibility (or
> > > > assumptions there in)?
> > > 
> > > Might be, let’s see what Sid or Martin will say on this.
> > > > 
> > > > > Please see the new testing case I added for PR109557, you
> > > > > will
> > > > > see that the above case 5 is a similar case as the new
> > > > > testing
> > > > > case in PR109557.
> > > > 
> > > > I will ponder this a bit more to see if I can come up with a
> > > > useful
> > > > test case to replace the part from "test 5" above.
> > > > 
> > > > > > 
> > > > > > If "return p->array[index];" would be caught by the
> > > > > > sanitizer,
> > > > > > then
> > > > > > it follows that __builtin_dynamic_object_size(p, 1) must
> > > > > > also
> > > > > > know the
> > > > > > size. i.e. both must examine "p" and its trailing flexible
> > > > > > array and
> > > > > > __counted_by annotation.
> > > > > > 
> > > > > > > 
> > > > > > > 2. The common issue for  the failed testing 3, 4, 9, 10
> > > > > > > is:
> > > > > > > 
> > > > > > > for the following annotated structure: 
> > > > > > > 
> > > > > > > ====
> > > > > > > struct annotated {
> > > > > > >       unsigned long flags;
> > > > > > >       size_t foo;
> > > > > > >       int array[] __attribute__((counted_by (foo)));
> > > > > > > };
> > > > > > > 
> > > > > > > 
> > > > > > > struct annotated *p;
> > > > > > > int index = 16;
> > > > > > > 
> > > > > > > p = malloc(sizeof(*p) + index * sizeof(*p->array));  //
> > > > > > > allocated real size 
> > > > > > > 
> > > > > > > p->foo = index + 2;  // p->foo was set by a different
> > > > > > > value
> > > > > > > than the real size of p->array as in test 9 and 10
> > > > > > 
> > > > > > Right, tests 9 and 10 are checking that the _smallest_
> > > > > > possible
> > > > > > value of
> > > > > > the array is used. (There are two sources of information:
> > > > > > the
> > > > > > allocation
> > > > > > size and the size calculated by counted_by. The smaller of
> > > > > > the
> > > > > > two
> > > > > > should be used when both are available.)
> > > > > 
> > > > > The counted_by attribute is used to annotate a Flexible array
> > > > > member on how many elements it will have.
> > > > > However, if this information can not accurately reflect the
> > > > > real
> > > > > number of elements for the array allocated, 
> > > > > What’s the purpose of such information? 
> > > > 
> > > > For example, imagine code that allocates space for 100 elements
> > > > since
> > > > the common case is that the number of elements will grow over
> > > > time.
> > > > Elements are added as it goes. For example:
> > > > 
> > > > struct grows {
> > > >         int alloc_count;
> > > >         int valid_count;
> > > >         struct element item[] __counted_by(valid_count);
> > > > } *p;
> > > > 
> > > > void something(void)
> > > > {
> > > >         p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
> > > >         p->alloc_count = 100;
> > > >         p->valid_count = 0;
> > > > 
> > > >         /* this loop doesn't check that we don't go over 100.
> > > > */
> > > >         while (items_to_copy) {
> > > >                 struct element *item_ptr = get_next_item();
> > > >                 /* __counted_by stays in sync: */
> > > >                 p->valid_count++;
> > > >                 p->item[p->valid_count - 1] = *item_ptr;
> > > >         }
> > > > }
> > > > 
> > > > We would want to catch cases there p->item[] is accessed with
> > > > an
> > > > index
> > > > that is >= p->valid_count, even though the allocation is
> > > > (currently)
> > > > larger.
> > > > 
> > > > However, if we ever reached valid_count >= alloc_count, we need
> > > > to
> > > > trap
> > > > too, since we can still "see" the true allocation size.
> > > > 
> > > > Now, the __alloc_size hint is visible in very few places, so if
> > > > there is
> > > > a strong reason to do so, I can live with saying that
> > > > __counted_by
> > > > takes
> > > > full precedence over __alloc_size. It seems it should be
> > > > possible
> > > > to
> > > > compare when both are present, but I can live with __counted_by
> > > > being
> > > > the universal truth. :)
> > > > 
> > > 
> > > Thanks for the example and the explanation. Understood now.
> > > 
> > > LLVM’s RFC requires the following relationship:
> > > (
> > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbou
> > > nd
> > > s-safety/70854#maintaining-correctness-of-bounds-annotations-18)
> > > 
> > > Buffer’s real allocated size >= counted_by value
> > > 
> > > Should be true all the time. 
> > > 
> > > I think that this is a reasonable requirement. 
> > > 
> > > (Otherwise, if counted_by > buffer’s real allocated size,
> > > overflow
> > > might happen)
> > > 
> > > Is the above enough to cover your use cases?
> > > 
> > > If so, I will study a little bit more to see how to implement
> > > this.
> > > > > 
> > > > > > > or
> > > > > > > p->foo was not set to any value as in test 3 and 4
> > > > > > 
> > > > > > For tests 3 and 4, yes, this was my mistake. I have fixed
> > > > > > up
> > > > > > the tests
> > > > > > now. Bill noticed the same issue. Sorry for the think-o!
> > > > > > 
> > > > > > > ====
> > > > > > > 
> > > > > > > i.e, the value of p->foo is NOT synced with the number of
> > > > > > > elements allocated for the array p->array.  
> > > > > > > 
> > > > > > > I think that this should be considered as an user error,
> > > > > > > and
> > > > > > > the documentation of the attribute should include
> > > > > > > this requirement.  (In the LLVM’s RFC, such requirement
> > > > > > > was
> > > > > > > included in the programing model: 
> > > > > > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> > > > > > > )
> > > > > > > 
> > > > > > > We can add a new warning option -Wcounted-by to report
> > > > > > > such
> > > > > > > user error if needed.
> > > > > > > 
> > > > > > > What’s your opinion on this?
> > > > > > 
> > > > > > I think it is correct to allow mismatch between allocation
> > > > > > and
> > > > > > counted_by as long as only the least of the two is used.
> > > > > 
> > > > > What’s your mean by “only the least of the two is used”?
> > > > 
> > > > I was just restating the above: if size information is
> > > > available
> > > > via
> > > > both __alloc_size and __counted_by, the smaller of the two
> > > > should
> > > > be
> > > > used.
> > > 
> > > If we are consistent with LLVM’s RFC, __alloc_size >=
> > > __counted_by
> > > all the time.
> > > Then we can always use the counted_by info for the array size. 
> > > > 
> > > > > 
> > > > > > This may be
> > > > > > desirable in a few situations. One example would be a large
> > > > > > allocation
> > > > > > that is slowly filled up by the program.
> > > > > 
> > > > > So, for such situation, whenever the allocation is filled up,
> > > > > the
> > > > > field that hold the “counted_by” attribute should be
> > > > > increased at
> > > > > the same time,
> > > > > Then, the “counted_by” value always sync with the real
> > > > > allocation. 
> > > > > > I.e. the counted_by member is
> > > > > > slowly increased during runtime (but not beyond the true
> > > > > > allocation size).
> > > > > 
> > > > > Then there should be source code to increase the “counted_by”
> > > > > field whenever the allocated space increased too. 
> > > > > > 
> > > > > > Of course allocation size is only available in limited
> > > > > > situations, so
> > > > > > the loss of that info is fine: we have counted_by for
> > > > > > everything else.
> > > > > 
> > > > > The point is: allocation size should synced with the value of
> > > > > “counted_by”. LLVM’s RFC also have the similar requirement:
> > > > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> > > > 
> > > > Right, I'm saying it would be nice if __alloc_size was checked
> > > > as
> > > > well,
> > > > in the sense that if it is available, it knows without question
> > > > what the
> > > > size of the allocation is. If __alloc_size and __counted_by
> > > > conflict,
> > > > the smaller of the two should be the truth.
> > > 
> > > > 
> > > > But, as I said, if there is some need to explicitly ignore
> > > > __alloc_size
> > > > when __counted_by is present, I can live with it; we just need
> > > > to
> > > > document it.
> > > > 
> > > > If the RFC and you agree that the __counted_by variable can
> > > > only
> > > > ever be
> > > > (re)assigned after the flex array has been (re)allocated, then
> > > > I
> > > > guess
> > > > we'll see how it goes. :) I think most places in the kernel
> > > > using
> > > > __counted_by will be fine, but I suspect we may have cases
> > > > where we
> > > > need
> > > > to update it like in the loop I described above. If that's
> > > > true, we
> > > > can
> > > > revisit the requirement then. :)
> > > 
> > > I guess if we can always keep  the relationship:   __alloc_size
> > > >=
> > > __counted_by all the time. Should be fine.
> > > 
> > > Please check whether this is enough for kernel, I will check
> > > whether
> > > this is doable For GCC.
> > > 
> > > thanks.
> > > 
> > > 
> > > Qing
> > > > 
> > > > -Kees
> > > > 
> > > > -- 
> > > > Kees Cook
> > > 
> > 
> > -- 
> > Univ.-Prof. Dr. rer. nat. Martin Uecker
> > Graz University of Technology
> > Institute of Biomedical Imaging
> 

-- 
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-18 15:37           ` Qing Zhao
  2023-07-18 16:03             ` Martin Uecker
@ 2023-07-18 18:53             ` Qing Zhao
  1 sibling, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-07-18 18:53 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Kees Cook, Siddhesh Poyarekar, Jakub Jelinek, joseph,
	richard.guenther, gcc-patches, isanbard



> On Jul 18, 2023, at 11:37 AM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> 
> 
>> On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org> wrote:
>> 
>> On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote:
>>> 
>>>> On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org> wrote:
>>>> 
>>>> In the bug, the problem is that "p" isn't known to be allocated, if I'm
>>>> reading that correctly?
>>> 
>>> I think that the major point in PR109557 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
>>> 
>>> for the following pointer p.3_1, 
>>> 
>>> p.3_1 = p;
>>> _2 = __builtin_object_size (p.3_1, 0);
>>> 
>>> Question: why the size of p.3_1 cannot use the TYPE_SIZE of the pointee of p when the TYPE_SIZE can be determined at compile time?
>>> 
>>> Answer:  From just knowing the type of the pointee of p, the compiler cannot determine the size of the object.  
>> 
>> Why is that? "p" points to "struct P", which has a fixed size. There
>> must be an assumption somewhere that a pointer is allocated, otherwise
>> __bos would almost never work?
> 
> My understanding from the comments in PR109557 was: 
> 
> In general the pointer could point to the first object of an array that has more elements, or to an object of a different type. 
> Without seeing the real allocation to the pointer, the compiler cannot assume that the pointer point to an object that has
> the exact same type as its declaration. 
> 
> Sid and Martin, is the above understand correctly?
> 
> Honestly, I am still not 100% clear on this yet.
> 
> Jakub, Sid and Martin, could  you please explain a little bit more here, especially for the following small example?
> 
> [opc@qinzhao-ol8u3-x86 109557]$ cat t.c
> #include <stdlib.h>
> #include <assert.h>
> struct P {
>  int k;
>  int x[10]; 
> } *p;
> 
> void store(int a, int b) 
> {
>  p = (struct P *)malloc (sizeof (struct P));
>  p->k = a;
>  p->x[b] = 0;
>  assert (__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10]));
>  assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
>  return;
> }
> 
> int main()
> {
>  store(7, 7);
>  assert (__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10]));
>  assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
>  free (p);
> }
> [opc@qinzhao-ol8u3-x86 109557]$ sh t
> /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c
> a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p->x, 1) == sizeof (int[10])' failed.
> t: line 19: 859944 Aborted                 (core dumped) ./a.out
> 
> 

A correction to the above compilation option:
/home/opc/Install/latest-d/bin/gcc -O -fstrict-flex-arrays=3 t.c
a.out: t.c:22: main: Assertion `__builtin_dynamic_object_size (p, 1) == sizeof (struct P)' failed.
t: line 19: 918833 Aborted                 (core dumped) ./a.out

(All others keep the same).

Sorry for the mistake.

Qing

> In the above, among the 4 assertions, only the last one failed.
> 
> Why GCC can use the TYPE_SIZE of the pointee of the pointer “p->x” as the size of the object, 
> but cannot use the TYPE_SIZE of the pointee of the pointer “p” as the size of the object? 
> 
> 
>> 
>>> Therefore the bug has been closed. 
>>> 
>>> In your following testing 5:
>>> 
>>>> I'm not sure this is a reasonable behavior, but
>>>> let me get into the specific test, which looks like this:
>>>> 
>>>> TEST(counted_by_seen_by_bdos)
>>>> {
>>>>      struct annotated *p;
>>>>      int index = MAX_INDEX + unconst;
>>>> 
>>>>      p = alloc_annotated(index);
>>>> 
>>>>      REPORT_SIZE(p->array);
>>>> /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
>>>>      /* Check array size alone. */
>>>> /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1), SIZE_MAX);
>>>> /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(*p->array));
>>>>      /* Check check entire object size. */
>>>> /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
>>>> /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1), sizeof(*p) + p->foo * sizeof(*p->array));
>>>> }
>>>> 
>>>> Test 5 should pass as well, since, again, p can be examined. Passing p
>>>> to __bdos implies it is allocated and the __counted_by annotation can be
>>>> examined.
>>> 
>>> Since the call to the routine “alloc_annotated" cannot be inlined, GCC does not see any allocation calls for the pointer p.
>> 
>> Correct.
>> 
>>> At the same time, due to the same reason as PR109986, GCC cannot determine the size of the object by just knowing the TYPE_SIZE
>>> of the pointee of p.  
>> 
>> So the difference between test 3 and test 5 is that "p" is explicitly
>> dereferenced to find "array", and therefore an assumption is established
>> that "p" is allocated?
> 
> Yes, this might be the assumption that GCC made  -:)
>> 
>>> So, this is exactly the same issue as PR109557.  It’s an existing behavior per the current __buitlin_object_size algorithm.
>>> I am still not very sure whether the situation in PR109557 can be improved or not, but either way, it’s a separate issue.
>> 
>> Okay, so the issue is one of object allocation visibility (or
>> assumptions there in)?
> 
> Might be, let’s see what Sid or Martin will say on this.
>> 
>>> Please see the new testing case I added for PR109557, you will see that the above case 5 is a similar case as the new testing case in PR109557.
>> 
>> I will ponder this a bit more to see if I can come up with a useful
>> test case to replace the part from "test 5" above.
>> 
>>>> 
>>>> If "return p->array[index];" would be caught by the sanitizer, then
>>>> it follows that __builtin_dynamic_object_size(p, 1) must also know the
>>>> size. i.e. both must examine "p" and its trailing flexible array and
>>>> __counted_by annotation.
>>>> 
>>>>> 
>>>>> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
>>>>> 
>>>>> for the following annotated structure: 
>>>>> 
>>>>> ====
>>>>> struct annotated {
>>>>>      unsigned long flags;
>>>>>      size_t foo;
>>>>>      int array[] __attribute__((counted_by (foo)));
>>>>> };
>>>>> 
>>>>> 
>>>>> struct annotated *p;
>>>>> int index = 16;
>>>>> 
>>>>> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
>>>>> 
>>>>> p->foo = index + 2;  // p->foo was set by a different value than the real size of p->array as in test 9 and 10
>>>> 
>>>> Right, tests 9 and 10 are checking that the _smallest_ possible value of
>>>> the array is used. (There are two sources of information: the allocation
>>>> size and the size calculated by counted_by. The smaller of the two
>>>> should be used when both are available.)
>>> 
>>> The counted_by attribute is used to annotate a Flexible array member on how many elements it will have.
>>> However, if this information can not accurately reflect the real number of elements for the array allocated, 
>>> What’s the purpose of such information? 
>> 
>> For example, imagine code that allocates space for 100 elements since
>> the common case is that the number of elements will grow over time.
>> Elements are added as it goes. For example:
>> 
>> struct grows {
>> 	int alloc_count;
>> 	int valid_count;
>> 	struct element item[] __counted_by(valid_count);
>> } *p;
>> 
>> void something(void)
>> {
>> 	p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
>> 	p->alloc_count = 100;
>> 	p->valid_count = 0;
>> 
>> 	/* this loop doesn't check that we don't go over 100. */
>> 	while (items_to_copy) {
>> 		struct element *item_ptr = get_next_item();
>> 		/* __counted_by stays in sync: */
>> 		p->valid_count++;
>> 		p->item[p->valid_count - 1] = *item_ptr;
>> 	}
>> }
>> 
>> We would want to catch cases there p->item[] is accessed with an index
>> that is >= p->valid_count, even though the allocation is (currently)
>> larger.
>> 
>> However, if we ever reached valid_count >= alloc_count, we need to trap
>> too, since we can still "see" the true allocation size.
>> 
>> Now, the __alloc_size hint is visible in very few places, so if there is
>> a strong reason to do so, I can live with saying that __counted_by takes
>> full precedence over __alloc_size. It seems it should be possible to
>> compare when both are present, but I can live with __counted_by being
>> the universal truth. :)
>> 
> 
> Thanks for the example and the explanation. Understood now.
> 
> LLVM’s RFC requires the following relationship: (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
> 
> Buffer’s real allocated size >= counted_by value
> 
> Should be true all the time. 
> 
> I think that this is a reasonable requirement. 
> 
> (Otherwise, if counted_by > buffer’s real allocated size, overflow might happen)
> 
> Is the above enough to cover your use cases?
> 
> If so, I will study a little bit more to see how to implement this.
>>> 
>>>>> or
>>>>> p->foo was not set to any value as in test 3 and 4
>>>> 
>>>> For tests 3 and 4, yes, this was my mistake. I have fixed up the tests
>>>> now. Bill noticed the same issue. Sorry for the think-o!
>>>> 
>>>>> ====
>>>>> 
>>>>> i.e, the value of p->foo is NOT synced with the number of elements allocated for the array p->array.  
>>>>> 
>>>>> I think that this should be considered as an user error, and the documentation of the attribute should include
>>>>> this requirement.  (In the LLVM’s RFC, such requirement was included in the programing model: 
>>>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
>>>>> 
>>>>> We can add a new warning option -Wcounted-by to report such user error if needed.
>>>>> 
>>>>> What’s your opinion on this?
>>>> 
>>>> I think it is correct to allow mismatch between allocation and
>>>> counted_by as long as only the least of the two is used.
>>> 
>>> What’s your mean by “only the least of the two is used”?
>> 
>> I was just restating the above: if size information is available via
>> both __alloc_size and __counted_by, the smaller of the two should be
>> used.
> 
> If we are consistent with LLVM’s RFC, __alloc_size >= __counted_by all the time.
> Then we can always use the counted_by info for the array size. 
>> 
>>> 
>>>> This may be
>>>> desirable in a few situations. One example would be a large allocation
>>>> that is slowly filled up by the program.
>>> 
>>> So, for such situation, whenever the allocation is filled up, the field that hold the “counted_by” attribute should be increased at the same time,
>>> Then, the “counted_by” value always sync with the real allocation. 
>>>> I.e. the counted_by member is
>>>> slowly increased during runtime (but not beyond the true allocation size).
>>> 
>>> Then there should be source code to increase the “counted_by” field whenever the allocated space increased too. 
>>>> 
>>>> Of course allocation size is only available in limited situations, so
>>>> the loss of that info is fine: we have counted_by for everything else.
>>> 
>>> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement:
>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
>> 
>> Right, I'm saying it would be nice if __alloc_size was checked as well,
>> in the sense that if it is available, it knows without question what the
>> size of the allocation is. If __alloc_size and __counted_by conflict,
>> the smaller of the two should be the truth.
> 
>> 
>> But, as I said, if there is some need to explicitly ignore __alloc_size
>> when __counted_by is present, I can live with it; we just need to
>> document it.
>> 
>> If the RFC and you agree that the __counted_by variable can only ever be
>> (re)assigned after the flex array has been (re)allocated, then I guess
>> we'll see how it goes. :) I think most places in the kernel using
>> __counted_by will be fine, but I suspect we may have cases where we need
>> to update it like in the loop I described above. If that's true, we can
>> revisit the requirement then. :)
> 
> I guess if we can always keep  the relationship:   __alloc_size >= __counted_by all the time. Should be fine.
> 
> Please check whether this is enough for kernel, I will check whether this is doable For GCC.
> 
> thanks.
> 
> 
> Qing
>> 
>> -Kees
>> 
>> -- 
>> Kees Cook


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-17 23:40         ` Kees Cook
  2023-07-18 15:37           ` Qing Zhao
@ 2023-07-19  8:41           ` Martin Uecker
  2023-07-19 16:16           ` Qing Zhao
  2023-07-19 18:52           ` Qing Zhao
  3 siblings, 0 replies; 52+ messages in thread
From: Martin Uecker @ 2023-07-19  8:41 UTC (permalink / raw)
  To: Kees Cook, Qing Zhao
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, isanbard

Am Montag, dem 17.07.2023 um 16:40 -0700 schrieb Kees Cook:
> On Mon, Jul 17, 2023 at 09:17:48PM +0000, Qing Zhao wrote:
> > 
> > > On Jul 13, 2023, at 4:31 PM, Kees Cook <keescook@chromium.org>
> > > wrote:
> > > 
> > > In the bug, the problem is that "p" isn't known to be allocated,
> > > if I'm
> > > reading that correctly?
> > 
> > I think that the major point in PR109557
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
> > 
> > for the following pointer p.3_1, 
> > 
> > p.3_1 = p;
> > _2 = __builtin_object_size (p.3_1, 0);
> > 
> > Question: why the size of p.3_1 cannot use the TYPE_SIZE of the
> > pointee of p when the TYPE_SIZE can be determined at compile time?
> > 
> > Answer:  From just knowing the type of the pointee of p, the
> > compiler cannot determine the size of the object.  
> 
> Why is that? "p" points to "struct P", which has a fixed size. There
> must be an assumption somewhere that a pointer is allocated,
> otherwise
> __bos would almost never work?

It often does not work, because it relies on the optimizer
propagating the information instead of the type system.

This is why it would be better to have proper *bounds* checks,
and not just object size checks. It is not quite clear to me
how BOS and bounds checking is supposed to work together,
but FAMs should be bounds checked. 

...

> 
> > 
> > > This may be
> > > desirable in a few situations. One example would be a large
> > > allocation
> > > that is slowly filled up by the program.
> > 
> > So, for such situation, whenever the allocation is filled up, the
> > field that hold the “counted_by” attribute should be increased at
> > the same time,
> > Then, the “counted_by” value always sync with the real allocation. 
> > > I.e. the counted_by member is
> > > slowly increased during runtime (but not beyond the true
> > > allocation size).
> > 
> > Then there should be source code to increase the “counted_by” field
> > whenever the allocated space increased too. 
> > > 
> > > Of course allocation size is only available in limited
> > > situations, so
> > > the loss of that info is fine: we have counted_by for everything
> > > else.
> > 
> > The point is: allocation size should synced with the value of
> > “counted_by”. LLVM’s RFC also have the similar requirement:
> > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> 
> Right, I'm saying it would be nice if __alloc_size was checked as
> well,
> in the sense that if it is available, it knows without question what
> the
> size of the allocation is. If __alloc_size and __counted_by conflict,
> the smaller of the two should be the truth.
> 
> But, as I said, if there is some need to explicitly ignore
> __alloc_size
> when __counted_by is present, I can live with it; we just need to
> document it.
> 
> If the RFC and you agree that the __counted_by variable can only ever
> be
> (re)assigned after the flex array has been (re)allocated, then I
> guess
> we'll see how it goes. :) I think most places in the kernel using
> __counted_by will be fine, but I suspect we may have cases where we
> need
> to update it like in the loop I described above. If that's true, we
> can
> revisit the requirement then. :)

It should be the other way round: You should first set
'count' and then reassign the pointer, because you can then
often check the pointer assignment (reading 'count').  The
other way round this works only sometimes, i.e. if both
assignments are close together and the optimizer can see this.



Martin





^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-17 23:40         ` Kees Cook
  2023-07-18 15:37           ` Qing Zhao
  2023-07-19  8:41           ` Martin Uecker
@ 2023-07-19 16:16           ` Qing Zhao
  2023-07-19 18:52           ` Qing Zhao
  3 siblings, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-07-19 16:16 UTC (permalink / raw)
  To: Kees Cook, Martin Uecker
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, isanbard

More thoughts on the following example Kees provided: 

> On Jul 17, 2023, at 7:40 PM, Kees Cook <keescook@chromium.org> wrote:
>> 
>> The counted_by attribute is used to annotate a Flexible array member on how many elements it will have.
>> However, if this information can not accurately reflect the real number of elements for the array allocated, 
>> What’s the purpose of such information? 
> 
> For example, imagine code that allocates space for 100 elements since
> the common case is that the number of elements will grow over time.
> Elements are added as it goes. For example:
> 
> struct grows {
> 	int alloc_count;
> 	int valid_count;
> 	struct element item[] __counted_by(valid_count);
> } *p;
> 
> void something(void)
> {
> 	p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
> 	p->alloc_count = 100;
> 	p->valid_count = 0;
> 
> 	/* this loop doesn't check that we don't go over 100. */
> 	while (items_to_copy) {
> 		struct element *item_ptr = get_next_item();
> 		/* __counted_by stays in sync: */
> 		p->valid_count++;
> 		p->item[p->valid_count - 1] = *item_ptr;
> 	}
> }
> 
> We would want to catch cases there p->item[] is accessed with an index
> that is >= p->valid_count, even though the allocation is (currently)
> larger.
> 
> However, if we ever reached valid_count >= alloc_count, we need to trap
> too, since we can still "see" the true allocation size.
> 
> Now, the __alloc_size hint is visible in very few places, so if there is
> a strong reason to do so, I can live with saying that __counted_by takes
> full precedence over __alloc_size. It seems it should be possible to
> compare when both are present, but I can live with __counted_by being
> the universal truth. :)

In the above use case (not sure how popular such user case is?), the major questions are:

for one object with flexible array member, 

1. Shall we allow the situation when  the allocated size for the object 
and the number of element for the contained FAM are mismatched?

If the answer to 1 is YES (to support such user cases), then

2.  If there is a mismatch between these two, should the number of element impact the allocated
size for the object? (__builtin_object_size())

From the doc of object size checking: (https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html)

=====
Built-in Function: size_t __builtin_object_size (const void * ptr, int type)
is a built-in construct that returns a constant number of bytes from ptr to the end of the object ptr pointer points to (if known at compile time). To determine the sizes of dynamically allocated objects the function relies on the allocation functions called to obtain the storage to be declared with the alloc_size attribute (see Common Function Attributes). __builtin_object_size never evaluates its arguments for side effects. If there are any side effects in them, it returns (size_t) -1 for type 0 or 1 and (size_t) 0 for type 2 or 3. If there are multiple objects ptr can point to and all of them are known at compile time, the returned number is the maximum of remaining byte counts in those objects if type & 2 is 0 and minimum if nonzero. If it is not possible to determine which objects ptr points to at compile time, __builtin_object_size should return (size_t) -1 for type 0 or 1 and (size_t) 0 for type 2 or 3.

=====

Based on the current documentation for __bos, I think that the answer should be NO, i.e, we should not use the counted_by info to change the REAL allocated size for the object. 


3. Then, As pointed out also by Martin, only the bounds check (including  -Warray-bounds or -fsanitizer=bounds) should be impacted by the counted_by information, since these checks are based on the TYPE system, and “counted_by” info should be treated as a complement to the TYPE system. 

Let me know your opinions.

Qing

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-17 23:40         ` Kees Cook
                             ` (2 preceding siblings ...)
  2023-07-19 16:16           ` Qing Zhao
@ 2023-07-19 18:52           ` Qing Zhao
  2023-07-31 20:14             ` Qing Zhao
  3 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-07-19 18:52 UTC (permalink / raw)
  To: Kees Cook
  Cc: joseph, richard.guenther, jakub, gcc-patches, siddhesh, uecker, isanbard


>> 
>> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement:
>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> 
> Right, I'm saying it would be nice if __alloc_size was checked as well,
> in the sense that if it is available, it knows without question what the
> size of the allocation is. If __alloc_size and __counted_by conflict,
> the smaller of the two should be the truth.

I don’t think that  “if __alloc_size and __counted_by conflict, the smaller of the two should be the truth” will work correctly.

When __alloc_size is larger than the value of __counted_by, it’s okay. 
But when the value of __counted_by is larger than the __alloc_size, the array bound check or object size sanitizer might not work correctly.


Please see the following example:

struct grows {
	int alloc_count;
	int valid_count;
	int  item[] __counted_by(valid_count);
} *p;

void __attribute__((__noinline__)) something (int n)
{
	p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
	p->alloc_count = 100;
	p->valid_count = 102;
	p->item[n] = 10;		// both _alloc_size and the value of __counted_by are available in this routine, the smaller one is , 100;
                    
}

void __attribute__((__noinline__))  something_2 (int n)
{
   p->item[n] = 10;   // only the value of  __counted_by is available in this routine, which is 102;  
}

Int main
{
   Something (101);
   Something_2 (101);
}


For the above example, the out-of-bound array access in routine “something” should be able to be caught by the compiler.
However, the out-of-bound array access in the routine “something_2” will NOT be able to be caught by the compiler.

Since in the routine “something_2” , the compiler don’t know the alloc_size, the only available info is the counted_by value
 through the attribute.  But this value is bigger than the REAL size of the array. Therefore the compiler cannot detect the 
out-of-bound array access in the routine something_2


Based on the above observation, I think we should add the following requirement: 

The value of “counted_by” should be equal or SMALLER than the real alloc_size for the flexible array member. 

This is the same requirement as the LLVM RFC. 
https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18

"the compiler inserts additional checks to ensure the new buf has at least as many elements as the new count indicates.”
LLVM has additional requirement in addition to this, we might need to consider those requirement too. 

Qing

> But, as I said, if there is some need to explicitly ignore __alloc_size
> when __counted_by is present, I can live with it; we just need to
> document it.
> 
> If the RFC and you agree that the __counted_by variable can only ever be
> (re)assigned after the flex array has been (re)allocated, then I guess
> we'll see how it goes. :) I think most places in the kernel using
> __counted_by will be fine, but I suspect we may have cases where we need
> to update it like in the loop I described above. If that's true, we can
> revisit the requirement then. :)
> 
> -Kees
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-19 18:52           ` Qing Zhao
@ 2023-07-31 20:14             ` Qing Zhao
  2023-08-01 22:45               ` Kees Cook
  0 siblings, 1 reply; 52+ messages in thread
From: Qing Zhao @ 2023-07-31 20:14 UTC (permalink / raw)
  To: Kees Cook, Siddhesh Poyarekar
  Cc: joseph, richard.guenther, jakub, gcc-patches, uecker, isanbard

Hi,

After some detailed study and consideration on how to use the new attribute “counted_by”
 in __builtin_dynamic_object_size, I came up with the following example with detailed explanation 
on the expected behavior from GCC on using this new attribute. 

Please take a look on this example and  the explanation embedded, and let me know if you have further
Comments or suggestions.

Thanks a lot.

Qing

===========================================
#include <stdint.h>
#include <malloc.h>
#include <string.h>
#include <stdio.h>

struct annotated {
        size_t foo;
        int array[] __attribute__((counted_by (foo)));
};

#define expect(p, _v) do { \
    size_t v = _v; \
    if (p == v) \
        __builtin_printf ("ok:  %s == %zd\n", #p, p); \
    else \
        {  \
          __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
        } \
} while (0);

#define noinline __attribute__((__noinline__))
#define SIZE_BUMP 2 

/* In general, Due to type casting, the type for the pointee of a pointer
   does not say anything about the object it points to,
   So, __builtin_object_size can not directly use the type of the pointee
   to decide the size of the object the pointer points to.

   there are only two reliable ways:
   A. observed allocations  (call to the allocation functions in the routine)
   B. observed accesses     (read or write access to the location of the 
                             pointer points to)

   that provide information about the type/existence of an object at
   the corresponding address.

   for A, we use the "alloc_size" attribute for the corresponding allocation
   functions to determine the object size;

   For B, we use the SIZE info of the TYPE attached to the corresponding access.
   (We treat counted_by attribute as a complement to the SIZE info of the TYPE
    for FMA) 

   The only other way in C which ensures that a pointer actually points
   to an object of the correct type is 'static':

   void foo(struct P *p[static 1]);   

   See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
   for more details.  */

/* in the following function, malloc allocated more space than the value
   of counted_by attribute.  Then what's the correct behavior we expect
   the __builtin_dynamic_object_size should have for each of the cases?  */ 

static struct annotated * noinline alloc_buf (int index)
{
  struct annotated *p;
  p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int));
  p->foo = index;

  /*when checking the observed access p->array, we have info on both
    observered allocation and observed access, 
    A. from observed allocation: (index + SIZE_BUMP) * sizeof (int)
    B. from observed access: p->foo * sizeof (int)

    in the above, p->foo = index.
   */
   
  /* for MAXIMUM sub-object size: chose the smaller of A and B.
   * Please see https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625891.html
   * for details on why.  */
  expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int));
  expect(__builtin_dynamic_object_size(p->array, 0), sizeof (*p) + (p->foo) * sizeof(int));

  /* for MINIMUM sub-object size: chose the smaller of A and B too.  */
  expect(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(int));
  expect(__builtin_dynamic_object_size(p->array, 2), sizeof (*p) + p->foo * sizeof(int));

  /*when checking the pointer p, we only have info on the observed allocation.
    So, the object size info can only been obtained from the call to malloc.
    for both MAXIMUM and MINIMUM: A = (index + SIZE_BUMP) * sizeof (int)  */ 
  expect(__builtin_dynamic_object_size(p, 1), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int));
  expect(__builtin_dynamic_object_size(p, 0), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int));
  expect(__builtin_dynamic_object_size(p, 3), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int));
  expect(__builtin_dynamic_object_size(p, 2), sizeof (*p) + (p->foo + SIZE_BUMP) * sizeof(int));
  return p;
}


int main ()
{
  struct annotated *p; 
  p = alloc_buf (10);
  /*when checking the observed access p->array, we only have info on the
    observed access, i.e, the TYPE_SIZE info from the access. We don't have
    info on the whole object.  */

  /*For MAXIMUM size, We know the SIZE info of the TYPE from the access to the
    sub-object p->array.
    but don't know the whole object the pointer p points to.  */ 
  expect(__builtin_dynamic_object_size(p->array, 1), p->foo * sizeof(int));
  expect(__builtin_dynamic_object_size(p->array, 0), -1);

  /*for MINIMUM size, We know the TYPE_SIZE from the access to the sub-oject
    p->array.
    but don't know the whole object the pointer p points to.  */
  expect(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(int));
  expect(__builtin_dynamic_object_size(p->array, 2), 0);

  /*when checking the pointer p, we have no observed allocation nor observed access.
    therefore, we cannot determine the size info here.  */
  expect(__builtin_dynamic_object_size(p, 1), -1);
  expect(__builtin_dynamic_object_size(p, 0), -1);
  expect(__builtin_dynamic_object_size(p, 3), 0);
  expect(__builtin_dynamic_object_size(p, 2), 0);

  return 0;
}





> On Jul 19, 2023, at 2:52 PM, Qing Zhao via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> 
>>> 
>>> The point is: allocation size should synced with the value of “counted_by”. LLVM’s RFC also have the similar requirement:
>>> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
>> 
>> Right, I'm saying it would be nice if __alloc_size was checked as well,
>> in the sense that if it is available, it knows without question what the
>> size of the allocation is. If __alloc_size and __counted_by conflict,
>> the smaller of the two should be the truth.
> 
> I don’t think that  “if __alloc_size and __counted_by conflict, the smaller of the two should be the truth” will work correctly.
> 
> When __alloc_size is larger than the value of __counted_by, it’s okay. 
> But when the value of __counted_by is larger than the __alloc_size, the array bound check or object size sanitizer might not work correctly.
> 
> 
> Please see the following example:
> 
> struct grows {
> 	int alloc_count;
> 	int valid_count;
> 	int  item[] __counted_by(valid_count);
> } *p;
> 
> void __attribute__((__noinline__)) something (int n)
> {
> 	p = malloc(sizeof(*p) + sizeof(*p->item) * 100);
> 	p->alloc_count = 100;
> 	p->valid_count = 102;
> 	p->item[n] = 10;		// both _alloc_size and the value of __counted_by are available in this routine, the smaller one is , 100;
> 
> }
> 
> void __attribute__((__noinline__))  something_2 (int n)
> {
>   p->item[n] = 10;   // only the value of  __counted_by is available in this routine, which is 102;  
> }
> 
> Int main
> {
>   Something (101);
>   Something_2 (101);
> }
> 
> 
> For the above example, the out-of-bound array access in routine “something” should be able to be caught by the compiler.
> However, the out-of-bound array access in the routine “something_2” will NOT be able to be caught by the compiler.
> 
> Since in the routine “something_2” , the compiler don’t know the alloc_size, the only available info is the counted_by value
> through the attribute.  But this value is bigger than the REAL size of the array. Therefore the compiler cannot detect the 
> out-of-bound array access in the routine something_2
> 
> 
> Based on the above observation, I think we should add the following requirement: 
> 
> The value of “counted_by” should be equal or SMALLER than the real alloc_size for the flexible array member. 
> 
> This is the same requirement as the LLVM RFC. 
> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> 
> "the compiler inserts additional checks to ensure the new buf has at least as many elements as the new count indicates.”
> LLVM has additional requirement in addition to this, we might need to consider those requirement too. 
> 
> Qing
> 
>> But, as I said, if there is some need to explicitly ignore __alloc_size
>> when __counted_by is present, I can live with it; we just need to
>> document it.
>> 
>> If the RFC and you agree that the __counted_by variable can only ever be
>> (re)assigned after the flex array has been (re)allocated, then I guess
>> we'll see how it goes. :) I think most places in the kernel using
>> __counted_by will be fine, but I suspect we may have cases where we need
>> to update it like in the loop I described above. If that's true, we can
>> revisit the requirement then. :)
>> 
>> -Kees
>> 
>> -- 
>> Kees Cook
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-07-31 20:14             ` Qing Zhao
@ 2023-08-01 22:45               ` Kees Cook
  2023-08-02  6:25                 ` Martin Uecker
  2023-08-02 15:09                 ` Qing Zhao
  0 siblings, 2 replies; 52+ messages in thread
From: Kees Cook @ 2023-08-01 22:45 UTC (permalink / raw)
  To: Qing Zhao
  Cc: Siddhesh Poyarekar, joseph, richard.guenther, jakub, gcc-patches,
	uecker, isanbard

On Mon, Jul 31, 2023 at 08:14:42PM +0000, Qing Zhao wrote:
> /* In general, Due to type casting, the type for the pointee of a pointer
>    does not say anything about the object it points to,
>    So, __builtin_object_size can not directly use the type of the pointee
>    to decide the size of the object the pointer points to.
> 
>    there are only two reliable ways:
>    A. observed allocations  (call to the allocation functions in the routine)
>    B. observed accesses     (read or write access to the location of the 
>                              pointer points to)
> 
>    that provide information about the type/existence of an object at
>    the corresponding address.
> 
>    for A, we use the "alloc_size" attribute for the corresponding allocation
>    functions to determine the object size;
> 
>    For B, we use the SIZE info of the TYPE attached to the corresponding access.
>    (We treat counted_by attribute as a complement to the SIZE info of the TYPE
>     for FMA) 
> 
>    The only other way in C which ensures that a pointer actually points
>    to an object of the correct type is 'static':
> 
>    void foo(struct P *p[static 1]);   
> 
>    See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
>    for more details.  */

This is a great explanation; thank you!

In the future I might want to have a new builtin that will allow
a program to query a pointer when neither A nor B have happened. But
for the first version of the __counted_by infrastructure, the above
limitations seen fine.

For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) +
sizeof(*p->flex_array_member) * p->counted_by_member). Though since
there might be multiple flex array members, maybe this can't work. :)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-08-01 22:45               ` Kees Cook
@ 2023-08-02  6:25                 ` Martin Uecker
  2023-08-02 15:02                   ` Qing Zhao
  2023-08-02 15:09                 ` Qing Zhao
  1 sibling, 1 reply; 52+ messages in thread
From: Martin Uecker @ 2023-08-02  6:25 UTC (permalink / raw)
  To: Kees Cook, Qing Zhao
  Cc: Siddhesh Poyarekar, joseph, richard.guenther, jakub, gcc-patches,
	isanbard

Am Dienstag, dem 01.08.2023 um 15:45 -0700 schrieb Kees Cook:
> On Mon, Jul 31, 2023 at 08:14:42PM +0000, Qing Zhao wrote:
> > /* In general, Due to type casting, the type for the pointee of a pointer
> >    does not say anything about the object it points to,
> >    So, __builtin_object_size can not directly use the type of the pointee
> >    to decide the size of the object the pointer points to.
> > 
> >    there are only two reliable ways:
> >    A. observed allocations  (call to the allocation functions in the routine)
> >    B. observed accesses     (read or write access to the location of the 
> >                              pointer points to)
> > 
> >    that provide information about the type/existence of an object at
> >    the corresponding address.
> > 
> >    for A, we use the "alloc_size" attribute for the corresponding allocation
> >    functions to determine the object size;
> > 
> >    For B, we use the SIZE info of the TYPE attached to the corresponding access.
> >    (We treat counted_by attribute as a complement to the SIZE info of the TYPE
> >     for FMA) 
> > 
> >    The only other way in C which ensures that a pointer actually points
> >    to an object of the correct type is 'static':
> > 
> >    void foo(struct P *p[static 1]);   
> > 
> >    See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
> >    for more details.  */
> 
> This is a great explanation; thank you!
> 
> In the future I might want to have a new builtin that will allow
> a program to query a pointer when neither A nor B have happened. But
> for the first version of the __counted_by infrastructure, the above
> limitations seen fine.
> 
> For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) +
> sizeof(*p->flex_array_member) * p->counted_by_member). Though since
> there might be multiple flex array members, maybe this can't work. :)

We had a _Lengthof proposal for arrays (instead of sizeof/sizeof)
and thought about how to extend this to structs with FAM. The
problem is that it can not rely on an attribute.

With GCC's VLA in structs you could do 

struct foo { int n; char buf[n_init]; } *p = malloc(sizeof *p);
p->n_init = n;

and get sizeof and bounds checking with UBSan
https://godbolt.org/z/d4nneqs3P

(but also compiler bugs and other issues)


Also see my experimental container library, where you can do:

vec_decl(int);
vec(int)* v = vec_alloc(int);

vec_push(&v, 1);
vec_push(&v, 3);

auto p = &vec_array(v);
(*p)[1] = 1; // bounds check

Here, "vec_array()" would give you a regular C array view
of the vector contant and with correct dynamic size, so you
can apply "sizeof" and  have bounds checking with UBSan and
it just works (with clang / GCC without changes). 
https://github.com/uecker/noplate



Martin








^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-08-02  6:25                 ` Martin Uecker
@ 2023-08-02 15:02                   ` Qing Zhao
  0 siblings, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-08-02 15:02 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Kees Cook, Siddhesh Poyarekar, joseph, richard.guenther, jakub,
	gcc-patches, isanbard



> On Aug 2, 2023, at 2:25 AM, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Dienstag, dem 01.08.2023 um 15:45 -0700 schrieb Kees Cook:
>> On Mon, Jul 31, 2023 at 08:14:42PM +0000, Qing Zhao wrote:
>>> /* In general, Due to type casting, the type for the pointee of a pointer
>>>   does not say anything about the object it points to,
>>>   So, __builtin_object_size can not directly use the type of the pointee
>>>   to decide the size of the object the pointer points to.
>>> 
>>>   there are only two reliable ways:
>>>   A. observed allocations  (call to the allocation functions in the routine)
>>>   B. observed accesses     (read or write access to the location of the 
>>>                             pointer points to)
>>> 
>>>   that provide information about the type/existence of an object at
>>>   the corresponding address.
>>> 
>>>   for A, we use the "alloc_size" attribute for the corresponding allocation
>>>   functions to determine the object size;
>>> 
>>>   For B, we use the SIZE info of the TYPE attached to the corresponding access.
>>>   (We treat counted_by attribute as a complement to the SIZE info of the TYPE
>>>    for FMA) 
>>> 
>>>   The only other way in C which ensures that a pointer actually points
>>>   to an object of the correct type is 'static':
>>> 
>>>   void foo(struct P *p[static 1]);   
>>> 
>>>   See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
>>>   for more details.  */
>> 
>> This is a great explanation; thank you!
>> 
>> In the future I might want to have a new builtin that will allow
>> a program to query a pointer when neither A nor B have happened. But
>> for the first version of the __counted_by infrastructure, the above
>> limitations seen fine.
>> 
>> For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) +
>> sizeof(*p->flex_array_member) * p->counted_by_member). Though since
>> there might be multiple flex array members, maybe this can't work. :)
> 
> We had a _Lengthof proposal for arrays (instead of sizeof/sizeof)
> and thought about how to extend this to structs with FAM. The
> problem is that it can not rely on an attribute.
> 
> With GCC's VLA in structs you could do 
> 
> struct foo { int n; char buf[n_init]; } *p = malloc(sizeof *p);
> p->n_init = n;
> 
> and get sizeof and bounds checking with UBSan
> https://godbolt.org/z/d4nneqs3P
> 
> (but also compiler bugs and other issues)

This works great!

If later the bounds information for the FAM can be integrated into TYPE system just like the VLA, 
That will be ideal, then we don’t need to hack the compiler here and there to handle the FMA specially.
> 
> 
> Also see my experimental container library, where you can do:
> 
> vec_decl(int);
> vec(int)* v = vec_alloc(int);
> 
> vec_push(&v, 1);
> vec_push(&v, 3);
> 
> auto p = &vec_array(v);
> (*p)[1] = 1; // bounds check
> 
> Here, "vec_array()" would give you a regular C array view
> of the vector contant and with correct dynamic size, so you
> can apply "sizeof" and  have bounds checking with UBSan and
> it just works (with clang / GCC without changes). 
> https://github.com/uecker/noplate

Yes, the idea of providing a type-safe library for C also looks promising. 
thanks

Qing
> 
> 
> 
> Martin


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)
  2023-08-01 22:45               ` Kees Cook
  2023-08-02  6:25                 ` Martin Uecker
@ 2023-08-02 15:09                 ` Qing Zhao
  1 sibling, 0 replies; 52+ messages in thread
From: Qing Zhao @ 2023-08-02 15:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: Siddhesh Poyarekar, joseph, richard.guenther, jakub, gcc-patches,
	uecker, isanbard



> On Aug 1, 2023, at 6:45 PM, Kees Cook <keescook@chromium.org> wrote:
> 
> On Mon, Jul 31, 2023 at 08:14:42PM +0000, Qing Zhao wrote:
>> /* In general, Due to type casting, the type for the pointee of a pointer
>>   does not say anything about the object it points to,
>>   So, __builtin_object_size can not directly use the type of the pointee
>>   to decide the size of the object the pointer points to.
>> 
>>   there are only two reliable ways:
>>   A. observed allocations  (call to the allocation functions in the routine)
>>   B. observed accesses     (read or write access to the location of the 
>>                             pointer points to)
>> 
>>   that provide information about the type/existence of an object at
>>   the corresponding address.
>> 
>>   for A, we use the "alloc_size" attribute for the corresponding allocation
>>   functions to determine the object size;
>> 
>>   For B, we use the SIZE info of the TYPE attached to the corresponding access.
>>   (We treat counted_by attribute as a complement to the SIZE info of the TYPE
>>    for FMA) 
>> 
>>   The only other way in C which ensures that a pointer actually points
>>   to an object of the correct type is 'static':
>> 
>>   void foo(struct P *p[static 1]);   
>> 
>>   See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
>>   for more details.  */
> 
> This is a great explanation; thank you!
> 
> In the future I might want to have a new builtin that will allow
> a program to query a pointer when neither A nor B have happened. But
> for the first version of the __counted_by infrastructure, the above
> limitations seen fine.
> 
> For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) +
> sizeof(*p->flex_array_member) * p->counted_by_member). Though since
> there might be multiple flex array members, maybe this can't work. :)

What do you mean by “there might be multiple flex array members”?

Do you mean the following example:

struct annotated {
        size_t foo;
        int array[] __attribute__((counted_by (foo)));
};

static struct annotated * noinline alloc_buf (int index)
{
  struct annotated *p;
  p = malloc(sizeof (*p) + (index) * sizeof (int));
  p->foo = index;
  return p;
}

Int main ()
{
  struct annotated *p1, *p2;
  p1 = alloc_buf (10);
  p2 = alloc_buf (20);

  __builtin_counted_size(p1)???
  __builtin_counted_size(p2)???
}

Or something else?

Qing
> 
> -Kees
> 
> -- 
> Kees Cook


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2023-08-02 15:09 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-25 16:14 [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Qing Zhao
2023-05-25 16:14 ` [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896) Qing Zhao
2023-05-25 21:02   ` Joseph Myers
2023-05-26 13:32     ` Qing Zhao
2023-05-26 18:15       ` Joseph Myers
2023-05-26 19:09         ` Qing Zhao
2023-06-07 19:59         ` Qing Zhao
2023-06-07 20:53           ` Joseph Myers
2023-06-07 21:32             ` Qing Zhao
2023-06-07 22:05               ` Joseph Myers
2023-06-08 13:06                 ` Qing Zhao
2023-06-15 15:09                 ` Qing Zhao
2023-06-15 16:55                   ` Joseph Myers
2023-06-15 19:54                     ` Qing Zhao
2023-06-15 22:48                       ` Joseph Myers
2023-06-16 15:01                         ` Qing Zhao
2023-06-16  7:21                     ` Martin Uecker
2023-06-16 15:14                       ` Qing Zhao
2023-06-16 16:21                       ` Joseph Myers
2023-06-16 17:07                         ` Martin Uecker
2023-06-16 20:20                           ` Qing Zhao
2023-06-16 21:35                             ` Joseph Myers
2023-06-20 19:40                               ` Qing Zhao
2023-06-27 15:44                                 ` Qing Zhao
2023-05-25 16:14 ` [V1][PATCH 2/3] Use the element_count atribute info in builtin object size [PR108896] Qing Zhao
2023-05-27 10:20   ` Martin Uecker
2023-05-30 16:08     ` Qing Zhao
2023-05-25 16:14 ` [V1][PATCH 3/3] Use the element_count attribute information in bound sanitizer[PR108896] Qing Zhao
2023-05-26 16:12 ` [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896) Kees Cook
2023-05-30 21:44   ` Qing Zhao
2023-05-26 20:40 ` Kees Cook
2023-05-30 15:43   ` Qing Zhao
2023-07-06 18:56   ` Qing Zhao
2023-07-06 21:10     ` Martin Uecker
2023-07-07 15:47       ` Qing Zhao
2023-07-07 20:21         ` Qing Zhao
2023-07-13 20:31     ` Kees Cook
2023-07-17 21:17       ` Qing Zhao
2023-07-17 23:40         ` Kees Cook
2023-07-18 15:37           ` Qing Zhao
2023-07-18 16:03             ` Martin Uecker
2023-07-18 16:25               ` Qing Zhao
2023-07-18 16:50                 ` Martin Uecker
2023-07-18 18:53             ` Qing Zhao
2023-07-19  8:41           ` Martin Uecker
2023-07-19 16:16           ` Qing Zhao
2023-07-19 18:52           ` Qing Zhao
2023-07-31 20:14             ` Qing Zhao
2023-08-01 22:45               ` Kees Cook
2023-08-02  6:25                 ` Martin Uecker
2023-08-02 15:02                   ` Qing Zhao
2023-08-02 15:09                 ` Qing Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).