public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894]
@ 2023-02-28  8:26 Jakub Jelinek
  2023-02-28  9:02 ` Richard Biener
  2023-02-28 16:13 ` Qing Zhao
  0 siblings, 2 replies; 9+ messages in thread
From: Jakub Jelinek @ 2023-02-28  8:26 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi!

While this isn't really a regression, the -fstrict-flex-arrays*
option is new in GCC 13 and so I think we should make -fsanitize=bounds
play with it well from the beginning.

The current behavior is that -fsanitize=bounds considers all trailing
arrays as flexible member-like arrays and both -fsanitize=bounds and
-fsanitize=bounds-strict because of a bug don't even instrument
[0] arrays at all, not as trailing nor when followed by other members.

I think -fstrict-flex-arrays* options can be considered as language
mode changing options, by default flexible member-like arrays are
handled like flexible arrays, but that option can change the set of
the arrays which are treated like that.  So, -fsanitize=bounds should
change with that on what is considered acceptable and what isn't.
While -fsanitize=bounds-strict should reject them all always to
continue previous behavior.

The following patch implements that.  To support [0] array instrumentation,
I had to change the meaning of the bounds argument to .UBSAN_BOUNDS,
previously it was the TYPE_MAX_VALUE of the domain unless ignore_off_by_one
(used for taking address of the array element rather than accessing it;
in that case 1 is added to the bound argument) and the later lowered checks
were if (index > bound) report_failure ().
The problem with that is that for [0] arrays where (at least for C++)
the max value is all ones, for accesses that condition will be never true;
for addresses of elements it was working (in C++) correctly before.
This patch changes it to add 1 + ignore_off_by_one, so -1 becomes 0 or
1 for &array_ref and changing the lowering to be if (index >= bound)
report_failure ().  Furthermore, as C represents the [0] arrays with
NULL TYPE_MAX_VALUE, I treated those like the C++ ones.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-02-28  Jakub Jelinek  <jakub@redhat.com>

	PR sanitizer/108894
gcc/
	* ubsan.cc (ubsan_expand_bounds_ifn): Emit index >= bound
	comparison rather than index > bound.
	* gimple-fold.cc (gimple_fold_call): Use tree_int_cst_lt
	rather than tree_int_cst_le for IFN_UBSAN_BOUND comparison.
	* doc/invoke.texi (-fsanitize=bounds): Document that whether
	flexible array member-like arrays are instrumented or not depends
	on -fstrict-flex-arrays* options of strict_flex_array attributes.
	(-fsanitize=bounds-strict): Document that flexible array members
	are not instrumented.
gcc/c-family/
	* c-common.h (c_strict_flex_array_level_of): Declare.
	* c-common.cc (c_strict_flex_array_level_of): New function,
	moved and renamed from c-decl.cc's strict_flex_array_level_of.
	* c-ubsan.cc (ubsan_instrument_bounds): Fix comment typo.  For
	C check c_strict_flex_array_level_of whether a trailing array
	should be treated as flexible member like.  Handle C [0] arrays.
	Add 1 + index_off_by_one rather than index_off_by_one to bounds
	and use tree_int_cst_lt rather than tree_int_cst_le for idx vs.
	bounds comparison.
gcc/c/
	* c-decl.cc (strict_flex_array_level_of): Move to c-common.cc
	and rename to c_strict_flex_array_level_of.
	(is_flexible_array_member_p): Adjust caller.
gcc/testsuite/
	* gcc.dg/ubsan/bounds-4.c: New test.
	* gcc.dg/ubsan/bounds-4a.c: New test.
	* gcc.dg/ubsan/bounds-4b.c: New test.
	* gcc.dg/ubsan/bounds-4c.c: New test.
	* gcc.dg/ubsan/bounds-4d.c: New test.
	* g++.dg/ubsan/bounds-1.C: New test.

--- gcc/ubsan.cc.jj	2023-02-06 09:05:49.001628881 +0100
+++ gcc/ubsan.cc	2023-02-27 16:46:13.452122423 +0100
@@ -721,7 +721,7 @@ ubsan_expand_bounds_ifn (gimple_stmt_ite
 
   gimple_stmt_iterator gsi_orig = *gsi;
 
-  /* Create condition "if (index > bound)".  */
+  /* Create condition "if (index >= bound)".  */
   basic_block then_bb, fallthru_bb;
   gimple_stmt_iterator cond_insert_point
     = create_cond_insert_point (gsi, false, false, true,
@@ -730,7 +730,7 @@ ubsan_expand_bounds_ifn (gimple_stmt_ite
   index = force_gimple_operand_gsi (&cond_insert_point, index,
 				    true, NULL_TREE,
 				    false, GSI_NEW_STMT);
-  gimple *g = gimple_build_cond (GT_EXPR, index, bound, NULL_TREE, NULL_TREE);
+  gimple *g = gimple_build_cond (GE_EXPR, index, bound, NULL_TREE, NULL_TREE);
   gimple_set_location (g, loc);
   gsi_insert_after (&cond_insert_point, g, GSI_NEW_STMT);
 
--- gcc/gimple-fold.cc.jj	2023-02-08 11:57:29.726582589 +0100
+++ gcc/gimple-fold.cc	2023-02-27 16:45:00.895160752 +0100
@@ -5624,7 +5624,7 @@ gimple_fold_call (gimple_stmt_iterator *
 	      {
 		index = fold_convert (TREE_TYPE (bound), index);
 		if (TREE_CODE (index) == INTEGER_CST
-		    && tree_int_cst_le (index, bound))
+		    && tree_int_cst_lt (index, bound))
 		  {
 		    replace_call_with_value (gsi, NULL_TREE);
 		    return true;
--- gcc/doc/invoke.texi.jj	2023-02-24 09:58:39.888509771 +0100
+++ gcc/doc/invoke.texi	2023-02-28 08:32:27.313775696 +0100
@@ -16831,14 +16831,17 @@ a++;
 @item -fsanitize=bounds
 This option enables instrumentation of array bounds.  Various out of bounds
 accesses are detected.  Flexible array members, flexible array member-like
-arrays, and initializers of variables with static storage are not instrumented.
+arrays, and initializers of variables with static storage are not
+instrumented, with the exception of flexible array member-like arrays
+for which @code{-fstrict-flex-arrays} or @code{-fstrict-flex-arrays=}
+options or @code{strict_flex_array} attributes say they shouldn't be treated
+like flexible array member-like arrays.
 
 @opindex fsanitize=bounds-strict
 @item -fsanitize=bounds-strict
 This option enables strict instrumentation of array bounds.  Most out of bounds
-accesses are detected, including flexible array members and flexible array
-member-like arrays.  Initializers of variables with static storage are not
-instrumented.
+accesses are detected, including flexible array member-like arrays.
+Initializers of variables with static storage are not instrumented.
 
 @opindex fsanitize=alignment
 @item -fsanitize=alignment
--- gcc/c-family/c-common.h.jj	2023-02-17 12:45:07.524646367 +0100
+++ gcc/c-family/c-common.h	2023-02-27 15:11:16.737259968 +0100
@@ -907,6 +907,7 @@ extern tree fold_for_warn (tree);
 extern tree c_common_get_narrower (tree, int *);
 extern bool get_attribute_operand (tree, unsigned HOST_WIDE_INT *);
 extern void c_common_finalize_early_debug (void);
+extern unsigned int c_strict_flex_array_level_of (tree);
 extern bool c_option_is_from_cpp_diagnostics (int);
 
 /* Used by convert_and_check; in front ends.  */
--- gcc/c-family/c-common.cc.jj	2023-01-19 09:58:50.875229150 +0100
+++ gcc/c-family/c-common.cc	2023-02-27 15:10:56.232547326 +0100
@@ -9501,4 +9501,33 @@ c_common_finalize_early_debug (void)
       (*debug_hooks->early_global_decl) (cnode->decl);
 }
 
+/* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
+   values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
+unsigned int
+c_strict_flex_array_level_of (tree array_field)
+{
+  gcc_assert (TREE_CODE (array_field) == FIELD_DECL);
+  unsigned int strict_flex_array_level = flag_strict_flex_arrays;
+
+  tree attr_strict_flex_array
+    = lookup_attribute ("strict_flex_array", DECL_ATTRIBUTES (array_field));
+  /* If there is a strict_flex_array attribute attached to the field,
+     override the flag_strict_flex_arrays.  */
+  if (attr_strict_flex_array)
+    {
+      /* Get the value of the level first from the attribute.  */
+      unsigned HOST_WIDE_INT attr_strict_flex_array_level = 0;
+      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
+      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
+      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
+      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
+      gcc_assert (tree_fits_uhwi_p (attr_strict_flex_array));
+      attr_strict_flex_array_level = tree_to_uhwi (attr_strict_flex_array);
+
+      /* The attribute has higher priority than flag_struct_flex_array.  */
+      strict_flex_array_level = attr_strict_flex_array_level;
+    }
+  return strict_flex_array_level;
+}
+
 #include "gt-c-family-c-common.h"
--- gcc/c-family/c-ubsan.cc.jj	2023-01-16 11:52:15.899736776 +0100
+++ gcc/c-family/c-ubsan.cc	2023-02-28 08:15:57.579124095 +0100
@@ -354,7 +354,7 @@ ubsan_instrument_return (location_t loc)
    that gets expanded in the sanopt pass, and make an array dimension
    of it.  ARRAY is the array, *INDEX is an index to the array.
    Return NULL_TREE if no instrumentation is emitted.
-   IGNORE_OFF_BY_ONE is true if the ARRAY_REF is inside a ADDR_EXPR.  */
+   IGNORE_OFF_BY_ONE is true if the ARRAY_REF is inside an ADDR_EXPR.  */
 
 tree
 ubsan_instrument_bounds (location_t loc, tree array, tree *index,
@@ -363,13 +363,25 @@ ubsan_instrument_bounds (location_t loc,
   tree type = TREE_TYPE (array);
   tree domain = TYPE_DOMAIN (type);
 
-  if (domain == NULL_TREE || TYPE_MAX_VALUE (domain) == NULL_TREE)
+  if (domain == NULL_TREE)
     return NULL_TREE;
 
   tree bound = TYPE_MAX_VALUE (domain);
-  if (ignore_off_by_one)
-    bound = fold_build2 (PLUS_EXPR, TREE_TYPE (bound), bound,
-			 build_int_cst (TREE_TYPE (bound), 1));
+  if (!bound)
+    {
+      /* Handle C [0] arrays, which have TYPE_MAX_VALUE NULL, like
+	 C++ [0] arrays which have TYPE_MIN_VALUE 0 TYPE_MAX_VALUE -1.  */
+      if (!c_dialect_cxx ()
+	  && COMPLETE_TYPE_P (type)
+	  && integer_zerop (TYPE_SIZE (type)))
+	bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+      else
+	return NULL_TREE;
+    }
+
+  bound = fold_build2 (PLUS_EXPR, TREE_TYPE (bound), bound,
+		       build_int_cst (TREE_TYPE (bound),
+		       1 + ignore_off_by_one));
 
   /* Detect flexible array members and suchlike, unless
      -fsanitize=bounds-strict.  */
@@ -392,6 +404,45 @@ ubsan_instrument_bounds (location_t loc,
 	  if (next)
 	    /* Not a last element.  Instrument it.  */
 	    break;
+	  if (TREE_CODE (TREE_TYPE (TREE_OPERAND (cref, 1))) == ARRAY_TYPE
+	      && !c_dialect_cxx ())
+	    {
+	      unsigned l
+		= c_strict_flex_array_level_of (TREE_OPERAND (cref, 1));
+	      tree type2 = TREE_TYPE (TREE_OPERAND (cref, 1));
+	      if (TYPE_DOMAIN (type2) != NULL_TREE)
+		{
+		  tree max = TYPE_MAX_VALUE (TYPE_DOMAIN (type2));
+		  if (max == NULL_TREE)
+		    {
+		      /* C [0] */
+		      if (COMPLETE_TYPE_P (type2)
+			  && integer_zerop (TYPE_SIZE (type2))
+			  && l == 3)
+			next = TREE_OPERAND (cref, 1);
+		    }
+		  else if (TREE_CODE (max) == INTEGER_CST)
+		    {
+		      if (c_dialect_cxx ()
+			  && integer_all_onesp (max))
+			{
+			  /* C++ [0] */
+			  if (l == 3)
+			    next = TREE_OPERAND (cref, 1);
+			}
+		      else if (integer_zerop (max))
+			{
+			  /* C/C++ [1] */
+			  if (l >= 2)
+			    next = TREE_OPERAND (cref, 1);
+			}
+		      else if (l >= 1)
+			next = TREE_OPERAND (cref, 1);
+		    }
+		}
+	      if (next)
+		break;
+	    }
 	  /* Ok, this is the last field of the structure/union.  But the
 	     aggregate containing the field must be the last field too,
 	     recursively.  */
@@ -413,7 +464,7 @@ ubsan_instrument_bounds (location_t loc,
   if (idx
       && TREE_CODE (bound) == INTEGER_CST
       && tree_int_cst_sgn (idx) >= 0
-      && tree_int_cst_le (idx, bound))
+      && tree_int_cst_lt (idx, bound))
     return NULL_TREE;
 
   *index = save_expr (*index);
--- gcc/c/c-decl.cc.jj	2023-02-18 12:38:30.000000000 +0100
+++ gcc/c/c-decl.cc	2023-02-27 15:09:57.143375989 +0100
@@ -9050,35 +9050,6 @@ finish_incomplete_vars (tree incomplete_
     }
 }
 
-/* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
-   values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
-static unsigned int
-strict_flex_array_level_of (tree array_field)
-{
-  gcc_assert (TREE_CODE (array_field) == FIELD_DECL);
-  unsigned int strict_flex_array_level = flag_strict_flex_arrays;
-
-  tree attr_strict_flex_array
-    = lookup_attribute ("strict_flex_array", DECL_ATTRIBUTES (array_field));
-  /* If there is a strict_flex_array attribute attached to the field,
-     override the flag_strict_flex_arrays.  */
-  if (attr_strict_flex_array)
-    {
-      /* Get the value of the level first from the attribute.  */
-      unsigned HOST_WIDE_INT attr_strict_flex_array_level = 0;
-      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
-      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
-      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
-      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
-      gcc_assert (tree_fits_uhwi_p (attr_strict_flex_array));
-      attr_strict_flex_array_level = tree_to_uhwi (attr_strict_flex_array);
-
-      /* The attribute has higher priority than flag_struct_flex_array.  */
-      strict_flex_array_level = attr_strict_flex_array_level;
-    }
-  return strict_flex_array_level;
-}
-
 /* Determine whether the FIELD_DECL X is a flexible array member according to
    the following info:
   A. whether the FIELD_DECL X is the last field of the DECL_CONTEXT;
@@ -9105,7 +9076,7 @@ is_flexible_array_member_p (bool is_last
   bool is_one_element_array = one_element_array_type_p (TREE_TYPE (x));
   bool is_flexible_array = flexible_array_member_type_p (TREE_TYPE (x));
 
-  unsigned int strict_flex_array_level = strict_flex_array_level_of (x);
+  unsigned int strict_flex_array_level = c_strict_flex_array_level_of (x);
 
   switch (strict_flex_array_level)
     {
--- gcc/testsuite/gcc.dg/ubsan/bounds-4.c.jj	2023-02-27 17:08:46.193766320 +0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4.c	2023-02-27 17:21:40.489668882 +0100
@@ -0,0 +1,79 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds" } */
+/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+struct A { int a; int b[]; };
+struct B { int a; int b[0]; };
+struct C { int a; int b[1]; };
+struct D { int a; int b[2]; };
+struct E { int a; int b[42]; };
+struct F { int a; int b[0]; int c[2]; };
+struct G { int a; int b[15]; int c[2]; };
+
+__attribute__((noipa)) int
+foo (struct A *a)
+{
+  return a->b[14];
+}
+
+__attribute__((noipa)) int
+bar (struct B *a)
+{
+  return a->b[0];
+}
+
+__attribute__((noipa)) int
+baz (struct C *a)
+{
+  return a->b[1];
+}
+
+__attribute__((noipa)) int
+qux (struct D *a)
+{
+  return a->b[2];
+}
+
+__attribute__((noipa)) int
+corge (struct E *a)
+{
+  return a->b[14];
+}
+
+__attribute__((noipa)) int
+freddy (struct F *a)
+{
+  return a->b[0];
+}
+
+__attribute__((noipa)) int
+garply (struct G *a)
+{
+  return a->b[15];
+}
+
+__attribute__((noipa)) int
+waldo (struct G *a)
+{
+  return a->b[16];
+}
+
+int
+main ()
+{
+  union { struct A a; struct B b; struct C c;
+	  struct D d; struct E e; struct F f; } u;
+  struct G g;
+  u.e.a = 42;
+  __builtin_memset (u.e.b, 0, sizeof (u.e.b));
+  __builtin_memset (&g, 0, sizeof (g));
+  int r = garply (&g);
+  r += foo (&u.a) + bar (&u.b) + baz (&u.c);
+  r += qux (&u.d) + corge (&u.e) + freddy (&u.f);
+  r += waldo (&g);
+  if (r != 0)
+    __builtin_abort ();
+}
--- gcc/testsuite/gcc.dg/ubsan/bounds-4a.c.jj	2023-02-27 17:08:52.797671833 +0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4a.c	2023-02-27 17:25:08.463687314 +0100
@@ -0,0 +1,8 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds -fstrict-flex-arrays=0" } */
+/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+#include "bounds-4.c"
--- gcc/testsuite/gcc.dg/ubsan/bounds-4b.c.jj	2023-02-27 17:08:52.797671833 +0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4b.c	2023-02-27 17:29:57.042545420 +0100
@@ -0,0 +1,9 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds -fstrict-flex-arrays=1" } */
+/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 2 out of bounds for type 'int \\\[2\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+#include "bounds-4.c"
--- gcc/testsuite/gcc.dg/ubsan/bounds-4c.c.jj	2023-02-27 17:08:52.797671833 +0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4c.c	2023-02-27 17:34:29.094634537 +0100
@@ -0,0 +1,10 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds -fstrict-flex-arrays=2" } */
+/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'int \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 2 out of bounds for type 'int \\\[2\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+#include "bounds-4.c"
--- gcc/testsuite/gcc.dg/ubsan/bounds-4d.c.jj	2023-02-27 17:08:52.797671833 +0100
+++ gcc/testsuite/gcc.dg/ubsan/bounds-4d.c	2023-02-27 17:35:46.556520992 +0100
@@ -0,0 +1,11 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds -fstrict-flex-arrays=3" } */
+/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'int \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 2 out of bounds for type 'int \\\[2\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
+
+#include "bounds-4.c"
--- gcc/testsuite/g++.dg/ubsan/bounds-1.C.jj	2023-02-27 17:37:03.076420982 +0100
+++ gcc/testsuite/g++.dg/ubsan/bounds-1.C	2023-02-27 17:38:23.717261726 +0100
@@ -0,0 +1,8 @@
+// PR sanitizer/108894
+// { dg-do run }
+// { dg-options "-fsanitize=bounds -fsanitize-recover=bounds" }
+// { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" }
+// { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\[0-9x]*\\\]'\[^\n\r]*(\n|\r\n|\r)" }
+// { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" }
+
+#include "../../gcc.dg/ubsan/bounds-4.c"

	Jakub


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894]
  2023-02-28  8:26 [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894] Jakub Jelinek
@ 2023-02-28  9:02 ` Richard Biener
  2023-02-28  9:11   ` Jakub Jelinek
  2023-02-28 16:13 ` Qing Zhao
  1 sibling, 1 reply; 9+ messages in thread
From: Richard Biener @ 2023-02-28  9:02 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Tue, 28 Feb 2023, Jakub Jelinek wrote:

> Hi!
> 
> While this isn't really a regression, the -fstrict-flex-arrays*
> option is new in GCC 13 and so I think we should make -fsanitize=bounds
> play with it well from the beginning.
> 
> The current behavior is that -fsanitize=bounds considers all trailing
> arrays as flexible member-like arrays and both -fsanitize=bounds and
> -fsanitize=bounds-strict because of a bug don't even instrument
> [0] arrays at all, not as trailing nor when followed by other members.
> 
> I think -fstrict-flex-arrays* options can be considered as language
> mode changing options, by default flexible member-like arrays are
> handled like flexible arrays, but that option can change the set of
> the arrays which are treated like that.  So, -fsanitize=bounds should
> change with that on what is considered acceptable and what isn't.
> While -fsanitize=bounds-strict should reject them all always to
> continue previous behavior.
> 
> The following patch implements that.  To support [0] array instrumentation,
> I had to change the meaning of the bounds argument to .UBSAN_BOUNDS,
> previously it was the TYPE_MAX_VALUE of the domain unless ignore_off_by_one
> (used for taking address of the array element rather than accessing it;
> in that case 1 is added to the bound argument) and the later lowered checks
> were if (index > bound) report_failure ().
> The problem with that is that for [0] arrays where (at least for C++)
> the max value is all ones, for accesses that condition will be never true;
> for addresses of elements it was working (in C++) correctly before.
> This patch changes it to add 1 + ignore_off_by_one, so -1 becomes 0 or
> 1 for &array_ref and changing the lowering to be if (index >= bound)
> report_failure ().  Furthermore, as C represents the [0] arrays with
> NULL TYPE_MAX_VALUE, I treated those like the C++ ones.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

LGTM.  Btw, what does -fsanitize=bounds do for C++ code which lacks
flexible arrays?  Does it treat all trailing arrays as fixed?

Thanks,
Richard.

> 2023-02-28  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR sanitizer/108894
> gcc/
> 	* ubsan.cc (ubsan_expand_bounds_ifn): Emit index >= bound
> 	comparison rather than index > bound.
> 	* gimple-fold.cc (gimple_fold_call): Use tree_int_cst_lt
> 	rather than tree_int_cst_le for IFN_UBSAN_BOUND comparison.
> 	* doc/invoke.texi (-fsanitize=bounds): Document that whether
> 	flexible array member-like arrays are instrumented or not depends
> 	on -fstrict-flex-arrays* options of strict_flex_array attributes.
> 	(-fsanitize=bounds-strict): Document that flexible array members
> 	are not instrumented.
> gcc/c-family/
> 	* c-common.h (c_strict_flex_array_level_of): Declare.
> 	* c-common.cc (c_strict_flex_array_level_of): New function,
> 	moved and renamed from c-decl.cc's strict_flex_array_level_of.
> 	* c-ubsan.cc (ubsan_instrument_bounds): Fix comment typo.  For
> 	C check c_strict_flex_array_level_of whether a trailing array
> 	should be treated as flexible member like.  Handle C [0] arrays.
> 	Add 1 + index_off_by_one rather than index_off_by_one to bounds
> 	and use tree_int_cst_lt rather than tree_int_cst_le for idx vs.
> 	bounds comparison.
> gcc/c/
> 	* c-decl.cc (strict_flex_array_level_of): Move to c-common.cc
> 	and rename to c_strict_flex_array_level_of.
> 	(is_flexible_array_member_p): Adjust caller.
> gcc/testsuite/
> 	* gcc.dg/ubsan/bounds-4.c: New test.
> 	* gcc.dg/ubsan/bounds-4a.c: New test.
> 	* gcc.dg/ubsan/bounds-4b.c: New test.
> 	* gcc.dg/ubsan/bounds-4c.c: New test.
> 	* gcc.dg/ubsan/bounds-4d.c: New test.
> 	* g++.dg/ubsan/bounds-1.C: New test.
> 
> --- gcc/ubsan.cc.jj	2023-02-06 09:05:49.001628881 +0100
> +++ gcc/ubsan.cc	2023-02-27 16:46:13.452122423 +0100
> @@ -721,7 +721,7 @@ ubsan_expand_bounds_ifn (gimple_stmt_ite
>  
>    gimple_stmt_iterator gsi_orig = *gsi;
>  
> -  /* Create condition "if (index > bound)".  */
> +  /* Create condition "if (index >= bound)".  */
>    basic_block then_bb, fallthru_bb;
>    gimple_stmt_iterator cond_insert_point
>      = create_cond_insert_point (gsi, false, false, true,
> @@ -730,7 +730,7 @@ ubsan_expand_bounds_ifn (gimple_stmt_ite
>    index = force_gimple_operand_gsi (&cond_insert_point, index,
>  				    true, NULL_TREE,
>  				    false, GSI_NEW_STMT);
> -  gimple *g = gimple_build_cond (GT_EXPR, index, bound, NULL_TREE, NULL_TREE);
> +  gimple *g = gimple_build_cond (GE_EXPR, index, bound, NULL_TREE, NULL_TREE);
>    gimple_set_location (g, loc);
>    gsi_insert_after (&cond_insert_point, g, GSI_NEW_STMT);
>  
> --- gcc/gimple-fold.cc.jj	2023-02-08 11:57:29.726582589 +0100
> +++ gcc/gimple-fold.cc	2023-02-27 16:45:00.895160752 +0100
> @@ -5624,7 +5624,7 @@ gimple_fold_call (gimple_stmt_iterator *
>  	      {
>  		index = fold_convert (TREE_TYPE (bound), index);
>  		if (TREE_CODE (index) == INTEGER_CST
> -		    && tree_int_cst_le (index, bound))
> +		    && tree_int_cst_lt (index, bound))
>  		  {
>  		    replace_call_with_value (gsi, NULL_TREE);
>  		    return true;
> --- gcc/doc/invoke.texi.jj	2023-02-24 09:58:39.888509771 +0100
> +++ gcc/doc/invoke.texi	2023-02-28 08:32:27.313775696 +0100
> @@ -16831,14 +16831,17 @@ a++;
>  @item -fsanitize=bounds
>  This option enables instrumentation of array bounds.  Various out of bounds
>  accesses are detected.  Flexible array members, flexible array member-like
> -arrays, and initializers of variables with static storage are not instrumented.
> +arrays, and initializers of variables with static storage are not
> +instrumented, with the exception of flexible array member-like arrays
> +for which @code{-fstrict-flex-arrays} or @code{-fstrict-flex-arrays=}
> +options or @code{strict_flex_array} attributes say they shouldn't be treated
> +like flexible array member-like arrays.
>  
>  @opindex fsanitize=bounds-strict
>  @item -fsanitize=bounds-strict
>  This option enables strict instrumentation of array bounds.  Most out of bounds
> -accesses are detected, including flexible array members and flexible array
> -member-like arrays.  Initializers of variables with static storage are not
> -instrumented.
> +accesses are detected, including flexible array member-like arrays.
> +Initializers of variables with static storage are not instrumented.
>  
>  @opindex fsanitize=alignment
>  @item -fsanitize=alignment
> --- gcc/c-family/c-common.h.jj	2023-02-17 12:45:07.524646367 +0100
> +++ gcc/c-family/c-common.h	2023-02-27 15:11:16.737259968 +0100
> @@ -907,6 +907,7 @@ extern tree fold_for_warn (tree);
>  extern tree c_common_get_narrower (tree, int *);
>  extern bool get_attribute_operand (tree, unsigned HOST_WIDE_INT *);
>  extern void c_common_finalize_early_debug (void);
> +extern unsigned int c_strict_flex_array_level_of (tree);
>  extern bool c_option_is_from_cpp_diagnostics (int);
>  
>  /* Used by convert_and_check; in front ends.  */
> --- gcc/c-family/c-common.cc.jj	2023-01-19 09:58:50.875229150 +0100
> +++ gcc/c-family/c-common.cc	2023-02-27 15:10:56.232547326 +0100
> @@ -9501,4 +9501,33 @@ c_common_finalize_early_debug (void)
>        (*debug_hooks->early_global_decl) (cnode->decl);
>  }
>  
> +/* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
> +   values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
> +unsigned int
> +c_strict_flex_array_level_of (tree array_field)
> +{
> +  gcc_assert (TREE_CODE (array_field) == FIELD_DECL);
> +  unsigned int strict_flex_array_level = flag_strict_flex_arrays;
> +
> +  tree attr_strict_flex_array
> +    = lookup_attribute ("strict_flex_array", DECL_ATTRIBUTES (array_field));
> +  /* If there is a strict_flex_array attribute attached to the field,
> +     override the flag_strict_flex_arrays.  */
> +  if (attr_strict_flex_array)
> +    {
> +      /* Get the value of the level first from the attribute.  */
> +      unsigned HOST_WIDE_INT attr_strict_flex_array_level = 0;
> +      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
> +      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
> +      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
> +      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
> +      gcc_assert (tree_fits_uhwi_p (attr_strict_flex_array));
> +      attr_strict_flex_array_level = tree_to_uhwi (attr_strict_flex_array);
> +
> +      /* The attribute has higher priority than flag_struct_flex_array.  */
> +      strict_flex_array_level = attr_strict_flex_array_level;
> +    }
> +  return strict_flex_array_level;
> +}
> +
>  #include "gt-c-family-c-common.h"
> --- gcc/c-family/c-ubsan.cc.jj	2023-01-16 11:52:15.899736776 +0100
> +++ gcc/c-family/c-ubsan.cc	2023-02-28 08:15:57.579124095 +0100
> @@ -354,7 +354,7 @@ ubsan_instrument_return (location_t loc)
>     that gets expanded in the sanopt pass, and make an array dimension
>     of it.  ARRAY is the array, *INDEX is an index to the array.
>     Return NULL_TREE if no instrumentation is emitted.
> -   IGNORE_OFF_BY_ONE is true if the ARRAY_REF is inside a ADDR_EXPR.  */
> +   IGNORE_OFF_BY_ONE is true if the ARRAY_REF is inside an ADDR_EXPR.  */
>  
>  tree
>  ubsan_instrument_bounds (location_t loc, tree array, tree *index,
> @@ -363,13 +363,25 @@ ubsan_instrument_bounds (location_t loc,
>    tree type = TREE_TYPE (array);
>    tree domain = TYPE_DOMAIN (type);
>  
> -  if (domain == NULL_TREE || TYPE_MAX_VALUE (domain) == NULL_TREE)
> +  if (domain == NULL_TREE)
>      return NULL_TREE;
>  
>    tree bound = TYPE_MAX_VALUE (domain);
> -  if (ignore_off_by_one)
> -    bound = fold_build2 (PLUS_EXPR, TREE_TYPE (bound), bound,
> -			 build_int_cst (TREE_TYPE (bound), 1));
> +  if (!bound)
> +    {
> +      /* Handle C [0] arrays, which have TYPE_MAX_VALUE NULL, like
> +	 C++ [0] arrays which have TYPE_MIN_VALUE 0 TYPE_MAX_VALUE -1.  */
> +      if (!c_dialect_cxx ()
> +	  && COMPLETE_TYPE_P (type)
> +	  && integer_zerop (TYPE_SIZE (type)))
> +	bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
> +      else
> +	return NULL_TREE;
> +    }
> +
> +  bound = fold_build2 (PLUS_EXPR, TREE_TYPE (bound), bound,
> +		       build_int_cst (TREE_TYPE (bound),
> +		       1 + ignore_off_by_one));
>  
>    /* Detect flexible array members and suchlike, unless
>       -fsanitize=bounds-strict.  */
> @@ -392,6 +404,45 @@ ubsan_instrument_bounds (location_t loc,
>  	  if (next)
>  	    /* Not a last element.  Instrument it.  */
>  	    break;
> +	  if (TREE_CODE (TREE_TYPE (TREE_OPERAND (cref, 1))) == ARRAY_TYPE
> +	      && !c_dialect_cxx ())
> +	    {
> +	      unsigned l
> +		= c_strict_flex_array_level_of (TREE_OPERAND (cref, 1));
> +	      tree type2 = TREE_TYPE (TREE_OPERAND (cref, 1));
> +	      if (TYPE_DOMAIN (type2) != NULL_TREE)
> +		{
> +		  tree max = TYPE_MAX_VALUE (TYPE_DOMAIN (type2));
> +		  if (max == NULL_TREE)
> +		    {
> +		      /* C [0] */
> +		      if (COMPLETE_TYPE_P (type2)
> +			  && integer_zerop (TYPE_SIZE (type2))
> +			  && l == 3)
> +			next = TREE_OPERAND (cref, 1);
> +		    }
> +		  else if (TREE_CODE (max) == INTEGER_CST)
> +		    {
> +		      if (c_dialect_cxx ()
> +			  && integer_all_onesp (max))
> +			{
> +			  /* C++ [0] */
> +			  if (l == 3)
> +			    next = TREE_OPERAND (cref, 1);
> +			}
> +		      else if (integer_zerop (max))
> +			{
> +			  /* C/C++ [1] */
> +			  if (l >= 2)
> +			    next = TREE_OPERAND (cref, 1);
> +			}
> +		      else if (l >= 1)
> +			next = TREE_OPERAND (cref, 1);
> +		    }
> +		}
> +	      if (next)
> +		break;
> +	    }
>  	  /* Ok, this is the last field of the structure/union.  But the
>  	     aggregate containing the field must be the last field too,
>  	     recursively.  */
> @@ -413,7 +464,7 @@ ubsan_instrument_bounds (location_t loc,
>    if (idx
>        && TREE_CODE (bound) == INTEGER_CST
>        && tree_int_cst_sgn (idx) >= 0
> -      && tree_int_cst_le (idx, bound))
> +      && tree_int_cst_lt (idx, bound))
>      return NULL_TREE;
>  
>    *index = save_expr (*index);
> --- gcc/c/c-decl.cc.jj	2023-02-18 12:38:30.000000000 +0100
> +++ gcc/c/c-decl.cc	2023-02-27 15:09:57.143375989 +0100
> @@ -9050,35 +9050,6 @@ finish_incomplete_vars (tree incomplete_
>      }
>  }
>  
> -/* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
> -   values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
> -static unsigned int
> -strict_flex_array_level_of (tree array_field)
> -{
> -  gcc_assert (TREE_CODE (array_field) == FIELD_DECL);
> -  unsigned int strict_flex_array_level = flag_strict_flex_arrays;
> -
> -  tree attr_strict_flex_array
> -    = lookup_attribute ("strict_flex_array", DECL_ATTRIBUTES (array_field));
> -  /* If there is a strict_flex_array attribute attached to the field,
> -     override the flag_strict_flex_arrays.  */
> -  if (attr_strict_flex_array)
> -    {
> -      /* Get the value of the level first from the attribute.  */
> -      unsigned HOST_WIDE_INT attr_strict_flex_array_level = 0;
> -      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
> -      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
> -      gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
> -      attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
> -      gcc_assert (tree_fits_uhwi_p (attr_strict_flex_array));
> -      attr_strict_flex_array_level = tree_to_uhwi (attr_strict_flex_array);
> -
> -      /* The attribute has higher priority than flag_struct_flex_array.  */
> -      strict_flex_array_level = attr_strict_flex_array_level;
> -    }
> -  return strict_flex_array_level;
> -}
> -
>  /* Determine whether the FIELD_DECL X is a flexible array member according to
>     the following info:
>    A. whether the FIELD_DECL X is the last field of the DECL_CONTEXT;
> @@ -9105,7 +9076,7 @@ is_flexible_array_member_p (bool is_last
>    bool is_one_element_array = one_element_array_type_p (TREE_TYPE (x));
>    bool is_flexible_array = flexible_array_member_type_p (TREE_TYPE (x));
>  
> -  unsigned int strict_flex_array_level = strict_flex_array_level_of (x);
> +  unsigned int strict_flex_array_level = c_strict_flex_array_level_of (x);
>  
>    switch (strict_flex_array_level)
>      {
> --- gcc/testsuite/gcc.dg/ubsan/bounds-4.c.jj	2023-02-27 17:08:46.193766320 +0100
> +++ gcc/testsuite/gcc.dg/ubsan/bounds-4.c	2023-02-27 17:21:40.489668882 +0100
> @@ -0,0 +1,79 @@
> +/* PR sanitizer/108894 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds" } */
> +/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
> +
> +struct A { int a; int b[]; };
> +struct B { int a; int b[0]; };
> +struct C { int a; int b[1]; };
> +struct D { int a; int b[2]; };
> +struct E { int a; int b[42]; };
> +struct F { int a; int b[0]; int c[2]; };
> +struct G { int a; int b[15]; int c[2]; };
> +
> +__attribute__((noipa)) int
> +foo (struct A *a)
> +{
> +  return a->b[14];
> +}
> +
> +__attribute__((noipa)) int
> +bar (struct B *a)
> +{
> +  return a->b[0];
> +}
> +
> +__attribute__((noipa)) int
> +baz (struct C *a)
> +{
> +  return a->b[1];
> +}
> +
> +__attribute__((noipa)) int
> +qux (struct D *a)
> +{
> +  return a->b[2];
> +}
> +
> +__attribute__((noipa)) int
> +corge (struct E *a)
> +{
> +  return a->b[14];
> +}
> +
> +__attribute__((noipa)) int
> +freddy (struct F *a)
> +{
> +  return a->b[0];
> +}
> +
> +__attribute__((noipa)) int
> +garply (struct G *a)
> +{
> +  return a->b[15];
> +}
> +
> +__attribute__((noipa)) int
> +waldo (struct G *a)
> +{
> +  return a->b[16];
> +}
> +
> +int
> +main ()
> +{
> +  union { struct A a; struct B b; struct C c;
> +	  struct D d; struct E e; struct F f; } u;
> +  struct G g;
> +  u.e.a = 42;
> +  __builtin_memset (u.e.b, 0, sizeof (u.e.b));
> +  __builtin_memset (&g, 0, sizeof (g));
> +  int r = garply (&g);
> +  r += foo (&u.a) + bar (&u.b) + baz (&u.c);
> +  r += qux (&u.d) + corge (&u.e) + freddy (&u.f);
> +  r += waldo (&g);
> +  if (r != 0)
> +    __builtin_abort ();
> +}
> --- gcc/testsuite/gcc.dg/ubsan/bounds-4a.c.jj	2023-02-27 17:08:52.797671833 +0100
> +++ gcc/testsuite/gcc.dg/ubsan/bounds-4a.c	2023-02-27 17:25:08.463687314 +0100
> @@ -0,0 +1,8 @@
> +/* PR sanitizer/108894 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds -fstrict-flex-arrays=0" } */
> +/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
> +
> +#include "bounds-4.c"
> --- gcc/testsuite/gcc.dg/ubsan/bounds-4b.c.jj	2023-02-27 17:08:52.797671833 +0100
> +++ gcc/testsuite/gcc.dg/ubsan/bounds-4b.c	2023-02-27 17:29:57.042545420 +0100
> @@ -0,0 +1,9 @@
> +/* PR sanitizer/108894 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds -fstrict-flex-arrays=1" } */
> +/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 2 out of bounds for type 'int \\\[2\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
> +
> +#include "bounds-4.c"
> --- gcc/testsuite/gcc.dg/ubsan/bounds-4c.c.jj	2023-02-27 17:08:52.797671833 +0100
> +++ gcc/testsuite/gcc.dg/ubsan/bounds-4c.c	2023-02-27 17:34:29.094634537 +0100
> @@ -0,0 +1,10 @@
> +/* PR sanitizer/108894 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds -fstrict-flex-arrays=2" } */
> +/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'int \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 2 out of bounds for type 'int \\\[2\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
> +
> +#include "bounds-4.c"
> --- gcc/testsuite/gcc.dg/ubsan/bounds-4d.c.jj	2023-02-27 17:08:52.797671833 +0100
> +++ gcc/testsuite/gcc.dg/ubsan/bounds-4d.c	2023-02-27 17:35:46.556520992 +0100
> @@ -0,0 +1,11 @@
> +/* PR sanitizer/108894 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds -fstrict-flex-arrays=3" } */
> +/* { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'int \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 2 out of bounds for type 'int \\\[2\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" } */
> +
> +#include "bounds-4.c"
> --- gcc/testsuite/g++.dg/ubsan/bounds-1.C.jj	2023-02-27 17:37:03.076420982 +0100
> +++ gcc/testsuite/g++.dg/ubsan/bounds-1.C	2023-02-27 17:38:23.717261726 +0100
> @@ -0,0 +1,8 @@
> +// PR sanitizer/108894
> +// { dg-do run }
> +// { dg-options "-fsanitize=bounds -fsanitize-recover=bounds" }
> +// { dg-output "index 15 out of bounds for type 'int \\\[15\\\]'\[^\n\r]*(\n|\r\n|\r)" }
> +// { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\[0-9x]*\\\]'\[^\n\r]*(\n|\r\n|\r)" }
> +// { dg-output "\[^\n\r]*index 16 out of bounds for type 'int \\\[15\\\]'" }
> +
> +#include "../../gcc.dg/ubsan/bounds-4.c"
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894]
  2023-02-28  9:02 ` Richard Biener
@ 2023-02-28  9:11   ` Jakub Jelinek
  0 siblings, 0 replies; 9+ messages in thread
From: Jakub Jelinek @ 2023-02-28  9:11 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Tue, Feb 28, 2023 at 09:02:47AM +0000, Richard Biener wrote:
> > While this isn't really a regression, the -fstrict-flex-arrays*
> > option is new in GCC 13 and so I think we should make -fsanitize=bounds
> > play with it well from the beginning.
> > 
> > The current behavior is that -fsanitize=bounds considers all trailing
> > arrays as flexible member-like arrays and both -fsanitize=bounds and
> > -fsanitize=bounds-strict because of a bug don't even instrument
> > [0] arrays at all, not as trailing nor when followed by other members.
> > 
> > I think -fstrict-flex-arrays* options can be considered as language
> > mode changing options, by default flexible member-like arrays are
> > handled like flexible arrays, but that option can change the set of
> > the arrays which are treated like that.  So, -fsanitize=bounds should
> > change with that on what is considered acceptable and what isn't.
> > While -fsanitize=bounds-strict should reject them all always to
> > continue previous behavior.
> > 
> > The following patch implements that.  To support [0] array instrumentation,
> > I had to change the meaning of the bounds argument to .UBSAN_BOUNDS,
> > previously it was the TYPE_MAX_VALUE of the domain unless ignore_off_by_one
> > (used for taking address of the array element rather than accessing it;
> > in that case 1 is added to the bound argument) and the later lowered checks
> > were if (index > bound) report_failure ().
> > The problem with that is that for [0] arrays where (at least for C++)
> > the max value is all ones, for accesses that condition will be never true;
> > for addresses of elements it was working (in C++) correctly before.
> > This patch changes it to add 1 + ignore_off_by_one, so -1 becomes 0 or
> > 1 for &array_ref and changing the lowering to be if (index >= bound)
> > report_failure ().  Furthermore, as C represents the [0] arrays with
> > NULL TYPE_MAX_VALUE, I treated those like the C++ ones.
> 
> LGTM.  Btw, what does -fsanitize=bounds do for C++ code which lacks
> flexible arrays?  Does it treat all trailing arrays as fixed?

As -fstrict-flex-arrays* options and strict_flex_array attribute are
basically ignored right now for C++, I've kept the previous behavior
for C++ (except for fixing handling of [0] arrays), which is that it like
the rest of the compiler treats all trailing arrays as flexible member like
(ok, there is a loop looking for nested references, so
struct S { int a; struct T { int b; int c[1]; } d; int e; };
that the e member results in c not being treated flexible member-like).
If/when -fstrict-flex-arrays* support is added for C++, we'll need to
drop one of the !c_dialect_cxx () guards there even in c-ubsan.cc.

	Jakub


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894]
  2023-02-28  8:26 [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894] Jakub Jelinek
  2023-02-28  9:02 ` Richard Biener
@ 2023-02-28 16:13 ` Qing Zhao
  2023-02-28 17:49   ` Jakub Jelinek
  1 sibling, 1 reply; 9+ messages in thread
From: Qing Zhao @ 2023-02-28 16:13 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

Hi, Jakub,

Thanks a lot for fixing this issue.

I have several questions in below:

> On Feb 28, 2023, at 3:26 AM, Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> I think -fstrict-flex-arrays* options can be considered as language
> mode changing options, by default flexible member-like arrays are
> handled like flexible arrays, but that option can change the set of
> the arrays which are treated like that.  So, -fsanitize=bounds should
> change with that on what is considered acceptable and what isn't.
> While -fsanitize=bounds-strict should reject them all always to
> continue previous behavior.


As my understanding, without your current patch, the current -fsanitize=bounds-strict behaves like -fstrict-flex-arrays=2, i.e:
it treats:
   [], [0] as flexible array members;
but
   [1], [4] as regular arrays

Then with your current patch, [0] will NOT be treated as flexible array members by default anymore, so, the -fsanitize=bounds-strict will
treats:
   [] as flexible array members;
but
   [0], [1], [4] as regular arrays
The same behavior as -fstrict-flex-arrays=3.

Therefore, -fsanitize=bounds-strict already implies -fstrict-flex-arrays=3. 

Is the above understanding correctly?

> 
> The following patch implements that.  To support [0] array instrumentation,
> I had to change the meaning of the bounds argument to .UBSAN_BOUNDS,
> previously it was the TYPE_MAX_VALUE of the domain unless ignore_off_by_one
> (used for taking address of the array element rather than accessing it;
> in that case 1 is added to the bound argument) and the later lowered checks
> were if (index > bound) report_failure ().
> The problem with that is that for [0] arrays where (at least for C++)
> the max value is all ones, for accesses that condition will be never true;
> for addresses of elements it was working (in C++) correctly before.
> This patch changes it to add 1 + ignore_off_by_one, so -1 becomes 0 or
> 1 for &array_ref and changing the lowering to be if (index >= bound)
> report_failure ().  Furthermore, as C represents the [0] arrays with
> NULL TYPE_MAX_VALUE, I treated those like the C++ ones.

For [0] arrays, why C++ and C represent with different IR? 

thanks.

Qing
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894]
  2023-02-28 16:13 ` Qing Zhao
@ 2023-02-28 17:49   ` Jakub Jelinek
  2023-02-28 19:19     ` Qing Zhao
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2023-02-28 17:49 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Richard Biener, gcc-patches

On Tue, Feb 28, 2023 at 04:13:28PM +0000, Qing Zhao wrote:
> > On Feb 28, 2023, at 3:26 AM, Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > I think -fstrict-flex-arrays* options can be considered as language
> > mode changing options, by default flexible member-like arrays are
> > handled like flexible arrays, but that option can change the set of
> > the arrays which are treated like that.  So, -fsanitize=bounds should
> > change with that on what is considered acceptable and what isn't.
> > While -fsanitize=bounds-strict should reject them all always to
> > continue previous behavior.
> 
> 
> As my understanding, without your current patch, the current -fsanitize=bounds-strict behaves like -fstrict-flex-arrays=2, i.e:
> it treats:
>    [], [0] as flexible array members;
> but
>    [1], [4] as regular arrays

Yes, but not because it would be an intention, but because of a bug
it actually never instrumented [0] arrays.  Well, it would complain about
struct S { int a; int b[0]; int c; } s;
... &s.b[1] ...
for C++, but not for C.

> Then with your current patch, [0] will NOT be treated as flexible array members by default anymore, so, the -fsanitize=bounds-strict will
> treats:
>    [] as flexible array members;
> but
>    [0], [1], [4] as regular arrays
> The same behavior as -fstrict-flex-arrays=3.
> 
> Therefore, -fsanitize=bounds-strict already implies -fstrict-flex-arrays=3. 

No.  -fsanitize=bounds-strict doesn't imply anything for
flag_strict_flex_arrays, it for the bounds sanitization decisions
behaves as if -fstrict-flex-arrays=3.

> For [0] arrays, why C++ and C represent with different IR? 

I think it is a historic difference that could take quite a big amount of
work to get rid of (and the question is what is better), and even after that
work there would be still big chances of regressions.

	Jakub


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894]
  2023-02-28 17:49   ` Jakub Jelinek
@ 2023-02-28 19:19     ` Qing Zhao
  2023-02-28 21:59       ` Jakub Jelinek
  0 siblings, 1 reply; 9+ messages in thread
From: Qing Zhao @ 2023-02-28 19:19 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches



> On Feb 28, 2023, at 12:49 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Tue, Feb 28, 2023 at 04:13:28PM +0000, Qing Zhao wrote:
>>> On Feb 28, 2023, at 3:26 AM, Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>> I think -fstrict-flex-arrays* options can be considered as language
>>> mode changing options, by default flexible member-like arrays are
>>> handled like flexible arrays, but that option can change the set of
>>> the arrays which are treated like that.  So, -fsanitize=bounds should
>>> change with that on what is considered acceptable and what isn't.
>>> While -fsanitize=bounds-strict should reject them all always to
>>> continue previous behavior.
>> 
>> 
>> As my understanding, without your current patch, the current -fsanitize=bounds-strict behaves like -fstrict-flex-arrays=2, i.e:
>> it treats:
>>   [], [0] as flexible array members;
>> but
>>   [1], [4] as regular arrays
> 
> Yes, but not because it would be an intention, but because of a bug
> it actually never instrumented [0] arrays.  
Understood.  
So, your patch fixed this bug, and then [0] arrays are instrumented by default with this patch.

> Well, it would complain about
> struct S { int a; int b[0]; int c; } s;
> ... &s.b[1] ...
> for C++, but not for C.

A little confused here: [0] arrays were instrumented by default for C++ if it’s not a trailing array, but not for C?

> 
>> Then with your current patch, [0] will NOT be treated as flexible array members by default anymore, so, the -fsanitize=bounds-strict will
>> treats:
>>   [] as flexible array members;
>> but
>>   [0], [1], [4] as regular arrays
>> The same behavior as -fstrict-flex-arrays=3.
>> 
>> Therefore, -fsanitize=bounds-strict already implies -fstrict-flex-arrays=3. 
> 
> No.  -fsanitize=bounds-strict doesn't imply anything for
> flag_strict_flex_arrays, it for the bounds sanitization decisions
> behaves as if -fstrict-flex-arrays=3.

Yes, that was what I meant. -:)
> 
>> For [0] arrays, why C++ and C represent with different IR? 
> 
> I think it is a historic difference that could take quite a big amount of
> work to get rid of (and the question is what is better), and even after that
> work there would be still big chances of regressions.

Okay, I see… (this is really a confusion situation…) but anyway…

Thanks. 

Qing
> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894]
  2023-02-28 19:19     ` Qing Zhao
@ 2023-02-28 21:59       ` Jakub Jelinek
  2023-03-01  9:58         ` [committed] ubsan: Add another testcase for [0] array in the middle of struct [PR108894] Jakub Jelinek
  2023-03-01 16:30         ` [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894] Qing Zhao
  0 siblings, 2 replies; 9+ messages in thread
From: Jakub Jelinek @ 2023-02-28 21:59 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Richard Biener, gcc-patches

On Tue, Feb 28, 2023 at 07:19:40PM +0000, Qing Zhao wrote:
> Understood.  
> So, your patch fixed this bug, and then [0] arrays are instrumented by default with this patch.
> 
> > Well, it would complain about
> > struct S { int a; int b[0]; int c; } s;
> > ... &s.b[1] ...
> > for C++, but not for C.
> 
> A little confused here: [0] arrays were instrumented by default for C++ if it’s not a trailing array, but not for C?

Given say
struct S { int a; int b[0]; int c; } s;

int
main ()
{
  int *volatile p = &s.b[0];
  p = &s.b[1];
  int volatile q = s.b[0];
}
both -fsanitize=bounds and -fsanitize=bounds-strict behaved the same way,
in C nothing was reported, in C++ the p = &s.b[1]; statement.
The reasons for s.b[0] not being reported in C++ was that for
!ignore_off_by_one, bounds was ~(size_t)0, and so index > ~(size_t)0
is always false.  While with the committed patch it is
index >= (~(size_t)0)+1 and so always true.  And in C additionally, we
punted early because TYPE_MAX_VALUE (domain) was NULL.

	Jakub


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [committed] ubsan: Add another testcase for [0] array in the middle of struct [PR108894]
  2023-02-28 21:59       ` Jakub Jelinek
@ 2023-03-01  9:58         ` Jakub Jelinek
  2023-03-01 16:30         ` [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894] Qing Zhao
  1 sibling, 0 replies; 9+ messages in thread
From: Jakub Jelinek @ 2023-03-01  9:58 UTC (permalink / raw)
  To: Qing Zhao, Richard Biener, gcc-patches

Hi!

On Tue, Feb 28, 2023 at 10:59:03PM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Feb 28, 2023 at 07:19:40PM +0000, Qing Zhao wrote:
> > Understood.  
> > So, your patch fixed this bug, and then [0] arrays are instrumented by default with this patch.
> > 
> > > Well, it would complain about
> > > struct S { int a; int b[0]; int c; } s;
> > > ... &s.b[1] ...
> > > for C++, but not for C.
> > 
> > A little confused here: [0] arrays were instrumented by default for C++ if it’s not a trailing array, but not for C?
> 
> Given say
> struct S { int a; int b[0]; int c; } s;
> 
> int
> main ()
> {
>   int *volatile p = &s.b[0];
>   p = &s.b[1];
>   int volatile q = s.b[0];
> }

And, when I wrote such a testcase, I thought it would be worth it to have it
in the testsuite too.
Tested on x86_64-linux -m32/-m64, committed to trunk as obvious:

2023-03-01  Jakub Jelinek  <jakub@redhat.com>

	PR sanitizer/108894
	* c-c++-common/ubsan/bounds-16.c: New test.

--- gcc/testsuite/c-c++-common/ubsan/bounds-16.c.jj	2023-03-01 10:35:29.751959193 +0100
+++ gcc/testsuite/c-c++-common/ubsan/bounds-16.c	2023-03-01 10:38:09.087645556 +0100
@@ -0,0 +1,15 @@
+/* PR sanitizer/108894 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds -fsanitize-recover=bounds" } */
+/* { dg-output "index 1 out of bounds for type 'int \\\[\[*0-9x]*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 0 out of bounds for type 'int \\\[\[*0-9x]*\\\]'" } */
+
+struct S { int a; int b[0]; int c; } s;
+
+int
+main ()
+{
+  int *volatile p = &s.b[0];
+  p = &s.b[1];
+  int volatile q = s.b[0];
+}


	Jakub


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894]
  2023-02-28 21:59       ` Jakub Jelinek
  2023-03-01  9:58         ` [committed] ubsan: Add another testcase for [0] array in the middle of struct [PR108894] Jakub Jelinek
@ 2023-03-01 16:30         ` Qing Zhao
  1 sibling, 0 replies; 9+ messages in thread
From: Qing Zhao @ 2023-03-01 16:30 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches



> On Feb 28, 2023, at 4:59 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Tue, Feb 28, 2023 at 07:19:40PM +0000, Qing Zhao wrote:
>> Understood.  
>> So, your patch fixed this bug, and then [0] arrays are instrumented by default with this patch.
>> 
>>> Well, it would complain about
>>> struct S { int a; int b[0]; int c; } s;
>>> ... &s.b[1] ...
>>> for C++, but not for C.
>> 
>> A little confused here: [0] arrays were instrumented by default for C++ if it’s not a trailing array, but not for C?
> 
> Given say
> struct S { int a; int b[0]; int c; } s;
> 
> int
> main ()
> {
>  int *volatile p = &s.b[0];
>  p = &s.b[1];
>  int volatile q = s.b[0];
> }
> both -fsanitize=bounds and -fsanitize=bounds-strict behaved the same way,
> in C nothing was reported, in C++ the p = &s.b[1]; statement.
> The reasons for s.b[0] not being reported in C++ was that for
> !ignore_off_by_one, bounds was ~(size_t)0, and so index > ~(size_t)0
> is always false.  While with the committed patch it is
> index >= (~(size_t)0)+1 and so always true.  And in C additionally, we
> punted early because TYPE_MAX_VALUE (domain) was NULL.

Thanks for the explanation.

With your patch, both C and C++ will report for the middle [0] arrays. That’s nice.

Qing
> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-03-01 16:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-28  8:26 [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894] Jakub Jelinek
2023-02-28  9:02 ` Richard Biener
2023-02-28  9:11   ` Jakub Jelinek
2023-02-28 16:13 ` Qing Zhao
2023-02-28 17:49   ` Jakub Jelinek
2023-02-28 19:19     ` Qing Zhao
2023-02-28 21:59       ` Jakub Jelinek
2023-03-01  9:58         ` [committed] ubsan: Add another testcase for [0] array in the middle of struct [PR108894] Jakub Jelinek
2023-03-01 16:30         ` [PATCH] ubsan: Honor -fstrict-flex-arrays= in -fsanitize=bounds [PR108894] Qing Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).