public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
@ 2017-10-29 16:15 Martin Sebor
  2017-10-30 11:55 ` Richard Biener
  2017-10-30 22:16 ` Jeff Law
  0 siblings, 2 replies; 22+ messages in thread
From: Martin Sebor @ 2017-10-29 16:15 UTC (permalink / raw)
  To: Gcc Patch List, Jeff Law, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 1471 bytes --]

In my work on -Wrestrict, to issue meaningful warnings, I found
it important to detect both out of bounds array indices as well
as offsets in calls to restrict-qualified functions like strcpy.
GCC already detects some of these cases but my tests for
the enhanced warning exposed a few gaps.

The attached patch enhances -Warray-bounds to detect more instances
out-of-bounds indices and offsets to member arrays and non-array
members.  For example, it detects the out-of-bounds offset in the
call to strcpy below.

The patch is meant to be applied on top posted here but not yet
committed:
   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

Richard, since this also touches tree-vrp.c I look for your comments.

Jeff, this is the enhancement you were interested in when we spoke
last week.

Thanks
Martin

$ cat a.c && gcc -O2 -S -Wall a.c
   struct A { char a[4]; void (*pf)(void); };

   void f (struct A *p)
   {
     p->a[5] = 'x';            // existing -Warray-bounds

     strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
   }

   a.c: In function ‘f’:
   a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’ [-Warray-bounds]
    strcpy (p->a + 6, "y");
    ^~~~~~~~~~~~~~~~~~~~~~
   a.c:1:17: note: member declared here
    struct A { char a[4]; void (*pf)(void); };
                    ^
   a.c:5:7: warning: array subscript 5 is above array bounds of 
‘char[4]’ [-Warray-bounds]
      p->a[5] = 'x';
      ~~~~^~~

[-- Attachment #2: gcc-82455.diff --]
[-- Type: text/x-patch, Size: 28853 bytes --]

PR tree-optimization/82455 - missing -Warray-bounds on strcpy offset in an out-of-bounds range

gcc/ChangeLog:

	PR tree-optimization/82455
	* tree-ssa-forwprop.c (maybe_warn_offset_out_of_bounds): New function.
	(forward_propagate_addr_expr_1): Call it.
	* tree-vrp.c (check_array_ref): Return bool.  Handle flexible array
	members, string literals, and inner indices.
	(search_for_addr_array): Return bool.  Add an argument.  Add detail
	to diagnostics.
	(check_array_bounds): New function.
	* tree-vrp.h (check_array_bounds): Declare it.
	* wide-int.h (sign_mask): Assert a precondition.

gcc/testsuite/ChangeLog:

	PR tree-optimization/82455
	* gcc.dg/Warray-bounds-23.c: New test.
	* gcc.dg/Warray-bounds-24.c: New test.

diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index 5569b98..364a3a2 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -46,6 +46,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-cfgcleanup.h"
 #include "cfganal.h"
 #include "optabs-tree.h"
+#include "diagnostic-core.h"
+#include "tree-vrp.h"
 
 /* This pass propagates the RHS of assignment statements into use
    sites of the LHS of the assignment.  It's basically a specialized
@@ -641,6 +643,210 @@ tidy_after_forward_propagate_addr (gimple *stmt)
      recompute_tree_invariant_for_addr_expr (gimple_assign_rhs1 (stmt));
 }
 
+/* Check to see if the expression (char*)PTR + OFF is valid (i.e., it
+   doesn't overflow and the result is within the bounds of the object
+   pointed to by PTR.  OFF is an offset in bytes.  If it isn't valid,
+   issue a -Warray-bounds warning.  */
+
+static bool
+maybe_warn_offset_out_of_bounds (location_t loc, tree ptr, tree off)
+{
+  offset_int cstoff = 0;
+
+  bool ignore_off_by_1 = false;
+
+  if (TREE_CODE (ptr) == SSA_NAME)
+    {
+      gimple *stmt = SSA_NAME_DEF_STMT (ptr);
+      if (is_gimple_assign (stmt))
+	{
+	  tree_code code = gimple_assign_rhs_code (stmt);
+	  if (code == ADDR_EXPR)
+	    {
+	      ptr = gimple_assign_rhs1 (stmt);
+	      ignore_off_by_1 = true;
+	    }
+	  else if (code == POINTER_PLUS_EXPR)
+	    {
+	      ptr = gimple_assign_rhs1 (stmt);
+	      tree offset = gimple_assign_rhs2 (stmt);
+	      if (maybe_warn_offset_out_of_bounds (loc, ptr, offset))
+		return true;
+
+	      /* To do: handle ranges.  */
+	      if (TREE_CODE (offset) != INTEGER_CST)
+		return false;
+
+	      /* Convert the offset from sizetype to ptrdiff_t.  */
+	      offset = fold_convert (ptrdiff_type_node, offset);
+	      cstoff = wi::to_offset (offset);
+	    }
+	}
+    }
+
+  if (TREE_CODE (ptr) != ADDR_EXPR
+      || TREE_CODE (TREE_OPERAND (ptr, 0)) != ARRAY_REF)
+    ignore_off_by_1 = false;
+
+  if (check_array_bounds (ptr, ignore_off_by_1))
+    return true;
+
+  bool addr_expr = false;
+
+  if (TREE_CODE (ptr) == ADDR_EXPR)
+    {
+      ptr = TREE_OPERAND (ptr, 0);
+      addr_expr = true;
+    }
+
+  /* The type of the array to whichj the offset applies and the type
+     of the array element.  For offsets to non-arrays (i.e., pointers)
+     the array is considered to have a single element of ELTYPE.   */
+  tree atype = TREE_TYPE (ptr);
+  tree eltype = atype;
+
+  if (TREE_CODE (ptr) == ARRAY_REF)
+    {
+      tree idx = TREE_OPERAND (ptr, 1);
+
+      /* To do: handle ranges.  */
+      if (TREE_CODE (idx) != INTEGER_CST)
+	return false;
+
+      ptr = TREE_OPERAND (ptr, 0);
+      eltype = TREE_TYPE (TREE_TYPE (ptr));
+      tree eltsize = TYPE_SIZE_UNIT (eltype);
+
+      idx = fold_convert (ptrdiff_type_node, idx);
+
+      if (TREE_CODE (eltsize) == INTEGER_CST)
+	cstoff += wi::to_offset (idx) * wi::to_offset (eltsize);
+      else
+	cstoff += wi::to_offset (idx);
+
+      atype = TREE_TYPE (ptr);
+      if (TREE_CODE (atype) == POINTER_TYPE)
+	atype = TREE_TYPE (atype);
+    }
+  else if (TREE_CODE (ptr) == MEM_REF)
+    addr_expr = false;
+
+  bool zero_low_bound;
+  tree ubound = NULL_TREE;
+
+  if (TREE_CODE (atype) == ARRAY_TYPE)
+    {
+      eltype = TREE_TYPE (atype);
+      zero_low_bound = true;
+
+      if (warn_array_bounds > 1
+	  || !array_at_struct_end_p (ptr))
+	{
+	  /* Determine the upper bound of the array and the element type
+	     if it's known.  Otherwise, use the maximum object size to
+	     compute the most conservative upper bound.  */
+	  if (tree dom = TYPE_DOMAIN (atype))
+	    ubound = TYPE_MAX_VALUE (dom);
+	}
+    }
+  else if (addr_expr)
+    {
+      /* Treat a singleton (non-array) variable as an array of a single
+	 element.  */
+      eltype = atype;
+      atype = build_array_type_nelts (eltype, 1);
+      ubound = size_zero_node;
+      zero_low_bound = true;
+    }
+  else
+    {
+      eltype = atype;
+      zero_low_bound = false;
+    }
+
+  offset_int elsize;
+  if (TREE_CODE (TYPE_SIZE_UNIT (eltype)) == INTEGER_CST)
+    elsize = wi::to_offset (TYPE_SIZE_UNIT (eltype));
+  else
+    elsize = 1;
+
+  /* Compute the maximum and minimum valid byte offsets. */
+  offset_int maxoff = wi::to_offset (TYPE_MAX_VALUE (ptrdiff_type_node));
+  offset_int minoff = wi::to_offset (TYPE_MIN_VALUE (ptrdiff_type_node));
+
+  if (TREE_CODE (off) == SSA_NAME)
+    {
+      wide_int min, max;
+      value_range_type rng = get_range_info (off, &min, &max);
+      if (rng == VR_VARYING)
+	return false;
+
+      if (rng == VR_RANGE)
+	{
+	  if (wi::lts_p (min, wi::zero (min.get_precision ())))
+	    min = max;
+	}
+      else
+	min = max + 1;
+
+      off = wide_int_to_tree (TREE_TYPE (off), min);
+    }
+
+  /* Convert the unsigned byte offset into ptrdiff_t.  */
+  off = fold_convert (ptrdiff_type_node, off);
+  offset_int wioff = wi::to_offset (off) + cstoff;
+
+  /* Determine the byte offset of the (member) reference and subtract
+     it from the upper bound on the offset.  The lower bound stays
+     the same because its value doesn't contribute to the result.  */
+  HOST_WIDE_INT byteoff;
+  tree base = get_addr_base_and_unit_offset (ptr, &byteoff);
+  if (base)
+    maxoff -= byteoff;
+
+  bool warned = false;
+
+  if (wi::lts_p (maxoff, wioff) || wi::lts_p (wioff, minoff))
+    {
+      warned = warning_at (loc, OPT_Warray_bounds,
+			   "pointer overflow computing offset %lli from %qT",
+			   (long long) wioff.to_shwi (), atype);
+    }
+  else
+    {
+      if (ubound && TREE_CODE (ubound) == INTEGER_CST)
+	maxoff = (wi::to_offset (ubound) + 1) * elsize;
+
+      if (zero_low_bound)
+	minoff = 0;
+
+      if (wi::lts_p (maxoff, wioff) || wi::lts_p (wioff, minoff))
+	warned = warning_at (loc, OPT_Warray_bounds,
+			     "offset %lli is out of bounds of %qT",
+			     (long long) wioff.to_shwi (), atype);
+    }
+
+  if (warned)
+    {
+      if (base && TREE_CODE (base) == SSA_NAME)
+	{
+	  gimple *stmt = SSA_NAME_DEF_STMT (ptr);
+	  if (is_gimple_assign (stmt)
+	      && gimple_assign_rhs_code (stmt) == ADDR_EXPR)
+	    ptr = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0);
+	}
+
+      if (TREE_CODE (ptr) == COMPONENT_REF)
+	{
+	  ptr = TREE_OPERAND (ptr, 1);
+	  inform (DECL_SOURCE_LOCATION (ptr), "member declared here");
+	}
+      return true;
+    }
+
+  return false;
+}
+
 /* NAME is a SSA_NAME representing DEF_RHS which is of the form
    ADDR_EXPR <whatever>.
 
@@ -717,10 +923,13 @@ forward_propagate_addr_expr_1 (tree name, tree def_rhs,
          to make sure we can build a valid constant offsetted address
 	 for further propagation.  Simply rely on fold building that
 	 and check after the fact.  */
+      tree off = gimple_assign_rhs2 (use_stmt);
+      maybe_warn_offset_out_of_bounds (gimple_location (use_stmt),
+				       rhs, off);
+
       new_def_rhs = fold_build2 (MEM_REF, TREE_TYPE (TREE_TYPE (rhs)),
 				 def_rhs,
-				 fold_convert (ptr_type_node,
-					       gimple_assign_rhs2 (use_stmt)));
+				 fold_convert (ptr_type_node, off));
       if (TREE_CODE (new_def_rhs) == MEM_REF
 	  && !is_gimple_mem_ref_addr (TREE_OPERAND (new_def_rhs, 0)))
 	return false;
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 3e93685..8097ea5 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -6664,7 +6664,7 @@ insert_range_assertions (void)
    non-overlapping with valid range.
    IGNORE_OFF_BY_ONE is true if the ARRAY_REF is inside a ADDR_EXPR.  */
 
-static void
+static bool
 check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
 {
   value_range *vr = NULL;
@@ -6672,7 +6672,7 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
   tree low_bound, up_bound, up_bound_p1;
 
   if (TREE_NO_WARNING (ref))
-    return;
+    return false;
 
   low_sub = up_sub = TREE_OPERAND (ref, 1);
   up_bound = array_ref_up_bound (ref);
@@ -6691,7 +6691,7 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
 
       /* FIXME: Handle VLAs.  */
       if (TREE_CODE (eltsize) != INTEGER_CST)
-	return;
+	return false;
 
       tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
 
@@ -6702,11 +6702,12 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
       if (code == COMPONENT_REF)
 	{
 	  HOST_WIDE_INT off;
-	  if (tree base = get_addr_base_and_unit_offset (arg, &off))
+	  if (get_addr_base_and_unit_offset (arg, &off))
 	    {
-	      tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
-	      if (TREE_CODE (size) == INTEGER_CST)
-		up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);
+	      off /= tree_to_uhwi (eltsize);
+	      up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1,
+					     build_int_cst (ptrdiff_type_node,
+							    off));
 	    }
 	}
 
@@ -6719,21 +6720,28 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
 
   low_bound = array_ref_low_bound (ref);
 
+  if (location == UNKNOWN_LOCATION
+      && EXPR_HAS_LOCATION (ref))
+    location = EXPR_LOCATION (ref);
+
+  bool warned = false;
   tree artype = TREE_TYPE (TREE_OPERAND (ref, 0));
 
   /* Empty array.  */
   if (tree_int_cst_equal (low_bound, up_bound_p1))
     {
-      warning_at (location, OPT_Warray_bounds,
-		  "array subscript %E is above array bounds of %qT",
-		  low_bound, artype);
-      TREE_NO_WARNING (ref) = 1;
+      warned = warning_at (location, OPT_Warray_bounds,
+			   "array subscript %E is above array bounds of %qT",
+			   low_bound, artype);
+      if (warned)
+	TREE_NO_WARNING (ref) = 1;
+      return warned;
     }
 
   if (TREE_CODE (low_sub) == SSA_NAME)
     {
       vr = get_value_range (low_sub);
-      if (vr->type == VR_RANGE || vr->type == VR_ANTI_RANGE)
+      if (vr && (vr->type == VR_RANGE || vr->type == VR_ANTI_RANGE))
         {
           low_sub = vr->type == VR_RANGE ? vr->max : vr->min;
           up_sub = vr->type == VR_RANGE ? vr->min : vr->max;
@@ -6748,13 +6756,11 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
 	      : tree_int_cst_le (up_bound, up_sub))
           && TREE_CODE (low_sub) == INTEGER_CST
           && tree_int_cst_le (low_sub, low_bound))
-        {
-          warning_at (location, OPT_Warray_bounds,
-		      "array subscript [%E, %E] is outside array bounds of %qT",
-		      low_sub, up_sub, artype);
-          TREE_NO_WARNING (ref) = 1;
-        }
-    }
+	warned = warning_at (location, OPT_Warray_bounds,
+			     "array subscript [%E, %E] is outside array "
+			     "bounds of %qT",
+			     low_sub, up_sub, artype);
+	}
   else if (TREE_CODE (up_sub) == INTEGER_CST
 	   && (ignore_off_by_one
 	       ? !tree_int_cst_le (up_sub, up_bound_p1)
@@ -6766,10 +6772,9 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
 	  dump_generic_expr (MSG_NOTE, TDF_SLIM, ref);
 	  fprintf (dump_file, "\n");
 	}
-      warning_at (location, OPT_Warray_bounds,
-		  "array subscript %E is above array bounds of %qT",
-		  up_sub, artype);
-      TREE_NO_WARNING (ref) = 1;
+      warned = warning_at (location, OPT_Warray_bounds,
+			   "array subscript %E is above array bounds of %qT",
+			   up_sub, artype);
     }
   else if (TREE_CODE (low_sub) == INTEGER_CST
            && tree_int_cst_lt (low_sub, low_bound))
@@ -6780,24 +6785,31 @@ check_array_ref (location_t location, tree ref, bool ignore_off_by_one)
 	  dump_generic_expr (MSG_NOTE, TDF_SLIM, ref);
 	  fprintf (dump_file, "\n");
 	}
-      warning_at (location, OPT_Warray_bounds,
-		  "array subscript %E is below array bounds of %qT",
-		  low_sub, artype);
-      TREE_NO_WARNING (ref) = 1;
+      warned = warning_at (location, OPT_Warray_bounds,
+			   "array subscript %E is below array bounds of %qT",
+			   low_sub, artype);
     }
+
+  if (warned)
+    TREE_NO_WARNING (ref) = 1;
+
+  return warned;
 }
 
-/* Searches if the expr T, located at LOCATION computes
-   address of an ARRAY_REF, and call check_array_ref on it.  */
+/* Search if the expr T, located at LOCATION, compute the address
+   of an ARRAY_REF, and call check_array_ref on it.
+   IGNORE_OFF_BY_ONE is true if the ARRAY_REF is inside a ADDR_EXPR.  */
 
-static void
-search_for_addr_array (tree t, location_t location)
+static bool
+search_for_addr_array (tree t, location_t location, bool ignore_off_by_one)
 {
+  bool warned = false;
+
   /* Check each ARRAY_REFs in the reference chain. */
   do
     {
       if (TREE_CODE (t) == ARRAY_REF)
-	check_array_ref (location, t, true /*ignore_off_by_one*/);
+	warned |= check_array_ref (location, t, ignore_off_by_one);
 
       t = TREE_OPERAND (t, 0);
     }
@@ -6813,7 +6825,7 @@ search_for_addr_array (tree t, location_t location)
       if (TREE_CODE (TREE_TYPE (tem)) != ARRAY_TYPE
 	  || TREE_CODE (TREE_TYPE (TREE_TYPE (tem))) == ARRAY_TYPE
 	  || !TYPE_DOMAIN (TREE_TYPE (tem)))
-	return;
+	return warned;
 
       low_bound = TYPE_MIN_VALUE (TYPE_DOMAIN (TREE_TYPE (tem)));
       up_bound = TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (tem)));
@@ -6824,7 +6836,7 @@ search_for_addr_array (tree t, location_t location)
 	  || TREE_CODE (up_bound) != INTEGER_CST
 	  || !el_sz
 	  || TREE_CODE (el_sz) != INTEGER_CST)
-	return;
+	return warned;
 
       idx = mem_ref_offset (t);
       idx = wi::sdiv_trunc (idx, wi::to_offset (el_sz));
@@ -6836,10 +6848,10 @@ search_for_addr_array (tree t, location_t location)
 	      dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
 	      fprintf (dump_file, "\n");
 	    }
-	  warning_at (location, OPT_Warray_bounds,
-		      "array subscript %wi is below array bounds of %qT",
-		      idx.to_shwi (), TREE_TYPE (tem));
-	  TREE_NO_WARNING (t) = 1;
+	  warned |= warning_at (location, OPT_Warray_bounds,
+				"array subscript %wi is below array bounds "
+				"of %qT",
+				idx.to_shwi (), TREE_TYPE (tem));
 	}
       else if (idx > (wi::to_offset (up_bound)
 		      - wi::to_offset (low_bound) + 1))
@@ -6850,12 +6862,17 @@ search_for_addr_array (tree t, location_t location)
 	      dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
 	      fprintf (dump_file, "\n");
 	    }
-	  warning_at (location, OPT_Warray_bounds,
-		      "array subscript %wu is above array bounds of %qT",
-		      idx.to_uhwi (), TREE_TYPE (tem));
-	  TREE_NO_WARNING (t) = 1;
+	  warned = warning_at (location, OPT_Warray_bounds,
+			       "array subscript %wu is above array bounds "
+			       "of %qT",
+			       idx.to_uhwi (), TREE_TYPE (tem));
 	}
     }
+
+  if (warned)
+    TREE_NO_WARNING (t) = 1;
+
+  return warned;
 }
 
 /* walk_tree() callback that checks if *TP is
@@ -6874,7 +6891,10 @@ check_array_bounds (tree *tp, int *walk_subtree, void *data)
   if (EXPR_HAS_LOCATION (t))
     location = EXPR_LOCATION (t);
   else
-    location = gimple_location (wi->stmt);
+    {
+      location_t *locp = (location_t *) wi->info;
+      location = *locp;
+    }
 
   *walk_subtree = TRUE;
 
@@ -6883,13 +6903,31 @@ check_array_bounds (tree *tp, int *walk_subtree, void *data)
 
   else if (TREE_CODE (t) == ADDR_EXPR)
     {
-      search_for_addr_array (t, location);
+      search_for_addr_array (t, location, true /*ignore_off_by_one*/);
       *walk_subtree = FALSE;
     }
 
   return NULL_TREE;
 }
 
+/* Check array indices in EXPR against the bounds of arrays they refer
+   to and issue warnings for any that are outside those bounds.  When
+   IGNORE_OFF_BY_ONE is true, ignore indices that refer to one past
+   the last element of an array (used in &A[N] when N equals A's upper
+   bound.  Return true if any warnings have been issued.  */
+
+bool
+check_array_bounds (tree expr, bool ignore_off_by_one)
+{
+  if (TREE_CODE (expr) == ARRAY_REF)
+    return check_array_ref (UNKNOWN_LOCATION, expr, ignore_off_by_one);
+
+  if (TREE_CODE (expr) == ADDR_EXPR)
+    return search_for_addr_array (expr, UNKNOWN_LOCATION, ignore_off_by_one);
+
+  return false;
+}
+
 /* Walk over all statements of all reachable BBs and call check_array_bounds
    on them.  */
 
@@ -6921,6 +6959,9 @@ check_all_array_refs (void)
 
 	  memset (&wi, 0, sizeof (wi));
 
+	  location_t loc = gimple_location (stmt);
+	  wi.info = &loc;
+
 	  walk_gimple_op (gsi_stmt (si),
 			  check_array_bounds,
 			  &wi);
diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h
index f84403a..6587b57 100644
--- a/gcc/tree-vrp.h
+++ b/gcc/tree-vrp.h
@@ -59,5 +59,6 @@ extern void extract_range_from_unary_expr (value_range *vr,
 					   tree type,
 					   value_range *vr0_,
 					   tree op0_type);
+extern bool check_array_bounds (tree, bool);
 
 #endif /* GCC_TREE_VRP_H */
diff --git a/gcc/wide-int.h b/gcc/wide-int.h
index e17b016..7f431db 100644
--- a/gcc/wide-int.h
+++ b/gcc/wide-int.h
@@ -822,6 +822,8 @@ inline HOST_WIDE_INT
 generic_wide_int <storage>::sign_mask () const
 {
   unsigned int len = this->get_len ();
+  gcc_assert (len != 0);
+
   unsigned HOST_WIDE_INT high = this->get_val ()[len - 1];
   if (!is_sign_extended)
     {
diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-23.c b/gcc/testsuite/gcc.dg/Warray-bounds-23.c
new file mode 100644
index 0000000..85071b0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Warray-bounds-23.c
@@ -0,0 +1,16 @@
+/* PR tree-optimization/82588 - missing -Warray-bounds on an excessively
+   large index
+   { dg-do compile }
+   { dg-options "-O2 -Warray-bounds -ftrack-macro-expansion=0" }  */
+
+#define SIZE_MAX  __SIZE_MAX__
+#define SSIZE_MAX __PTRDIFF_MAX__
+#define SSIZE_MIN (-SSIZE_MAX - 1)
+
+
+int g (int i)
+{
+  int (*p)[2] = (int (*)[2])&i;
+
+  return (*p)[2];
+}
diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-24.c b/gcc/testsuite/gcc.dg/Warray-bounds-24.c
new file mode 100644
index 0000000..3ab06ee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Warray-bounds-24.c
@@ -0,0 +1,246 @@
+/* PR tree-optimization/82581 -	missing -Warray-bounds on writing past
+   the end of a member array
+   { dg-do compile }
+   { dg-options "-O2 -Warray-bounds -ftrack-macro-expansion=0" } */
+
+#define SMAX  __PTRDIFF_MAX__
+#define SMIN  (-SMAX -1)
+#define UMAX  __SIZE_MAX__
+
+#define STATIC_ASSERT(expr) ((void)sizeof (char[1 - 2 * !!(expr)]))
+
+typedef __PTRDIFF_TYPE__ ptrdiff_t;
+typedef __SIZE_TYPE__    size_t;
+
+struct A { char a3[3], b4[4]; };
+struct Ax { char a4[4]; char ax[]; };
+
+void sink (void*);
+#define T(expr)   sink (expr)
+
+
+void test_char_memarray_1 (struct A *a, struct Ax *ax)
+{
+  T (a->a3 + SMIN);           /* { dg-warning "offset -\[0-9\]+ is out of bounds of .char\\\[3]." } */
+  T (a->a3 - SMAX);           /* { dg-warning "offset -\[0-9\]+ is out of bounds" } */
+  T (a->a3 - 1);              /* { dg-warning "offset -1 is out of bounds" } */
+  T (a->a3 + 0);
+  T (a->a3 + 1);
+  T (a->a3 + 2);
+  T (a->a3 + 3);
+  T (a->a3 + 4);              /* { dg-warning "offset 4 is out of bounds" } */
+  T (a->a3 + SMAX);           /* { dg-warning "offset \[0-9\]+ is out of bounds" } */
+  T (a->a3 + SMAX + (size_t)1);   /* { dg-warning "offset -\[0-9\]+ is out of bounds" } */
+
+  T (a->b4 + SMIN);           /* { dg-warning "offset -\[0-9\]+ is out of bounds of .char\\\[4]." } */
+  T (a->b4 - SMAX);           /* { dg-warning "offset -\[0-9\]+ is out of bounds" } */
+  T (a->b4 - 1);              /* { dg-warning "offset -1 is out of bounds" } */
+  T (a->b4 + 0);
+  T (a->b4 + 1);
+  T (a->b4 + 2);
+  T (a->b4 + 3);
+  T (a->b4 + 4);
+  /* This is considered okay for an array at the end of a struct.  */
+  T (a->b4 + 5);
+  T (a->b4 + 99);
+  T (a->b4 + SMAX - 4);
+  T (a->b4 + SMAX);               /* { dg-warning "pointer overflow" } */
+  T (a->b4 + SMAX + (size_t)1);   /* { dg-warning "offset -\[0-9\]+ is out of bounds " } */
+
+  T (ax->a4 + SMIN);          /* { dg-warning "offset -\[0-9\]+ is out of bounds of .char\\\[4]." } */
+  T (ax->a4 - SMAX);          /* { dg-warning "offset -\[0-9\]+ is out of bounds" } */
+  T (ax->a4 - 1);             /* { dg-warning "offset -1 is out of bounds" } */
+  T (ax->a4 + 0);
+  T (ax->a4 + 1);
+  T (ax->a4 + 2);
+  T (ax->a4 + 3);
+  T (ax->a4 + 4);
+  T (ax->a4 + 5);             /* { dg-warning "offset 5 is out of bounds" } */
+  T (ax->a4 + SMAX);          /* { dg-warning "offset \[0-9\]+ is out of bounds" } */
+
+  T (ax->ax + SMIN);          /* { dg-warning "offset -\[0-9\]+ is out of bounds." } */
+  T (ax->ax - SMAX);          /* { dg-warning "offset -\[0-9\]+ is out of bounds" } */
+  T (ax->ax - 1);             /* { dg-warning "offset -1 is out of bounds" } */
+  T (ax->ax + 0);
+  T (ax->ax + 1);
+  T (ax->ax + 2);
+  T (ax->ax + 3);
+  T (ax->ax + 4);
+  T (ax->ax + 99);
+  T (ax->ax + SMAX - 4);
+  T (ax->ax + SMAX - 3);      /* { dg-warning "pointer overflow computing offset \[0-9\]+ from .char\\\[]." } */
+  T (ax->ax + SMAX);          /* { dg-warning "pointer overflow computing offset \[0-9\]+ from .char\\\[]." } */
+ }
+
+struct B { struct A a[2]; };
+struct Bx { struct A a[2]; struct A ax[]; };
+
+void test_memarray_2 (struct B *b, struct Bx *bx)
+{
+  T (b[0].a + 0);
+  T (b[0].a + 1);
+  T (b[0].a + 2);
+  /* The following are okay for an array at the end of a struct.  */
+  T (b[0].a + 3);
+  T (b[0].a + 99);
+  T (b[0].a + SMAX / sizeof b[0].a[0]);
+  /* The offset computation in following overflows too early to detect
+     it (with the current implementation) but because the offset gets
+     negative it is detected and diagnosed.  */
+  T (b[0].a + SMAX / sizeof b[0].a[0] + 1);   /* { dg-warning "offset -?\[0-9\]+ is out of bounds of .struct A\\\[2]." } */
+  T (b[0].a + SMAX / sizeof b[0].a[0] + 2);   /* { dg-warning "offset -?\[0-9\]+ is out of bounds of .struct A\\\[2]." } */
+  T (b[0].a + SMAX / sizeof b[0].a[0] + 3);   /* { dg-warning "offset -?\[0-9\]+ is out of bounds of .struct A\\\[2]." } */
+
+  T (b[0].a[0].b4 + 0);
+  T (b[0].a[0].b4 + 1);
+  T (b[0].a[0].b4 + 2);
+  T (b[0].a[0].b4 + 3);
+  T (b[0].a[0].b4 + 5);   /* { dg-warning "offset 5 is out of bounds of .char\\\[4]." } */
+
+  T (b[0].a[1].b4 + 0);
+  T (b[0].a[1].b4 + 1);
+  T (b[0].a[1].b4 + 2);
+  T (b[0].a[1].b4 + 3);
+  T (b[0].a[1].b4 + 5);   /* { dg-warning "offset 5 is out of bounds of .char\\\[4]." } */
+  T (b[0].a[1].b4 + 6);   /* { dg-warning "offset 6 is out of bounds" } */
+  T (b[0].a[1].b4 + 7);   /* { dg-warning "offset 7 is out of bounds" } */
+  T (b[0].a[1].b4 + 8);   /* { dg-warning "offset 8 is out of bounds" } */
+  T (b[0].a[1].b4 + 9);   /* { dg-warning "offset 9 is out of bounds" } */
+  T (b[0].a[1].b4 + 10);  /* { dg-warning "offset 10 is out of bounds" } */
+  T (b[0].a[1].b4 + 11);  /* { dg-warning "offset 11 is out of bounds" } */
+  T (b[0].a[1].b4 + 12);  /* { dg-warning "offset 12 is out of bounds" } */
+  T (b[0].a[1].b4 + 13);  /* { dg-warning "offset 13 is out of bounds" } */
+
+  T (b[1].a[0].b4 + 0);
+  T (b[1].a[0].b4 + 1);
+  T (b[1].a[0].b4 + 2);
+  T (b[1].a[0].b4 + 3);
+  T (b[1].a[0].b4 + 5);   /* { dg-warning "offset 5 is out of bounds" } */
+
+  T (b[1].a[1].b4 + 0);
+  T (b[1].a[1].b4 + 1);
+  T (b[1].a[1].b4 + 2);
+  T (b[1].a[1].b4 + 3);
+  T (b[1].a[1].b4 + 5);   /* { dg-warning "offset 5 is out of bounds" } */
+
+  STATIC_ASSERT (sizeof bx[0].a == 7);
+  T (bx[0].a + 0);
+  T (bx[0].a + 1);
+  T (bx[0].a + 2);
+  T (bx[0].a + 3);        /* { dg-warning "offset 21 is out of bounds of .struct A\\\[2]." } */
+  T (bx[0].a + 99);       /* { dg-warning "offset 693 is out of bounds of .struct A\\\[2]." } */
+  T (bx[0].a + SMAX);     /* { dg-warning "offset -?\[0-9\]+ is out of bounds of .struct A\\\[2]." } */
+
+  T (bx[0].a[0].b4);
+  T (bx[0].a[0].b4 + 1);
+  T (bx[0].a[0].b4 + 2);
+  T (bx[0].a[0].b4 + 3);
+  T (bx[0].a[0].b4 + 99);   /* { dg-warning "offset 99 is out of bounds of .char\\\[4]." } */
+  T (bx[0].a[0].b4 + SMAX); /* { dg-warning "pointer overflow" } */
+
+  T (bx[0].a[1].b4);
+  T (bx[0].a[1].b4 + 1);
+  T (bx[0].a[1].b4 + 2);
+  T (bx[0].a[1].b4 + 3);
+  T (bx[0].a[1].b4 + 99);   /* { dg-warning "offset 99 is out of bounds of .char\\\[4]." } */
+  T (bx[0].a[1].b4 + SMAX); /* { dg-warning "pointer overflow computing offset -?\[0-9\]+ from .char\\\[4]." } */
+
+  T (bx[0].a[-1].b4);        /* { dg-warning "array subscript -1 is below array bounds" } */
+  T (bx[0].a[2].b4);
+  T (bx[0].a[2].b4 + 1);    /* { dg-warning "array subscript 2 is above array bounds" } */
+  T (bx[0].a[2].b4 + 2);    /* { dg-warning "array subscript 2 is above array bounds" } */
+  T (bx[0].a[2].b4 + 3);    /* { dg-warning "array subscript 2 is above array bounds" } */
+  T (bx[0].a[2].b4 + 99);   /* { dg-warning "array subscript 2 is above array bounds" } */
+  T (bx[0].a[2].b4 + SMAX); /* { dg-warning "array subscript 2 is above array bounds" } */
+
+  T (bx[0].ax[0].b4 + 0);
+  T (bx[0].ax[0].b4 + 1);
+  T (bx[0].ax[0].b4 + 2);
+  T (bx[0].ax[0].b4 + 3);
+  T (bx[0].ax[0].b4 + 5);   /* { dg-warning "offset 5 is out of bounds of .char\\\[4]." } */
+
+  T (bx[-99].ax[0].b4 + 0);
+  T (bx[ -1].ax[0].b4 + 0);
+  T (bx[  0].ax[0].b4 + 1);
+  T (bx[  1].ax[0].b4 + 2);
+  T (bx[  2].ax[0].b4 + 3);
+  T (bx[ 99].ax[0].b4 + 5);   /* { dg-warning "offset 5 is out of bounds of .char\\\[4]." } */
+
+  T (bx[-99].ax[-9].b4 + 0);  /* { dg-warning "array subscript -9 is below array bounds" } */
+  T (bx[-99].ax[-1].b4 + 0);  /* { dg-warning "array subscript -1 is below array bounds" } */
+  T (bx[-99].ax[11].b4 + 0);
+  T (bx[ -1].ax[22].b4 + 0);
+  T (bx[  0].ax[33].b4 + 1);
+  T (bx[  1].ax[44].b4 + 2);
+  T (bx[  2].ax[55].b4 + 3);
+  T (bx[ 99].ax[99].b4 + 5);   /* { dg-warning "offset 5 is out of bounds of .char\\\[4]." } */
+}
+
+struct Cx { struct B b2_3[2][3]; struct B bx_3[][3]; };
+
+void test_memarray_3 (struct Cx *cx)
+{
+  STATIC_ASSERT (sizeof cx->b2_3 == sizeof (struct B[3]));
+  STATIC_ASSERT (sizeof cx->b2_3 == 42);
+
+  T (cx->b2_3 - 1);             /* { dg-warning "offset -42 is out of bounds of .struct B\\\[2]\\\[3]." } */
+  T (cx->b2_3 + 0);
+  T (cx->b2_3 + 1);
+  T (cx->b2_3 + 2);
+  T (cx->b2_3 + 3);             /* { dg-warning "offset 126 is out of bounds of .struct B\\\[2]\\\[3]." } */
+}
+
+struct Memarray
+{
+  struct __attribute__ ((packed))
+  S2 { char a, b; }
+    s, *ps, a2[2], ax[];
+};
+
+void test_int_memarray (struct Memarray *p)
+{
+  T (&p->s);
+  T (&p->s + 0);
+  T (&p->s + 1);
+  T (&p->s - 1);                /* { dg-warning "offset -2 is out of bounds of .struct S2\\\[1]." } */
+  T (&p->s + 2);                /* { dg-warning "offset 4 is out of bounds of .struct S2\\\[1]." } */
+
+  T (&(&p->s)[0]);
+  T (&(&p->s)[1]);
+  T (&(&p->s)[-1]);             /* { dg-warning "offset -2 is out of bounds of .struct S2\\\[1]." } */
+  T (&(&p->s)[2]);              /* { dg-warning "offset 4 is out of bounds of .struct S2\\\[1]." } */
+
+  T (p->ps - 99);
+  T (p->ps -  1);
+  T (p->ps);
+  T (p->ps +  0);
+  T (p->ps +  1);
+  T (p->ps + 99);
+
+  T (&p->ps - 1);                 /* { dg-warning "offset -\[48\] is out of bounds of .struct S2\*." } */
+  T (&p->ps);
+  T (&p->ps + 1);
+  T (&p->ps + 2);                 /* { dg-warning "offset \(8|16\) is out of bounds of .struct S2\*." } */
+
+  T (&p->a2);
+  T (&p->a2 + 0);
+  T (&p->a2 + 1);
+  T (&p->a2 - 1);                /* { dg-warning "offset -4 is out of bounds of .struct S2\\\[2]." } */
+  T (&p->a2 + 2);                /* { dg-warning "offset 8 is out of bounds of .struct S2\\\[2]." } */
+  T (&p->a2 + 3);                /* { dg-warning "offset 12 is out of bounds of .struct S2\\\[2]." } */
+
+  T (&p->a2[0] + 0);
+  T (&p->a2[0] + 1);
+  T (&p->a2[0] + 2);
+  T (&p->a2[0] + 3);             /* { dg-warning "offset 6 is out of bounds of .struct S2\\\[2]." } */
+
+  T (&p->a2[1] + 0);
+  T (&p->a2[1] + 1);
+  T (&p->a2[1] + 2);             /* { dg-warning "offset 6 is out of bounds of .struct S2\\\[2]." } */
+  T (&p->a2[1] + 3);             /* { dg-warning "offset 8 is out of bounds of .struct S2\\\[2]." } */
+
+  T (&p->a2[2] + 0);
+  T (&p->a2[2] + 1);             /* { dg-warning "offset 6 is out of bounds of .struct S2\\\[2]." } */
+  T (&p->a2[2] + 2);             /* { dg-warning "offset 8 is out of bounds of .struct S2\\\[2]." } */
+  T (&p->a2[2] + 3);             /* { dg-warning "offset 10 is out of bounds of .struct S2\\\[2]." } */
+}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-29 16:15 [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455) Martin Sebor
@ 2017-10-30 11:55 ` Richard Biener
  2017-10-30 15:21   ` Martin Sebor
  2017-10-30 22:16 ` Jeff Law
  1 sibling, 1 reply; 22+ messages in thread
From: Richard Biener @ 2017-10-30 11:55 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Gcc Patch List, Jeff Law

[-- Attachment #1: Type: text/plain, Size: 2648 bytes --]

On Sun, 29 Oct 2017, Martin Sebor wrote:

> In my work on -Wrestrict, to issue meaningful warnings, I found
> it important to detect both out of bounds array indices as well
> as offsets in calls to restrict-qualified functions like strcpy.
> GCC already detects some of these cases but my tests for
> the enhanced warning exposed a few gaps.
> 
> The attached patch enhances -Warray-bounds to detect more instances
> out-of-bounds indices and offsets to member arrays and non-array
> members.  For example, it detects the out-of-bounds offset in the
> call to strcpy below.
> 
> The patch is meant to be applied on top posted here but not yet
> committed:
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
> 
> Richard, since this also touches tree-vrp.c I look for your comments.

You fail to tell what you are changing and why - I have to reverse
engineer this from the patch which a) isn't easy in this case, b) feels
like a waste of time.  Esp. since the patch does many things.

My first question is why do you add a warning from forwprop?  It
_feels_ like you're trying to warn about arbitrary out-of-bound
addresses at the point they are folded to MEM_REFs.  And it looks
like you're warning about pointer arithmetic like &p->a + 6.
That doesn't look correct to me.  Pointer arithmetic in GIMPLE
is not restricted to operate within fields that are appearantly
accessed here - the only restriction is with respect to the
whole underlying pointed-to-object.

By doing the warning from forwprop you'll run into all such cases
introduced by GCC itself during quite late optimization passes.

You're trying to re-do __builtin_object_size even when that wasn't
used.

So it looks like you're on the wrong track.  Yes,

 strcpy (p->a + 6, "y");

_may_ be "invalid" C (I'm not even sure about that!) but it
is certainly not invalid GIMPLE.

Richard.

> Jeff, this is the enhancement you were interested in when we spoke
> last week.
> 
> Thanks
> Martin
> 
> $ cat a.c && gcc -O2 -S -Wall a.c
>   struct A { char a[4]; void (*pf)(void); };
> 
>   void f (struct A *p)
>   {
>     p->a[5] = 'x';            // existing -Warray-bounds
> 
>     strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>   }
> 
>   a.c: In function ‘f’:
>   a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’ [-Warray-bounds]
>    strcpy (p->a + 6, "y");
>    ^~~~~~~~~~~~~~~~~~~~~~
>   a.c:1:17: note: member declared here
>    struct A { char a[4]; void (*pf)(void); };
>                    ^
>   a.c:5:7: warning: array subscript 5 is above array bounds of ‘char[4]’
> [-Warray-bounds]
>      p->a[5] = 'x';
>      ~~~~^~~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-30 11:55 ` Richard Biener
@ 2017-10-30 15:21   ` Martin Sebor
  2017-10-30 19:59     ` Richard Biener
  2017-10-30 22:16     ` Jeff Law
  0 siblings, 2 replies; 22+ messages in thread
From: Martin Sebor @ 2017-10-30 15:21 UTC (permalink / raw)
  To: Richard Biener; +Cc: Gcc Patch List, Jeff Law

On 10/30/2017 05:45 AM, Richard Biener wrote:
> On Sun, 29 Oct 2017, Martin Sebor wrote:
> 
>> In my work on -Wrestrict, to issue meaningful warnings, I found
>> it important to detect both out of bounds array indices as well
>> as offsets in calls to restrict-qualified functions like strcpy.
>> GCC already detects some of these cases but my tests for
>> the enhanced warning exposed a few gaps.
>>
>> The attached patch enhances -Warray-bounds to detect more instances
>> out-of-bounds indices and offsets to member arrays and non-array
>> members.  For example, it detects the out-of-bounds offset in the
>> call to strcpy below.
>>
>> The patch is meant to be applied on top posted here but not yet
>> committed:
>>    https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>
>> Richard, since this also touches tree-vrp.c I look for your comments.
> 
> You fail to tell what you are changing and why - I have to reverse
> engineer this from the patch which a) isn't easy in this case, b) feels
> like a waste of time.  Esp. since the patch does many things.
> 
> My first question is why do you add a warning from forwprop?  It
> _feels_ like you're trying to warn about arbitrary out-of-bound
> addresses at the point they are folded to MEM_REFs.  And it looks
> like you're warning about pointer arithmetic like &p->a + 6.
> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
> is not restricted to operate within fields that are appearantly
> accessed here - the only restriction is with respect to the
> whole underlying pointed-to-object.
> 
> By doing the warning from forwprop you'll run into all such cases
> introduced by GCC itself during quite late optimization passes.

I haven't run into any such cases.  What would a more appropriate
place to detect out-of-bounds offsets?  I'm having a hard time
distinguishing what is appropriate and what isn't.  For instance,
if it's okay to detect some out of bounds offsets/indices in vrp
why is it wrong to do a better job of it in forwprop?

> 
> You're trying to re-do __builtin_object_size even when that wasn't
> used.

That's not the quite my intent, although it is close.

> 
> So it looks like you're on the wrong track.  Yes,
> 
>   strcpy (p->a + 6, "y");
> 
> _may_ be "invalid" C (I'm not even sure about that!) but it
> is certainly not invalid GIMPLE.

Adding (or subtracting) an integer to/from a pointer to an array
is defined in both C and C++ only if the result points to an element
of the array or just past the last element of the array.  Otherwise
it's undefined. (A non-array object is considered an array of one
for this purpose.)

Martin

> 
> Richard.
> 
>> Jeff, this is the enhancement you were interested in when we spoke
>> last week.
>>
>> Thanks
>> Martin
>>
>> $ cat a.c && gcc -O2 -S -Wall a.c
>>    struct A { char a[4]; void (*pf)(void); };
>>
>>    void f (struct A *p)
>>    {
>>      p->a[5] = 'x';            // existing -Warray-bounds
>>
>>      strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>>    }
>>
>>    a.c: In function ‘f’:
>>    a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’ [-Warray-bounds]
>>     strcpy (p->a + 6, "y");
>>     ^~~~~~~~~~~~~~~~~~~~~~
>>    a.c:1:17: note: member declared here
>>     struct A { char a[4]; void (*pf)(void); };
>>                     ^
>>    a.c:5:7: warning: array subscript 5 is above array bounds of ‘char[4]’
>> [-Warray-bounds]
>>       p->a[5] = 'x';
>>       ~~~~^~~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-30 15:21   ` Martin Sebor
@ 2017-10-30 19:59     ` Richard Biener
  2017-10-30 20:40       ` Martin Sebor
  2017-10-30 22:16     ` Jeff Law
  1 sibling, 1 reply; 22+ messages in thread
From: Richard Biener @ 2017-10-30 19:59 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Gcc Patch List, Jeff Law

On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor <msebor@gmail.com> wrote:
>On 10/30/2017 05:45 AM, Richard Biener wrote:
>> On Sun, 29 Oct 2017, Martin Sebor wrote:
>> 
>>> In my work on -Wrestrict, to issue meaningful warnings, I found
>>> it important to detect both out of bounds array indices as well
>>> as offsets in calls to restrict-qualified functions like strcpy.
>>> GCC already detects some of these cases but my tests for
>>> the enhanced warning exposed a few gaps.
>>>
>>> The attached patch enhances -Warray-bounds to detect more instances
>>> out-of-bounds indices and offsets to member arrays and non-array
>>> members.  For example, it detects the out-of-bounds offset in the
>>> call to strcpy below.
>>>
>>> The patch is meant to be applied on top posted here but not yet
>>> committed:
>>>    https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>
>>> Richard, since this also touches tree-vrp.c I look for your
>comments.
>> 
>> You fail to tell what you are changing and why - I have to reverse
>> engineer this from the patch which a) isn't easy in this case, b)
>feels
>> like a waste of time.  Esp. since the patch does many things.
>> 
>> My first question is why do you add a warning from forwprop?  It
>> _feels_ like you're trying to warn about arbitrary out-of-bound
>> addresses at the point they are folded to MEM_REFs.  And it looks
>> like you're warning about pointer arithmetic like &p->a + 6.
>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
>> is not restricted to operate within fields that are appearantly
>> accessed here - the only restriction is with respect to the
>> whole underlying pointed-to-object.
>> 
>> By doing the warning from forwprop you'll run into all such cases
>> introduced by GCC itself during quite late optimization passes.
>
>I haven't run into any such cases.  What would a more appropriate
>place to detect out-of-bounds offsets?  I'm having a hard time
>distinguishing what is appropriate and what isn't.  For instance,
>if it's okay to detect some out of bounds offsets/indices in vrp
>why is it wrong to do a better job of it in forwprop?
>
>> 
>> You're trying to re-do __builtin_object_size even when that wasn't
>> used.
>
>That's not the quite my intent, although it is close.
>
>> 
>> So it looks like you're on the wrong track.  Yes,
>> 
>>   strcpy (p->a + 6, "y");
>> 
>> _may_ be "invalid" C (I'm not even sure about that!) but it
>> is certainly not invalid GIMPLE.
>
>Adding (or subtracting) an integer to/from a pointer to an array
>is defined in both C and C++ only if the result points to an element
>of the array or just past the last element of the array.  Otherwise
>it's undefined. (A non-array object is considered an array of one
>for this purpose.)

On GIMPLE this is indistinguishable from (short *) (p->a) + 3.

GIMPLE is neither C nor C++. 

Richard. 

>
>Martin
>
>> 
>> Richard.
>> 
>>> Jeff, this is the enhancement you were interested in when we spoke
>>> last week.
>>>
>>> Thanks
>>> Martin
>>>
>>> $ cat a.c && gcc -O2 -S -Wall a.c
>>>    struct A { char a[4]; void (*pf)(void); };
>>>
>>>    void f (struct A *p)
>>>    {
>>>      p->a[5] = 'x';            // existing -Warray-bounds
>>>
>>>      strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>>>    }
>>>
>>>    a.c: In function ‘f’:
>>>    a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’
>[-Warray-bounds]
>>>     strcpy (p->a + 6, "y");
>>>     ^~~~~~~~~~~~~~~~~~~~~~
>>>    a.c:1:17: note: member declared here
>>>     struct A { char a[4]; void (*pf)(void); };
>>>                     ^
>>>    a.c:5:7: warning: array subscript 5 is above array bounds of
>‘char[4]’
>>> [-Warray-bounds]
>>>       p->a[5] = 'x';
>>>       ~~~~^~~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-30 19:59     ` Richard Biener
@ 2017-10-30 20:40       ` Martin Sebor
  2017-10-30 21:23         ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Martin Sebor @ 2017-10-30 20:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: Gcc Patch List, Jeff Law

On 10/30/2017 01:53 PM, Richard Biener wrote:
> On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor <msebor@gmail.com> wrote:
>> On 10/30/2017 05:45 AM, Richard Biener wrote:
>>> On Sun, 29 Oct 2017, Martin Sebor wrote:
>>>
>>>> In my work on -Wrestrict, to issue meaningful warnings, I found
>>>> it important to detect both out of bounds array indices as well
>>>> as offsets in calls to restrict-qualified functions like strcpy.
>>>> GCC already detects some of these cases but my tests for
>>>> the enhanced warning exposed a few gaps.
>>>>
>>>> The attached patch enhances -Warray-bounds to detect more instances
>>>> out-of-bounds indices and offsets to member arrays and non-array
>>>> members.  For example, it detects the out-of-bounds offset in the
>>>> call to strcpy below.
>>>>
>>>> The patch is meant to be applied on top posted here but not yet
>>>> committed:
>>>>     https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>>
>>>> Richard, since this also touches tree-vrp.c I look for your
>> comments.
>>>
>>> You fail to tell what you are changing and why - I have to reverse
>>> engineer this from the patch which a) isn't easy in this case, b)
>> feels
>>> like a waste of time.  Esp. since the patch does many things.
>>>
>>> My first question is why do you add a warning from forwprop?  It
>>> _feels_ like you're trying to warn about arbitrary out-of-bound
>>> addresses at the point they are folded to MEM_REFs.  And it looks
>>> like you're warning about pointer arithmetic like &p->a + 6.
>>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
>>> is not restricted to operate within fields that are appearantly
>>> accessed here - the only restriction is with respect to the
>>> whole underlying pointed-to-object.
>>>
>>> By doing the warning from forwprop you'll run into all such cases
>>> introduced by GCC itself during quite late optimization passes.
>>
>> I haven't run into any such cases.  What would a more appropriate
>> place to detect out-of-bounds offsets?  I'm having a hard time
>> distinguishing what is appropriate and what isn't.  For instance,
>> if it's okay to detect some out of bounds offsets/indices in vrp
>> why is it wrong to do a better job of it in forwprop?
>>
>>>
>>> You're trying to re-do __builtin_object_size even when that wasn't
>>> used.
>>
>> That's not the quite my intent, although it is close.
>>
>>>
>>> So it looks like you're on the wrong track.  Yes,
>>>
>>>    strcpy (p->a + 6, "y");
>>>
>>> _may_ be "invalid" C (I'm not even sure about that!) but it
>>> is certainly not invalid GIMPLE.
>>
>> Adding (or subtracting) an integer to/from a pointer to an array
>> is defined in both C and C++ only if the result points to an element
>> of the array or just past the last element of the array.  Otherwise
>> it's undefined. (A non-array object is considered an array of one
>> for this purpose.)
> 
> On GIMPLE this is indistinguishable from (short *) (p->a) + 3.

Sure, they're both the same:

   pa_3 = &p_2(D)->a;
   _1 = pa_3 + 6;

and that's fine because the implementation of the warning sees and
uses the byte offset from the beginning of a, so I don't understand
the problem you are pointing out.  Can you clarify what you mean?

Martin

> 
> GIMPLE is neither C nor C++.
> 
> Richard.
> 
>>
>> Martin
>>
>>>
>>> Richard.
>>>
>>>> Jeff, this is the enhancement you were interested in when we spoke
>>>> last week.
>>>>
>>>> Thanks
>>>> Martin
>>>>
>>>> $ cat a.c && gcc -O2 -S -Wall a.c
>>>>     struct A { char a[4]; void (*pf)(void); };
>>>>
>>>>     void f (struct A *p)
>>>>     {
>>>>       p->a[5] = 'x';            // existing -Warray-bounds
>>>>
>>>>       strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>>>>     }
>>>>
>>>>     a.c: In function ‘f’:
>>>>     a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’
>> [-Warray-bounds]
>>>>      strcpy (p->a + 6, "y");
>>>>      ^~~~~~~~~~~~~~~~~~~~~~
>>>>     a.c:1:17: note: member declared here
>>>>      struct A { char a[4]; void (*pf)(void); };
>>>>                      ^
>>>>     a.c:5:7: warning: array subscript 5 is above array bounds of
>> ‘char[4]’
>>>> [-Warray-bounds]
>>>>        p->a[5] = 'x';
>>>>        ~~~~^~~
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-30 20:40       ` Martin Sebor
@ 2017-10-30 21:23         ` Richard Biener
  2017-10-30 21:49           ` Martin Sebor
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2017-10-30 21:23 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Gcc Patch List, Jeff Law

On October 30, 2017 9:13:04 PM GMT+01:00, Martin Sebor <msebor@gmail.com> wrote:
>On 10/30/2017 01:53 PM, Richard Biener wrote:
>> On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor
><msebor@gmail.com> wrote:
>>> On 10/30/2017 05:45 AM, Richard Biener wrote:
>>>> On Sun, 29 Oct 2017, Martin Sebor wrote:
>>>>
>>>>> In my work on -Wrestrict, to issue meaningful warnings, I found
>>>>> it important to detect both out of bounds array indices as well
>>>>> as offsets in calls to restrict-qualified functions like strcpy.
>>>>> GCC already detects some of these cases but my tests for
>>>>> the enhanced warning exposed a few gaps.
>>>>>
>>>>> The attached patch enhances -Warray-bounds to detect more
>instances
>>>>> out-of-bounds indices and offsets to member arrays and non-array
>>>>> members.  For example, it detects the out-of-bounds offset in the
>>>>> call to strcpy below.
>>>>>
>>>>> The patch is meant to be applied on top posted here but not yet
>>>>> committed:
>>>>>     https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>>>
>>>>> Richard, since this also touches tree-vrp.c I look for your
>>> comments.
>>>>
>>>> You fail to tell what you are changing and why - I have to reverse
>>>> engineer this from the patch which a) isn't easy in this case, b)
>>> feels
>>>> like a waste of time.  Esp. since the patch does many things.
>>>>
>>>> My first question is why do you add a warning from forwprop?  It
>>>> _feels_ like you're trying to warn about arbitrary out-of-bound
>>>> addresses at the point they are folded to MEM_REFs.  And it looks
>>>> like you're warning about pointer arithmetic like &p->a + 6.
>>>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
>>>> is not restricted to operate within fields that are appearantly
>>>> accessed here - the only restriction is with respect to the
>>>> whole underlying pointed-to-object.
>>>>
>>>> By doing the warning from forwprop you'll run into all such cases
>>>> introduced by GCC itself during quite late optimization passes.
>>>
>>> I haven't run into any such cases.  What would a more appropriate
>>> place to detect out-of-bounds offsets?  I'm having a hard time
>>> distinguishing what is appropriate and what isn't.  For instance,
>>> if it's okay to detect some out of bounds offsets/indices in vrp
>>> why is it wrong to do a better job of it in forwprop?
>>>
>>>>
>>>> You're trying to re-do __builtin_object_size even when that wasn't
>>>> used.
>>>
>>> That's not the quite my intent, although it is close.
>>>
>>>>
>>>> So it looks like you're on the wrong track.  Yes,
>>>>
>>>>    strcpy (p->a + 6, "y");
>>>>
>>>> _may_ be "invalid" C (I'm not even sure about that!) but it
>>>> is certainly not invalid GIMPLE.
>>>
>>> Adding (or subtracting) an integer to/from a pointer to an array
>>> is defined in both C and C++ only if the result points to an element
>>> of the array or just past the last element of the array.  Otherwise
>>> it's undefined. (A non-array object is considered an array of one
>>> for this purpose.)
>> 
>> On GIMPLE this is indistinguishable from (short *) (p->a) + 3.
>
>Sure, they're both the same:
>
>   pa_3 = &p_2(D)->a;
>   _1 = pa_3 + 6;
>
>and that's fine because the implementation of the warning sees and
>uses the byte offset from the beginning of a, so I don't understand
>the problem you are pointing out.  Can you clarify what you mean?

It does not access the array but the underlying object. On GIMPLE it is just an address calculation without constraints. 

Richard. 

>Martin
>
>> 
>> GIMPLE is neither C nor C++.
>> 
>> Richard.
>> 
>>>
>>> Martin
>>>
>>>>
>>>> Richard.
>>>>
>>>>> Jeff, this is the enhancement you were interested in when we spoke
>>>>> last week.
>>>>>
>>>>> Thanks
>>>>> Martin
>>>>>
>>>>> $ cat a.c && gcc -O2 -S -Wall a.c
>>>>>     struct A { char a[4]; void (*pf)(void); };
>>>>>
>>>>>     void f (struct A *p)
>>>>>     {
>>>>>       p->a[5] = 'x';            // existing -Warray-bounds
>>>>>
>>>>>       strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>>>>>     }
>>>>>
>>>>>     a.c: In function ‘f’:
>>>>>     a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’
>>> [-Warray-bounds]
>>>>>      strcpy (p->a + 6, "y");
>>>>>      ^~~~~~~~~~~~~~~~~~~~~~
>>>>>     a.c:1:17: note: member declared here
>>>>>      struct A { char a[4]; void (*pf)(void); };
>>>>>                      ^
>>>>>     a.c:5:7: warning: array subscript 5 is above array bounds of
>>> ‘char[4]’
>>>>> [-Warray-bounds]
>>>>>        p->a[5] = 'x';
>>>>>        ~~~~^~~
>> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-30 21:23         ` Richard Biener
@ 2017-10-30 21:49           ` Martin Sebor
  2017-11-02 11:48             ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Martin Sebor @ 2017-10-30 21:49 UTC (permalink / raw)
  To: Richard Biener; +Cc: Gcc Patch List, Jeff Law

On 10/30/2017 02:56 PM, Richard Biener wrote:
> On October 30, 2017 9:13:04 PM GMT+01:00, Martin Sebor <msebor@gmail.com> wrote:
>> On 10/30/2017 01:53 PM, Richard Biener wrote:
>>> On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor
>> <msebor@gmail.com> wrote:
>>>> On 10/30/2017 05:45 AM, Richard Biener wrote:
>>>>> On Sun, 29 Oct 2017, Martin Sebor wrote:
>>>>>
>>>>>> In my work on -Wrestrict, to issue meaningful warnings, I found
>>>>>> it important to detect both out of bounds array indices as well
>>>>>> as offsets in calls to restrict-qualified functions like strcpy.
>>>>>> GCC already detects some of these cases but my tests for
>>>>>> the enhanced warning exposed a few gaps.
>>>>>>
>>>>>> The attached patch enhances -Warray-bounds to detect more
>> instances
>>>>>> out-of-bounds indices and offsets to member arrays and non-array
>>>>>> members.  For example, it detects the out-of-bounds offset in the
>>>>>> call to strcpy below.
>>>>>>
>>>>>> The patch is meant to be applied on top posted here but not yet
>>>>>> committed:
>>>>>>      https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>>>>
>>>>>> Richard, since this also touches tree-vrp.c I look for your
>>>> comments.
>>>>>
>>>>> You fail to tell what you are changing and why - I have to reverse
>>>>> engineer this from the patch which a) isn't easy in this case, b)
>>>> feels
>>>>> like a waste of time.  Esp. since the patch does many things.
>>>>>
>>>>> My first question is why do you add a warning from forwprop?  It
>>>>> _feels_ like you're trying to warn about arbitrary out-of-bound
>>>>> addresses at the point they are folded to MEM_REFs.  And it looks
>>>>> like you're warning about pointer arithmetic like &p->a + 6.
>>>>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
>>>>> is not restricted to operate within fields that are appearantly
>>>>> accessed here - the only restriction is with respect to the
>>>>> whole underlying pointed-to-object.
>>>>>
>>>>> By doing the warning from forwprop you'll run into all such cases
>>>>> introduced by GCC itself during quite late optimization passes.
>>>>
>>>> I haven't run into any such cases.  What would a more appropriate
>>>> place to detect out-of-bounds offsets?  I'm having a hard time
>>>> distinguishing what is appropriate and what isn't.  For instance,
>>>> if it's okay to detect some out of bounds offsets/indices in vrp
>>>> why is it wrong to do a better job of it in forwprop?
>>>>
>>>>>
>>>>> You're trying to re-do __builtin_object_size even when that wasn't
>>>>> used.
>>>>
>>>> That's not the quite my intent, although it is close.
>>>>
>>>>>
>>>>> So it looks like you're on the wrong track.  Yes,
>>>>>
>>>>>     strcpy (p->a + 6, "y");
>>>>>
>>>>> _may_ be "invalid" C (I'm not even sure about that!) but it
>>>>> is certainly not invalid GIMPLE.
>>>>
>>>> Adding (or subtracting) an integer to/from a pointer to an array
>>>> is defined in both C and C++ only if the result points to an element
>>>> of the array or just past the last element of the array.  Otherwise
>>>> it's undefined. (A non-array object is considered an array of one
>>>> for this purpose.)
>>>
>>> On GIMPLE this is indistinguishable from (short *) (p->a) + 3.
>>
>> Sure, they're both the same:
>>
>>    pa_3 = &p_2(D)->a;
>>    _1 = pa_3 + 6;
>>
>> and that's fine because the implementation of the warning sees and
>> uses the byte offset from the beginning of a, so I don't understand
>> the problem you are pointing out.  Can you clarify what you mean?
> 
> It does not access the array but the underlying object. On GIMPLE it is just an address calculation without constraints.

But the computation starts with the subobject and so is only
valid within the bounds of the subobject.  Or are you saying
that GCC emits such GIMPLE expressions for valid code?  If so,
can you give an example of such code?

Or, if that's not it, what exactly is your concern with this
enhancement?  If it's that it's implemented in forwprop, what
would be a better place, e.g., earlier in the optimization
phase?  If it's something something else, I'd appreciate it
if you could explain what.

Martin

> 
> Richard.
> 
>> Martin
>>
>>>
>>> GIMPLE is neither C nor C++.
>>>
>>> Richard.
>>>
>>>>
>>>> Martin
>>>>
>>>>>
>>>>> Richard.
>>>>>
>>>>>> Jeff, this is the enhancement you were interested in when we spoke
>>>>>> last week.
>>>>>>
>>>>>> Thanks
>>>>>> Martin
>>>>>>
>>>>>> $ cat a.c && gcc -O2 -S -Wall a.c
>>>>>>      struct A { char a[4]; void (*pf)(void); };
>>>>>>
>>>>>>      void f (struct A *p)
>>>>>>      {
>>>>>>        p->a[5] = 'x';            // existing -Warray-bounds
>>>>>>
>>>>>>        strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>>>>>>      }
>>>>>>
>>>>>>      a.c: In function ‘f’:
>>>>>>      a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’
>>>> [-Warray-bounds]
>>>>>>       strcpy (p->a + 6, "y");
>>>>>>       ^~~~~~~~~~~~~~~~~~~~~~
>>>>>>      a.c:1:17: note: member declared here
>>>>>>       struct A { char a[4]; void (*pf)(void); };
>>>>>>                       ^
>>>>>>      a.c:5:7: warning: array subscript 5 is above array bounds of
>>>> ‘char[4]’
>>>>>> [-Warray-bounds]
>>>>>>         p->a[5] = 'x';
>>>>>>         ~~~~^~~
>>>
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-29 16:15 [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455) Martin Sebor
  2017-10-30 11:55 ` Richard Biener
@ 2017-10-30 22:16 ` Jeff Law
  1 sibling, 0 replies; 22+ messages in thread
From: Jeff Law @ 2017-10-30 22:16 UTC (permalink / raw)
  To: Martin Sebor, Gcc Patch List, Richard Biener

On 10/29/2017 10:01 AM, Martin Sebor wrote:
> In my work on -Wrestrict, to issue meaningful warnings, I found
> it important to detect both out of bounds array indices as well
> as offsets in calls to restrict-qualified functions like strcpy.
> GCC already detects some of these cases but my tests for
> the enhanced warning exposed a few gaps.
> 
> The attached patch enhances -Warray-bounds to detect more instances
> out-of-bounds indices and offsets to member arrays and non-array
> members.  For example, it detects the out-of-bounds offset in the
> call to strcpy below.
> 
> The patch is meant to be applied on top posted here but not yet
> committed:
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
> 
> Richard, since this also touches tree-vrp.c I look for your comments.
> 
> Jeff, this is the enhancement you were interested in when we spoke
> last week.
> 
> Thanks
> Martin
> 
> $ cat a.c && gcc -O2 -S -Wall a.c
>   struct A { char a[4]; void (*pf)(void); };
> 
>   void f (struct A *p)
>   {
>     p->a[5] = 'x';            // existing -Warray-bounds
> 
>     strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>   }
> 
>   a.c: In function ‘f’:
>   a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’ [-Warray-bounds]
>    strcpy (p->a + 6, "y");
>    ^~~~~~~~~~~~~~~~~~~~~~
>   a.c:1:17: note: member declared here
>    struct A { char a[4]; void (*pf)(void); };
>                    ^
>   a.c:5:7: warning: array subscript 5 is above array bounds of
> ‘char[4]’ [-Warray-bounds]
>      p->a[5] = 'x';
>      ~~~~^~~
> 
> gcc-82455.diff
> 
> 
> PR tree-optimization/82455 - missing -Warray-bounds on strcpy offset in an out-of-bounds range
> 
> gcc/ChangeLog:
> 
> 	PR tree-optimization/82455
> 	* tree-ssa-forwprop.c (maybe_warn_offset_out_of_bounds): New function.
> 	(forward_propagate_addr_expr_1): Call it.
> 	* tree-vrp.c (check_array_ref): Return bool.  Handle flexible array
> 	members, string literals, and inner indices.
> 	(search_for_addr_array): Return bool.  Add an argument.  Add detail
> 	to diagnostics.
> 	(check_array_bounds): New function.
> 	* tree-vrp.h (check_array_bounds): Declare it.
> 	* wide-int.h (sign_mask): Assert a precondition.So one of the things I see here is exposing the array bounds checking
bits from VRP to other passes.  I'm not inherently against that -- I'd
like to be able to do that in other passes for various reasons.

It also seems like you're improving that code.


So I'd probably start with pulling the improvements into their own
patch.  That shouldn't be terribly controversial.


I'd have to look more closely, but let's make sure we have a query API
that does not warn, but merely returns a boolean.  I think we'll have a
desire for that kind of query.

I wonder if all this would be better implemented in the context of
_b_o_s.  That may also make it easy to use a domwalk to gather the
ranges given potentially better ranges than we're getting out of VRP.

I would also request that we dig deep into the routines we use here.
I'm currently trying to pull apart the various dependencies within VRP
and wouldn't want to make that worse.  Even if it's just a proof of
concept, pull the routines into a different file to see what the "touch"
points are.  I'm rather dismayed at how intertwined various bits within
VRP are.




> +/* Check to see if the expression (char*)PTR + OFF is valid (i.e., it
> +   doesn't overflow and the result is within the bounds of the object
> +   pointed to by PTR.  OFF is an offset in bytes.  If it isn't valid,
> +   issue a -Warray-bounds warning.  */
> +
> +static bool
> +maybe_warn_offset_out_of_bounds (location_t loc, tree ptr, tree off)
> +{
> +  offset_int cstoff = 0;
> +
> +  bool ignore_off_by_1 = false;
> +
> +  if (TREE_CODE (ptr) == SSA_NAME)
> +    {
> +      gimple *stmt = SSA_NAME_DEF_STMT (ptr);
> +      if (is_gimple_assign (stmt))
> +	{
> +	  tree_code code = gimple_assign_rhs_code (stmt);
> +	  if (code == ADDR_EXPR)
> +	    {
> +	      ptr = gimple_assign_rhs1 (stmt);
> +	      ignore_off_by_1 = true;
> +	    }
> +	  else if (code == POINTER_PLUS_EXPR)
> +	    {
> +	      ptr = gimple_assign_rhs1 (stmt);
> +	      tree offset = gimple_assign_rhs2 (stmt);
> +	      if (maybe_warn_offset_out_of_bounds (loc, ptr, offset))
> +		return true;
> +
> +	      /* To do: handle ranges.  */
> +	      if (TREE_CODE (offset) != INTEGER_CST)
> +		return false;
Even if we don't handle full ranges, handling ranges that have collapsed
to singletons ought to be trivial.  Similarly handling a range that
lives entirely out of bounds shouldn't be too hard either.


> +
> +  if (check_array_bounds (ptr, ignore_off_by_1))
> +    return true;
> +
> +  bool addr_expr = false;
> +
> +  if (TREE_CODE (ptr) == ADDR_EXPR)
> +    {
> +      ptr = TREE_OPERAND (ptr, 0);
> +      addr_expr = true;
> +    }
> +
> +  /* The type of the array to whichj the offset applies and the type
nit.  s/whichj/which/

I'm getting really tired..  More later...

Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-30 15:21   ` Martin Sebor
  2017-10-30 19:59     ` Richard Biener
@ 2017-10-30 22:16     ` Jeff Law
  2017-10-30 23:30       ` Martin Sebor
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff Law @ 2017-10-30 22:16 UTC (permalink / raw)
  To: Martin Sebor, Richard Biener; +Cc: Gcc Patch List

On 10/30/2017 09:19 AM, Martin Sebor wrote:
> On 10/30/2017 05:45 AM, Richard Biener wrote:
>> On Sun, 29 Oct 2017, Martin Sebor wrote:
>>
>>> In my work on -Wrestrict, to issue meaningful warnings, I found
>>> it important to detect both out of bounds array indices as well
>>> as offsets in calls to restrict-qualified functions like strcpy.
>>> GCC already detects some of these cases but my tests for
>>> the enhanced warning exposed a few gaps.
>>>
>>> The attached patch enhances -Warray-bounds to detect more instances
>>> out-of-bounds indices and offsets to member arrays and non-array
>>> members.  For example, it detects the out-of-bounds offset in the
>>> call to strcpy below.
>>>
>>> The patch is meant to be applied on top posted here but not yet
>>> committed:
>>>    https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>
>>> Richard, since this also touches tree-vrp.c I look for your comments.
>>
>> You fail to tell what you are changing and why - I have to reverse
>> engineer this from the patch which a) isn't easy in this case, b) feels
>> like a waste of time.  Esp. since the patch does many things.
>>
>> My first question is why do you add a warning from forwprop?  It
>> _feels_ like you're trying to warn about arbitrary out-of-bound
>> addresses at the point they are folded to MEM_REFs.  And it looks
>> like you're warning about pointer arithmetic like &p->a + 6.
>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
>> is not restricted to operate within fields that are appearantly
>> accessed here - the only restriction is with respect to the
>> whole underlying pointed-to-object.
>>
>> By doing the warning from forwprop you'll run into all such cases
>> introduced by GCC itself during quite late optimization passes.
> 
> I haven't run into any such cases.  What would a more appropriate
> place to detect out-of-bounds offsets?  I'm having a hard time
> distinguishing what is appropriate and what isn't.  For instance,
> if it's okay to detect some out of bounds offsets/indices in vrp
> why is it wrong to do a better job of it in forwpropI think part of the problem is there isn't a well defined place to do
this kind of warning.  I suspect it's currently in VRP simply because
that is where we had range information in the past.  It's still the
location with the most accurate range information.

In a world where we have an embedded context sensitive range analysis
engine, we should *really* look at pulling the out of bounds array
warnings out of any optimization pass an have a distinct pass to deal
with them.

I guess in the immediate term the question I would ask Martin is what is
it about forwprop that makes it interesting?  Is it because of the
lowering issues we touched on last week?  If so I wonder if we could
recreate an array form from a MEM_REF for the purposes of optimization.
Or if we could just handle MEM_REFs better within the existing warning.


> 
>>
>> You're trying to re-do __builtin_object_size even when that wasn't
>> used.
> 
> That's not the quite my intent, although it is close.
Wouldn't we be better off improving _b_o_s?

> 
>>
>> So it looks like you're on the wrong track.  Yes,
>>
>>   strcpy (p->a + 6, "y");
>>
>> _may_ be "invalid" C (I'm not even sure about that!) but it
>> is certainly not invalid GIMPLE.
> 
> Adding (or subtracting) an integer to/from a pointer to an array
> is defined in both C and C++ only if the result points to an element
> of the array or just past the last element of the array.  Otherwise
> it's undefined. (A non-array object is considered an array of one
> for this purpose.)
I think Richi's argument is that gimple allows things that are not
necessarily allowed by the C/C++ standard.  For example we support
virtual origins from Ada, which internally would look something like
invalid C code

OTOH, we currently have code in tree-vrp.c which warns if we compute the
address of an out of bounds array index which is very C/C++ centric.

jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-30 22:16     ` Jeff Law
@ 2017-10-30 23:30       ` Martin Sebor
  2017-10-31  4:32         ` Jeff Law
  0 siblings, 1 reply; 22+ messages in thread
From: Martin Sebor @ 2017-10-30 23:30 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: Gcc Patch List

On 10/30/2017 03:48 PM, Jeff Law wrote:
> On 10/30/2017 09:19 AM, Martin Sebor wrote:
>> On 10/30/2017 05:45 AM, Richard Biener wrote:
>>> On Sun, 29 Oct 2017, Martin Sebor wrote:
>>>
>>>> In my work on -Wrestrict, to issue meaningful warnings, I found
>>>> it important to detect both out of bounds array indices as well
>>>> as offsets in calls to restrict-qualified functions like strcpy.
>>>> GCC already detects some of these cases but my tests for
>>>> the enhanced warning exposed a few gaps.
>>>>
>>>> The attached patch enhances -Warray-bounds to detect more instances
>>>> out-of-bounds indices and offsets to member arrays and non-array
>>>> members.  For example, it detects the out-of-bounds offset in the
>>>> call to strcpy below.
>>>>
>>>> The patch is meant to be applied on top posted here but not yet
>>>> committed:
>>>>     https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>>
>>>> Richard, since this also touches tree-vrp.c I look for your comments.
>>>
>>> You fail to tell what you are changing and why - I have to reverse
>>> engineer this from the patch which a) isn't easy in this case, b) feels
>>> like a waste of time.  Esp. since the patch does many things.
>>>
>>> My first question is why do you add a warning from forwprop?  It
>>> _feels_ like you're trying to warn about arbitrary out-of-bound
>>> addresses at the point they are folded to MEM_REFs.  And it looks
>>> like you're warning about pointer arithmetic like &p->a + 6.
>>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
>>> is not restricted to operate within fields that are appearantly
>>> accessed here - the only restriction is with respect to the
>>> whole underlying pointed-to-object.
>>>
>>> By doing the warning from forwprop you'll run into all such cases
>>> introduced by GCC itself during quite late optimization passes.
>>
>> I haven't run into any such cases.  What would a more appropriate
>> place to detect out-of-bounds offsets?  I'm having a hard time
>> distinguishing what is appropriate and what isn't.  For instance,
>> if it's okay to detect some out of bounds offsets/indices in vrp
>> why is it wrong to do a better job of it in forwpropI think part of the problem is there isn't a well defined place to do
> this kind of warning.  I suspect it's currently in VRP simply because
> that is where we had range information in the past.  It's still the
> location with the most accurate range information.
> 
> In a world where we have an embedded context sensitive range analysis
> engine, we should *really* look at pulling the out of bounds array
> warnings out of any optimization pass an have a distinct pass to deal
> with them.
> 
> I guess in the immediate term the question I would ask Martin is what is
> it about forwprop that makes it interesting?  Is it because of the
> lowering issues we touched on last week?  If so I wonder if we could
> recreate an array form from a MEM_REF for the purposes of optimization.
> Or if we could just handle MEM_REFs better within the existing warning.

I put it in forwprop only because that was the last stage where
there's still enough context before the POINTER_PLUS_EXPR is
folded into MEM_REF to tell an offset from the beginning of
a subobject from the one from the beginning of the bigger object
of which the subobject is a member.  I certainly don't mind moving
it somewhere else more appropriate if this isn't ideal, just as
long it doesn't cripple the detection (e.g., as long as we still
have range information).

>>> You're trying to re-do __builtin_object_size even when that wasn't
>>> used.
>>
>> That's not the quite my intent, although it is close.
> Wouldn't we be better off improving _b_o_s?
> 
>>
>>>
>>> So it looks like you're on the wrong track.  Yes,
>>>
>>>    strcpy (p->a + 6, "y");
>>>
>>> _may_ be "invalid" C (I'm not even sure about that!) but it
>>> is certainly not invalid GIMPLE.
>>
>> Adding (or subtracting) an integer to/from a pointer to an array
>> is defined in both C and C++ only if the result points to an element
>> of the array or just past the last element of the array.  Otherwise
>> it's undefined. (A non-array object is considered an array of one
>> for this purpose.)
> I think Richi's argument is that gimple allows things that are not
> necessarily allowed by the C/C++ standard.  For example we support
> virtual origins from Ada, which internally would look something like
> invalid C code
> 
> OTOH, we currently have code in tree-vrp.c which warns if we compute the
> address of an out of bounds array index which is very C/C++ centric.

I of course don't want to break anything.  I didn't see any fallout
in my testing and I normally test all the front ends, including Ada,
but let me check to make sure I tested it this time (I had made some
temporary changes to my build script and may have disabled it.)  Let
me double check it after I get back from my trip.

Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-30 23:30       ` Martin Sebor
@ 2017-10-31  4:32         ` Jeff Law
  2017-11-01 22:21           ` Martin Sebor
  2017-11-02 11:27           ` Richard Biener
  0 siblings, 2 replies; 22+ messages in thread
From: Jeff Law @ 2017-10-31  4:32 UTC (permalink / raw)
  To: Martin Sebor, Richard Biener; +Cc: Gcc Patch List

On 10/30/2017 05:29 PM, Martin Sebor wrote:
> On 10/30/2017 03:48 PM, Jeff Law wrote:
>> On 10/30/2017 09:19 AM, Martin Sebor wrote:
>>> On 10/30/2017 05:45 AM, Richard Biener wrote:
>>>> On Sun, 29 Oct 2017, Martin Sebor wrote:
>>>>
>>>>> In my work on -Wrestrict, to issue meaningful warnings, I found
>>>>> it important to detect both out of bounds array indices as well
>>>>> as offsets in calls to restrict-qualified functions like strcpy.
>>>>> GCC already detects some of these cases but my tests for
>>>>> the enhanced warning exposed a few gaps.
>>>>>
>>>>> The attached patch enhances -Warray-bounds to detect more instances
>>>>> out-of-bounds indices and offsets to member arrays and non-array
>>>>> members.  For example, it detects the out-of-bounds offset in the
>>>>> call to strcpy below.
>>>>>
>>>>> The patch is meant to be applied on top posted here but not yet
>>>>> committed:
>>>>>     https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>>>
>>>>> Richard, since this also touches tree-vrp.c I look for your comments.
>>>>
>>>> You fail to tell what you are changing and why - I have to reverse
>>>> engineer this from the patch which a) isn't easy in this case, b) feels
>>>> like a waste of time.  Esp. since the patch does many things.
>>>>
>>>> My first question is why do you add a warning from forwprop?  It
>>>> _feels_ like you're trying to warn about arbitrary out-of-bound
>>>> addresses at the point they are folded to MEM_REFs.  And it looks
>>>> like you're warning about pointer arithmetic like &p->a + 6.
>>>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
>>>> is not restricted to operate within fields that are appearantly
>>>> accessed here - the only restriction is with respect to the
>>>> whole underlying pointed-to-object.
>>>>
>>>> By doing the warning from forwprop you'll run into all such cases
>>>> introduced by GCC itself during quite late optimization passes.
>>>
>>> I haven't run into any such cases.  What would a more appropriate
>>> place to detect out-of-bounds offsets?  I'm having a hard time
>>> distinguishing what is appropriate and what isn't.  For instance,
>>> if it's okay to detect some out of bounds offsets/indices in vrp
>>> why is it wrong to do a better job of it in forwpropI think part of
>>> the problem is there isn't a well defined place to do
>> this kind of warning.  I suspect it's currently in VRP simply because
>> that is where we had range information in the past.  It's still the
>> location with the most accurate range information.
>>
>> In a world where we have an embedded context sensitive range analysis
>> engine, we should *really* look at pulling the out of bounds array
>> warnings out of any optimization pass an have a distinct pass to deal
>> with them.
>>
>> I guess in the immediate term the question I would ask Martin is what is
>> it about forwprop that makes it interesting?  Is it because of the
>> lowering issues we touched on last week?  If so I wonder if we could
>> recreate an array form from a MEM_REF for the purposes of optimization.
>> Or if we could just handle MEM_REFs better within the existing warning.
> 
> I put it in forwprop only because that was the last stage where
> there's still enough context before the POINTER_PLUS_EXPR is
> folded into MEM_REF to tell an offset from the beginning of
> a subobject from the one from the beginning of the bigger object
> of which the subobject is a member.  I certainly don't mind moving
> it somewhere else more appropriate if this isn't ideal, just as
> long it doesn't cripple the detection (e.g., as long as we still
> have range information).
Understood.


[ ... ]

> 
> I of course don't want to break anything.  I didn't see any fallout
> in my testing and I normally test all the front ends, including Ada,
> but let me check to make sure I tested it this time (I had made some
> temporary changes to my build script and may have disabled it.)  Let
> me double check it after I get back from my trip.
No worries.  Hopefully by the time you're back I'll have something
publishable on the ripping apart tree-vrp front and we can prototype the
effectiveness of doing this kind of stuff outside tree-vrp.c

We should also revisit Aldy's work from last year which started the
whole effort around fixing how we deal with out out of bounds index
testing.   He had a version which ran outside tree-vrp.c but used the
same basic structure and queried range data for the index.  I've got a
copy here we can poke at.

jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-31  4:32         ` Jeff Law
@ 2017-11-01 22:21           ` Martin Sebor
  2017-11-02 11:27           ` Richard Biener
  1 sibling, 0 replies; 22+ messages in thread
From: Martin Sebor @ 2017-11-01 22:21 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: Gcc Patch List

>> I of course don't want to break anything.  I didn't see any fallout
>> in my testing and I normally test all the front ends, including Ada,
>> but let me check to make sure I tested it this time (I had made some
>> temporary changes to my build script and may have disabled it.)  Let
>> me double check it after I get back from my trip.
> No worries.  Hopefully by the time you're back I'll have something
> publishable on the ripping apart tree-vrp front and we can prototype the
> effectiveness of doing this kind of stuff outside tree-vrp.c

I re-did my build and re-ran all the tests and found no regressions
in the Ada tests.

> We should also revisit Aldy's work from last year which started the
> whole effort around fixing how we deal with out out of bounds index
> testing.   He had a version which ran outside tree-vrp.c but used the
> same basic structure and queried range data for the index.  I've got a
> copy here we can poke at.

Sure, I'd be interested in looking at it when I'm done with this
work.

Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-31  4:32         ` Jeff Law
  2017-11-01 22:21           ` Martin Sebor
@ 2017-11-02 11:27           ` Richard Biener
  1 sibling, 0 replies; 22+ messages in thread
From: Richard Biener @ 2017-11-02 11:27 UTC (permalink / raw)
  To: Jeff Law; +Cc: Martin Sebor, Gcc Patch List

[-- Attachment #1: Type: text/plain, Size: 5460 bytes --]

On Mon, 30 Oct 2017, Jeff Law wrote:

> On 10/30/2017 05:29 PM, Martin Sebor wrote:
> > On 10/30/2017 03:48 PM, Jeff Law wrote:
> >> On 10/30/2017 09:19 AM, Martin Sebor wrote:
> >>> On 10/30/2017 05:45 AM, Richard Biener wrote:
> >>>> On Sun, 29 Oct 2017, Martin Sebor wrote:
> >>>>
> >>>>> In my work on -Wrestrict, to issue meaningful warnings, I found
> >>>>> it important to detect both out of bounds array indices as well
> >>>>> as offsets in calls to restrict-qualified functions like strcpy.
> >>>>> GCC already detects some of these cases but my tests for
> >>>>> the enhanced warning exposed a few gaps.
> >>>>>
> >>>>> The attached patch enhances -Warray-bounds to detect more instances
> >>>>> out-of-bounds indices and offsets to member arrays and non-array
> >>>>> members.  For example, it detects the out-of-bounds offset in the
> >>>>> call to strcpy below.
> >>>>>
> >>>>> The patch is meant to be applied on top posted here but not yet
> >>>>> committed:
> >>>>>     https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
> >>>>>
> >>>>> Richard, since this also touches tree-vrp.c I look for your comments.
> >>>>
> >>>> You fail to tell what you are changing and why - I have to reverse
> >>>> engineer this from the patch which a) isn't easy in this case, b) feels
> >>>> like a waste of time.  Esp. since the patch does many things.
> >>>>
> >>>> My first question is why do you add a warning from forwprop?  It
> >>>> _feels_ like you're trying to warn about arbitrary out-of-bound
> >>>> addresses at the point they are folded to MEM_REFs.  And it looks
> >>>> like you're warning about pointer arithmetic like &p->a + 6.
> >>>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
> >>>> is not restricted to operate within fields that are appearantly
> >>>> accessed here - the only restriction is with respect to the
> >>>> whole underlying pointed-to-object.
> >>>>
> >>>> By doing the warning from forwprop you'll run into all such cases
> >>>> introduced by GCC itself during quite late optimization passes.
> >>>
> >>> I haven't run into any such cases.  What would a more appropriate
> >>> place to detect out-of-bounds offsets?  I'm having a hard time
> >>> distinguishing what is appropriate and what isn't.  For instance,
> >>> if it's okay to detect some out of bounds offsets/indices in vrp
> >>> why is it wrong to do a better job of it in forwpropI think part of
> >>> the problem is there isn't a well defined place to do
> >> this kind of warning.  I suspect it's currently in VRP simply because
> >> that is where we had range information in the past.  It's still the
> >> location with the most accurate range information.
> >>
> >> In a world where we have an embedded context sensitive range analysis
> >> engine, we should *really* look at pulling the out of bounds array
> >> warnings out of any optimization pass an have a distinct pass to deal
> >> with them.
> >>
> >> I guess in the immediate term the question I would ask Martin is what is
> >> it about forwprop that makes it interesting?  Is it because of the
> >> lowering issues we touched on last week?  If so I wonder if we could
> >> recreate an array form from a MEM_REF for the purposes of optimization.
> >> Or if we could just handle MEM_REFs better within the existing warning.
> > 
> > I put it in forwprop only because that was the last stage where
> > there's still enough context before the POINTER_PLUS_EXPR is
> > folded into MEM_REF to tell an offset from the beginning of
> > a subobject from the one from the beginning of the bigger object
> > of which the subobject is a member.  I certainly don't mind moving
> > it somewhere else more appropriate if this isn't ideal, just as
> > long it doesn't cripple the detection (e.g., as long as we still
> > have range information).
> Understood.

Well, it's a long-standing issue with how we do these kind of
warnings, likewise for _b_o_s which also "can't stand" component-refs
to be folded into the MEM_REF offset.

I've said in the past that _b_o_s relying on component-refs to stay
and for them to be constrained the same way they are in C is bogus.
We've added an early _b_o_s pass to mitigate that "issue" somewhat.

Now you're trying to "solve" the same issue as _b_o_s -- in the
end it looks like the warning could well reside in that pass
rather than in forwprop.

Richard.

> 
> [ ... ]
> 
> > 
> > I of course don't want to break anything.  I didn't see any fallout
> > in my testing and I normally test all the front ends, including Ada,
> > but let me check to make sure I tested it this time (I had made some
> > temporary changes to my build script and may have disabled it.)  Let
> > me double check it after I get back from my trip.
> No worries.  Hopefully by the time you're back I'll have something
> publishable on the ripping apart tree-vrp front and we can prototype the
> effectiveness of doing this kind of stuff outside tree-vrp.c
> 
> We should also revisit Aldy's work from last year which started the
> whole effort around fixing how we deal with out out of bounds index
> testing.   He had a version which ran outside tree-vrp.c but used the
> same basic structure and queried range data for the index.  I've got a
> copy here we can poke at.
> 
> jeff
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-10-30 21:49           ` Martin Sebor
@ 2017-11-02 11:48             ` Richard Biener
  2017-11-10  1:12               ` Jeff Law
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2017-11-02 11:48 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Gcc Patch List, Jeff Law

On Mon, 30 Oct 2017, Martin Sebor wrote:

> On 10/30/2017 02:56 PM, Richard Biener wrote:
> > On October 30, 2017 9:13:04 PM GMT+01:00, Martin Sebor <msebor@gmail.com>
> > wrote:
> > > On 10/30/2017 01:53 PM, Richard Biener wrote:
> > > > On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor
> > > <msebor@gmail.com> wrote:
> > > > > On 10/30/2017 05:45 AM, Richard Biener wrote:
> > > > > > On Sun, 29 Oct 2017, Martin Sebor wrote:
> > > > > > 
> > > > > > > In my work on -Wrestrict, to issue meaningful warnings, I found
> > > > > > > it important to detect both out of bounds array indices as well
> > > > > > > as offsets in calls to restrict-qualified functions like strcpy.
> > > > > > > GCC already detects some of these cases but my tests for
> > > > > > > the enhanced warning exposed a few gaps.
> > > > > > > 
> > > > > > > The attached patch enhances -Warray-bounds to detect more
> > > instances
> > > > > > > out-of-bounds indices and offsets to member arrays and non-array
> > > > > > > members.  For example, it detects the out-of-bounds offset in the
> > > > > > > call to strcpy below.
> > > > > > > 
> > > > > > > The patch is meant to be applied on top posted here but not yet
> > > > > > > committed:
> > > > > > >      https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
> > > > > > > 
> > > > > > > Richard, since this also touches tree-vrp.c I look for your
> > > > > comments.
> > > > > > 
> > > > > > You fail to tell what you are changing and why - I have to reverse
> > > > > > engineer this from the patch which a) isn't easy in this case, b)
> > > > > feels
> > > > > > like a waste of time.  Esp. since the patch does many things.
> > > > > > 
> > > > > > My first question is why do you add a warning from forwprop?  It
> > > > > > _feels_ like you're trying to warn about arbitrary out-of-bound
> > > > > > addresses at the point they are folded to MEM_REFs.  And it looks
> > > > > > like you're warning about pointer arithmetic like &p->a + 6.
> > > > > > That doesn't look correct to me.  Pointer arithmetic in GIMPLE
> > > > > > is not restricted to operate within fields that are appearantly
> > > > > > accessed here - the only restriction is with respect to the
> > > > > > whole underlying pointed-to-object.
> > > > > > 
> > > > > > By doing the warning from forwprop you'll run into all such cases
> > > > > > introduced by GCC itself during quite late optimization passes.
> > > > > 
> > > > > I haven't run into any such cases.  What would a more appropriate
> > > > > place to detect out-of-bounds offsets?  I'm having a hard time
> > > > > distinguishing what is appropriate and what isn't.  For instance,
> > > > > if it's okay to detect some out of bounds offsets/indices in vrp
> > > > > why is it wrong to do a better job of it in forwprop?
> > > > > 
> > > > > > 
> > > > > > You're trying to re-do __builtin_object_size even when that wasn't
> > > > > > used.
> > > > > 
> > > > > That's not the quite my intent, although it is close.
> > > > > 
> > > > > > 
> > > > > > So it looks like you're on the wrong track.  Yes,
> > > > > > 
> > > > > >     strcpy (p->a + 6, "y");
> > > > > > 
> > > > > > _may_ be "invalid" C (I'm not even sure about that!) but it
> > > > > > is certainly not invalid GIMPLE.
> > > > > 
> > > > > Adding (or subtracting) an integer to/from a pointer to an array
> > > > > is defined in both C and C++ only if the result points to an element
> > > > > of the array or just past the last element of the array.  Otherwise
> > > > > it's undefined. (A non-array object is considered an array of one
> > > > > for this purpose.)
> > > > 
> > > > On GIMPLE this is indistinguishable from (short *) (p->a) + 3.
> > > 
> > > Sure, they're both the same:
> > > 
> > >    pa_3 = &p_2(D)->a;
> > >    _1 = pa_3 + 6;
> > > 
> > > and that's fine because the implementation of the warning sees and
> > > uses the byte offset from the beginning of a, so I don't understand
> > > the problem you are pointing out.  Can you clarify what you mean?
> > 
> > It does not access the array but the underlying object. On GIMPLE it is just
> > an address calculation without constraints.
> 
> But the computation starts with the subobject and so is only
> valid within the bounds of the subobject.  Or are you saying
> that GCC emits such GIMPLE expressions for valid code?  If so,
> can you give an example of such code?

There were elaborate transforms of ptr + CST to ptr->a.b.c[3] in the
past.  We have ripped out _most_ of them because of bad interaction
with dependence analysis and _b_o_s warnings.

But for example PRE might still end up propagating

 _1 = &ptr->a.b.c;
 _2 = (*_1)[i_3];

as

 _2 = ptr->a.b.c[i_3];

But it's not so much GCC building up GIMPLE expressions that would
cause false positives but "valid" C code and "invalid" C code
represented exactly the same in GCC.  Let's say

struct S {
    short s[4];
    int i;
};

char foo (struct S *p)
{
  *((char *)p->s + 8);
}

for example which I think is perfectly valid but ends up as

foo (struct S * p)
{
  short int[4] * _1;
  char * _2;

  <bb 2> [0.00%] [count: INV]:
  _1 = &p_3(D)->s;
  _2 = _1 + 8;
  return;

in GIMPLE.  You might say well, I should have used (char *)p,
not (char *)p->s, but I believe such code is more common than
you think (and we have the different _FORITIFY_SOURCE levels
for a reason!  You are forcing the most strict interpretation
on all address computations!)

> Or, if that's not it, what exactly is your concern with this
> enhancement?  If it's that it's implemented in forwprop, what
> would be a better place, e.g., earlier in the optimization
> phase?  If it's something something else, I'd appreciate it
> if you could explain what.

For one implementing this in forwprop looks like a move in the
wrong direction.  I'd like to have separate warning passes or
at most amend warnings from optimization passes, not add new ones.

Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-11-02 11:48             ` Richard Biener
@ 2017-11-10  1:12               ` Jeff Law
  2017-11-10  8:25                 ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff Law @ 2017-11-10  1:12 UTC (permalink / raw)
  To: Richard Biener, Martin Sebor; +Cc: Gcc Patch List

On 11/02/2017 05:48 AM, Richard Biener wrote:

> 
> There were elaborate transforms of ptr + CST to ptr->a.b.c[3] in the
> past.  We have ripped out _most_ of them because of bad interaction
> with dependence analysis and _b_o_s warnings.
> 
> But for example PRE might still end up propagating
> 
>  _1 = &ptr->a.b.c;
>  _2 = (*_1)[i_3];
> 
> as
> 
>  _2 = ptr->a.b.c[i_3];
> 
> But it's not so much GCC building up GIMPLE expressions that would
> cause false positives but "valid" C code and "invalid" C code
> represented exactly the same in GCC.  Let's say
I think this is a key point.

> 
>> Or, if that's not it, what exactly is your concern with this
>> enhancement?  If it's that it's implemented in forwprop, what
>> would be a better place, e.g., earlier in the optimization
>> phase?  If it's something something else, I'd appreciate it
>> if you could explain what.
> 
> For one implementing this in forwprop looks like a move in the
> wrong direction.  I'd like to have separate warning passes or
> at most amend warnings from optimization passes, not add new ones.
I tend to agree.  That's one of the reasons why I pushed Aldy away from
doing this kind of stuff within VRP.

What I envision is a pass which does a dominator walk through the
blocks.  It gathers context sensitive range information as it does the walk.

As we encounter array references, we try to check them against the
current range information.  We could also try to warn about certain
pointer computations, though we have to be more careful with those.

Though I certainly still worry that the false positive cases which led
Aldy, Andrew and myself to look at path sensitive ranges arent' resolved
and will limit the utility of doing more array range checking.

Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-11-10  1:12               ` Jeff Law
@ 2017-11-10  8:25                 ` Richard Biener
  2017-11-14  0:04                   ` Jeff Law
  2017-11-14  5:22                   ` Martin Sebor
  0 siblings, 2 replies; 22+ messages in thread
From: Richard Biener @ 2017-11-10  8:25 UTC (permalink / raw)
  To: Jeff Law; +Cc: Martin Sebor, Gcc Patch List

On Thu, 9 Nov 2017, Jeff Law wrote:

> On 11/02/2017 05:48 AM, Richard Biener wrote:
> 
> > 
> > There were elaborate transforms of ptr + CST to ptr->a.b.c[3] in the
> > past.  We have ripped out _most_ of them because of bad interaction
> > with dependence analysis and _b_o_s warnings.
> > 
> > But for example PRE might still end up propagating
> > 
> >  _1 = &ptr->a.b.c;
> >  _2 = (*_1)[i_3];
> > 
> > as
> > 
> >  _2 = ptr->a.b.c[i_3];
> > 
> > But it's not so much GCC building up GIMPLE expressions that would
> > cause false positives but "valid" C code and "invalid" C code
> > represented exactly the same in GCC.  Let's say
> I think this is a key point.

It's the usual issue with an optimizing compiler vs. a static analyzer.
We try to get rid of the little semantic details of the input languages
that in the end do not matter for code-generation but that makes
using those semantic details hard (sometimes the little details
are useful, like signed overflow being undefined).

For GIMPLE it's also often the case that we didn't really thoroughly
specify the semantics of the IL - like is an aggregate copy a
block copy (that's how we expand it to RTL) or a memberwise copy?
SRA treats it like the latter in some cases but memcpy folding
turns memcpy into aggregate assignments ... (now think about padding).

It's not that GCC doesn't have its set of existing issues with
respect to interpreting GIMPLE semantic as it seems fit in one way
here and in another way there.  I'm just always nervous when adding
new "interpretations" where I know the non-existing formal definition
of GIMPLE leaves things unspecified.

For example we _do_ use array bounds and array accesses (but _not_
and for now _nowhere_ if they appear in address computations!)
to derive niter information.  At the same time, because of this
exploitation, we try very very hard to never (ok, PRE above as a
couter-example) create an actual array access when dereferencing
a pointer that is constructed by taking the address of an array-ref.
That's why Martin added the warning to forwprop because that pass,
when forwarding such addresses, gets rid of the array-ref.

> >> Or, if that's not it, what exactly is your concern with this
> >> enhancement?  If it's that it's implemented in forwprop, what
> >> would be a better place, e.g., earlier in the optimization
> >> phase?  If it's something something else, I'd appreciate it
> >> if you could explain what.
> > 
> > For one implementing this in forwprop looks like a move in the
> > wrong direction.  I'd like to have separate warning passes or
> > at most amend warnings from optimization passes, not add new ones.
> I tend to agree.  That's one of the reasons why I pushed Aldy away from
> doing this kind of stuff within VRP.
> 
> What I envision is a pass which does a dominator walk through the
> blocks.  It gathers context sensitive range information as it does the walk.
> 
> As we encounter array references, we try to check them against the
> current range information.  We could also try to warn about certain
> pointer computations, though we have to be more careful with those.
> 
> Though I certainly still worry that the false positive cases which led
> Aldy, Andrew and myself to look at path sensitive ranges arent' resolved
> and will limit the utility of doing more array range checking.

I fear while this might be a little bit cleaner you'd still have to
do this very very early in the optimization pipeline (see all the
hard time we had with __builtin_object_size) and thus you won't catch
very many cases unless you start doing an IPA pass and handle propagating
through memory.  Which is when you arrived at a full-blown static
analyzer.

Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-11-10  8:25                 ` Richard Biener
@ 2017-11-14  0:04                   ` Jeff Law
  2017-11-14  9:11                     ` Richard Biener
  2017-11-14  5:22                   ` Martin Sebor
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff Law @ 2017-11-14  0:04 UTC (permalink / raw)
  To: Richard Biener; +Cc: Martin Sebor, Gcc Patch List

On 11/10/2017 01:00 AM, Richard Biener wrote:
> 
> It's the usual issue with an optimizing compiler vs. a static analyzer.
> We try to get rid of the little semantic details of the input languages
> that in the end do not matter for code-generation but that makes
> using those semantic details hard (sometimes the little details
> are useful, like signed overflow being undefined).
> 
> For GIMPLE it's also often the case that we didn't really thoroughly
> specify the semantics of the IL - like is an aggregate copy a
> block copy (that's how we expand it to RTL) or a memberwise copy?
> SRA treats it like the latter in some cases but memcpy folding
> turns memcpy into aggregate assignments ... (now think about padding).
Understood far too well.  In fact, I was looking at the aggregate copy
stuff not terribly long ago and concluded that depending on either
particular behavior was undesirable. Something (glibc IIRC) was
depending on the padding being copied because they were actually shoving
live data into the pad.  Ugh.

> 
> It's not that GCC doesn't have its set of existing issues with
> respect to interpreting GIMPLE semantic as it seems fit in one way
> here and in another way there.  I'm just always nervous when adding
> new "interpretations" where I know the non-existing formal definition
> of GIMPLE leaves things unspecified.
Right.  No disagreement from me.  We have these issues and address
representation is just one of a class of things which we can represent
in gimple that don't "properly" map back to the source language.  And
that's probably inherent in the lowering from the source language to
something like GIMPLE.


> 
> For example we _do_ use array bounds and array accesses (but _not_
> and for now _nowhere_ if they appear in address computations!)
> to derive niter information.  At the same time, because of this
> exploitation, we try very very hard to never (ok, PRE above as a
> couter-example) create an actual array access when dereferencing
> a pointer that is constructed by taking the address of an array-ref.
> That's why Martin added the warning to forwprop because that pass,
> when forwarding such addresses, gets rid of the array-ref.
Right.  IIRC there's some BZs around this issue that come up
release-to-release related to how we've changed this code over the last
few years.


> 
>>>> Or, if that's not it, what exactly is your concern with this
>>>> enhancement?  If it's that it's implemented in forwprop, what
>>>> would be a better place, e.g., earlier in the optimization
>>>> phase?  If it's something something else, I'd appreciate it
>>>> if you could explain what.
>>>
>>> For one implementing this in forwprop looks like a move in the
>>> wrong direction.  I'd like to have separate warning passes or
>>> at most amend warnings from optimization passes, not add new ones.
>> I tend to agree.  That's one of the reasons why I pushed Aldy away from
>> doing this kind of stuff within VRP.
>>
>> What I envision is a pass which does a dominator walk through the
>> blocks.  It gathers context sensitive range information as it does the walk.
>>
>> As we encounter array references, we try to check them against the
>> current range information.  We could also try to warn about certain
>> pointer computations, though we have to be more careful with those.
>>
>> Though I certainly still worry that the false positive cases which led
>> Aldy, Andrew and myself to look at path sensitive ranges arent' resolved
>> and will limit the utility of doing more array range checking.
> 
> I fear while this might be a little bit cleaner you'd still have to
> do this very very early in the optimization pipeline (see all the
> hard time we had with __builtin_object_size) and thus you won't catch
> very many cases unless you start doing an IPA pass and handle propagating
> through memory.  Which is when you arrived at a full-blown static
> analyzer.
Could be.  We won't know until we give it a whirl.  FWIW, we saw far
more problems due to the lack of path sensitivity than anything.
Nothing I'm suggesting in this thread addresses that problem.

Doing a really good job for warnings with path sensitivity shares a lot
of properties with jump threading.  Specifically that you need to
propagate range knowledge along a path, often past join points in the
CFG (ie, you're propagating beyond the dominance frontier).

Once you do a good job there, I'd strongly suspect that IPA issues would
then dominate.

Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-11-10  8:25                 ` Richard Biener
  2017-11-14  0:04                   ` Jeff Law
@ 2017-11-14  5:22                   ` Martin Sebor
  2017-11-14  9:13                     ` Richard Biener
  2017-11-15  1:54                     ` Jeff Law
  1 sibling, 2 replies; 22+ messages in thread
From: Martin Sebor @ 2017-11-14  5:22 UTC (permalink / raw)
  To: Richard Biener, Jeff Law; +Cc: Gcc Patch List

On 11/10/2017 01:00 AM, Richard Biener wrote:
> On Thu, 9 Nov 2017, Jeff Law wrote:
>
>> On 11/02/2017 05:48 AM, Richard Biener wrote:
>>
>>>
>>> There were elaborate transforms of ptr + CST to ptr->a.b.c[3] in the
>>> past.  We have ripped out _most_ of them because of bad interaction
>>> with dependence analysis and _b_o_s warnings.
>>>
>>> But for example PRE might still end up propagating
>>>
>>>  _1 = &ptr->a.b.c;
>>>  _2 = (*_1)[i_3];
>>>
>>> as
>>>
>>>  _2 = ptr->a.b.c[i_3];
>>>
>>> But it's not so much GCC building up GIMPLE expressions that would
>>> cause false positives but "valid" C code and "invalid" C code
>>> represented exactly the same in GCC.  Let's say
>> I think this is a key point.
>
> It's the usual issue with an optimizing compiler vs. a static analyzer.
> We try to get rid of the little semantic details of the input languages
> that in the end do not matter for code-generation but that makes
> using those semantic details hard (sometimes the little details
> are useful, like signed overflow being undefined).
>
> For GIMPLE it's also often the case that we didn't really thoroughly
> specify the semantics of the IL - like is an aggregate copy a
> block copy (that's how we expand it to RTL) or a memberwise copy?
> SRA treats it like the latter in some cases but memcpy folding
> turns memcpy into aggregate assignments ... (now think about padding).
>
> It's not that GCC doesn't have its set of existing issues with
> respect to interpreting GIMPLE semantic as it seems fit in one way
> here and in another way there.  I'm just always nervous when adding
> new "interpretations" where I know the non-existing formal definition
> of GIMPLE leaves things unspecified.
>
> For example we _do_ use array bounds and array accesses (but _not_
> and for now _nowhere_ if they appear in address computations!)
> to derive niter information.  At the same time, because of this
> exploitation, we try very very hard to never (ok, PRE above as a
> couter-example) create an actual array access when dereferencing
> a pointer that is constructed by taking the address of an array-ref.
> That's why Martin added the warning to forwprop because that pass,
> when forwarding such addresses, gets rid of the array-ref.
>
>>>> Or, if that's not it, what exactly is your concern with this
>>>> enhancement?  If it's that it's implemented in forwprop, what
>>>> would be a better place, e.g., earlier in the optimization
>>>> phase?  If it's something something else, I'd appreciate it
>>>> if you could explain what.
>>>
>>> For one implementing this in forwprop looks like a move in the
>>> wrong direction.  I'd like to have separate warning passes or
>>> at most amend warnings from optimization passes, not add new ones.
>> I tend to agree.  That's one of the reasons why I pushed Aldy away from
>> doing this kind of stuff within VRP.
>>
>> What I envision is a pass which does a dominator walk through the
>> blocks.  It gathers context sensitive range information as it does the walk.
>>
>> As we encounter array references, we try to check them against the
>> current range information.  We could also try to warn about certain
>> pointer computations, though we have to be more careful with those.
>>
>> Though I certainly still worry that the false positive cases which led
>> Aldy, Andrew and myself to look at path sensitive ranges arent' resolved
>> and will limit the utility of doing more array range checking.
>
> I fear while this might be a little bit cleaner you'd still have to
> do this very very early in the optimization pipeline (see all the
> hard time we had with __builtin_object_size) and thus you won't catch
> very many cases unless you start doing an IPA pass and handle propagating
> through memory.  Which is when you arrived at a full-blown static
> analyzer.

I have a different concern with the general idea of moving these
kinds of warnings into passes of their own.  It would unavoidably
result in duplicating some code from the optimization passes (at
a minimum, the GIMPLE traversal, but likely more than that).  It
would also require pretty tight coupling between the warning pass
and the optimization passes as the latter can defeat the work of
the former (as happens in this case).  But most significantly,
the separation isn't feasible in cases where the optimization
pass computes and maintains non-trivial internal state (like,
say, tree-object-size or tree-ssa-strlen).

I think of these kinds of "basic" warnings as error checking and
handling.  It makes no more sense to me to have a separate pass
check for input errors than it would to have one to check the
validity of other kinds of input within GCC (such as trees of
the wrong kind being passed from one function to another due to
a GCC programmer's error).  They go hand in hand.

To be sure, there are more sophisticated kinds of checkers that
depend on an in-depth analysis of the program that's outside of
what most existing optimization passes do.  The sprintf warning
comes to mind as an example.  But even those can be (and the
sprintf pass has been) used to do both: detect and diagnose
errors _and_ optimize code.  And even these passes (or especially
they) do or would benefit from data computed by other optimization
passes to deliver better results (whether it be avoiding false
positives or improving the code they generate).

That said, it would make perfect sense to me to have the warning
code live in separate files from the optimization code, and keep
the separation of concerns and responsibilities crisp and well
defined by their APIs.

Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-11-14  0:04                   ` Jeff Law
@ 2017-11-14  9:11                     ` Richard Biener
  2017-11-15  1:52                       ` Jeff Law
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2017-11-14  9:11 UTC (permalink / raw)
  To: Jeff Law; +Cc: Martin Sebor, Gcc Patch List

On Mon, 13 Nov 2017, Jeff Law wrote:

> On 11/10/2017 01:00 AM, Richard Biener wrote:
> > 
> > It's the usual issue with an optimizing compiler vs. a static analyzer.
> > We try to get rid of the little semantic details of the input languages
> > that in the end do not matter for code-generation but that makes
> > using those semantic details hard (sometimes the little details
> > are useful, like signed overflow being undefined).
> > 
> > For GIMPLE it's also often the case that we didn't really thoroughly
> > specify the semantics of the IL - like is an aggregate copy a
> > block copy (that's how we expand it to RTL) or a memberwise copy?
> > SRA treats it like the latter in some cases but memcpy folding
> > turns memcpy into aggregate assignments ... (now think about padding).
> Understood far too well.  In fact, I was looking at the aggregate copy
> stuff not terribly long ago and concluded that depending on either
> particular behavior was undesirable. Something (glibc IIRC) was
> depending on the padding being copied because they were actually shoving
> live data into the pad.  Ugh.
> 
> > 
> > It's not that GCC doesn't have its set of existing issues with
> > respect to interpreting GIMPLE semantic as it seems fit in one way
> > here and in another way there.  I'm just always nervous when adding
> > new "interpretations" where I know the non-existing formal definition
> > of GIMPLE leaves things unspecified.
> Right.  No disagreement from me.  We have these issues and address
> representation is just one of a class of things which we can represent
> in gimple that don't "properly" map back to the source language.  And
> that's probably inherent in the lowering from the source language to
> something like GIMPLE.
> 
> 
> > 
> > For example we _do_ use array bounds and array accesses (but _not_
> > and for now _nowhere_ if they appear in address computations!)
> > to derive niter information.  At the same time, because of this
> > exploitation, we try very very hard to never (ok, PRE above as a
> > couter-example) create an actual array access when dereferencing
> > a pointer that is constructed by taking the address of an array-ref.
> > That's why Martin added the warning to forwprop because that pass,
> > when forwarding such addresses, gets rid of the array-ref.
> Right.  IIRC there's some BZs around this issue that come up
> release-to-release related to how we've changed this code over the last
> few years.
> 
> 
> > 
> >>>> Or, if that's not it, what exactly is your concern with this
> >>>> enhancement?  If it's that it's implemented in forwprop, what
> >>>> would be a better place, e.g., earlier in the optimization
> >>>> phase?  If it's something something else, I'd appreciate it
> >>>> if you could explain what.
> >>>
> >>> For one implementing this in forwprop looks like a move in the
> >>> wrong direction.  I'd like to have separate warning passes or
> >>> at most amend warnings from optimization passes, not add new ones.
> >> I tend to agree.  That's one of the reasons why I pushed Aldy away from
> >> doing this kind of stuff within VRP.
> >>
> >> What I envision is a pass which does a dominator walk through the
> >> blocks.  It gathers context sensitive range information as it does the walk.
> >>
> >> As we encounter array references, we try to check them against the
> >> current range information.  We could also try to warn about certain
> >> pointer computations, though we have to be more careful with those.
> >>
> >> Though I certainly still worry that the false positive cases which led
> >> Aldy, Andrew and myself to look at path sensitive ranges arent' resolved
> >> and will limit the utility of doing more array range checking.
> > 
> > I fear while this might be a little bit cleaner you'd still have to
> > do this very very early in the optimization pipeline (see all the
> > hard time we had with __builtin_object_size) and thus you won't catch
> > very many cases unless you start doing an IPA pass and handle propagating
> > through memory.  Which is when you arrived at a full-blown static
> > analyzer.
> Could be.  We won't know until we give it a whirl.  FWIW, we saw far
> more problems due to the lack of path sensitivity than anything.
> Nothing I'm suggesting in this thread addresses that problem.
> 
> Doing a really good job for warnings with path sensitivity shares a lot
> of properties with jump threading.  Specifically that you need to
> propagate range knowledge along a path, often past join points in the
> CFG (ie, you're propagating beyond the dominance frontier).

Right.

> Once you do a good job there, I'd strongly suspect that IPA issues would
> then dominate.

I suspect once you're dealing with C++ code you run into the issue
that even early inlining exposes code with forwprop run on it
before running forwprop again on the inlined-into body.

So the IPA issues start very early.  Of course if you are doing
path-sensitive processing then processing call/return "edges" as if
they were CFG edges shouldn't be too hard.

Then the only issue remaining is that there are very many
paths in a program compared to # edges or # blocks which means
you'll quickly run into compile-time issues.

Static analyzers are hard ;)  But I appreciate somebody finally
trying that route.  Ideally we'd do the static analysis in parallel
to the compilation given we'd need an "early" LTO phase before
early inlining.  Thus do a LTO out then in parallel do WPA
static analysis with diagnostics.

Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-11-14  5:22                   ` Martin Sebor
@ 2017-11-14  9:13                     ` Richard Biener
  2017-11-15  1:54                     ` Jeff Law
  1 sibling, 0 replies; 22+ messages in thread
From: Richard Biener @ 2017-11-14  9:13 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jeff Law, Gcc Patch List

On Mon, 13 Nov 2017, Martin Sebor wrote:

> On 11/10/2017 01:00 AM, Richard Biener wrote:
> > On Thu, 9 Nov 2017, Jeff Law wrote:
> > 
> > > On 11/02/2017 05:48 AM, Richard Biener wrote:
> > > 
> > > > 
> > > > There were elaborate transforms of ptr + CST to ptr->a.b.c[3] in the
> > > > past.  We have ripped out _most_ of them because of bad interaction
> > > > with dependence analysis and _b_o_s warnings.
> > > > 
> > > > But for example PRE might still end up propagating
> > > > 
> > > >  _1 = &ptr->a.b.c;
> > > >  _2 = (*_1)[i_3];
> > > > 
> > > > as
> > > > 
> > > >  _2 = ptr->a.b.c[i_3];
> > > > 
> > > > But it's not so much GCC building up GIMPLE expressions that would
> > > > cause false positives but "valid" C code and "invalid" C code
> > > > represented exactly the same in GCC.  Let's say
> > > I think this is a key point.
> > 
> > It's the usual issue with an optimizing compiler vs. a static analyzer.
> > We try to get rid of the little semantic details of the input languages
> > that in the end do not matter for code-generation but that makes
> > using those semantic details hard (sometimes the little details
> > are useful, like signed overflow being undefined).
> > 
> > For GIMPLE it's also often the case that we didn't really thoroughly
> > specify the semantics of the IL - like is an aggregate copy a
> > block copy (that's how we expand it to RTL) or a memberwise copy?
> > SRA treats it like the latter in some cases but memcpy folding
> > turns memcpy into aggregate assignments ... (now think about padding).
> > 
> > It's not that GCC doesn't have its set of existing issues with
> > respect to interpreting GIMPLE semantic as it seems fit in one way
> > here and in another way there.  I'm just always nervous when adding
> > new "interpretations" where I know the non-existing formal definition
> > of GIMPLE leaves things unspecified.
> > 
> > For example we _do_ use array bounds and array accesses (but _not_
> > and for now _nowhere_ if they appear in address computations!)
> > to derive niter information.  At the same time, because of this
> > exploitation, we try very very hard to never (ok, PRE above as a
> > couter-example) create an actual array access when dereferencing
> > a pointer that is constructed by taking the address of an array-ref.
> > That's why Martin added the warning to forwprop because that pass,
> > when forwarding such addresses, gets rid of the array-ref.
> > 
> > > > > Or, if that's not it, what exactly is your concern with this
> > > > > enhancement?  If it's that it's implemented in forwprop, what
> > > > > would be a better place, e.g., earlier in the optimization
> > > > > phase?  If it's something something else, I'd appreciate it
> > > > > if you could explain what.
> > > > 
> > > > For one implementing this in forwprop looks like a move in the
> > > > wrong direction.  I'd like to have separate warning passes or
> > > > at most amend warnings from optimization passes, not add new ones.
> > > I tend to agree.  That's one of the reasons why I pushed Aldy away from
> > > doing this kind of stuff within VRP.
> > > 
> > > What I envision is a pass which does a dominator walk through the
> > > blocks.  It gathers context sensitive range information as it does the
> > > walk.
> > > 
> > > As we encounter array references, we try to check them against the
> > > current range information.  We could also try to warn about certain
> > > pointer computations, though we have to be more careful with those.
> > > 
> > > Though I certainly still worry that the false positive cases which led
> > > Aldy, Andrew and myself to look at path sensitive ranges arent' resolved
> > > and will limit the utility of doing more array range checking.
> > 
> > I fear while this might be a little bit cleaner you'd still have to
> > do this very very early in the optimization pipeline (see all the
> > hard time we had with __builtin_object_size) and thus you won't catch
> > very many cases unless you start doing an IPA pass and handle propagating
> > through memory.  Which is when you arrived at a full-blown static
> > analyzer.
> 
> I have a different concern with the general idea of moving these
> kinds of warnings into passes of their own.  It would unavoidably
> result in duplicating some code from the optimization passes (at
> a minimum, the GIMPLE traversal, but likely more than that).  It
> would also require pretty tight coupling between the warning pass
> and the optimization passes as the latter can defeat the work of
> the former (as happens in this case).  But most significantly,
> the separation isn't feasible in cases where the optimization
> pass computes and maintains non-trivial internal state (like,
> say, tree-object-size or tree-ssa-strlen).

Well, as I said elsewhere the goal is to modularize the "optimizations"
enough so you can run their analysis phase together from a
"warning pass" and get access to their internal state.  Like what
Jeff is doing with VRP.  I'm currently working on some region-based
value-numbering where the iteration scheme is supposed to allow
pluggin in, say, a value-range lattice.  It's currently not
allowing path separation (simulating jump-threading) at the moment
but you have to start somewhere ...

> I think of these kinds of "basic" warnings as error checking and
> handling.  It makes no more sense to me to have a separate pass
> check for input errors than it would to have one to check the
> validity of other kinds of input within GCC (such as trees of
> the wrong kind being passed from one function to another due to
> a GCC programmer's error).  They go hand in hand.
> 
> To be sure, there are more sophisticated kinds of checkers that
> depend on an in-depth analysis of the program that's outside of
> what most existing optimization passes do.  The sprintf warning
> comes to mind as an example.  But even those can be (and the
> sprintf pass has been) used to do both: detect and diagnose
> errors _and_ optimize code.  And even these passes (or especially
> they) do or would benefit from data computed by other optimization
> passes to deliver better results (whether it be avoiding false
> positives or improving the code they generate).

Sure - no disagreement here.  But then those warnings have to
operate within the limitations of the GIMPLE IL which was designed
for an optimizing compiler with a variety of input source languages.
That is, you simply can't implement every diagnostic you'd like
to have when coming from a source level perspective.

It seems to be you're somewhat pushing the boundary here ;)

Richard.

> That said, it would make perfect sense to me to have the warning
> code live in separate files from the optimization code, and keep
> the separation of concerns and responsibilities crisp and well
> defined by their APIs.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-11-14  9:11                     ` Richard Biener
@ 2017-11-15  1:52                       ` Jeff Law
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff Law @ 2017-11-15  1:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: Martin Sebor, Gcc Patch List

On 11/14/2017 01:42 AM, Richard Biener wrote:

> 
> I suspect once you're dealing with C++ code you run into the issue
> that even early inlining exposes code with forwprop run on it
> before running forwprop again on the inlined-into body.
> 
> So the IPA issues start very early.  Of course if you are doing
> path-sensitive processing then processing call/return "edges" as if
> they were CFG edges shouldn't be too hard.
> 
> Then the only issue remaining is that there are very many
> paths in a program compared to # edges or # blocks which means
> you'll quickly run into compile-time issues.
Yup.  There's a reason why static analysis is usually not on by default.
   It's bloody expensive.

> 
> Static analyzers are hard ;)  But I appreciate somebody finally
> trying that route.  Ideally we'd do the static analysis in parallel
> to the compilation given we'd need an "early" LTO phase before
> early inlining.  Thus do a LTO out then in parallel do WPA
> static analysis with diagnostics.
> 
No doubt.  I wouldn't even say we're going that route, at least not in
any real sense.  The walkers would probably be structured differently
than what we're doing and have to hook in much earlier.  You have to
build both the callgraph walker as well as the CFG walker and the
ability to pass data back and forth between them.  You'd also have to
finish all the deferred folding stuff so that what you analyze is closer
to the actual source.

We're just building on the existing infrastructure we have to give
better function scoped warnings.

jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)
  2017-11-14  5:22                   ` Martin Sebor
  2017-11-14  9:13                     ` Richard Biener
@ 2017-11-15  1:54                     ` Jeff Law
  1 sibling, 0 replies; 22+ messages in thread
From: Jeff Law @ 2017-11-15  1:54 UTC (permalink / raw)
  To: Martin Sebor, Richard Biener; +Cc: Gcc Patch List

On 11/13/2017 06:21 PM, Martin Sebor wrote:
> I have a different concern with the general idea of moving these
> kinds of warnings into passes of their own.  It would unavoidably
> result in duplicating some code from the optimization passes (at
> a minimum, the GIMPLE traversal, but likely more than that).  It
> would also require pretty tight coupling between the warning pass
> and the optimization passes as the latter can defeat the work of
> the former (as happens in this case).  But most significantly,
> the separation isn't feasible in cases where the optimization
> pass computes and maintains non-trivial internal state (like,
> say, tree-object-size or tree-ssa-strlen).
Something like object-size, string lengths, value data, etc should be
independent analysis modules.  You then build an optimizer on top of the
module and/or a warning pass on top of the module.  The optimizer and
warning pass might in fact share analysis and just do different things
with the underlying data produced.

It's not perfect and we're not likely to ever be perfect, but it's a
hell of a lot better than what we're doing now.   Yes there'll be phase
ordering problems, problems with other passes hiding or exposing
questionable code, possibly duplication between the optimization client
and warning client, etc, etc.  We have to look at each problem and
decide how best to proceed.

I do think that the frameworks we build should have mechanisms to allow
transformation of code that is clearly wrong and exhibits undefined
behavior.

> 
> That said, it would make perfect sense to me to have the warning
> code live in separate files from the optimization code, and keep
> the separation of concerns and responsibilities crisp and well
> defined by their APIs.
Where that's feasible it seems reasonable to me.  I'm obviously biased
based on recent experiences within VRP which has grown over time into a
horrid mess my forays into tree-ssa-uninit.c which has a damn cool
little predicate analysis engine just waiting for someone extract and
re-use.

Jeff

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-11-15  1:06 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-29 16:15 [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455) Martin Sebor
2017-10-30 11:55 ` Richard Biener
2017-10-30 15:21   ` Martin Sebor
2017-10-30 19:59     ` Richard Biener
2017-10-30 20:40       ` Martin Sebor
2017-10-30 21:23         ` Richard Biener
2017-10-30 21:49           ` Martin Sebor
2017-11-02 11:48             ` Richard Biener
2017-11-10  1:12               ` Jeff Law
2017-11-10  8:25                 ` Richard Biener
2017-11-14  0:04                   ` Jeff Law
2017-11-14  9:11                     ` Richard Biener
2017-11-15  1:52                       ` Jeff Law
2017-11-14  5:22                   ` Martin Sebor
2017-11-14  9:13                     ` Richard Biener
2017-11-15  1:54                     ` Jeff Law
2017-10-30 22:16     ` Jeff Law
2017-10-30 23:30       ` Martin Sebor
2017-10-31  4:32         ` Jeff Law
2017-11-01 22:21           ` Martin Sebor
2017-11-02 11:27           ` Richard Biener
2017-10-30 22:16 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).