public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r13-2546] analyzer: support for symbolic values in the out-of-bounds checker [PR106625]
@ 2022-09-08 17:47 Tim Lange
  0 siblings, 0 replies; only message in thread
From: Tim Lange @ 2022-09-08 17:47 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:7a6564c9b277d9f93582605758d57457de696deb

commit r13-2546-g7a6564c9b277d9f93582605758d57457de696deb
Author: Tim Lange <mail@tim-lange.me>
Date:   Wed Sep 7 17:41:28 2022 +0200

    analyzer: support for symbolic values in the out-of-bounds checker [PR106625]
    
    This patch adds support for reasoning about the inequality of two symbolic
    values in the special case specifically suited for reasoning about
    out-of-bounds past the end of the buffer. With this patch, the analyzer
    catches off-by-one errors and more even when the offset and capacity is
    symbolic.
    
    Regrtested on Linux x86_64 and tested on coreutils, curl, httpd and
    openssh as usual.
    
    2022-09-07  Tim Lange  <mail@tim-lange.me>
    
    gcc/analyzer/ChangeLog:
    
            PR analyzer/106625
            * analyzer.h (region_offset): Eliminate m_is_symbolic member.
            * region-model-impl-calls.cc (region_model::impl_call_realloc):
            Refine implementation to be more precise.
            * region-model.cc (class symbolic_past_the_end):
            Abstract diagnostic class to complain about accesses past the end
            with symbolic values.
            (class symbolic_buffer_overflow):
            Concrete diagnostic class to complain about buffer overflows with
            symbolic values.
            (class symbolic_buffer_overread):
            Concrete diagnostic class to complain about buffer overreads with
            symbolic values.
            (region_model::check_symbolic_bounds): New function.
            (maybe_get_integer_cst_tree): New helper function.
            (region_model::check_region_bounds):
            Add call to check_symbolic_bounds if offset is not concrete.
            (region_model::eval_condition_without_cm):
            Add support for EQ_EXPR and GT_EXPR with binaryop_svalues.
            (is_positive_svalue): New hleper function.
            (region_model::symbolic_greater_than):
            New function to handle GT_EXPR comparisons with symbolic values.
            (region_model::structural_equality): New function to compare
            whether two svalues are structured the same, i.e. evaluate to
            the same value.
            (test_struct): Reflect changes to region::calc_offset.
            (test_var): Likewise.
            (test_array_2): Likewise and add selftest with symbolic i.
            * region-model.h (class region_model): Add check_symbolic_bounds,
            symbolic_greater_than and structural_equality.
            * region.cc (region::get_offset):
            Reflect changes to region::calc_offset.
            (region::calc_offset):
            Compute the symbolic offset if the offset is not concrete.
            (region::get_relative_symbolic_offset): New function to return the
            symbolic offset in bytes relative to its parent.
            (field_region::get_relative_symbolic_offset): Likewise.
            (element_region::get_relative_symbolic_offset): Likewise.
            (offset_region::get_relative_symbolic_offset): Likewise.
            (bit_range_region::get_relative_symbolic_offset): Likewise.
            * region.h: Add get_relative_symbolic_offset.
            * store.cc (binding_key::make):
            Reflect changes to region::calc_offset.
            (binding_map::apply_ctor_val_to_range): Likewise.
            (binding_map::apply_ctor_pair_to_child_region): Likewise.
            (binding_cluster::bind_compound_sval): Likewise.
            (binding_cluster::get_any_binding): Likewise.
            (binding_cluster::maybe_get_compound_binding): Likewise.
    
    gcc/ChangeLog:
    
            PR analyzer/106625
            * doc/invoke.texi:
            State that the checker also reasons about symbolic values.
    
    gcc/testsuite/ChangeLog:
    
            PR analyzer/106625
            * gcc.dg/analyzer/data-model-1.c: Change expected result.
            * gcc.dg/analyzer/out-of-bounds-5.c: New test.
            * gcc.dg/analyzer/out-of-bounds-realloc-grow.c: New test.
            * gcc.dg/analyzer/symbolic-gt-1.c: New test.

Diff:
---
 gcc/analyzer/analyzer.h                            |  23 +-
 gcc/analyzer/region-model-impl-calls.cc            |  39 +-
 gcc/analyzer/region-model.cc                       | 469 +++++++++++++++++++--
 gcc/analyzer/region-model.h                        |   9 +
 gcc/analyzer/region.cc                             | 131 +++++-
 gcc/analyzer/region.h                              |  17 +-
 gcc/analyzer/store.cc                              |  18 +-
 gcc/doc/invoke.texi                                |   8 +-
 gcc/testsuite/gcc.dg/analyzer/data-model-1.c       |   3 +-
 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-5.c    | 156 +++++++
 .../gcc.dg/analyzer/out-of-bounds-realloc-grow.c   |  87 ++++
 gcc/testsuite/gcc.dg/analyzer/symbolic-gt-1.c      |  76 ++++
 12 files changed, 941 insertions(+), 95 deletions(-)

diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index dcefc13a546..e4dd6d6339d 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -172,16 +172,17 @@ public:
   static region_offset make_concrete (const region *base_region,
 				      bit_offset_t offset)
   {
-    return region_offset (base_region, offset, false);
+    return region_offset (base_region, offset, NULL);
   }
-  static region_offset make_symbolic (const region *base_region)
+  static region_offset make_symbolic (const region *base_region,
+				      const svalue *sym_offset)
   {
-    return region_offset (base_region, 0, true);
+    return region_offset (base_region, 0, sym_offset);
   }
 
   const region *get_base_region () const { return m_base_region; }
 
-  bool symbolic_p () const { return m_is_symbolic; }
+  bool symbolic_p () const { return m_sym_offset != NULL; }
 
   bit_offset_t get_bit_offset () const
   {
@@ -189,22 +190,28 @@ public:
     return m_offset;
   }
 
+  const svalue *get_symbolic_byte_offset () const
+  {
+    gcc_assert (symbolic_p ());
+    return m_sym_offset;
+  }
+
   bool operator== (const region_offset &other) const
   {
     return (m_base_region == other.m_base_region
 	    && m_offset == other.m_offset
-	    && m_is_symbolic == other.m_is_symbolic);
+	    && m_sym_offset == other.m_sym_offset);
   }
 
 private:
   region_offset (const region *base_region, bit_offset_t offset,
-		 bool is_symbolic)
-  : m_base_region (base_region), m_offset (offset), m_is_symbolic (is_symbolic)
+		 const svalue *sym_offset)
+  : m_base_region (base_region), m_offset (offset), m_sym_offset (sym_offset)
   {}
 
   const region *m_base_region;
   bit_offset_t m_offset;
-  bool m_is_symbolic;
+  const svalue *m_sym_offset;
 };
 
 extern location_t get_stmt_location (const gimple *stmt, function *fun);
diff --git a/gcc/analyzer/region-model-impl-calls.cc b/gcc/analyzer/region-model-impl-calls.cc
index 3790eaf2d79..617491be306 100644
--- a/gcc/analyzer/region-model-impl-calls.cc
+++ b/gcc/analyzer/region-model-impl-calls.cc
@@ -850,7 +850,7 @@ region_model::impl_call_realloc (const call_details &cd)
 	  if (old_size_sval)
 	    {
 	      const svalue *copied_size_sval
-		= get_copied_size (old_size_sval, new_size_sval);
+		= get_copied_size (model, old_size_sval, new_size_sval);
 	      const region *copied_old_reg
 		= model->m_mgr->get_sized_region (freed_reg, NULL,
 						  copied_size_sval);
@@ -896,35 +896,22 @@ region_model::impl_call_realloc (const call_details &cd)
 
   private:
     /* Return the lesser of OLD_SIZE_SVAL and NEW_SIZE_SVAL.
-       If either one is symbolic, the symbolic svalue is returned.  */
-    const svalue *get_copied_size (const svalue *old_size_sval,
+       If unknown, OLD_SIZE_SVAL is returned.  */
+    const svalue *get_copied_size (region_model *model,
+				   const svalue *old_size_sval,
 				   const svalue *new_size_sval) const
     {
-      tree old_size_cst = old_size_sval->maybe_get_constant ();
-      tree new_size_cst = new_size_sval->maybe_get_constant ();
-
-      if (old_size_cst && new_size_cst)
-	{
-	  /* Both are constants and comparable.  */
-	  tree cmp = fold_binary (LT_EXPR, boolean_type_node,
-				  old_size_cst, new_size_cst);
-
-	  if (cmp == boolean_true_node)
-	    return old_size_sval;
-	  else
-	    return new_size_sval;
-	}
-      else if (new_size_cst)
+      tristate res
+	= model->eval_condition (old_size_sval, GT_EXPR, new_size_sval);
+      switch (res.get_value ())
 	{
-	  /* OLD_SIZE_SVAL is symbolic, so return that.  */
-	  return old_size_sval;
-	}
-      else
-	{
-	  /* NEW_SIZE_SVAL is symbolic or both are symbolic.
-	     Return NEW_SIZE_SVAL, because implementations of realloc
-	     probably only moves the buffer if the new size is larger.  */
+	case tristate::TS_TRUE:
 	  return new_size_sval;
+	case tristate::TS_FALSE:
+	case tristate::TS_UNKNOWN:
+	  return old_size_sval;
+	default:
+	  gcc_unreachable ();
 	}
     }
   };
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index e84087abc1f..d321e5b6479 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1268,7 +1268,7 @@ region_model::on_stmt_pre (const gimple *stmt,
     }
 }
 
-/* Abstract base class for all out-of-bounds warnings.  */
+/* Abstract base class for all out-of-bounds warnings with concrete values.  */
 
 class out_of_bounds : public pending_diagnostic_subclass<out_of_bounds>
 {
@@ -1591,49 +1591,285 @@ public:
   }
 };
 
+/* Abstract class to complain about out-of-bounds read/writes where
+   the values are symbolic.  */
+
+class symbolic_past_the_end
+  : public pending_diagnostic_subclass<symbolic_past_the_end>
+{
+public:
+  symbolic_past_the_end (const region *reg, tree diag_arg, tree offset,
+			 tree num_bytes, tree capacity)
+  : m_reg (reg), m_diag_arg (diag_arg), m_offset (offset),
+    m_num_bytes (num_bytes), m_capacity (capacity)
+  {}
+
+  const char *get_kind () const final override
+  {
+    return "symbolic_past_the_end";
+  }
+
+  bool operator== (const symbolic_past_the_end &other) const
+  {
+    return m_reg == other.m_reg
+	   && pending_diagnostic::same_tree_p (m_diag_arg, other.m_diag_arg)
+	   && pending_diagnostic::same_tree_p (m_offset, other.m_offset)
+	   && pending_diagnostic::same_tree_p (m_num_bytes, other.m_num_bytes)
+	   && pending_diagnostic::same_tree_p (m_capacity, other.m_capacity);
+  }
+
+  int get_controlling_option () const final override
+  {
+    return OPT_Wanalyzer_out_of_bounds;
+  }
+
+  void mark_interesting_stuff (interesting_t *interest) final override
+  {
+    interest->add_region_creation (m_reg);
+  }
+
+  label_text
+  describe_region_creation_event (const evdesc::region_creation &ev) final
+  override
+  {
+    if (m_capacity)
+      return ev.formatted_print ("capacity is %qE bytes", m_capacity);
+
+    return label_text ();
+  }
+
+  label_text
+  describe_final_event (const evdesc::final_event &ev) final override
+  {
+    const char *byte_str;
+    if (pending_diagnostic::same_tree_p (m_num_bytes, integer_one_node))
+      byte_str = "byte";
+    else
+      byte_str = "bytes";
+
+    if (m_offset)
+      {
+	if (m_num_bytes && TREE_CODE (m_num_bytes) == INTEGER_CST)
+	  {
+	    if (m_diag_arg)
+	      return ev.formatted_print ("%s of %E %s at offset %qE"
+					 " exceeds %qE", m_dir_str,
+					 m_num_bytes, byte_str,
+					 m_offset, m_diag_arg);
+	    else
+	      return ev.formatted_print ("%s of %E %s at offset %qE"
+					 " exceeds the buffer", m_dir_str,
+					 m_num_bytes, byte_str, m_offset);
+	  }
+	else if (m_num_bytes)
+	  {
+	    if (m_diag_arg)
+	      return ev.formatted_print ("%s of %qE %s at offset %qE"
+					 " exceeds %qE", m_dir_str,
+					 m_num_bytes, byte_str,
+					 m_offset, m_diag_arg);
+	    else
+	      return ev.formatted_print ("%s of %qE %s at offset %qE"
+					 " exceeds the buffer", m_dir_str,
+					 m_num_bytes, byte_str, m_offset);
+	  }
+	else
+	  {
+	    if (m_diag_arg)
+	      return ev.formatted_print ("%s at offset %qE exceeds %qE",
+					 m_dir_str, m_offset, m_diag_arg);
+	    else
+	      return ev.formatted_print ("%s at offset %qE exceeds the"
+					 " buffer", m_dir_str, m_offset);
+	  }
+      }
+    if (m_diag_arg)
+      return ev.formatted_print ("out-of-bounds %s on %qE",
+				 m_dir_str, m_diag_arg);
+    return ev.formatted_print ("out-of-bounds %s", m_dir_str);
+  }
+
+protected:
+  const region *m_reg;
+  tree m_diag_arg;
+  tree m_offset;
+  tree m_num_bytes;
+  tree m_capacity;
+  const char *m_dir_str;
+};
+
+/* Concrete subclass to complain about overflows with symbolic values.  */
+
+class symbolic_buffer_overflow : public symbolic_past_the_end
+{
+public:
+  symbolic_buffer_overflow (const region *reg, tree diag_arg, tree offset,
+			    tree num_bytes, tree capacity)
+  : symbolic_past_the_end (reg, diag_arg, offset, num_bytes, capacity)
+  {
+    m_dir_str = "write";
+  }
+
+  bool emit (rich_location *rich_loc) final override
+  {
+    diagnostic_metadata m;
+    switch (m_reg->get_memory_space ())
+      {
+      default:
+	m.add_cwe (787);
+	return warning_meta (rich_loc, m, get_controlling_option (),
+			     "buffer overflow");
+      case MEMSPACE_STACK:
+	m.add_cwe (121);
+	return warning_meta (rich_loc, m, get_controlling_option (),
+			     "stack-based buffer overflow");
+      case MEMSPACE_HEAP:
+	m.add_cwe (122);
+	return warning_meta (rich_loc, m, get_controlling_option (),
+			     "heap-based buffer overflow");
+      }
+  }
+};
+
+/* Concrete subclass to complain about overreads with symbolic values.  */
+
+class symbolic_buffer_overread : public symbolic_past_the_end
+{
+public:
+  symbolic_buffer_overread (const region *reg, tree diag_arg, tree offset,
+			    tree num_bytes, tree capacity)
+  : symbolic_past_the_end (reg, diag_arg, offset, num_bytes, capacity)
+  {
+    m_dir_str = "read";
+  }
+
+  bool emit (rich_location *rich_loc) final override
+  {
+    diagnostic_metadata m;
+    m.add_cwe (126);
+    return warning_meta (rich_loc, m, get_controlling_option (),
+			 "buffer overread");
+  }
+};
+
+/* Check whether an access is past the end of the BASE_REG.  */
+
+void region_model::check_symbolic_bounds (const region *base_reg,
+					  const svalue *sym_byte_offset,
+					  const svalue *num_bytes_sval,
+					  const svalue *capacity,
+					  enum access_direction dir,
+					  region_model_context *ctxt) const
+{
+  gcc_assert (ctxt);
+
+  const svalue *next_byte
+    = m_mgr->get_or_create_binop (num_bytes_sval->get_type (), PLUS_EXPR,
+				  sym_byte_offset, num_bytes_sval);
+
+  if (eval_condition_without_cm (next_byte, GT_EXPR, capacity).is_true ())
+    {
+      tree diag_arg = get_representative_tree (base_reg);
+      tree offset_tree = get_representative_tree (sym_byte_offset);
+      tree num_bytes_tree = get_representative_tree (num_bytes_sval);
+      tree capacity_tree = get_representative_tree (capacity);
+      switch (dir)
+	{
+	default:
+	  gcc_unreachable ();
+	  break;
+	case DIR_READ:
+	  ctxt->warn (new symbolic_buffer_overread (base_reg, diag_arg,
+						    offset_tree,
+						    num_bytes_tree,
+						    capacity_tree));
+	  break;
+	case DIR_WRITE:
+	  ctxt->warn (new symbolic_buffer_overflow (base_reg, diag_arg,
+						    offset_tree,
+						    num_bytes_tree,
+						    capacity_tree));
+	  break;
+	}
+    }
+}
+
+static tree
+maybe_get_integer_cst_tree (const svalue *sval)
+{
+  tree cst_tree = sval->maybe_get_constant ();
+  if (cst_tree && TREE_CODE (cst_tree) == INTEGER_CST)
+    return cst_tree;
+
+  return NULL_TREE;
+}
+
 /* May complain when the access on REG is out-of-bounds.  */
 
-void region_model::check_region_bounds (const region *reg,
-					enum access_direction dir,
-					region_model_context *ctxt) const
+void
+region_model::check_region_bounds (const region *reg,
+				   enum access_direction dir,
+				   region_model_context *ctxt) const
 {
   gcc_assert (ctxt);
 
-  region_offset reg_offset = reg->get_offset ();
+  /* Get the offset.  */
+  region_offset reg_offset = reg->get_offset (m_mgr);
   const region *base_reg = reg_offset.get_base_region ();
 
-  /* Bail out on symbolic offsets or symbolic regions.
+  /* Bail out on symbolic regions.
      (e.g. because the analyzer did not see previous offsets on the latter,
      it might think that a negative access is before the buffer).  */
-  if (reg_offset.symbolic_p () || base_reg->symbolic_p ())
+  if (base_reg->symbolic_p ())
     return;
-  byte_offset_t offset_unsigned
-    = reg_offset.get_bit_offset () >> LOG2_BITS_PER_UNIT;
+
+  /* Find out how many bytes were accessed.  */
+  const svalue *num_bytes_sval = reg->get_byte_size_sval (m_mgr);
+  tree num_bytes_tree = maybe_get_integer_cst_tree (num_bytes_sval);
+
+  /* Get the capacity of the buffer.  */
+  const svalue *capacity = get_capacity (base_reg);
+  tree cst_capacity_tree = maybe_get_integer_cst_tree (capacity);
+
   /* The constant offset from a pointer is represented internally as a sizetype
      but should be interpreted as a signed value here.  The statement below
-     converts the offset to a signed integer with the same precision the
-     sizetype has on the target system.
+     converts the offset from bits to bytes and then to a signed integer with
+     the same precision the sizetype has on the target system.
 
      For example, this is needed for out-of-bounds-3.c test1 to pass when
      compiled with a 64-bit gcc build targeting 32-bit systems.  */
-  byte_offset_t offset
-    = offset_unsigned.to_shwi (TYPE_PRECISION (size_type_node));
-
-  /* Find out how many bytes were accessed.  */
-  const svalue *num_bytes_sval = reg->get_byte_size_sval (m_mgr);
-  tree num_bytes_tree = num_bytes_sval->maybe_get_constant ();
-  if (!num_bytes_tree || TREE_CODE (num_bytes_tree) != INTEGER_CST)
-    /* If we do not know how many bytes were read/written,
-       assume that at least one byte was read/written.  */
-    num_bytes_tree = integer_one_node;
+  byte_offset_t offset;
+  if (!reg_offset.symbolic_p ())
+    offset = wi::sext (reg_offset.get_bit_offset () >> LOG2_BITS_PER_UNIT,
+		       TYPE_PRECISION (size_type_node));
+
+  /* If either the offset or the number of bytes accessed are symbolic,
+     we have to reason about symbolic values.  */
+  if (reg_offset.symbolic_p () || !num_bytes_tree)
+    {
+      const svalue* byte_offset_sval;
+      if (!reg_offset.symbolic_p ())
+	{
+	  tree offset_tree = wide_int_to_tree (integer_type_node, offset);
+	  byte_offset_sval
+	    = m_mgr->get_or_create_constant_svalue (offset_tree);
+	}
+      else
+	byte_offset_sval = reg_offset.get_symbolic_byte_offset ();
+      check_symbolic_bounds (base_reg, byte_offset_sval, num_bytes_sval,
+			     capacity, dir, ctxt);
+      return;
+    }
 
+  /* Otherwise continue to check with concrete values.  */
   byte_range out (0, 0);
   /* NUM_BYTES_TREE should always be interpreted as unsigned.  */
-  byte_range read_bytes (offset, wi::to_offset (num_bytes_tree).to_uhwi ());
+  byte_offset_t num_bytes_unsigned = wi::to_offset (num_bytes_tree);
+  byte_range read_bytes (offset, num_bytes_unsigned);
   /* If read_bytes has a subset < 0, we do have an underflow.  */
   if (read_bytes.falls_short_of_p (0, &out))
     {
-      tree diag_arg = get_representative_tree (reg->get_base_region ());
+      tree diag_arg = get_representative_tree (base_reg);
       switch (dir)
 	{
 	default:
@@ -1648,9 +1884,10 @@ void region_model::check_region_bounds (const region *reg,
 	}
     }
 
-  const svalue *capacity = get_capacity (base_reg);
-  tree cst_capacity_tree = capacity->maybe_get_constant ();
-  if (!cst_capacity_tree || TREE_CODE (cst_capacity_tree) != INTEGER_CST)
+  /* For accesses past the end, we do need a concrete capacity.  No need to
+     do a symbolic check here because the inequality check does not reason
+     whether constants are greater than symbolic values.  */
+  if (!cst_capacity_tree)
     return;
 
   byte_range buffer (0, wi::to_offset (cst_capacity_tree));
@@ -1659,7 +1896,7 @@ void region_model::check_region_bounds (const region *reg,
     {
       tree byte_bound = wide_int_to_tree (size_type_node,
 					  buffer.get_next_byte_offset ());
-      tree diag_arg = get_representative_tree (reg->get_base_region ());
+      tree diag_arg = get_representative_tree (base_reg);
 
       switch (dir)
 	{
@@ -3907,6 +4144,49 @@ region_model::eval_condition_without_cm (const svalue *lhs,
 	  return res;
       }
 
+  /* Handle comparisons between two svalues with more than one operand.  */
+	if (const binop_svalue *binop = lhs->dyn_cast_binop_svalue ())
+    {
+      switch (op)
+	{
+	default:
+	  break;
+	case EQ_EXPR:
+	  {
+	    /* TODO: binops can be equal even if they are not structurally
+		     equal in case of commutative operators.  */
+	    tristate res = structural_equality (lhs, rhs);
+	    if (res.is_true ())
+	      return res;
+	  }
+	  break;
+	case LE_EXPR:
+	  {
+	    tristate res = structural_equality (lhs, rhs);
+	    if (res.is_true ())
+	      return res;
+	  }
+	  break;
+	case GE_EXPR:
+	  {
+	    tristate res = structural_equality (lhs, rhs);
+	    if (res.is_true ())
+	      return res;
+	    res = symbolic_greater_than (binop, rhs);
+	    if (res.is_true ())
+	      return res;
+	  }
+	  break;
+	case GT_EXPR:
+	  {
+	    tristate res = symbolic_greater_than (binop, rhs);
+	    if (res.is_true ())
+	      return res;
+	  }
+	  break;
+	}
+    }
+
   return tristate::TS_UNKNOWN;
 }
 
@@ -3928,6 +4208,123 @@ region_model::compare_initial_and_pointer (const initial_svalue *init,
   return tristate::TS_UNKNOWN;
 }
 
+/* Return true if SVAL is definitely positive.  */
+
+static bool
+is_positive_svalue (const svalue *sval)
+{
+  if (tree cst = sval->maybe_get_constant ())
+    return !zerop (cst) && get_range_pos_neg (cst) == 1;
+  tree type = sval->get_type ();
+  if (!type)
+    return false;
+  /* Consider a binary operation size_t + int.  The analyzer wraps the int in
+     an unaryop_svalue, converting it to a size_t, but in the dynamic execution
+     the result is smaller than the first operand.  Thus, we have to look if
+     the argument of the unaryop_svalue is also positive.  */
+  if (const unaryop_svalue *un_op = dyn_cast <const unaryop_svalue *> (sval))
+    return CONVERT_EXPR_CODE_P (un_op->get_op ()) && TYPE_UNSIGNED (type)
+	   && is_positive_svalue (un_op->get_arg ());
+  return TYPE_UNSIGNED (type);
+}
+
+/* Return true if A is definitely larger than B.
+
+   Limitation: does not account for integer overflows and does not try to
+	       return false, so it can not be used negated.  */
+
+tristate
+region_model::symbolic_greater_than (const binop_svalue *bin_a,
+				     const svalue *b) const
+{
+  if (bin_a->get_op () == PLUS_EXPR || bin_a->get_op () == MULT_EXPR)
+    {
+      /* Eliminate the right-hand side of both svalues.  */
+      if (const binop_svalue *bin_b = dyn_cast <const binop_svalue *> (b))
+	if (bin_a->get_op () == bin_b->get_op ()
+	    && eval_condition_without_cm (bin_a->get_arg1 (),
+					  GT_EXPR,
+					  bin_b->get_arg1 ()).is_true ()
+	    && eval_condition_without_cm (bin_a->get_arg0 (),
+					  GE_EXPR,
+					  bin_b->get_arg0 ()).is_true ())
+	  return tristate (tristate::TS_TRUE);
+
+      /* Otherwise, try to remove a positive offset or factor from BIN_A.  */
+      if (is_positive_svalue (bin_a->get_arg1 ())
+	  && eval_condition_without_cm (bin_a->get_arg0 (),
+					GE_EXPR, b).is_true ())
+	  return tristate (tristate::TS_TRUE);
+    }
+  return tristate::unknown ();
+}
+
+/* Return true if A and B are equal structurally.
+
+   Structural equality means that A and B are equal if the svalues A and B have
+   the same nodes at the same positions in the tree and the leafs are equal.
+   Equality for conjured_svalues and initial_svalues is determined by comparing
+   the pointers while constants are compared by value.  That behavior is useful
+   to check for binaryop_svlaues that evaluate to the same concrete value but
+   might use one operand with a different type but the same constant value.
+
+   For example,
+     binop_svalue (mult_expr,
+       initial_svalue (‘size_t’, decl_region (..., 'some_var')),
+       constant_svalue (‘size_t’, 4))
+   and
+     binop_svalue (mult_expr,
+       initial_svalue (‘size_t’, decl_region (..., 'some_var'),
+       constant_svalue (‘sizetype’, 4))
+   are structurally equal.  A concrete C code example, where this occurs, can
+   be found in test7 of out-of-bounds-5.c.  */
+
+tristate
+region_model::structural_equality (const svalue *a, const svalue *b) const
+{
+  /* If A and B are referentially equal, they are also structurally equal.  */
+  if (a == b)
+    return tristate (tristate::TS_TRUE);
+
+  switch (a->get_kind ())
+    {
+    default:
+      return tristate::unknown ();
+    /* SK_CONJURED and SK_INITIAL are already handled
+       by the referential equality above.  */
+    case SK_CONSTANT:
+      {
+	tree a_cst = a->maybe_get_constant ();
+	tree b_cst = b->maybe_get_constant ();
+	if (a_cst && b_cst)
+	  return tristate (tree_int_cst_equal (a_cst, b_cst));
+      }
+      return tristate (tristate::TS_FALSE);
+    case SK_UNARYOP:
+      {
+	const unaryop_svalue *un_a = as_a <const unaryop_svalue *> (a);
+	if (const unaryop_svalue *un_b = dyn_cast <const unaryop_svalue *> (b))
+	  return tristate (pending_diagnostic::same_tree_p (un_a->get_type (),
+							    un_b->get_type ())
+			   && un_a->get_op () == un_b->get_op ()
+			   && structural_equality (un_a->get_arg (),
+						   un_b->get_arg ()));
+      }
+      return tristate (tristate::TS_FALSE);
+    case SK_BINOP:
+      {
+	const binop_svalue *bin_a = as_a <const binop_svalue *> (a);
+	if (const binop_svalue *bin_b = dyn_cast <const binop_svalue *> (b))
+	  return tristate (bin_a->get_op () == bin_b->get_op ()
+			   && structural_equality (bin_a->get_arg0 (),
+						   bin_b->get_arg0 ())
+			   && structural_equality (bin_a->get_arg1 (),
+						   bin_b->get_arg1 ()));
+      }
+      return tristate (tristate::TS_FALSE);
+    }
+}
+
 /* Handle various constraints of the form:
      LHS: ((bool)INNER_LHS INNER_OP INNER_RHS))
      OP : == or !=
@@ -5637,7 +6034,7 @@ test_struct ()
   /* Verify get_offset for "c.x".  */
   {
     const region *c_x_reg = model.get_lvalue (c_x, NULL);
-    region_offset offset = c_x_reg->get_offset ();
+    region_offset offset = c_x_reg->get_offset (&mgr);
     ASSERT_EQ (offset.get_base_region (), model.get_lvalue (c, NULL));
     ASSERT_EQ (offset.get_bit_offset (), 0);
   }
@@ -5645,7 +6042,7 @@ test_struct ()
   /* Verify get_offset for "c.y".  */
   {
     const region *c_y_reg = model.get_lvalue (c_y, NULL);
-    region_offset offset = c_y_reg->get_offset ();
+    region_offset offset = c_y_reg->get_offset (&mgr);
     ASSERT_EQ (offset.get_base_region (), model.get_lvalue (c, NULL));
     ASSERT_EQ (offset.get_bit_offset (), INT_TYPE_SIZE);
   }
@@ -7140,7 +7537,7 @@ test_var ()
 
   /* Verify get_offset for "i".  */
   {
-    region_offset offset = i_reg->get_offset ();
+    region_offset offset = i_reg->get_offset (&mgr);
     ASSERT_EQ (offset.get_base_region (), i_reg);
     ASSERT_EQ (offset.get_bit_offset (), 0);
   }
@@ -7189,7 +7586,7 @@ test_array_2 ()
   /* Verify get_offset for "arr[0]".  */
   {
     const region *arr_0_reg = model.get_lvalue (arr_0, NULL);
-    region_offset offset = arr_0_reg->get_offset ();
+    region_offset offset = arr_0_reg->get_offset (&mgr);
     ASSERT_EQ (offset.get_base_region (), model.get_lvalue (arr, NULL));
     ASSERT_EQ (offset.get_bit_offset (), 0);
   }
@@ -7197,11 +7594,19 @@ test_array_2 ()
   /* Verify get_offset for "arr[1]".  */
   {
     const region *arr_1_reg = model.get_lvalue (arr_1, NULL);
-    region_offset offset = arr_1_reg->get_offset ();
+    region_offset offset = arr_1_reg->get_offset (&mgr);
     ASSERT_EQ (offset.get_base_region (), model.get_lvalue (arr, NULL));
     ASSERT_EQ (offset.get_bit_offset (), INT_TYPE_SIZE);
   }
 
+  /* Verify get_offset for "arr[i]".  */
+  {
+    const region *arr_i_reg = model.get_lvalue (arr_i, NULL);
+    region_offset offset = arr_i_reg->get_offset (&mgr);
+    ASSERT_EQ (offset.get_base_region (), model.get_lvalue (arr, NULL));
+    ASSERT_EQ (offset.get_symbolic_byte_offset ()->get_kind (), SK_BINOP);
+  }
+
   /* "arr[i] = i;" - this should remove the earlier bindings.  */
   model.set_value (arr_i, i, NULL);
   ASSERT_EQ (model.get_rvalue (arr_i, NULL), model.get_rvalue (i, NULL));
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index a1f2165e145..977cccf0ef7 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -717,6 +717,9 @@ class region_model
 				      const svalue *rhs) const;
   tristate compare_initial_and_pointer (const initial_svalue *init,
 					const region_svalue *ptr) const;
+  tristate symbolic_greater_than (const binop_svalue *a,
+				  const svalue *b) const;
+  tristate structural_equality (const svalue *a, const svalue *b) const;
   tristate eval_condition (tree lhs,
 			   enum tree_code op,
 			   tree rhs,
@@ -874,6 +877,12 @@ class region_model
 			      region_model_context *ctxt) const;
   void check_region_size (const region *lhs_reg, const svalue *rhs_sval,
 			  region_model_context *ctxt) const;
+  void check_symbolic_bounds (const region *base_reg,
+			      const svalue *sym_byte_offset,
+			      const svalue *num_bytes_sval,
+			      const svalue *capacity,
+			      enum access_direction dir,
+			      region_model_context *ctxt) const;
   void check_region_bounds (const region *reg, enum access_direction dir,
 			    region_model_context *ctxt) const;
 
diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index 9c8279b130d..09646bfc98a 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -290,10 +290,10 @@ region::maybe_get_decl () const
    first call and caching it internally).  */
 
 region_offset
-region::get_offset () const
+region::get_offset (region_model_manager *mgr) const
 {
   if(!m_cached_offset)
-    m_cached_offset = new region_offset (calc_offset ());
+    m_cached_offset = new region_offset (calc_offset (mgr));
   return *m_cached_offset;
 }
 
@@ -491,10 +491,11 @@ region::get_subregions_for_binding (region_model_manager *mgr,
    or a symbolic offset.  */
 
 region_offset
-region::calc_offset () const
+region::calc_offset (region_model_manager *mgr) const
 {
   const region *iter_region = this;
   bit_offset_t accum_bit_offset = 0;
+  const svalue *accum_byte_sval = NULL;
 
   while (iter_region)
     {
@@ -504,16 +505,36 @@ region::calc_offset () const
 	case RK_ELEMENT:
 	case RK_OFFSET:
 	case RK_BIT_RANGE:
-	  {
-	    bit_offset_t rel_bit_offset;
-	    if (!iter_region->get_relative_concrete_offset (&rel_bit_offset))
-	      return region_offset::make_symbolic
-		(iter_region->get_parent_region ());
-	    accum_bit_offset += rel_bit_offset;
-	    iter_region = iter_region->get_parent_region ();
-	  }
+	  if (accum_byte_sval)
+	    {
+	      const svalue *sval
+		= iter_region->get_relative_symbolic_offset (mgr);
+	      accum_byte_sval
+		= mgr->get_or_create_binop (sval->get_type (), PLUS_EXPR,
+					    accum_byte_sval, sval);
+	      iter_region = iter_region->get_parent_region ();
+	    }
+	  else
+	    {
+	      bit_offset_t rel_bit_offset;
+	      if (iter_region->get_relative_concrete_offset (&rel_bit_offset))
+		{
+		  accum_bit_offset += rel_bit_offset;
+		  iter_region = iter_region->get_parent_region ();
+		}
+	      else
+		{
+		  /* If the iter_region is not concrete anymore, convert the
+		     accumulated bits to a svalue in bytes and revisit the
+		     iter_region collecting the symbolic value.  */
+		  byte_offset_t byte_offset = accum_bit_offset / BITS_PER_UNIT;
+		  tree offset_tree = wide_int_to_tree (integer_type_node,
+						       byte_offset);
+		  accum_byte_sval
+		    = mgr->get_or_create_constant_svalue (offset_tree);
+		}
+	    }
 	  continue;
-
 	case RK_SIZED:
 	  iter_region = iter_region->get_parent_region ();
 	  continue;
@@ -527,10 +548,18 @@ region::calc_offset () const
 	  continue;
 
 	default:
-	  return region_offset::make_concrete (iter_region, accum_bit_offset);
+	  return accum_byte_sval
+		  ? region_offset::make_symbolic (iter_region,
+						  accum_byte_sval)
+		  : region_offset::make_concrete (iter_region,
+						  accum_bit_offset);
 	}
     }
-  return region_offset::make_concrete (iter_region, accum_bit_offset);
+
+  return accum_byte_sval ? region_offset::make_symbolic (iter_region,
+							 accum_byte_sval)
+			 : region_offset::make_concrete (iter_region,
+							 accum_bit_offset);
 }
 
 /* Base implementation of region::get_relative_concrete_offset vfunc.  */
@@ -541,6 +570,14 @@ region::get_relative_concrete_offset (bit_offset_t *) const
   return false;
 }
 
+/* Base implementation of region::get_relative_symbolic_offset vfunc.  */
+
+const svalue *
+region::get_relative_symbolic_offset (region_model_manager *mgr) const
+{
+  return mgr->get_or_create_unknown_svalue (integer_type_node);
+}
+
 /* Attempt to get the position and size of this region expressed as a
    concrete range of bytes relative to its parent.
    If successful, return true and write to *OUT.
@@ -1316,6 +1353,25 @@ field_region::get_relative_concrete_offset (bit_offset_t *out) const
   return true;
 }
 
+
+/* Implementation of region::get_relative_symbolic_offset vfunc
+   for field_region.
+   If known, the returned svalue is equal to the offset converted to bytes and
+   rounded off.  */
+
+const svalue *
+field_region::get_relative_symbolic_offset (region_model_manager *mgr) const
+{
+  bit_offset_t out;
+  if (get_relative_concrete_offset (&out))
+    {
+      tree cst_tree
+	= wide_int_to_tree (integer_type_node, out / BITS_PER_UNIT);
+      return mgr->get_or_create_constant_svalue (cst_tree);
+    }
+  return mgr->get_or_create_unknown_svalue (integer_type_node);
+}
+
 /* class element_region : public region.  */
 
 /* Implementation of region::accept vfunc for element_region.  */
@@ -1382,6 +1438,29 @@ element_region::get_relative_concrete_offset (bit_offset_t *out) const
   return false;
 }
 
+/* Implementation of region::get_relative_symbolic_offset vfunc
+   for element_region.  */
+
+const svalue *
+element_region::get_relative_symbolic_offset (region_model_manager *mgr) const
+{
+  tree elem_type = get_type ();
+
+  /* First, use int_size_in_bytes, to reject the case where we
+     have an incomplete type, or a non-constant value.  */
+  HOST_WIDE_INT hwi_byte_size = int_size_in_bytes (elem_type);
+  if (hwi_byte_size > 0)
+	  {
+      tree byte_size_tree = wide_int_to_tree (integer_type_node,
+					      hwi_byte_size);
+      const svalue *byte_size_sval
+	= mgr->get_or_create_constant_svalue (byte_size_tree);
+      return mgr->get_or_create_binop (integer_type_node, MULT_EXPR,
+				       m_index, byte_size_sval);
+    }
+  return mgr->get_or_create_unknown_svalue (integer_type_node);
+}
+
 /* class offset_region : public region.  */
 
 /* Implementation of region::accept vfunc for offset_region.  */
@@ -1438,6 +1517,16 @@ offset_region::get_relative_concrete_offset (bit_offset_t *out) const
   return false;
 }
 
+/* Implementation of region::get_relative_symbolic_offset vfunc
+   for offset_region.  */
+
+const svalue *
+offset_region::get_relative_symbolic_offset (region_model_manager *mgr
+					      ATTRIBUTE_UNUSED) const
+{
+  return get_byte_offset ();
+}
+
 /* Implementation of region::get_byte_size_sval vfunc for offset_region.  */
 
 const svalue *
@@ -1683,6 +1772,20 @@ bit_range_region::get_relative_concrete_offset (bit_offset_t *out) const
   return true;
 }
 
+/* Implementation of region::get_relative_symbolic_offset vfunc for
+   bit_range_region.
+   The returned svalue is equal to the offset converted to bytes and
+   rounded off.  */
+
+const svalue *
+bit_range_region::get_relative_symbolic_offset (region_model_manager *mgr)
+  const
+{
+  byte_offset_t start_byte = m_bits.get_start_bit_offset () / BITS_PER_UNIT;
+  tree start_bit_tree = wide_int_to_tree (integer_type_node, start_byte);
+  return mgr->get_or_create_constant_svalue (start_bit_tree);
+}
+
 /* class var_arg_region : public region.  */
 
 void
diff --git a/gcc/analyzer/region.h b/gcc/analyzer/region.h
index 34ce1fa1714..6315fac62e5 100644
--- a/gcc/analyzer/region.h
+++ b/gcc/analyzer/region.h
@@ -175,7 +175,7 @@ public:
 
   bool involves_p (const svalue *sval) const;
 
-  region_offset get_offset () const;
+  region_offset get_offset (region_model_manager *mgr) const;
 
   /* Attempt to get the size of this region as a concrete number of bytes.
      If successful, return true and write the size to *OUT.
@@ -196,6 +196,11 @@ public:
      Otherwise return false.  */
   virtual bool get_relative_concrete_offset (bit_offset_t *out) const;
 
+  /* Get the offset in bytes of this region relative to its parent as a svalue.
+     Might return an unknown_svalue.  */
+  virtual const svalue *
+  get_relative_symbolic_offset (region_model_manager *mgr) const;
+
   /* Attempt to get the position and size of this region expressed as a
      concrete range of bytes relative to its parent.
      If successful, return true and write to *OUT.
@@ -226,7 +231,7 @@ public:
   region (complexity c, unsigned id, const region *parent, tree type);
 
  private:
-  region_offset calc_offset () const;
+  region_offset calc_offset (region_model_manager *mgr) const;
 
   complexity m_complexity;
   unsigned m_id; // purely for deterministic sorting at this stage, for dumps
@@ -751,6 +756,8 @@ public:
   tree get_field () const { return m_field; }
 
   bool get_relative_concrete_offset (bit_offset_t *out) const final override;
+  const svalue *get_relative_symbolic_offset (region_model_manager *mgr)
+    const final override;
 
 private:
   tree m_field;
@@ -835,6 +842,8 @@ public:
 
   virtual bool
   get_relative_concrete_offset (bit_offset_t *out) const final override;
+  const svalue *get_relative_symbolic_offset (region_model_manager *mgr)
+    const final override;
 
 private:
   const svalue *m_index;
@@ -919,6 +928,8 @@ public:
   const svalue *get_byte_offset () const { return m_byte_offset; }
 
   bool get_relative_concrete_offset (bit_offset_t *out) const final override;
+  const svalue *get_relative_symbolic_offset (region_model_manager *mgr)
+    const final override;
   const svalue * get_byte_size_sval (region_model_manager *mgr)
     const final override;
 
@@ -1245,6 +1256,8 @@ public:
   bool get_bit_size (bit_size_t *out) const final override;
   const svalue *get_byte_size_sval (region_model_manager *mgr) const final override;
   bool get_relative_concrete_offset (bit_offset_t *out) const final override;
+  const svalue *get_relative_symbolic_offset (region_model_manager *mgr)
+    const final override;
 
 private:
   bit_range m_bits;
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index 848c5e1aaff..ec5232cb055 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -123,7 +123,7 @@ uncertainty_t::dump (bool simple) const
 const binding_key *
 binding_key::make (store_manager *mgr, const region *r)
 {
-  region_offset offset = r->get_offset ();
+  region_offset offset = r->get_offset (mgr->get_svalue_manager ());
   if (offset.symbolic_p ())
     return mgr->get_symbolic_binding (r);
   else
@@ -897,7 +897,7 @@ binding_map::apply_ctor_val_to_range (const region *parent_reg,
     = get_subregion_within_ctor (parent_reg, min_index, mgr);
   const region *max_element
     = get_subregion_within_ctor (parent_reg, max_index, mgr);
-  region_offset min_offset = min_element->get_offset ();
+  region_offset min_offset = min_element->get_offset (mgr);
   if (min_offset.symbolic_p ())
     return false;
   bit_offset_t start_bit_offset = min_offset.get_bit_offset ();
@@ -955,11 +955,11 @@ binding_map::apply_ctor_pair_to_child_region (const region *parent_reg,
 	  gcc_assert (sval_byte_size != -1);
 	  bit_size_t sval_bit_size = sval_byte_size * BITS_PER_UNIT;
 	  /* Get offset of child relative to base region.  */
-	  region_offset child_base_offset = child_reg->get_offset ();
+	  region_offset child_base_offset = child_reg->get_offset (mgr);
 	  if (child_base_offset.symbolic_p ())
 	    return false;
 	  /* Convert to an offset relative to the parent region.  */
-	  region_offset parent_base_offset = parent_reg->get_offset ();
+	  region_offset parent_base_offset = parent_reg->get_offset (mgr);
 	  gcc_assert (!parent_base_offset.symbolic_p ());
 	  bit_offset_t child_parent_offset
 	    = (child_base_offset.get_bit_offset ()
@@ -1365,7 +1365,8 @@ binding_cluster::bind_compound_sval (store_manager *mgr,
 				     const region *reg,
 				     const compound_svalue *compound_sval)
 {
-  region_offset reg_offset = reg->get_offset ();
+  region_offset reg_offset
+    = reg->get_offset (mgr->get_svalue_manager ());
   if (reg_offset.symbolic_p ())
     {
       m_touched = true;
@@ -1614,7 +1615,7 @@ binding_cluster::get_any_binding (store_manager *mgr,
   /* Alternatively, if this is a symbolic read and the cluster has any bindings,
      then we don't know if we're reading those values or not, so the result
      is also "UNKNOWN".  */
-  if (reg->get_offset ().symbolic_p ()
+  if (reg->get_offset (mgr->get_svalue_manager ()).symbolic_p ()
       && m_map.elements () > 0)
     {
       region_model_manager *rmm_mgr = mgr->get_svalue_manager ();
@@ -1643,10 +1644,11 @@ const svalue *
 binding_cluster::maybe_get_compound_binding (store_manager *mgr,
 					     const region *reg) const
 {
-  region_offset cluster_offset = m_base_region->get_offset ();
+  region_offset cluster_offset
+    = m_base_region->get_offset (mgr->get_svalue_manager ());
   if (cluster_offset.symbolic_p ())
     return NULL;
-  region_offset reg_offset = reg->get_offset ();
+  region_offset reg_offset = reg->get_offset (mgr->get_svalue_manager ());
   if (reg_offset.symbolic_p ())
     return NULL;
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cc631dfa261..5c066219a7d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10051,9 +10051,11 @@ This warning requires @option{-fanalyzer} to enable it; use
 @option{-Wno-analyzer-out-of-bounds} to disable it.
 
 This diagnostic warns for path through the code in which a buffer is
-definitely read or written out-of-bounds.  The diagnostic only applies
-for cases where the analyzer is able to determine a constant offset and
-for accesses past the end of a buffer, also a constant capacity.
+definitely read or written out-of-bounds.  The diagnostic applies for
+cases where the analyzer is able to determine a constant offset and for
+accesses past the end of a buffer, also a constant capacity.  Further,
+the diagnostic does limited checking for accesses past the end when the
+offset as well as the capacity is symbolic.
 
 See @uref{https://cwe.mitre.org/data/definitions/119.html, CWE-119: Improper Restriction of Operations within the Bounds of a Memory Buffer}.
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/data-model-1.c b/gcc/testsuite/gcc.dg/analyzer/data-model-1.c
index 43181911795..d8930d189bb 100644
--- a/gcc/testsuite/gcc.dg/analyzer/data-model-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/data-model-1.c
@@ -624,8 +624,7 @@ void test_29a (struct coord p[])
   __analyzer_eval (q[-2].y == 107025); /* { dg-warning "TRUE" } */
 
   q -= 2;
-  __analyzer_eval (q == &p[7]); /* { dg-warning "UNKNOWN" } */
-  // TODO: make this be TRUE
+  __analyzer_eval (q == &p[7]); /* { dg-warning "TRUE" } */
 
   __analyzer_eval (q->x == 107024); /* { dg-warning "TRUE" } */
   __analyzer_eval (q->y == 107025); /* { dg-warning "TRUE" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-5.c b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-5.c
new file mode 100644
index 00000000000..7dc0bc5bf18
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-5.c
@@ -0,0 +1,156 @@
+/* { dg-additional-options "-Wno-unused-but-set-variable" } */
+
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <alloca.h>
+#include <stdint.h>
+
+/* Tests with symbolic values.  */
+
+void test1 (size_t size)
+{
+  char *buf = __builtin_malloc (size);
+  if (!buf) return;
+
+  buf[size] = '\0'; /* { dg-warning "overflow" } */
+  free (buf);
+}
+
+void test2 (size_t size)
+{
+  char *buf = __builtin_malloc (size);
+  if (!buf) return;
+
+  buf[size + 1] = '\0'; /* { dg-warning "overflow" } */
+  free (buf);
+}
+
+void test3 (size_t size, size_t op)
+{
+  char *buf = __builtin_malloc (size);
+  if (!buf) return;
+
+  buf[size + op] = '\0'; /* { dg-warning "overflow" } */
+  free (buf);
+}
+
+void test4 (size_t size, unsigned short s)
+{
+  char *buf = __builtin_alloca (size);
+  buf[size + s] = '\0'; /* { dg-warning "overflow" } */
+}
+
+void test5 (size_t size)
+{
+  int32_t *buf = __builtin_alloca (4 * size);
+  buf[size] = 42; /* { dg-warning "overflow" } */
+}
+
+void test6 (size_t size)
+{
+  int32_t *buf = __builtin_alloca (4 * size);
+  memset (buf, 0, 4 * size);
+  int32_t last = *(buf + 4 * size); /* { dg-warning "overread" } */
+}
+
+void test7 (size_t size)
+{
+  int32_t *buf = __builtin_alloca (4 * size + 3); /* { dg-warning "allocated buffer size is not a multiple of the pointee's size" } */
+  buf[size] = 42; /* { dg-warning "overflow" } */
+}
+
+/* Test where the offset itself is not out-of-bounds
+   but multiple bytes are read.  */
+
+void test8 (size_t size, size_t offset)
+{
+  char src[size];
+  char dst[size];
+  memcpy (dst, src, size + offset); /* { dg-line test8 } */
+  /* { dg-warning "overread" "warning" { target *-*-* } test8 } */
+  /* { dg-warning "overflow" "warning" { target *-*-* } test8 } */
+}
+
+void test9 (size_t size, size_t offset)
+{
+  int32_t src[size];
+  int32_t dst[size];
+  memcpy (dst, src, 4 * size + 1); /* { dg-line test9 } */
+  /* { dg-warning "overread" "warning" { target *-*-* } test9 } */
+  /* { dg-warning "overflow" "warning" { target *-*-* } test9 } */
+}
+
+/* Test for no false-positives.  */
+
+void test10 (size_t size)
+{
+  int32_t buf[4 * size];
+  /* 4 * size is smaller than 4 * 4 * size.  */
+  buf[size] = 42;
+}
+
+void test11 (size_t size)
+{
+  int32_t *buf = __builtin_alloca (4 * size + 5); /* { dg-warning "allocated buffer size is not a multiple of the pointee's size" } */
+  buf[size] = 42;
+}
+
+void test12 (size_t size, size_t offset)
+{
+  int buf[size];
+  buf[offset] = 42;
+}
+
+void test13 (size_t size, int offset)
+{
+  int buf[size];
+  /* We don't know whether offset is positive or not.  */
+  buf[size + offset] = 42;
+}
+
+void test14 (size_t size, size_t offset, size_t offset2)
+{
+  int buf[size];
+  /* We don't know whether offset > offset2.  */
+  buf[size + offset - offset2] = 42;
+}
+
+void test15 (size_t a, size_t b)
+{
+  int buf[a * b];
+  /* We can't reason about a*b < a+b either.  */
+  buf[a + b] = 42;
+}
+
+/* Misc.  */
+
+char *test98 (const char *x, const char *y)
+{
+  size_t len_x = __builtin_strlen (x);
+  size_t len_y = __builtin_strlen (y);
+  size_t sz = len_x + len_y + 1;
+  char *result = __builtin_malloc (sz);
+  if (!result)
+    return NULL;
+  __builtin_memcpy (result, x, len_x);
+  __builtin_memcpy (result + len_x, y, len_y);
+  result[len_x + len_y] = '\0';
+  return result;
+}
+
+char *test99 (const char *x, const char *y)
+{
+  size_t len_x = __builtin_strlen (x);
+  size_t len_y = __builtin_strlen (y);
+  /* BUG (root cause): forgot to add 1 for terminator.  */
+  size_t sz = len_x + len_y;
+  char *result = __builtin_malloc (sz);
+  if (!result)
+    return NULL;
+  __builtin_memcpy (result, x, len_x);
+  __builtin_memcpy (result + len_x, y, len_y);
+  /* BUG (symptom): off-by-one out-of-bounds write to heap.  */
+  result[len_x + len_y] = '\0'; /* { dg-warning "overflow" } */
+  return result;
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-realloc-grow.c b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-realloc-grow.c
new file mode 100644
index 00000000000..707611e08a6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-realloc-grow.c
@@ -0,0 +1,87 @@
+/* { dg-additional-options "-Wno-analyzer-too-complex" } */
+
+/* Reduced from gnulib/read-file.c.
+
+   Tests that there is no false-positive on
+   realloc when the buffer is growing.  */
+
+#include <stdlib.h>
+
+/* Indicate that the file is treated as binary.  */
+#define RF_BINARY 0x1
+
+#include <stdio.h>
+#include <stdint.h>
+#include <string.h>
+#include <errno.h>
+
+char *
+fread_file (FILE *stream, int flags, size_t *length)
+{
+  char *buf = NULL;
+  size_t alloc = BUFSIZ;
+
+  if (!(buf = malloc (alloc)))
+    return NULL; /* errno is ENOMEM.  */
+
+  {
+    size_t size = 0; /* number of bytes read so far */
+    int save_errno;
+
+    for (;;)
+      {
+        /* This reads 1 more than the size of a regular file
+           so that we get eof immediately.  */
+        size_t requested = alloc - size;
+        size_t count = fread (buf + size, 1, requested, stream);
+        size += count;
+
+        {
+          char *new_buf;
+
+          if (alloc < PTRDIFF_MAX - alloc / 2)
+            alloc = alloc + alloc / 2;
+          else
+            alloc = PTRDIFF_MAX;
+
+           if (!(new_buf = realloc (buf, alloc)))
+            {
+              save_errno = errno;
+              break;
+            }
+
+          buf = new_buf;
+        }
+      }
+
+    free (buf);
+    errno = save_errno;
+    return NULL;
+  }
+}
+
+/* Open and read the contents of FILENAME, and return a newly
+   allocated string with the content, and set *LENGTH to the length of
+   the string.  The string is zero-terminated, but the terminating
+   zero byte is not counted in *LENGTH.  On errors, *LENGTH is
+   undefined, errno preserves the values set by system functions (if
+   any), and NULL is returned.
+   If the RF_BINARY flag is set in FLAGS, the file is opened in binary
+   mode.  If the RF_SENSITIVE flag is set in FLAGS, the memory buffer
+   internally allocated will be cleared upon failure.  */
+char *
+read_file (const char *filename, int flags, size_t *length)
+{
+  const char *mode = (flags & RF_BINARY) ? "rbe" : "re";
+  FILE *stream = fopen (filename, mode);
+  char *out;
+
+  if (!stream)
+    return NULL;
+
+  out = fread_file (stream, flags, length);
+
+  fclose (stream);
+
+  return out;
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/symbolic-gt-1.c b/gcc/testsuite/gcc.dg/analyzer/symbolic-gt-1.c
new file mode 100644
index 00000000000..140abce6961
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/symbolic-gt-1.c
@@ -0,0 +1,76 @@
+#include <string.h>
+#include "analyzer-decls.h"
+
+/* Test GT_EXPR comparison of symbolic values.  */
+
+void test1 (size_t size)
+{
+  size_t a = 4 * size + 1;
+  size_t b = 4 * size;
+  __analyzer_eval (a > b); /* { dg-warning "TRUE" } */
+}
+
+void test2 (size_t size, size_t offset)
+{
+  size_t a = size + offset;
+  size_t b = size;
+  __analyzer_eval (a > b); /* { dg-warning "TRUE" } */
+}
+
+void test3 (size_t size, size_t offset)
+{
+  size_t a = size * offset;
+  size_t b = size;
+  __analyzer_eval (a > b); /* { dg-warning "TRUE" } */
+}
+
+void test4 (size_t size)
+{
+  size_t op = -1;
+  size_t a = size + op;
+  size_t b = size;
+  __analyzer_eval (a > b); /* { dg-warning "UNKNOWN" } */
+}
+
+void test5 (size_t size)
+{
+  size_t a = size - 1;
+  size_t b = size;
+  __analyzer_eval (a > b); /* { dg-warning "UNKNOWN" } */
+}
+
+void test6 (size_t size, int offset)
+{
+  /* OFFSET is a symbolic integer, thus could be negative.  */
+  size_t a = size + offset;
+  size_t b = size;
+  __analyzer_eval (a > b); /* { dg-warning "UNKNOWN" } */
+}
+
+void test7 (size_t size, size_t mul)
+{
+  size_t a = mul * size + 1;
+  size_t b = mul * size;
+  __analyzer_eval (a > b); /* { dg-warning "TRUE" } */
+}
+
+void test8 (size_t size)
+{
+  size_t a = size - 5;
+  size_t b = size - 1;
+  __analyzer_eval (a > b); /* { dg-warning "UNKNOWN" } */
+}
+
+void test9 (size_t size)
+{
+  size_t a = size + 1;
+  size_t b = size + 2;
+  __analyzer_eval (a > b); /* { dg-warning "UNKNOWN" } */
+}
+
+void test10 (size_t size)
+{
+  size_t a = size + 2;
+  size_t b = size + 1;
+  __analyzer_eval (a > b); /* { dg-warning "TRUE" } */
+}

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-09-08 17:47 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-08 17:47 [gcc r13-2546] analyzer: support for symbolic values in the out-of-bounds checker [PR106625] Tim Lange

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).