public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [committed] analyzer: fix taint handling of switch statements [PR106321]
@ 2022-07-19 14:02 David Malcolm
  0 siblings, 0 replies; only message in thread
From: David Malcolm @ 2022-07-19 14:02 UTC (permalink / raw)
  To: gcc-patches

PR analyzer/106321 reports false positives from
-Wanalyzer-tainted-array-index on switch statements, seen e.g.
in the Linux kernel in drivers/vfio/pci/vfio_pci_core.c, where
vfio_pci_core_ioctl has:

    |  744 |                 switch (info.index) {
    |      |                 ~~~~~~  ~~~~~~~~~~
    |      |                 |           |
    |      |                 |           (8) ...to here
    |      |                 (9) following ‘case 0 ... 5:’ branch...
    |......
    |  751 |                 case VFIO_PCI_BAR0_REGION_INDEX ... VFIO_PCI_BAR5_REGION_INDEX:
    |      |                 ~~~~
    |      |                 |
    |      |                 (10) ...to here

and then a false complaint about "use of attacker-controlled value
‘info.index’ in array lookup without upper-bounds checking", where
info.index has clearly had its bounds checked by the switch/case.

It turns out that when I rewrote switch handling for the analyzer in
r12-3101-g8ca7fa84a3af35, I removed notifications to state machines
about the constraints on cases.

This patch fixes that oversight by adding a new on_bounded_ranges vfunc
for region_model_context, called on switch statement edges, which calls
a new state_machine vfunc.  It implements it for the "taint" state
machine, so that it updates the "has bounds" flags at out-edges for
switch statements, based on whether the bounds from the edge appear to
actually constrain the switch index.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-1748-g2c044ff123ee57.

gcc/analyzer/ChangeLog:
	PR analyzer/106321
	* constraint-manager.h (bounded_ranges::get_count): New.
	(bounded_ranges::get_range): New.
	* engine.cc (impl_region_model_context::on_bounded_ranges): New.
	* exploded-graph.h (impl_region_model_context::on_bounded_ranges):
	New decl.
	* region-model.cc (region_model::apply_constraints_for_gswitch):
	Potentially call ctxt->on_bounded_ranges.
	* region-model.h (region_model_context::on_bounded_ranges): New
	vfunc.
	(noop_region_model_context::on_bounded_ranges): New.
	(region_model_context_decorator::on_bounded_ranges): New.
	* sm-taint.cc: Include "analyzer/constraint-manager.h".
	(taint_state_machine::on_bounded_ranges): New.
	* sm.h (state_machine::on_bounded_ranges): New.

gcc/testsuite/ChangeLog:
	PR analyzer/106321
	* gcc.dg/analyzer/torture/taint-read-index-2.c: Add test coverage
	for switch statements.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
 gcc/analyzer/constraint-manager.h             |  3 +
 gcc/analyzer/engine.cc                        | 26 ++++++
 gcc/analyzer/exploded-graph.h                 |  3 +
 gcc/analyzer/region-model.cc                  |  2 +
 gcc/analyzer/region-model.h                   | 17 ++++
 gcc/analyzer/sm-taint.cc                      | 58 +++++++++++++
 gcc/analyzer/sm.h                             |  9 ++
 .../analyzer/torture/taint-read-index-2.c     | 85 +++++++++++++++++++
 8 files changed, 203 insertions(+)

diff --git a/gcc/analyzer/constraint-manager.h b/gcc/analyzer/constraint-manager.h
index f67c7647497..1271f18f1d1 100644
--- a/gcc/analyzer/constraint-manager.h
+++ b/gcc/analyzer/constraint-manager.h
@@ -138,6 +138,9 @@ public:
 
   static int cmp (const bounded_ranges *a, const bounded_ranges *b);
 
+  unsigned get_count () const { return m_ranges.length (); }
+  const bounded_range &get_range (unsigned idx) const { return m_ranges[idx]; }
+
 private:
   void canonicalize ();
   void validate () const;
diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 4f7f9d03900..85b7c5e1227 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -916,6 +916,32 @@ impl_region_model_context::on_condition (const svalue *lhs,
     }
 }
 
+/* Implementation of region_model_context::on_bounded_ranges vfunc.
+   Notify all state machines about the ranges, which could lead to
+   state transitions.  */
+
+void
+impl_region_model_context::on_bounded_ranges (const svalue &sval,
+					      const bounded_ranges &ranges)
+{
+  int sm_idx;
+  sm_state_map *smap;
+  FOR_EACH_VEC_ELT (m_new_state->m_checker_states, sm_idx, smap)
+    {
+      const state_machine &sm = m_ext_state.get_sm (sm_idx);
+      impl_sm_context sm_ctxt (*m_eg, sm_idx, sm, m_enode_for_diag,
+			       m_old_state, m_new_state,
+			       m_old_state->m_checker_states[sm_idx],
+			       m_new_state->m_checker_states[sm_idx],
+			       m_path_ctxt);
+      sm.on_bounded_ranges (&sm_ctxt,
+			    (m_enode_for_diag
+			     ? m_enode_for_diag->get_supernode ()
+			     : NULL),
+			    m_stmt, sval, ranges);
+    }
+}
+
 /* Implementation of region_model_context::on_phi vfunc.
    Notify all state machines about the phi, which could lead to
    state transitions.  */
diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h
index 0613f558b8b..f9575688fdd 100644
--- a/gcc/analyzer/exploded-graph.h
+++ b/gcc/analyzer/exploded-graph.h
@@ -65,6 +65,9 @@ class impl_region_model_context : public region_model_context
 		     enum tree_code op,
 		     const svalue *rhs) final override;
 
+  void on_bounded_ranges (const svalue &sval,
+			  const bounded_ranges &ranges) final override;
+
   void on_unknown_change (const svalue *sval, bool is_mutable) final override;
 
   void on_phi (const gphi *phi, tree rhs) final override;
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 8b7b4e1f697..5bb7112a383 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -4228,6 +4228,8 @@ region_model::apply_constraints_for_gswitch (const switch_cfg_superedge &edge,
   bool sat = m_constraints->add_bounded_ranges (index_sval, all_cases_ranges);
   if (!sat && out)
     *out = new rejected_ranges_constraint (*this, index, all_cases_ranges);
+  if (sat && ctxt && !all_cases_ranges->empty_p ())
+    ctxt->on_bounded_ranges (*index_sval, *all_cases_ranges);
   return sat;
 }
 
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 6dda43f5658..42f8abeb043 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -931,6 +931,13 @@ class region_model_context
 			     enum tree_code op,
 			     const svalue *rhs) = 0;
 
+  /* Hook for clients to be notified when the condition that
+     SVAL is within RANGES is added to the region model.
+     Similar to on_condition, but for use when handling switch statements.
+     RANGES is non-empty.  */
+  virtual void on_bounded_ranges (const svalue &sval,
+				  const bounded_ranges &ranges) = 0;
+
   /* Hooks for clients to be notified when an unknown change happens
      to SVAL (in response to a call to an unknown function).  */
   virtual void on_unknown_change (const svalue *sval, bool is_mutable) = 0;
@@ -991,6 +998,10 @@ public:
 		     const svalue *rhs ATTRIBUTE_UNUSED) override
   {
   }
+  void on_bounded_ranges (const svalue &,
+			  const bounded_ranges &) override
+  {
+  }
   void on_unknown_change (const svalue *sval ATTRIBUTE_UNUSED,
 			  bool is_mutable ATTRIBUTE_UNUSED) override
   {
@@ -1087,6 +1098,12 @@ class region_model_context_decorator : public region_model_context
     m_inner->on_condition (lhs, op, rhs);
   }
 
+  void on_bounded_ranges (const svalue &sval,
+			  const bounded_ranges &ranges) override
+  {
+    m_inner->on_bounded_ranges (sval, ranges);
+  }
+
   void on_unknown_change (const svalue *sval, bool is_mutable) override
   {
     m_inner->on_unknown_change (sval, is_mutable);
diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc
index 2de9284624e..9cb78886c9f 100644
--- a/gcc/analyzer/sm-taint.cc
+++ b/gcc/analyzer/sm-taint.cc
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "analyzer/sm.h"
 #include "analyzer/program-state.h"
 #include "analyzer/pending-diagnostic.h"
+#include "analyzer/constraint-manager.h"
 
 #if ENABLE_ANALYZER
 
@@ -97,6 +98,11 @@ public:
 		     const svalue *lhs,
 		     enum tree_code op,
 		     const svalue *rhs) const final override;
+  void on_bounded_ranges (sm_context *sm_ctxt,
+			  const supernode *node,
+			  const gimple *stmt,
+			  const svalue &sval,
+			  const bounded_ranges &ranges) const final override;
 
   bool can_purge_p (state_t s) const final override;
 
@@ -901,6 +907,58 @@ taint_state_machine::on_condition (sm_context *sm_ctxt,
     }
 }
 
+/* Implementation of state_machine::on_bounded_ranges vfunc for
+   taint_state_machine, for handling switch statement cases.
+   Potentially transition state 'tainted' to 'has_ub' or 'has_lb',
+   and states 'has_ub' and 'has_lb' to 'stop'.  */
+
+void
+taint_state_machine::on_bounded_ranges (sm_context *sm_ctxt,
+					const supernode *,
+					const gimple *stmt,
+					const svalue &sval,
+					const bounded_ranges &ranges) const
+{
+  gcc_assert (!ranges.empty_p ());
+  gcc_assert (ranges.get_count () > 0);
+
+  /* We have one or more ranges; this could be a "default:", or one or
+     more single or range cases.
+
+     Look at the overall endpoints to see if the ranges impose any lower
+     bounds or upper bounds beyond those of the underlying numeric type.  */
+
+  tree lowest_bound = ranges.get_range (0).m_lower;
+  tree highest_bound = ranges.get_range (ranges.get_count () - 1).m_upper;
+  gcc_assert (lowest_bound);
+  gcc_assert (highest_bound);
+
+  bool ranges_have_lb
+    = (lowest_bound != TYPE_MIN_VALUE (TREE_TYPE (lowest_bound)));
+  bool ranges_have_ub
+    = (highest_bound != TYPE_MAX_VALUE (TREE_TYPE (highest_bound)));
+
+  if (!ranges_have_lb && !ranges_have_ub)
+    return;
+
+  /* We have new bounds from the ranges; combine them with any
+     existing bounds on SVAL.  */
+  state_t old_state = sm_ctxt->get_state (stmt, &sval);
+  if (old_state == m_tainted)
+    {
+      if (ranges_have_lb && ranges_have_ub)
+	sm_ctxt->set_next_state (stmt, &sval, m_stop);
+      else if (ranges_have_lb)
+	sm_ctxt->set_next_state (stmt, &sval, m_has_lb);
+      else if (ranges_have_ub)
+	sm_ctxt->set_next_state (stmt, &sval, m_has_ub);
+    }
+  else if (old_state == m_has_ub && ranges_have_lb)
+    sm_ctxt->set_next_state (stmt, &sval, m_stop);
+  else if (old_state == m_has_lb && ranges_have_ub)
+    sm_ctxt->set_next_state (stmt, &sval, m_stop);
+}
+
 bool
 taint_state_machine::can_purge_p (state_t s ATTRIBUTE_UNUSED) const
 {
diff --git a/gcc/analyzer/sm.h b/gcc/analyzer/sm.h
index 353a6db53b0..87ab11cc962 100644
--- a/gcc/analyzer/sm.h
+++ b/gcc/analyzer/sm.h
@@ -108,6 +108,15 @@ public:
   {
   }
 
+  virtual void
+  on_bounded_ranges (sm_context *sm_ctxt ATTRIBUTE_UNUSED,
+		     const supernode *node ATTRIBUTE_UNUSED,
+		     const gimple *stmt ATTRIBUTE_UNUSED,
+		     const svalue &sval ATTRIBUTE_UNUSED,
+		     const bounded_ranges &ranges ATTRIBUTE_UNUSED) const
+  {
+  }
+
   /* Return true if it safe to discard the given state (to help
      when simplifying state objects).
      States that need leak detection should return false.  */
diff --git a/gcc/testsuite/gcc.dg/analyzer/torture/taint-read-index-2.c b/gcc/testsuite/gcc.dg/analyzer/torture/taint-read-index-2.c
index 6a4ebdbba16..b3dc177cb14 100644
--- a/gcc/testsuite/gcc.dg/analyzer/torture/taint-read-index-2.c
+++ b/gcc/testsuite/gcc.dg/analyzer/torture/taint-read-index-2.c
@@ -54,3 +54,88 @@ test_4 (unsigned uarg)
 {
   return called_by_test_4 (uarg);
 }
+
+int  __attribute__((tainted_args))
+test_5 (int idx)
+{
+  switch (idx)
+    {
+    default:
+      return 0;
+    case 5 ... 20:
+      return arr[idx]; /* { dg-bogus "bounds checking" } */
+      /* 20 is still an out-of-bounds error (off-by-one)
+	 but we don't check for that, just that bounds have been imposed.  */
+
+    /* Extra cases to avoid optimizing the switch away.  */
+    case 22:
+      return 22;
+    case 23:
+      return -17;
+    }
+}
+
+int  __attribute__((tainted_args))
+test_6 (int idx)
+{
+  switch (idx)
+    {
+    default:
+      return arr[idx]; /* { dg-warning "without bounds checking" } */
+
+    case 2:
+      return arr[idx]; /* { dg-bogus "bounds checking" } */
+
+    case 6 ... 19:
+      return arr[idx]; /* { dg-bogus "bounds checking" } */
+
+    case 22:
+      return 22;
+    case 23:
+      return -17;
+    }
+}
+
+int  __attribute__((tainted_args))
+test_7 (int idx)
+{
+  switch (idx)
+    {
+    default:
+      return arr[idx]; /* { dg-warning "without bounds checking" } */
+
+    case 2 ... 4:
+    case 7 ... 9:
+      return arr[idx]; /* { dg-bogus "bounds checking" } */
+
+    case 12 ... 19:
+      return arr[idx]; /* { dg-bogus "bounds checking" } */
+
+    case 22:
+      return 22;
+    case 23:
+      return -17;
+    }
+}
+
+int  __attribute__((tainted_args))
+test_8 (unsigned idx)
+{
+  switch (idx)
+    {
+    default:
+      return arr[idx]; /* { dg-warning "without upper-bounds checking" } */
+
+    case 2 ... 4:
+    case 7 ... 9:
+      return arr[idx]; /* { dg-bogus "bounds checking" } */
+
+    case 12 ... 19:
+      return arr[idx]; /* { dg-bogus "bounds checking" } */
+
+    case 22:
+      return 22;
+    case 23:
+      return -17;
+    }
+}
-- 
2.26.3


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-07-19 14:02 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-19 14:02 [committed] analyzer: fix taint handling of switch statements [PR106321] David Malcolm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).