From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 9C6E8385741C for ; Tue, 19 Jul 2022 14:02:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9C6E8385741C Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-646-XqIO-7GUOWiFv7_M7OAePQ-1; Tue, 19 Jul 2022 10:02:34 -0400 X-MC-Unique: XqIO-7GUOWiFv7_M7OAePQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C0134382C977 for ; Tue, 19 Jul 2022 14:02:29 +0000 (UTC) Received: from t14s.localdomain.com (unknown [10.2.16.236]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8DE11403163; Tue, 19 Jul 2022 14:02:29 +0000 (UTC) From: David Malcolm To: gcc-patches@gcc.gnu.org Subject: [committed] analyzer: fix taint handling of switch statements [PR106321] Date: Tue, 19 Jul 2022 10:02:28 -0400 Message-Id: <20220719140228.1336945-1-dmalcolm@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2022 14:03:08 -0000 PR analyzer/106321 reports false positives from -Wanalyzer-tainted-array-index on switch statements, seen e.g. in the Linux kernel in drivers/vfio/pci/vfio_pci_core.c, where vfio_pci_core_ioctl has: | 744 | switch (info.index) { | | ~~~~~~ ~~~~~~~~~~ | | | | | | | (8) ...to here | | (9) following ‘case 0 ... 5:’ branch... |...... | 751 | case VFIO_PCI_BAR0_REGION_INDEX ... VFIO_PCI_BAR5_REGION_INDEX: | | ~~~~ | | | | | (10) ...to here and then a false complaint about "use of attacker-controlled value ‘info.index’ in array lookup without upper-bounds checking", where info.index has clearly had its bounds checked by the switch/case. It turns out that when I rewrote switch handling for the analyzer in r12-3101-g8ca7fa84a3af35, I removed notifications to state machines about the constraints on cases. This patch fixes that oversight by adding a new on_bounded_ranges vfunc for region_model_context, called on switch statement edges, which calls a new state_machine vfunc. It implements it for the "taint" state machine, so that it updates the "has bounds" flags at out-edges for switch statements, based on whether the bounds from the edge appear to actually constrain the switch index. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r13-1748-g2c044ff123ee57. gcc/analyzer/ChangeLog: PR analyzer/106321 * constraint-manager.h (bounded_ranges::get_count): New. (bounded_ranges::get_range): New. * engine.cc (impl_region_model_context::on_bounded_ranges): New. * exploded-graph.h (impl_region_model_context::on_bounded_ranges): New decl. * region-model.cc (region_model::apply_constraints_for_gswitch): Potentially call ctxt->on_bounded_ranges. * region-model.h (region_model_context::on_bounded_ranges): New vfunc. (noop_region_model_context::on_bounded_ranges): New. (region_model_context_decorator::on_bounded_ranges): New. * sm-taint.cc: Include "analyzer/constraint-manager.h". (taint_state_machine::on_bounded_ranges): New. * sm.h (state_machine::on_bounded_ranges): New. gcc/testsuite/ChangeLog: PR analyzer/106321 * gcc.dg/analyzer/torture/taint-read-index-2.c: Add test coverage for switch statements. Signed-off-by: David Malcolm --- gcc/analyzer/constraint-manager.h | 3 + gcc/analyzer/engine.cc | 26 ++++++ gcc/analyzer/exploded-graph.h | 3 + gcc/analyzer/region-model.cc | 2 + gcc/analyzer/region-model.h | 17 ++++ gcc/analyzer/sm-taint.cc | 58 +++++++++++++ gcc/analyzer/sm.h | 9 ++ .../analyzer/torture/taint-read-index-2.c | 85 +++++++++++++++++++ 8 files changed, 203 insertions(+) diff --git a/gcc/analyzer/constraint-manager.h b/gcc/analyzer/constraint-manager.h index f67c7647497..1271f18f1d1 100644 --- a/gcc/analyzer/constraint-manager.h +++ b/gcc/analyzer/constraint-manager.h @@ -138,6 +138,9 @@ public: static int cmp (const bounded_ranges *a, const bounded_ranges *b); + unsigned get_count () const { return m_ranges.length (); } + const bounded_range &get_range (unsigned idx) const { return m_ranges[idx]; } + private: void canonicalize (); void validate () const; diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc index 4f7f9d03900..85b7c5e1227 100644 --- a/gcc/analyzer/engine.cc +++ b/gcc/analyzer/engine.cc @@ -916,6 +916,32 @@ impl_region_model_context::on_condition (const svalue *lhs, } } +/* Implementation of region_model_context::on_bounded_ranges vfunc. + Notify all state machines about the ranges, which could lead to + state transitions. */ + +void +impl_region_model_context::on_bounded_ranges (const svalue &sval, + const bounded_ranges &ranges) +{ + int sm_idx; + sm_state_map *smap; + FOR_EACH_VEC_ELT (m_new_state->m_checker_states, sm_idx, smap) + { + const state_machine &sm = m_ext_state.get_sm (sm_idx); + impl_sm_context sm_ctxt (*m_eg, sm_idx, sm, m_enode_for_diag, + m_old_state, m_new_state, + m_old_state->m_checker_states[sm_idx], + m_new_state->m_checker_states[sm_idx], + m_path_ctxt); + sm.on_bounded_ranges (&sm_ctxt, + (m_enode_for_diag + ? m_enode_for_diag->get_supernode () + : NULL), + m_stmt, sval, ranges); + } +} + /* Implementation of region_model_context::on_phi vfunc. Notify all state machines about the phi, which could lead to state transitions. */ diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h index 0613f558b8b..f9575688fdd 100644 --- a/gcc/analyzer/exploded-graph.h +++ b/gcc/analyzer/exploded-graph.h @@ -65,6 +65,9 @@ class impl_region_model_context : public region_model_context enum tree_code op, const svalue *rhs) final override; + void on_bounded_ranges (const svalue &sval, + const bounded_ranges &ranges) final override; + void on_unknown_change (const svalue *sval, bool is_mutable) final override; void on_phi (const gphi *phi, tree rhs) final override; diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index 8b7b4e1f697..5bb7112a383 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -4228,6 +4228,8 @@ region_model::apply_constraints_for_gswitch (const switch_cfg_superedge &edge, bool sat = m_constraints->add_bounded_ranges (index_sval, all_cases_ranges); if (!sat && out) *out = new rejected_ranges_constraint (*this, index, all_cases_ranges); + if (sat && ctxt && !all_cases_ranges->empty_p ()) + ctxt->on_bounded_ranges (*index_sval, *all_cases_ranges); return sat; } diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h index 6dda43f5658..42f8abeb043 100644 --- a/gcc/analyzer/region-model.h +++ b/gcc/analyzer/region-model.h @@ -931,6 +931,13 @@ class region_model_context enum tree_code op, const svalue *rhs) = 0; + /* Hook for clients to be notified when the condition that + SVAL is within RANGES is added to the region model. + Similar to on_condition, but for use when handling switch statements. + RANGES is non-empty. */ + virtual void on_bounded_ranges (const svalue &sval, + const bounded_ranges &ranges) = 0; + /* Hooks for clients to be notified when an unknown change happens to SVAL (in response to a call to an unknown function). */ virtual void on_unknown_change (const svalue *sval, bool is_mutable) = 0; @@ -991,6 +998,10 @@ public: const svalue *rhs ATTRIBUTE_UNUSED) override { } + void on_bounded_ranges (const svalue &, + const bounded_ranges &) override + { + } void on_unknown_change (const svalue *sval ATTRIBUTE_UNUSED, bool is_mutable ATTRIBUTE_UNUSED) override { @@ -1087,6 +1098,12 @@ class region_model_context_decorator : public region_model_context m_inner->on_condition (lhs, op, rhs); } + void on_bounded_ranges (const svalue &sval, + const bounded_ranges &ranges) override + { + m_inner->on_bounded_ranges (sval, ranges); + } + void on_unknown_change (const svalue *sval, bool is_mutable) override { m_inner->on_unknown_change (sval, is_mutable); diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc index 2de9284624e..9cb78886c9f 100644 --- a/gcc/analyzer/sm-taint.cc +++ b/gcc/analyzer/sm-taint.cc @@ -51,6 +51,7 @@ along with GCC; see the file COPYING3. If not see #include "analyzer/sm.h" #include "analyzer/program-state.h" #include "analyzer/pending-diagnostic.h" +#include "analyzer/constraint-manager.h" #if ENABLE_ANALYZER @@ -97,6 +98,11 @@ public: const svalue *lhs, enum tree_code op, const svalue *rhs) const final override; + void on_bounded_ranges (sm_context *sm_ctxt, + const supernode *node, + const gimple *stmt, + const svalue &sval, + const bounded_ranges &ranges) const final override; bool can_purge_p (state_t s) const final override; @@ -901,6 +907,58 @@ taint_state_machine::on_condition (sm_context *sm_ctxt, } } +/* Implementation of state_machine::on_bounded_ranges vfunc for + taint_state_machine, for handling switch statement cases. + Potentially transition state 'tainted' to 'has_ub' or 'has_lb', + and states 'has_ub' and 'has_lb' to 'stop'. */ + +void +taint_state_machine::on_bounded_ranges (sm_context *sm_ctxt, + const supernode *, + const gimple *stmt, + const svalue &sval, + const bounded_ranges &ranges) const +{ + gcc_assert (!ranges.empty_p ()); + gcc_assert (ranges.get_count () > 0); + + /* We have one or more ranges; this could be a "default:", or one or + more single or range cases. + + Look at the overall endpoints to see if the ranges impose any lower + bounds or upper bounds beyond those of the underlying numeric type. */ + + tree lowest_bound = ranges.get_range (0).m_lower; + tree highest_bound = ranges.get_range (ranges.get_count () - 1).m_upper; + gcc_assert (lowest_bound); + gcc_assert (highest_bound); + + bool ranges_have_lb + = (lowest_bound != TYPE_MIN_VALUE (TREE_TYPE (lowest_bound))); + bool ranges_have_ub + = (highest_bound != TYPE_MAX_VALUE (TREE_TYPE (highest_bound))); + + if (!ranges_have_lb && !ranges_have_ub) + return; + + /* We have new bounds from the ranges; combine them with any + existing bounds on SVAL. */ + state_t old_state = sm_ctxt->get_state (stmt, &sval); + if (old_state == m_tainted) + { + if (ranges_have_lb && ranges_have_ub) + sm_ctxt->set_next_state (stmt, &sval, m_stop); + else if (ranges_have_lb) + sm_ctxt->set_next_state (stmt, &sval, m_has_lb); + else if (ranges_have_ub) + sm_ctxt->set_next_state (stmt, &sval, m_has_ub); + } + else if (old_state == m_has_ub && ranges_have_lb) + sm_ctxt->set_next_state (stmt, &sval, m_stop); + else if (old_state == m_has_lb && ranges_have_ub) + sm_ctxt->set_next_state (stmt, &sval, m_stop); +} + bool taint_state_machine::can_purge_p (state_t s ATTRIBUTE_UNUSED) const { diff --git a/gcc/analyzer/sm.h b/gcc/analyzer/sm.h index 353a6db53b0..87ab11cc962 100644 --- a/gcc/analyzer/sm.h +++ b/gcc/analyzer/sm.h @@ -108,6 +108,15 @@ public: { } + virtual void + on_bounded_ranges (sm_context *sm_ctxt ATTRIBUTE_UNUSED, + const supernode *node ATTRIBUTE_UNUSED, + const gimple *stmt ATTRIBUTE_UNUSED, + const svalue &sval ATTRIBUTE_UNUSED, + const bounded_ranges &ranges ATTRIBUTE_UNUSED) const + { + } + /* Return true if it safe to discard the given state (to help when simplifying state objects). States that need leak detection should return false. */ diff --git a/gcc/testsuite/gcc.dg/analyzer/torture/taint-read-index-2.c b/gcc/testsuite/gcc.dg/analyzer/torture/taint-read-index-2.c index 6a4ebdbba16..b3dc177cb14 100644 --- a/gcc/testsuite/gcc.dg/analyzer/torture/taint-read-index-2.c +++ b/gcc/testsuite/gcc.dg/analyzer/torture/taint-read-index-2.c @@ -54,3 +54,88 @@ test_4 (unsigned uarg) { return called_by_test_4 (uarg); } + +int __attribute__((tainted_args)) +test_5 (int idx) +{ + switch (idx) + { + default: + return 0; + case 5 ... 20: + return arr[idx]; /* { dg-bogus "bounds checking" } */ + /* 20 is still an out-of-bounds error (off-by-one) + but we don't check for that, just that bounds have been imposed. */ + + /* Extra cases to avoid optimizing the switch away. */ + case 22: + return 22; + case 23: + return -17; + } +} + +int __attribute__((tainted_args)) +test_6 (int idx) +{ + switch (idx) + { + default: + return arr[idx]; /* { dg-warning "without bounds checking" } */ + + case 2: + return arr[idx]; /* { dg-bogus "bounds checking" } */ + + case 6 ... 19: + return arr[idx]; /* { dg-bogus "bounds checking" } */ + + case 22: + return 22; + case 23: + return -17; + } +} + +int __attribute__((tainted_args)) +test_7 (int idx) +{ + switch (idx) + { + default: + return arr[idx]; /* { dg-warning "without bounds checking" } */ + + case 2 ... 4: + case 7 ... 9: + return arr[idx]; /* { dg-bogus "bounds checking" } */ + + case 12 ... 19: + return arr[idx]; /* { dg-bogus "bounds checking" } */ + + case 22: + return 22; + case 23: + return -17; + } +} + +int __attribute__((tainted_args)) +test_8 (unsigned idx) +{ + switch (idx) + { + default: + return arr[idx]; /* { dg-warning "without upper-bounds checking" } */ + + case 2 ... 4: + case 7 ... 9: + return arr[idx]; /* { dg-bogus "bounds checking" } */ + + case 12 ... 19: + return arr[idx]; /* { dg-bogus "bounds checking" } */ + + case 22: + return 22; + case 23: + return -17; + } +} -- 2.26.3