public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [pushed 1/6] analyzer: convert note_adding_context to annotating_context
@ 2023-08-22  1:21 David Malcolm
  2023-08-22  1:21 ` [pushed 2/6] analyzer: add ability for context to add events to a saved_diagnostic David Malcolm
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: David Malcolm @ 2023-08-22  1:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This is enabling work towards the context being able to inject
events into diagnostic paths, rather than just notes after the
warning.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-3371-ge40a935db29cfd.


gcc/analyzer/ChangeLog:
	* region-model.cc
	(class check_external_function_for_access_attr::annotating_ctxt):
	Convert to an annotating_context.
	* region-model.h (class note_adding_context): Rename to...
	(class annotating_context): ...this, updating the "warn" method.
	(note_adding_context::make_note): Replace with...
	(annotating_context::add_annotations): ...this.
---
 gcc/analyzer/region-model.cc | 12 ++++++------
 gcc/analyzer/region-model.h  | 14 +++++++-------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 494a9cdf149e..5c165ff127f8 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1641,23 +1641,23 @@ check_external_function_for_access_attr (const gcall *call,
       if (access->mode == access_write_only
 	  || access->mode == access_read_write)
 	{
-	  /* Subclass of decorated_region_model_context that
+	  /* Subclass of annotating_context that
 	     adds a note about the attr access to any saved diagnostics.  */
-	  class annotating_ctxt : public note_adding_context
+	  class annotating_ctxt : public annotating_context
 	  {
 	  public:
 	    annotating_ctxt (tree callee_fndecl,
 			     const attr_access &access,
 			     region_model_context *ctxt)
-	    : note_adding_context (ctxt),
+	    : annotating_context (ctxt),
 	      m_callee_fndecl (callee_fndecl),
 	      m_access (access)
 	    {
 	    }
-	    std::unique_ptr<pending_note> make_note () final override
+	    void add_annotations () final override
 	    {
-	      return make_unique<reason_attr_access>
-		(m_callee_fndecl, m_access);
+	      add_note (make_unique<reason_attr_access>
+			(m_callee_fndecl, m_access));
 	    }
 	  private:
 	    tree m_callee_fndecl;
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 4f09f2e585ac..88772655bc5b 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -922,28 +922,28 @@ protected:
   region_model_context *m_inner;
 };
 
-/* Subclass of region_model_context_decorator that adds a note
-   when saving diagnostics.  */
+/* Subclass of region_model_context_decorator with a hook for adding
+   notes/events when saving diagnostics.  */
 
-class note_adding_context : public region_model_context_decorator
+class annotating_context : public region_model_context_decorator
 {
 public:
   bool warn (std::unique_ptr<pending_diagnostic> d) override
   {
     if (m_inner->warn (std::move (d)))
       {
-	add_note (make_note ());
+	add_annotations ();
 	return true;
       }
     else
       return false;
   }
 
-  /* Hook to make the new note.  */
-  virtual std::unique_ptr<pending_note> make_note () = 0;
+  /* Hook to add new event(s)/note(s)  */
+  virtual void add_annotations () = 0;
 
 protected:
-  note_adding_context (region_model_context *inner)
+  annotating_context (region_model_context *inner)
   : region_model_context_decorator (inner)
   {
   }
-- 
2.26.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pushed 2/6] analyzer: add ability for context to add events to a saved_diagnostic
  2023-08-22  1:21 [pushed 1/6] analyzer: convert note_adding_context to annotating_context David Malcolm
@ 2023-08-22  1:21 ` David Malcolm
  2023-08-22  1:21 ` [pushed 3/6] analyzer: handle NULL inner context in region_model_context_decorator David Malcolm
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: David Malcolm @ 2023-08-22  1:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-3372-g2503dd59b588d3.

gcc/analyzer/ChangeLog:
	* diagnostic-manager.cc (saved_diagnostic::add_event): New.
	(saved_diagnostic::add_any_saved_events): New.
	(diagnostic_manager::add_event): New.
	(dedupe_winners::emit_best): New.
	(diagnostic_manager::emit_saved_diagnostic): Make "sd" param
	non-const.  Call saved_diagnostic::add_any_saved_events.
	* diagnostic-manager.h (saved_diagnostic::add_event): New decl.
	(saved_diagnostic::add_any_saved_events): New decl.
	(saved_diagnostic::m_saved_events): New field.
	(diagnostic_manager::add_event): New decl.
	(diagnostic_manager::emit_saved_diagnostic): Make "sd" param
	non-const.
	* engine.cc (impl_region_model_context::add_event): New.
	* exploded-graph.h (impl_region_model_context::add_event): New decl.
	* region-model.cc
	(noop_region_model_context::add_event): New.
	(region_model_context_decorator::add_event): New.
	* region-model.h (region_model_context::add_event): New vfunc.
	(noop_region_model_context::add_event): New decl.
	(region_model_context_decorator::add_event): New decl.
---
 gcc/analyzer/diagnostic-manager.cc | 45 ++++++++++++++++++++++++++++--
 gcc/analyzer/diagnostic-manager.h  | 12 +++++++-
 gcc/analyzer/engine.cc             |  8 ++++++
 gcc/analyzer/exploded-graph.h      |  1 +
 gcc/analyzer/region-model.cc       | 13 +++++++++
 gcc/analyzer/region-model.h        |  6 ++++
 6 files changed, 82 insertions(+), 3 deletions(-)

diff --git a/gcc/analyzer/diagnostic-manager.cc b/gcc/analyzer/diagnostic-manager.cc
index 62f78f35dc08..10fea486b8c8 100644
--- a/gcc/analyzer/diagnostic-manager.cc
+++ b/gcc/analyzer/diagnostic-manager.cc
@@ -721,6 +721,15 @@ saved_diagnostic::add_note (std::unique_ptr<pending_note> pn)
   m_notes.safe_push (pn.release ());
 }
 
+/* Add EVENT to this diagnostic.  */
+
+void
+saved_diagnostic::add_event (std::unique_ptr<checker_event> event)
+{
+  gcc_assert (event);
+  m_saved_events.safe_push (event.release ());
+}
+
 /* Return a new json::object of the form
    {"sm": optional str,
     "enode": int,
@@ -890,6 +899,19 @@ saved_diagnostic::supercedes_p (const saved_diagnostic &other) const
   return m_d->supercedes_p (*other.m_d);
 }
 
+/* Move any saved checker_events from this saved_diagnostic to
+   the end of DST_PATH.  */
+
+void
+saved_diagnostic::add_any_saved_events (checker_path &dst_path)
+{
+  for (auto &event : m_saved_events)
+    {
+      dst_path.add_event (std::unique_ptr<checker_event> (event));
+      event = nullptr;
+    }
+}
+
 /* Emit any pending notes owned by this diagnostic.  */
 
 void
@@ -1057,6 +1079,20 @@ diagnostic_manager::add_note (std::unique_ptr<pending_note> pn)
   sd->add_note (std::move (pn));
 }
 
+/* Add EVENT to the most recent saved_diagnostic.  */
+
+void
+diagnostic_manager::add_event (std::unique_ptr<checker_event> event)
+{
+  LOG_FUNC (get_logger ());
+  gcc_assert (event);
+
+  /* Get most recent saved_diagnostic.  */
+  gcc_assert (m_saved_diagnostics.length () > 0);
+  saved_diagnostic *sd = m_saved_diagnostics[m_saved_diagnostics.length () - 1];
+  sd->add_event (std::move (event));
+}
+
 /* Return a new json::object of the form
    {"diagnostics"  : [obj for saved_diagnostic]}.  */
 
@@ -1308,7 +1344,7 @@ public:
       {
 	saved_diagnostic **slot = m_map.get (key);
 	gcc_assert (*slot);
-	const saved_diagnostic *sd = *slot;
+	saved_diagnostic *sd = *slot;
 	dm->emit_saved_diagnostic (eg, *sd);
       }
   }
@@ -1370,7 +1406,7 @@ diagnostic_manager::emit_saved_diagnostics (const exploded_graph &eg)
 
 void
 diagnostic_manager::emit_saved_diagnostic (const exploded_graph &eg,
-					   const saved_diagnostic &sd)
+					   saved_diagnostic &sd)
 {
   LOG_SCOPE (get_logger ());
   log ("sd[%i]: %qs at SN: %i",
@@ -1395,6 +1431,11 @@ diagnostic_manager::emit_saved_diagnostic (const exploded_graph &eg,
   /* Now prune it to just cover the most pertinent events.  */
   prune_path (&emission_path, sd.m_sm, sd.m_sval, sd.m_state);
 
+  /* Add any saved events to the path, giving contextual information
+     about what the analyzer was simulating as the diagnostic was
+     generated.  These don't get pruned, as they are probably pertinent.  */
+  sd.add_any_saved_events (emission_path);
+
   /* Add a final event to the path, covering the diagnostic itself.
      We use the final enode from the epath, which might be different from
      the sd.m_enode, as the dedupe code doesn't care about enodes, just
diff --git a/gcc/analyzer/diagnostic-manager.h b/gcc/analyzer/diagnostic-manager.h
index d3022b888dd5..413ab0c90b14 100644
--- a/gcc/analyzer/diagnostic-manager.h
+++ b/gcc/analyzer/diagnostic-manager.h
@@ -42,6 +42,7 @@ public:
   bool operator== (const saved_diagnostic &other) const;
 
   void add_note (std::unique_ptr<pending_note> pn);
+  void add_event (std::unique_ptr<checker_event> event);
 
   json::object *to_json () const;
 
@@ -64,6 +65,8 @@ public:
 
   bool supercedes_p (const saved_diagnostic &other) const;
 
+  void add_any_saved_events (checker_path &dst_path);
+
   void emit_any_notes () const;
 
   //private:
@@ -87,6 +90,12 @@ private:
 
   auto_vec<const saved_diagnostic *> m_duplicates;
   auto_delete_vec <pending_note> m_notes;
+
+  /* Optionally: additional context-dependent events to be emitted
+     immediately before the warning_event, giving more details of what
+     operation was being simulated when a diagnostic was saved
+     e.g. "looking for null terminator in param 2 of 'foo'".  */
+  auto_delete_vec <checker_event> m_saved_events;
 };
 
 class path_builder;
@@ -124,11 +133,12 @@ public:
 		       std::unique_ptr<pending_diagnostic> d);
 
   void add_note (std::unique_ptr<pending_note> pn);
+  void add_event (std::unique_ptr<checker_event> event);
 
   void emit_saved_diagnostics (const exploded_graph &eg);
 
   void emit_saved_diagnostic (const exploded_graph &eg,
-			      const saved_diagnostic &sd);
+			      saved_diagnostic &sd);
 
   unsigned get_num_diagnostics () const
   {
diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 61685f43fba0..3700154eec2c 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -149,6 +149,14 @@ impl_region_model_context::add_note (std::unique_ptr<pending_note> pn)
     m_eg->get_diagnostic_manager ().add_note (std::move (pn));
 }
 
+void
+impl_region_model_context::add_event (std::unique_ptr<checker_event> event)
+{
+  LOG_FUNC (get_logger ());
+  if (m_eg)
+    m_eg->get_diagnostic_manager ().add_event (std::move (event));
+}
+
 void
 impl_region_model_context::on_svalue_leak (const svalue *sval)
 
diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h
index 4a4ef9d12b48..5a7ab645bfee 100644
--- a/gcc/analyzer/exploded-graph.h
+++ b/gcc/analyzer/exploded-graph.h
@@ -58,6 +58,7 @@ class impl_region_model_context : public region_model_context
 
   bool warn (std::unique_ptr<pending_diagnostic> d) final override;
   void add_note (std::unique_ptr<pending_note> pn) final override;
+  void add_event (std::unique_ptr<checker_event> event) final override;
   void on_svalue_leak (const svalue *) override;
   void on_liveness_change (const svalue_set &live_svalues,
 			   const region_model *model) final override;
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 5c165ff127f8..fa30193943d2 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5855,6 +5855,11 @@ noop_region_model_context::add_note (std::unique_ptr<pending_note>)
 {
 }
 
+void
+noop_region_model_context::add_event (std::unique_ptr<checker_event>)
+{
+}
+
 void
 noop_region_model_context::bifurcate (std::unique_ptr<custom_edge_info>)
 {
@@ -5865,6 +5870,14 @@ noop_region_model_context::terminate_path ()
 {
 }
 
+/* class region_model_context_decorator : public region_model_context.  */
+
+void
+region_model_context_decorator::add_event (std::unique_ptr<checker_event> event)
+{
+  m_inner->add_event (std::move (event));
+}
+
 /* struct model_merger.  */
 
 /* Dump a multiline representation of this merger to PP.  */
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 88772655bc5b..cdfce0727cf7 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -627,6 +627,10 @@ class region_model_context
      pending diagnostic.  */
   virtual void add_note (std::unique_ptr<pending_note> pn) = 0;
 
+  /* Hook for clients to add an event to the last previously stored
+     pending diagnostic.  */
+  virtual void add_event (std::unique_ptr<checker_event> event) = 0;
+
   /* Hook for clients to be notified when an SVAL that was reachable
      in a previous state is no longer live, so that clients can emit warnings
      about leaks.  */
@@ -733,6 +737,7 @@ class noop_region_model_context : public region_model_context
 public:
   bool warn (std::unique_ptr<pending_diagnostic>) override { return false; }
   void add_note (std::unique_ptr<pending_note>) override;
+  void add_event (std::unique_ptr<checker_event>) override;
   void on_svalue_leak (const svalue *) override {}
   void on_liveness_change (const svalue_set &,
 			   const region_model *) override {}
@@ -815,6 +820,7 @@ class region_model_context_decorator : public region_model_context
   {
     m_inner->add_note (std::move (pn));
   }
+  void add_event (std::unique_ptr<checker_event> event) override;
 
   void on_svalue_leak (const svalue *sval) override
   {
-- 
2.26.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pushed 3/6] analyzer: handle NULL inner context in region_model_context_decorator
  2023-08-22  1:21 [pushed 1/6] analyzer: convert note_adding_context to annotating_context David Malcolm
  2023-08-22  1:21 ` [pushed 2/6] analyzer: add ability for context to add events to a saved_diagnostic David Malcolm
@ 2023-08-22  1:21 ` David Malcolm
  2023-08-22  1:21 ` [pushed 4/6] analyzer: replace -Wanalyzer-unterminated-string with scan_for_null_terminator [PR105899] David Malcolm
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: David Malcolm @ 2023-08-22  1:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-3373-g1e7b0a5d7a45dc.

gcc/analyzer/ChangeLog:
	* region-model.cc (region_model_context_decorator::add_event):
	Handle m_inner being NULL.
	* region-model.h (class region_model_context_decorator): Likewise.
	(annotating_context::warn): Likewise.
---
 gcc/analyzer/region-model.cc |  3 +-
 gcc/analyzer/region-model.h  | 86 ++++++++++++++++++++++++------------
 2 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index fa30193943d2..ed93fb89f933 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5875,7 +5875,8 @@ noop_region_model_context::terminate_path ()
 void
 region_model_context_decorator::add_event (std::unique_ptr<checker_event> event)
 {
-  m_inner->add_event (std::move (event));
+  if (m_inner)
+    m_inner->add_event (std::move (event));
 }
 
 /* struct model_merger.  */
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index cdfce0727cf7..a01399c8e85a 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -813,93 +813,118 @@ class region_model_context_decorator : public region_model_context
  public:
   bool warn (std::unique_ptr<pending_diagnostic> d) override
   {
-    return m_inner->warn (std::move (d));
+    if (m_inner)
+      return m_inner->warn (std::move (d));
+    else
+      return false;
   }
 
   void add_note (std::unique_ptr<pending_note> pn) override
   {
-    m_inner->add_note (std::move (pn));
+    if (m_inner)
+      m_inner->add_note (std::move (pn));
   }
   void add_event (std::unique_ptr<checker_event> event) override;
 
   void on_svalue_leak (const svalue *sval) override
   {
-    m_inner->on_svalue_leak (sval);
+    if (m_inner)
+      m_inner->on_svalue_leak (sval);
   }
 
   void on_liveness_change (const svalue_set &live_svalues,
 			   const region_model *model) override
   {
-    m_inner->on_liveness_change (live_svalues, model);
+    if (m_inner)
+      m_inner->on_liveness_change (live_svalues, model);
   }
 
   logger *get_logger () override
   {
-    return m_inner->get_logger ();
+    if (m_inner)
+      return m_inner->get_logger ();
+    else
+      return nullptr;
   }
 
   void on_condition (const svalue *lhs,
 		     enum tree_code op,
 		     const svalue *rhs) override
   {
-    m_inner->on_condition (lhs, op, rhs);
+    if (m_inner)
+      m_inner->on_condition (lhs, op, rhs);
   }
 
   void on_bounded_ranges (const svalue &sval,
 			  const bounded_ranges &ranges) override
   {
-    m_inner->on_bounded_ranges (sval, ranges);
+    if (m_inner)
+      m_inner->on_bounded_ranges (sval, ranges);
   }
 
   void on_pop_frame (const frame_region *frame_reg) override
   {
-    m_inner->on_pop_frame (frame_reg);
+    if (m_inner)
+      m_inner->on_pop_frame (frame_reg);
   }
 
   void on_unknown_change (const svalue *sval, bool is_mutable) override
   {
-    m_inner->on_unknown_change (sval, is_mutable);
+    if (m_inner)
+      m_inner->on_unknown_change (sval, is_mutable);
   }
 
   void on_phi (const gphi *phi, tree rhs) override
   {
-    m_inner->on_phi (phi, rhs);
+    if (m_inner)
+      m_inner->on_phi (phi, rhs);
   }
 
   void on_unexpected_tree_code (tree t,
 				const dump_location_t &loc) override
   {
-    m_inner->on_unexpected_tree_code (t, loc);
+    if (m_inner)
+      m_inner->on_unexpected_tree_code (t, loc);
   }
 
   void on_escaped_function (tree fndecl) override
   {
-    m_inner->on_escaped_function (fndecl);
+    if (m_inner)
+      m_inner->on_escaped_function (fndecl);
   }
 
   uncertainty_t *get_uncertainty () override
   {
-    return m_inner->get_uncertainty ();
+    if (m_inner)
+      return m_inner->get_uncertainty ();
+    else
+      return nullptr;
   }
 
   void purge_state_involving (const svalue *sval) override
   {
-    m_inner->purge_state_involving (sval);
+    if (m_inner)
+      m_inner->purge_state_involving (sval);
   }
 
   void bifurcate (std::unique_ptr<custom_edge_info> info) override
   {
-    m_inner->bifurcate (std::move (info));
+    if (m_inner)
+      m_inner->bifurcate (std::move (info));
   }
 
   void terminate_path () override
   {
-    m_inner->terminate_path ();
+    if (m_inner)
+      m_inner->terminate_path ();
   }
 
   const extrinsic_state *get_ext_state () const override
   {
-    return m_inner->get_ext_state ();
+    if (m_inner)
+      return m_inner->get_ext_state ();
+    else
+      return nullptr;
   }
 
   bool get_state_map_by_name (const char *name,
@@ -909,20 +934,25 @@ class region_model_context_decorator : public region_model_context
 			      std::unique_ptr<sm_context> *out_sm_context)
     override
   {
-    return m_inner->get_state_map_by_name (name, out_smap, out_sm, out_sm_idx,
-					   out_sm_context);
+    if (m_inner)
+      return m_inner->get_state_map_by_name (name, out_smap, out_sm, out_sm_idx,
+					     out_sm_context);
+    else
+      return false;
   }
 
   const gimple *get_stmt () const override
   {
-    return m_inner->get_stmt ();
+    if (m_inner)
+      return m_inner->get_stmt ();
+    else
+      return nullptr;
   }
 
 protected:
   region_model_context_decorator (region_model_context *inner)
   : m_inner (inner)
   {
-    gcc_assert (m_inner);
   }
 
   region_model_context *m_inner;
@@ -936,13 +966,13 @@ class annotating_context : public region_model_context_decorator
 public:
   bool warn (std::unique_ptr<pending_diagnostic> d) override
   {
-    if (m_inner->warn (std::move (d)))
-      {
-	add_annotations ();
-	return true;
-      }
-    else
-      return false;
+    if (m_inner)
+      if (m_inner->warn (std::move (d)))
+	{
+	  add_annotations ();
+	  return true;
+	}
+    return false;
   }
 
   /* Hook to add new event(s)/note(s)  */
-- 
2.26.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pushed 4/6] analyzer: replace -Wanalyzer-unterminated-string with scan_for_null_terminator [PR105899]
  2023-08-22  1:21 [pushed 1/6] analyzer: convert note_adding_context to annotating_context David Malcolm
  2023-08-22  1:21 ` [pushed 2/6] analyzer: add ability for context to add events to a saved_diagnostic David Malcolm
  2023-08-22  1:21 ` [pushed 3/6] analyzer: handle NULL inner context in region_model_context_decorator David Malcolm
@ 2023-08-22  1:21 ` David Malcolm
  2023-08-22  1:21 ` [pushed 5/6] analyzer: add kf_fopen David Malcolm
  2023-08-22  1:21 ` [pushed 6/6] analyzer: check format strings for null termination [PR105899] David Malcolm
  4 siblings, 0 replies; 6+ messages in thread
From: David Malcolm @ 2023-08-22  1:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

In r14-3169-g325f9e88802daa I added check_for_null_terminated_string_arg
to -fanalyzer, calling it in various places, with a sole check for
unterminated string constants, adding -Wanalyzer-unterminated-string for
this case.

This patch adds region_model::scan_for_null_terminator, which simulates
scanning memory for a zero byte, complaining about uninitiliazed bytes
and out-of-range accesses seen before any zero byte is seen.

This more flexible approach catches the issues we saw before with
-Wanalyzer-unterminated-string, and also catches uninitialized runs
of bytes, and I believe will be a better way to build checking of C
string operations in the analyzer.

Given that the patch makes -Wanalyzer-unterminated-string redundant
and that this option was only in trunk for 10 days and has no known
users, the patch simply removes the option without a compatibility
fallback.

The patch uses custom events and notes to provide context on where
the issues are coming from.  For example, given:

null-terminated-strings-1.c: In function ‘test_partially_initialized’:
null-terminated-strings-1.c:71:3: warning: use of uninitialized value ‘buf[1]’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
   71 |   __analyzer_get_strlen (buf);
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  ‘test_partially_initialized’: events 1-3
    |
    |   69 |   char buf[16];
    |      |        ^~~
    |      |        |
    |      |        (1) region created on stack here
    |   70 |   buf[0] = 'a';
    |   71 |   __analyzer_get_strlen (buf);
    |      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |   |
    |      |   (2) while looking for null terminator for argument 1 (‘&buf’) of ‘__analyzer_get_strlen’...
    |      |   (3) use of uninitialized value ‘buf[1]’ here
    |
analyzer-decls.h:59:22: note: argument 1 of ‘__analyzer_get_strlen’ must be a pointer to a null-terminated string
   59 | extern __SIZE_TYPE__ __analyzer_get_strlen (const char *ptr);
      |                      ^~~~~~~~~~~~~~~~~~~~~

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-3374-gfe97f09a0caeff.

gcc/analyzer/ChangeLog:
	PR analyzer/105899
	* analyzer.opt (Wanalyzer-unterminated-string): Delete.
	* call-details.cc
	(call_details::check_for_null_terminated_string_arg): Convert
	return type from void to const svalue *.  Add param "out_sval".
	* call-details.h
	(call_details::check_for_null_terminated_string_arg): Likewise.
	* kf-analyzer.cc (kf_analyzer_get_strlen::impl_call_pre): Wire up
	to result of check_for_null_terminated_string_arg.
	* region-model.cc (get_strlen): Delete.
	(class unterminated_string_arg): Delete.
	(struct fragment): New.
	(class iterable_cluster): New.
	(region_model::get_store_bytes): New.
	(get_tree_for_byte_offset): New.
	(region_model::scan_for_null_terminator): New.
	(region_model::check_for_null_terminated_string_arg): Convert
	return type from void to const svalue *.  Add param "out_sval".
	Reimplement in terms of scan_for_null_terminator, dropping the
	special-case for -Wanalyzer-unterminated-string.
	* region-model.h (region_model::get_store_bytes): New decl.
	(region_model::scan_for_null_terminator): New decl.
	(region_model::check_for_null_terminated_string_arg): Convert
	return type from void to const svalue *.  Add param "out_sval".
	* store.cc (concrete_binding::get_byte_range): New.
	* store.h (concrete_binding::get_byte_range): New decl.
	(store_manager::get_concrete_binding): New overload.

gcc/ChangeLog:
	PR analyzer/105899
	* doc/invoke.texi: Remove -Wanalyzer-unterminated-string.

gcc/testsuite/ChangeLog:
	PR analyzer/105899
	* gcc.dg/analyzer/error-1.c: Update expected results to reflect
	reimplementation of unterminated string detection.  Add test
	coverage for uninitialized buffers.
	* gcc.dg/analyzer/null-terminated-strings-1.c: Likewise.
	* gcc.dg/analyzer/putenv-1.c: Likewise.
	* gcc.dg/analyzer/strchr-1.c: Likewise.
	* gcc.dg/analyzer/strcpy-1.c: Likewise.
	* gcc.dg/analyzer/strdup-1.c: Likewise.
---
 gcc/analyzer/analyzer.opt                     |   4 -
 gcc/analyzer/call-details.cc                  |   8 +-
 gcc/analyzer/call-details.h                   |   4 +-
 gcc/analyzer/kf-analyzer.cc                   |  15 +-
 gcc/analyzer/region-model.cc                  | 521 +++++++++++++++---
 gcc/analyzer/region-model.h                   |  13 +-
 gcc/analyzer/store.cc                         |   9 +
 gcc/analyzer/store.h                          |   7 +
 gcc/doc/invoke.texi                           |  13 -
 gcc/testsuite/gcc.dg/analyzer/error-1.c       |  20 +-
 .../analyzer/null-terminated-strings-1.c      | 128 ++++-
 gcc/testsuite/gcc.dg/analyzer/putenv-1.c      |  13 +-
 gcc/testsuite/gcc.dg/analyzer/strchr-1.c      |  10 +-
 gcc/testsuite/gcc.dg/analyzer/strcpy-1.c      |  10 +-
 gcc/testsuite/gcc.dg/analyzer/strdup-1.c      |  10 +-
 15 files changed, 662 insertions(+), 123 deletions(-)

diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
index 6658ac520f14..7917473d1223 100644
--- a/gcc/analyzer/analyzer.opt
+++ b/gcc/analyzer/analyzer.opt
@@ -214,10 +214,6 @@ Wanalyzer-tainted-size
 Common Var(warn_analyzer_tainted_size) Init(1) Warning
 Warn about code paths in which an unsanitized value is used as a size.
 
-Wanalyzer-unterminated-string
-Common Var(warn_analyzer_unterminated_string) Init(1) Warning
-Warn about code paths which attempt to find the length of an unterminated string.
-
 Wanalyzer-use-after-free
 Common Var(warn_analyzer_use_after_free) Init(1) Warning
 Warn about code paths in which a freed value is used.
diff --git a/gcc/analyzer/call-details.cc b/gcc/analyzer/call-details.cc
index fa86f55177a4..e497fc58e028 100644
--- a/gcc/analyzer/call-details.cc
+++ b/gcc/analyzer/call-details.cc
@@ -376,11 +376,13 @@ call_details::lookup_function_attribute (const char *attr_name) const
   return lookup_attribute (attr_name, TYPE_ATTRIBUTES (allocfntype));
 }
 
-void
-call_details::check_for_null_terminated_string_arg (unsigned arg_idx) const
+const svalue *
+call_details::
+check_for_null_terminated_string_arg (unsigned arg_idx,
+				      const svalue **out_sval) const
 {
   region_model *model = get_model ();
-  model->check_for_null_terminated_string_arg (*this, arg_idx);
+  return model->check_for_null_terminated_string_arg (*this, arg_idx, out_sval);
 }
 
 } // namespace ana
diff --git a/gcc/analyzer/call-details.h b/gcc/analyzer/call-details.h
index 0622cab7856a..86f0e68072bd 100644
--- a/gcc/analyzer/call-details.h
+++ b/gcc/analyzer/call-details.h
@@ -71,7 +71,9 @@ public:
 
   tree lookup_function_attribute (const char *attr_name) const;
 
-  void check_for_null_terminated_string_arg (unsigned arg_idx) const;
+  const svalue *
+  check_for_null_terminated_string_arg (unsigned arg_idx,
+					const svalue **out_sval = nullptr) const;
 
 private:
   const gcall *m_call;
diff --git a/gcc/analyzer/kf-analyzer.cc b/gcc/analyzer/kf-analyzer.cc
index 1a0c94089aca..c767ebcb6615 100644
--- a/gcc/analyzer/kf-analyzer.cc
+++ b/gcc/analyzer/kf-analyzer.cc
@@ -369,8 +369,19 @@ public:
   }
   void impl_call_pre (const call_details &cd) const final override
   {
-    cd.check_for_null_terminated_string_arg (0);
-    cd.set_any_lhs_with_defaults ();
+    if (const svalue *bytes_read = cd.check_for_null_terminated_string_arg (0))
+      {
+	region_model_manager *mgr = cd.get_manager ();
+	/* strlen is (bytes_read - 1).  */
+	const svalue *strlen_sval
+	  = mgr->get_or_create_binop (size_type_node,
+				      MINUS_EXPR,
+				      bytes_read,
+				      mgr->get_or_create_int_cst (size_type_node, 1));
+	cd.maybe_set_lhs (strlen_sval);
+      }
+    else
+      cd.set_any_lhs_with_defaults ();
   }
 };
 
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index ed93fb89f933..0fce18896fbc 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -3175,26 +3175,6 @@ region_model::set_value (tree lhs, tree rhs, region_model_context *ctxt)
   set_value (lhs_reg, rhs_sval, ctxt);
 }
 
-/* Look for the first 0 byte within STRING_CST.
-   If there is one, write its index to *OUT and return true.
-   Otherwise, return false.  */
-
-static bool
-get_strlen (tree string_cst, int *out)
-{
-  gcc_assert (TREE_CODE (string_cst) == STRING_CST);
-
-  if (const void *p = memchr (TREE_STRING_POINTER (string_cst),
-			      0,
-			      TREE_STRING_LENGTH (string_cst)))
-    {
-      *out = (const char *)p - TREE_STRING_POINTER (string_cst);
-      return true;
-    }
-  else
-    return false;
-}
-
 /* A bundle of information about a problematic argument at a callsite
    for use by pending_diagnostic subclasses for reporting and
    for deduplication.  */
@@ -3236,106 +3216,477 @@ inform_about_expected_null_terminated_string_arg (const call_arg_details &ad)
 	  ad.m_arg_idx + 1, ad.m_called_fndecl);
 }
 
-/* A subclass of pending_diagnostic for complaining about uses
-   of unterminated strings (thus accessing beyond the bounds
-   of a buffer).  */
+/* A binding of a specific svalue at a concrete byte range.  */
 
-class unterminated_string_arg
-: public pending_diagnostic_subclass<unterminated_string_arg>
+struct fragment
 {
-public:
-  unterminated_string_arg (const call_arg_details arg_details)
-  : m_arg_details (arg_details)
+  fragment ()
+  : m_byte_range (0, 0), m_sval (nullptr)
   {
-    gcc_assert (m_arg_details.m_called_fndecl);
   }
 
-  const char *get_kind () const final override
+  fragment (const byte_range &bytes, const svalue *sval)
+    : m_byte_range (bytes), m_sval (sval)
   {
-    return "unterminated_string_arg";
   }
 
-  bool operator== (const unterminated_string_arg &other) const
+  static int cmp_ptrs (const void *p1, const void *p2)
   {
-    return m_arg_details == other.m_arg_details;
+    const fragment *f1 = (const fragment *)p1;
+    const fragment *f2 = (const fragment *)p2;
+    return byte_range::cmp (f1->m_byte_range, f2->m_byte_range);
   }
 
-  int get_controlling_option () const final override
+  /* Determine if there is a zero terminator somewhere in the
+     bytes of this fragment, starting at START_READ_OFFSET (which
+     is absolute to the start of the cluster as a whole), and stopping
+     at the end of this fragment.
+
+     Return a tristate:
+     - true if there definitely is a zero byte, writing to *OUT_BYTES_READ
+     the number of bytes from that would be read, including the zero byte.
+     - false if there definitely isn't a zero byte
+     - unknown if we don't know.  */
+  tristate has_null_terminator (byte_offset_t start_read_offset,
+				byte_offset_t *out_bytes_read) const
   {
-    return OPT_Wanalyzer_unterminated_string;
+    byte_offset_t rel_start_read_offset
+      = start_read_offset - m_byte_range.get_start_byte_offset ();
+    gcc_assert (rel_start_read_offset >= 0);
+    byte_offset_t available_bytes
+      = (m_byte_range.get_next_byte_offset () - start_read_offset);
+    gcc_assert (available_bytes >= 0);
+
+    if (rel_start_read_offset > INT_MAX)
+      return tristate::TS_UNKNOWN;
+    HOST_WIDE_INT rel_start_read_offset_hwi = rel_start_read_offset.slow ();
+
+    if (available_bytes > INT_MAX)
+      return tristate::TS_UNKNOWN;
+    HOST_WIDE_INT available_bytes_hwi = available_bytes.slow ();
+
+    switch (m_sval->get_kind ())
+      {
+      case SK_CONSTANT:
+	{
+	  tree cst
+	    = as_a <const constant_svalue *> (m_sval)->get_constant ();
+	  switch (TREE_CODE (cst))
+	    {
+	    case STRING_CST:
+	      {
+		/* Look for the first 0 byte within STRING_CST
+		   from START_READ_OFFSET onwards.  */
+		const HOST_WIDE_INT num_bytes_to_search
+		  = std::min<HOST_WIDE_INT> ((TREE_STRING_LENGTH (cst)
+					      - rel_start_read_offset_hwi),
+					     available_bytes_hwi);
+		const char *start = (TREE_STRING_POINTER (cst)
+				     + rel_start_read_offset_hwi);
+		if (num_bytes_to_search >= 0)
+		  if (const void *p = memchr (start, 0,
+					      num_bytes_to_search))
+		    {
+		      *out_bytes_read = (const char *)p - start + 1;
+		      return tristate (true);
+		    }
+
+		*out_bytes_read = available_bytes;
+		return tristate (false);
+	      }
+	      break;
+	    case INTEGER_CST:
+	      if (rel_start_read_offset_hwi == 0
+		  && integer_onep (TYPE_SIZE_UNIT (TREE_TYPE (cst))))
+		{
+		  /* Model accesses to the initial byte of a 1-byte
+		     INTEGER_CST.  */
+		  if (zerop (cst))
+		    {
+		      *out_bytes_read = 1;
+		      return tristate (true);
+		    }
+		  else
+		    {
+		      *out_bytes_read = available_bytes;
+		      return tristate (false);
+		    }
+		}
+	      /* Treat any other access to an INTEGER_CST as unknown.  */
+	      return tristate::TS_UNKNOWN;
+
+	    default:
+	      gcc_unreachable ();
+	      break;
+	    }
+	}
+	break;
+      default:
+	// TODO: it may be possible to handle other cases here.
+	return tristate::TS_UNKNOWN;
+      }
   }
 
-  bool emit (rich_location *rich_loc, logger *) final override
+  byte_range m_byte_range;
+  const svalue *m_sval;
+};
+
+/* A frozen copy of a single base region's binding_cluster within a store,
+   optimized for traversal of the concrete parts in byte order.
+   This only captures concrete bindings, and is an implementation detail
+   of region_model::scan_for_null_terminator.  */
+
+class iterable_cluster
+{
+public:
+  iterable_cluster (const binding_cluster *cluster)
   {
-    auto_diagnostic_group d;
-    bool warned;
-    if (m_arg_details.m_arg_expr)
-      warned = warning_at (rich_loc, get_controlling_option (),
-			   "passing pointer to unterminated string %qE"
-			   " as argument %i of %qE",
-			   m_arg_details.m_arg_expr,
-			   m_arg_details.m_arg_idx + 1,
-			   m_arg_details.m_called_fndecl);
-    else
-      warned = warning_at (rich_loc, get_controlling_option (),
-			   "passing pointer to unterminated string"
-			   " as argument %i of %qE",
-			   m_arg_details.m_arg_idx + 1,
-			   m_arg_details.m_called_fndecl);
-    if (warned)
-      inform_about_expected_null_terminated_string_arg (m_arg_details);
-    return warned;
+    if (!cluster)
+      return;
+    for (auto iter : *cluster)
+      {
+	const binding_key *key = iter.first;
+	const svalue *sval = iter.second;
+
+	if (const concrete_binding *concrete_key
+	    = key->dyn_cast_concrete_binding ())
+	  {
+	    byte_range fragment_bytes (0, 0);
+	    if (concrete_key->get_byte_range (&fragment_bytes))
+	      m_fragments.safe_push (fragment (fragment_bytes, sval));
+	  }
+      }
+    m_fragments.qsort (fragment::cmp_ptrs);
   }
 
-  label_text describe_final_event (const evdesc::final_event &ev) final override
+  bool
+  get_fragment_for_byte (byte_offset_t byte, fragment *out_frag) const
   {
-    return ev.formatted_print
-      ("passing pointer to unterminated buffer as argument %i of %qE"
-       " would lead to read past the end of the buffer",
-       m_arg_details.m_arg_idx + 1,
-       m_arg_details.m_called_fndecl);
+    /* TODO: binary search rather than linear.  */
+    unsigned iter_idx;
+    for (iter_idx = 0; iter_idx < m_fragments.length (); iter_idx++)
+      {
+	if (m_fragments[iter_idx].m_byte_range.contains_p (byte))
+	{
+	  *out_frag = m_fragments[iter_idx];
+	  return true;
+	}
+      }
+    return false;
   }
 
 private:
-  const call_arg_details m_arg_details;
+  auto_vec<fragment> m_fragments;
 };
 
+/* Simulate reading the bytes at BYTES from BASE_REG.
+   Complain to CTXT about any issues with the read e.g. out-of-bounds.  */
+
+const svalue *
+region_model::get_store_bytes (const region *base_reg,
+			       const byte_range &bytes,
+			       region_model_context *ctxt) const
+{
+  const svalue *index_sval
+    = m_mgr->get_or_create_int_cst (size_type_node,
+				    bytes.get_start_byte_offset ());
+  const region *offset_reg = m_mgr->get_offset_region (base_reg,
+						       NULL_TREE,
+						       index_sval);
+  const svalue *byte_size_sval
+    = m_mgr->get_or_create_int_cst (size_type_node, bytes.m_size_in_bytes);
+  const region *read_reg = m_mgr->get_sized_region (offset_reg,
+						    NULL_TREE,
+						    byte_size_sval);
+
+  /* Simulate reading those bytes from the store.  */
+  const svalue *sval = get_store_value (read_reg, ctxt);
+  return sval;
+}
+
+static tree
+get_tree_for_byte_offset (tree ptr_expr, byte_offset_t byte_offset)
+{
+  gcc_assert (ptr_expr);
+  return fold_build2 (MEM_REF,
+		      char_type_node,
+		      ptr_expr, wide_int_to_tree (size_type_node, byte_offset));
+}
+
+/* Simulate a series of reads of REG until we find a 0 byte
+   (equivalent to calling strlen).
+
+   Complain to CTXT and return NULL if:
+   - the buffer pointed to isn't null-terminated
+   - the buffer pointed to has any uninitalized bytes before any 0-terminator
+   - any of the reads aren't within the bounds of the underlying base region
+
+   Otherwise, return a svalue for the number of bytes read (strlen + 1),
+   and, if OUT_SVAL is non-NULL, write to *OUT_SVAL with an svalue
+   representing the content of REG up to and including the terminator.
+
+   Algorithm
+   =========
+
+   Get offset for first byte to read.
+   Find the binding (if any) that contains it.
+   Find the size in bits of that binding.
+   Round to the nearest byte (which way???)
+     Or maybe give up if we have a partial binding there.
+   Get the svalue from the binding.
+   Determine the strlen (if any) of that svalue.
+     Does it have a 0-terminator within it?
+      If so, we have a partial read up to and including that terminator
+       Read those bytes from the store; add to the result in the correct place.
+       Finish
+      If not, we have a full read of that svalue
+       Read those bytes from the store; add to the result in the correct place.
+       Update read/write offsets
+       Continue
+      If unknown:
+       Result is unknown
+       Finish
+*/
+
+const svalue *
+region_model::scan_for_null_terminator (const region *reg,
+					tree expr,
+					const svalue **out_sval,
+					region_model_context *ctxt) const
+{
+  store_manager *store_mgr = m_mgr->get_store_manager ();
+
+  region_offset offset = reg->get_offset (m_mgr);
+  if (offset.symbolic_p ())
+    {
+      if (out_sval)
+	*out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+      return m_mgr->get_or_create_unknown_svalue (size_type_node);
+    }
+  byte_offset_t src_byte_offset;
+  if (!offset.get_concrete_byte_offset (&src_byte_offset))
+    {
+      if (out_sval)
+	*out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+      return m_mgr->get_or_create_unknown_svalue (size_type_node);
+    }
+  const byte_offset_t initial_src_byte_offset = src_byte_offset;
+  byte_offset_t dst_byte_offset = 0;
+
+  const region *base_reg = reg->get_base_region ();
+
+  if (const string_region *str_reg = base_reg->dyn_cast_string_region ())
+    {
+      tree string_cst = str_reg->get_string_cst ();
+      if (const void *p = memchr (TREE_STRING_POINTER (string_cst),
+				  0,
+				  TREE_STRING_LENGTH (string_cst)))
+	{
+	  size_t num_bytes_read
+	    = (const char *)p - TREE_STRING_POINTER (string_cst) + 1;
+	  /* Simulate the read.  */
+	  byte_range bytes_to_read (0, num_bytes_read);
+	  const svalue *sval = get_store_bytes (reg, bytes_to_read, ctxt);
+	  if (out_sval)
+	    *out_sval = sval;
+	  return m_mgr->get_or_create_int_cst (size_type_node,
+					       num_bytes_read);
+	}
+    }
+
+  const binding_cluster *cluster = m_store.get_cluster (base_reg);
+  iterable_cluster c (cluster);
+  binding_map result;
+
+  while (1)
+    {
+      fragment f;
+      if (c.get_fragment_for_byte (src_byte_offset, &f))
+	{
+	  byte_offset_t fragment_bytes_read;
+	  tristate is_terminated
+	    = f.has_null_terminator (src_byte_offset, &fragment_bytes_read);
+	  if (is_terminated.is_unknown ())
+	    {
+	      if (out_sval)
+		*out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+	      return m_mgr->get_or_create_unknown_svalue (size_type_node);
+	    }
+
+	  /* Simulate reading those bytes from the store.  */
+	  byte_range bytes_to_read (src_byte_offset, fragment_bytes_read);
+	  const svalue *sval = get_store_bytes (base_reg, bytes_to_read, ctxt);
+	  check_for_poison (sval, expr, nullptr, ctxt);
+
+	  if (out_sval)
+	    {
+	      byte_range bytes_to_write (dst_byte_offset, fragment_bytes_read);
+	      const binding_key *key
+		= store_mgr->get_concrete_binding (bytes_to_write);
+	      result.put (key, sval);
+	    }
+
+	  src_byte_offset += fragment_bytes_read;
+	  dst_byte_offset += fragment_bytes_read;
+
+	  if (is_terminated.is_true ())
+	    {
+	      if (out_sval)
+		*out_sval = m_mgr->get_or_create_compound_svalue (NULL_TREE,
+								  result);
+	      return m_mgr->get_or_create_int_cst (size_type_node,
+						   dst_byte_offset);
+	    }
+	}
+      else
+	break;
+    }
+
+  /* No binding for this base_region, or no binding at src_byte_offset
+     (or a symbolic binding).  */
+
+  /* TODO: the various special-cases seen in
+     region_model::get_store_value.  */
+
+  /* Simulate reading from this byte, then give up.  */
+  byte_range bytes_to_read (src_byte_offset, 1);
+  const svalue *sval = get_store_bytes (base_reg, bytes_to_read, ctxt);
+  tree byte_expr
+    = get_tree_for_byte_offset (expr,
+				src_byte_offset - initial_src_byte_offset);
+  check_for_poison (sval, byte_expr, nullptr, ctxt);
+  if (base_reg->can_have_initial_svalue_p ())
+    {
+      if (out_sval)
+	*out_sval = m_mgr->get_or_create_unknown_svalue (NULL_TREE);
+      return m_mgr->get_or_create_unknown_svalue (size_type_node);
+    }
+  else
+    return nullptr;
+}
+
 /* Check that argument ARG_IDX (0-based) to the call described by CD
    is a pointer to a valid null-terminated string.
 
-   Complain if the buffer pointed to isn't null-terminated.
+   Simulate scanning through the buffer, reading until we find a 0 byte
+   (equivalent to calling strlen).
 
-   TODO: we should also complain if:
-   - the pointer is NULL (or could be)
-   - the buffer pointed to is uninitalized before any 0-terminator
-   - the 0-terminator is within the bounds of the underlying base region
+   Complain and return NULL if:
+   - the buffer pointed to isn't null-terminated
+   - the buffer pointed to has any uninitalized bytes before any 0-terminator
+   - any of the reads aren't within the bounds of the underlying base region
 
-   We're checking that the called function could validly iterate through
-   the buffer reading it until it finds a 0 byte (such as by calling
-   strlen, or equivalent code).  */
+   Otherwise, return a svalue for the number of bytes read (strlen + 1),
+   and, if OUT_SVAL is non-NULL, write to *OUT_SVAL with an svalue
+   representing the content of the buffer up to and including the terminator.
 
-void
+   TODO: we should also complain if:
+   - the pointer is NULL (or could be).  */
+
+const svalue *
 region_model::check_for_null_terminated_string_arg (const call_details &cd,
-						    unsigned arg_idx)
+						    unsigned arg_idx,
+						    const svalue **out_sval)
 {
-  region_model_context *ctxt = cd.get_ctxt ();
+  class null_terminator_check_event : public custom_event
+  {
+  public:
+    null_terminator_check_event (const event_loc_info &loc_info,
+				 const call_arg_details &arg_details)
+    : custom_event (loc_info),
+      m_arg_details (arg_details)
+    {
+    }
+
+    label_text get_desc (bool can_colorize) const final override
+    {
+      if (m_arg_details.m_arg_expr)
+	return make_label_text (can_colorize,
+				"while looking for null terminator"
+				" for argument %i (%qE) of %qD...",
+				m_arg_details.m_arg_idx + 1,
+				m_arg_details.m_arg_expr,
+				m_arg_details.m_called_fndecl);
+      else
+	return make_label_text (can_colorize,
+				"while looking for null terminator"
+				" for argument %i of %qD...",
+				m_arg_details.m_arg_idx + 1,
+				m_arg_details.m_called_fndecl);
+    }
+
+  private:
+    const call_arg_details m_arg_details;
+  };
+
+  class null_terminator_check_decl_note
+    : public pending_note_subclass<null_terminator_check_decl_note>
+  {
+  public:
+    null_terminator_check_decl_note (const call_arg_details &arg_details)
+    : m_arg_details (arg_details)
+    {
+    }
+
+    const char *get_kind () const final override
+    {
+      return "null_terminator_check_decl_note";
+    }
+
+    void emit () const final override
+    {
+      inform_about_expected_null_terminated_string_arg (m_arg_details);
+    }
+
+    bool operator== (const null_terminator_check_decl_note &other) const
+    {
+      return m_arg_details == other.m_arg_details;
+    }
+
+  private:
+    const call_arg_details m_arg_details;
+  };
+
+  /* Subclass of decorated_region_model_context that
+     adds the above event and note to any saved diagnostics.  */
+  class annotating_ctxt : public annotating_context
+  {
+  public:
+    annotating_ctxt (const call_details &cd,
+		     unsigned arg_idx)
+    : annotating_context (cd.get_ctxt ()),
+      m_cd (cd),
+      m_arg_idx (arg_idx)
+    {
+    }
+    void add_annotations () final override
+    {
+      call_arg_details arg_details (m_cd, m_arg_idx);
+      event_loc_info loc_info (m_cd.get_location (),
+			       m_cd.get_model ()->get_current_function ()->decl,
+			       m_cd.get_model ()->get_stack_depth ());
+
+      add_event (make_unique<null_terminator_check_event> (loc_info,
+							   arg_details));
+      add_note (make_unique <null_terminator_check_decl_note> (arg_details));
+    }
+  private:
+    const call_details &m_cd;
+    unsigned m_arg_idx;
+  };
+
+  /* Use this ctxt below so that any diagnostics that get added
+     get annotated.  */
+  annotating_ctxt my_ctxt (cd, arg_idx);
 
   const svalue *arg_sval = cd.get_arg_svalue (arg_idx);
   const region *buf_reg
-    = deref_rvalue (arg_sval, cd.get_arg_tree (arg_idx), ctxt);
-
-  const svalue *contents_sval = get_store_value (buf_reg, ctxt);
+    = deref_rvalue (arg_sval, cd.get_arg_tree (arg_idx), &my_ctxt);
 
-  if (tree cst = contents_sval->maybe_get_constant ())
-    if (TREE_CODE (cst) == STRING_CST)
-      {
-	int cst_strlen;
-	if (!get_strlen (cst, &cst_strlen))
-	  {
-	    call_arg_details arg_details (cd, arg_idx);
-	    ctxt->warn (make_unique<unterminated_string_arg> (arg_details));
-	  }
-      }
+  return scan_for_null_terminator (buf_reg,
+				   cd.get_arg_tree (arg_idx),
+				   out_sval,
+				   &my_ctxt);
 }
 
 /* Remove all bindings overlapping REG within the store.  */
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index a01399c8e85a..63a67b35350b 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -451,6 +451,13 @@ class region_model
 
   const svalue *get_store_value (const region *reg,
 				 region_model_context *ctxt) const;
+  const svalue *get_store_bytes (const region *base_reg,
+				 const byte_range &bytes,
+				 region_model_context *ctxt) const;
+  const svalue *scan_for_null_terminator (const region *reg,
+					  tree expr,
+					  const svalue **out_sval,
+					  region_model_context *ctxt) const;
 
   bool region_exists_p (const region *reg) const;
 
@@ -502,8 +509,10 @@ class region_model
 			       const svalue *sval_hint,
 			       region_model_context *ctxt) const;
 
-  void check_for_null_terminated_string_arg (const call_details &cd,
-					     unsigned idx);
+  const svalue *
+  check_for_null_terminated_string_arg (const call_details &cd,
+					unsigned idx,
+					const svalue **out_sval = nullptr);
 
 private:
   const region *get_lvalue_1 (path_var pv, region_model_context *ctxt) const;
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index c7bc4b40f87c..aeea69311378 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -538,6 +538,15 @@ concrete_binding::overlaps_p (const concrete_binding &other) const
   return false;
 }
 
+/* If this is expressible as a concrete byte range, return true
+   and write it to *OUT.  Otherwise return false.  */
+
+bool
+concrete_binding::get_byte_range (byte_range *out) const
+{
+  return m_bit_range.as_byte_range (out);
+}
+
 /* Comparator for use by vec<const concrete_binding *>::qsort.  */
 
 int
diff --git a/gcc/analyzer/store.h b/gcc/analyzer/store.h
index af6cc7ed03c7..cf10fa3b0108 100644
--- a/gcc/analyzer/store.h
+++ b/gcc/analyzer/store.h
@@ -399,6 +399,7 @@ public:
   { return this; }
 
   const bit_range &get_bit_range () const { return m_bit_range; }
+  bool get_byte_range (byte_range *out) const;
 
   bit_offset_t get_start_bit_offset () const
   {
@@ -855,6 +856,12 @@ public:
     return get_concrete_binding (bits.get_start_bit_offset (),
 				 bits.m_size_in_bits);
   }
+  const concrete_binding *
+  get_concrete_binding (const byte_range &bytes)
+  {
+    bit_range bits = bytes.as_bit_range ();
+    return get_concrete_binding (bits);
+  }
   const symbolic_binding *
   get_symbolic_binding (const region *region);
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 01aa9efebce5..ef3f40989860 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -488,7 +488,6 @@ Objective-C and Objective-C++ Dialects}.
 -Wno-analyzer-tainted-size
 -Wanalyzer-too-complex
 -Wno-analyzer-unsafe-call-within-signal-handler
--Wno-analyzer-unterminated-string
 -Wno-analyzer-use-after-free
 -Wno-analyzer-use-of-pointer-in-stale-stack-frame
 -Wno-analyzer-use-of-uninitialized-value
@@ -10328,7 +10327,6 @@ Enabling this option effectively enables the following warnings:
 -Wanalyzer-shift-count-overflow
 -Wanalyzer-stale-setjmp-buffer
 -Wanalyzer-unsafe-call-within-signal-handler
--Wanalyzer-unterminated-string
 -Wanalyzer-use-after-free
 -Wanalyzer-use-of-pointer-in-stale-stack-frame
 -Wanalyzer-use-of-uninitialized-value
@@ -10918,17 +10916,6 @@ called from a signal handler.
 
 See @uref{https://cwe.mitre.org/data/definitions/479.html, CWE-479: Signal Handler Use of a Non-reentrant Function}.
 
-@opindex Wanalyzer-unterminated-string
-@opindex Wno-analyzer-unterminated-string
-@item -Wno-analyzer-unterminated-string
-This warning requires @option{-fanalyzer}, which enables it; use
-@option{-Wno-analyzer-unterminated-string} to disable it.
-
-This diagnostic warns about code paths which attempt to find the length
-of an unterminated string.  For example, passing a pointer to an unterminated
-buffer to @code{strlen} would lead to accesses beyond the end of the buffer
-whilst attempting to find the terminating zero character.
-
 @opindex Wanalyzer-use-after-free
 @opindex Wno-analyzer-use-after-free
 @item -Wno-analyzer-use-after-free
diff --git a/gcc/testsuite/gcc.dg/analyzer/error-1.c b/gcc/testsuite/gcc.dg/analyzer/error-1.c
index 491d615e2cb1..794a9ae7b42d 100644
--- a/gcc/testsuite/gcc.dg/analyzer/error-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/error-1.c
@@ -68,11 +68,27 @@ void test_6 (int st, const char *str)
 char *test_error_unterminated (int st)
 {
   char fmt[3] = "abc";
-  error (st, errno, fmt); /* { dg-warning "passing pointer to unterminated string '&fmt' as argument 3 of 'error'" } */
+  error (st, errno, fmt); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 3 \\('&fmt'\\) of 'error'..." "event" { target *-*-* } .-1 } */
 }
 
 char *test_error_at_line_unterminated (int st, int errno)
 {
   char fmt[3] = "abc";
-  error_at_line (st, errno, __FILE__, __LINE__, fmt); /* { dg-warning "passing pointer to unterminated string '&fmt' as argument 5 of 'error_at_line'" } */
+  error_at_line (st, errno, __FILE__, __LINE__, fmt); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 5 \\('&fmt'\\) of 'error_at_line'..." "event" { target *-*-* } .-1 } */
+}
+
+char *test_error_uninitialized (int st, int errno)
+{
+  char fmt[16];
+  error (st, errno, fmt); /* { dg-warning "use of uninitialized value 'fmt\\\[0\\\]'" } */
+  /* { dg-message "while looking for null terminator for argument 3 \\('&fmt'\\) of 'error'..." "event" { target *-*-* } .-1 } */
+}
+
+char *test_error_at_line_uninitialized (int st, int errno)
+{
+  char fmt[16];
+  error_at_line (st, errno, __FILE__, __LINE__, fmt); /* { dg-warning "use of uninitialized value 'fmt\\\[0\\\]'" } */
+  /* { dg-message "while looking for null terminator for argument 5 \\('&fmt'\\) of 'error_at_line'..." "event" { target *-*-* } .-1 } */
 }
diff --git a/gcc/testsuite/gcc.dg/analyzer/null-terminated-strings-1.c b/gcc/testsuite/gcc.dg/analyzer/null-terminated-strings-1.c
index 337987068237..1db82a76d3b3 100644
--- a/gcc/testsuite/gcc.dg/analyzer/null-terminated-strings-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/null-terminated-strings-1.c
@@ -5,19 +5,47 @@ typedef __SIZE_TYPE__ size_t;
 
 void test_terminated (void)
 {
-  __analyzer_get_strlen ("abc"); /* { dg-bogus "" } */
+  __analyzer_eval (__analyzer_get_strlen ("abc") == 3); /* { dg-warning "TRUE" } */
 }
 
 void test_unterminated (void)
 {
   char buf[3] = "abc";
-  __analyzer_get_strlen (buf); /* { dg-warning "passing pointer to unterminated string '&buf' as argument 1 of '__analyzer_get_strlen'" } */
+  __analyzer_get_strlen (buf); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "out-of-bounds read at byte 3 but 'buf' ends at byte 3" "bad read event" { target *-*-* } .-1 } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of '__analyzer_get_strlen'..." "null terminator event" { target *-*-* } .-2 } */
 }
 
-void test_embedded_nul (void)
+void test_embedded_nuls (void)
 {
-  char buf[3] = "a\0c";
-  __analyzer_get_strlen (buf); /* { dg-bogus "" } */
+  /*             0123 456 78.  */
+  char buf[9] = "abc\0pq\0xy"; /* unterminated.  */
+  __analyzer_eval (__analyzer_get_strlen (buf) == 3); /* { dg-warning "TRUE" } */
+  __analyzer_eval (__analyzer_get_strlen (buf + 1) == 2); /* { dg-warning "TRUE" } */
+  __analyzer_eval (__analyzer_get_strlen (buf + 2) == 1); /* { dg-warning "TRUE" } */
+  __analyzer_eval (__analyzer_get_strlen (buf + 3) == 0); /* { dg-warning "TRUE" } */
+  __analyzer_eval (__analyzer_get_strlen (buf + 4) == 2); /* { dg-warning "TRUE" } */
+  __analyzer_eval (__analyzer_get_strlen (buf + 5) == 1); /* { dg-warning "TRUE" } */
+  __analyzer_eval (__analyzer_get_strlen (buf + 6) == 0); /* { dg-warning "TRUE" } */
+  __analyzer_get_strlen (buf + 7); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('<unknown>'\\) of '__analyzer_get_strlen'..." "event" { target *-*-* } .-1 } */
+  // TODO: fix the "<unknown>" here?
+}
+
+void test_before_start_of_buffer (void)
+{
+  const char *buf = "abc";
+  __analyzer_get_strlen (buf - 1); /* { dg-warning "buffer under-read" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('<unknown>'\\) of '__analyzer_get_strlen'..." "event" { target *-*-* } .-1 } */
+  // TODO: fix the "<unknown>" here?
+}
+
+void test_after_end_of_buffer (void)
+{
+  const char *buf = "abc";
+  __analyzer_get_strlen (buf + 4); /* { dg-warning "buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('<unknown>'\\) of '__analyzer_get_strlen'..." "event" { target *-*-* } .-1 } */  
+  // TODO: fix the "<unknown>" here?
 }
 
 void test_fully_initialized_but_unterminated (void)
@@ -26,5 +54,93 @@ void test_fully_initialized_but_unterminated (void)
   buf[0] = 'a';
   buf[1] = 'b';
   buf[2] = 'c';
-  __analyzer_get_strlen (buf); /* { dg-warning "passing pointer to unterminated string '&buf' as argument 1 of '__analyzer_get_strlen'" "" { xfail *-*-* } } */
+  __analyzer_get_strlen (buf); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of '__analyzer_get_strlen'..." "event" { target *-*-* } .-1 } */
+}
+
+void test_uninitialized (void)
+{
+  char buf[16];
+  __analyzer_get_strlen (buf); /* { dg-warning "use of uninitialized value 'buf\\\[0\\\]'" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of '__analyzer_get_strlen'..." "event" { target *-*-* } .-1 } */
+}
+
+void test_partially_initialized (void)
+{
+  char buf[16];
+  buf[0] = 'a';
+  __analyzer_get_strlen (buf); /* { dg-warning "use of uninitialized value 'buf\\\[1\\\]'" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of '__analyzer_get_strlen'..." "event" { target *-*-* } .-1 } */
+}
+
+char *test_dynamic_1 (void)
+{
+  const char *kvstr = "NAME=value";
+  size_t len = __builtin_strlen (kvstr);
+  char *ptr = __builtin_malloc (len + 1);
+  if (!ptr)
+    return NULL;
+  __builtin_memcpy (ptr, kvstr, len);
+  ptr[len] = '\0';
+  __analyzer_eval (__analyzer_get_strlen (ptr) == 10); /* { dg-warning "UNKNOWN" } */
+  // TODO: should be TRUE
+  return ptr;
+}
+
+char *test_dynamic_2 (void)
+{
+  const char *kvstr = "NAME=value";
+  size_t len = __builtin_strlen (kvstr);
+  char *ptr = __builtin_malloc (len + 1);
+  if (!ptr)
+    return NULL;
+  __builtin_memcpy (ptr, kvstr, len);
+  /* Missing termination.  */
+  __analyzer_get_strlen (ptr); /* { dg-warning "use of uninitialized value '&buf'" "" { xfail *-*-* } } */
+  // TODO (xfail)
+  return ptr;
+}
+
+char *test_dynamic_3 (const char *src)
+{
+  size_t len = __builtin_strlen (src);
+  char *ptr = __builtin_malloc (len + 1);
+  if (!ptr)
+    return NULL;
+  __builtin_memcpy (ptr, src, len);
+  ptr[len] = '\0';
+  __analyzer_eval (__analyzer_get_strlen (ptr) == len); /* { dg-warning "UNKNOWN" } */
+  // TODO: should get TRUE for this
+  return ptr;
+}
+
+char *test_dynamic_4 (const char *src)
+{
+  size_t len = __builtin_strlen (src);
+  char *ptr = __builtin_malloc (len + 1);
+  if (!ptr)
+    return NULL;
+  __builtin_memcpy (ptr, src, len);
+  /* Missing termination.  */
+  __analyzer_get_strlen (ptr); /* { dg-warning "use of uninitialized value 'buf\\\[len\\\]'" "" { xfail *-*-* } } */
+  // TODO (xfail)
+  return ptr;
+}
+
+void test_symbolic_ptr (const char *ptr)
+{
+  __analyzer_describe (0, __analyzer_get_strlen (ptr)); /* { dg-warning "UNKNOWN" } */
+}
+
+void test_symbolic_offset (size_t idx)
+{
+  __analyzer_describe (0, __analyzer_get_strlen ("abc" + idx)); /* { dg-warning "UNKNOWN" } */
+}
+
+void test_casts (void)
+{
+  int i = 42;
+  const char *p = (const char *)&i;
+  __analyzer_eval (__analyzer_get_strlen (p) == 0); /* { dg-warning "UNKNOWN" } */  
+  __analyzer_eval (__analyzer_get_strlen (p + 1) == 0); /* { dg-warning "UNKNOWN" } */  
 }
diff --git a/gcc/testsuite/gcc.dg/analyzer/putenv-1.c b/gcc/testsuite/gcc.dg/analyzer/putenv-1.c
index 5fa20334c0ab..5c4e08c68dff 100644
--- a/gcc/testsuite/gcc.dg/analyzer/putenv-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/putenv-1.c
@@ -112,6 +112,15 @@ void test_outer (void)
 void test_unterminated (void)
 {
   char buf[3] = "abc";
-  putenv (buf); /* { dg-warning "passing pointer to unterminated string" } */
-  /* { dg-warning "'putenv' on a pointer to automatic variable 'buf'" "POS34-C" { target *-*-* } .-1 } */
+  putenv (buf); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of 'putenv'..." "event" { target *-*-* } .-1 } */
+  /* { dg-warning "'putenv' on a pointer to automatic variable 'buf'" "POS34-C" { target *-*-* } .-2 } */
+}
+
+void test_uninitialized (void)
+{
+  char buf[16];
+  putenv (buf); /* { dg-warning "use of uninitialized value 'buf\\\[0\\\]'" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of 'putenv'..." "event" { target *-*-* } .-1 } */
+  /* { dg-warning "'putenv' on a pointer to automatic variable 'buf'" "POS34-C" { target *-*-* } .-2 } */
 }
diff --git a/gcc/testsuite/gcc.dg/analyzer/strchr-1.c b/gcc/testsuite/gcc.dg/analyzer/strchr-1.c
index 2fb6c76797e8..08c429d8f909 100644
--- a/gcc/testsuite/gcc.dg/analyzer/strchr-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/strchr-1.c
@@ -29,5 +29,13 @@ void test_3 (const char *s, int c)
 void test_unterminated (int c)
 {
   char buf[3] = "abc";
-  strchr (buf, c); /* { dg-warning "passing pointer to unterminated string '&buf' as argument 1 of 'strchr'" } */
+  strchr (buf, c); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of 'strchr'..." "event" { target *-*-* } .-1 } */
+}
+
+void test_uninitialized (int c)
+{
+  char buf[16];
+  strchr (buf, c); /* { dg-warning "use of uninitialized value 'buf\\\[0\\\]'" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of 'strchr'..." "event" { target *-*-* } .-1 } */
 }
diff --git a/gcc/testsuite/gcc.dg/analyzer/strcpy-1.c b/gcc/testsuite/gcc.dg/analyzer/strcpy-1.c
index f23dd69bfb69..d21e77175119 100644
--- a/gcc/testsuite/gcc.dg/analyzer/strcpy-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/strcpy-1.c
@@ -20,5 +20,13 @@ test_1a (char *dst, char *src)
 char *test_unterminated (char *dst)
 {
   char buf[3] = "abc";
-  return strcpy (dst, buf); /* { dg-warning "passing pointer to unterminated string '&buf' as argument 2 of 'strcpy'" } */
+  return strcpy (dst, buf); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 2 \\('&buf'\\) of 'strcpy'..." "event" { target *-*-* } .-1 } */
+}
+
+char *test_uninitialized (char *dst)
+{
+  char buf[16];
+  return strcpy (dst, buf); /* { dg-warning "use of uninitialized value 'buf\\\[0\\\]'" } */
+  /* { dg-message "while looking for null terminator for argument 2 \\('&buf'\\) of 'strcpy'..." "event" { target *-*-* } .-1 } */
 }
diff --git a/gcc/testsuite/gcc.dg/analyzer/strdup-1.c b/gcc/testsuite/gcc.dg/analyzer/strdup-1.c
index 682bfb901768..f6c176f174eb 100644
--- a/gcc/testsuite/gcc.dg/analyzer/strdup-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/strdup-1.c
@@ -42,5 +42,13 @@ void test_6 (const char *s)
 char *test_unterminated (void)
 {
   char buf[3] = "abc";
-  return strdup (buf); /* { dg-warning "passing pointer to unterminated string '&buf' as argument 1 of 'strdup'" } */
+  return strdup (buf); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of 'strdup'..." "event" { target *-*-* } .-1 } */
+}
+
+char *test_uninitialized (void)
+{
+  char buf[16];
+  return strdup (buf); /* { dg-warning "use of uninitialized value 'buf\\\[0\\\]'" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of 'strdup'..." "event" { target *-*-* } .-1 } */
 }
-- 
2.26.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pushed 5/6] analyzer: add kf_fopen
  2023-08-22  1:21 [pushed 1/6] analyzer: convert note_adding_context to annotating_context David Malcolm
                   ` (2 preceding siblings ...)
  2023-08-22  1:21 ` [pushed 4/6] analyzer: replace -Wanalyzer-unterminated-string with scan_for_null_terminator [PR105899] David Malcolm
@ 2023-08-22  1:21 ` David Malcolm
  2023-08-22  1:21 ` [pushed 6/6] analyzer: check format strings for null termination [PR105899] David Malcolm
  4 siblings, 0 replies; 6+ messages in thread
From: David Malcolm @ 2023-08-22  1:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

Add checking to -fanalyzer that both params of calls to "fopen" are
valid null-terminated strings.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-3375-g4325c82736d9e8.

gcc/analyzer/ChangeLog:
	* kf.cc (class kf_fopen): New.
	(register_known_functions): Register it.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/fopen-1.c: New test.
---
 gcc/analyzer/kf.cc                      | 28 +++++++++++
 gcc/testsuite/gcc.dg/analyzer/fopen-1.c | 66 +++++++++++++++++++++++++
 2 files changed, 94 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/fopen-1.c

diff --git a/gcc/analyzer/kf.cc b/gcc/analyzer/kf.cc
index 6b2db8613768..1601cf15c685 100644
--- a/gcc/analyzer/kf.cc
+++ b/gcc/analyzer/kf.cc
@@ -420,6 +420,33 @@ kf_error::impl_call_pre (const call_details &cd) const
   model->check_for_null_terminated_string_arg (cd, fmt_arg_idx);
 }
 
+/* Handler for fopen.
+     FILE *fopen (const char *filename, const char *mode);
+   See e.g. https://en.cppreference.com/w/c/io/fopen
+   https://www.man7.org/linux/man-pages/man3/fopen.3.html
+   https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fopen-wfopen?view=msvc-170  */
+
+class kf_fopen : public known_function
+{
+public:
+  bool matches_call_types_p (const call_details &cd) const final override
+  {
+    return (cd.num_args () == 2
+	    && cd.arg_is_pointer_p (0)
+	    && cd.arg_is_pointer_p (1));
+  }
+
+  void impl_call_pre (const call_details &cd) const final override
+  {
+    cd.check_for_null_terminated_string_arg (0);
+    cd.check_for_null_terminated_string_arg (1);
+    cd.set_any_lhs_with_defaults ();
+
+    /* fopen's mode param is effectively a mini-DSL, but there are various
+       non-standard extensions, so we don't bother to check it.  */
+  }
+};
+
 /* Handler for "free", after sm-handling.
 
    If the ptr points to an underlying heap region, delete the region,
@@ -1422,6 +1449,7 @@ register_known_functions (known_function_manager &kfm)
 
   /* Known POSIX functions, and some non-standard extensions.  */
   {
+    kfm.add ("fopen", make_unique<kf_fopen> ());
     kfm.add ("putenv", make_unique<kf_putenv> ());
 
     register_known_fd_functions (kfm);
diff --git a/gcc/testsuite/gcc.dg/analyzer/fopen-1.c b/gcc/testsuite/gcc.dg/analyzer/fopen-1.c
new file mode 100644
index 000000000000..e5b00e93b6da
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/fopen-1.c
@@ -0,0 +1,66 @@
+typedef struct FILE FILE;
+FILE *fopen (const char *pathname, const char *mode);
+#define NULL ((void *)0)
+
+FILE *
+test_passthrough (const char *pathname, const char *mode)
+{
+  return fopen (pathname, mode);
+}
+
+FILE *
+test_null_pathname (const char *pathname, const char *mode)
+{
+  return fopen (NULL, mode);
+}
+
+FILE *
+test_null_mode (const char *pathname)
+{
+  return fopen (pathname, NULL);
+}
+
+FILE *
+test_simple_r (void)
+{
+  return fopen ("foo.txt", "r");
+}
+
+FILE *
+test_swapped_args (void)
+{
+  return fopen ("r", "foo.txt"); /* TODO: would be nice to detect this.  */
+}
+
+FILE *
+test_unterminated_pathname (const char *mode)
+{
+  char buf[3] = "abc";
+  return fopen (buf, mode); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of 'fopen'..." "event" { target *-*-* } .-1 } */
+}
+
+FILE *
+test_unterminated_mode (const char *filename)
+{
+  char buf[3] = "abc";
+  return fopen (filename, buf);  /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 2 \\('&buf'\\) of 'fopen'..." "event" { target *-*-* } .-1 } */
+}
+
+FILE *
+test_uninitialized_pathname (const char *mode)
+{
+  char buf[10];
+  return fopen (buf, mode); /* { dg-warning "use of uninitialized value 'buf\\\[0\\\]'" } */  
+  /* { dg-message "while looking for null terminator for argument 1 \\('&buf'\\) of 'fopen'..." "event" { target *-*-* } .-1 } */
+}
+
+FILE *
+test_uninitialized_mode (const char *filename)
+{
+  char buf[10];
+  return fopen (filename, buf); /* { dg-warning "use of uninitialized value 'buf\\\[0\\\]'" } */  
+  /* { dg-message "while looking for null terminator for argument 2 \\('&buf'\\) of 'fopen'..." "event" { target *-*-* } .-1 } */
+}
+
-- 
2.26.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pushed 6/6] analyzer: check format strings for null termination [PR105899]
  2023-08-22  1:21 [pushed 1/6] analyzer: convert note_adding_context to annotating_context David Malcolm
                   ` (3 preceding siblings ...)
  2023-08-22  1:21 ` [pushed 5/6] analyzer: add kf_fopen David Malcolm
@ 2023-08-22  1:21 ` David Malcolm
  4 siblings, 0 replies; 6+ messages in thread
From: David Malcolm @ 2023-08-22  1:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This patch extends -fanalyzer to check the format strings of calls
to functions marked with '__attribute__ ((format...))'.

The only checking done in this patch is to check that the format string
is a valid null-terminated string; this patch doesn't attempt to check
the content of the format string.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-3376-g3b691e0190c6e7.

gcc/analyzer/ChangeLog:
	PR analyzer/105899
	* call-details.cc (call_details::call_details): New ctor.
	* call-details.h (call_details::call_details): New ctor decl.
	(struct call_arg_details): Move here from region-model.cc.
	* region-model.cc (region_model::check_call_format_attr): New.
	(region_model::check_call_args): Call it.
	(struct call_arg_details): Move it to call-details.h.
	* region-model.h (region_model::check_call_format_attr): New decl.

gcc/testsuite/ChangeLog:
	PR analyzer/105899
	* gcc.dg/analyzer/attr-format-1.c: New test.
	* gcc.dg/analyzer/sprintf-1.c: Update expected results for
	now-passing tests.
---
 gcc/analyzer/call-details.cc                  |  10 ++
 gcc/analyzer/call-details.h                   |  30 +++++
 gcc/analyzer/region-model.cc                  | 125 +++++++++++++-----
 gcc/analyzer/region-model.h                   |   2 +
 gcc/testsuite/gcc.dg/analyzer/attr-format-1.c |  31 +++++
 gcc/testsuite/gcc.dg/analyzer/sprintf-1.c     |   6 +-
 6 files changed, 172 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/attr-format-1.c

diff --git a/gcc/analyzer/call-details.cc b/gcc/analyzer/call-details.cc
index e497fc58e028..8f5b28ce6c26 100644
--- a/gcc/analyzer/call-details.cc
+++ b/gcc/analyzer/call-details.cc
@@ -58,6 +58,16 @@ call_details::call_details (const gcall *call, region_model *model,
     }
 }
 
+/* call_details's ctor: copy CD, but override the context,
+   using CTXT instead.  */
+
+call_details::call_details (const call_details &cd,
+			    region_model_context *ctxt)
+{
+  *this = cd;
+  m_ctxt = ctxt;
+}
+
 /* Get the manager from m_model.  */
 
 region_model_manager *
diff --git a/gcc/analyzer/call-details.h b/gcc/analyzer/call-details.h
index 86f0e68072bd..58b5ccd2acde 100644
--- a/gcc/analyzer/call-details.h
+++ b/gcc/analyzer/call-details.h
@@ -30,6 +30,7 @@ class call_details
 public:
   call_details (const gcall *call, region_model *model,
 		region_model_context *ctxt);
+  call_details (const call_details &cd, region_model_context *ctxt);
 
   region_model *get_model () const { return m_model; }
   region_model_manager *get_manager () const;
@@ -83,6 +84,35 @@ private:
   const region *m_lhs_region;
 };
 
+/* A bundle of information about a problematic argument at a callsite
+   for use by pending_diagnostic subclasses for reporting and
+   for deduplication.  */
+
+struct call_arg_details
+{
+public:
+  call_arg_details (const call_details &cd, unsigned arg_idx)
+  : m_call (cd.get_call_stmt ()),
+    m_called_fndecl (cd.get_fndecl_for_call ()),
+    m_arg_idx (arg_idx),
+    m_arg_expr (cd.get_arg_tree (arg_idx))
+  {
+  }
+
+  bool operator== (const call_arg_details &other) const
+  {
+    return (m_call == other.m_call
+	    && m_called_fndecl == other.m_called_fndecl
+	    && m_arg_idx == other.m_arg_idx
+	    && pending_diagnostic::same_tree_p (m_arg_expr, other.m_arg_expr));
+  }
+
+  const gcall *m_call;
+  tree m_called_fndecl;
+  unsigned m_arg_idx; // 0-based
+  tree m_arg_expr;
+};
+
 } // namespace ana
 
 #endif /* GCC_ANALYZER_CALL_DETAILS_H */
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 0fce18896fbc..99817aee3a93 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1271,14 +1271,108 @@ region_model::on_stmt_pre (const gimple *stmt,
     }
 }
 
+/* Given a call CD with function attribute FORMAT_ATTR, check that the
+   format arg to the call is a valid null-terminated string.  */
+
+void
+region_model::check_call_format_attr (const call_details &cd,
+				      tree format_attr) const
+{
+  /* We assume that FORMAT_ATTR has already been validated.  */
+
+  /* arg0 of the attribute should be kind of format strings
+     that this function expects (e.g. "printf").  */
+  const tree arg0_tree_list = TREE_VALUE (format_attr);
+  if (!arg0_tree_list)
+    return;
+
+  /* arg1 of the attribute should be the 1-based parameter index
+     to treat as the format string.  */
+  const tree arg1_tree_list = TREE_CHAIN (arg0_tree_list);
+  if (!arg1_tree_list)
+    return;
+  const tree arg1_value = TREE_VALUE (arg1_tree_list);
+  if (!arg1_value)
+    return;
+
+  unsigned format_arg_idx = TREE_INT_CST_LOW (arg1_value) - 1;
+  if (cd.num_args () <= format_arg_idx)
+    return;
+
+  /* Subclass of annotating_context that
+     adds a note about the format attr to any saved diagnostics.  */
+  class annotating_ctxt : public annotating_context
+  {
+  public:
+    annotating_ctxt (const call_details &cd,
+		     unsigned fmt_param_idx)
+    : annotating_context (cd.get_ctxt ()),
+      m_cd (cd),
+      m_fmt_param_idx (fmt_param_idx)
+    {
+    }
+    void add_annotations () final override
+    {
+      class reason_format_attr
+	: public pending_note_subclass<reason_format_attr>
+      {
+      public:
+	reason_format_attr (const call_arg_details &arg_details)
+	  : m_arg_details (arg_details)
+	{
+	}
+
+	const char *get_kind () const final override
+	{
+	  return "reason_format_attr";
+	}
+
+	void emit () const final override
+	{
+	  inform (DECL_SOURCE_LOCATION (m_arg_details.m_called_fndecl),
+		  "parameter %i of %qD marked as a format string"
+		  " via %qs attribute",
+		  m_arg_details.m_arg_idx + 1, m_arg_details.m_called_fndecl,
+		  "format");
+	}
+
+	bool operator== (const reason_format_attr &other) const
+	{
+	  return m_arg_details == other.m_arg_details;
+	}
+
+      private:
+	call_arg_details m_arg_details;
+      };
+
+      call_arg_details arg_details (m_cd, m_fmt_param_idx);
+      add_note (make_unique<reason_format_attr> (arg_details));
+    }
+  private:
+    const call_details &m_cd;
+    unsigned m_fmt_param_idx;
+  };
+
+  annotating_ctxt my_ctxt (cd, format_arg_idx);
+  call_details my_cd (cd, &my_ctxt);
+  my_cd.check_for_null_terminated_string_arg (format_arg_idx);
+}
+
 /* Ensure that all arguments at the call described by CD are checked
-   for poisoned values, by calling get_rvalue on each argument.  */
+   for poisoned values, by calling get_rvalue on each argument.
+
+   Check that calls to functions with "format" attribute have valid
+   null-terminated strings for their format argument.  */
 
 void
 region_model::check_call_args (const call_details &cd) const
 {
   for (unsigned arg_idx = 0; arg_idx < cd.num_args (); arg_idx++)
     cd.get_arg_svalue (arg_idx);
+
+  /* Handle attribute "format".  */
+  if (tree format_attr = cd.lookup_function_attribute ("format"))
+    check_call_format_attr (cd, format_attr);
 }
 
 /* Update this model for an outcome of a call that returns a specific
@@ -3175,35 +3269,6 @@ region_model::set_value (tree lhs, tree rhs, region_model_context *ctxt)
   set_value (lhs_reg, rhs_sval, ctxt);
 }
 
-/* A bundle of information about a problematic argument at a callsite
-   for use by pending_diagnostic subclasses for reporting and
-   for deduplication.  */
-
-struct call_arg_details
-{
-public:
-  call_arg_details (const call_details &cd, unsigned arg_idx)
-  : m_call (cd.get_call_stmt ()),
-    m_called_fndecl (cd.get_fndecl_for_call ()),
-    m_arg_idx (arg_idx),
-    m_arg_expr (cd.get_arg_tree (arg_idx))
-  {
-  }
-
-  bool operator== (const call_arg_details &other) const
-  {
-    return (m_call == other.m_call
-	    && m_called_fndecl == other.m_called_fndecl
-	    && m_arg_idx == other.m_arg_idx
-	    && pending_diagnostic::same_tree_p (m_arg_expr, other.m_arg_expr));
-  }
-
-  const gcall *m_call;
-  tree m_called_fndecl;
-  unsigned m_arg_idx; // 0-based
-  tree m_arg_expr;
-};
-
 /* Issue a note specifying that a particular function parameter is expected
    to be a valid null-terminated string.  */
 
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 63a67b35350b..3979bf124783 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -597,6 +597,8 @@ private:
 			    region_model_context *ctxt) const;
 
   void check_call_args (const call_details &cd) const;
+  void check_call_format_attr (const call_details &cd,
+			       tree format_attr) const;
   void check_external_function_for_access_attr (const gcall *call,
 						tree callee_fndecl,
 						region_model_context *ctxt) const;
diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-format-1.c b/gcc/testsuite/gcc.dg/analyzer/attr-format-1.c
new file mode 100644
index 000000000000..c7fa705585d7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/attr-format-1.c
@@ -0,0 +1,31 @@
+extern int
+my_printf (void *my_object, const char *my_format, ...)
+  __attribute__ ((format (printf, 2, 3)));
+/* { dg-message "parameter 2 of 'my_printf' marked as a format string via 'format' attribute" "attr note" { target *-*-* } .-2 } */
+/* { dg-message "argument 2 of 'my_printf' must be a pointer to a null-terminated string" "arg note" { target *-*-* } .-3 } */
+
+int test_empty (void *my_object, const char *msg)
+{
+  return my_printf (my_object, "");
+}
+
+int test_percent_s (void *my_object, const char *msg)
+{
+  return my_printf (my_object, "%s\n", msg);
+}
+
+int
+test_unterminated_format (void *my_object)
+{
+  char fmt[3] = "abc";
+  return my_printf (my_object, fmt); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 2 \\('&fmt'\\) of 'my_printf'..." "event" { target *-*-* } .-1 } */
+}
+
+int
+test_uninitialized_format (void *my_object)
+{
+  char fmt[10];
+  return my_printf (my_object, fmt); /* { dg-warning "use of uninitialized value 'fmt\\\[0\\\]'" } */  
+  /* { dg-message "while looking for null terminator for argument 2 \\('&fmt'\\) of 'my_printf'..." "event" { target *-*-* } .-1 } */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/sprintf-1.c b/gcc/testsuite/gcc.dg/analyzer/sprintf-1.c
index c79525d912f1..f8dc806d6192 100644
--- a/gcc/testsuite/gcc.dg/analyzer/sprintf-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/sprintf-1.c
@@ -53,12 +53,14 @@ int
 test_uninit_fmt_buf (char *dst)
 {
   const char fmt[10];
-  return sprintf (dst, fmt); // TODO (PR analyzer/105899): complain about "fmt" not being initialized
+  return sprintf (dst, fmt); /* { dg-warning "use of uninitialized value 'fmt\\\[0\\\]'" } */
+  /* { dg-message "while looking for null terminator for argument 2 \\('&fmt'\\) of 'sprintf'..." "event" { target *-*-* } .-1 } */
 }
 
 int
 test_fmt_not_terminated (char *dst)
 {
   const char fmt[3] = "foo";
-  return sprintf (dst, fmt); // TODO (PR analyzer/105899): complain about "fmt" not being terminated
+  return sprintf (dst, fmt); /* { dg-warning "stack-based buffer over-read" } */
+  /* { dg-message "while looking for null terminator for argument 2 \\('&fmt'\\) of 'sprintf'..." "event" { target *-*-* } .-1 } */
 }
-- 
2.26.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-22  1:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-22  1:21 [pushed 1/6] analyzer: convert note_adding_context to annotating_context David Malcolm
2023-08-22  1:21 ` [pushed 2/6] analyzer: add ability for context to add events to a saved_diagnostic David Malcolm
2023-08-22  1:21 ` [pushed 3/6] analyzer: handle NULL inner context in region_model_context_decorator David Malcolm
2023-08-22  1:21 ` [pushed 4/6] analyzer: replace -Wanalyzer-unterminated-string with scan_for_null_terminator [PR105899] David Malcolm
2023-08-22  1:21 ` [pushed 5/6] analyzer: add kf_fopen David Malcolm
2023-08-22  1:21 ` [pushed 6/6] analyzer: check format strings for null termination [PR105899] David Malcolm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).