public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: David Malcolm <dmalcolm@redhat.com>, gcc-patches@gcc.gnu.org
Subject: Re: [committed] analyzer: fix missing -Wanalyzer-write-to-const [PR102695]
Date: Wed, 17 Nov 2021 09:16:01 -0700	[thread overview]
Message-ID: <0a85a50e-2180-3236-cc31-f6ede8546e66@gmail.com> (raw)
In-Reply-To: <20211117020518.2417407-1-dmalcolm@redhat.com>

On 11/16/21 7:05 PM, David Malcolm via Gcc-patches wrote:
> This patch fixes -Wanalyzer-write-to-const so that it will complain
> about attempts to write to functions, to labels.
> It also "teaches" the analyzer about strchr, in that strchr can either
> return a pointer into the input area (and thus -Wanalyzer-write-to-const
> can now complain about writes into a string literal seen this way),
> or return NULL (and thus the analyzer can complain about NULL
> dereferences if the result is used without a check).

Fow what it's worth, I used strchr in the test case as an example.
There are a few other built-ins like it, including index, rindex,
memchr, strrchr, and strstr (just going through the switch
statements in my code).

At least some of these built-ins have an attribute "fn spec" that
describes some of their properties (like what argument they read
from; see builtin_fnspec in builtins.c).  But it doesn't look
like attr_fnspec has a way of encoding a function that returns
a pointer argument plus some offset.  That seems like a useful
enhancement both for our work and also for optimizers.  It would
let us avoid having to hardcode these properties in duplicate
case and switch statements in multiple places.

Martin

> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> Pushed to trunk as r12-5330-g111fd515f2894d7cddf62f80c69765c43ae18577.
> 
> gcc/analyzer/ChangeLog:
> 	PR analyzer/102695
> 	* region-model-impl-calls.cc (region_model::impl_call_strchr): New.
> 	* region-model-manager.cc
> 	(region_model_manager::maybe_fold_unaryop): Simplify cast to
> 	pointer type of an existing pointer to a region.
> 	* region-model.cc (region_model::on_call_pre): Handle
> 	BUILT_IN_STRCHR and "strchr".
> 	(write_to_const_diagnostic::emit): Add auto_diagnostic_group.  Add
> 	alternate wordings for functions and labels.
> 	(write_to_const_diagnostic::describe_final_event): Add alternate
> 	wordings for functions and labels.
> 	(region_model::check_for_writable_region): Handle RK_FUNCTION and
> 	RK_LABEL.
> 	* region-model.h (region_model::impl_call_strchr): New decl.
> 
> gcc/testsuite/ChangeLog:
> 	PR analyzer/102695
> 	* gcc.dg/analyzer/pr102695.c: New test.
> 	* gcc.dg/analyzer/strchr-1.c: New test.
> 
> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
> ---
>   gcc/analyzer/region-model-impl-calls.cc  | 69 ++++++++++++++++++++++++
>   gcc/analyzer/region-model-manager.cc     |  7 +++
>   gcc/analyzer/region-model.cc             | 52 ++++++++++++++++--
>   gcc/analyzer/region-model.h              |  1 +
>   gcc/testsuite/gcc.dg/analyzer/pr102695.c | 44 +++++++++++++++
>   gcc/testsuite/gcc.dg/analyzer/strchr-1.c | 26 +++++++++
>   6 files changed, 196 insertions(+), 3 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr102695.c
>   create mode 100644 gcc/testsuite/gcc.dg/analyzer/strchr-1.c
> 
> diff --git a/gcc/analyzer/region-model-impl-calls.cc b/gcc/analyzer/region-model-impl-calls.cc
> index 90d4cf9c2db..ae50e69542e 100644
> --- a/gcc/analyzer/region-model-impl-calls.cc
> +++ b/gcc/analyzer/region-model-impl-calls.cc
> @@ -678,6 +678,75 @@ region_model::impl_call_realloc (const call_details &cd)
>       }
>   }
>   
> +/* Handle the on_call_pre part of "strchr" and "__builtin_strchr".  */
> +
> +void
> +region_model::impl_call_strchr (const call_details &cd)
> +{
> +  class strchr_call_info : public call_info
> +  {
> +  public:
> +    strchr_call_info (const call_details &cd, bool found)
> +    : call_info (cd), m_found (found)
> +    {
> +    }
> +
> +    label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> +    {
> +      if (m_found)
> +	return make_label_text (can_colorize,
> +				"when %qE returns non-NULL",
> +				get_fndecl ());
> +      else
> +	return make_label_text (can_colorize,
> +				"when %qE returns NULL",
> +				get_fndecl ());
> +    }
> +
> +    bool update_model (region_model *model,
> +		       const exploded_edge *,
> +		       region_model_context *ctxt) const FINAL OVERRIDE
> +    {
> +      const call_details cd (get_call_details (model, ctxt));
> +      if (tree lhs_type = cd.get_lhs_type ())
> +	{
> +	  region_model_manager *mgr = model->get_manager ();
> +	  const svalue *result;
> +	  if (m_found)
> +	    {
> +	      const svalue *str_sval = cd.get_arg_svalue (0);
> +	      const region *str_reg
> +		= model->deref_rvalue (str_sval, cd.get_arg_tree (0),
> +				       cd.get_ctxt ());
> +	      /* We want str_sval + OFFSET for some unknown OFFSET.
> +		 Use a conjured_svalue to represent the offset,
> +		 using the str_reg as the id of the conjured_svalue.  */
> +	      const svalue *offset
> +		= mgr->get_or_create_conjured_svalue (size_type_node,
> +						      cd.get_call_stmt (),
> +						      str_reg);
> +	      result = mgr->get_or_create_binop (lhs_type, POINTER_PLUS_EXPR,
> +						 str_sval, offset);
> +	    }
> +	  else
> +	    result = mgr->get_or_create_int_cst (lhs_type, 0);
> +	  cd.maybe_set_lhs (result);
> +	}
> +      return true;
> +    }
> +  private:
> +    bool m_found;
> +  };
> +
> +  /* Bifurcate state, creating a "not found" out-edge.  */
> +  if (cd.get_ctxt ())
> +    cd.get_ctxt ()->bifurcate (new strchr_call_info (cd, false));
> +
> +  /* The "unbifurcated" state is the "found" case.  */
> +  strchr_call_info found (cd, true);
> +  found.update_model (this, NULL, cd.get_ctxt ());
> +}
> +
>   /* Handle the on_call_pre part of "strcpy" and "__builtin_strcpy_chk".  */
>   
>   void
> diff --git a/gcc/analyzer/region-model-manager.cc b/gcc/analyzer/region-model-manager.cc
> index 1cdec1bd230..fdf32122dea 100644
> --- a/gcc/analyzer/region-model-manager.cc
> +++ b/gcc/analyzer/region-model-manager.cc
> @@ -380,6 +380,13 @@ region_model_manager::maybe_fold_unaryop (tree type, enum tree_code op,
>   		    == boolean_true_node))
>   	      return maybe_fold_unaryop (type, op, innermost_arg);
>   	  }
> +	/* Avoid creating symbolic regions for pointer casts by
> +	   simplifying (T*)(&REGION) to ((T*)&REGION).  */
> +	if (const region_svalue *region_sval = arg->dyn_cast_region_svalue ())
> +	  if (POINTER_TYPE_P (type)
> +	      && region_sval->get_type ()
> +	      && POINTER_TYPE_P (region_sval->get_type ()))
> +	    return get_ptr_svalue (type, region_sval->get_pointee ());
>         }
>         break;
>       case TRUTH_NOT_EXPR:
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
> index 416a5ac7249..bbb15abdfe2 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -1133,6 +1133,9 @@ region_model::on_call_pre (const gcall *call, region_model_context *ctxt,
>   	    break;
>   	  case BUILT_IN_REALLOC:
>   	    return false;
> +	  case BUILT_IN_STRCHR:
> +	    impl_call_strchr (cd);
> +	    return false;
>   	  case BUILT_IN_STRCPY:
>   	  case BUILT_IN_STRCPY_CHK:
>   	    impl_call_strcpy (cd);
> @@ -1225,6 +1228,12 @@ region_model::on_call_pre (const gcall *call, region_model_context *ctxt,
>   	  impl_call_memset (cd);
>   	  return false;
>   	}
> +      else if (is_named_call_p (callee_fndecl, "strchr", call, 2)
> +	       && POINTER_TYPE_P (cd.get_arg_type (0)))
> +	{
> +	  impl_call_strchr (cd);
> +	  return false;
> +	}
>         else if (is_named_call_p (callee_fndecl, "strlen", call, 1)
>   	       && POINTER_TYPE_P (cd.get_arg_type (0)))
>   	{
> @@ -2161,8 +2170,23 @@ public:
>   
>     bool emit (rich_location *rich_loc) FINAL OVERRIDE
>     {
> -    bool warned = warning_at (rich_loc, OPT_Wanalyzer_write_to_const,
> -			      "write to %<const%> object %qE", m_decl);
> +    auto_diagnostic_group d;
> +    bool warned;
> +    switch (m_reg->get_kind ())
> +      {
> +      default:
> +	warned = warning_at (rich_loc, OPT_Wanalyzer_write_to_const,
> +			     "write to %<const%> object %qE", m_decl);
> +	break;
> +      case RK_FUNCTION:
> +	warned = warning_at (rich_loc, OPT_Wanalyzer_write_to_const,
> +			     "write to function %qE", m_decl);
> +	break;
> +      case RK_LABEL:
> +	warned = warning_at (rich_loc, OPT_Wanalyzer_write_to_const,
> +			     "write to label %qE", m_decl);
> +	break;
> +      }
>       if (warned)
>         inform (DECL_SOURCE_LOCATION (m_decl), "declared here");
>       return warned;
> @@ -2170,7 +2194,15 @@ public:
>   
>     label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE
>     {
> -    return ev.formatted_print ("write to %<const%> object %qE here", m_decl);
> +    switch (m_reg->get_kind ())
> +      {
> +      default:
> +	return ev.formatted_print ("write to %<const%> object %qE here", m_decl);
> +      case RK_FUNCTION:
> +	return ev.formatted_print ("write to function %qE here", m_decl);
> +      case RK_LABEL:
> +	return ev.formatted_print ("write to label %qE here", m_decl);
> +      }
>     }
>   
>   private:
> @@ -2231,6 +2263,20 @@ region_model::check_for_writable_region (const region* dest_reg,
>       {
>       default:
>         break;
> +    case RK_FUNCTION:
> +      {
> +	const function_region *func_reg = as_a <const function_region *> (base_reg);
> +	tree fndecl = func_reg->get_fndecl ();
> +	ctxt->warn (new write_to_const_diagnostic (func_reg, fndecl));
> +      }
> +      break;
> +    case RK_LABEL:
> +      {
> +	const label_region *label_reg = as_a <const label_region *> (base_reg);
> +	tree label = label_reg->get_label ();
> +	ctxt->warn (new write_to_const_diagnostic (label_reg, label));
> +      }
> +      break;
>       case RK_DECL:
>         {
>   	const decl_region *decl_reg = as_a <const decl_region *> (base_reg);
> diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
> index 13e8109aa51..543401167ae 100644
> --- a/gcc/analyzer/region-model.h
> +++ b/gcc/analyzer/region-model.h
> @@ -586,6 +586,7 @@ class region_model
>     void impl_call_memcpy (const call_details &cd);
>     void impl_call_memset (const call_details &cd);
>     void impl_call_realloc (const call_details &cd);
> +  void impl_call_strchr (const call_details &cd);
>     void impl_call_strcpy (const call_details &cd);
>     void impl_call_strlen (const call_details &cd);
>     void impl_call_operator_new (const call_details &cd);
> diff --git a/gcc/testsuite/gcc.dg/analyzer/pr102695.c b/gcc/testsuite/gcc.dg/analyzer/pr102695.c
> new file mode 100644
> index 00000000000..2ca988254fe
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/pr102695.c
> @@ -0,0 +1,44 @@
> +extern void* malloc (__SIZE_TYPE__);
> +
> +const char* write_strchr_literal (int x)
> +{
> +  char *p = __builtin_strchr ("123", x);
> +  *p = 0; /* { dg-warning "dereference of NULL 'p'" "null deref" } */
> +  /* { dg-warning "write to string literal" "string literal" { target *-*-* } .-1 } */
> +  return p;
> +}
> +
> +const char* write_strchr_const_array (int x)
> +{
> +  static const char a[] = "123";
> +  char *p = __builtin_strchr (a, x);
> +  *p = 0; /* { dg-warning "dereference of NULL 'p'" "null deref" } */
> +  /* { dg-warning "write to 'const' object 'a'" "write to const" { target *-*-* } .-1 } */
> +  return a;
> +}
> +
> +char* write_function (void)
> +{
> +  char *p = (char*)malloc /* forgot arguments */;
> +  p[1] = 'a'; /* { dg-warning "write to function 'malloc'" } */
> +  __builtin_strcpy (p, "123");  /* { dg-warning "write to function 'malloc'" } */
> +  return p;
> +}
> +
> +char* write_label (void)
> +{
> +  char *p = (char*)&&L;
> +  *p = 0; /* { dg-warning "write to label 'L'" } */
> +L:
> +  return p;
> +}
> +
> +struct A { const int i; };
> +
> +extern /* not const */ struct A a;
> +
> +void write_const_member (void)
> +{
> +  char *p = (char*)&a.i;
> +  *p = 0;   // missing warning
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/strchr-1.c b/gcc/testsuite/gcc.dg/analyzer/strchr-1.c
> new file mode 100644
> index 00000000000..dfe1bc9ea1a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/strchr-1.c
> @@ -0,0 +1,26 @@
> +#include <string.h>
> +#include "analyzer-decls.h"
> +
> +const char* test_literal (int x)
> +{
> +  char *p = __builtin_strchr ("123", x);
> +  if (p)
> +    {
> +      __analyzer_eval (*p == x); /* { dg-message "UNKNOWN" } */
> +      /* TODO: this ought to be TRUE, but it's unclear that it's
> +	 worth stashing this constraint.  */
> +    }
> +  return p;
> +}
> +
> +void test_2 (const char *s, int c)
> +{
> +  char *p = __builtin_strchr (s, c); /* { dg-message "when '__builtin_strchr' returns NULL"} */
> +  *p = 'A'; /* { dg-warning "dereference of NULL 'p'" "null deref" } */
> +}
> +
> +void test_3 (const char *s, int c)
> +{
> +  char *p = strchr (s, c); /* { dg-message "when 'strchr' returns NULL"} */
> +  *p = 'A'; /* { dg-warning "dereference of NULL 'p'" "null deref" } */
> +}
> 


      reply	other threads:[~2021-11-17 16:16 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-17  2:05 David Malcolm
2021-11-17 16:16 ` Martin Sebor [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a85a50e-2180-3236-cc31-f6ede8546e66@gmail.com \
    --to=msebor@gmail.com \
    --cc=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).