From: Martin Sebor <msebor@gmail.com>
To: David Malcolm <dmalcolm@redhat.com>, gcc-patches@gcc.gnu.org
Subject: Re: [committed] analyzer: fix missing -Wanalyzer-write-to-const [PR102695]
Date: Wed, 17 Nov 2021 09:16:01 -0700 [thread overview]
Message-ID: <0a85a50e-2180-3236-cc31-f6ede8546e66@gmail.com> (raw)
In-Reply-To: <20211117020518.2417407-1-dmalcolm@redhat.com>
On 11/16/21 7:05 PM, David Malcolm via Gcc-patches wrote:
> This patch fixes -Wanalyzer-write-to-const so that it will complain
> about attempts to write to functions, to labels.
> It also "teaches" the analyzer about strchr, in that strchr can either
> return a pointer into the input area (and thus -Wanalyzer-write-to-const
> can now complain about writes into a string literal seen this way),
> or return NULL (and thus the analyzer can complain about NULL
> dereferences if the result is used without a check).
Fow what it's worth, I used strchr in the test case as an example.
There are a few other built-ins like it, including index, rindex,
memchr, strrchr, and strstr (just going through the switch
statements in my code).
At least some of these built-ins have an attribute "fn spec" that
describes some of their properties (like what argument they read
from; see builtin_fnspec in builtins.c). But it doesn't look
like attr_fnspec has a way of encoding a function that returns
a pointer argument plus some offset. That seems like a useful
enhancement both for our work and also for optimizers. It would
let us avoid having to hardcode these properties in duplicate
case and switch statements in multiple places.
Martin
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> Pushed to trunk as r12-5330-g111fd515f2894d7cddf62f80c69765c43ae18577.
>
> gcc/analyzer/ChangeLog:
> PR analyzer/102695
> * region-model-impl-calls.cc (region_model::impl_call_strchr): New.
> * region-model-manager.cc
> (region_model_manager::maybe_fold_unaryop): Simplify cast to
> pointer type of an existing pointer to a region.
> * region-model.cc (region_model::on_call_pre): Handle
> BUILT_IN_STRCHR and "strchr".
> (write_to_const_diagnostic::emit): Add auto_diagnostic_group. Add
> alternate wordings for functions and labels.
> (write_to_const_diagnostic::describe_final_event): Add alternate
> wordings for functions and labels.
> (region_model::check_for_writable_region): Handle RK_FUNCTION and
> RK_LABEL.
> * region-model.h (region_model::impl_call_strchr): New decl.
>
> gcc/testsuite/ChangeLog:
> PR analyzer/102695
> * gcc.dg/analyzer/pr102695.c: New test.
> * gcc.dg/analyzer/strchr-1.c: New test.
>
> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
> ---
> gcc/analyzer/region-model-impl-calls.cc | 69 ++++++++++++++++++++++++
> gcc/analyzer/region-model-manager.cc | 7 +++
> gcc/analyzer/region-model.cc | 52 ++++++++++++++++--
> gcc/analyzer/region-model.h | 1 +
> gcc/testsuite/gcc.dg/analyzer/pr102695.c | 44 +++++++++++++++
> gcc/testsuite/gcc.dg/analyzer/strchr-1.c | 26 +++++++++
> 6 files changed, 196 insertions(+), 3 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr102695.c
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/strchr-1.c
>
> diff --git a/gcc/analyzer/region-model-impl-calls.cc b/gcc/analyzer/region-model-impl-calls.cc
> index 90d4cf9c2db..ae50e69542e 100644
> --- a/gcc/analyzer/region-model-impl-calls.cc
> +++ b/gcc/analyzer/region-model-impl-calls.cc
> @@ -678,6 +678,75 @@ region_model::impl_call_realloc (const call_details &cd)
> }
> }
>
> +/* Handle the on_call_pre part of "strchr" and "__builtin_strchr". */
> +
> +void
> +region_model::impl_call_strchr (const call_details &cd)
> +{
> + class strchr_call_info : public call_info
> + {
> + public:
> + strchr_call_info (const call_details &cd, bool found)
> + : call_info (cd), m_found (found)
> + {
> + }
> +
> + label_text get_desc (bool can_colorize) const FINAL OVERRIDE
> + {
> + if (m_found)
> + return make_label_text (can_colorize,
> + "when %qE returns non-NULL",
> + get_fndecl ());
> + else
> + return make_label_text (can_colorize,
> + "when %qE returns NULL",
> + get_fndecl ());
> + }
> +
> + bool update_model (region_model *model,
> + const exploded_edge *,
> + region_model_context *ctxt) const FINAL OVERRIDE
> + {
> + const call_details cd (get_call_details (model, ctxt));
> + if (tree lhs_type = cd.get_lhs_type ())
> + {
> + region_model_manager *mgr = model->get_manager ();
> + const svalue *result;
> + if (m_found)
> + {
> + const svalue *str_sval = cd.get_arg_svalue (0);
> + const region *str_reg
> + = model->deref_rvalue (str_sval, cd.get_arg_tree (0),
> + cd.get_ctxt ());
> + /* We want str_sval + OFFSET for some unknown OFFSET.
> + Use a conjured_svalue to represent the offset,
> + using the str_reg as the id of the conjured_svalue. */
> + const svalue *offset
> + = mgr->get_or_create_conjured_svalue (size_type_node,
> + cd.get_call_stmt (),
> + str_reg);
> + result = mgr->get_or_create_binop (lhs_type, POINTER_PLUS_EXPR,
> + str_sval, offset);
> + }
> + else
> + result = mgr->get_or_create_int_cst (lhs_type, 0);
> + cd.maybe_set_lhs (result);
> + }
> + return true;
> + }
> + private:
> + bool m_found;
> + };
> +
> + /* Bifurcate state, creating a "not found" out-edge. */
> + if (cd.get_ctxt ())
> + cd.get_ctxt ()->bifurcate (new strchr_call_info (cd, false));
> +
> + /* The "unbifurcated" state is the "found" case. */
> + strchr_call_info found (cd, true);
> + found.update_model (this, NULL, cd.get_ctxt ());
> +}
> +
> /* Handle the on_call_pre part of "strcpy" and "__builtin_strcpy_chk". */
>
> void
> diff --git a/gcc/analyzer/region-model-manager.cc b/gcc/analyzer/region-model-manager.cc
> index 1cdec1bd230..fdf32122dea 100644
> --- a/gcc/analyzer/region-model-manager.cc
> +++ b/gcc/analyzer/region-model-manager.cc
> @@ -380,6 +380,13 @@ region_model_manager::maybe_fold_unaryop (tree type, enum tree_code op,
> == boolean_true_node))
> return maybe_fold_unaryop (type, op, innermost_arg);
> }
> + /* Avoid creating symbolic regions for pointer casts by
> + simplifying (T*)(®ION) to ((T*)®ION). */
> + if (const region_svalue *region_sval = arg->dyn_cast_region_svalue ())
> + if (POINTER_TYPE_P (type)
> + && region_sval->get_type ()
> + && POINTER_TYPE_P (region_sval->get_type ()))
> + return get_ptr_svalue (type, region_sval->get_pointee ());
> }
> break;
> case TRUTH_NOT_EXPR:
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
> index 416a5ac7249..bbb15abdfe2 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -1133,6 +1133,9 @@ region_model::on_call_pre (const gcall *call, region_model_context *ctxt,
> break;
> case BUILT_IN_REALLOC:
> return false;
> + case BUILT_IN_STRCHR:
> + impl_call_strchr (cd);
> + return false;
> case BUILT_IN_STRCPY:
> case BUILT_IN_STRCPY_CHK:
> impl_call_strcpy (cd);
> @@ -1225,6 +1228,12 @@ region_model::on_call_pre (const gcall *call, region_model_context *ctxt,
> impl_call_memset (cd);
> return false;
> }
> + else if (is_named_call_p (callee_fndecl, "strchr", call, 2)
> + && POINTER_TYPE_P (cd.get_arg_type (0)))
> + {
> + impl_call_strchr (cd);
> + return false;
> + }
> else if (is_named_call_p (callee_fndecl, "strlen", call, 1)
> && POINTER_TYPE_P (cd.get_arg_type (0)))
> {
> @@ -2161,8 +2170,23 @@ public:
>
> bool emit (rich_location *rich_loc) FINAL OVERRIDE
> {
> - bool warned = warning_at (rich_loc, OPT_Wanalyzer_write_to_const,
> - "write to %<const%> object %qE", m_decl);
> + auto_diagnostic_group d;
> + bool warned;
> + switch (m_reg->get_kind ())
> + {
> + default:
> + warned = warning_at (rich_loc, OPT_Wanalyzer_write_to_const,
> + "write to %<const%> object %qE", m_decl);
> + break;
> + case RK_FUNCTION:
> + warned = warning_at (rich_loc, OPT_Wanalyzer_write_to_const,
> + "write to function %qE", m_decl);
> + break;
> + case RK_LABEL:
> + warned = warning_at (rich_loc, OPT_Wanalyzer_write_to_const,
> + "write to label %qE", m_decl);
> + break;
> + }
> if (warned)
> inform (DECL_SOURCE_LOCATION (m_decl), "declared here");
> return warned;
> @@ -2170,7 +2194,15 @@ public:
>
> label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE
> {
> - return ev.formatted_print ("write to %<const%> object %qE here", m_decl);
> + switch (m_reg->get_kind ())
> + {
> + default:
> + return ev.formatted_print ("write to %<const%> object %qE here", m_decl);
> + case RK_FUNCTION:
> + return ev.formatted_print ("write to function %qE here", m_decl);
> + case RK_LABEL:
> + return ev.formatted_print ("write to label %qE here", m_decl);
> + }
> }
>
> private:
> @@ -2231,6 +2263,20 @@ region_model::check_for_writable_region (const region* dest_reg,
> {
> default:
> break;
> + case RK_FUNCTION:
> + {
> + const function_region *func_reg = as_a <const function_region *> (base_reg);
> + tree fndecl = func_reg->get_fndecl ();
> + ctxt->warn (new write_to_const_diagnostic (func_reg, fndecl));
> + }
> + break;
> + case RK_LABEL:
> + {
> + const label_region *label_reg = as_a <const label_region *> (base_reg);
> + tree label = label_reg->get_label ();
> + ctxt->warn (new write_to_const_diagnostic (label_reg, label));
> + }
> + break;
> case RK_DECL:
> {
> const decl_region *decl_reg = as_a <const decl_region *> (base_reg);
> diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
> index 13e8109aa51..543401167ae 100644
> --- a/gcc/analyzer/region-model.h
> +++ b/gcc/analyzer/region-model.h
> @@ -586,6 +586,7 @@ class region_model
> void impl_call_memcpy (const call_details &cd);
> void impl_call_memset (const call_details &cd);
> void impl_call_realloc (const call_details &cd);
> + void impl_call_strchr (const call_details &cd);
> void impl_call_strcpy (const call_details &cd);
> void impl_call_strlen (const call_details &cd);
> void impl_call_operator_new (const call_details &cd);
> diff --git a/gcc/testsuite/gcc.dg/analyzer/pr102695.c b/gcc/testsuite/gcc.dg/analyzer/pr102695.c
> new file mode 100644
> index 00000000000..2ca988254fe
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/pr102695.c
> @@ -0,0 +1,44 @@
> +extern void* malloc (__SIZE_TYPE__);
> +
> +const char* write_strchr_literal (int x)
> +{
> + char *p = __builtin_strchr ("123", x);
> + *p = 0; /* { dg-warning "dereference of NULL 'p'" "null deref" } */
> + /* { dg-warning "write to string literal" "string literal" { target *-*-* } .-1 } */
> + return p;
> +}
> +
> +const char* write_strchr_const_array (int x)
> +{
> + static const char a[] = "123";
> + char *p = __builtin_strchr (a, x);
> + *p = 0; /* { dg-warning "dereference of NULL 'p'" "null deref" } */
> + /* { dg-warning "write to 'const' object 'a'" "write to const" { target *-*-* } .-1 } */
> + return a;
> +}
> +
> +char* write_function (void)
> +{
> + char *p = (char*)malloc /* forgot arguments */;
> + p[1] = 'a'; /* { dg-warning "write to function 'malloc'" } */
> + __builtin_strcpy (p, "123"); /* { dg-warning "write to function 'malloc'" } */
> + return p;
> +}
> +
> +char* write_label (void)
> +{
> + char *p = (char*)&&L;
> + *p = 0; /* { dg-warning "write to label 'L'" } */
> +L:
> + return p;
> +}
> +
> +struct A { const int i; };
> +
> +extern /* not const */ struct A a;
> +
> +void write_const_member (void)
> +{
> + char *p = (char*)&a.i;
> + *p = 0; // missing warning
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/strchr-1.c b/gcc/testsuite/gcc.dg/analyzer/strchr-1.c
> new file mode 100644
> index 00000000000..dfe1bc9ea1a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/strchr-1.c
> @@ -0,0 +1,26 @@
> +#include <string.h>
> +#include "analyzer-decls.h"
> +
> +const char* test_literal (int x)
> +{
> + char *p = __builtin_strchr ("123", x);
> + if (p)
> + {
> + __analyzer_eval (*p == x); /* { dg-message "UNKNOWN" } */
> + /* TODO: this ought to be TRUE, but it's unclear that it's
> + worth stashing this constraint. */
> + }
> + return p;
> +}
> +
> +void test_2 (const char *s, int c)
> +{
> + char *p = __builtin_strchr (s, c); /* { dg-message "when '__builtin_strchr' returns NULL"} */
> + *p = 'A'; /* { dg-warning "dereference of NULL 'p'" "null deref" } */
> +}
> +
> +void test_3 (const char *s, int c)
> +{
> + char *p = strchr (s, c); /* { dg-message "when 'strchr' returns NULL"} */
> + *p = 'A'; /* { dg-warning "dereference of NULL 'p'" "null deref" } */
> +}
>
prev parent reply other threads:[~2021-11-17 16:16 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-17 2:05 David Malcolm
2021-11-17 16:16 ` Martin Sebor [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0a85a50e-2180-3236-cc31-f6ede8546e66@gmail.com \
--to=msebor@gmail.com \
--cc=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).