From: David Malcolm <dmalcolm@redhat.com>
To: gcc-patches@gcc.gnu.org, Joseph Myers <joseph@codesourcery.com>
Cc: Marek Polacek <polacek@redhat.com>
Subject: PING: Re: [PATCH] c, analyzer: support named constants in analyzer [PR106302]
Date: Mon, 07 Nov 2022 22:06:37 -0500 [thread overview]
Message-ID: <acc9d326129e38eb01ff8b766a832525d61e3a38.camel@redhat.com> (raw)
In-Reply-To: <20221031190742.2116564-1-dmalcolm@redhat.com>
I'd like to "ping" the C frontend parts of this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604739.html
FWIW I've just posted the socket patch that I referred to, here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605281.html
which depends on this patch.
Thanks
Dave
On Mon, 2022-10-31 at 15:07 -0400, David Malcolm wrote:
> The analyzer's file-descriptor state machine tracks the access mode
> of
> opened files, so that it can emit -Wanalyzer-fd-access-mode-mismatch.
>
> To do this, its symbolic execution needs to "know" the values of the
> constants "O_RDONLY", "O_WRONLY", and "O_ACCMODE". Currently
> analyzer/sm-fd.cc simply uses these values directly from the build-
> time
> header files, but these are the values on the host, not those from
> the
> target, which could be different (PR analyzer/106302).
>
> In an earlier discussion of this issue:
> https://gcc.gnu.org/pipermail/gcc/2022-June/238954.html
> we talked about adding a target hook for this.
>
> However, I've also been experimenting with extending the fd state
> machine to track sockets (PR analyzer/106140). For this, it's useful
> to
> "know" the values of the constants "SOCK_STREAM" and "SOCK_DGRAM".
> Unfortunately, these seem to have many arbitrary differences from
> target
> to target.
>
> For example: Linux/glibc general has SOCK_STREAM == 1, SOCK_DGRAM ==
> 2,
> as does AIX, but annoyingly, e.g. Linux on MIPS has them the other
> way
> around.
>
> It seems to me that as the analyzer grows more ambitious modeling of
> the
> behavior of APIs (perhaps via plugins) it's more likely that the
> analyzer will need to know the values of named constants, which might
> not even exist on the host.
>
> For example, at LPC it was suggested to me that -fanalyzer could
> check
> rules about memory management inside the Linux kernel (probably via a
> plugin), but doing so involves a bunch of GFP_* flags (see PR
> 107472).
>
> So rather than trying to capture all this knowledge in a target hook,
> this patch attempts to get at named constant values from the user's
> source code.
>
> The patch adds an interface for frontends to call into the analyzer
> as
> the translation unit finishes. The analyzer can then call back into
> the
> frontend to ask about the values of the named constants it cares
> about
> whilst the frontend's data structures are still around.
>
> The patch implements this for the C frontend, which looks up the
> names
> by looking for named CONST_DECLs (which handles enum values).
> Failing
> that, it attempts to look up the values of macros but only the
> simplest
> cases are supported (a non-traditional macro with a single CPP_NUMBER
> token). It does this by building a buffer containing the macro
> definition and rerunning a lexer on it.
>
> The analyzer gracefully handles the cases where named values aren't
> found (such as anything more complicated than described above).
>
> The patch ports the analyzer to use this mechanism for "O_RDONLY",
> "O_WRONLY", and "O_ACCMODE". I have successfully tested my socket
> patch
> to also use this for "SOCK_STREAM" and "SOCK_DGRAM", so the technique
> seems to work.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> Are the C frontend parts OK for trunk?
>
> Thanks
> Dave
>
> gcc/ChangeLog:
> PR analyzer/106302
> * Makefile.in (ANALYZER_OBJS): Add analyzer/analyzer-
> language.o.
> (GTFILES): Add analyzer/analyzer-language.cc.
>
> gcc/analyzer/ChangeLog:
> PR analyzer/106302
> * analyzer-language.cc: New file.
> * analyzer-language.h: New file.
> * analyzer.h (get_stashed_constant_by_name): New decl.
> (log_stashed_constants): New decl.
> * engine.cc (impl_run_checkers): Call log_stashed_constants.
> * region-model-impl-calls.cc
> (region_model::impl_call_analyzer_dump_named_constant): New.
> * region-model.cc (region_model::on_stmt_pre): Handle
> __analyzer_dump_named_constant.
> * region-model.h
> (region_model::impl_call_analyzer_dump_named_constant): New
> decl.
> * sm-fd.cc (fd_state_machine::m_O_ACCMODE): New.
> (fd_state_machine::m_O_RDONLY): New.
> (fd_state_machine::m_O_WRONLY): New.
> (fd_state_machine::fd_state_machine): Initialize the new
> fields.
> (fd_state_machine::get_access_mode_from_flag): Use the new
> fields,
> rather than using the host values.
>
> gcc/c/ChangeLog:
> PR analyzer/106302
> * c-parser.cc: Include "analyzer/analyzer-language.h" and
> "toplev.h".
> (class ana::c_translation_unit): New.
> (c_parser_translation_unit): Call
> ana::on_finish_translation_unit.
>
> gcc/ChangeLog:
> * doc/analyzer.texi: Document __analyzer_dump_named_constant.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/analyzer/analyzer-decls.h
> (__analyzer_dump_named_constant): New decl.
> * gcc.dg/analyzer/fd-4.c (void): Likewise.
> (O_ACCMODE): Define.
> * gcc.dg/analyzer/fd-access-mode-enum.c: New test, based on .
> * gcc.dg/analyzer/fd-5.c: ...this. Rename to...
> * gcc.dg/analyzer/fd-access-mode-macros.c: ...this.
> (O_ACCMODE): Define.
> * gcc.dg/analyzer/fd-access-mode-target-headers.c: New test,
> also
> based on fd-5.c.
> (test_sm_fd_constants): New.
> * gcc.dg/analyzer/fd-dup-1.c (O_ACCMODE): Define.
> * gcc.dg/analyzer/named-constants-via-enum.c: New test.
> * gcc.dg/analyzer/named-constants-via-macros-2.c: New test.
> * gcc.dg/analyzer/named-constants-via-macros.c: New test.
>
> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
> ---
> gcc/Makefile.in | 2 +
> gcc/analyzer/analyzer-language.cc | 110
> ++++++++++++++++++
> gcc/analyzer/analyzer-language.h | 48 ++++++++
> gcc/analyzer/analyzer.h | 3 +
> gcc/analyzer/engine.cc | 1 +
> gcc/analyzer/region-model-impl-calls.cc | 28 +++++
> gcc/analyzer/region-model.cc | 4 +
> gcc/analyzer/region-model.h | 2 +
> gcc/analyzer/sm-fd.cc | 30 +++--
> gcc/c/c-parser.cc | 91 +++++++++++++++
> gcc/doc/analyzer.texi | 17 +++
> .../gcc.dg/analyzer/analyzer-decls.h | 3 +
> gcc/testsuite/gcc.dg/analyzer/fd-4.c | 1 +
> .../gcc.dg/analyzer/fd-access-mode-enum.c | 60 ++++++++++
> .../{fd-5.c => fd-access-mode-macros.c} | 1 +
> .../analyzer/fd-access-mode-target-headers.c | 56 +++++++++
> gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c | 1 +
> .../analyzer/named-constants-via-enum.c | 20 ++++
> .../analyzer/named-constants-via-macros-2.c | 15 +++
> .../analyzer/named-constants-via-macros.c | 19 +++
> 20 files changed, 502 insertions(+), 10 deletions(-)
> create mode 100644 gcc/analyzer/analyzer-language.cc
> create mode 100644 gcc/analyzer/analyzer-language.h
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-access-mode-
> enum.c
> rename gcc/testsuite/gcc.dg/analyzer/{fd-5.c => fd-access-mode-
> macros.c} (98%)
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-access-mode-
> target-headers.c
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-
> via-enum.c
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-
> via-macros-2.c
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-
> via-macros.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index f672e6ea549..4539934cb03 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1249,6 +1249,7 @@ C_COMMON_OBJS = c-family/c-common.o c-family/c-
> cppbuiltin.o c-family/c-dump.o \
> ANALYZER_OBJS = \
> analyzer/analysis-plan.o \
> analyzer/analyzer.o \
> + analyzer/analyzer-language.o \
> analyzer/analyzer-logging.o \
> analyzer/analyzer-pass.o \
> analyzer/analyzer-selftests.o \
> @@ -2741,6 +2742,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h
> $(srcdir)/coretypes.h \
> $(srcdir)/internal-fn.h \
> $(srcdir)/calls.cc \
> $(srcdir)/omp-general.h \
> + $(srcdir)/analyzer/analyzer-language.cc \
> @all_gtfiles@
>
> # Compute the list of GT header files from the corresponding C
> sources,
> diff --git a/gcc/analyzer/analyzer-language.cc
> b/gcc/analyzer/analyzer-language.cc
> new file mode 100644
> index 00000000000..ba4352b729a
> --- /dev/null
> +++ b/gcc/analyzer/analyzer-language.cc
> @@ -0,0 +1,110 @@
> +/* Interface between analyzer and frontends.
> + Copyright (C) 2022 Free Software Foundation, Inc.
> + Contributed by David Malcolm <dmalcolm@redhat.com>.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful, but
> +WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#include "config.h"
> +#define INCLUDE_MEMORY
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree.h"
> +#include "stringpool.h"
> +#include "analyzer/analyzer.h"
> +#include "analyzer/analyzer-language.h"
> +#include "analyzer/analyzer-logging.h"
> +
> +/* Map from identifier to INTEGER_CST. */
> +static GTY (()) hash_map <tree, tree> *analyzer_stashed_constants;
> +
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Call into TU to try to find a value for NAME.
> + If found, stash its value within analyzer_stashed_constants. */
> +
> +static void
> +maybe_stash_named_constant (const translation_unit &tu, const char
> *name)
> +{
> + if (!analyzer_stashed_constants)
> + analyzer_stashed_constants = hash_map<tree, tree>::create_ggc
> ();
> +
> + tree id = get_identifier (name);
> + if (tree t = tu.lookup_constant_by_id (id))
> + {
> + gcc_assert (TREE_CODE (t) == INTEGER_CST);
> + analyzer_stashed_constants->put (id, t);
> + }
> +}
> +
> +/* Hook for frontend to call into analyzer when TU finishes.
> + This exists so that the analyzer can stash named constant values
> from
> + header files (e.g. macros and enums) for later use when modeling
> the
> + behaviors of APIs.
> +
> + By doing it this way, the analyzer can use the precise values for
> those
> + constants from the user's headers, rather than attempting to
> model them
> + as properties of the target. */
> +
> +void
> +on_finish_translation_unit (const translation_unit &tu)
> +{
> + /* Bail if the analyzer isn't enabled. */
> + if (!flag_analyzer)
> + return;
> +
> + /* Stash named constants for use by sm-fd.cc */
> + maybe_stash_named_constant (tu, "O_ACCMODE");
> + maybe_stash_named_constant (tu, "O_RDONLY");
> + maybe_stash_named_constant (tu, "O_WRONLY");
> +}
> +
> +/* Lookup NAME in the named constants stashed when the frontend TU
> finished.
> + Return either an INTEGER_CST, or NULL_TREE. */
> +
> +tree
> +get_stashed_constant_by_name (const char *name)
> +{
> + if (!analyzer_stashed_constants)
> + return NULL_TREE;
> + tree id = get_identifier (name);
> + if (tree *slot = analyzer_stashed_constants->get (id))
> + {
> + gcc_assert (TREE_CODE (*slot) == INTEGER_CST);
> + return *slot;
> + }
> + return NULL_TREE;
> +}
> +
> +/* Log all stashed named constants to LOGGER. */
> +
> +void
> +log_stashed_constants (logger *logger)
> +{
> + gcc_assert (logger);
> + LOG_SCOPE (logger);
> + if (analyzer_stashed_constants)
> + for (auto iter : *analyzer_stashed_constants)
> + logger->log ("%qE: %qE", iter.first, iter.second);
> +}
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
> +#include "gt-analyzer-language.h"
> diff --git a/gcc/analyzer/analyzer-language.h
> b/gcc/analyzer/analyzer-language.h
> new file mode 100644
> index 00000000000..33c4dd60623
> --- /dev/null
> +++ b/gcc/analyzer/analyzer-language.h
> @@ -0,0 +1,48 @@
> +/* Interface between analyzer and frontends.
> + Copyright (C) 2022 Free Software Foundation, Inc.
> + Contributed by David Malcolm <dmalcolm@redhat.com>.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful, but
> +WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#ifndef GCC_ANALYZER_LANGUAGE_H
> +#define GCC_ANALYZER_LANGUAGE_H
> +
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Abstract base class for representing a specific TU
> + to the analyzer. */
> +
> +class translation_unit
> +{
> + public:
> + /* Attempt to look up an value for identifier ID (e.g. in the
> headers that
> + have been seen). If it is defined and an integer (e.g. either
> as a
> + macro or enum), return the INTEGER_CST value, otherwise return
> NULL. */
> + virtual tree lookup_constant_by_id (tree id) const = 0;
> +};
> +
> +/* Analyzer hook for frontends to call at the end of the TU. */
> +
> +void on_finish_translation_unit (const translation_unit &tu);
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
> +#endif /* GCC_ANALYZER_LANGUAGE_H */
> diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
> index a2d79e4a59f..e0c2ef77405 100644
> --- a/gcc/analyzer/analyzer.h
> +++ b/gcc/analyzer/analyzer.h
> @@ -312,6 +312,9 @@ public:
> virtual bool terminate_path_p () const = 0;
> };
>
> +extern tree get_stashed_constant_by_name (const char *name);
> +extern void log_stashed_constants (logger *logger);
> +
> } // namespace ana
>
> extern bool is_special_named_call_p (const gcall *call, const char
> *funcname,
> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
> index 52978dd0d37..40ee14c1eae 100644
> --- a/gcc/analyzer/engine.cc
> +++ b/gcc/analyzer/engine.cc
> @@ -6001,6 +6001,7 @@ impl_run_checkers (logger *logger)
> logger->log ("BITS_BIG_ENDIAN: %i", BITS_BIG_ENDIAN ? 1 : 0);
> logger->log ("BYTES_BIG_ENDIAN: %i", BYTES_BIG_ENDIAN ? 1 :
> 0);
> logger->log ("WORDS_BIG_ENDIAN: %i", WORDS_BIG_ENDIAN ? 1 :
> 0);
> + log_stashed_constants (logger);
> }
>
> /* If using LTO, ensure that the cgraph nodes have function
> bodies. */
> diff --git a/gcc/analyzer/region-model-impl-calls.cc
> b/gcc/analyzer/region-model-impl-calls.cc
> index 52c4205cbeb..3e06f9eacd2 100644
> --- a/gcc/analyzer/region-model-impl-calls.cc
> +++ b/gcc/analyzer/region-model-impl-calls.cc
> @@ -350,6 +350,34 @@ region_model::impl_call_analyzer_dump_escaped
> (const gcall *call)
> pp_formatted_text (&pp));
> }
>
> +/* Handle a call to "__analyzer_dump_named_constant".
> +
> + Look up the given name, and emit a warning describing the
> + state of the corresponding stashed value.
> +
> + This is for use when debugging, and for DejaGnu tests. */
> +
> +void
> +region_model::
> +impl_call_analyzer_dump_named_constant (const gcall *call,
> + region_model_context *ctxt)
> +{
> + call_details cd (call, this, ctxt);
> + const char *name = cd.get_arg_string_literal (0);
> + if (!name)
> + {
> + error_at (call->location, "cannot determine name");
> + return;
> + }
> + tree value = get_stashed_constant_by_name (name);
> + if (value)
> + warning_at (call->location, 0, "named constant %qs has value
> %qE",
> + name, value);
> + else
> + warning_at (call->location, 0, "named constant %qs has unknown
> value",
> + name);
> +}
> +
> /* Handle a call to "__analyzer_eval" by evaluating the input
> and dumping as a dummy warning, so that test cases can use
> dg-warning to validate the result (and so unexpected warnings
> will
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> model.cc
> index 7c44fc9e253..d33be9c9905 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -1224,6 +1224,10 @@ region_model::on_stmt_pre (const gimple *stmt,
> impl_call_analyzer_dump_capacity (call, ctxt);
> else if (is_special_named_call_p (call,
> "__analyzer_dump_escaped", 0))
> impl_call_analyzer_dump_escaped (call);
> + else if (is_special_named_call_p (call,
> +
> "__analyzer_dump_named_constant",
> + 1))
> + impl_call_analyzer_dump_named_constant (call, ctxt);
> else if (is_special_named_call_p (call,
> "__analyzer_dump_path", 0))
> {
> /* Handle the builtin "__analyzer_dump_path" by queuing a
> diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-
> model.h
> index 19e8043daa4..16429703cef 100644
> --- a/gcc/analyzer/region-model.h
> +++ b/gcc/analyzer/region-model.h
> @@ -343,6 +343,8 @@ class region_model
> void impl_call_analyzer_dump_capacity (const gcall *call,
> region_model_context *ctxt);
> void impl_call_analyzer_dump_escaped (const gcall *call);
> + void impl_call_analyzer_dump_named_constant (const gcall *call,
> + region_model_context
> *ctxt);
> void impl_call_analyzer_eval (const gcall *call,
> region_model_context *ctxt);
> void impl_call_analyzer_get_unknown_ptr (const call_details &cd);
> diff --git a/gcc/analyzer/sm-fd.cc b/gcc/analyzer/sm-fd.cc
> index ae846cd6ec8..f3a5582c4b8 100644
> --- a/gcc/analyzer/sm-fd.cc
> +++ b/gcc/analyzer/sm-fd.cc
> @@ -157,6 +157,11 @@ public:
> /* State for a file descriptor that we do not want to track
> anymore . */
> state_t m_stop;
>
> + /* Stashed constant values from the frontend. These could be
> NULL. */
> + tree m_O_ACCMODE;
> + tree m_O_RDONLY;
> + tree m_O_WRONLY;
> +
> private:
> void on_open (sm_context *sm_ctxt, const supernode *node, const
> gimple *stmt,
> const gcall *call) const;
> @@ -684,7 +689,10 @@ fd_state_machine::fd_state_machine (logger
> *logger)
> m_valid_write_only (add_state ("fd-valid-write-only")),
> m_invalid (add_state ("fd-invalid")),
> m_closed (add_state ("fd-closed")),
> - m_stop (add_state ("fd-stop"))
> + m_stop (add_state ("fd-stop")),
> + m_O_ACCMODE (get_stashed_constant_by_name ("O_ACCMODE")),
> + m_O_RDONLY (get_stashed_constant_by_name ("O_RDONLY")),
> + m_O_WRONLY (get_stashed_constant_by_name ("O_WRONLY"))
> {
> }
>
> @@ -707,16 +715,18 @@ fd_state_machine::is_valid_fd_p (state_t s)
> const
> enum access_mode
> fd_state_machine::get_access_mode_from_flag (int flag) const
> {
> - /* FIXME: this code assumes the access modes on the host and
> - target are the same, which in practice might not be the case.
> */
> -
> - if ((flag & O_ACCMODE) == O_RDONLY)
> - {
> - return READ_ONLY;
> - }
> - else if ((flag & O_ACCMODE) == O_WRONLY)
> + if (m_O_ACCMODE && TREE_CODE (m_O_ACCMODE) == INTEGER_CST)
> {
> - return WRITE_ONLY;
> + const unsigned HOST_WIDE_INT mask_val = TREE_INT_CST_LOW
> (m_O_ACCMODE);
> + const unsigned HOST_WIDE_INT masked_flag = flag & mask_val;
> +
> + if (m_O_RDONLY && TREE_CODE (m_O_RDONLY) == INTEGER_CST)
> + if (masked_flag == TREE_INT_CST_LOW (m_O_RDONLY))
> + return READ_ONLY;
> +
> + if (m_O_WRONLY && TREE_CODE (m_O_WRONLY) == INTEGER_CST)
> + if (masked_flag == TREE_INT_CST_LOW (m_O_WRONLY))
> + return WRITE_ONLY;
> }
> return READ_WRITE;
> }
> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> index ca533c9c667..9e67afffa13 100644
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -72,6 +72,8 @@ along with GCC; see the file COPYING3. If not see
> #include "memmodel.h"
> #include "c-family/known-headers.h"
> #include "bitmap.h"
> +#include "analyzer/analyzer-language.h"
> +#include "toplev.h"
>
> /* We need to walk over decls with incomplete struct/union/enum
> types
> after parsing the whole translation unit.
> @@ -1662,6 +1664,87 @@ static bool
> c_parser_objc_diagnose_bad_element_prefix
> (c_parser *, struct c_declspecs *);
> static location_t c_parser_parse_rtl_body (c_parser *, char *);
>
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Concrete implementation of ana::translation_unit for the C
> frontend. */
> +
> +class c_translation_unit : public translation_unit
> +{
> +public:
> + /* Implementation of translation_unit::lookup_constant_by_id for
> use by the
> + analyzer to look up named constants in the user's source code.
> */
> + tree lookup_constant_by_id (tree id) const final override
> + {
> + /* Consider decls. */
> + if (tree decl = lookup_name (id))
> + if (TREE_CODE (decl) == CONST_DECL)
> + if (tree value = DECL_INITIAL (decl))
> + if (TREE_CODE (value) == INTEGER_CST)
> + return value;
> +
> + /* Consider macros. */
> + cpp_hashnode *hashnode = C_CPP_HASHNODE (id);
> + if (cpp_macro_p (hashnode))
> + if (tree value = consider_macro (hashnode->value.macro))
> + return value;
> +
> + return NULL_TREE;
> + }
> +
> +private:
> + /* Attempt to get an INTEGER_CST from MACRO.
> + Only handle the simplest cases: where MACRO's definition is a
> single
> + token containing a number, by lexing the number again.
> + This will handle e.g.
> + #define NAME 42
> + and other bases but not negative numbers, parentheses or e.g.
> + #define NAME 1 << 7
> + as doing so would require a parser. */
> + tree consider_macro (cpp_macro *macro) const
> + {
> + if (macro->paramc > 0)
> + return NULL_TREE;
> + if (macro->kind == cmk_traditional)
> + return NULL_TREE;
> + if (macro->count != 1)
> + return NULL_TREE;
> + const cpp_token &tok = macro->exp.tokens[0];
> + if (tok.type != CPP_NUMBER)
> + return NULL_TREE;
> +
> + cpp_reader *old_parse_in = parse_in;
> + parse_in = cpp_create_reader (c_dialect_cxx () ? CLK_GNUCXX:
> CLK_GNUC89,
> + ident_hash, line_table);
> +
> + pretty_printer pp;
> + pp_string (&pp, (const char *)tok.val.str.text);
> + pp_newline (&pp);
> + cpp_push_buffer (parse_in,
> + (const unsigned char *)pp_formatted_text (&pp),
> + strlen (pp_formatted_text (&pp)),
> + 0);
> +
> + tree value;
> + location_t loc;
> + unsigned char cpp_flags;
> + c_lex_with_flags (&value, &loc, &cpp_flags, 0);
> +
> + cpp_destroy (parse_in);
> + parse_in = old_parse_in;
> +
> + if (value && TREE_CODE (value) == INTEGER_CST)
> + return value;
> +
> + return NULL_TREE;
> + }
> +};
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
> /* Parse a translation unit (C90 6.7, C99 6.9, C11 6.9).
>
> translation-unit:
> @@ -1722,6 +1805,14 @@ c_parser_translation_unit (c_parser *parser)
> "#pragma omp begin assumes", "#pragma omp end
> assumes");
> current_omp_begin_assumes = 0;
> }
> +
> +#if ENABLE_ANALYZER
> + if (flag_analyzer)
> + {
> + ana::c_translation_unit tu;
> + ana::on_finish_translation_unit (tu);
> + }
> +#endif
> }
>
> /* Parse an external declaration (C90 6.7, C99 6.9, C11 6.9).
> diff --git a/gcc/doc/analyzer.texi b/gcc/doc/analyzer.texi
> index ec49f951435..d61b55cec5a 100644
> --- a/gcc/doc/analyzer.texi
> +++ b/gcc/doc/analyzer.texi
> @@ -524,6 +524,23 @@ With a non-zero argument
>
> it will also dump all of the states within the ``processed'' nodes.
>
> +The builtin @code{__analyzer_dump_named_constant} will emit a
> warning
> +during analysis describing what is known about the value of a given
> +named constant, for parts of the analyzer that interact with target
> +headers.
> +
> +For example:
> +
> +@smallexample
> +__analyzer_dump_named_constant ("O_RDONLY");
> +@end smallexample
> +
> +might emit the warning:
> +
> +@smallexample
> +warning: named constant 'O_RDONLY' has value '1'
> +@end smallexample
> +
> @smallexample
> __analyzer_dump_region_model ();
> @end smallexample
> diff --git a/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> b/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> index 4478d740b58..d9a32ed9370 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> +++ b/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> @@ -31,6 +31,9 @@ extern void __analyzer_dump_escaped (void);
> will also dump all of the states within those nodes. */
> extern void __analyzer_dump_exploded_nodes (int);
>
> +/* Emit a warning describing what is known about the value of NAME.
> */
> +extern void __analyzer_dump_named_constant (const char *name);
> +
> /* Emit a placeholder "note" diagnostic with a path to this call
> site,
> if the analyzer finds a feasible path to it. */
> extern void __analyzer_dump_path (void);
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> index 842a26b4364..994bad84342 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> @@ -8,6 +8,7 @@ void close(int fd);
> int write (int fd, void *buf, int nbytes);
> int read (int fd, void *buf, int nbytes);
>
> +#define O_ACCMODE 0xf
> #define O_RDONLY 0
> #define O_WRONLY 1
> #define O_RDWR 2
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
> new file mode 100644
> index 00000000000..5226569c437
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
> @@ -0,0 +1,60 @@
> +int open(const char *, int mode);
> +void close(int fd);
> +int write (int fd, void *buf, int nbytes);
> +int read (int fd, void *buf, int nbytes);
> +
> +/* Example of these flags as an enum, and with
> + non-standard values for them. */
> +
> +enum {
> + O_RDONLY = 0x10,
> + O_WRONLY = 0x20,
> + O_RDWR = 0x40,
> +
> + O_ACCMODE = 0xf0
> +};
> +
> +void f (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'f' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void
> +test_1 (const char *path)
> +{
> + int fd = open (path, O_RDWR);
> + close(fd);
> + f(fd); /* { dg-warning "'f' on closed file descriptor 'fd'" } */
> + /* { dg-message "\\(3\\) 'f' on closed file descriptor 'fd';
> 'close' was at \\(2\\)" "" { target *-*-* } .-1 } */
> +}
> +
> +void g (int fd) __attribute__((fd_arg_read(1))); /* { dg-message
> "argument 1 of 'g' must be a readable file descriptor, due to
> '__attribute__\\(\\(fd_arg_read\\(1\\)\\)\\)'" } */
> +
> +void
> +test_2 (const char *path)
> +{
> + int fd = open (path, O_WRONLY);
> + if (fd != -1)
> + {
> + g (fd); /* { dg-warning "'g' on write-only file descriptor 'fd'"
> } */
> + }
> + close (fd);
> +}
> +
> +void h (int fd) __attribute__((fd_arg_write(1))); /* { dg-message
> "argument 1 of 'h' must be a writable file descriptor, due to
> '__attribute__\\(\\(fd_arg_write\\(1\\)\\)\\)'" } */
> +void
> +test_3 (const char *path)
> +{
> + int fd = open (path, O_RDONLY);
> + if (fd != -1)
> + {
> + h (fd); /* { dg-warning "'h' on read-only file descriptor 'fd'"
> } */
> + }
> + close(fd);
> +}
> +
> +void ff (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'ff' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void test_4 (const char *path)
> +{
> + int fd = open (path, O_RDWR);
> + ff (fd); /* { dg-warning "'ff' on possibly invalid file descriptor
> 'fd'" } */
> + close(fd);
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-5.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
> similarity index 98%
> rename from gcc/testsuite/gcc.dg/analyzer/fd-5.c
> rename to gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
> index c18b2adcbe5..f9a6931a5db 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-5.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
> @@ -6,6 +6,7 @@ int read (int fd, void *buf, int nbytes);
> #define O_RDONLY 0
> #define O_WRONLY 1
> #define O_RDWR 2
> +#define O_ACCMODE 0x3
>
> void f (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'f' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
>
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-
> headers.c b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-
> headers.c
> new file mode 100644
> index 00000000000..b76eb667d50
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-headers.c
> @@ -0,0 +1,56 @@
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include <unistd.h>
> +#include "analyzer-decls.h"
> +
> +void f (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'f' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void
> +test_1 (const char *path)
> +{
> + int fd = open (path, O_RDWR);
> + close(fd);
> + f(fd); /* { dg-warning "'f' on closed file descriptor 'fd'" } */
> + /* { dg-message "\\(3\\) 'f' on closed file descriptor 'fd';
> 'close' was at \\(2\\)" "" { target *-*-* } .-1 } */
> +}
> +
> +void g (int fd) __attribute__((fd_arg_read(1))); /* { dg-message
> "argument 1 of 'g' must be a readable file descriptor, due to
> '__attribute__\\(\\(fd_arg_read\\(1\\)\\)\\)'" } */
> +
> +void
> +test_2 (const char *path)
> +{
> + int fd = open (path, O_WRONLY);
> + if (fd != -1)
> + {
> + g (fd); /* { dg-warning "'g' on write-only file descriptor 'fd'"
> } */
> + }
> + close (fd);
> +}
> +
> +void h (int fd) __attribute__((fd_arg_write(1))); /* { dg-message
> "argument 1 of 'h' must be a writable file descriptor, due to
> '__attribute__\\(\\(fd_arg_write\\(1\\)\\)\\)'" } */
> +void
> +test_3 (const char *path)
> +{
> + int fd = open (path, O_RDONLY);
> + if (fd != -1)
> + {
> + h (fd); /* { dg-warning "'h' on read-only file descriptor 'fd'"
> } */
> + }
> + close(fd);
> +}
> +
> +void ff (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'ff' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void test_4 (const char *path)
> +{
> + int fd = open (path, O_RDWR);
> + ff (fd); /* { dg-warning "'ff' on possibly invalid file descriptor
> 'fd'" } */
> + close(fd);
> +}
> +
> +void test_sm_fd_constants (void)
> +{
> + __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has value '\[0-9\]+'" } */
> + __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has value '\[0-9\]+'" } */
> + __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has value '\[0-9\]+'" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> index b4f43e7f0ef..bb58e9d9d5e 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> @@ -8,6 +8,7 @@ int read (int fd, void *buf, int nbytes);
> #define O_RDONLY 0
> #define O_WRONLY 1
> #define O_RDWR 2
> +#define O_ACCMODE 3
>
> void test_1 (const char *path)
> {
> diff --git a/gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
> b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
> new file mode 100644
> index 00000000000..e6b77b8dd18
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
> @@ -0,0 +1,20 @@
> +#include "analyzer-decls.h"
> +
> +/* Various constants used by the fd state machine. */
> +enum {
> + O_ACCMODE = 42,
> + O_RDONLY = 0x1,
> + O_WRONLY = 010
> +};
> +
> +void test_sm_fd_constants (void)
> +{
> + __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has value '42'" } */
> + __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has value '1'" } */
> + __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has value '8'" } */
> +}
> +
> +void test_unknown (void)
> +{
> + __analyzer_dump_named_constant ("UNKNOWN"); /* { dg-warning "named
> constant 'UNKNOWN' has unknown value" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/named-constants-via-
> macros-2.c b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-
> macros-2.c
> new file mode 100644
> index 00000000000..9c019e7c5ef
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros-2.c
> @@ -0,0 +1,15 @@
> +#include "analyzer-decls.h"
> +
> +/* Various constants used by the fd state machine, as macros
> + that can't be handled. */
> +
> +#define O_ACCMODE (
> +#define O_RDONLY "foo"
> +#define O_WRONLY int
> +
> +void test_sm_fd_constants (void)
> +{
> + __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has unknown value" } */
> + __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has unknown value" } */
> + __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has unknown value" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/named-constants-via-
> macros.c b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros.c
> new file mode 100644
> index 00000000000..2022f98e5b6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros.c
> @@ -0,0 +1,19 @@
> +#include "analyzer-decls.h"
> +
> +/* Various constants used by the fd state machine. */
> +
> +#define O_ACCMODE 42
> +#define O_RDONLY 0x1
> +#define O_WRONLY 010
> +
> +void test_sm_fd_constants (void)
> +{
> + __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has value '42'" } */
> + __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has value '1'" } */
> + __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has value '8'" } */
> +}
> +
> +void test_unknown (void)
> +{
> + __analyzer_dump_named_constant ("UNKNOWN"); /* { dg-warning "named
> constant 'UNKNOWN' has unknown value" } */
> +}
next prev parent reply other threads:[~2022-11-08 3:06 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-31 19:07 David Malcolm
2022-11-08 3:02 ` [PATCH] analyzer: add warnings relating to sockets [PR106140] David Malcolm
2022-11-12 3:27 ` [PATCH v2] " David Malcolm
2022-11-15 19:02 ` David Malcolm
2022-11-08 3:06 ` David Malcolm [this message]
2022-11-12 3:23 ` [PATCH v2] c, analyzer: support named constants in analyzer [PR106302] David Malcolm
2022-11-12 3:32 ` David Malcolm
2022-11-14 20:42 ` Marek Polacek
2022-11-15 18:35 ` [PATCH v3] " David Malcolm
2022-11-15 18:39 ` Marek Polacek
2022-11-17 3:05 ` [PATCH] c: fix ICE with -fanalyzer and -Wunused-macros [PR107711] David Malcolm
2022-11-17 15:30 ` Marek Polacek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=acc9d326129e38eb01ff8b766a832525d61e3a38.camel@redhat.com \
--to=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=joseph@codesourcery.com \
--cc=polacek@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).