public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: gcc-patches@gcc.gnu.org, Joseph Myers <joseph@codesourcery.com>
Cc: Marek Polacek <polacek@redhat.com>
Subject: Re: [PATCH v2] c, analyzer: support named constants in analyzer [PR106302]
Date: Fri, 11 Nov 2022 22:32:07 -0500	[thread overview]
Message-ID: <8bd04399163661fde948dc052f21ee567b7b9f46.camel@redhat.com> (raw)
In-Reply-To: <20221112032310.2723361-1-dmalcolm@redhat.com>

On Fri, 2022-11-11 at 22:23 -0500, David Malcolm wrote:
> Changes since v1: ported the doc changes from texinfo to sphinx
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> 
> Are the C frontend parts OK for trunk?  (I can self-approve the
> analyzer parts)

...and FWIW, the followup patch that uses this to add checking of
sockets to -fanalyzer is here:

 [PATCH v2] analyzer: add warnings relating to sockets [PR106140]
    https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605836.html

Thanks
Dave

> 
> Thanks
> Dave
> 
> 
> The analyzer's file-descriptor state machine tracks the access mode
> of
> opened files, so that it can emit -Wanalyzer-fd-access-mode-mismatch.
> 
> To do this, its symbolic execution needs to "know" the values of the
> constants "O_RDONLY", "O_WRONLY", and "O_ACCMODE".  Currently
> analyzer/sm-fd.cc simply uses these values directly from the build-
> time
> header files, but these are the values on the host, not those from
> the
> target, which could be different (PR analyzer/106302).
> 
> In an earlier discussion of this issue:
>   https://gcc.gnu.org/pipermail/gcc/2022-June/238954.html
> we talked about adding a target hook for this.
> 
> However, I've also been experimenting with extending the fd state
> machine to track sockets (PR analyzer/106140).  For this, it's useful
> to
> "know" the values of the constants "SOCK_STREAM" and "SOCK_DGRAM".
> Unfortunately, these seem to have many arbitrary differences from
> target
> to target.
> 
> For example: Linux/glibc general has SOCK_STREAM == 1, SOCK_DGRAM ==
> 2,
> as does AIX, but annoyingly, e.g. Linux on MIPS has them the other
> way
> around.
> 
> It seems to me that as the analyzer grows more ambitious modeling of
> the
> behavior of APIs (perhaps via plugins) it's more likely that the
> analyzer will need to know the values of named constants, which might
> not even exist on the host.
> 
> For example, at LPC it was suggested to me that -fanalyzer could
> check
> rules about memory management inside the Linux kernel (probably via a
> plugin), but doing so involves a bunch of GFP_* flags (see PR
> 107472).
> 
> So rather than trying to capture all this knowledge in a target hook,
> this patch attempts to get at named constant values from the user's
> source code.
> 
> The patch adds an interface for frontends to call into the analyzer
> as
> the translation unit finishes.  The analyzer can then call back into
> the
> frontend to ask about the values of the named constants it cares
> about
> whilst the frontend's data structures are still around.
> 
> The patch implements this for the C frontend, which looks up the
> names
> by looking for named CONST_DECLs (which handles enum values). 
> Failing
> that, it attempts to look up the values of macros but only the
> simplest
> cases are supported (a non-traditional macro with a single CPP_NUMBER
> token).  It does this by building a buffer containing the macro
> definition and rerunning a lexer on it.
> 
> The analyzer gracefully handles the cases where named values aren't
> found (such as anything more complicated than described above).
> 
> The patch ports the analyzer to use this mechanism for "O_RDONLY",
> "O_WRONLY", and "O_ACCMODE".  I have successfully tested my socket
> patch
> to also use this for "SOCK_STREAM" and "SOCK_DGRAM", so the technique
> seems to work.
> 
> gcc/ChangeLog:
>         PR analyzer/106302
>         * Makefile.in (ANALYZER_OBJS): Add analyzer/analyzer-
> language.o.
>         (GTFILES): Add analyzer/analyzer-language.cc.
>         * doc/gccint/debugging-the-analyzer.rst: Document
>         __analyzer_dump_named_constant.
> 
> gcc/analyzer/ChangeLog:
>         PR analyzer/106302
>         * analyzer-language.cc: New file.
>         * analyzer-language.h: New file.
>         * analyzer.h (get_stashed_constant_by_name): New decl.
>         (log_stashed_constants): New decl.
>         * engine.cc (impl_run_checkers): Call log_stashed_constants.
>         * region-model-impl-calls.cc
>         (region_model::impl_call_analyzer_dump_named_constant): New.
>         * region-model.cc (region_model::on_stmt_pre): Handle
>         __analyzer_dump_named_constant.
>         * region-model.h
>         (region_model::impl_call_analyzer_dump_named_constant): New
> decl.
>         * sm-fd.cc (fd_state_machine::m_O_ACCMODE): New.
>         (fd_state_machine::m_O_RDONLY): New.
>         (fd_state_machine::m_O_WRONLY): New.
>         (fd_state_machine::fd_state_machine): Initialize the new
> fields.
>         (fd_state_machine::get_access_mode_from_flag): Use the new
> fields,
>         rather than using the host values.
> 
> gcc/c/ChangeLog:
>         PR analyzer/106302
>         * c-parser.cc: Include "analyzer/analyzer-language.h" and
> "toplev.h".
>         (class ana::c_translation_unit): New.
>         (c_parser_translation_unit): Call
> ana::on_finish_translation_unit.
> 
> gcc/testsuite/ChangeLog:
>         * gcc.dg/analyzer/analyzer-decls.h
>         (__analyzer_dump_named_constant): New decl.
>         * gcc.dg/analyzer/fd-4.c (void): Likewise.
>         (O_ACCMODE): Define.
>         * gcc.dg/analyzer/fd-access-mode-enum.c: New test, based on .
>         * gcc.dg/analyzer/fd-5.c: ...this.  Rename to...
>         * gcc.dg/analyzer/fd-access-mode-macros.c: ...this.
>         (O_ACCMODE): Define.
>         * gcc.dg/analyzer/fd-access-mode-target-headers.c: New test,
> also
>         based on fd-5.c.
>         (test_sm_fd_constants): New.
>         * gcc.dg/analyzer/fd-dup-1.c (O_ACCMODE): Define.
>         * gcc.dg/analyzer/named-constants-via-enum.c: New test.
>         * gcc.dg/analyzer/named-constants-via-macros-2.c: New test.
>         * gcc.dg/analyzer/named-constants-via-macros.c: New test.
> 
> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
> ---
>  gcc/Makefile.in                               |   2 +
>  gcc/analyzer/analyzer-language.cc             | 110
> ++++++++++++++++++
>  gcc/analyzer/analyzer-language.h              |  48 ++++++++
>  gcc/analyzer/analyzer.h                       |   3 +
>  gcc/analyzer/engine.cc                        |   1 +
>  gcc/analyzer/region-model-impl-calls.cc       |  28 +++++
>  gcc/analyzer/region-model.cc                  |   4 +
>  gcc/analyzer/region-model.h                   |   2 +
>  gcc/analyzer/sm-fd.cc                         |  30 +++--
>  gcc/c/c-parser.cc                             |  91 +++++++++++++++
>  gcc/doc/gccint/debugging-the-analyzer.rst     |  17 +++
>  .../gcc.dg/analyzer/analyzer-decls.h          |   3 +
>  gcc/testsuite/gcc.dg/analyzer/fd-4.c          |   1 +
>  .../gcc.dg/analyzer/fd-access-mode-enum.c     |  60 ++++++++++
>  .../{fd-5.c => fd-access-mode-macros.c}       |   1 +
>  .../analyzer/fd-access-mode-target-headers.c  |  56 +++++++++
>  gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c      |   1 +
>  .../analyzer/named-constants-via-enum.c       |  20 ++++
>  .../analyzer/named-constants-via-macros-2.c   |  15 +++
>  .../analyzer/named-constants-via-macros.c     |  19 +++
>  20 files changed, 502 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/analyzer/analyzer-language.cc
>  create mode 100644 gcc/analyzer/analyzer-language.h
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-access-mode-
> enum.c
>  rename gcc/testsuite/gcc.dg/analyzer/{fd-5.c => fd-access-mode-
> macros.c} (98%)
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-access-mode-
> target-headers.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-
> via-enum.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-
> via-macros-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-
> via-macros.c
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 246a85a1677..684caedc2df 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1236,6 +1236,7 @@ C_COMMON_OBJS = c-family/c-common.o c-family/c-
> cppbuiltin.o c-family/c-dump.o \
>  ANALYZER_OBJS = \
>         analyzer/analysis-plan.o \
>         analyzer/analyzer.o \
> +       analyzer/analyzer-language.o \
>         analyzer/analyzer-logging.o \
>         analyzer/analyzer-pass.o \
>         analyzer/analyzer-selftests.o \
> @@ -2709,6 +2710,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h
> $(srcdir)/coretypes.h \
>    $(srcdir)/internal-fn.h \
>    $(srcdir)/calls.cc \
>    $(srcdir)/omp-general.h \
> +  $(srcdir)/analyzer/analyzer-language.cc \
>    @all_gtfiles@
>  
>  # Compute the list of GT header files from the corresponding C
> sources,
> diff --git a/gcc/analyzer/analyzer-language.cc
> b/gcc/analyzer/analyzer-language.cc
> new file mode 100644
> index 00000000000..ba4352b729a
> --- /dev/null
> +++ b/gcc/analyzer/analyzer-language.cc
> @@ -0,0 +1,110 @@
> +/* Interface between analyzer and frontends.
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +   Contributed by David Malcolm <dmalcolm@redhat.com>.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful, but
> +WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#define INCLUDE_MEMORY
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree.h"
> +#include "stringpool.h"
> +#include "analyzer/analyzer.h"
> +#include "analyzer/analyzer-language.h"
> +#include "analyzer/analyzer-logging.h"
> +
> +/* Map from identifier to INTEGER_CST.  */
> +static GTY (()) hash_map <tree, tree> *analyzer_stashed_constants;
> +
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Call into TU to try to find a value for NAME.
> +   If found, stash its value within analyzer_stashed_constants.  */
> +
> +static void
> +maybe_stash_named_constant (const translation_unit &tu, const char
> *name)
> +{
> +  if (!analyzer_stashed_constants)
> +    analyzer_stashed_constants = hash_map<tree, tree>::create_ggc
> ();
> +
> +  tree id = get_identifier (name);
> +  if (tree t = tu.lookup_constant_by_id (id))
> +    {
> +      gcc_assert (TREE_CODE (t) == INTEGER_CST);
> +      analyzer_stashed_constants->put (id, t);
> +    }
> +}
> +
> +/* Hook for frontend to call into analyzer when TU finishes.
> +   This exists so that the analyzer can stash named constant values
> from
> +   header files (e.g. macros and enums) for later use when modeling
> the
> +   behaviors of APIs.
> +
> +   By doing it this way, the analyzer can use the precise values for
> those
> +   constants from the user's headers, rather than attempting to
> model them
> +   as properties of the target.  */
> +
> +void
> +on_finish_translation_unit (const translation_unit &tu)
> +{
> +  /* Bail if the analyzer isn't enabled.  */
> +  if (!flag_analyzer)
> +    return;
> +
> +  /* Stash named constants for use by sm-fd.cc  */
> +  maybe_stash_named_constant (tu, "O_ACCMODE");
> +  maybe_stash_named_constant (tu, "O_RDONLY");
> +  maybe_stash_named_constant (tu, "O_WRONLY");
> +}
> +
> +/* Lookup NAME in the named constants stashed when the frontend TU
> finished.
> +   Return either an INTEGER_CST, or NULL_TREE.  */
> +
> +tree
> +get_stashed_constant_by_name (const char *name)
> +{
> +  if (!analyzer_stashed_constants)
> +    return NULL_TREE;
> +  tree id = get_identifier (name);
> +  if (tree *slot = analyzer_stashed_constants->get (id))
> +    {
> +      gcc_assert (TREE_CODE (*slot) == INTEGER_CST);
> +      return *slot;
> +    }
> +  return NULL_TREE;
> +}
> +
> +/* Log all stashed named constants to LOGGER.  */
> +
> +void
> +log_stashed_constants (logger *logger)
> +{
> +  gcc_assert (logger);
> +  LOG_SCOPE (logger);
> +  if (analyzer_stashed_constants)
> +    for (auto iter : *analyzer_stashed_constants)
> +      logger->log ("%qE: %qE", iter.first, iter.second);
> +}
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
> +#include "gt-analyzer-language.h"
> diff --git a/gcc/analyzer/analyzer-language.h
> b/gcc/analyzer/analyzer-language.h
> new file mode 100644
> index 00000000000..33c4dd60623
> --- /dev/null
> +++ b/gcc/analyzer/analyzer-language.h
> @@ -0,0 +1,48 @@
> +/* Interface between analyzer and frontends.
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +   Contributed by David Malcolm <dmalcolm@redhat.com>.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful, but
> +WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#ifndef GCC_ANALYZER_LANGUAGE_H
> +#define GCC_ANALYZER_LANGUAGE_H
> +
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Abstract base class for representing a specific TU
> +   to the analyzer.  */
> +
> +class translation_unit
> +{
> + public:
> +  /* Attempt to look up an  value for identifier ID (e.g. in the
> headers that
> +     have been seen).  If it is defined and an integer (e.g. either
> as a
> +     macro or enum), return the INTEGER_CST value, otherwise return
> NULL.  */
> +  virtual tree lookup_constant_by_id (tree id) const = 0;
> +};
> +
> +/* Analyzer hook for frontends to call at the end of the TU.  */
> +
> +void on_finish_translation_unit (const translation_unit &tu);
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
> +#endif /* GCC_ANALYZER_LANGUAGE_H */
> diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
> index c0041c35d1a..9cf8d98fabe 100644
> --- a/gcc/analyzer/analyzer.h
> +++ b/gcc/analyzer/analyzer.h
> @@ -311,6 +311,9 @@ public:
>    virtual bool terminate_path_p () const = 0;
>  };
>  
> +extern tree get_stashed_constant_by_name (const char *name);
> +extern void log_stashed_constants (logger *logger);
> +
>  } // namespace ana
>  
>  extern bool is_special_named_call_p (const gcall *call, const char
> *funcname,
> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
> index d0595ef0d07..891be7c5c90 100644
> --- a/gcc/analyzer/engine.cc
> +++ b/gcc/analyzer/engine.cc
> @@ -6010,6 +6010,7 @@ impl_run_checkers (logger *logger)
>        logger->log ("BITS_BIG_ENDIAN: %i", BITS_BIG_ENDIAN ? 1 : 0);
>        logger->log ("BYTES_BIG_ENDIAN: %i", BYTES_BIG_ENDIAN ? 1 :
> 0);
>        logger->log ("WORDS_BIG_ENDIAN: %i", WORDS_BIG_ENDIAN ? 1 :
> 0);
> +      log_stashed_constants (logger);
>      }
>  
>    /* If using LTO, ensure that the cgraph nodes have function
> bodies.  */
> diff --git a/gcc/analyzer/region-model-impl-calls.cc
> b/gcc/analyzer/region-model-impl-calls.cc
> index 9ef31f6ab05..a7134ed90bb 100644
> --- a/gcc/analyzer/region-model-impl-calls.cc
> +++ b/gcc/analyzer/region-model-impl-calls.cc
> @@ -352,6 +352,34 @@ region_model::impl_call_analyzer_dump_escaped
> (const gcall *call)
>               pp_formatted_text (&pp));
>  }
>  
> +/* Handle a call to "__analyzer_dump_named_constant".
> +
> +   Look up the given name, and emit a warning describing the
> +   state of the corresponding stashed value.
> +
> +   This is for use when debugging, and for DejaGnu tests.  */
> +
> +void
> +region_model::
> +impl_call_analyzer_dump_named_constant (const gcall *call,
> +                                       region_model_context *ctxt)
> +{
> +  call_details cd (call, this, ctxt);
> +  const char *name = cd.get_arg_string_literal (0);
> +  if (!name)
> +    {
> +      error_at (call->location, "cannot determine name");
> +      return;
> +    }
> +  tree value = get_stashed_constant_by_name (name);
> +  if (value)
> +    warning_at (call->location, 0, "named constant %qs has value
> %qE",
> +               name, value);
> +  else
> +    warning_at (call->location, 0, "named constant %qs has unknown
> value",
> +               name);
> +}
> +
>  /* Handle a call to "__analyzer_eval" by evaluating the input
>     and dumping as a dummy warning, so that test cases can use
>     dg-warning to validate the result (and so unexpected warnings
> will
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> model.cc
> index b91434d7db4..5bae3cf5cd4 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -1229,6 +1229,10 @@ region_model::on_stmt_pre (const gimple *stmt,
>           impl_call_analyzer_dump_capacity (call, ctxt);
>         else if (is_special_named_call_p (call,
> "__analyzer_dump_escaped", 0))
>           impl_call_analyzer_dump_escaped (call);
> +       else if (is_special_named_call_p (call,
> +                                        
> "__analyzer_dump_named_constant",
> +                                         1))
> +         impl_call_analyzer_dump_named_constant (call, ctxt);
>         else if (is_special_named_call_p (call,
> "__analyzer_dump_path", 0))
>           {
>             /* Handle the builtin "__analyzer_dump_path" by queuing a
> diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-
> model.h
> index 50790596726..bd81e6b6b9d 100644
> --- a/gcc/analyzer/region-model.h
> +++ b/gcc/analyzer/region-model.h
> @@ -344,6 +344,8 @@ class region_model
>    void impl_call_analyzer_dump_capacity (const gcall *call,
>                                          region_model_context *ctxt);
>    void impl_call_analyzer_dump_escaped (const gcall *call);
> +  void impl_call_analyzer_dump_named_constant (const gcall *call,
> +                                              region_model_context
> *ctxt);
>    void impl_call_analyzer_eval (const gcall *call,
>                                 region_model_context *ctxt);
>    void impl_call_analyzer_get_unknown_ptr (const call_details &cd);
> diff --git a/gcc/analyzer/sm-fd.cc b/gcc/analyzer/sm-fd.cc
> index da0e92b5113..370115d56bf 100644
> --- a/gcc/analyzer/sm-fd.cc
> +++ b/gcc/analyzer/sm-fd.cc
> @@ -159,6 +159,11 @@ public:
>    /* State for a file descriptor that we do not want to track
> anymore . */
>    state_t m_stop;
>  
> +  /* Stashed constant values from the frontend.  These could be
> NULL.  */
> +  tree m_O_ACCMODE;
> +  tree m_O_RDONLY;
> +  tree m_O_WRONLY;
> +
>  private:
>    void on_open (sm_context *sm_ctxt, const supernode *node, const
> gimple *stmt,
>                 const gcall *call) const;
> @@ -686,7 +691,10 @@ fd_state_machine::fd_state_machine (logger
> *logger)
>        m_valid_write_only (add_state ("fd-valid-write-only")),
>        m_invalid (add_state ("fd-invalid")),
>        m_closed (add_state ("fd-closed")),
> -      m_stop (add_state ("fd-stop"))
> +      m_stop (add_state ("fd-stop")),
> +      m_O_ACCMODE (get_stashed_constant_by_name ("O_ACCMODE")),
> +      m_O_RDONLY (get_stashed_constant_by_name ("O_RDONLY")),
> +      m_O_WRONLY (get_stashed_constant_by_name ("O_WRONLY"))
>  {
>  }
>  
> @@ -709,16 +717,18 @@ fd_state_machine::is_valid_fd_p (state_t s)
> const
>  enum access_mode
>  fd_state_machine::get_access_mode_from_flag (int flag) const
>  {
> -  /* FIXME: this code assumes the access modes on the host and
> -     target are the same, which in practice might not be the case. 
> */
> -
> -  if ((flag & O_ACCMODE) == O_RDONLY)
> -    {
> -      return READ_ONLY;
> -    }
> -  else if ((flag & O_ACCMODE) == O_WRONLY)
> +  if (m_O_ACCMODE && TREE_CODE (m_O_ACCMODE) == INTEGER_CST)
>      {
> -      return WRITE_ONLY;
> +      const unsigned HOST_WIDE_INT mask_val = TREE_INT_CST_LOW
> (m_O_ACCMODE);
> +      const unsigned HOST_WIDE_INT masked_flag = flag & mask_val;
> +
> +      if (m_O_RDONLY && TREE_CODE (m_O_RDONLY) == INTEGER_CST)
> +       if (masked_flag == TREE_INT_CST_LOW (m_O_RDONLY))
> +         return READ_ONLY;
> +
> +      if (m_O_WRONLY && TREE_CODE (m_O_WRONLY) == INTEGER_CST)
> +       if (masked_flag == TREE_INT_CST_LOW (m_O_WRONLY))
> +         return WRITE_ONLY;
>      }
>    return READ_WRITE;
>  }
> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> index d70697b1d63..efe19fbe70b 100644
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -72,6 +72,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "memmodel.h"
>  #include "c-family/known-headers.h"
>  #include "bitmap.h"
> +#include "analyzer/analyzer-language.h"
> +#include "toplev.h"
>  
>  /* We need to walk over decls with incomplete struct/union/enum
> types
>     after parsing the whole translation unit.
> @@ -1662,6 +1664,87 @@ static bool
> c_parser_objc_diagnose_bad_element_prefix
>    (c_parser *, struct c_declspecs *);
>  static location_t c_parser_parse_rtl_body (c_parser *, char *);
>  
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Concrete implementation of ana::translation_unit for the C
> frontend.  */
> +
> +class c_translation_unit : public translation_unit
> +{
> +public:
> +  /* Implementation of translation_unit::lookup_constant_by_id for
> use by the
> +     analyzer to look up named constants in the user's source code. 
> */
> +  tree lookup_constant_by_id (tree id) const final override
> +  {
> +    /* Consider decls.  */
> +    if (tree decl = lookup_name (id))
> +      if (TREE_CODE (decl) == CONST_DECL)
> +       if (tree value = DECL_INITIAL (decl))
> +         if (TREE_CODE (value) == INTEGER_CST)
> +           return value;
> +
> +    /* Consider macros.  */
> +    cpp_hashnode *hashnode = C_CPP_HASHNODE (id);
> +    if (cpp_macro_p (hashnode))
> +      if (tree value = consider_macro (hashnode->value.macro))
> +       return value;
> +
> +    return NULL_TREE;
> +  }
> +
> +private:
> +  /* Attempt to get an INTEGER_CST from MACRO.
> +     Only handle the simplest cases: where MACRO's definition is a
> single
> +     token containing a number, by lexing the number again.
> +     This will handle e.g.
> +       #define NAME 42
> +     and other bases but not negative numbers, parentheses or e.g.
> +       #define NAME 1 << 7
> +     as doing so would require a parser.  */
> +  tree consider_macro (cpp_macro *macro) const
> +  {
> +    if (macro->paramc > 0)
> +      return NULL_TREE;
> +    if (macro->kind == cmk_traditional)
> +      return NULL_TREE;
> +    if (macro->count != 1)
> +      return NULL_TREE;
> +    const cpp_token &tok = macro->exp.tokens[0];
> +    if (tok.type != CPP_NUMBER)
> +      return NULL_TREE;
> +
> +    cpp_reader *old_parse_in = parse_in;
> +    parse_in = cpp_create_reader (c_dialect_cxx () ? CLK_GNUCXX:
> CLK_GNUC89,
> +                                 ident_hash, line_table);
> +
> +    pretty_printer pp;
> +    pp_string (&pp, (const char *)tok.val.str.text);
> +    pp_newline (&pp);
> +    cpp_push_buffer (parse_in,
> +                    (const unsigned char *)pp_formatted_text (&pp),
> +                    strlen (pp_formatted_text (&pp)),
> +                    0);
> +
> +    tree value;
> +    location_t loc;
> +    unsigned char cpp_flags;
> +    c_lex_with_flags (&value, &loc, &cpp_flags, 0);
> +
> +    cpp_destroy (parse_in);
> +    parse_in = old_parse_in;
> +
> +    if (value && TREE_CODE (value) == INTEGER_CST)
> +      return value;
> +
> +    return NULL_TREE;
> +  }
> +};
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
>  /* Parse a translation unit (C90 6.7, C99 6.9, C11 6.9).
>  
>     translation-unit:
> @@ -1722,6 +1805,14 @@ c_parser_translation_unit (c_parser *parser)
>                "#pragma omp begin assumes", "#pragma omp end
> assumes");
>        current_omp_begin_assumes = 0;
>      }
> +
> +#if ENABLE_ANALYZER
> +  if (flag_analyzer)
> +    {
> +      ana::c_translation_unit tu;
> +      ana::on_finish_translation_unit (tu);
> +    }
> +#endif
>  }
>  
>  /* Parse an external declaration (C90 6.7, C99 6.9, C11 6.9).
> diff --git a/gcc/doc/gccint/debugging-the-analyzer.rst
> b/gcc/doc/gccint/debugging-the-analyzer.rst
> index 4a09b39e166..099acf295b9 100644
> --- a/gcc/doc/gccint/debugging-the-analyzer.rst
> +++ b/gcc/doc/gccint/debugging-the-analyzer.rst
> @@ -90,6 +90,23 @@ With a non-zero argument
>  
>  it will also dump all of the states within the 'processed' nodes.
>  
> +The builtin ``__analyzer_dump_named_constant`` will emit a warning
> +during analysis describing what is known about the value of a given
> +named constant, for parts of the analyzer that interact with target
> +headers.
> +
> +For example:
> +
> +.. code-block:: c
> +
> +   __analyzer_dump_named_constant ("O_RDONLY");
> +
> +might emit the warning:
> +
> +.. code-block::
> +
> +   warning: named constant 'O_RDONLY' has value '1'
> +
>  .. code-block:: c++
>  
>       __analyzer_dump_region_model ();
> diff --git a/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> b/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> index 4478d740b58..d9a32ed9370 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> +++ b/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> @@ -31,6 +31,9 @@ extern void __analyzer_dump_escaped (void);
>     will also dump all of the states within those nodes.  */
>  extern void __analyzer_dump_exploded_nodes (int);
>  
> +/* Emit a warning describing what is known about the value of NAME. 
> */
> +extern void __analyzer_dump_named_constant (const char *name);
> +
>  /* Emit a placeholder "note" diagnostic with a path to this call
> site,
>     if the analyzer finds a feasible path to it.  */
>  extern void __analyzer_dump_path (void);
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> index 842a26b4364..994bad84342 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> @@ -8,6 +8,7 @@ void close(int fd);
>  int write (int fd, void *buf, int nbytes);
>  int read (int fd, void *buf, int nbytes);
>  
> +#define O_ACCMODE 0xf
>  #define O_RDONLY 0
>  #define O_WRONLY 1
>  #define O_RDWR 2
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
> new file mode 100644
> index 00000000000..5226569c437
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
> @@ -0,0 +1,60 @@
> +int open(const char *, int mode);
> +void close(int fd);
> +int write (int fd, void *buf, int nbytes);
> +int read (int fd, void *buf, int nbytes);
> +
> +/* Example of these flags as an enum, and with
> +   non-standard values for them.  */
> +
> +enum {
> +      O_RDONLY  = 0x10,
> +      O_WRONLY  = 0x20,
> +      O_RDWR    = 0x40,
> +
> +      O_ACCMODE = 0xf0
> +};
> +
> +void f (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'f' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void
> +test_1 (const char *path)
> +{
> +    int fd = open (path, O_RDWR);
> +    close(fd);
> +    f(fd); /* { dg-warning "'f' on closed file descriptor 'fd'" } */
> +      /* { dg-message "\\(3\\) 'f' on closed file descriptor 'fd';
> 'close' was at \\(2\\)" "" { target *-*-* } .-1 } */
> +}
> +
> +void g (int fd) __attribute__((fd_arg_read(1))); /* { dg-message
> "argument 1 of 'g' must be a readable file descriptor, due to
> '__attribute__\\(\\(fd_arg_read\\(1\\)\\)\\)'" } */
> +
> +void
> +test_2 (const char *path)
> +{
> +  int fd = open (path, O_WRONLY);
> +  if (fd != -1)
> +  {
> +    g (fd); /* { dg-warning "'g' on write-only file descriptor 'fd'"
> } */
> +  }
> +  close (fd);
> +}
> +
> +void h (int fd) __attribute__((fd_arg_write(1))); /* { dg-message
> "argument 1 of 'h' must be a writable file descriptor, due to
> '__attribute__\\(\\(fd_arg_write\\(1\\)\\)\\)'" } */
> +void
> +test_3 (const char *path)
> +{
> +  int fd = open (path, O_RDONLY);
> +  if (fd != -1)
> +  {
> +    h (fd); /* { dg-warning "'h' on read-only file descriptor 'fd'"
> } */
> +  }
> +  close(fd);
> +}
> +
> +void ff (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'ff' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void test_4 (const char *path)
> +{
> +  int fd = open (path, O_RDWR);
> +  ff (fd); /* { dg-warning "'ff' on possibly invalid file descriptor
> 'fd'" } */
> +  close(fd);
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-5.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
> similarity index 98%
> rename from gcc/testsuite/gcc.dg/analyzer/fd-5.c
> rename to gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
> index c18b2adcbe5..f9a6931a5db 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-5.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
> @@ -6,6 +6,7 @@ int read (int fd, void *buf, int nbytes);
>  #define O_RDONLY 0
>  #define O_WRONLY 1
>  #define O_RDWR 2
> +#define O_ACCMODE 0x3
>  
>  void f (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'f' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
>  
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-
> headers.c b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-
> headers.c
> new file mode 100644
> index 00000000000..b76eb667d50
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-headers.c
> @@ -0,0 +1,56 @@
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include <unistd.h>
> +#include "analyzer-decls.h"
> +
> +void f (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'f' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void
> +test_1 (const char *path)
> +{
> +    int fd = open (path, O_RDWR);
> +    close(fd);
> +    f(fd); /* { dg-warning "'f' on closed file descriptor 'fd'" } */
> +      /* { dg-message "\\(3\\) 'f' on closed file descriptor 'fd';
> 'close' was at \\(2\\)" "" { target *-*-* } .-1 } */
> +}
> +
> +void g (int fd) __attribute__((fd_arg_read(1))); /* { dg-message
> "argument 1 of 'g' must be a readable file descriptor, due to
> '__attribute__\\(\\(fd_arg_read\\(1\\)\\)\\)'" } */
> +
> +void
> +test_2 (const char *path)
> +{
> +  int fd = open (path, O_WRONLY);
> +  if (fd != -1)
> +  {
> +    g (fd); /* { dg-warning "'g' on write-only file descriptor 'fd'"
> } */
> +  }
> +  close (fd);
> +}
> +
> +void h (int fd) __attribute__((fd_arg_write(1))); /* { dg-message
> "argument 1 of 'h' must be a writable file descriptor, due to
> '__attribute__\\(\\(fd_arg_write\\(1\\)\\)\\)'" } */
> +void
> +test_3 (const char *path)
> +{
> +  int fd = open (path, O_RDONLY);
> +  if (fd != -1)
> +  {
> +    h (fd); /* { dg-warning "'h' on read-only file descriptor 'fd'"
> } */
> +  }
> +  close(fd);
> +}
> +
> +void ff (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'ff' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void test_4 (const char *path)
> +{
> +  int fd = open (path, O_RDWR);
> +  ff (fd); /* { dg-warning "'ff' on possibly invalid file descriptor
> 'fd'" } */
> +  close(fd);
> +}
> +
> +void test_sm_fd_constants (void)
> +{
> +  __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has value '\[0-9\]+'" } */
> +  __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has value '\[0-9\]+'" } */
> +  __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has value '\[0-9\]+'" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> index b4f43e7f0ef..bb58e9d9d5e 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> @@ -8,6 +8,7 @@ int read (int fd, void *buf, int nbytes);
>  #define O_RDONLY 0
>  #define O_WRONLY 1
>  #define O_RDWR 2
> +#define O_ACCMODE 3
>  
>  void test_1 (const char *path)
>  {
> diff --git a/gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
> b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
> new file mode 100644
> index 00000000000..e6b77b8dd18
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
> @@ -0,0 +1,20 @@
> +#include "analyzer-decls.h"
> +
> +/* Various constants used by the fd state machine.  */
> +enum {
> +      O_ACCMODE = 42,
> +      O_RDONLY  = 0x1,
> +      O_WRONLY  = 010
> +};
> +
> +void test_sm_fd_constants (void)
> +{
> +  __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has value '42'" } */
> +  __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has value '1'" } */
> +  __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has value '8'" } */
> +}
> +
> +void test_unknown (void)
> +{
> +  __analyzer_dump_named_constant ("UNKNOWN"); /* { dg-warning "named
> constant 'UNKNOWN' has unknown value" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/named-constants-via-
> macros-2.c b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-
> macros-2.c
> new file mode 100644
> index 00000000000..9c019e7c5ef
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros-2.c
> @@ -0,0 +1,15 @@
> +#include "analyzer-decls.h"
> +
> +/* Various constants used by the fd state machine, as macros
> +   that can't be handled.  */
> +
> +#define O_ACCMODE (
> +#define O_RDONLY  "foo"
> +#define O_WRONLY  int
> +
> +void test_sm_fd_constants (void)
> +{
> +  __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has unknown value" } */
> +  __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has unknown value" } */
> +  __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has unknown value" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/named-constants-via-
> macros.c b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros.c
> new file mode 100644
> index 00000000000..2022f98e5b6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros.c
> @@ -0,0 +1,19 @@
> +#include "analyzer-decls.h"
> +
> +/* Various constants used by the fd state machine.  */
> +
> +#define O_ACCMODE 42
> +#define O_RDONLY  0x1
> +#define O_WRONLY  010
> +
> +void test_sm_fd_constants (void)
> +{
> +  __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has value '42'" } */
> +  __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has value '1'" } */
> +  __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has value '8'" } */
> +}
> +
> +void test_unknown (void)
> +{
> +  __analyzer_dump_named_constant ("UNKNOWN"); /* { dg-warning "named
> constant 'UNKNOWN' has unknown value" } */
> +}


  reply	other threads:[~2022-11-12  3:32 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 19:07 [PATCH] " David Malcolm
2022-11-08  3:02 ` [PATCH] analyzer: add warnings relating to sockets [PR106140] David Malcolm
2022-11-12  3:27   ` [PATCH v2] " David Malcolm
2022-11-15 19:02     ` David Malcolm
2022-11-08  3:06 ` PING: Re: [PATCH] c, analyzer: support named constants in analyzer [PR106302] David Malcolm
2022-11-12  3:23   ` [PATCH v2] " David Malcolm
2022-11-12  3:32     ` David Malcolm [this message]
2022-11-14 20:42     ` Marek Polacek
2022-11-15 18:35       ` [PATCH v3] " David Malcolm
2022-11-15 18:39         ` Marek Polacek
2022-11-17  3:05           ` [PATCH] c: fix ICE with -fanalyzer and -Wunused-macros [PR107711] David Malcolm
2022-11-17 15:30             ` Marek Polacek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8bd04399163661fde948dc052f21ee567b7b9f46.camel@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=joseph@codesourcery.com \
    --cc=polacek@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).