From: Marek Polacek <polacek@redhat.com>
To: David Malcolm <dmalcolm@redhat.com>
Cc: gcc-patches@gcc.gnu.org, Joseph Myers <joseph@codesourcery.com>
Subject: Re: [PATCH v2] c, analyzer: support named constants in analyzer [PR106302]
Date: Mon, 14 Nov 2022 15:42:17 -0500 [thread overview]
Message-ID: <Y3KoKQLnBQKmIt8N@redhat.com> (raw)
In-Reply-To: <20221112032310.2723361-1-dmalcolm@redhat.com>
On Fri, Nov 11, 2022 at 10:23:10PM -0500, David Malcolm wrote:
> Changes since v1: ported the doc changes from texinfo to sphinx
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> Are the C frontend parts OK for trunk? (I can self-approve the
> analyzer parts)
Sorry for the delay.
> The patch adds an interface for frontends to call into the analyzer as
> the translation unit finishes. The analyzer can then call back into the
> frontend to ask about the values of the named constants it cares about
> whilst the frontend's data structures are still around.
>
> The patch implements this for the C frontend, which looks up the names
> by looking for named CONST_DECLs (which handles enum values). Failing
> that, it attempts to look up the values of macros but only the simplest
> cases are supported (a non-traditional macro with a single CPP_NUMBER
> token). It does this by building a buffer containing the macro
> definition and rerunning a lexer on it.
>
> The analyzer gracefully handles the cases where named values aren't
> found (such as anything more complicated than described above).
>
> The patch ports the analyzer to use this mechanism for "O_RDONLY",
> "O_WRONLY", and "O_ACCMODE". I have successfully tested my socket patch
> to also use this for "SOCK_STREAM" and "SOCK_DGRAM", so the technique
> seems to work.
So this works well for code like
enum __socket_type {
SOCK_STREAM = 1,
#define SOCK_STREAM SOCK_STREAM
};
?
> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> index d70697b1d63..efe19fbe70b 100644
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -72,6 +72,8 @@ along with GCC; see the file COPYING3. If not see
> #include "memmodel.h"
> #include "c-family/known-headers.h"
> #include "bitmap.h"
> +#include "analyzer/analyzer-language.h"
> +#include "toplev.h"
>
> /* We need to walk over decls with incomplete struct/union/enum types
> after parsing the whole translation unit.
> @@ -1662,6 +1664,87 @@ static bool c_parser_objc_diagnose_bad_element_prefix
> (c_parser *, struct c_declspecs *);
> static location_t c_parser_parse_rtl_body (c_parser *, char *);
>
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Concrete implementation of ana::translation_unit for the C frontend. */
> +
> +class c_translation_unit : public translation_unit
> +{
> +public:
> + /* Implementation of translation_unit::lookup_constant_by_id for use by the
> + analyzer to look up named constants in the user's source code. */
> + tree lookup_constant_by_id (tree id) const final override
> + {
> + /* Consider decls. */
> + if (tree decl = lookup_name (id))
> + if (TREE_CODE (decl) == CONST_DECL)
> + if (tree value = DECL_INITIAL (decl))
> + if (TREE_CODE (value) == INTEGER_CST)
> + return value;
> +
> + /* Consider macros. */
> + cpp_hashnode *hashnode = C_CPP_HASHNODE (id);
> + if (cpp_macro_p (hashnode))
> + if (tree value = consider_macro (hashnode->value.macro))
> + return value;
> +
> + return NULL_TREE;
> + }
> +
> +private:
> + /* Attempt to get an INTEGER_CST from MACRO.
> + Only handle the simplest cases: where MACRO's definition is a single
> + token containing a number, by lexing the number again.
> + This will handle e.g.
> + #define NAME 42
> + and other bases but not negative numbers, parentheses or e.g.
> + #define NAME 1 << 7
> + as doing so would require a parser. */
> + tree consider_macro (cpp_macro *macro) const
> + {
> + if (macro->paramc > 0)
> + return NULL_TREE;
> + if (macro->kind == cmk_traditional)
Do you really want to handle cmk_assert? I'd say you want
if (macro->kind != cmk_macro)
> + return NULL_TREE;
> + if (macro->count != 1)
> + return NULL_TREE;
> + const cpp_token &tok = macro->exp.tokens[0];
> + if (tok.type != CPP_NUMBER)
> + return NULL_TREE;
> +
> + cpp_reader *old_parse_in = parse_in;
> + parse_in = cpp_create_reader (c_dialect_cxx () ? CLK_GNUCXX: CLK_GNUC89,
> + ident_hash, line_table);
Why not always CLK_GNUC89 since we're in the C FE?
> +
> + pretty_printer pp;
> + pp_string (&pp, (const char *)tok.val.str.text);
A space after ')'.
> + pp_newline (&pp);
> + cpp_push_buffer (parse_in,
> + (const unsigned char *)pp_formatted_text (&pp),
Likewise.
> + strlen (pp_formatted_text (&pp)),
> + 0);
> +
> + tree value;
> + location_t loc;
> + unsigned char cpp_flags;
> + c_lex_with_flags (&value, &loc, &cpp_flags, 0);
> +
> + cpp_destroy (parse_in);
> + parse_in = old_parse_in;
> +
> + if (value && TREE_CODE (value) == INTEGER_CST)
> + return value;
> +
> + return NULL_TREE;
> + }
> +};
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
> /* Parse a translation unit (C90 6.7, C99 6.9, C11 6.9).
>
> translation-unit:
> @@ -1722,6 +1805,14 @@ c_parser_translation_unit (c_parser *parser)
> "#pragma omp begin assumes", "#pragma omp end assumes");
> current_omp_begin_assumes = 0;
> }
> +
> +#if ENABLE_ANALYZER
> + if (flag_analyzer)
> + {
> + ana::c_translation_unit tu;
> + ana::on_finish_translation_unit (tu);
> + }
> +#endif
> }
Marek
next prev parent reply other threads:[~2022-11-14 20:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-31 19:07 [PATCH] " David Malcolm
2022-11-08 3:02 ` [PATCH] analyzer: add warnings relating to sockets [PR106140] David Malcolm
2022-11-12 3:27 ` [PATCH v2] " David Malcolm
2022-11-15 19:02 ` David Malcolm
2022-11-08 3:06 ` PING: Re: [PATCH] c, analyzer: support named constants in analyzer [PR106302] David Malcolm
2022-11-12 3:23 ` [PATCH v2] " David Malcolm
2022-11-12 3:32 ` David Malcolm
2022-11-14 20:42 ` Marek Polacek [this message]
2022-11-15 18:35 ` [PATCH v3] " David Malcolm
2022-11-15 18:39 ` Marek Polacek
2022-11-17 3:05 ` [PATCH] c: fix ICE with -fanalyzer and -Wunused-macros [PR107711] David Malcolm
2022-11-17 15:30 ` Marek Polacek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y3KoKQLnBQKmIt8N@redhat.com \
--to=polacek@redhat.com \
--cc=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=joseph@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).