From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 010A9385840C for ; Mon, 14 Nov 2022 20:42:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 010A9385840C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668458544; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=nMmUfQqMGUv1DMkpyqnvsG7dH8/wMMIvCY6x5wHE5y0=; b=IfQZGs5Jegfmiw+gVLO985MXqBa8m/efbL1KuYa6veT3uIFbBarhvVt+HbwjrYzHsZiYJG EhcmQ7I8/5EN3liS3UmqeT8ScuYbWpG8wxUOBvqt18UoqEsE57iJFg8DZ4eOvn+Rx6UyyT OD8DzWMfG1w6iOegSKhxgVHG1hgn9RM= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-509-EnHFZQ69OCydOdAJJbOZDw-1; Mon, 14 Nov 2022 15:42:21 -0500 X-MC-Unique: EnHFZQ69OCydOdAJJbOZDw-1 Received: by mail-qk1-f198.google.com with SMTP id q14-20020a05620a0d8e00b006ef0350dae1so11911951qkl.12 for ; Mon, 14 Nov 2022 12:42:21 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nMmUfQqMGUv1DMkpyqnvsG7dH8/wMMIvCY6x5wHE5y0=; b=1MlORyMejowFCAEcaDYnHC35nkYvQdN8cXarxhFpQO0lvk0R4LWpIyBzVxMKNYbipl dU31IBPKPg7+izH0eD4LkWmq2xyCZ4uogqbIvWtVPSUljZah8vpjr/XW5OWzvhlJgoRX SJW+NWtZi3MtxV10Y1szszej4mMd9rxph4yzFLkBhBPeF5K/0HjbzIjqRb6f8KGPDG0u y2gL01Mi03dBDPCx2pxWapNZpOac0eAGmFFiMexO2NkTEfAyHm3a5g9y+aMNTNrylorE cavxFzkTrTy9/QSQcvsLFSxHYSbD6bqmMEIL7hBjK/SCLW6XfOhOuYcntH//sfOrCzKB u2nw== X-Gm-Message-State: ANoB5pkFZNSehyRzVPAImPyTSJO5ohbpZJM1Jv4CNchH08SnwE2/0eDV JzOIdV8S2r3DD3dyD4maI/1ARffj6VVkj5go0CPxvfseGMGXRmzaE3HQqVPujxNCssEL0jDRgjC P5RCGQv2HI6I8F2b2DA== X-Received: by 2002:ac8:7348:0:b0:398:45f2:402 with SMTP id q8-20020ac87348000000b0039845f20402mr14020061qtp.565.1668458540583; Mon, 14 Nov 2022 12:42:20 -0800 (PST) X-Google-Smtp-Source: AA0mqf6fddLjrVvWnU/YXZAAc8Zx1LNwfphmLgLnXIqbX4fELGD6MIUYIwdtS+FbWgGUycjJgWqMUA== X-Received: by 2002:ac8:7348:0:b0:398:45f2:402 with SMTP id q8-20020ac87348000000b0039845f20402mr14020041qtp.565.1668458540278; Mon, 14 Nov 2022 12:42:20 -0800 (PST) Received: from redhat.com (2603-7000-9500-2e39-0000-0000-0000-1db4.res6.spectrum.com. [2603:7000:9500:2e39::1db4]) by smtp.gmail.com with ESMTPSA id s4-20020ac87584000000b003a5092ed8cdsm6181456qtq.9.2022.11.14.12.42.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Nov 2022 12:42:19 -0800 (PST) Date: Mon, 14 Nov 2022 15:42:17 -0500 From: Marek Polacek To: David Malcolm Cc: gcc-patches@gcc.gnu.org, Joseph Myers Subject: Re: [PATCH v2] c, analyzer: support named constants in analyzer [PR106302] Message-ID: References: <20221112032310.2723361-1-dmalcolm@redhat.com> MIME-Version: 1.0 In-Reply-To: <20221112032310.2723361-1-dmalcolm@redhat.com> User-Agent: Mutt/2.2.7 (2022-08-07) X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Nov 11, 2022 at 10:23:10PM -0500, David Malcolm wrote: > Changes since v1: ported the doc changes from texinfo to sphinx > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > Are the C frontend parts OK for trunk? (I can self-approve the > analyzer parts) Sorry for the delay. > The patch adds an interface for frontends to call into the analyzer as > the translation unit finishes. The analyzer can then call back into the > frontend to ask about the values of the named constants it cares about > whilst the frontend's data structures are still around. > > The patch implements this for the C frontend, which looks up the names > by looking for named CONST_DECLs (which handles enum values). Failing > that, it attempts to look up the values of macros but only the simplest > cases are supported (a non-traditional macro with a single CPP_NUMBER > token). It does this by building a buffer containing the macro > definition and rerunning a lexer on it. > > The analyzer gracefully handles the cases where named values aren't > found (such as anything more complicated than described above). > > The patch ports the analyzer to use this mechanism for "O_RDONLY", > "O_WRONLY", and "O_ACCMODE". I have successfully tested my socket patch > to also use this for "SOCK_STREAM" and "SOCK_DGRAM", so the technique > seems to work. So this works well for code like enum __socket_type { SOCK_STREAM = 1, #define SOCK_STREAM SOCK_STREAM }; ? > diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc > index d70697b1d63..efe19fbe70b 100644 > --- a/gcc/c/c-parser.cc > +++ b/gcc/c/c-parser.cc > @@ -72,6 +72,8 @@ along with GCC; see the file COPYING3. If not see > #include "memmodel.h" > #include "c-family/known-headers.h" > #include "bitmap.h" > +#include "analyzer/analyzer-language.h" > +#include "toplev.h" > > /* We need to walk over decls with incomplete struct/union/enum types > after parsing the whole translation unit. > @@ -1662,6 +1664,87 @@ static bool c_parser_objc_diagnose_bad_element_prefix > (c_parser *, struct c_declspecs *); > static location_t c_parser_parse_rtl_body (c_parser *, char *); > > +#if ENABLE_ANALYZER > + > +namespace ana { > + > +/* Concrete implementation of ana::translation_unit for the C frontend. */ > + > +class c_translation_unit : public translation_unit > +{ > +public: > + /* Implementation of translation_unit::lookup_constant_by_id for use by the > + analyzer to look up named constants in the user's source code. */ > + tree lookup_constant_by_id (tree id) const final override > + { > + /* Consider decls. */ > + if (tree decl = lookup_name (id)) > + if (TREE_CODE (decl) == CONST_DECL) > + if (tree value = DECL_INITIAL (decl)) > + if (TREE_CODE (value) == INTEGER_CST) > + return value; > + > + /* Consider macros. */ > + cpp_hashnode *hashnode = C_CPP_HASHNODE (id); > + if (cpp_macro_p (hashnode)) > + if (tree value = consider_macro (hashnode->value.macro)) > + return value; > + > + return NULL_TREE; > + } > + > +private: > + /* Attempt to get an INTEGER_CST from MACRO. > + Only handle the simplest cases: where MACRO's definition is a single > + token containing a number, by lexing the number again. > + This will handle e.g. > + #define NAME 42 > + and other bases but not negative numbers, parentheses or e.g. > + #define NAME 1 << 7 > + as doing so would require a parser. */ > + tree consider_macro (cpp_macro *macro) const > + { > + if (macro->paramc > 0) > + return NULL_TREE; > + if (macro->kind == cmk_traditional) Do you really want to handle cmk_assert? I'd say you want if (macro->kind != cmk_macro) > + return NULL_TREE; > + if (macro->count != 1) > + return NULL_TREE; > + const cpp_token &tok = macro->exp.tokens[0]; > + if (tok.type != CPP_NUMBER) > + return NULL_TREE; > + > + cpp_reader *old_parse_in = parse_in; > + parse_in = cpp_create_reader (c_dialect_cxx () ? CLK_GNUCXX: CLK_GNUC89, > + ident_hash, line_table); Why not always CLK_GNUC89 since we're in the C FE? > + > + pretty_printer pp; > + pp_string (&pp, (const char *)tok.val.str.text); A space after ')'. > + pp_newline (&pp); > + cpp_push_buffer (parse_in, > + (const unsigned char *)pp_formatted_text (&pp), Likewise. > + strlen (pp_formatted_text (&pp)), > + 0); > + > + tree value; > + location_t loc; > + unsigned char cpp_flags; > + c_lex_with_flags (&value, &loc, &cpp_flags, 0); > + > + cpp_destroy (parse_in); > + parse_in = old_parse_in; > + > + if (value && TREE_CODE (value) == INTEGER_CST) > + return value; > + > + return NULL_TREE; > + } > +}; > + > +} // namespace ana > + > +#endif /* #if ENABLE_ANALYZER */ > + > /* Parse a translation unit (C90 6.7, C99 6.9, C11 6.9). > > translation-unit: > @@ -1722,6 +1805,14 @@ c_parser_translation_unit (c_parser *parser) > "#pragma omp begin assumes", "#pragma omp end assumes"); > current_omp_begin_assumes = 0; > } > + > +#if ENABLE_ANALYZER > + if (flag_analyzer) > + { > + ana::c_translation_unit tu; > + ana::on_finish_translation_unit (tu); > + } > +#endif > } Marek