From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 6B8B73857BBF for ; Fri, 15 Jul 2022 16:58:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6B8B73857BBF Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-636-LuIM3E8dPdSeD2plQD5nbA-1; Fri, 15 Jul 2022 12:58:00 -0400 X-MC-Unique: LuIM3E8dPdSeD2plQD5nbA-1 Received: by mail-qt1-f198.google.com with SMTP id t9-20020ac85309000000b0031ee055ad11so718191qtn.12 for ; Fri, 15 Jul 2022 09:58:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=FBeCyJhLOwxiZUSh3lwpFlvt8DuDEFgDPePUqyw1JKk=; b=T7kp/LBiKhRPZ5mW25fDzed5zLrTTmED0zos0OxNgzhCiyv/kbAKiKliy7w/WX3GmG AzC8I1z1DwK0SuOXXKoSbVAZWWJCAGOQNtE4o2nremsFNxraw94r3+EkDvfFVUz3kkJw ewecjLve4UJ4aT4yieF/u4vdMwPUuIC6tRnFQ6jaDiOKaVAv7qyAp/IqtCSSwd4UzCMx iCGh4oI0ifqNMJM7zl+7QEHRy8kLxqxUE+8b+w9LmOUIyXx7Xivvnv+PcF0yo7cRsqyc o8BqdUjmzM0wox99r5nb3+gbOXPueCvRzDIU1+5TX0Nhzmhg9WwUTF3GTm5gtFoPH5lc NQHQ== X-Gm-Message-State: AJIora96LalDy8Byo7cg9x7Db6B6NLPweL5gN2UnEDQ3UdgtCWzVte5+ VJdMFaunvi/rwVO2P4tCPtOnzhSyWH5WPEfSCAUpk2Neo6dYP6up1vvWMLxkiowNCMxiB+L/7v9 BXLH82/Q2vXCwgcn/Gw== X-Received: by 2002:ae9:de82:0:b0:6b5:90dc:a7df with SMTP id s124-20020ae9de82000000b006b590dca7dfmr10308600qkf.615.1657904279347; Fri, 15 Jul 2022 09:57:59 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tG8dKJ14AbK0igPbXeDuZZk3RKmFbAl+hQxLr/ulQZdmIpXquyAYxuBxGk2D23rND61dN9iQ== X-Received: by 2002:ae9:de82:0:b0:6b5:90dc:a7df with SMTP id s124-20020ae9de82000000b006b590dca7dfmr10308470qkf.615.1657904276563; Fri, 15 Jul 2022 09:57:56 -0700 (PDT) Received: from t14s.localdomain (c-73-69-212-193.hsd1.nh.comcast.net. [73.69.212.193]) by smtp.gmail.com with ESMTPSA id c1-20020ac84e01000000b0031ea1ad6c5asm3702509qtw.75.2022.07.15.09.57.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jul 2022 09:57:55 -0700 (PDT) Message-ID: <77393168bbd011cdc3a708d96f97f8f421155a3d.camel@redhat.com> Subject: Re: [PATCH] Adding three new function attributes for static analysis of file descriptors From: David Malcolm To: mirimnan017@gmail.com, gcc-patches@gcc.gnu.org Cc: Immad Mir Date: Fri, 15 Jul 2022 12:57:54 -0400 In-Reply-To: References: User-Agent: Evolution 3.38.4 (3.38.4-1.fc33) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jul 2022 16:58:04 -0000 On Fri, 2022-07-15 at 21:08 +0530, Immad Mir wrote: Thanks for the patch. Various review comments: The patch is missing a ChangeLog. > --- > gcc/analyzer/sm-fd.cc | 257 ++++++++++++++++++++++++--- > gcc/c-family/c-attribs.cc | 115 ++++++++++++ > gcc/doc/extend.texi | 19 ++ > gcc/testsuite/gcc.dg/analyzer/fd-5.c | 53 ++++++ > gcc/testsuite/gcc.dg/analyzer/fd-6.c | 14 ++ > 5 files changed, 431 insertions(+), 27 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-5.c > create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-6.c > > diff --git a/gcc/analyzer/sm-fd.cc b/gcc/analyzer/sm-fd.cc > index 8e4300b06e2..20018dfd12b 100644 > --- a/gcc/analyzer/sm-fd.cc > +++ b/gcc/analyzer/sm-fd.cc [...snip...] > +enum fd_access_direction > +{ > + DIR_READ_WRITE, > + DIR_READ, > + DIR_WRITE > +}; Don't we already have DIR_READ and DIR_WRITE in enum access_direction in analyzer.h? I wonder why this isn't an error due to the naming collision? The new enum refers to a *set* of valid access directions, so maybe rename the new enum to "enum access_directions" (note the plural), and rename the elements from DIR_ to DIRS_. Please carefully check all usage of DIR_* in sm-fd.cc. Maybe instead we should convert the values in enum access_direction from indexes into bitfield masks. [...snip...] > @@ -317,19 +335,27 @@ public: > bool > emit (rich_location *rich_loc) final override > { > + bool warned; > switch (m_fd_dir) > { > case DIR_READ: > - return warning_at (rich_loc, get_controlling_option (), > + warned = warning_at (rich_loc, get_controlling_option (), > "%qE on % file descriptor %qE", > m_callee_fndecl, m_arg); > + break; > case DIR_WRITE: > - return warning_at (rich_loc, get_controlling_option (), > + warned = warning_at (rich_loc, get_controlling_option (), > "%qE on % file descriptor %qE", > m_callee_fndecl, m_arg); > + break; ...i.e. which DIR_READ/DIR_WRITE values are these cases using? > default: > gcc_unreachable (); > } > + if (warned && m_attr) > + { > + m_sm.inform_filedescriptor_attribute (m_callee_fndecl, m_arg_idx, m_fd_dir); > + } Redundant braces here ^^^ > + return warned; > } > > bool > @@ -359,8 +385,10 @@ public: > } > > private: > - enum access_direction m_fd_dir; > + enum fd_access_direction m_fd_dir; > const tree m_callee_fndecl; > + bool m_attr; > + int m_arg_idx; > }; Most (all?) of the concrete subclasses seem to be adding the two fields m_attr and m_arg_idx, so can you add them to class fd_diagnostic, or introduce a common subclass, rather than repeating them in each subclass? I suspect the code would be simpler if inform_filedescriptor_attribute were a method of fd_diagnostic, rather than of the state_machine: you could move the (&& m_attr) part of the conditional into there, rather than having every emit vfunc do the same test, and having to pass the same fields to it each time (except perhaps the access dir?). [...snip...] > @@ -466,25 +510,29 @@ public: > describe_final_event (const evdesc::final_event &ev) final override > { > if (m_first_close_event.known_p ()) > - return ev.formatted_print ( > - "%qE on closed file descriptor %qE; %qs was at %@", m_callee_fndecl, > - m_arg, "close", &m_first_close_event); > - else > - return ev.formatted_print ("%qE on closed file descriptor %qE", > - m_callee_fndecl, m_arg); > + return ev.formatted_print ( > + "%qE on closed file descriptor %qE; %qs was at %@", m_callee_fndecl, > + m_arg, "close", &m_first_close_event); > + else > + return ev.formatted_print ("%qE on closed file descriptor %qE", > + m_callee_fndecl, m_arg); What changed here? Is this just whitespace changes? [...snip...] > fd_state_machine::fd_state_machine (logger *logger) > : state_machine ("file-descriptor", logger), > m_constant_fd (add_state ("fd-constant")), > @@ -647,11 +733,126 @@ fd_state_machine::on_stmt (sm_context *sm_ctxt, const supernode *node, > on_read (sm_ctxt, node, stmt, call, callee_fndecl); > return true; > } // "read" > + > + > + { > + // Handle __attribute__((fd_arg)) > + bitmap argmap = get_fd_attrs ("fd_arg", callee_fndecl); > + if (argmap) > + check_for_fd_attrs (sm_ctxt, node, stmt, call, callee_fndecl, argmap, DIR_READ_WRITE); > + > + // Handle __attribute__((fd_arg_read)) > + bitmap read_argmap = get_fd_attrs ("fd_arg_read", callee_fndecl); > + if(read_argmap) > + check_for_fd_attrs (sm_ctxt, node, stmt, call, callee_fndecl, read_argmap, DIR_READ); > + > + // Handle __attribute__((fd_arg_write)) > + bitmap write_argmap = get_fd_attrs ("fd_arg_write", callee_fndecl); > + if (write_argmap) > + check_for_fd_attrs (sm_ctxt, node, stmt, call, callee_fndecl, write_argmap, DIR_WRITE); I think there are three memory leaks here, anytime the bitmap is non- null you need a BITMAP_FREE (foo_argmap); on it. Rather than repeating that logic three times, it's probably much cleaner to combine get_fd_attrs and check_for_fd_attrs into a single function that: - builds the bitmap - bails out early for the common case where there's no attribute - use the bitmap - frees it, or, better, uses auto_bitmap rather than bitmap so that the cleanup happens implicitly Then the above logic could read: handle_any_fd_attrs (callee_fndecl, sm_ctxt, node, stmt, call, "fd_arg", DIR_READ_WRITE); handle_any_fd_attrs (callee_fndecl, sm_ctxt, node, stmt, call, "fd_arg_read", DIR_READ); handle_any_fd_attrs (callee_fndecl, sm_ctxt, node, stmt, call, "fd_arg_write", DIR_WRITE); or similar. > + > + return true; > + } > + > } > > return false; > } [...snip...] > +bitmap > +fd_state_machine::get_fd_attrs (const char *attr_name, tree callee_fndecl) const > +{ > + bitmap argmap = NULL; > + tree attrs = TYPE_ATTRIBUTES (TREE_TYPE (callee_fndecl)); > + attrs = lookup_attribute (attr_name, attrs); > + if (!attrs) > + return argmap; > + > + if (!TREE_VALUE (attrs)) > + return argmap; > + > + argmap = BITMAP_ALLOC (NULL); > + > + for (tree idx = TREE_VALUE (attrs); idx; idx = TREE_CHAIN (idx)) > + { > + unsigned int val = TREE_INT_CST_LOW (TREE_VALUE (idx)) - 1; We need to be sure that the attribute-validation logic in c-attribs.cc rejects argument numbers that aren't >= 1. I think the above will crash if the user supplies an argument number that's 0, negative, or not an integer. I *think* positional_argument checks for all this and will reject such attributes, but let's have coverage for each of these in the test suite. > + bitmap_set_bit (argmap, val); > + } > + return argmap; > +} > + > + > void > fd_state_machine::on_open (sm_context *sm_ctxt, const supernode *node, > const gimple *stmt, const gcall *call) const > @@ -729,7 +930,7 @@ void > fd_state_machine::check_for_open_fd ( > sm_context *sm_ctxt, const supernode *node, const gimple *stmt, > const gcall *call, const tree callee_fndecl, > - enum access_direction callee_fndecl_dir) const > + enum fd_access_direction callee_fndecl_dir) const > { > tree arg = gimple_call_arg (call, 0); > tree diag_arg = sm_ctxt->get_diagnostic_tree (arg); > @@ -738,7 +939,7 @@ fd_state_machine::check_for_open_fd ( > if (is_closed_fd_p (state)) > { > sm_ctxt->warn (node, stmt, arg, > - new fd_use_after_close (*this, diag_arg, callee_fndecl)); > + new fd_use_after_close (*this, diag_arg, callee_fndecl, false, -1)); I'd prefer overloaded constructors for these fd_diagnostic subclasses, one for the for the hardcoded case, another for the attribute case, so that the -1 for the "this is meaningless" attribute argno is only in the fd_diagnostic constructor. [...snip...] > diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc [...snip...] > +/* Handle the "fd_arg" attribute */ > + > +static tree > +handle_fd_arg_attribute (tree *node, tree name, tree args, > + int ARG_UNUSED (flags), bool *no_add_attrs) > +{ Am I right in thinking that all three of the handle_fd_arg*_attribute callback functions are identical? If so, let's just have a single callback shared by all three entries in the array. > + tree type = *node; > + if (!args) > + { > + if (!prototype_p (type)) > + { > + error ("%qE attribute without arguments on a non-prototype", name); > + *no_add_attrs = true; > + } > + return NULL_TREE; > + } > + > + for (int i = 1; args; ++i) > + { > + tree pos = TREE_VALUE (args); > + tree next = TREE_CHAIN (args); > + if (tree val = positional_argument (type, name, pos, INTEGER_TYPE, > + next || i > 1 ? i : 0)) > + TREE_VALUE (args) = val; > + else > + { > + *no_add_attrs = true; > + break; > + } > + args = next; > + } Looks like you've copied and pasted the above loop from handle_alloc_size_attribute, which can take multiple args. The new attributes only take a single argument, so I think handle_alloc_align_attribute is a better model. > + return NULL_TREE; > +} [...snip...] > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index dfbe33ac652..eba86e9f7ef 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -3007,6 +3007,25 @@ produced by @command{gold}. > For other linkers that cannot generate resolution file, > explicit @code{externally_visible} attributes are still necessary. > > +@item fd_arg (@Var{N}, @Var{...}) The @var directives should have lower-case "v". You should add a @cindex line below the @item for each of these; see the existing function attributes for how to do that. > + > +The @Var{fd_arg} attribute may be applied to a function that takes an open file @Var should be @code here. > +descriptor at referenced argument @Var{N}. It indicates that the passed file > +descriptor must not have been closed. Please spell out which analyzer warnings are affected by this attribute. > + > +@item fd_arg_read (@Var{N}, @Var{...}) Aha: you're supporting 1 or more multiple arguments, rather than just 1? I don't think we need this, almost every API I can think of takes a single FD (or is a special-case, like dup2). So for simplicity, let's just support a single argument per attribute; if there's a rare case where an API takes two fd_arg_read, the user can simply supply the attribute twice. > + > +The @Var{fd_arg_read} attribute may be applied to function that takes an open file > +descriptor at referenced argument @Var{N} that it might read from. It > +indicates that the passed file descriptor must not have been closed or > +not opened as write-only. It's probably simplest to document this as: "This attribute is identical to @code{fd_arg}, but with the additional requirement that..." or somesuch. > + > +@item fd_arg_write (@Var{N}, @Var{...}) > + > +The @Var{fd_arg_write} attribute may be applied to a function that takes an open > +file descriptor at referenced argument @Var{N} that it might write to. It indicates > +that the passed file descriptor must not have been closed or not opened as read-only. Likewise. [...snip...] > diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-6.c b/gcc/testsuite/gcc.dg/analyzer/fd-6.c > new file mode 100644 > index 00000000000..7efb9e69b58 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/analyzer/fd-6.c > @@ -0,0 +1,14 @@ > + > +int not_a_fn __attribute__ ((fd_arg(1))); /* { dg-warning "'fd_arg' attribute only applies to function types" } */ > + > +void f (char *p) __attribute__ ((fd_arg(1))); /* { dg-warning "'fd_arg' attribute argument value '1' refers to parameter type 'char \\\*'" } */ > + > + > +int not_a_fn_b __attribute__ ((fd_arg_read(1))); /* { dg-warning "'fd_arg_read' attribute only applies to function types" } */ > + > +void g (char *p) __attribute__ ((fd_arg_read(1))); /* { dg-warning "'fd_arg_read' attribute argument value '1' refers to parameter type 'char \\\*'" } */ > + > + > +int not_a_fn_c __attribute__ ((fd_arg_write(1))); /* { dg-warning "'fd_arg_write' attribute only applies to function types" } */ > + > +void f (char *p) __attribute__ ((fd_arg_write(1))); /* { dg-warning "'fd_arg_write' attribute argument value '1' refers to parameter type 'char \\\*'" } */ > \ No newline at end of file The attribute-parsing code isn't analyzer-specific, so move this test file to c-c++-common/attr-fd.c Hope the above makes sense; thanks again for the patch Dave