public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Eric Feng <ef2648@columbia.edu>
To: David Malcolm <dmalcolm@redhat.com>
Cc: gcc@gcc.gnu.org
Subject: Re: Update on CPython Extension Module -fanalyzer plugin development
Date: Wed, 16 Aug 2023 21:47:41 -0400	[thread overview]
Message-ID: <CANGHATWtt60xdShpk=U44Kh_ex8fF4tY=hpSzAb0sbz65GPL8w@mail.gmail.com> (raw)
In-Reply-To: <755975d612a449027a1f605d950161ef9b62a72f.camel@redhat.com>

Hi Dave,

Thanks for the feedback!


On Wed, Aug 16, 2023 at 5:29 PM David Malcolm <dmalcolm@redhat.com> wrote:
>
> On Wed, 2023-08-16 at 15:17 -0400, Eric Feng via Gcc wrote:
> > Hi everyone,
>
> [fixing typo in my email address]
>
> Hi Eric, thanks for the update, and the WIP patch.
>
> >
> > After pushing the code that supports various known function classes last week,
> > I've turned my attention back to the core reference count checking
> > functionality. This functionality used to reside in region_model, which
> > wasn't ideal. To address this, I've introduced a hook to register callbacks
> > to pop_frame. Specifically, this allows the code that checks the reference
> > count and emits diagnostics to be housed within the plugin, rather than the
> > core analyzer.
> >
> > As of now, the parameters of pop_frame_callback are tailored specifically to
> > our needs. If the use of callbacks at the end of pop_frame becomes more
> > prevalent, we can revisit the setup to potentially make it more general.
> >
> > Moreover, the core reference count checking logic was previously somewhat
> > bloated, contained in one extensive function. I've since refactored it,
> > breaking it down into several helper functions to simplify and reduce
> > complexity. There are still some aspects that need refinement, especially
> > since the plugin has seen changes since I last worked on this logic. However,
> > I believe that there aren't any significant problems.
>
> Suggestion: introduce some more decls into analyzer-decls.h and
> known_functions for them into the plugin so that you can run/test/debug
> the helper functions independently (similar to the existing ones in kf-
> analyzer.cc).
>
> e.g.
>   extern void __analyzer_cpython_dump_real_refcounts (void);
>   extern void __analyzer_cpython_dump_ob_refcnt (void);
>
> >
Thanks for the suggestion. This will be even more helpful now that we
have split the logic into helper functions. I will look into these
when I come back to the "is there a refcount problem" side of the
equation.
> > Currently, I've started working a custom stmt_finder similar to leak_stmt_finder
> > to address the issue of m_stmt and m_stmt_finder being NULL at the time of
> > region_model::pop_frame. This approach was discussed as a viable solution in
> > a previous email, and I'll keep everyone posted on my progress. Afterwards, I
> > will go back to address the refinements necessary mentioned above.
>
> You might want to experiment with splitting out
> (a) "is there a refcount problem" from
> (b) "emit a refcount problem".
>
> For example, you could hardcode (a) to true, so we always complain with
> (b) on every heap-allocated object, just to debug the stmt_finder
> workaround.
>
>
> [...snip...]
>
> BTW, you don't need to bother to write ChangeLog entries if you're just
> sending a work-in-progress for me.
>
> > diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > index 7cd72e8a886..918bb5a5587 100644
> > --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
>
> [...]
>
> > +/* For PyListObjects: processes the ob_item field within the current region and
> > + * increments the reference count if conditions are met. */
> > +void
> > +process_ob_item_region (const region_model *model, region_model_manager *mgr,
> > +                       region_model_context *ctxt, const region *curr_region,
> > +                       const svalue *pylist_type_ptr, const region *base_reg,
> > +                       int &actual_refcnt)
>
> You seem to be special-casing PyListObject here; why?  That seems like
> it's not going to be scalable tothe general case.
>
> Am I right in thinking the intent of this code is to count the actual
> number of pointers in memory that point to a particular region?
>
> Doesn't the ob_item buffer show up in the store as another cluster?
> Can't you just look at the bindings in the clusters and tally up the
> pointers for each heap_allocated_region? (accumulating a result map
> from region to int of the actual reference counts).
Yes, you're correct, I hadn't thought of that ... This simplifies a
lot of things. Thanks for the great suggestion!
>
> Or am I missing something?
>
> What does
>   model->debug ();
> show in your examples?
>
>
> > +{
> > +  tree ob_item_field_tree = get_field_by_name (pylistobj_record, "ob_item");
> > +  const region *ob_item_field_reg
> > +      = mgr->get_field_region (curr_region, ob_item_field_tree);
> > +  const svalue *ob_item_ptr = model->get_store_value (ob_item_field_reg, ctxt);
> > +
> > +  if (const auto &cast_ob_item_reg = ob_item_ptr->dyn_cast_region_svalue ())
> > +    {
> > +      const region *ob_item_reg = cast_ob_item_reg->get_pointee ();
> > +      const svalue *allocated_bytes = model->get_dynamic_extents (ob_item_reg);
> > +      const region *ob_item_sized = mgr->get_sized_region (
> > +         ob_item_reg, pyobj_ptr_ptr, allocated_bytes);
> > +      const svalue *buffer_contents_sval
> > +         = model->get_store_value (ob_item_sized, ctxt);
> > +
> > +      if (const auto &buffer_contents
> > +         = buffer_contents_sval->dyn_cast_compound_svalue ())
> > +       {
> > +         for (const auto &buffer_content : buffer_contents->get_map ())
> > +           {
> > +                   const auto &content_value = buffer_content.second;
> > +                   if (const auto &content_region
> > +                       = content_value->dyn_cast_region_svalue ())
> > +                           if (content_region->get_pointee () == base_reg)
> > +                                   actual_refcnt++;
> > +           }
> > +       }
> > +    }
> > +}
> > +
> > +/* Counts the actual references from all clusters in the model's store. */
> > +int
> > +count_actual_references (const region_model *model, region_model_manager *mgr,
> > +                        region_model_context *ctxt, const region *base_reg,
> > +                        const svalue *pylist_type_ptr, tree ob_type_field)
> > +{
> > +  int actual_refcnt = 0;
> > +  for (const auto &other_cluster : *model->get_store ())
> > +    {
> > +      for (const auto &binding : other_cluster.second->get_map ())
> > +       {
> > +         const auto &sval = binding.second;
> > +         const auto &curr_region = sval->maybe_get_region ();
> > +
> > +         if (!curr_region || curr_region->get_kind () != RK_HEAP_ALLOCATED)
> > +           continue;
> > +
> > +         increment_count_if_base_matches (curr_region, base_reg,
> > +                                           actual_refcnt);
> > +
> > +         const region *ob_type_region
> > +             = mgr->get_field_region (curr_region, ob_type_field);
> > +         const svalue *stored_sval
> > +             = model->get_store_value (ob_type_region, ctxt);
> > +         const auto &remove_cast = stored_sval->dyn_cast_unaryop_svalue ();
> > +
> > +         if (!remove_cast)
> > +           continue;
> > +
> > +         const svalue *type = remove_cast->get_arg ();
> > +         if (type == pylist_type_ptr)
> > +           process_ob_item_region (model, mgr, ctxt, curr_region,
> > +                                   pylist_type_ptr, base_reg, actual_refcnt);
> > +       }
> > +    }
> > +  return actual_refcnt;
> > +}
> >
>
> Hope the above is constructive.
> Dave
>

Best,
Eric

  reply	other threads:[~2023-08-17  1:47 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-25  4:49 Update and Questions " Eric Feng
2023-07-25 14:41 ` David Malcolm
2023-07-27 22:13   ` Eric Feng
2023-07-27 22:35     ` David Malcolm
2023-07-30 17:52       ` Eric Feng
2023-07-30 23:44         ` David Malcolm
2023-08-01 13:57           ` Eric Feng
2023-08-01 17:06             ` David Malcolm
2023-08-04 15:02               ` Eric Feng
2023-08-04 15:39                 ` David Malcolm
2023-08-04 20:48                   ` Eric Feng
2023-08-04 22:42                     ` David Malcolm
2023-08-04 22:46                       ` David Malcolm
2023-08-07 18:31                         ` Eric Feng
2023-08-07 23:16                           ` David Malcolm
2023-08-08 16:51                             ` [PATCH] WIP for dg-require-python-h [PR107646] Eric Feng
2023-08-08 18:08                               ` David Malcolm
2023-08-08 18:51                               ` David Malcolm
2023-08-09 19:22                                 ` [PATCH v2] analyzer: More features for CPython analyzer plugin [PR107646] Eric Feng
2023-08-09 21:36                                   ` David Malcolm
2023-08-11 17:47                                     ` [COMMITTED] " Eric Feng
2023-08-11 20:23                                       ` Eric Feng
2023-08-16 19:17                                         ` Update on CPython Extension Module -fanalyzer plugin development Eric Feng
2023-08-16 21:28                                           ` David Malcolm
2023-08-17  1:47                                             ` Eric Feng [this message]
2023-08-21 14:05                                               ` Eric Feng
2023-08-21 15:04                                                 ` David Malcolm
2023-08-23 21:15                                                   ` Eric Feng
2023-08-23 23:16                                                     ` David Malcolm
2023-08-24 14:45                                                       ` Eric Feng
2023-08-25 12:50                                                         ` Eric Feng
2023-08-25 19:50                                                           ` David Malcolm
2023-08-29  4:31                                                             ` [PATCH] analyzer: implement reference count checking for CPython plugin [PR107646] Eric Feng
2023-08-29  4:35                                                               ` Eric Feng
2023-08-29 17:28                                                                 ` Eric Feng
2023-08-29 21:14                                                                   ` David Malcolm
2023-08-30 22:15                                                                     ` Eric Feng
2023-08-31 17:01                                                                       ` David Malcolm
2023-08-31 19:09                                                                         ` Eric Feng
2023-08-31 20:19                                                                           ` David Malcolm
2023-09-01  1:25                                                                             ` Eric Feng
2023-09-01 11:57                                                                               ` David Malcolm
2023-09-05  2:13                                                                                 ` [PATCH] analyzer: implement symbolic value support for CPython plugin's refcnt checker [PR107646] Eric Feng
2023-09-07 17:28                                                                                   ` David Malcolm
2023-09-11  2:12                                                                                     ` Eric Feng
2023-09-11 19:00                                                                                       ` David Malcolm
2023-08-29 21:08                                                               ` [PATCH] analyzer: implement reference count checking for CPython plugin [PR107646] David Malcolm
2023-09-01  2:49                                                               ` Hans-Peter Nilsson
2023-09-01 14:51                                                                 ` David Malcolm
2023-09-01 21:07                                                                   ` Eric Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANGHATWtt60xdShpk=U44Kh_ex8fF4tY=hpSzAb0sbz65GPL8w@mail.gmail.com' \
    --to=ef2648@columbia.edu \
    --cc=dmalcolm@redhat.com \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).