From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id DFFA13839DE1 for ; Fri, 25 Aug 2023 19:50:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DFFA13839DE1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1692993014; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I1hPx0oXKiV0rCUryBMx4HWqK90PCbmKTl01yQhSU6w=; b=fOHV9JJt4zwN66VHxE4nIeCBXGShaRNvwgfdz/s/o4I3mhfnfighe/14QpuUT540oGt+pI j+dOB6yddDoGdG/4TrmvdZQ3P9Jcy0uqVaMraryrLkhPkce/xnlRr8bLXFigjqjpC1/XkR 8hKDZ6xTAxuAIQqdBQbD4KxNRfv9B2M= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-413-BdVDDbdHMeiYcvMcywT_Fw-1; Fri, 25 Aug 2023 15:50:13 -0400 X-MC-Unique: BdVDDbdHMeiYcvMcywT_Fw-1 Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-6496d9acb98so15351486d6.0 for ; Fri, 25 Aug 2023 12:50:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692993012; x=1693597812; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=I1hPx0oXKiV0rCUryBMx4HWqK90PCbmKTl01yQhSU6w=; b=OJjIHqhcnQfHyYXsHbqlkkRhgHAqnJjS+rBkJC8KtP2jJzVb7KEew0fYFKociQegke +7QAUwcyytIrHsaooQ3CoycPs7wqkjrjHpcMajOD1qONCeBC6Lk/hoyt5WjasU7HRwbd iue9COkfvxp5bfzR9f2rHj0nhquRQNu3lAd10JJMjejRNGNZq/dF+cFW99gDQ6eehhW1 rRMYs7jhFqkWC1Hro8ykY8jGwcIv19ZVs/UY/eWgiAGXkhnr5DSTY15600gEFb7M6QRi e+ffx271nKRyJbVCczjP7eDF+OSxGtuG8TdEdhZ80ANQTod1YNvQGGQS3o+jTn8oqyql 8wfA== X-Gm-Message-State: AOJu0YyNGNVvy3b6606bibqFax/MH7EIn6prIUTC+M03bExrXzYZ1Lca zGvpvyivspi1hdDH2n6YAkzvyjacivE1TtIG0Q78jR8ujUW1AtsNuS58PZbTfkifm5/x+DMv5mG kY+r2AF8S3ruGPJA= X-Received: by 2002:a05:622a:1308:b0:40d:6ce7:7692 with SMTP id v8-20020a05622a130800b0040d6ce77692mr24152750qtk.52.1692993012039; Fri, 25 Aug 2023 12:50:12 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEwEoCWpd9VhLcuxrfqzvJloJ2XxNmV0/XULi+/ZXLjfhBE/Gg4+r8p4njW0K6HEbMFWs51aQ== X-Received: by 2002:a05:622a:1308:b0:40d:6ce7:7692 with SMTP id v8-20020a05622a130800b0040d6ce77692mr24152733qtk.52.1692993011571; Fri, 25 Aug 2023 12:50:11 -0700 (PDT) Received: from t14s.localdomain (c-76-28-97-5.hsd1.ma.comcast.net. [76.28.97.5]) by smtp.gmail.com with ESMTPSA id ew9-20020a05622a514900b004109928c607sm722470qtb.43.2023.08.25.12.50.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Aug 2023 12:50:11 -0700 (PDT) Message-ID: Subject: Re: Update on CPython Extension Module -fanalyzer plugin development From: David Malcolm To: Eric Feng Cc: gcc@gcc.gnu.org Date: Fri, 25 Aug 2023 15:50:09 -0400 In-Reply-To: <20230825125056.99826-1-ef2648@columbia.edu> References: <20230825125056.99826-1-ef2648@columbia.edu> User-Agent: Evolution 3.44.4 (3.44.4-2.fc36) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00,BODY_8BITS,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 2023-08-25 at 08:50 -0400, Eric Feng wrote: > Hi Dave, >=20 > Please find an updated WIP patch on reference count checking below. Some > parts aren't properly formatted yet; I apologize for that. >=20 > Since the last WIP patch, the major updates include: > - Updated certain areas of the core analyzer to support custom stmt_finde= r. > - A significant revamp of the reference count checking logic. > - __analyzer_cpython_dump_refcounts: This dumps reference count related i= nformation. Thanks for the patch. Given the scope, this is close to being ready for trunk; various comments inline below... >=20 > Here's an updated look at the current WIP diagnostic we issue: > rc3.c:25:10: warning: Expected = to have reference count: =E2=80=981=E2=80=99 but ob_refcnt field is: =E2= =80=982=E2=80=99 > =C2=A0=C2=A0 25 |=C2=A0=C2=A0 return list; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 ^~~~ > =C2=A0 =E2=80=98create_py_object=E2=80=99: events 1-4 > =C2=A0=C2=A0=C2=A0 | > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0 4 |=C2=A0=C2=A0 PyObject* item =3D= PyLong_FromLong(3); > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 ^~~~~~~~~~~~~~~~~~ > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 | > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 (1) when =E2=80=98PyLong_FromLong=E2=80=99 succeeds > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0 5 |=C2=A0=C2=A0 PyObject* list =3D= PyList_New(1); > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 ~~~~~~~~~~~~~ > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 | > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 (2) when =E2=80=98PyList_New=E2=80=99 succeeds > =C2=A0=C2=A0=C2=A0 |...... > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0 16 |=C2=A0=C2=A0 PyList_Append(list, ite= m); > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0 ~~~~~~~~= ~~~~~~~~~~~~~~~~~ > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0 | > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0 (3) when= =E2=80=98PyList_Append=E2=80=99 succeeds, moving buffer > =C2=A0=C2=A0=C2=A0 |...... > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0 25 |=C2=A0=C2=A0 return list; > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ~~~~ > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | > =C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (4) here > =C2=A0=C2=A0=C2=A0 | >=20 Looks good. As mentioned in our chat, ideally there would be some kind of information to the user as to what the reference actually are. Perhaps it could be a series of notes like this: note: there is 1 pointer to "*item" note: (1) '*item' is pointed to by '((PyListObject *)list)->ob_item[0]' Or, to be even more fancy, we could add events to the execution path whenever the number of pointers or ob_refcnt changes, giving something like: (1) when =E2=80=98PyLong_FromLong=E2=80=99 succeeds (2) '(*item).ob_refcnt' is now 1 (3) when =E2=80=98PyList_New=E2=80=99 succeeds (4) '*item' is now pointed to by '((PyListObject *)list)->ob_item[0]'; pointer count is 2, ob_refcnt is 1 (5) when 'create_py_object' returns, '(*item).ob_refcnt' is 1 ...or somesuch (I'm brainstorming here, and I expect that the above might not be implementable, due to local variables and SSA names confusing things). Plus there's the question of whether the return value "owns" a reference.=20 But all of that can be deferred to follow-up work. > The reference count checking logic for v1 should be almost complete. > Currently, it supports situations where the returned object is newly crea= ted. > It doesn't yet support the other case (i.e., incremented by 1 from > what it was previously, if not newly created).=20 >=20 > This week, I've focused primarily on the reference count checking logic. = I plan > to shift my focus to refining the diagnostic next. As seen in the placeho= lder > diagnostic message above, I believe we should at least inform the user ab= out the > variable name associated with the region that has an unexpected reference= count > issue (in this case, 'item'). Initially, I suspect the issue might be tha= t: >=20 > tree reg_tree =3D model->get_representative_tree (curr_region); >=20 > returns NULL since curr_region is heap allocated and thus the path_var re= turned would be: >=20 > path_var (NULL_TREE, 0); >=20 > Which means that: >=20 > refcnt_stmt_finder finder (*eg, reg_tree); >=20 > always receives a NULL_TREE, causing it to always default to the > not_found case. A workaround might be necessary, but I haven't delved too= deeply into this yet, > so my suspicion could be off. >=20 You could try looking for a representative tree for the pointer, rather than the region. I ran into similar issues with leak detection: we report a leak when nothing is pointing at the region, but if nothing is pointing at the region, then there's nothing to label/describe the region with. A workaround I've used is to look at the old model immediately before the state change, rather than the new one. So for example, use the region_model * for immediately before popping the frame (if it's available; you might need to make a copy?), and do something like: tree expr =3D old_model->get_representative_tree (curr_region); > Additionally, I think it would be helpful to show users what the > ob_refcnt looks like in each event as well. >=20 You could try make a custom_event subclass for such ob_refcnt messages, but adding them could be fiddly. Perhaps diagnostic_manager::add_events_for_eedge could support some kind of plugin hook whenever we add a statement_event, which you could use to add your extra events whenever the ob_refcnt of the object of interest differs between the old and the new state? (though currently we only emit a statement_event for the first stmt in an enode, and that function is already quite messy, alas) But as I said before, that's definitely something to postpone to a followup. [...snip...] >=20 > @@ -620,8 +639,15 @@ class region_model_context > =C2=A0{ > =C2=A0 public: > =C2=A0=C2=A0 /* Hook for clients to store pending diagnostics. > -=C2=A0=C2=A0=C2=A0=C2=A0 Return true if the diagnostic was stored, or fa= lse if it was deleted.=C2=A0 */ > -=C2=A0 virtual bool warn (std::unique_ptr d) =3D 0; > +=C2=A0=C2=A0=C2=A0=C2=A0 Return true if the diagnostic was stored, or fa= lse if it was deleted. > +=C2=A0=C2=A0=C2=A0=C2=A0 Optionally provide a custom stmt_finder.=C2=A0 = */ > +=C2=A0=C2=A0=C2=A0 virtual bool warn(std::unique_ptr= d) { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return warn(std::move(d), nul= lptr); > +=C2=A0=C2=A0=C2=A0 } > +=C2=A0=C2=A0=C2=A0=20 > +=C2=A0=C2=A0=C2=A0 virtual bool warn(std::unique_ptr= d, const stmt_finder *custom_finder) { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return false; > +=C2=A0=C2=A0=C2=A0 } I found the pair of vfuncs here confusing. Please can we have a single pure virtual "warn" vfunc, potentially with an optional arg, so: virtual bool warn (std::unique_ptr d,=C2=A0 const stmt_finder *custom_finder =3D nullptr) =3D 0; and add the new argument to the various implementations that override "warn". [...snip...] > =C2=A0/* A subclass of region_model_context for determining if operations= fail > @@ -912,6 +941,11 @@ class region_model_context_decorator : public region= _model_context > =C2=A0=C2=A0=C2=A0=C2=A0 return m_inner->get_stmt (); > =C2=A0=C2=A0 } > =C2=A0 > +=C2=A0 const exploded_graph *get_eg () const override > +=C2=A0 { > +=C2=A0=C2=A0=C2=A0 return m_inner->get_eg (); I recently changed region_model_context_decorator so that m_inner can be null (see 1e7b0a5d7a45dc5932992d36e1375582d61269e4), so please can you change this to: const exploded_graph *get_eg () const override { if (m_inner) return m_inner->get_eg (); else return nullptr; } [...snip...] >=20 > diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c b/gcc/= testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c > index 7cd72e8a886..a3274ced4a8 100644 > --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c > +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c [...snip...] > +class refcnt_mismatch : public pending_diagnostic_subclass > +{ [...snip...] > + > +=C2=A0 bool > +=C2=A0 emit (rich_location *rich_loc, logger *) final override > +=C2=A0 { > +=C2=A0=C2=A0=C2=A0 diagnostic_metadata m; > +=C2=A0=C2=A0=C2=A0 bool warned; > +=C2=A0=C2=A0=C2=A0 // just assuming constants for now > +=C2=A0=C2=A0=C2=A0 auto actual_refcnt > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=3D m_actual_refcnt->dyn_cast_= constant_svalue ()->get_constant (); > +=C2=A0=C2=A0=C2=A0 auto ob_refcnt =3D m_ob_refcnt->dyn_cast_constant_sva= lue ()->get_constant (); > +=C2=A0=C2=A0=C2=A0 warned =3D warning_meta ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0rich_loc, m, get_controlling_o= ption (), > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0"Expected to have " > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0"reference count: %qE but ob_r= efcnt field is: %qE", > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0actual_refcnt, ob_refcnt); This is OK for now. Diagnostic messages should start with a lower-case letter. Ideally we'd handle both symbolic and constant values. We'd quote the values if they're symbolic e.g. reference count: 2 but not quote them for constants, e.g. reference count: 'x + 1' However doing so tends to lead to a combinatorial explosion here in the possible messages, so maybe this could be split up as: warning: reference count mismatch for PyObject * 'item' note: 2 pointers found in memory to '*item'... note: ...but '(*item).ob_refcnt' is 1 (where here both values are concrete) But again, there's no need to fix this in this version of the code. [...snip...] > +/* Compare ob_refcnt field vs the actual reference count of a region */ > +void > +check_refcnt (const region_model *model, region_model_context *ctxt, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= const hash_map +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 int>::iterator::reference_pair region_refcnt) This function can be made "static" (as indeed can almost all of the functions in the plugin, I expect, apart from "plugin_init"); these aren't APIs being exposed to an external consumer. > +{ > +=C2=A0 region_model_manager *mgr =3D model->get_manager (); > +=C2=A0 const auto &curr_region =3D region_refcnt.first; > +=C2=A0 const auto &actual_refcnt =3D region_refcnt.second; > +=C2=A0 const svalue *ob_refcnt_sval =3D retrieve_ob_refcnt_sval (curr_re= gion, model, ctxt); > +=C2=A0 const svalue *actual_refcnt_sval =3D mgr->get_or_create_int_cst ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ob_refcnt_sval->get_type (), actual_refcn= t); > + > +=C2=A0 if (ob_refcnt_sval !=3D actual_refcnt_sval) > +=C2=A0 { > +=C2=A0=C2=A0=C2=A0 // todo: fix this (always null) > +=C2=A0=C2=A0=C2=A0 tree reg_tree =3D model->get_representative_tree (cur= r_region); > + > +=C2=A0=C2=A0=C2=A0 const auto &eg =3D ctxt->get_eg (); > +=C2=A0=C2=A0=C2=A0 refcnt_stmt_finder finder (*eg, reg_tree); > +=C2=A0=C2=A0=C2=A0 auto pd =3D make_unique (curr_region= , ob_refcnt_sval, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 actual_refcnt_sval, reg_tree); > +=C2=A0=C2=A0=C2=A0 if (pd && eg) > +=C2=A0=C2=A0=C2=A0 ctxt->warn (std::move (pd), &finder); Note that IIRC the region_model code gets rerun with a NULL region_model_context during feasibility checking, so we need to either capture the invariant that ctxt is non-null here (with a gcc_assert), and/or bail out early somewhere in all this checking for the null ctxt case. > +=C2=A0 } > +} > + > +void > +check_refcnts (const region_model *model, const svalue *retval, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 region_mode= l_context *ctxt, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 hash_map ®ion_to_refcnt) > +{ > +=C2=A0 for (const auto ®ion_refcnt : region_to_refcnt) > +=C2=A0 { > +=C2=A0=C2=A0=C2=A0 check_refcnt(model, ctxt, region_refcnt); > +=C2=A0 } > +} > + > +/* Validates the reference count of all Python objects. */ > +void > +pyobj_refcnt_checker (const region_model *model, const svalue *retval, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 region_model_context *ctxt) > +{ > +=C2=A0 if (!ctxt) > +=C2=A0 return; Aha: I see you have an early bailout here for the null ctxt, good. > + > +=C2=A0 auto region_to_refcnt =3D hash_map (); > +=C2=A0 auto seen_regions =3D hash_set (); > + > +=C2=A0 count_expected_pyobj_references (model, region_to_refcnt, retval,= seen_regions); > +=C2=A0 check_refcnts (model, retval, ctxt, region_to_refcnt); > +} > + [...snip...] > + > +void > +dump_refcnt_info (const hash_map ®ion_to_refcnt, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 const region_model *model, region_model_context *c= txt) > +{ > +=C2=A0 region_model_manager *mgr =3D model->get_manager (); > +=C2=A0 pretty_printer pp; > +=C2=A0 pp_format_decoder (&pp) =3D default_tree_printer; > +=C2=A0 pp_show_color (&pp) =3D pp_show_color (global_dc->printer); > +=C2=A0 pp.buffer->stream =3D stderr; > + > +=C2=A0 for (const auto ®ion_refcnt : region_to_refcnt) > +=C2=A0 { FWIW, the iteration order within a hash_map will vary from run to run (e.g. precise pointer values are affected by address space layout randomization), and so the ordering within the output to stderr will vary from run to run. It's OK for now, but it might be worth first building a vec, sorting it in some well-defined way, and then iterating over that for the printing; see e.g. b0702ac5588333e27d7ec43d21d704521f7a05c6. > +=C2=A0=C2=A0=C2=A0 auto region =3D region_refcnt.first; > +=C2=A0=C2=A0=C2=A0 auto actual_refcnt =3D region_refcnt.second; > +=C2=A0=C2=A0=C2=A0 const svalue *ob_refcnt_sval > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=3D retrieve_ob_refcnt_sval (r= egion, model, ctxt); > +=C2=A0=C2=A0=C2=A0 const svalue *actual_refcnt_sval =3D mgr->get_or_crea= te_int_cst ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0ob_refcnt_sval->get_type (), a= ctual_refcnt); > + > +=C2=A0=C2=A0=C2=A0 region->dump_to_pp (&pp, true); > +=C2=A0=C2=A0=C2=A0 pp_string (&pp, " =E2=80=94 ob_refcnt: "); > +=C2=A0=C2=A0=C2=A0 ob_refcnt_sval->dump_to_pp (&pp, true); > +=C2=A0=C2=A0=C2=A0 pp_string (&pp, " actual refcnt: "); > +=C2=A0=C2=A0=C2=A0 actual_refcnt_sval->dump_to_pp (&pp, true); > +=C2=A0=C2=A0=C2=A0 pp_newline (&pp); > +=C2=A0 } > +=C2=A0 pp_string (&pp, "~~~~~~~~\n"); > +=C2=A0 pp_flush (&pp); > +} > + >=20 [...snip...] Does the testcase pass or fail with this code? Do you get a new warning for it? How about some trivial new testcases e.g.: void test_dropping_PyLong (long val) { PyObject* p =3D PyLong_FromLong(val); /* Do nothing with 'p'; should be reported as a leak. */ } PyObject *test_valid_PyLong (long val) { PyObject* p =3D PyLong_FromLong(val); return p; // shouldn't report here } PyObject *test_stray_incref_PyLong (long val) { PyObject* p =3D PyLong_FromLong(val); Py_INCREF (p); // bogus return p; // should report that ob_refcnt is 2 when it should be 1 } ...or somesuch. The patch is almost ready for trunk as-is, but: - needs a ChangeLog - please fix the dual "warn" vfuncs thing I mentioned above - check it compiles! - make sure the testsuite doesn't emit any new FAILs (e.g. by adding a dg-warning with a suitable short regex, e.g. /* { dg-warning "reference count" ) */ Hope this is constructive Dave