From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 178303858D33 for ; Wed, 9 Aug 2023 21:36:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 178303858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1691616997; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yNYYIW3mfux+Q9cAk1fkUTyM5rl7/178EpdsTBmMsOg=; b=NPrBZ/am7wpbgVcWVvvJUY52C2bmkjDbc5AMu+/b3H4cqud5OcB/zSIfKLRICrs+2eY0ND otLh750+NWLqSEXyYAso/dSPKjiWwAS3t3L4smjbUTnwcu1EL0zEdSeP70xq4K3EsTTdkf JnYNQUJ4fBsCzLjMv0e2gbHPDnDd/v0= Received: from mail-ot1-f72.google.com (mail-ot1-f72.google.com [209.85.210.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-212-5DV_VTl8OxC6b-LA2adBaw-1; Wed, 09 Aug 2023 17:36:35 -0400 X-MC-Unique: 5DV_VTl8OxC6b-LA2adBaw-1 Received: by mail-ot1-f72.google.com with SMTP id 46e09a7af769-6bc56f23c65so267016a34.2 for ; Wed, 09 Aug 2023 14:36:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691616995; x=1692221795; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=yNYYIW3mfux+Q9cAk1fkUTyM5rl7/178EpdsTBmMsOg=; b=C93WGn6eGNBOb49vt5M7Yczko8QRIpf1TVKAD6+7ahvkBBHB5WyabiNzxAS9Y3mwFX UQtIEtR7PLhisW0GO1AULUyrvcGqkQU9by7rO5W0syd7Az+oDqGGmvTvhLLP7PBvpWGP eynVvaC+tRKE4sssTImv8CSaDpEvhmJeqqHoHrGmuUBO+71HGFewdfpXEg99fnZp5Pqt KKlFky9Z2S4RpoAHb/+fdnil6WNLBDmUowL+AgIIc9QR0WhfZUwHLD8vJuHRavonk4fB 8pnyt9DjBe9ADlYhwhiI+p/1NMeT/0+8gn4oEVKiGUKsefKsBxQWDP5TwnySIMUOTWCy 1S5g== X-Gm-Message-State: AOJu0YxzsxXgPqg2WiDMx/LH5lst1JASmp4mBtQLNLMKdLqbyREUZ598 8r8nfWuUXgYxcB/0tw8m9L77WY0RLHPYFQ8VP/PGLCZkJjIGo9L8sAq/NdwsSVUcUX+vKUVcqO1 0aCHuhWo= X-Received: by 2002:a05:6870:478a:b0:1bf:7b3:5116 with SMTP id c10-20020a056870478a00b001bf07b35116mr546912oaq.47.1691616994940; Wed, 09 Aug 2023 14:36:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGpVZV6mst7uAnwG1eouFvhgb9VPh21/qlWDBhSraFpOtJO0dZUw1U6hDBPcRzb/A4JicRDDQ== X-Received: by 2002:a05:6870:478a:b0:1bf:7b3:5116 with SMTP id c10-20020a056870478a00b001bf07b35116mr546894oaq.47.1691616994500; Wed, 09 Aug 2023 14:36:34 -0700 (PDT) Received: from t14s.localdomain (c-76-28-97-5.hsd1.ma.comcast.net. [76.28.97.5]) by smtp.gmail.com with ESMTPSA id j17-20020a0cf311000000b0063007ccaf42sm2905133qvl.57.2023.08.09.14.36.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Aug 2023 14:36:33 -0700 (PDT) Message-ID: <48d2a18cd58bad40757ede4a8f6e46eeeff1133d.camel@redhat.com> Subject: Re: [PATCH v2] analyzer: More features for CPython analyzer plugin [PR107646] From: David Malcolm To: Eric Feng Cc: gcc@gcc.gnu.org, gcc-patches@gcc.gnu.org Date: Wed, 09 Aug 2023 17:36:32 -0400 In-Reply-To: <97dcec4bcbb4d0b6eaacfd9990f05d7cfc0b35a9.1691608483.git.ef2648@columbia.edu> References: <132e5dd3440e22003db92244e9d17916fe33f0c7.camel@redhat.com> <97dcec4bcbb4d0b6eaacfd9990f05d7cfc0b35a9.1691608483.git.ef2648@columbia.edu> User-Agent: Evolution 3.44.4 (3.44.4-2.fc36) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00,BODY_8BITS,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 2023-08-09 at 15:22 -0400, Eric Feng wrote: > Thank you for your help in getting dg-require-python-h working! I can > confirm that the FAILs are related to differences between the -- > cflags > affecting the gimple seen by the analyzer. For this reason, I have > changed it to --includes for now.=C2=A0 Sounds good. Eventually we'll probably want to support --cflags, but given that every distribution probably has its own set of flags, it's a recipe for an unpleasantly large test matrix, so just using --includes is a good compromise. > To be sure, I tested on Python 3.8 as > well and it works as expected. I have also addressed the following > comments on the WIP patch as you described. >=20 > -- Update Changelog entry to list new functions being simulated. > -- Update region_model::get_or_create_region_for_heap_alloc leading > comment. > -- Change register_alloc to update_state_machine. > -- Change move_ptr_sval_non_null to transition_ptr_sval_non_null. > -- Static helper functions for: > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0-- Initializing ob_refcnt= field. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0-- Setting ob_type field. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0-- Getting ob_base field. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0-- Initializing heap allo= cated region for PyObjects. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0-- Incrementing a field b= y one. > -- Change arg_is_long_p to arg_is_integral_p. > -- Extract common failure scenario for reusability. >=20 > The initial WIP patch using=20 >=20 > /* { dg-options "-fanalyzer -I/usr/include/python3.9" }. */ >=20 > have been bootstrapped and regtested on aarch64-unknown-linux-gnu. > Since > we did not change any core logic in the revision and the only changes > within the analyzer core are changing variable names, is it OK for > trunk. In the mean time, the revised patch is currently going through > bootstrap and regtest process. Thanks for the updated patch. Unfortunately I just pushed a largish analyzer patch (r14-3114- g73da34a538ddc2) which may well conflict with your patch, so please rebase to beyond that. =C2=A0 Sorry about this. In particular note that there's no longer a default assignment to the LHS at a call-site in region_model::on_call_pre; known_function subclasses are now responsible for assigning to the LHS of the callsite. But I suspect that all the known_function subclasses in the cpython plugin already do that. Some nits inline below... [...snip...] > Some concessions were made to > simplify the analysis process when comparing kf_PyList_Append with > the > real implementation. In particular, PyList_Append performs some > optimization internally to try and avoid calls to realloc if > possible. For simplicity, we assume that realloc is called every > time. > Also, we grow the size by just 1 (to ensure enough space for adding a > new element) rather than abide by the heuristics that the actual > implementation > follows. Might be worth capturing these notes as comments in the source (for class kf_PyList_Append), rather than just within the commit message. [...snip...] >=20 > diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region- > model.cc > index e92b3f7b074..c338f045d92 100644 > --- a/gcc/analyzer/region-model.cc > +++ b/gcc/analyzer/region-model.cc > @@ -5127,11 +5127,16 @@ region_model::check_dynamic_size_for_floats > (const svalue *size_in_bytes, > =C2=A0=C2=A0=C2=A0 Use CTXT to complain about tainted sizes. > =C2=A0 > =C2=A0=C2=A0=C2=A0 Reuse an existing heap_allocated_region if it's not be= ing > referenced by > -=C2=A0=C2=A0 this region_model; otherwise create a new one.=C2=A0 */ > +=C2=A0=C2=A0 this region_model; otherwise create a new one. > + > +=C2=A0=C2=A0 Optionally (update_state_machine) transitions the pointer > pointing to the > +=C2=A0=C2=A0 heap_allocated_region from start to assumed non-null.=C2=A0= */ > =C2=A0 > =C2=A0const region * > =C2=A0region_model::get_or_create_region_for_heap_alloc (const svalue > *size_in_bytes, > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 > region_model_context *ctxt) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 region_model_context *ctxt, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bool update_state_machine, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const call_details *cd) > =C2=A0{ > =C2=A0=C2=A0 /* Determine which regions are referenced in this region_mod= el, so > that > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 we can reuse an existing heap_allocated_re= gion if it's not in > use on > @@ -5153,6 +5158,17 @@ > region_model::get_or_create_region_for_heap_alloc (const svalue > *size_in_bytes, > =C2=A0=C2=A0 if (size_in_bytes) > =C2=A0=C2=A0=C2=A0=C2=A0 if (compat_types_p (size_in_bytes->get_type (), = size_type_node)) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 set_dynamic_extents (reg, size_in_by= tes, ctxt); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (update_state_machine && cd= ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0{ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0const sva= lue *ptr_sval =3D nullptr; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (cd->g= et_lhs_type ()) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ptr_sval =3D m_mgr->get_ptr_svalue = (cd->get_lhs_type (), reg); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0else > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ptr_sval =3D m_mgr->get_ptr_svalue = (NULL_TREE, reg); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0transitio= n_ptr_sval_non_null (ctxt, > ptr_sval); This if/else is redundant: the "else" is only reached if cd- >get_lhs_type () is null, in which case you pass in NULL_TREE, so it works the same either way. Or am I missing something? Also, it looks like something weird's happening with indentation in this hunk. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0} > + > =C2=A0=C2=A0 return reg; > =C2=A0} > =C2=A0 > diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region- > model.h > index 0cf38714c96..16c80a238bc 100644 > --- a/gcc/analyzer/region-model.h > +++ b/gcc/analyzer/region-model.h > @@ -387,9 +387,9 @@ class region_model > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 region_model_c= ontext *ctxt, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 rejected_const= raint **out); > =C2=A0 > -=C2=A0 const region * > -=C2=A0 get_or_create_region_for_heap_alloc (const svalue *size_in_bytes, > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= region_model_context *ctxt); > +=C2=A0 const region *get_or_create_region_for_heap_alloc ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *size_in_bytes, region_model= _context *ctxt, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bool update_state_machine =3D false, cons= t call_details *cd =3D > nullptr); Nit: non-standard indentation here. > =C2=A0=C2=A0 const region *create_region_for_alloca (const svalue > *size_in_bytes, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 region_model_context > *ctxt); > =C2=A0=C2=A0 void get_referenced_base_regions (auto_bitmap &out_ids) cons= t; > @@ -476,6 +476,10 @@ class region_model > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 const svalue *old_ptr_sval, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 const svalue *new_ptr_sval); > =C2=A0 > +=C2=A0 /* Implemented in sm-malloc.cc.=C2=A0 */ > +=C2=A0 void transition_ptr_sval_non_null (region_model_context *ctxt, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *new_ptr_sval); Nit: non-standard indentation here. > + > =C2=A0=C2=A0 /* Implemented in sm-taint.cc.=C2=A0 */ > =C2=A0=C2=A0 void mark_as_tainted (const svalue *sval, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0reg= ion_model_context *ctxt); > diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc > index a8c63eb1ce8..bb8d83e4605 100644 > --- a/gcc/analyzer/sm-malloc.cc > +++ b/gcc/analyzer/sm-malloc.cc > @@ -434,6 +434,10 @@ public: > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 const svalue *new_ptr_sval, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 const extrinsic_state &ext_state) const; > =C2=A0 > +=C2=A0 void transition_ptr_sval_non_null (region_model *model, > sm_state_map *smap, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *new_ptr_sval, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const extrinsic_state &ext_state) c= onst; Nit: non-standard indentation here. > + > =C2=A0=C2=A0 standard_deallocator_set m_free; > =C2=A0=C2=A0 standard_deallocator_set m_scalar_delete; > =C2=A0=C2=A0 standard_deallocator_set m_vector_delete; > @@ -2504,6 +2508,16 @@ on_realloc_with_move (region_model *model, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 NULL, ext_state); > =C2=A0} > =C2=A0 > +/*=C2=A0 Hook for get_or_create_region_for_heap_alloc for the case when > we want > +=C2=A0=C2=A0 ptr_sval to mark a newly created region as assumed non null= on > malloc SM.=C2=A0 */ > +void > +malloc_state_machine::transition_ptr_sval_non_null ( > +=C2=A0=C2=A0=C2=A0 region_model *model, sm_state_map *smap, const svalue > *new_ptr_sval, > +=C2=A0=C2=A0=C2=A0 const extrinsic_state &ext_state) const Nit: non-standard indentation here. > +{ > +=C2=A0 smap->set_state (model, new_ptr_sval, m_free.m_nonnull, NULL, > ext_state); > +} >=20 [...] > diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c > b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c > index 9ecc42d4465..42c8aff101e 100644 > --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c > +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c [...] > @@ -76,6 +78,703 @@ get_field_by_name (tree type, const char *name) > =C2=A0=C2=A0 return NULL_TREE; > =C2=A0} > =C2=A0 > +static const svalue * > +get_sizeof_pyobjptr (region_model_manager *mgr) > +{ > +=C2=A0 tree size_tree =3D TYPE_SIZE_UNIT (pyobj_ptr_tree); > +=C2=A0 const svalue *sizeof_sval =3D mgr->get_or_create_constant_svalue > (size_tree); > +=C2=A0 return sizeof_sval; > +} > + > +static bool > +arg_is_integral_p(const call_details &cd, unsigned idx) > +{ > +=C2=A0 return INTEGRAL_TYPE_P(cd.get_arg_type(idx)); > +} We already have a call_details::arg_is_pointer_p, so perhaps this should be a call_details::arg_is_integral_p? > + > +static void > +init_ob_refcnt_field (region_model_manager *mgr, region_model > *model, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *ob_bas= e_region, tree > pyobj_record, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const call_details &= cd) > +{ > +=C2=A0 tree ob_refcnt_tree =3D get_field_by_name (pyobj_record, > "ob_refcnt"); > +=C2=A0 const region *ob_refcnt_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_field_region (ob_base_region= , ob_refcnt_tree); > +=C2=A0 const svalue *refcnt_one_sval > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_or_create_int_cst (size_type= _node, 1); > +=C2=A0 model->set_value (ob_refcnt_region, refcnt_one_sval, cd.get_ctxt > ()); > +} Please add a leading comment to this function, something like: /* Update MODEL to set OB_BASE_REGION's ob_refcnt to 1. */ > + > +static void > +set_ob_type_field (region_model_manager *mgr, region_model *model, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *ob_base_region, tree pyo= bj_record, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tree pytype_var_decl_ptr, const call_d= etails &cd) > +{ > +=C2=A0 const region *pylist_type_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_region_for_global (pytype_va= r_decl_ptr); > +=C2=A0 tree pytype_var_decl_ptr_type > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D build_pointer_type (TREE_TYPE (pytype= _var_decl_ptr)); > +=C2=A0 const svalue *pylist_type_ptr_sval > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_ptr_svalue (pytype_var_decl_= ptr_type, > pylist_type_region); > +=C2=A0 tree ob_type_field =3D get_field_by_name (pyobj_record, "ob_type"= ); > +=C2=A0 const region *ob_type_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_field_region (ob_base_region= , ob_type_field); > +=C2=A0 model->set_value (ob_type_region, pylist_type_ptr_sval, > cd.get_ctxt ()); > +} Likewise, this needs a leading comment, something like: /* Update MODEL to set OB_BASE_REGION's ob_type to point to PYTYPE_VAR_DECL_PTR. */ > + > +static const region * > +get_ob_base_region (region_model_manager *mgr, region_model *model, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *new_object_region, tree > object_record, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *pyobj_svalue, const call= _details > &cd) > +{ > +=C2=A0 tree ob_base_tree =3D get_field_by_name (object_record, "ob_base"= ); > +=C2=A0 const region *ob_base_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_field_region (new_object_reg= ion, ob_base_tree); > +=C2=A0 model->set_value (ob_base_region, pyobj_svalue, cd.get_ctxt ()); > +=C2=A0 return ob_base_region; > +} Likewise, needs a leading comment. It isn't clear to me what the intent of this function is. I see it used from kf_PyLong_FromLong::impl_call_post's outcome handler, where it seems to be used to set the ob_base_region to an unknown value. > + > +static const region * > +init_pyobject_region (region_model_manager *mgr, region_model > *model, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *object= _svalue, const > call_details &cd) > +{ > +=C2=A0 /* TODO: switch to actual tp_basic_size */ > +=C2=A0 const svalue *tp_basicsize_sval =3D mgr- > >get_or_create_unknown_svalue (NULL); > +=C2=A0 const region *pyobject_region =3D model- > >get_or_create_region_for_heap_alloc ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tp_basicsize_sval, cd.get_ctxt (), true, = &cd); > +=C2=A0 model->set_value (pyobject_region, object_svalue, cd.get_ctxt ())= ; > +=C2=A0 return pyobject_region; > +} Likewise needs a leading comment, and the exact intent isn't quite clear to me. I believe that everywhere you're calling it, you're passing in an unknown svalue for "object_svalue". [...] > +class kf_PyList_Append : public known_function > +{ > +public: > +=C2=A0 bool > +=C2=A0 matches_call_types_p (const call_details &cd) const final overrid= e > +=C2=A0 { > +=C2=A0=C2=A0=C2=A0 return (cd.num_args () =3D=3D 2); // TODO: more check= s here Probably:=C2=A0 && cd.arg_is_pointer_p (0) && cd.arg_is_pointer_p (1) > +=C2=A0 } > +=C2=A0 void impl_call_pre (const call_details &cd) const final override; > +=C2=A0 void impl_call_post (const call_details &cd) const final override= ; > +}; > + [...snip kf_PyList_Append implementation...] I confess that my eyes started glazing over at the kf_PyList_Append implementation. Given that this is just within the test plugin, I'll defer to you on this. > +class kf_PyList_New : public known_function > +{ > +public: > +=C2=A0 bool > +=C2=A0 matches_call_types_p (const call_details &cd) const final overrid= e > +=C2=A0 { > +=C2=A0=C2=A0=C2=A0 return (cd.num_args () =3D=3D 1 && arg_is_integral_p = (cd, 0)); > +=C2=A0 } > +=C2=A0 void impl_call_post (const call_details &cd) const final override= ; > +}; > + > +void > +kf_PyList_New::impl_call_post (const call_details &cd) const > +{ > +=C2=A0 class success : public call_info > +=C2=A0 { > +=C2=A0 public: > +=C2=A0=C2=A0=C2=A0 success (const call_details &cd) : call_info (cd) {} > + > +=C2=A0=C2=A0=C2=A0 label_text > +=C2=A0=C2=A0=C2=A0 get_desc (bool can_colorize) const final override > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return make_label_text (can_colorize, "wh= en %qE succeeds", > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 get_fndecl ()); > +=C2=A0=C2=A0=C2=A0 } > + > +=C2=A0=C2=A0=C2=A0 bool > +=C2=A0=C2=A0=C2=A0 update_model (region_model *model, const exploded_edg= e *, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 region_model_context *ctxt) const final over= ride > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const call_details cd (get_call_details (= model, ctxt)); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 region_model_manager *mgr =3D cd.get_mana= ger (); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *pyobj_svalue > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_or_c= reate_unknown_svalue (pyobj_record); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *varobj_svalue > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_or_c= reate_unknown_svalue (varobj_record); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *pylist_svalue > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_or_c= reate_unknown_svalue (pylistobj_record); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *size_sval =3D cd.get_arg_sv= alue (0); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *pylist_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D init_pyobject= _region (mgr, model, pylist_svalue, cd); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 typedef struct > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 PyObject_VAR_HEAD > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 PyObject **ob_item; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Py_ssize_t allocated; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } PyListObject; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tree varobj_field =3D get_field_by_name (= pylistobj_record, > "ob_base"); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *varobj_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_fiel= d_region (pylist_region, varobj_field); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 model->set_value (varobj_region, varobj_s= value, cd.get_ctxt > ()); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tree ob_item_field =3D get_field_by_name = (pylistobj_record, > "ob_item"); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *ob_item_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_fiel= d_region (pylist_region, ob_item_field); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *zero_sval =3D mgr->get_or_c= reate_int_cst > (size_type_node, 0); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *casted_size_sval > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_or_c= reate_cast (size_type_node, size_sval); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *size_cond_sval =3D mgr->get= _or_create_binop ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 size_type_node, L= E_EXPR, casted_size_sval, zero_sval); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 // if size <=3D 0, ob_item =3D NULL > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (tree_int_cst_equal (size_cond_sval->m= aybe_get_constant (), > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 integer_one_node)) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 { I think the way I'd do this is to bifurcate on the <=3D 0 versus > 0 case, and add the constraints to the model, as this ought to better handle non-constant values for the size. But this is good enough for now. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *nul= l_sval > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =3D mgr->get_or_create_null_ptr (pyobj_ptr_ptr); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 model->set_value = (ob_item_region, null_sval, cd.get_ctxt > ()); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 else // calloc > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *siz= eof_sval =3D mgr->get_or_create_cast ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 size_sval->get_type (), get_sizeof_pyobjptr (mgr)); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *pro= d_sval =3D mgr->get_or_create_binop ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 size_type_node, MULT_EXPR, sizeof_sval, size_sval); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *ob_= item_sized_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =3D model->get_or_create_region_for_heap_alloc > (prod_sval, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 > cd.get_ctxt ()); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 model->zero_fill_= region (ob_item_sized_region); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *ob_= item_ptr_sval > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =3D mgr->get_ptr_svalue (pyobj_ptr_ptr, > ob_item_sized_region); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *ob_= item_unmergeable > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =3D mgr->get_or_create_unmergeable (ob_item_ptr_sval); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 model->set_value = (ob_item_region, ob_item_unmergeable, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 cd.get_ctxt ()); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 typedef struct { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 PyObject ob_base; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Py_ssize_t ob_size; // Number of items in= variable part > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } PyVarObject; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *ob_base_region =3D get_ob_b= ase_region ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mgr, model, varob= j_region, varobj_record, pyobj_svalue, > cd); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tree ob_size_tree =3D get_field_by_name (= varobj_record, > "ob_size"); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *ob_size_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_fiel= d_region (varobj_region, ob_size_tree); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 model->set_value (ob_size_region, size_sv= al, cd.get_ctxt ()); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 typedef struct _object { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 _PyObject_HEAD_EX= TRA > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Py_ssize_t ob_ref= cnt; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 PyTypeObject *ob_= type; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } PyObject; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 // Initialize ob_refcnt field to 1. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 init_ob_refcnt_field(mgr, model, ob_base_= region, pyobj_record, > cd); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 // Get pointer svalue for PyList_Type the= n assign it to > ob_type field. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 set_ob_type_field(mgr, model, ob_base_reg= ion, pyobj_record, > pylisttype_vardecl, cd); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (cd.get_lhs_type ()) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *ptr= _sval > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =3D mgr->get_ptr_svalue (cd.get_lhs_type (), > pylist_region); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cd.maybe_set_lhs = (ptr_sval); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return true; > +=C2=A0=C2=A0=C2=A0 } > +=C2=A0 }; > + > +=C2=A0 if (cd.get_ctxt ()) > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cd.get_ctxt ()->bifurcate (make_unique (cd)); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cd.get_ctxt ()->bifurcate (make_unique (cd)); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cd.get_ctxt ()->terminate_path (); > +=C2=A0=C2=A0=C2=A0 } > +} > + > +class kf_PyLong_FromLong : public known_function > +{ > +public: > +=C2=A0 bool > +=C2=A0 matches_call_types_p (const call_details &cd) const final overrid= e > +=C2=A0 { > +=C2=A0=C2=A0=C2=A0 return (cd.num_args () =3D=3D 1 && arg_is_integral_p = (cd, 0)); > +=C2=A0 } > +=C2=A0 void impl_call_post (const call_details &cd) const final override= ; > +}; > + > +void > +kf_PyLong_FromLong::impl_call_post (const call_details &cd) const > +{ > +=C2=A0 class success : public call_info > +=C2=A0 { > +=C2=A0 public: > +=C2=A0=C2=A0=C2=A0 success (const call_details &cd) : call_info (cd) {} > + > +=C2=A0=C2=A0=C2=A0 label_text > +=C2=A0=C2=A0=C2=A0 get_desc (bool can_colorize) const final override > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return make_label_text (can_colorize, "wh= en %qE succeeds", > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 get_fndecl ()); > +=C2=A0=C2=A0=C2=A0 } If you subclass from success_call_info then you get an equivalent get_desc implementation "for free". > + > +=C2=A0=C2=A0=C2=A0 bool > +=C2=A0=C2=A0=C2=A0 update_model (region_model *model, const exploded_edg= e *, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 region_model_context *ctxt) const final over= ride > +=C2=A0=C2=A0=C2=A0 { Could add a comment here that we *don't* attempt to simulate the special-casing that CPython does for values -5 to 256 (see https://docs.python.org/3/c-api/long.html#c.PyLong_FromLong ). > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const call_details cd (get_call_details (= model, ctxt)); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 region_model_manager *mgr =3D cd.get_mana= ger (); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *pyobj_svalue > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_or_c= reate_unknown_svalue (pyobj_record); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *pylongobj_sval > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_or_c= reate_unknown_svalue (pylongobj_record); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *pylong_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D init_pyobject= _region (mgr, model, pylongobj_sval, cd); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 // Create a region for the base PyObject = within the > PyLongObject. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *ob_base_region =3D get_ob_b= ase_region ( > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mgr, model, pylon= g_region, pylongobj_record, pyobj_svalue, > cd); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 // Initialize ob_refcnt field to 1. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 init_ob_refcnt_field(mgr, model, ob_base_= region, pyobj_record, > cd); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 // Get pointer svalue for PyLong_Type the= n assign it to > ob_type field. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 set_ob_type_field(mgr, model, ob_base_reg= ion, pyobj_record, > pylongtype_vardecl, cd); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 // Set the PyLongObject value. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tree ob_digit_field =3D get_field_by_name= (pylongobj_record, > "ob_digit"); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const region *ob_digit_region > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D mgr->get_fiel= d_region (pylong_region, ob_digit_field); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *ob_digit_sval =3D cd.get_ar= g_svalue (0); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 model->set_value (ob_digit_region, ob_dig= it_sval, cd.get_ctxt > ()); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (cd.get_lhs_type ()) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const svalue *ptr= _sval > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =3D mgr->get_ptr_svalue (cd.get_lhs_type (), > pylong_region); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cd.maybe_set_lhs = (ptr_sval); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return true; > +=C2=A0=C2=A0=C2=A0 } > +=C2=A0 }; > + > +=C2=A0 if (cd.get_ctxt ()) > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cd.get_ctxt ()->bifurcate (make_unique (cd)); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cd.get_ctxt ()->bifurcate (make_unique (cd)); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cd.get_ctxt ()->terminate_path (); > +=C2=A0=C2=A0=C2=A0 } > +} [...] > diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c > b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c > new file mode 100644 > index 00000000000..19b5c17428a > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c > @@ -0,0 +1,78 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target analyzer } */ > +/* { dg-options "-fanalyzer" } */ > +/* { dg-require-python-h "" } */ > + > + > +#define PY_SSIZE_T_CLEAN > +#include > +#include "../analyzer/analyzer-decls.h" > + > +PyObject * > +test_PyList_New (Py_ssize_t len) > +{ > +=C2=A0 PyObject *obj =3D PyList_New (len); > +=C2=A0 if (obj) > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0 __analyzer_eval (obj->ob_refcnt =3D=3D 1); /* {= dg-warning "TRUE" } > */ > +=C2=A0=C2=A0=C2=A0=C2=A0 __analyzer_eval (PyList_CheckExact (obj)); /* {= dg-warning > "TRUE" } */ > +=C2=A0=C2=A0=C2=A0 } > +=C2=A0 else > +=C2=A0=C2=A0=C2=A0 __analyzer_dump_path (); /* { dg-message "path" } */ > +=C2=A0 return obj; > +} There's lots of scope for extra test coverage here. For example, for all these test_FOO_New cases, consider a variant with "void" return and no "return obj;" at the end. The analyzer ought to report a leak when we fall off the end of these functions. Similarly, it's good to try both: - symbolic values for arguments (like you have here) - constant values (for example, what happens with NULL for pointer params?) FWIW I like to organize test coverage for specific known functions into test cases named after the function, so perhaps we could have: cpython-plugin-test-PyList_Append.c cpython-plugin-test-PyList_New.c cpython-plugin-test-PyLong_New.c but that's up to you. All this can be left to a follow-up, though. > + > +PyObject * > +test_PyLong_New (long n) > +{ > +=C2=A0 PyObject *obj =3D PyLong_FromLong (n); > +=C2=A0 if (obj) > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0 __analyzer_eval (obj->ob_refcnt =3D=3D 1); /* {= dg-warning "TRUE" } > */ > +=C2=A0=C2=A0=C2=A0=C2=A0 __analyzer_eval (PyLong_CheckExact (obj)); /* {= dg-warning > "TRUE" } */ > +=C2=A0=C2=A0=C2=A0 } > +=C2=A0 else > +=C2=A0=C2=A0=C2=A0 __analyzer_dump_path (); /* { dg-message "path" } */ > +=C2=A0 return obj; > +} > + > +PyObject * > +test_PyListAppend (long n) > +{ > +=C2=A0 PyObject *item =3D PyLong_FromLong (n); > +=C2=A0 PyObject *list =3D PyList_New (0); > +=C2=A0 PyList_Append(list, item); > +=C2=A0 return list; /* { dg-warning "leak of 'item'" } */ > +} > + > +PyObject * > +test_PyListAppend_2 (long n) > +{ > +=C2=A0 PyObject *item =3D PyLong_FromLong (n); > +=C2=A0 if (!item) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return NULL; > + > +=C2=A0 __analyzer_eval (item->ob_refcnt =3D=3D 1); /* { dg-warning "TRUE= " } > */ > +=C2=A0 PyObject *list =3D PyList_New (n); > +=C2=A0 if (!list) > +=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0Py_DECREF(item); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return NULL; > +=C2=A0 } > + > +=C2=A0 __analyzer_eval (list->ob_refcnt =3D=3D 1); /* { dg-warning "TRUE= " } > */ > + > +=C2=A0 if (PyList_Append (list, item) < 0) > +=C2=A0=C2=A0=C2=A0 __analyzer_eval (item->ob_refcnt =3D=3D 1); /* { dg-w= arning "TRUE" } > */ > +=C2=A0 else > +=C2=A0=C2=A0=C2=A0 __analyzer_eval (item->ob_refcnt =3D=3D 2); /* { dg-w= arning "TRUE" } > */ > +=C2=A0 return list; /* { dg-warning "leak of 'item'" } */ > +} > + > + > +PyObject * > +test_PyListAppend_3 (PyObject *item, PyObject *list) > +{ > +=C2=A0 PyList_Append (list, item); > +=C2=A0 return list; > +} > \ No newline at end of file [...] Overall, I think that assuming the rebase is trivial then with the nits fixed, this is good for trunk. As noted above there are some issues with the known_function implementations in the plugin, but that's a minor detail that doesn't impact anyone else, so let's not perfect be the enemy of the good. Hope the above makes sense; thanks again for the patch. Dave