From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id DF2243858D37 for ; Thu, 27 Jul 2023 22:35:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF2243858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690497331; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ndsf61UPT09O+X3Q6yKp6R6uCa8ENi/vPdqwQnLhFfQ=; b=LEFz6s2mDWgwU3tUQsSeG+in8Wlj3vIqRXUfkN+RpqJ8wPc+siedG/5d4CU8XWhRXpL5gS /3O9jbaGcVzB1NEo93Y7M1hrwYnFoqFYz0A0oFbxP8YooaWwfMMicuzufp/PZTqyBpjjhJ Z7Q0sCyH7M4robLXx4aPQ/QLlzI9YBc= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-684-Zq9lKAKLOayFbVnfz9RdyA-1; Thu, 27 Jul 2023 18:35:30 -0400 X-MC-Unique: Zq9lKAKLOayFbVnfz9RdyA-1 Received: by mail-qv1-f69.google.com with SMTP id 6a1803df08f44-63cf0a96ab1so16505436d6.3 for ; Thu, 27 Jul 2023 15:35:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690497329; x=1691102129; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Ndsf61UPT09O+X3Q6yKp6R6uCa8ENi/vPdqwQnLhFfQ=; b=Oj9NTYkVK8C+wqSlST4FbegL9zEzISHX3r2S/hIsHIh6AjpjcNuGkhvX6J2GIelk4k Wiz+dzph0FQdPCVumWN7AlazI7kaDlnFQplgYjwjFzW/B2pR/brf5YX4LHvGYYNNY+yf HSvSDYxD6yqGZ33D3MhpHomslTqzogXyfg7oM4jA50jPnwwoS9qI6R+cQ0Bey8dRrUup EIMTcjIphULxabryd7ObOcqvzkijEtSZ6muBfV3Lfn2/mLCowkzkj9FRyzX6UPVuuJb1 aaPzECiOttguTiUgBKDAZqCNnw7ByYBFY86WZCHvU0KEr4WYTQDm5s3SE4NIuoUVKwrH dqwA== X-Gm-Message-State: ABy/qLb3mxBPQ1xbra1ssRUcjVCSu2ScCjr6NUxhD8i7yD9nG0ipkvEi uGVlGNEGyJexqikzW+tea7J2zT+fKOqvD+qqcvhPmGXPwJW4Y3Fa74N7SB6BzsIbu8kNv1WxHNG lJNSMbh1J/YQh244= X-Received: by 2002:a0c:ef0a:0:b0:63d:419c:5916 with SMTP id t10-20020a0cef0a000000b0063d419c5916mr798325qvr.35.1690497329309; Thu, 27 Jul 2023 15:35:29 -0700 (PDT) X-Google-Smtp-Source: APBJJlHkMPeLTPdof6OmyRx39oeHVzrb5OdTFvTjXNl5LIpz8lwKJnu99AYyc33zV9Ofp6TAJCZWUw== X-Received: by 2002:a0c:ef0a:0:b0:63d:419c:5916 with SMTP id t10-20020a0cef0a000000b0063d419c5916mr798320qvr.35.1690497329034; Thu, 27 Jul 2023 15:35:29 -0700 (PDT) Received: from t14s.localdomain (c-76-28-97-5.hsd1.ma.comcast.net. [76.28.97.5]) by smtp.gmail.com with ESMTPSA id i7-20020a0cab47000000b0063c71b62239sm759651qvb.42.2023.07.27.15.35.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Jul 2023 15:35:28 -0700 (PDT) Message-ID: <969057b59e5cf472b73e8e1dedcc4a46630b31a0.camel@redhat.com> Subject: Re: Update and Questions on CPython Extension Module -fanalyzer plugin development From: David Malcolm To: Eric Feng Cc: gcc@gcc.gnu.org Date: Thu, 27 Jul 2023 18:35:27 -0400 In-Reply-To: References: User-Agent: Evolution 3.44.4 (3.44.4-2.fc36) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00,BODY_8BITS,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 2023-07-27 at 18:13 -0400, Eric Feng wrote: > Hi Dave, >=20 > Thanks for the comments! >=20 > [...] > > Do you have any DejaGnu tests for this functionality?=C2=A0 For example= , > > given PyList_New > > =C2=A0 https://docs.python.org/3/c-api/list.html#c.PyList_New > > there could be a test like: > >=20 > > /* { dg-require-effective-target python_h } */ > >=20 > > #define PY_SSIZE_T_CLEAN > > #include > > #include "analyzer-decls.h" > >=20 > > PyObject * > > test_PyList_New (Py_ssize_t len) > > { > > =C2=A0 PyObject *obj =3D PyList_New (len); > > =C2=A0 if (obj) > > =C2=A0=C2=A0=C2=A0 { > > =C2=A0=C2=A0=C2=A0=C2=A0 __analyzer_eval (obj->ob_refcnt =3D=3D 1); /* = { dg-warning "TRUE" > > } */ > > =C2=A0=C2=A0=C2=A0=C2=A0 __analyzer_eval (PyList_Check (obj)); /* { dg-= warning "TRUE" } > > */ > > =C2=A0=C2=A0=C2=A0=C2=A0 __analyzer_eval (PyList_CheckExact (obj)); /* = { dg-warning > > "TRUE" } */ > > =C2=A0=C2=A0=C2=A0 } > > =C2=A0 else > > =C2=A0=C2=A0=C2=A0 __analyzer_dump_path (); /* { dg-warning "path" } */ > > =C2=A0 return obj; > > } > >=20 > > ...or similar, to verify that we simulate that the call can both > > succeed and fail, and to verify properties of the store along the > > "success" path.=C2=A0 Caveat: I didn't look at exactly what properties > > you're simulating, so the above tests might need adjusting. > >=20 >=20 > I am currently in the process of developing more tests. Specific to > the test you provided as an example, we are passing all cases except > for PyList_Check. PyList_Check does not pass because I have not yet > added support for the various definitions of tp_flags. As noted in our chat earlier, I don't think we can easily make these work. Looking at CPython's implementation: PyList_Type's initializer here: https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101 initializes tp_flags with the flags, but: (a) we don't see that code when compiling a user's extension module (b) even if we did, PyList_Type is non-const, so the analyzer has to assume that tp_flags could have been written to since it was initialized In theory we could specialcase such lookups, so that, say, a plugin could register assumptions into the analyzer about the value of bits within (PyList_Type.tp_flags). However, this seems like a future feature. > I also > encountered a minor hiccup where PyList_CheckExact appeared to give > "UNKNOWN" rather than "TRUE", but this has since been fixed. The > problem was caused by accidentally using the tree representation of > struct PyList_Type as opposed to struct PyList_Type * when creating a > pointer sval to the region for Pylist_Type. Ah, good. >=20 > [...] > >=20 > > > Let's consider the following example which lacks error checking: > > >=20 > > > PyObject* foo() { > > > =C2=A0=C2=A0=C2=A0 PyObject item =3D PyLong_FromLong(10); > > > =C2=A0=C2=A0=C2=A0 PyObject list =3D PyList_New(5); > > > =C2=A0=C2=A0=C2=A0 return list; > > > } > > >=20 > > > The states for when PyLong_FromLong fails and when > > > PyLong_FromLong > > > succeeds are merged before the call to PyObject* list =3D > > > PyList_New(5). > >=20 > > Ideally we would emit a leak warning about the "success" case of > > PyLong_FromLong here.=C2=A0 I think you're running into the problem of > > the > > "store" part of the program_state being separate from the "malloc" > > state machine part of program_state - I'm guessing that you're > > creating > > a heap_allocated_region for the new python object, but the "malloc" > > state machine isn't transitioning the pointer from "start" to > > "assumed- > > non-null".=C2=A0 Such state machine states inhibit state-merging, and s= o > > this might solve your state-merging problem. > >=20 > > I think we need a way to call > > malloc_state_machine::on_allocator_call > > from outside of sm-malloc.cc.=C2=A0 See > > region_model::on_realloc_with_move > > for an example of how to do something similar. > >=20 >=20 > Thank you for the suggestion =E2=80=94 this worked great and has solved t= he > issue! Excellent! Thanks for the update Dave