public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Eric Feng <ef2648@columbia.edu>
To: David Malcolm <dmalcolm@redhat.com>
Cc: gcc@gcc.gnu.org
Subject: Re: [GSoC] Interest and initial proposal for project on reimplementing cpychecker as -fanalyzer plugin
Date: Mon, 3 Apr 2023 10:29:55 -0400	[thread overview]
Message-ID: <CANGHATWM6Vtzo7WSogDc99AgZo5kWg+UMQGTiwozpvNfXzWdyg@mail.gmail.com> (raw)
In-Reply-To: <5ac8582d2f76f48133d4b933574d775863a347bf.camel@redhat.com>

Thanks for bringing this to my attention Dave! I’m happy to
collaborate on this project with Steven. I will reply in more detail
in the other thread.

Best,
Eric


On Sun, Apr 2, 2023 at 7:28 PM David Malcolm <dmalcolm@redhat.com> wrote:
>
> On Sat, 2023-04-01 at 19:49 -0400, Eric Feng wrote:
> > > For the task above, I think it's almost all there, it's "just" a
> > > case
> > > of implementing the special-case knowledge about the CPython API,
> > > mostly via known_function subclasses.
> >
> > Sounds good.
> >
> >
> > > In cpychecker I added some custom function attributes:
> > >
> > > https://gcc-python-plugin.readthedocs.io/en/latest/cpychecker.html
> > > which were:
> > >   __attribute__((cpychecker_returns_borrowed_ref))
> > >   __attribute__((cpychecker_steals_reference_to_arg(n)))
> > >
> > [...]
> > >
> > > But exactly what these macros would look like would be a decision
> > > for
> > > the CPython community (hence do it via PEP, based on a sample
> > > implementation).
> >
> > Ok, I see what you mean now. Thanks for clarifying!
> >
> >
> > > Yeah, this sounds like a big project.  Fortunately there are a lot
> > > of
> > > possible subtasks in this one, and the project has benefits to GCC
> > > and
> > > to CPython even if you only get a subset of the ideas done in the
> > > time
> > > available (refcount checking being probably the highest-value
> > > subtask).
> >
> > Sounds good.
> >
> > I refactored the project description and timeline sections of the
> > proposal according to our conversation. Notably, I moved format
> > string
> > checking to task #2 in the timeline since its subtasks are
> > particularly beneficial. I also suggest in the timeline section to
> > reach out to the CPython community via PEP about the specifics of new
> > attributes in week 9/10 since I think we should have a somewhat
> > mature
> > prototype by that point. Let me know if you think it should be done
> > earlier/later. Please find the changed sections below (I omitted
> > unchanged sections for brevity)
> > _______
> >
> > Describe the project and clearly define its goals:
> > One pertinent use case of the gcc-python plugin was as a static
> > analysis tool for CPython extension modules. The main goal of the
> > plugin was to help programmers writing extensions identify common
> > coding errors. The gcc-python-plugin has bitrotted over the years
> > and,
> > in particular, cpychecker stopped working some GCC releases ago.
> > Broadly, the goal of this project is to port the functionalities of
> > cpychecker to a -fanalyzer plugin.
> >
> > Below is a brief description of the functionalities of the static
> > analysis tool for which I will work on porting over to a -fanalyzer
> > plugin. The structure of the objectives is based on the
> > gcc-python-plugin documentation:
> >
> > Reference count checking: <Unchanged from original proposal>
> >
> > Format string checking: Some CPython APIs such as PyArgs_ParseTuple,
> > PyArg_ParseTupleAndKeywords, etc take format strings as arguments.
> > This check involves verifying that the format strings taken in by
> > these APIs are correct with respect to the number and types of
> > arguments passed in. In particular, I will work on integrating the
> > analyzer with -Wformat
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107017) and adding
> > plugin support for -Wformat
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100121) . We should
> > then
> > be able to specify our own archetype which reflects the format string
> > syntax for the relevant CPython APIs and take advantage of the
> > integrated analyzer to check them.
> >
> > Associating PyTypeObject instances with compile-time-types:
> > <Unchanged
> > from original proposal>
> >
> > Error-handling checking (including errors in exception handling):
> > Common errors such as dereferencing a NULL value are already checked
> > by the analyzer. I will extend this functionality by implementing
> > special-case knowledge about the CPython API.
> >
> > Verification of PyMethodDef tables: <Unchanged from original
> > proposal>
> >
> > Provide an expected timeline:
> > Please find a rough estimate of the weekly progress in relation to
> > the
> > features described below. Tasks that I expect to take longer than one
> > week are broken down in more detail. In addition to what’s described,
> > each task also involves adding test coverage pertaining its specific
> > feature to a regression test suite.
> >
> > Week 1 - 7: Reference counting checking
> >     Week 1: Set up the overall infrastructure of the plugin and begin
> > building core functionality
> >     Week 1 - 6: Core reference counting functionality
> >     Week 7: Refine prototype
> > Week 8 - 10.5: Format string checking (including associating
> > PyTypeObject instances with compile-time-types)
> >     Week 8 - ~9: RFE: support printf-style formatted functions in -
> > fanalyzer
> >     Week ~9 - 10.5: RFE: plugin support for -Wformat via
> > __attribute__((format()))
> >     Additionally, begin conversing with CPython community via PEP
> > about the exact form of new attributes on CPython headers which may
> > be
> > helpful for both humans and the static analyzer. Present ideas based
> > on work done so far.
> > Week 10.5 - 12: Error-handling checking, errors in exception
> > handling,
> > and verification of PyMethodDef tables
> >
>
> Sounds great.
>
> Note that the deadline for submitting proposals to the official GSoC
> website is April 4 - 18:00 UTC (i.e. this coming Tuesday) and that
> Google are very strict about that deadline; see:
> https://developers.google.com/open-source/gsoc/timeline
>
> Please include the biographical detail on yourself in the proposal that
> you posted on the list, and if you can, link to C++ code you've
> written.
>
>
> I don't know if you saw the emails from Sun Steven, but they're also
> interested in this project, perhaps as a collaboration with you.  Given
> that the project is large and could be chopped up into several
> components that might be a possibility - but don't feel like you need
> to do that yourself in your proposal; as noted in the email I just
> sent, we don't know how many slots we'll get from the GSoC program.
>
> Good luck
> Dave
>

  reply	other threads:[~2023-04-03 14:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-25 19:38 Eric Feng
2023-03-26 15:58 ` David Malcolm
2023-03-28 12:08   ` Eric Feng
2023-03-28 19:14     ` David Malcolm
2023-04-01 23:49       ` Eric Feng
2023-04-02 23:28         ` David Malcolm
2023-04-03 14:29           ` Eric Feng [this message]
2023-04-02 17:24 Sun Steven
2023-04-02 23:14 ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANGHATWM6Vtzo7WSogDc99AgZo5kWg+UMQGTiwozpvNfXzWdyg@mail.gmail.com \
    --to=ef2648@columbia.edu \
    --cc=dmalcolm@redhat.com \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).