public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <andrew.burgess@embecosm.com>
To: Simon Marchi <simon.marchi@polymtl.ca>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] gdb/python: fix memory leak in python inferior code
Date: Mon, 27 Sep 2021 14:40:43 +0100	[thread overview]
Message-ID: <20210927134043.GG1900093@embecosm.com> (raw)
In-Reply-To: <20210908092634.GU2581@embecosm.com>

Ping!

Thanks,
Andrew

* Andrew Burgess <andrew.burgess@embecosm.com> [2021-09-08 10:26:34 +0100]:

> * Simon Marchi <simon.marchi@polymtl.ca> [2021-09-07 11:10:20 -0400]:
> 
> > 
> > 
> > On 2021-09-05 4:53 a.m., Andrew Burgess wrote:
> > > When a user creates a gdb.Inferior object for the first time a new
> > > Python object is created.  This object is then cached within GDB's
> > > inferior object using the registry mechanism (see
> > > inferior_to_inferior_object in py-inferior.c, specifically the calls
> > > to inferior_data and set_inferior_data).
> > > 
> > > This reference to the Python object, held within the real inferior
> > > object, counts as a reference to the Python object, and so holds the
> > > reference count of the Python object above zero.
> > > 
> > > At the same time, the Python object maintains a pointer back to GDB's
> > > real inferior object.
> > > 
> > > When GDB's inferior object is deleted (say the inferior exits) then
> > > py_free_inferior is called (thanks to the registry system), this
> > > function looks up the Python object and sets the inferior pointer to
> > > nullptr and finally deletes reduces the reference count on the Python
> > > object.
> > > 
> > > If at this point the user still holds a reference to the Python object
> > > then nothing happens.  The gdb.Inferior Python object is now in the
> > > non-valid state (see infpy_is_valid in py-inferior.c), but otherwise,
> > > everything is fine.
> > > 
> > > However, if there are no further references to the gdb.Inferior
> > > object, or, once the user has given up all their references to the
> > > gdb.Inferior object, then infpy_dealloc is called.
> > > 
> > > This function currently checks to see if the inferior pointer within
> > > the gdb.Inferior object is nullptr or not.  If the pointer is nullptr
> > > then infpy_dealloc immediately returns.
> > > 
> > > Only when the inferior point in the gdb.Inferior is not nullptr do
> > > we (a) set the gdb.Inferior reference inside GDB's inferior to
> > > nullptr, and (b) call the underlying Python tp_free function.
> > > 
> > > There are a number things wrong here:
> > > 
> > >   1. The gdb.Inferior reference within GDB's inferior object hold a
> > >   reference count, thus, setting this reference to nullptr without
> > >   first decrementing the reference count would leak a reference.  This
> > >   could be solved by having the inferior hold a gdbpy_ref object,
> > >   however...
> > > 
> > >   2. As GDB's inferior holds a reference then infpy_dealloc will never
> > >   be called until GDB's inferior object is deleted, calls
> > >   py_free_inferior, and so gives up the reference.  At this point
> > >   there is no longer a need to call set_inferior_data to set the field
> > >   back to NULL, that field must have been cleared in order to get the
> > >   reference count to zero, which means...
> > > 
> > >   3. If we know that py_free_inferior must be called before
> > >   infpy_dealloc, then we know that the inferior pointer in
> > >   gdb.Inferior will always be nullptr when infpy_dealloc is called,
> > >   this means that the call to the underlying tp_free function will
> > >   always be skipped.  Skipping this call at all always seems like a
> > >   bad idea, but it turns out we will always skip it.
> > > 
> > > Given all of the above, I assert that the gdb.Inferior field will
> > > always be nullptr when infpy_dealloc is called.  That's what this
> > > patch does.
> > > 
> > > I assume that failing to call tp_free will mean that Python leaks some
> > > memory associated with the gdb.Inferior object, but I don't know any
> > > way that I could show that in a test.  As such, I've got no new tests
> > > for this commit, but if anyone has any ideas I'd be happy to add
> > > something.
> > 
> > I tried to read this a few times, and I'm stuck on not understanding how
> > there isn't a reference cycle here.  If the Python object holds a
> > reference on the inferior, and the inferior holds a reference on the
> > Python object (through the registry), how can that work?  Or is it a
> > weak reference in one of the directions?
> 
> There's no "reference" from the Python object to the inferior in the
> same way there's a "reference" from the inferior to the Python object.
> 
> That is the inferior (via the registry) holds a reference to the
> Python object.  This means that the Python object can't be deleted
> until the reference in the inferior is released (that reference holds
> the refcount above zero).
> 
> However, the Python object only holds a pointer back to the inferior,
> even though inferiors are a child of refcounted_object, the pointer
> within the Python object isn't handled as a refcounted_object, just a
> raw pointer.
> 
> So, the inferior's refcount can hit zero with no consideration for the
> pointer held within the Python object, but that's OK, when the
> inferior is deleted we clean up the registry.  This calls into the
> Python code, which sets the inferior pointer to nullptr, and
> decrements the refcount on the Python object.
> 
> When the user gives up their last gdb.Inferior reference the Python
> object can then be free'd.
> 
> Hope that makes things a little clearer,
> 
> Thanks,
> Andrew

  reply	other threads:[~2021-09-27 13:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-05  8:53 Andrew Burgess
2021-09-07 15:10 ` Simon Marchi
2021-09-08  9:26   ` Andrew Burgess
2021-09-27 13:40     ` Andrew Burgess [this message]
2021-10-05 12:02 ` [PATCHv2] " Andrew Burgess
2021-10-05 13:17   ` Simon Marchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210927134043.GG1900093@embecosm.com \
    --to=andrew.burgess@embecosm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=simon.marchi@polymtl.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).