public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] gdb/python: fix memory leak in python inferior code
@ 2021-09-05  8:53 Andrew Burgess
  2021-09-07 15:10 ` Simon Marchi
  2021-10-05 12:02 ` [PATCHv2] " Andrew Burgess
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Burgess @ 2021-09-05  8:53 UTC (permalink / raw)
  To: gdb-patches

When a user creates a gdb.Inferior object for the first time a new
Python object is created.  This object is then cached within GDB's
inferior object using the registry mechanism (see
inferior_to_inferior_object in py-inferior.c, specifically the calls
to inferior_data and set_inferior_data).

This reference to the Python object, held within the real inferior
object, counts as a reference to the Python object, and so holds the
reference count of the Python object above zero.

At the same time, the Python object maintains a pointer back to GDB's
real inferior object.

When GDB's inferior object is deleted (say the inferior exits) then
py_free_inferior is called (thanks to the registry system), this
function looks up the Python object and sets the inferior pointer to
nullptr and finally deletes reduces the reference count on the Python
object.

If at this point the user still holds a reference to the Python object
then nothing happens.  The gdb.Inferior Python object is now in the
non-valid state (see infpy_is_valid in py-inferior.c), but otherwise,
everything is fine.

However, if there are no further references to the gdb.Inferior
object, or, once the user has given up all their references to the
gdb.Inferior object, then infpy_dealloc is called.

This function currently checks to see if the inferior pointer within
the gdb.Inferior object is nullptr or not.  If the pointer is nullptr
then infpy_dealloc immediately returns.

Only when the inferior point in the gdb.Inferior is not nullptr do
we (a) set the gdb.Inferior reference inside GDB's inferior to
nullptr, and (b) call the underlying Python tp_free function.

There are a number things wrong here:

  1. The gdb.Inferior reference within GDB's inferior object hold a
  reference count, thus, setting this reference to nullptr without
  first decrementing the reference count would leak a reference.  This
  could be solved by having the inferior hold a gdbpy_ref object,
  however...

  2. As GDB's inferior holds a reference then infpy_dealloc will never
  be called until GDB's inferior object is deleted, calls
  py_free_inferior, and so gives up the reference.  At this point
  there is no longer a need to call set_inferior_data to set the field
  back to NULL, that field must have been cleared in order to get the
  reference count to zero, which means...

  3. If we know that py_free_inferior must be called before
  infpy_dealloc, then we know that the inferior pointer in
  gdb.Inferior will always be nullptr when infpy_dealloc is called,
  this means that the call to the underlying tp_free function will
  always be skipped.  Skipping this call at all always seems like a
  bad idea, but it turns out we will always skip it.

Given all of the above, I assert that the gdb.Inferior field will
always be nullptr when infpy_dealloc is called.  That's what this
patch does.

I assume that failing to call tp_free will mean that Python leaks some
memory associated with the gdb.Inferior object, but I don't know any
way that I could show that in a test.  As such, I've got no new tests
for this commit, but if anyone has any ideas I'd be happy to add
something.
---
 gdb/python/py-inferior.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/gdb/python/py-inferior.c b/gdb/python/py-inferior.c
index cfbc2f6574f..c36f275b23e 100644
--- a/gdb/python/py-inferior.c
+++ b/gdb/python/py-inferior.c
@@ -864,12 +864,20 @@ static void
 infpy_dealloc (PyObject *obj)
 {
   inferior_object *inf_obj = (inferior_object *) obj;
-  struct inferior *inf = inf_obj->inferior;
 
-  if (! inf)
-    return;
+  /* The inferior itself holds a reference to this Python object, which
+     will keep the reference count of this object above zero until GDB
+     deletes the inferior and py_free_inferior is called.
+
+     Once py_free_inferior has been called then the link between this
+     Python object and the inferior is set to nullptr, and then the
+     reference count on this Python object is decremented.
+
+     The result of all this is that the link between this Python object and
+     the inferior should always have been set to nullptr before this
+     function is called.  */
+  gdb_assert (inf_obj->inferior == nullptr);
 
-  set_inferior_data (inf, infpy_inf_data_key, NULL);
   Py_TYPE (obj)->tp_free (obj);
 }
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] gdb/python: fix memory leak in python inferior code
  2021-09-05  8:53 [PATCH] gdb/python: fix memory leak in python inferior code Andrew Burgess
@ 2021-09-07 15:10 ` Simon Marchi
  2021-09-08  9:26   ` Andrew Burgess
  2021-10-05 12:02 ` [PATCHv2] " Andrew Burgess
  1 sibling, 1 reply; 6+ messages in thread
From: Simon Marchi @ 2021-09-07 15:10 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches



On 2021-09-05 4:53 a.m., Andrew Burgess wrote:
> When a user creates a gdb.Inferior object for the first time a new
> Python object is created.  This object is then cached within GDB's
> inferior object using the registry mechanism (see
> inferior_to_inferior_object in py-inferior.c, specifically the calls
> to inferior_data and set_inferior_data).
> 
> This reference to the Python object, held within the real inferior
> object, counts as a reference to the Python object, and so holds the
> reference count of the Python object above zero.
> 
> At the same time, the Python object maintains a pointer back to GDB's
> real inferior object.
> 
> When GDB's inferior object is deleted (say the inferior exits) then
> py_free_inferior is called (thanks to the registry system), this
> function looks up the Python object and sets the inferior pointer to
> nullptr and finally deletes reduces the reference count on the Python
> object.
> 
> If at this point the user still holds a reference to the Python object
> then nothing happens.  The gdb.Inferior Python object is now in the
> non-valid state (see infpy_is_valid in py-inferior.c), but otherwise,
> everything is fine.
> 
> However, if there are no further references to the gdb.Inferior
> object, or, once the user has given up all their references to the
> gdb.Inferior object, then infpy_dealloc is called.
> 
> This function currently checks to see if the inferior pointer within
> the gdb.Inferior object is nullptr or not.  If the pointer is nullptr
> then infpy_dealloc immediately returns.
> 
> Only when the inferior point in the gdb.Inferior is not nullptr do
> we (a) set the gdb.Inferior reference inside GDB's inferior to
> nullptr, and (b) call the underlying Python tp_free function.
> 
> There are a number things wrong here:
> 
>   1. The gdb.Inferior reference within GDB's inferior object hold a
>   reference count, thus, setting this reference to nullptr without
>   first decrementing the reference count would leak a reference.  This
>   could be solved by having the inferior hold a gdbpy_ref object,
>   however...
> 
>   2. As GDB's inferior holds a reference then infpy_dealloc will never
>   be called until GDB's inferior object is deleted, calls
>   py_free_inferior, and so gives up the reference.  At this point
>   there is no longer a need to call set_inferior_data to set the field
>   back to NULL, that field must have been cleared in order to get the
>   reference count to zero, which means...
> 
>   3. If we know that py_free_inferior must be called before
>   infpy_dealloc, then we know that the inferior pointer in
>   gdb.Inferior will always be nullptr when infpy_dealloc is called,
>   this means that the call to the underlying tp_free function will
>   always be skipped.  Skipping this call at all always seems like a
>   bad idea, but it turns out we will always skip it.
> 
> Given all of the above, I assert that the gdb.Inferior field will
> always be nullptr when infpy_dealloc is called.  That's what this
> patch does.
> 
> I assume that failing to call tp_free will mean that Python leaks some
> memory associated with the gdb.Inferior object, but I don't know any
> way that I could show that in a test.  As such, I've got no new tests
> for this commit, but if anyone has any ideas I'd be happy to add
> something.

I tried to read this a few times, and I'm stuck on not understanding how
there isn't a reference cycle here.  If the Python object holds a
reference on the inferior, and the inferior holds a reference on the
Python object (through the registry), how can that work?  Or is it a
weak reference in one of the directions?

Simon

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] gdb/python: fix memory leak in python inferior code
  2021-09-07 15:10 ` Simon Marchi
@ 2021-09-08  9:26   ` Andrew Burgess
  2021-09-27 13:40     ` Andrew Burgess
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Burgess @ 2021-09-08  9:26 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

* Simon Marchi <simon.marchi@polymtl.ca> [2021-09-07 11:10:20 -0400]:

> 
> 
> On 2021-09-05 4:53 a.m., Andrew Burgess wrote:
> > When a user creates a gdb.Inferior object for the first time a new
> > Python object is created.  This object is then cached within GDB's
> > inferior object using the registry mechanism (see
> > inferior_to_inferior_object in py-inferior.c, specifically the calls
> > to inferior_data and set_inferior_data).
> > 
> > This reference to the Python object, held within the real inferior
> > object, counts as a reference to the Python object, and so holds the
> > reference count of the Python object above zero.
> > 
> > At the same time, the Python object maintains a pointer back to GDB's
> > real inferior object.
> > 
> > When GDB's inferior object is deleted (say the inferior exits) then
> > py_free_inferior is called (thanks to the registry system), this
> > function looks up the Python object and sets the inferior pointer to
> > nullptr and finally deletes reduces the reference count on the Python
> > object.
> > 
> > If at this point the user still holds a reference to the Python object
> > then nothing happens.  The gdb.Inferior Python object is now in the
> > non-valid state (see infpy_is_valid in py-inferior.c), but otherwise,
> > everything is fine.
> > 
> > However, if there are no further references to the gdb.Inferior
> > object, or, once the user has given up all their references to the
> > gdb.Inferior object, then infpy_dealloc is called.
> > 
> > This function currently checks to see if the inferior pointer within
> > the gdb.Inferior object is nullptr or not.  If the pointer is nullptr
> > then infpy_dealloc immediately returns.
> > 
> > Only when the inferior point in the gdb.Inferior is not nullptr do
> > we (a) set the gdb.Inferior reference inside GDB's inferior to
> > nullptr, and (b) call the underlying Python tp_free function.
> > 
> > There are a number things wrong here:
> > 
> >   1. The gdb.Inferior reference within GDB's inferior object hold a
> >   reference count, thus, setting this reference to nullptr without
> >   first decrementing the reference count would leak a reference.  This
> >   could be solved by having the inferior hold a gdbpy_ref object,
> >   however...
> > 
> >   2. As GDB's inferior holds a reference then infpy_dealloc will never
> >   be called until GDB's inferior object is deleted, calls
> >   py_free_inferior, and so gives up the reference.  At this point
> >   there is no longer a need to call set_inferior_data to set the field
> >   back to NULL, that field must have been cleared in order to get the
> >   reference count to zero, which means...
> > 
> >   3. If we know that py_free_inferior must be called before
> >   infpy_dealloc, then we know that the inferior pointer in
> >   gdb.Inferior will always be nullptr when infpy_dealloc is called,
> >   this means that the call to the underlying tp_free function will
> >   always be skipped.  Skipping this call at all always seems like a
> >   bad idea, but it turns out we will always skip it.
> > 
> > Given all of the above, I assert that the gdb.Inferior field will
> > always be nullptr when infpy_dealloc is called.  That's what this
> > patch does.
> > 
> > I assume that failing to call tp_free will mean that Python leaks some
> > memory associated with the gdb.Inferior object, but I don't know any
> > way that I could show that in a test.  As such, I've got no new tests
> > for this commit, but if anyone has any ideas I'd be happy to add
> > something.
> 
> I tried to read this a few times, and I'm stuck on not understanding how
> there isn't a reference cycle here.  If the Python object holds a
> reference on the inferior, and the inferior holds a reference on the
> Python object (through the registry), how can that work?  Or is it a
> weak reference in one of the directions?

There's no "reference" from the Python object to the inferior in the
same way there's a "reference" from the inferior to the Python object.

That is the inferior (via the registry) holds a reference to the
Python object.  This means that the Python object can't be deleted
until the reference in the inferior is released (that reference holds
the refcount above zero).

However, the Python object only holds a pointer back to the inferior,
even though inferiors are a child of refcounted_object, the pointer
within the Python object isn't handled as a refcounted_object, just a
raw pointer.

So, the inferior's refcount can hit zero with no consideration for the
pointer held within the Python object, but that's OK, when the
inferior is deleted we clean up the registry.  This calls into the
Python code, which sets the inferior pointer to nullptr, and
decrements the refcount on the Python object.

When the user gives up their last gdb.Inferior reference the Python
object can then be free'd.

Hope that makes things a little clearer,

Thanks,
Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] gdb/python: fix memory leak in python inferior code
  2021-09-08  9:26   ` Andrew Burgess
@ 2021-09-27 13:40     ` Andrew Burgess
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Burgess @ 2021-09-27 13:40 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

Ping!

Thanks,
Andrew

* Andrew Burgess <andrew.burgess@embecosm.com> [2021-09-08 10:26:34 +0100]:

> * Simon Marchi <simon.marchi@polymtl.ca> [2021-09-07 11:10:20 -0400]:
> 
> > 
> > 
> > On 2021-09-05 4:53 a.m., Andrew Burgess wrote:
> > > When a user creates a gdb.Inferior object for the first time a new
> > > Python object is created.  This object is then cached within GDB's
> > > inferior object using the registry mechanism (see
> > > inferior_to_inferior_object in py-inferior.c, specifically the calls
> > > to inferior_data and set_inferior_data).
> > > 
> > > This reference to the Python object, held within the real inferior
> > > object, counts as a reference to the Python object, and so holds the
> > > reference count of the Python object above zero.
> > > 
> > > At the same time, the Python object maintains a pointer back to GDB's
> > > real inferior object.
> > > 
> > > When GDB's inferior object is deleted (say the inferior exits) then
> > > py_free_inferior is called (thanks to the registry system), this
> > > function looks up the Python object and sets the inferior pointer to
> > > nullptr and finally deletes reduces the reference count on the Python
> > > object.
> > > 
> > > If at this point the user still holds a reference to the Python object
> > > then nothing happens.  The gdb.Inferior Python object is now in the
> > > non-valid state (see infpy_is_valid in py-inferior.c), but otherwise,
> > > everything is fine.
> > > 
> > > However, if there are no further references to the gdb.Inferior
> > > object, or, once the user has given up all their references to the
> > > gdb.Inferior object, then infpy_dealloc is called.
> > > 
> > > This function currently checks to see if the inferior pointer within
> > > the gdb.Inferior object is nullptr or not.  If the pointer is nullptr
> > > then infpy_dealloc immediately returns.
> > > 
> > > Only when the inferior point in the gdb.Inferior is not nullptr do
> > > we (a) set the gdb.Inferior reference inside GDB's inferior to
> > > nullptr, and (b) call the underlying Python tp_free function.
> > > 
> > > There are a number things wrong here:
> > > 
> > >   1. The gdb.Inferior reference within GDB's inferior object hold a
> > >   reference count, thus, setting this reference to nullptr without
> > >   first decrementing the reference count would leak a reference.  This
> > >   could be solved by having the inferior hold a gdbpy_ref object,
> > >   however...
> > > 
> > >   2. As GDB's inferior holds a reference then infpy_dealloc will never
> > >   be called until GDB's inferior object is deleted, calls
> > >   py_free_inferior, and so gives up the reference.  At this point
> > >   there is no longer a need to call set_inferior_data to set the field
> > >   back to NULL, that field must have been cleared in order to get the
> > >   reference count to zero, which means...
> > > 
> > >   3. If we know that py_free_inferior must be called before
> > >   infpy_dealloc, then we know that the inferior pointer in
> > >   gdb.Inferior will always be nullptr when infpy_dealloc is called,
> > >   this means that the call to the underlying tp_free function will
> > >   always be skipped.  Skipping this call at all always seems like a
> > >   bad idea, but it turns out we will always skip it.
> > > 
> > > Given all of the above, I assert that the gdb.Inferior field will
> > > always be nullptr when infpy_dealloc is called.  That's what this
> > > patch does.
> > > 
> > > I assume that failing to call tp_free will mean that Python leaks some
> > > memory associated with the gdb.Inferior object, but I don't know any
> > > way that I could show that in a test.  As such, I've got no new tests
> > > for this commit, but if anyone has any ideas I'd be happy to add
> > > something.
> > 
> > I tried to read this a few times, and I'm stuck on not understanding how
> > there isn't a reference cycle here.  If the Python object holds a
> > reference on the inferior, and the inferior holds a reference on the
> > Python object (through the registry), how can that work?  Or is it a
> > weak reference in one of the directions?
> 
> There's no "reference" from the Python object to the inferior in the
> same way there's a "reference" from the inferior to the Python object.
> 
> That is the inferior (via the registry) holds a reference to the
> Python object.  This means that the Python object can't be deleted
> until the reference in the inferior is released (that reference holds
> the refcount above zero).
> 
> However, the Python object only holds a pointer back to the inferior,
> even though inferiors are a child of refcounted_object, the pointer
> within the Python object isn't handled as a refcounted_object, just a
> raw pointer.
> 
> So, the inferior's refcount can hit zero with no consideration for the
> pointer held within the Python object, but that's OK, when the
> inferior is deleted we clean up the registry.  This calls into the
> Python code, which sets the inferior pointer to nullptr, and
> decrements the refcount on the Python object.
> 
> When the user gives up their last gdb.Inferior reference the Python
> object can then be free'd.
> 
> Hope that makes things a little clearer,
> 
> Thanks,
> Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCHv2] gdb/python: fix memory leak in python inferior code
  2021-09-05  8:53 [PATCH] gdb/python: fix memory leak in python inferior code Andrew Burgess
  2021-09-07 15:10 ` Simon Marchi
@ 2021-10-05 12:02 ` Andrew Burgess
  2021-10-05 13:17   ` Simon Marchi
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Burgess @ 2021-10-05 12:02 UTC (permalink / raw)
  To: gdb-patches

Compared to v1, I have:

 1. Rewritten the commit message to, I hope, make it clearer what is
    going on,

 2. Written a test using Python's tracemalloc module so show this
    bug exists,

 3. The actual code change in GDB is identical to v1.

Thanks,
Andrew

---

When a user creates a gdb.Inferior object for the first time a new
Python object is created.  This object is then cached within GDB's
inferior object using the registry mechanism (see
inferior_to_inferior_object in py-inferior.c, specifically the calls
to inferior_data and set_inferior_data).

The Python Reference to the gdb.Inferior object held within the real
inferior object ensures that the reference count on the Python
gdb.Inferior object never reaches zero while the GDB inferior object
continues to exist.

At the same time, the gdb.Inferior object maintains a C++ pointer back
to GDB's real inferior object.  We therefore end up with a system that
looks like this:

                   Python Reference
                         |
                         |
    .----------.         |          .--------------.
    |          |------------------->|              |
    | inferior |                    | gdb.Inferior |
    |          |<-------------------|              |
    '----------'         |          '--------------'
                         |
                         |
                    C++ Pointer

When GDB's inferior object is deleted (say the inferior exits) then
py_free_inferior is called (thanks to the registry system), this
function looks up the Python gdb.Inferior object and sets the C++
pointer to nullptr and finally reduces the reference count on the
Python gdb.Inferior object.

If at this point the user still holds a reference to the Python
gdb.Inferior object then nothing happens.  However, the gdb.Inferior
object is now in the non-valid state (see infpy_is_valid in
py-inferior.c), but otherwise, everything is fine.

However, if there are no further references to the Python gdb.Inferior
object, or, once the user has given up all their references to the
gdb.Inferior object, then infpy_dealloc is called.

This function currently checks to see if the inferior pointer within
the gdb.Inferior object is nullptr or not.  If the pointer is nullptr
then infpy_dealloc immediately returns.

Only when the inferior point in the gdb.Inferior is not nullptr do
we (a) set the gdb.Inferior reference inside GDB's inferior to
nullptr, and (b) call the underlying Python tp_free function.

There are a number things wrong here:

  1.  The Python gdb.Inferior reference within GDB's inferior object
  holds a reference count, thus, setting this reference to nullptr
  without first decrementing the reference count would leak a
  reference, however...

  2. As GDB's inferior holds a reference then infpy_dealloc will never
  be called until GDB's inferior object is deleted.  Deleting a GDB
  inferior ohject calls py_free_inferior, and so gives up the
  reference.  At this point there is no longer a need to call
  set_inferior_data to set the field back to NULL, that field must
  have been cleared in order to get the reference count to zero, which
  means...

  3. If we know that py_free_inferior must be called before
  infpy_dealloc, then we know that the inferior pointer in
  gdb.Inferior will always be nullptr when infpy_dealloc is called,
  this means that the call to the underlying tp_free function will
  always be skipped.  Skipping this call will cause Python to leak the
  memory associated with the gdb.Inferior object, which is what we
  currently always do.

Given all of the above, I assert that the C++ pointer within
gdb.Inferior will always be nullptr when infpy_dealloc is called.
That's what this patch does.

I wrote a test for this issue making use of Pythons tracemalloc
module, which allows us to spot this memory leak.
---
 gdb/python/py-inferior.c                      |  16 ++-
 gdb/testsuite/gdb.python/py-inferior-leak.c   |  22 ++++
 gdb/testsuite/gdb.python/py-inferior-leak.exp |  33 ++++++
 gdb/testsuite/gdb.python/py-inferior-leak.py  | 109 ++++++++++++++++++
 4 files changed, 176 insertions(+), 4 deletions(-)
 create mode 100644 gdb/testsuite/gdb.python/py-inferior-leak.c
 create mode 100644 gdb/testsuite/gdb.python/py-inferior-leak.exp
 create mode 100644 gdb/testsuite/gdb.python/py-inferior-leak.py

diff --git a/gdb/python/py-inferior.c b/gdb/python/py-inferior.c
index 0659c28ea9c..c8de41dd009 100644
--- a/gdb/python/py-inferior.c
+++ b/gdb/python/py-inferior.c
@@ -864,12 +864,20 @@ static void
 infpy_dealloc (PyObject *obj)
 {
   inferior_object *inf_obj = (inferior_object *) obj;
-  struct inferior *inf = inf_obj->inferior;
 
-  if (! inf)
-    return;
+  /* The inferior itself holds a reference to this Python object, which
+     will keep the reference count of this object above zero until GDB
+     deletes the inferior and py_free_inferior is called.
+
+     Once py_free_inferior has been called then the link between this
+     Python object and the inferior is set to nullptr, and then the
+     reference count on this Python object is decremented.
+
+     The result of all this is that the link between this Python object and
+     the inferior should always have been set to nullptr before this
+     function is called.  */
+  gdb_assert (inf_obj->inferior == nullptr);
 
-  set_inferior_data (inf, infpy_inf_data_key, NULL);
   Py_TYPE (obj)->tp_free (obj);
 }
 
diff --git a/gdb/testsuite/gdb.python/py-inferior-leak.c b/gdb/testsuite/gdb.python/py-inferior-leak.c
new file mode 100644
index 00000000000..bfe52c018d4
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-inferior-leak.c
@@ -0,0 +1,22 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int
+main (void)
+{
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.python/py-inferior-leak.exp b/gdb/testsuite/gdb.python/py-inferior-leak.exp
new file mode 100644
index 00000000000..259a35e90bb
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-inferior-leak.exp
@@ -0,0 +1,33 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the GDB testsuite.  It checks for memory leaks
+# associated with allocating and deallocation gdb.Inferior objects.
+
+load_lib gdb-python.exp
+
+standard_testfile
+
+clean_restart
+
+# Skip all tests if Python scripting is not enabled.
+if { [skip_python_tests] } { continue }
+
+set pyfile [gdb_remote_download host ${srcdir}/${subdir}/${testfile}.py]
+
+# Source the Python script, this runs the test (which is written
+# completely in Python), and either prints PASS, or thows an
+# exception.
+gdb_test "source ${pyfile}" "PASS" "source python script"
diff --git a/gdb/testsuite/gdb.python/py-inferior-leak.py b/gdb/testsuite/gdb.python/py-inferior-leak.py
new file mode 100644
index 00000000000..914fb3ecc08
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-inferior-leak.py
@@ -0,0 +1,109 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+import tracemalloc
+import gdb
+import re
+
+# A global variable in which we store a reference to the gdb.Inferior
+# object sent to us in the new_inferior event.
+inf = None
+
+# Register the new_inferior event handler.
+def new_inferior_handler(event):
+    global inf
+    inf = event.inferior
+
+
+gdb.events.new_inferior.connect(new_inferior_handler)
+
+# A global filters list, we only care about memory allocations
+# originating from this script.
+filters = [tracemalloc.Filter(True, "*py-inferior-leak.py")]
+
+# Add a new inferior, and return the number of the new inferior.
+def add_inferior():
+    output = gdb.execute("add-inferior", False, True)
+    m = re.search(r"Added inferior (\d+)", output)
+    if m:
+        num = int(m.group(1))
+    else:
+        raise RuntimeError("no match")
+    return num
+
+
+# Run the test.  When CLEAR is True we clear the global INF variable
+# before comparing the before and after memory allocation traces.
+# When CLEAR is False we leave INF set to reference the gdb.Inferior
+# object, thus preventing the gdb.Inferior from being deallocated.
+def test(clear):
+    global filters, inf
+
+    # Start tracing, and take a snapshot of the current allocations.
+    tracemalloc.start()
+    snapshot1 = tracemalloc.take_snapshot()
+
+    # Create an inferior, this triggers the new_inferior event, which
+    # in turn holds a reference to the new gdb.Inferior object in the
+    # global INF variable.
+    num = add_inferior()
+    gdb.execute("remove-inferiors %s" % num)
+
+    # Possibly clear the global INF variable.
+    if clear:
+        inf = None
+
+    # Now grab a second snapshot of memory allocations, and stop
+    # tracing memory allocations.
+    snapshot2 = tracemalloc.take_snapshot()
+    tracemalloc.stop()
+
+    # Filter the snapshots; we only care about allocations originating
+    # from this file.
+    snapshot1 = snapshot1.filter_traces(filters)
+    snapshot2 = snapshot2.filter_traces(filters)
+
+    # Compare the snapshots, this leaves only things that were
+    # allocated, but not deallocated since the first snapshot.
+    stats = snapshot2.compare_to(snapshot1, "traceback")
+
+    # Total up all the deallocated things.
+    total = 0
+    for stat in stats:
+        total += stat.size_diff
+    return total
+
+
+# The first time we run this some global state will be allocated which
+# shows up as memory that is allocated, but not released.  So, run the
+# test once and discard the result.
+test(True)
+
+# Now run the test twice, the first time we clear our global reference
+# to the gdb.Inferior object, which should allow Python to deallocate
+# the object.  The second time we hold onto the global reference,
+# preventing Python from performing the deallocation.
+bytes_with_clear = test(True)
+bytes_without_clear = test(False)
+
+# The bug that used to exist in GDB was that even when we released the
+# global reference the gdb.Inferior object would not be deallocated.
+if bytes_with_clear > 0:
+    raise gdb.GdbError("memory leak when gdb.Inferior should be released")
+if bytes_without_clear == 0:
+    raise gdb.GdbError("gdb.Inferior object is no longer allocated")
+
+# Print a PASS message that the test script can see.
+print("PASS")
-- 
2.25.4


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCHv2] gdb/python: fix memory leak in python inferior code
  2021-10-05 12:02 ` [PATCHv2] " Andrew Burgess
@ 2021-10-05 13:17   ` Simon Marchi
  0 siblings, 0 replies; 6+ messages in thread
From: Simon Marchi @ 2021-10-05 13:17 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches



On 2021-10-05 08:02, Andrew Burgess wrote:
> Compared to v1, I have:
> 
>  1. Rewritten the commit message to, I hope, make it clearer what is
>     going on,
> 
>  2. Written a test using Python's tracemalloc module so show this
>     bug exists,
> 
>  3. The actual code change in GDB is identical to v1.
> 
> Thanks,
> Andrew
> 
> ---
> 
> When a user creates a gdb.Inferior object for the first time a new
> Python object is created.  This object is then cached within GDB's
> inferior object using the registry mechanism (see
> inferior_to_inferior_object in py-inferior.c, specifically the calls
> to inferior_data and set_inferior_data).
> 
> The Python Reference to the gdb.Inferior object held within the real
> inferior object ensures that the reference count on the Python
> gdb.Inferior object never reaches zero while the GDB inferior object
> continues to exist.
> 
> At the same time, the gdb.Inferior object maintains a C++ pointer back
> to GDB's real inferior object.  We therefore end up with a system that
> looks like this:
> 
>                    Python Reference
>                          |
>                          |
>     .----------.         |          .--------------.
>     |          |------------------->|              |
>     | inferior |                    | gdb.Inferior |
>     |          |<-------------------|              |
>     '----------'         |          '--------------'
>                          |
>                          |
>                     C++ Pointer
> 
> When GDB's inferior object is deleted (say the inferior exits) then
> py_free_inferior is called (thanks to the registry system), this
> function looks up the Python gdb.Inferior object and sets the C++
> pointer to nullptr and finally reduces the reference count on the
> Python gdb.Inferior object.
> 
> If at this point the user still holds a reference to the Python
> gdb.Inferior object then nothing happens.  However, the gdb.Inferior
> object is now in the non-valid state (see infpy_is_valid in
> py-inferior.c), but otherwise, everything is fine.
> 
> However, if there are no further references to the Python gdb.Inferior
> object, or, once the user has given up all their references to the
> gdb.Inferior object, then infpy_dealloc is called.
> 
> This function currently checks to see if the inferior pointer within
> the gdb.Inferior object is nullptr or not.  If the pointer is nullptr
> then infpy_dealloc immediately returns.
> 
> Only when the inferior point in the gdb.Inferior is not nullptr do
> we (a) set the gdb.Inferior reference inside GDB's inferior to
> nullptr, and (b) call the underlying Python tp_free function.
> 
> There are a number things wrong here:
> 
>   1.  The Python gdb.Inferior reference within GDB's inferior object
>   holds a reference count, thus, setting this reference to nullptr
>   without first decrementing the reference count would leak a
>   reference, however...
> 
>   2. As GDB's inferior holds a reference then infpy_dealloc will never
>   be called until GDB's inferior object is deleted.  Deleting a GDB
>   inferior ohject calls py_free_inferior, and so gives up the
>   reference.  At this point there is no longer a need to call
>   set_inferior_data to set the field back to NULL, that field must
>   have been cleared in order to get the reference count to zero, which
>   means...
> 
>   3. If we know that py_free_inferior must be called before
>   infpy_dealloc, then we know that the inferior pointer in
>   gdb.Inferior will always be nullptr when infpy_dealloc is called,
>   this means that the call to the underlying tp_free function will
>   always be skipped.  Skipping this call will cause Python to leak the
>   memory associated with the gdb.Inferior object, which is what we
>   currently always do.
> 
> Given all of the above, I assert that the C++ pointer within
> gdb.Inferior will always be nullptr when infpy_dealloc is called.
> That's what this patch does.
> 
> I wrote a test for this issue making use of Pythons tracemalloc
> module, which allows us to spot this memory leak.

Well, this is still a bit confusing, not because of your explanation,
because it is actually a bit complicatied.  But from what I understand,
that makes sense.

The patch LGTM, I just found one typo:

> diff --git a/gdb/testsuite/gdb.python/py-inferior-leak.exp b/gdb/testsuite/gdb.python/py-inferior-leak.exp
> new file mode 100644
> index 00000000000..259a35e90bb
> --- /dev/null
> +++ b/gdb/testsuite/gdb.python/py-inferior-leak.exp
> @@ -0,0 +1,33 @@
> +# Copyright (C) 2021 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# This file is part of the GDB testsuite.  It checks for memory leaks
> +# associated with allocating and deallocation gdb.Inferior objects.
> +
> +load_lib gdb-python.exp
> +
> +standard_testfile
> +
> +clean_restart
> +
> +# Skip all tests if Python scripting is not enabled.
> +if { [skip_python_tests] } { continue }
> +
> +set pyfile [gdb_remote_download host ${srcdir}/${subdir}/${testfile}.py]
> +
> +# Source the Python script, this runs the test (which is written
> +# completely in Python), and either prints PASS, or thows an

thows -> throws

Simon

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-05 13:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-05  8:53 [PATCH] gdb/python: fix memory leak in python inferior code Andrew Burgess
2021-09-07 15:10 ` Simon Marchi
2021-09-08  9:26   ` Andrew Burgess
2021-09-27 13:40     ` Andrew Burgess
2021-10-05 12:02 ` [PATCHv2] " Andrew Burgess
2021-10-05 13:17   ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).