From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <archer-return-1905-listarch-archer=sourceware.org@sourceware.org>
Received: (qmail 9002 invoked by alias); 4 Feb 2010 23:57:59 -0000
Mailing-List: contact archer-help@sourceware.org; run by ezmlm
Sender: <archer@sourceware.org>
Precedence: bulk
List-Post: <mailto:archer@sourceware.org>
List-Help: <mailto:archer-help@sourceware.org>
List-Subscribe: <mailto:archer-subscribe@sourceware.org>
List-Id: <archer.sourceware.org>
Received: (qmail 8991 invoked by uid 22791); 4 Feb 2010 23:57:58 -0000
X-SWARE-Spam-Status: No, hits=-2.6 required=5.0
	tests=BAYES_00,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
Subject: Inferior python debugging: working prototype
From: David Malcolm <dmalcolm@redhat.com>
To: Project Archer <archer@sourceware.org>
Content-Type: text/plain; charset="UTF-8"
Date: Thu, 04 Feb 2010 23:57:00 -0000
Message-Id: <1265327904.8892.56.camel@brick>
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-SW-Source: 2010-q1/txt/msg00065.txt.bz2

(This is a followup to this December posting:
http://sourceware.org/ml/archer/2009-q4/msg00129.html "Pretty-printing
backtraces when "python" is the inferior process")

I've now got working gdb python code for prettyprinting the various
types seen when the inferior process is linked against libpython, such
as for debugging the python binary itself (or presumably a build of gdb
itself, though I haven't tried that).

I am still seeing the issue described in that post where variables of
interest are visible from within the regular (gdb) interface, but not
visible from python's gdb.selectedframe.read_var() (specifically, the
PyFrameObject *f within PyEval_EvalFrameEx; which appears to be an issue
with an inlined frame of C code).

Having said that, I noticed that the backtrace contained this data, and
by writing a prettyprinter rather than a backtrace implementation I'm
able to hook in to the gdb.Value and sidestep this issue.

Current status:
  - I've written a libpython.py which can be seen in this 1-file git
repository:
http://fedorapeople.org/gitweb?p=dmalcolm/public_git/libpython.git;a=summary
  - the code has a prettyprinter for (PyObject*) and for
(PyFrameObject*)
  - the code has to be imported manually
  - It works, and generates large amounts of useful debugging
information, showing (nested) lists, tuples, ints, strings, unicode,
old-style classes etc, and showing file/line/locals/globals information
at the python level; it seems somewhat robust in the face of corrupt
data in the inferior process.
  - See:
https://fedoraproject.org/wiki/Features/EasierPythonDebugging#User_Experience
for a series of text dumps comparing before/after backtraces of a
segfault within /usr/bin/python.
  - It mostly works with python3 as well.
  - All my testing has been by hand, using Fedora 12's build of gdb
(gdb-7.0.1-26.fc12.i686)

Questions:
  - I want the prettyprinter hooks to be used by default on Python
backtraces in Fedora 13, so that (for example) automated tools that
capture backtraces contain this rich debugging information. What's the
best way of wiring this up so that the module is imported?  Is something
like this happening for the GLib/GTK hooks, or for the STL?
  - should this be distributed as part of gdb or part of python?  (it
handles both python2 and python3, mostly, at any rate)
  - I'd like to be able to automatically test this.  Am I right in
thinking that it's reasonable to assume that given that a gdb configured
with --with-python can also be tested debugging that instance of python
(I'd also like to automate testing of python3 support, which would be a
different runtime)
  - to what extent is a pretty-printer expected to return in a sane
amount of time and use sane amounts of RAM?  For example, if my
prettyprinter tries to print a PyListObject, but the length of the list
(the "ob_size" field) has become  0xdeadbeef rather than, say 3,
building a proxy list to represent it within the gdb process is probably
going to make gdb run out of heap.  Are there any standards around this?
(e.g. some defined limit to how much it's worth scraping)

Hope this is useful
Dave