Re: RFC GDB Linux Awareness analysis

public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed

* Re: RFC GDB Linux Awareness analysis
       [not found] <20150603142858.GA19370@griffinp-ThinkPad-X1-Carbon-2nd>
@ 2015-08-20 18:22 ` Andreas Arnez
  2015-09-30 13:27   ` Peter Griffin
  0 siblings, 1 reply; 8+ messages in thread
From: Andreas Arnez @ 2015-08-20 18:22 UTC (permalink / raw)
  To: Peter Griffin; +Cc: gdb, lee.jones

Hi,

Sorry for the late reply.  Yao Qi pointed me to this RFC; I've just
missed it before.  For those who have missed it as well, the original
posting was here:

  https://sourceware.org/ml/gdb-patches/2015-06/msg00040.html

Independently of that proposal I have presented some thoughts about
improving GDB's Linux kernel debugging support on the GNU Tools
Cauldron.  Surprisingly or not, some of the goals look very similar.
The slides are here:

  https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile&do=view&target=Andreas+Arnez_+Debugging+Linux+kernel+dumps+with+GDB.pdf

In that talk I was focusing more on kernel dump debugging rather than
live debugging, but I also tried to emphasize the fact that the feature
sets required for both (should) have a large overlap.  Only a few
features are specific to one or the other, e.g. live debugging requires
a notification method for refreshing the task list and the modules list.

From the resonance in the talk I got the impression that there is indeed
wide-spread interest in having improved kernel debug support in GDB.  So
I believe posting the patches to get more feedback would be worth-while.

On Wed, Jun 03 2015, Peter Griffin wrote:

> Hi GDB community,
>
> Overview
> ========
>
> The purpose of this email is to describe a useful feature which has been developed
> by STMicroelectronics inside of GDB for debugging Linux kernels. I will cover at
> a high level some of the implementation details and what I see as the advantages
> / disadvantages of the implementation. I will also cover some other alternative 
> approaches that I'm aware of.
>
> The purpose is to facilitate discussion with the GDB experts on this
> mailing list as to what the "correct" way to implement this functionality would
> be.

I'm copying gdb@sourceware.org instead of gdb-patches, since I think
it's better suited for now -- until we are actually discussing a patch.

> The end goal is to have an upstream implementation of this functionality.
>
> Introduction
> ============
>
> STMicroelectronics has a patchset on top of vanilla GDB which adds much better
> Linux kernel awareness into GDB. They have called this GDB extension LKD (Linux
> Kernel Debugger). This GDB extension is primarily used in conjunction with ST's
> JTAG debugger for debugging ARM / SH4 SoCs, and is implemented as an internal
> GDB extension, written in C.
>
> The LKD extension is nicely abstracted from the underlying JTAG interface,
> and I have used it to debug a ARMv7 kernel running in QEMU via gdbremote.
>
> ST would like to contribute these patches back to GDB, as we think it could
> be useful not only for ARM Linux debugging, but also other CPU/OS combinations
> in the future.
>
> LKD can be broadly split into the following parts: -
>
> 1) Linux task awareness
> 2) Loadable module support
> 3) Various Linux helper commands
>
> The next section looks at each of the three parts, and any other implementations
> I'm aware of that currently exist.
>
>
> 1) Task Awareness
> =================
>
> 1.1) LKD Linux Task awareness
> =============================
>
> When using mainline GDB for debugging a Linux kernel via JTAG GDB will typically
> only show the actual hardware threads.
>
> The LKD task awareness extension (lkd-process.c) adds the ability for GDB to
> parse some kernel data structures, so it can build a thread list of all kernel
> threads in the running Linux kernel.
>
> When halting the processor (via a breakpoint, ctrl+c etc) the thread list is
> re-enumerated, so new tasks are visible to GDB and tasks which have exited
> are removed from the thread list.
>
> To achieve this GDB has to know about various Linux kernel data structures,
> and also various fields within these structures (mainly task_struct,
> thread_info, mm_struct). Code has been implemented to parse these structs,
> and also do virtual to physical address translations and cache handling.
> Various frame unwinders are also implemented to stop backtracing on various
> exceptions, and entry points to the kernel (these would be a useful addition
> regardless of which task awareness approach is taken).
>
> Advantages
>  - Adds Linux kernel task awareness to GDB
>  - Supports symbolic debugging of user processes
>  - Contextual information (structs / field offsets) are readily available inside of GDB
>  - Has been used and well tested inside ST for some time
>
> Disadvantages
>  - Being implemented in C within GDB creates a dependency between GDB and the Linux kernel
>  - Mainly tested on 3.10 and 2.6.30 with ARM and SH4 kernels, being upstreamed would
>    expose it to many differing kernel versions and architectures.
>
>  - I can't see any other "OS awareness" support currently in the GDB code base

Actually, it seems that there's a bit of special OS kernel debug
support, e.g. in bsd-kvm.c.

But I'd rather view this as providing support for the Linux kernel
*runtime* and compare it to components like linux-thread-db.c for
user-space threads.  GDB has several of those already.  They often leave
the internals of accessing the thread library's data structures to a
"thread debug library" outside GDB.  We may or may not want to do
something similar for the kernel, depending on the complexity of the
data structures involved and on the likelihood for them to change.
Maybe the "thread debug library" could actually be a Python script
provided by the kernel developers, as outlined in 1.3 below.  Or maybe
it's just easier to put all the required logic into GDB and perform
appropriate version checks where necessary.

> 1.2) OpenOCD / GDB Linux task awareness
> =======================================
>
> Whilst looking for any prior art in this area I found that OpenOCD already
> implements some basic Linux task awareness. See here: -
>  http://openocd.org/doc/doxygen/html/linux_8c_source.html
>
> I've this working to the extent where I can connect via JTAG to a ARMv7 U8500
> Snowball board and enumerate a thread list in GDB. It required some debugging
> & hacking to get this far.
>
> This implementation passes the thread list to GDB via the gdbremote protocol
> and as such changes required in GDB are minimal.
>
> Advantages
>  - OpenOCD already supports many target types (ARM / MIPS), and has support
> for virt to phys translations / cache handling etc.
>
>  - OpenOCD also implements task awareness for other RTOS’s (ThreadX / FreeRTOS / eCos)
>
>  - Using gdbremote means GDB changes are (so far) minimal.
>
>  - OpenOCD / Linux Kernel dependency already exists
>
>
> Disadvantages
>  - Creates a dependency between Linux kernel data structures and OpenOCD.
>
>  - I believe finding field offsets within structs is currently not possible via
>    gdbremote protocol.
>
>    Currently OpenOCD generates these offsets at compile time which is ugly and needs
>    fixing. See here http://openocd.org/doc-release/doxygen/linux__header_8h_source.html.
>
>    Being able to find field offsets would in my opinion be a useful addition to the
>    gdbremote protocol which would allow OpenOCD task awareness to work much better
>    at runtime.
>    
>  - Doesn't support debugging user processes. I think this would still require some
>    GDB changes, and also gdbremote protocol changes to get working correctly.
>
>  - Needs to be made more generic
>
>
> 1.3) Python Task awareness
> ==========================
>
> Jan Kiszka from Siemens has implemented some basic Linux kernel task awareness using
> the GDB Python interface.
>
> See here https://lwn.net/Articles/631167/ and here
> https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt
>
> This support is currently limited to the following commands: -
>
> lx_current -- Return current task
> lx_per_cpu -- Return per-cpu variable
> lx_task_by_pid -- Find Linux task by PID and return the task_struct variable
> lx_thread_info -- Calculate Linux thread_info from task variable
>
> However this could be extended to build up a thread list. The GDB Python interface
> would I believe need to be extended to allow a thread list to be passed back to GDB
> via Python.
>
> Advantages
>  - Code parsing Linux kernel structures lives in the kernel source tree
>  - Contextual information is easily available
>  - Works with all OCD implementations not just OpenOCD 
>
> Disadvantages
>  - Doesn't exist yet
>  - Python / GDB interface would need to be extended to support threads
>
> Questions
> ========
>
> 1) What do GDB community think about having Linux OS task awareness in the GDB
>    codebase?

For my part, I don't see problem with that.  We already have code in GDB
for supporting various runtime environments.  This would just be another
one.  Currently, even if an external thread debug library is involved,
the code that uses it is still specially crafted for a particular
runtime and does not work for another.

> 2) Is there a preferred way to implement this (C extension / gdbremote / Python
>    / something else)?

Since many existing JTAG debuggers as well as Qemu will not offer a
Linux task list, I think such logic belongs on the GDB client side.
It's also on the client side where you have the DWARF info from vmlinux
that enables dealing with varying field offsets in structures.

Whether the logic should be written in C or as a Python (or Guile)
extension, I don't have a strong opinion.

> 3) Any other implementations / advantages / disadvantages that I'm not currently
>    aware of?
>
> 2) Loadable module support
> ==========================
>
> 2.1) LKD Loadable module support
> ================================
>
> lkd-modules.c adds better Linux kernel module symbolic symbol support to GDB using the
> GDB shared libraries infrastructure (solib).
>
> It has hooks to enable the debugging of module __init sections to reflect that these pages
> get discarded post-load. It also implements the same section layout algorithm as
> the kernel to speed up symbol resolution, only inspecting the target’s memory if there is a
> mismatch (inspecting target memory can be slow).
>
> I think this part could be upstreamed separately to the task awareness support, although
> I’ve not tried separating it yet from the other LKD patches.

Yes, I think that separate patch sets would be helpful.

> Advantages
>  - Allows full symbolic debugging of Linux loadable modules including init sections
>
>  - Would be independent of underlying communication mechanism (JTAG / gdbremote etc)
>
> Disadvantages
>  - Some dependency between GDB and Linux kernel is still present
>
> Questions:
>
> Are GDB community happy to have a Linux specific solib functionality in the GDB code base?

For my part, same as with thread support: GDB already supports various
shared library implementations and "knows" about their internals, so
this would just be another one.

> 2.2) Python Loadable module support
> ===================================
>
> I've not managed to look to much at this, but some basic support exists, see here
> https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt
>
> Having read the LKD modules implementation I don’t think this can be anywhere near
> as functionally complete.
>
>
> 3) Linux Helper commands
> ========================
>
> 3.1) LKD helper commands
> ========================
>
> LKD implements various Linux helper commands inside GDB such as: -
>  - dmesg - dump dmesg log buffer from kernel
>  - process_info - prints various info about current process
>  - pmap - prints memory map of current process
>  - vm_translate - translates virtual to physical address
>  - proc_interrupts - prints interrupt statistics
>  - proc_iomem - prints I/O mem map
>  - proc_cmdline - prints the contents of /proc/cmdline
>  - proc_version - prints the contents of /proc/version
>  - proc_mounts - print the contents of /proc/mounts
>  - proc_meminfo - print the contents of /proc/meminfo.
>
> Advantages
>  - Can be used by all GDB based debug solutions
>  - Some precedent of OS related commands in the GDB code base 
>    https://sourceware.org/gdb/current/onlinedocs/gdb/OS-Information.html#OS-Information
>
> Disadvantages
>  - Creates a GDB dependency with the kernel
>
> Python helper commands
> ======================
>
> Jan’s python scripts also implement some of the same LKD commands such as 'dmesg'.
>
> See here https://lwn.net/Articles/631167/ and 
> https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt
>
> Advantages
>  - Code that is parsing Linux kernel structures lives in the kernel source tree
>  - Can be used by all GDB based debug solutions
>
>
> Questions
>  - Do GDB community mind Linux specific custom commands being added to GDB code base? 

I'm not sure about that one.  We also need to consider the capability
for the user to add more such commands, e.g. specific to certain device
drivers or certain debug scenarios.  In my view both cases should
ideally be covered with the same underlying mechanism.

> My current opinion is that helper commands which can be, should be migrated from C code
> into Python, and merged into the kernel source tree (and then retired from the LKD patchset).

That's my current view as well.  Or maybe we could consider using EPPIC
for those.  (See my Cauldron slides.)

> If you got here, thanks for reading this far! Like I said at the beginning, the purpose of
> this email is to stimulate some discussion on what you folks consider the 'correct' way to
> implement this OS awareness functionality is.
>
> All feedback is welcome.
>
> Kind regards,
>
> Peter.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC GDB Linux Awareness analysis
  2015-08-20 18:22 ` RFC GDB Linux Awareness analysis Andreas Arnez
@ 2015-09-30 13:27   ` Peter Griffin
  2015-09-30 16:41     ` Duane Ellis
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Peter Griffin @ 2015-09-30 13:27 UTC (permalink / raw)
  To: Andreas Arnez; +Cc: gdb, lee.jones

Hi Andreas,

On Thu, 20 Aug 2015, Andreas Arnez wrote:

Many thanks for your feedback.
> 
> Sorry for the late reply.  Yao Qi pointed me to this RFC; I've just
> missed it before.  For those who have missed it as well, the original
> posting was here:
> 
>   https://sourceware.org/ml/gdb-patches/2015-06/msg00040.html
> 
> Independently of that proposal I have presented some thoughts about
> improving GDB's Linux kernel debugging support on the GNU Tools
> Cauldron.  Surprisingly or not, some of the goals look very similar.
> The slides are here:
> 
>
>https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile&do=view&target=Andreas+Arnez_+Debugging+Linux+kernel+dumps+with+GDB.pdf

Thanks for the link to your slides, that is very intresting. From your slides
LKD implements exactly some of the features you were proposing.

For example info threads in LKD reports running and sleeping threads
and the loadable module support, resuses solib infrastructure inside GDB.

> 
> In that talk I was focusing more on kernel dump debugging rather than
> live debugging, but I also tried to emphasize the fact that the feature
> sets required for both (should) have a large overlap.  Only a few
> features are specific to one or the other, e.g. live debugging requires
> a notification method for refreshing the task list and the modules list.
> 
> From the resonance in the talk I got the impression that there is indeed
> wide-spread interest in having improved kernel debug support in GDB.

Ok that is good news, one reason for posting the RFC to the mailing
list was to try and gauge what the GDB community support was for having such a
feature. So it is good to know at least so far it seems positive.


> So I believe posting the patches to get more feedback would be worth-while.

Ok, so the existing LKD code is on quite an old GDB version (7.6). Also the whole
LKD patchset is currently quite large (15k LoC). So my plan was to try and 
reduce this as much as possible by removing parts of LKD that can (or already
have) python implementations e.g. dmesg.

Essentially the aim would be to try and reduce the LKD patchset as much as possible
to the "core features" which I consider to be task awareness and loadable module
support. At this point we could look at what might need adding
to the GDB python API to migrate even more parts into python.

Posting the whole patchset currently would involve porting to the latest GDB
version and before doing that I'd like to get a better idea of what parts are
likely to need re-writing or at least only ports the necessary parts.

> 
> On Wed, Jun 03 2015, Peter Griffin wrote:
> 
> > Hi GDB community,
> >
> > Overview
> > ========
> >
> > The purpose of this email is to describe a useful feature which has been developed
> > by STMicroelectronics inside of GDB for debugging Linux kernels. I will cover at
> > a high level some of the implementation details and what I see as the advantages
> > / disadvantages of the implementation. I will also cover some other alternative 
> > approaches that I'm aware of.
> >
> > The purpose is to facilitate discussion with the GDB experts on this
> > mailing list as to what the "correct" way to implement this functionality would
> > be.
> 
> I'm copying gdb@sourceware.org instead of gdb-patches, since I think
> it's better suited for now -- until we are actually discussing a patch.

Thanks :)

> 
> > The end goal is to have an upstream implementation of this functionality.
> >
> > Introduction
> > ============
> >
> > STMicroelectronics has a patchset on top of vanilla GDB which adds much better
> > Linux kernel awareness into GDB. They have called this GDB extension LKD (Linux
> > Kernel Debugger). This GDB extension is primarily used in conjunction with ST's
> > JTAG debugger for debugging ARM / SH4 SoCs, and is implemented as an internal
> > GDB extension, written in C.
> >
> > The LKD extension is nicely abstracted from the underlying JTAG interface,
> > and I have used it to debug a ARMv7 kernel running in QEMU via gdbremote.
> >
> > ST would like to contribute these patches back to GDB, as we think it could
> > be useful not only for ARM Linux debugging, but also other CPU/OS combinations
> > in the future.
> >
> > LKD can be broadly split into the following parts: -
> >
> > 1) Linux task awareness
> > 2) Loadable module support
> > 3) Various Linux helper commands
> >
> > The next section looks at each of the three parts, and any other implementations
> > I'm aware of that currently exist.
> >
> >
> > 1) Task Awareness
> > =================
> >
> > 1.1) LKD Linux Task awareness
> > =============================
> >
> > When using mainline GDB for debugging a Linux kernel via JTAG GDB will typically
> > only show the actual hardware threads.
> >
> > The LKD task awareness extension (lkd-process.c) adds the ability for GDB to
> > parse some kernel data structures, so it can build a thread list of all kernel
> > threads in the running Linux kernel.
> >
> > When halting the processor (via a breakpoint, ctrl+c etc) the thread list is
> > re-enumerated, so new tasks are visible to GDB and tasks which have exited
> > are removed from the thread list.
> >
> > To achieve this GDB has to know about various Linux kernel data structures,
> > and also various fields within these structures (mainly task_struct,
> > thread_info, mm_struct). Code has been implemented to parse these structs,
> > and also do virtual to physical address translations and cache handling.
> > Various frame unwinders are also implemented to stop backtracing on various
> > exceptions, and entry points to the kernel (these would be a useful addition
> > regardless of which task awareness approach is taken).
> >
> > Advantages
> >  - Adds Linux kernel task awareness to GDB
> >  - Supports symbolic debugging of user processes
> >  - Contextual information (structs / field offsets) are readily available inside of GDB
> >  - Has been used and well tested inside ST for some time
> >
> > Disadvantages
> >  - Being implemented in C within GDB creates a dependency between GDB and the Linux kernel
> >  - Mainly tested on 3.10 and 2.6.30 with ARM and SH4 kernels, being upstreamed would
> >    expose it to many differing kernel versions and architectures.
> >
> >  - I can't see any other "OS awareness" support currently in the GDB code base
> 
> Actually, it seems that there's a bit of special OS kernel debug
> support, e.g. in bsd-kvm.c.

Thanks for the pointer I will take a look in there. It's useful to know where
all the OS specific bits in GDB are already.
> 
> But I'd rather view this as providing support for the Linux kernel
> *runtime* and compare it to components like linux-thread-db.c for
> user-space threads.  GDB has several of those already.  They often leave
> the internals of accessing the thread library's data structures to a
> "thread debug library" outside GDB.  We may or may not want to do
> something similar for the kernel, depending on the complexity of the
> data structures involved and on the likelihood for them to change.

Interesting..I wasn't aware of the use of an external library, I will take a
look in there to. Looking at the gdb-kdump implementation that someone kindly
CC'ed me in on the discussion it looks like that is doing something similar at
least for the virt to phys translation (which is something the LKD patches
currently do in GDB).

> Maybe the "thread debug library" could actually be a Python script
> provided by the kernel developers, as outlined in 1.3 below.  Or maybe
> it's just easier to put all the required logic into GDB and perform
> appropriate version checks where necessary.

It is certainly easier from an implementation PoV as we have access to all the
required objects inside GDB in C, which I don't think is the case currently with
the python interface. Also of course we have 2 existing out of tree
implementations (LKD and gdb-kdump). Having taken a brief look at the gdb-kdump
code that is also parsing the kernel structures in GDB C code.

> 
> > 1.2) OpenOCD / GDB Linux task awareness
> > =======================================
> >
> > Whilst looking for any prior art in this area I found that OpenOCD already
> > implements some basic Linux task awareness. See here: -
> >  http://openocd.org/doc/doxygen/html/linux_8c_source.html
> >
> > I've this working to the extent where I can connect via JTAG to a ARMv7 U8500
> > Snowball board and enumerate a thread list in GDB. It required some debugging
> > & hacking to get this far.
> >
> > This implementation passes the thread list to GDB via the gdbremote protocol
> > and as such changes required in GDB are minimal.
> >
> > Advantages
> >  - OpenOCD already supports many target types (ARM / MIPS), and has support
> > for virt to phys translations / cache handling etc.
> >
> >  - OpenOCD also implements task awareness for other RTOSâ€™s (ThreadX / FreeRTOS / eCos)
> >
> >  - Using gdbremote means GDB changes are (so far) minimal.
> >
> >  - OpenOCD / Linux Kernel dependency already exists
> >
> >
> > Disadvantages
> >  - Creates a dependency between Linux kernel data structures and OpenOCD.
> >
> >  - I believe finding field offsets within structs is currently not possible via
> >    gdbremote protocol.
> >
> >    Currently OpenOCD generates these offsets at compile time which is ugly and needs
> >    fixing. See here http://openocd.org/doc-release/doxygen/linux__header_8h_source.html.
> >
> >    Being able to find field offsets would in my opinion be a useful addition to the
> >    gdbremote protocol which would allow OpenOCD task awareness to work much better
> >    at runtime.
> >    
> >  - Doesn't support debugging user processes. I think this would still require some
> >    GDB changes, and also gdbremote protocol changes to get working correctly.
> >
> >  - Needs to be made more generic
> >
> >
> > 1.3) Python Task awareness
> > ==========================
> >
> > Jan Kiszka from Siemens has implemented some basic Linux kernel task awareness using
> > the GDB Python interface.
> >
> > See here https://lwn.net/Articles/631167/ and here
> > https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt
> >
> > This support is currently limited to the following commands: -
> >
> > lx_current -- Return current task
> > lx_per_cpu -- Return per-cpu variable
> > lx_task_by_pid -- Find Linux task by PID and return the task_struct variable
> > lx_thread_info -- Calculate Linux thread_info from task variable
> >
> > However this could be extended to build up a thread list. The GDB Python interface
> > would I believe need to be extended to allow a thread list to be passed back to GDB
> > via Python.
> >
> > Advantages
> >  - Code parsing Linux kernel structures lives in the kernel source tree
> >  - Contextual information is easily available
> >  - Works with all OCD implementations not just OpenOCD 
> >
> > Disadvantages
> >  - Doesn't exist yet
> >  - Python / GDB interface would need to be extended to support threads
> >
> > Questions
> > ========
> >
> > 1) What do GDB community think about having Linux OS task awareness in the GDB
> >    codebase?
> 
> For my part, I don't see problem with that.  We already have code in GDB
> for supporting various runtime environments.  This would just be another
> one.  Currently, even if an external thread debug library is involved,
> the code that uses it is still specially crafted for a particular
> runtime and does not work for another.

Ok, good to know that a C implementation inside GDB isn't necessarily a no-go
and that there are other implementations already merged. This route would
enable the most anmount of re-use with the existing LKD implementation.

Does anyone have experience / thoughts on how often the existing threading
implementations that are already inside GDB break? Presumably these are also
tightly coupled to parsing out of tree data structures (or the libraries which
GDB relies on to do this).

I'm trying to get a feel for what the current maintenance burden is like having
these implementations in C.

> 
> > 2) Is there a preferred way to implement this (C extension / gdbremote / Python
> >    / something else)?
> 
> Since many existing JTAG debuggers as well as Qemu will not offer a
> Linux task list, I think such logic belongs on the GDB client side.
> It's also on the client side where you have the DWARF info from vmlinux
> that enables dealing with varying field offsets in structures.
> 
> Whether the logic should be written in C or as a Python (or Guile)
> extension, I don't have a strong opinion.

Ok.

> 
> > 3) Any other implementations / advantages / disadvantages that I'm not currently
> >    aware of?
> >
> > 2) Loadable module support
> > ==========================
> >
> > 2.1) LKD Loadable module support
> > ================================
> >
> > lkd-modules.c adds better Linux kernel module symbolic symbol support to GDB using the
> > GDB shared libraries infrastructure (solib).
> >
> > It has hooks to enable the debugging of module __init sections to reflect that these pages
> > get discarded post-load. It also implements the same section layout algorithm as
> > the kernel to speed up symbol resolution, only inspecting the targetâ€™s memory if there is a
> > mismatch (inspecting target memory can be slow).
> >
> > I think this part could be upstreamed separately to the task awareness support, although
> > Iâ€™ve not tried separating it yet from the other LKD patches.
> 
> Yes, I think that separate patch sets would be helpful.
> 
> > Advantages
> >  - Allows full symbolic debugging of Linux loadable modules including init sections
> >
> >  - Would be independent of underlying communication mechanism (JTAG / gdbremote etc)
> >
> > Disadvantages
> >  - Some dependency between GDB and Linux kernel is still present
> >
> > Questions:
> >
> > Are GDB community happy to have a Linux specific solib functionality in the GDB code base?
> 
> For my part, same as with thread support: GDB already supports various
> shared library implementations and "knows" about their internals, so
> this would just be another one.

Ok good to know. I will take a read of the existing thread support already in GDB.
> 
> > 2.2) Python Loadable module support
> > ===================================
> >
> > I've not managed to look to much at this, but some basic support exists, see here
> > https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt
> >
> > Having read the LKD modules implementation I donâ€™t think this can be anywhere near
> > as functionally complete.
> >
> >
> > 3) Linux Helper commands
> > ========================
> >
> > 3.1) LKD helper commands
> > ========================
> >
> > LKD implements various Linux helper commands inside GDB such as: -
> >  - dmesg - dump dmesg log buffer from kernel
> >  - process_info - prints various info about current process
> >  - pmap - prints memory map of current process
> >  - vm_translate - translates virtual to physical address
> >  - proc_interrupts - prints interrupt statistics
> >  - proc_iomem - prints I/O mem map
> >  - proc_cmdline - prints the contents of /proc/cmdline
> >  - proc_version - prints the contents of /proc/version
> >  - proc_mounts - print the contents of /proc/mounts
> >  - proc_meminfo - print the contents of /proc/meminfo.
> >
> > Advantages
> >  - Can be used by all GDB based debug solutions
> >  - Some precedent of OS related commands in the GDB code base 
> >    https://sourceware.org/gdb/current/onlinedocs/gdb/OS-Information.html#OS-Information
> >
> > Disadvantages
> >  - Creates a GDB dependency with the kernel
> >
> > Python helper commands
> > ======================
> >
> > Janâ€™s python scripts also implement some of the same LKD commands such as 'dmesg'.
> >
> > See here https://lwn.net/Articles/631167/ and 
> > https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt
> >
> > Advantages
> >  - Code that is parsing Linux kernel structures lives in the kernel source tree
> >  - Can be used by all GDB based debug solutions
> >
> >
> > Questions
> >  - Do GDB community mind Linux specific custom commands being added to GDB code base? 
> 
> I'm not sure about that one.  We also need to consider the capability
> for the user to add more such commands, e.g. specific to certain device
> drivers or certain debug scenarios.  In my view both cases should
> ideally be covered with the same underlying mechanism.
> 
> > My current opinion is that helper commands which can be, should be migrated from C code
> > into Python, and merged into the kernel source tree (and then retired from the LKD patchset).
> 
> That's my current view as well.  Or maybe we could consider using EPPIC
> for those.  (See my Cauldron slides.)

Ok, I'm glad we're aligned on that. Once again many thanks for your feedback.

regards,

Peter.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC GDB Linux Awareness analysis
  2015-09-30 13:27   ` Peter Griffin
@ 2015-09-30 16:41     ` Duane Ellis
  2015-10-05 18:32       ` Doug Evans
  2015-10-01  9:25     ` Yao Qi
  2015-10-02 10:56     ` Andreas Arnez
  2 siblings, 1 reply; 8+ messages in thread
From: Duane Ellis @ 2015-09-30 16:41 UTC (permalink / raw)
  To: Peter Griffin; +Cc: Andreas Arnez, gdb, lee.jones

I would offer the following for GDB Kernel Dump analysis - there is *a*lot* more that is needed.

1)	It is not uncommon to have a raw ram dump of the running system captured by some other means
	This RAMDUMP can be loaded into some type of CPU or MEMORY simulator 
	By raw RAMDUMP - assume the target has a 1GIG memory, the raw ram dump file is exactly 1 gig
	And - may contain the contexts of other non-linux memory sections

	Other non-linux components examples include:
		GPU state,  power management processor state, DSP state
		Or perhaps a video processing subsystem (ie: AMP  core 0 = linux, core 1= dedicated video)

2) These other things need to be examined “in context” 
	That context might be a different endian order
	A different instruction set
	Or different structure packing rules
	Or perhaps they are encoded logs that need to be converted to readable ascii text

3) GDB - generally as stated in the caldron slide deck is a “application debugger” it is not a bare metal debugger
	I cannot agree with this more - and it is a fundamental limitation for GDB
	And it is the source of much of the attitude about what gets done with GDB

	Crashdump is exactly a bare metal debug situation
	But you can also think about live target debug.
	Example:  Step through user space into the kernel, and perhaps into a hypervisor call
	And debug each of these situations within their separate (and different) contexts.

4) In the bare metal world - GDB has a really big (fundamental) problem - GDB thinks an address is a integer
    These problems exist during *LIVE* debug of a target, and during postmortem debug analysis of a crash dump
     LLDB has the same issue.

	It is not - an address in bare metal consists of 3 components, I would call these “memory access attributes":
	Component (A) the integer like index into the memory region
	Component (B) The route or memory region identifier
	Component (C) Attributes specific to the memory region

   Some examples include:
	ARM Trust Zone - secure vrs non-secure access (dump logs in trust zone)
	Dump context of non-active thread in that threads virtual memory configuration
	Some SOCs include alternate access means to memory
	For example an ARM SoC has the CPU memory aperture (view) of system memory
	And the ARM SoC may have the DAP’s view of memory access.
	Access to system registers of some type (i.e.: ARM  mrc/mcr/mrs/msr, other cpus have other forms)

5) In the crash dump case, the memory emulation system needs to be told *where* is the active MMU table
	The memory simulation needs a means to set the mum translation table base register(s)

	In the crash dump case, GDB will issue a “read memory request” to examine a data structure.
	The memory simulation needs to perform MMU page table walks

6)  Lots of the above needs to be scripted (i.e.: Python is a great solution here, but is not always present)
	And - these scripts could be provided by the Linux Kernel build process
	Specifically: The kernel build process should produce an architecture/build-specific data file with structure definition
	These scripts that I talk about, could read the ‘build-specific’ data file 
	(more about this later)

7) A good example of scripting is during postmortem debug 
	GDB cannot call (execute) a helper function within the target because the target is not “live”
	Thus, many of these things have to be written on the host side

	Writing this in C is painful…  Python offers some better solution and increased flexibility

7) We are talking here about a command line, ascii text interface for GDB
	There is another slew of implications when you add a GUI window 
	For example - how do you specify a memory access for a memory (variable) dump (or watch) window?
	
	There are other interesting windows - things that display CPU configuration registers
	(ie: MMU enable/disable, cache control,  the list goes on)
	
	I’m debugging a kernel - so these things are relevant to the debug session
	And thus the debugger should provide a means to access these items

8)  The GDB expression parser (ie: address parser) needs to support casting to a memory address with attributes
	For example:  I have a “phys_pointer”  I want to cast this to a C data structure
	But - some variables are accessed via “the current [default] memory configuration”
	But other variables {the one I just cast) needs to use a different memory access configuration

So what is the way out of this:

1)	GDB needs to become more “bare metal friendly”  or at least “bare metal aware"

2) 	GDB fundamentally needs the ability to specify the 3 missing elements to an address.
	This needs to become pervasive throughout GDB
	This is not a simple change - but a means needs to be created

Example:
	The commercial debugger from lauterbach use what they call a “memory class”
	In that debugger the feature is pervasive  - every target SYMBOL can have an access attribute
	Memory display windows have attributes
	Items in a dialog box (i.e.: CPU register view window) can have attributes

3)	Some attitudes need to change about where things belong
	Some argue that feature (X) belongs in the gdb server
	And others say (Y) belongs in the GUI side of things

	I agree - architecturally some of these things belong in those other places
	But these do not address the scripting problem.

	Here’s an example:
		My target performs image or video processing

	With GDB - we are talking about 3 separate processes.
	The GUI,  GDB it self, and the ‘gdb-remote’ or simulation process

	I want to point at a physical memory location( the image buffer )
	That memory buffer could be video bit planes, or RGBA data
	Or maybe it is some software defined radio data stream

	Accessing this might require starting with a data structure, an element within that structure (data pointer) 
	I can now access that memory - I have the data in Python array of some type

	Can I use the graphics library in Python to display my image or wave form?
	Can I use Python graphics to create a task switch timeline graph?

Here’s an example (finger print)
	http://www.analog.com/library/analogdialogue/archives/42-07/fingerprint.html

	This is not limited to BARE METAL - what about applications that manipulate images?

	How can I write *ONE* script that controls all three execution processes:
		The GUI - eclipse or DDD or something else
		GDB - the core, what we talk about here
		GDB-REMOTE - which might be a JTAG thing, or a SIMULATION thing

	Attitudes that keep these separate make scripting these solutions impossibly hard

4) GDB does not currently expose enough via the scripting interface
	As I stated above - attitudes need to change about GDB
	LLDB suffers from this also - i.e.: It needs to work when Python is not present…. Grrr...
	
	In order to solve these bare-metal problems somebody needs to write code to make it happen.
	In effect, these would be “bare metal plugin” features

	You could think of “image processing” or “DSP aware” features as another plugin.

	Python offers that plugin solution :-)

5) The GDB server thing (for jtag/bare metal) needs to change
	But that is a discussion for a different day and a different email chain


-Duane


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC GDB Linux Awareness analysis
  2015-09-30 13:27   ` Peter Griffin
  2015-09-30 16:41     ` Duane Ellis
@ 2015-10-01  9:25     ` Yao Qi
  2015-10-02 10:56     ` Andreas Arnez
  2 siblings, 0 replies; 8+ messages in thread
From: Yao Qi @ 2015-10-01  9:25 UTC (permalink / raw)
  To: Peter Griffin; +Cc: Andreas Arnez, gdb, lee.jones

Peter Griffin <peter.griffin@linaro.org> writes:

> Does anyone have experience / thoughts on how often the existing threading
> implementations that are already inside GDB break? Presumably these are also
> tightly coupled to parsing out of tree data structures (or the libraries which
> GDB relies on to do this).

It shouldn't break in any case.  There are some conventions in these
thread libraries to make it debuggable.  In glibc NPTL,
__nptl_create_event should be called after a thread is created.  GDB
relies on this, and set a breakpoint on it to monitor the thread
creation.  In AIX thread lib, GDB replies on the symbol name returned by
pthdb_session_pthreaded, and set a breakpoint on it similarly.

We need linux kernel to have such convention with GDB, that is, a
function is called after a thread is created, and the function name
shouldn't be changed across different version of kernels.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC GDB Linux Awareness analysis
  2015-09-30 13:27   ` Peter Griffin
  2015-09-30 16:41     ` Duane Ellis
  2015-10-01  9:25     ` Yao Qi
@ 2015-10-02 10:56     ` Andreas Arnez
  2 siblings, 0 replies; 8+ messages in thread
From: Andreas Arnez @ 2015-10-02 10:56 UTC (permalink / raw)
  To: Peter Griffin; +Cc: gdb, lee.jones

On Wed, Sep 30 2015, Peter Griffin wrote:

> Hi Andreas,
>
> On Thu, 20 Aug 2015, Andreas Arnez wrote:
>
> [...]
>> So I believe posting the patches to get more feedback would be
>> worth-while.
>
> Ok, so the existing LKD code is on quite an old GDB version
> (7.6). Also the whole LKD patchset is currently quite large (15k
> LoC). So my plan was to try and reduce this as much as possible by
> removing parts of LKD that can (or already have) python
> implementations e.g. dmesg.
>
> Essentially the aim would be to try and reduce the LKD patchset as
> much as possible to the "core features" which I consider to be task
> awareness and loadable module support. At this point we could look at
> what might need adding to the GDB python API to migrate even more
> parts into python.
>
> Posting the whole patchset currently would involve porting to the
> latest GDB version and before doing that I'd like to get a better idea
> of what parts are likely to need re-writing or at least only ports the
> necessary parts.

I suggest to start with a small patch set that can be reviewed easily --
and changed, if necessary.  Such as loadable modules support, or
whatever else you deem best suited for laying the foundations for this
project.  And then see how it goes.

>[...]
>
> Does anyone have experience / thoughts on how often the existing threading
> implementations that are already inside GDB break? Presumably these are also
> tightly coupled to parsing out of tree data structures (or the libraries which
> GDB relies on to do this).
>
> I'm trying to get a feel for what the current maintenance burden is like having
> these implementations in C.

A user-space runtime usually has some sort of "debug interface" which
gives certain guarantees about reliable breakpoint targets and data
structure layouts.  Such a convention allows the GDB support to stay
pretty stable.

Ideally the Linux kernel runtime provides a similar interface.  For
instance, see the description of do_init_module() in module.c:

  /*
   * This is where the real work happens.
   *
   * Keep it uninlined to provide a reliable breakpoint target, e.g. for the gdb
   * helper command 'lx-symbols'.
   */

One thing we haven't addressed so far is testing.  Usually a new GDB
feature should come with a regression test case, such that it's less
likely to break after unrelated changes.

Maybe we can put various vmlinux- and associated crash dump binaries
into the test suite, along with a test case that loads/analyzes those.
However, such binaries can grow fairly large (many Giga- or even
Terabytes), so I'm not sure whether this is a viable option.  Maybe
remove everything from the binaries that is irrelevant to the test case?

Other ideas for testing?

--
Andreas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC GDB Linux Awareness analysis
  2015-09-30 16:41     ` Duane Ellis
@ 2015-10-05 18:32       ` Doug Evans
  0 siblings, 0 replies; 8+ messages in thread
From: Doug Evans @ 2015-10-05 18:32 UTC (permalink / raw)
  To: Duane Ellis; +Cc: Peter Griffin, Andreas Arnez, gdb, Lee Jones

On Wed, Sep 30, 2015 at 9:41 AM, Duane Ellis <duane@duaneellis.com> wrote:
> I would offer the following for GDB Kernel Dump analysis - there is *a*lot* more that is needed.
> ...
> 3) GDB - generally as stated in the caldron slide deck is a “application debugger” it is not a bare metal debugger
>         I cannot agree with this more - and it is a fundamental limitation for GDB

The extent to which gdb is not a bare metal debugger is IMO mostly
driven by patches.
Application level debugging gets more attention.
But if reasonable patches came our way I would certainly approve them.
[The devil is, again, in the details of course.
One of the harder parts of hacking on gdb is that you can't break, or
make harder,
any of the other of the myriad of supported use cases, some of which are so
old it's a crime that we're still having to put time into them (IMO).]

> 4) In the bare metal world - GDB has a really big (fundamental) problem - GDB thinks an address is a integer

This has been an off and on discussion in gdb since at least as far
back as Cygnus days. :-)
No disagreement here that we have the wrong kind of abstraction for an address.

> 6)  Lots of the above needs to be scripted (i.e.: Python is a great solution here, but is not always present)
>         And - these scripts could be provided by the Linux Kernel build process
>         Specifically: The kernel build process should produce an architecture/build-specific data file with structure definition
>         These scripts that I talk about, could read the ‘build-specific’ data file
>         (more about this later)

I'd like to see a world where we actually open up gdb innards more,
instead of providing
an API on top of a closed system. The scripting possibilities would
increase by at least
an order of magnitude.

> 7) A good example of scripting is during postmortem debug
>         GDB cannot call (execute) a helper function within the target because the target is not “live”

Depends on what the helper function does of course.
E.g., it's possible to resurrect a corefile (assuming it hasn't been
truncated, etc.)
enough to run a pretty-printer contained in the app (as opposed to in python).

> 4) GDB does not currently expose enough via the scripting interface
>         As I stated above - attitudes need to change about GDB

No disagreement here.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RFC GDB Linux Awareness analysis
  2015-10-05 18:54 duane
@ 2015-10-05 19:41 ` Doug Evans
  0 siblings, 0 replies; 8+ messages in thread
From: Doug Evans @ 2015-10-05 19:41 UTC (permalink / raw)
  To: duane; +Cc: Peter Griffin, Andreas Arnez, gdb, Lee Jones

On Mon, Oct 5, 2015 at 11:54 AM,  <duane@duaneellis.com> wrote:
> duane> 7) A good example of scripting is during postmortem debug
> duane> GDB cannot call (execute) a helper function within the
> duane> target because the target is not “live”
>
> doug> Depends on what the helper function does of course.
> doug> E.g., it's possible to resurrect a corefile (assuming it hasn't
> been
> doug> truncated, etc.) enough to run a pretty-printer contained in the
> doug> app (as opposed to in python).
>
> Yes, as you said "it depends on what it does"

And it depends on the debugging environment.

> I would say almost categorically you *cannot* use the target pretty
> printer.

Whatever. It was just an example.

> Every target is different, every embedded system is different.
>
> Some of these pretty print problems also occur when live debugging.
>
> This whole area is one giant rat hole of problems to make work
> universally.

I think you'll find you're preaching to the choir here.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: RFC GDB Linux Awareness analysis
@ 2015-10-05 18:54 duane
  2015-10-05 19:41 ` Doug Evans
  0 siblings, 1 reply; 8+ messages in thread
From: duane @ 2015-10-05 18:54 UTC (permalink / raw)
  To: Doug Evans; +Cc: Peter Griffin, Andreas Arnez, gdb, Lee Jones

duane> 7) A good example of scripting is during postmortem debug
duane> GDB cannot call (execute) a helper function within the 
duane> target because the target is not “live”

doug> Depends on what the helper function does of course.
doug> E.g., it's possible to resurrect a corefile (assuming it hasn't
been
doug> truncated, etc.) enough to run a pretty-printer contained in the 
doug> app (as opposed to in python).

Yes, as you said "it depends on what it does"

I would say almost categorically you *cannot* use the target pretty
printer.

Bare metal is often a synonym for "cross compiling". That means
host != target, and loading a complete image into a real target is 
not always viable, or possible. (ie: You may need to construct
MMU page tables, or initialize memory decoders, or DDR to make memory
work.

The advantage is: The Target CPU might be able to execute opcodes :-)

The easier method is loading into a 'memory only' emulation, and do not 
support execution. Yes, you could use an opcode simulator... (ie: the 
armulator) but that is not viable for all CPU types.

Using your "pretty print" solution as the example - another problem is
the run time library support. I'll bet it would be very helpful to use
the targets version of "printf()" in some way (maybe sprintf())

The problem is often the underlying printf() routines in the target 
C library in some cases [often!] use malloc/free to manage buffers

And thus, often make use of some OS feature like a semaphore or mutex.
Thus, while calling this special function interrupts get turned back on
and who knows what else get changed.

Every target is different, every embedded system is different.

Some of these pretty print problems also occur when live debugging.

This whole area is one giant rat hole of problems to make work
universally.

-Duane.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-10-05 19:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20150603142858.GA19370@griffinp-ThinkPad-X1-Carbon-2nd>
2015-08-20 18:22 ` RFC GDB Linux Awareness analysis Andreas Arnez
2015-09-30 13:27   ` Peter Griffin
2015-09-30 16:41     ` Duane Ellis
2015-10-05 18:32       ` Doug Evans
2015-10-01  9:25     ` Yao Qi
2015-10-02 10:56     ` Andreas Arnez
2015-10-05 18:54 duane
2015-10-05 19:41 ` Doug Evans

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).