* Re: RFC GDB Linux Awareness analysis [not found] <20150603142858.GA19370@griffinp-ThinkPad-X1-Carbon-2nd> @ 2015-08-20 18:22 ` Andreas Arnez 2015-09-30 13:27 ` Peter Griffin 0 siblings, 1 reply; 8+ messages in thread From: Andreas Arnez @ 2015-08-20 18:22 UTC (permalink / raw) To: Peter Griffin; +Cc: gdb, lee.jones Hi, Sorry for the late reply. Yao Qi pointed me to this RFC; I've just missed it before. For those who have missed it as well, the original posting was here: https://sourceware.org/ml/gdb-patches/2015-06/msg00040.html Independently of that proposal I have presented some thoughts about improving GDB's Linux kernel debugging support on the GNU Tools Cauldron. Surprisingly or not, some of the goals look very similar. The slides are here: https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile&do=view&target=Andreas+Arnez_+Debugging+Linux+kernel+dumps+with+GDB.pdf In that talk I was focusing more on kernel dump debugging rather than live debugging, but I also tried to emphasize the fact that the feature sets required for both (should) have a large overlap. Only a few features are specific to one or the other, e.g. live debugging requires a notification method for refreshing the task list and the modules list. From the resonance in the talk I got the impression that there is indeed wide-spread interest in having improved kernel debug support in GDB. So I believe posting the patches to get more feedback would be worth-while. On Wed, Jun 03 2015, Peter Griffin wrote: > Hi GDB community, > > Overview > ======== > > The purpose of this email is to describe a useful feature which has been developed > by STMicroelectronics inside of GDB for debugging Linux kernels. I will cover at > a high level some of the implementation details and what I see as the advantages > / disadvantages of the implementation. I will also cover some other alternative > approaches that I'm aware of. > > The purpose is to facilitate discussion with the GDB experts on this > mailing list as to what the "correct" way to implement this functionality would > be. I'm copying gdb@sourceware.org instead of gdb-patches, since I think it's better suited for now -- until we are actually discussing a patch. > The end goal is to have an upstream implementation of this functionality. > > Introduction > ============ > > STMicroelectronics has a patchset on top of vanilla GDB which adds much better > Linux kernel awareness into GDB. They have called this GDB extension LKD (Linux > Kernel Debugger). This GDB extension is primarily used in conjunction with ST's > JTAG debugger for debugging ARM / SH4 SoCs, and is implemented as an internal > GDB extension, written in C. > > The LKD extension is nicely abstracted from the underlying JTAG interface, > and I have used it to debug a ARMv7 kernel running in QEMU via gdbremote. > > ST would like to contribute these patches back to GDB, as we think it could > be useful not only for ARM Linux debugging, but also other CPU/OS combinations > in the future. > > LKD can be broadly split into the following parts: - > > 1) Linux task awareness > 2) Loadable module support > 3) Various Linux helper commands > > The next section looks at each of the three parts, and any other implementations > I'm aware of that currently exist. > > > 1) Task Awareness > ================= > > 1.1) LKD Linux Task awareness > ============================= > > When using mainline GDB for debugging a Linux kernel via JTAG GDB will typically > only show the actual hardware threads. > > The LKD task awareness extension (lkd-process.c) adds the ability for GDB to > parse some kernel data structures, so it can build a thread list of all kernel > threads in the running Linux kernel. > > When halting the processor (via a breakpoint, ctrl+c etc) the thread list is > re-enumerated, so new tasks are visible to GDB and tasks which have exited > are removed from the thread list. > > To achieve this GDB has to know about various Linux kernel data structures, > and also various fields within these structures (mainly task_struct, > thread_info, mm_struct). Code has been implemented to parse these structs, > and also do virtual to physical address translations and cache handling. > Various frame unwinders are also implemented to stop backtracing on various > exceptions, and entry points to the kernel (these would be a useful addition > regardless of which task awareness approach is taken). > > Advantages > - Adds Linux kernel task awareness to GDB > - Supports symbolic debugging of user processes > - Contextual information (structs / field offsets) are readily available inside of GDB > - Has been used and well tested inside ST for some time > > Disadvantages > - Being implemented in C within GDB creates a dependency between GDB and the Linux kernel > - Mainly tested on 3.10 and 2.6.30 with ARM and SH4 kernels, being upstreamed would > expose it to many differing kernel versions and architectures. > > - I can't see any other "OS awareness" support currently in the GDB code base Actually, it seems that there's a bit of special OS kernel debug support, e.g. in bsd-kvm.c. But I'd rather view this as providing support for the Linux kernel *runtime* and compare it to components like linux-thread-db.c for user-space threads. GDB has several of those already. They often leave the internals of accessing the thread library's data structures to a "thread debug library" outside GDB. We may or may not want to do something similar for the kernel, depending on the complexity of the data structures involved and on the likelihood for them to change. Maybe the "thread debug library" could actually be a Python script provided by the kernel developers, as outlined in 1.3 below. Or maybe it's just easier to put all the required logic into GDB and perform appropriate version checks where necessary. > 1.2) OpenOCD / GDB Linux task awareness > ======================================= > > Whilst looking for any prior art in this area I found that OpenOCD already > implements some basic Linux task awareness. See here: - > http://openocd.org/doc/doxygen/html/linux_8c_source.html > > I've this working to the extent where I can connect via JTAG to a ARMv7 U8500 > Snowball board and enumerate a thread list in GDB. It required some debugging > & hacking to get this far. > > This implementation passes the thread list to GDB via the gdbremote protocol > and as such changes required in GDB are minimal. > > Advantages > - OpenOCD already supports many target types (ARM / MIPS), and has support > for virt to phys translations / cache handling etc. > > - OpenOCD also implements task awareness for other RTOS’s (ThreadX / FreeRTOS / eCos) > > - Using gdbremote means GDB changes are (so far) minimal. > > - OpenOCD / Linux Kernel dependency already exists > > > Disadvantages > - Creates a dependency between Linux kernel data structures and OpenOCD. > > - I believe finding field offsets within structs is currently not possible via > gdbremote protocol. > > Currently OpenOCD generates these offsets at compile time which is ugly and needs > fixing. See here http://openocd.org/doc-release/doxygen/linux__header_8h_source.html. > > Being able to find field offsets would in my opinion be a useful addition to the > gdbremote protocol which would allow OpenOCD task awareness to work much better > at runtime. > > - Doesn't support debugging user processes. I think this would still require some > GDB changes, and also gdbremote protocol changes to get working correctly. > > - Needs to be made more generic > > > 1.3) Python Task awareness > ========================== > > Jan Kiszka from Siemens has implemented some basic Linux kernel task awareness using > the GDB Python interface. > > See here https://lwn.net/Articles/631167/ and here > https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt > > This support is currently limited to the following commands: - > > lx_current -- Return current task > lx_per_cpu -- Return per-cpu variable > lx_task_by_pid -- Find Linux task by PID and return the task_struct variable > lx_thread_info -- Calculate Linux thread_info from task variable > > However this could be extended to build up a thread list. The GDB Python interface > would I believe need to be extended to allow a thread list to be passed back to GDB > via Python. > > Advantages > - Code parsing Linux kernel structures lives in the kernel source tree > - Contextual information is easily available > - Works with all OCD implementations not just OpenOCD > > Disadvantages > - Doesn't exist yet > - Python / GDB interface would need to be extended to support threads > > Questions > ======== > > 1) What do GDB community think about having Linux OS task awareness in the GDB > codebase? For my part, I don't see problem with that. We already have code in GDB for supporting various runtime environments. This would just be another one. Currently, even if an external thread debug library is involved, the code that uses it is still specially crafted for a particular runtime and does not work for another. > 2) Is there a preferred way to implement this (C extension / gdbremote / Python > / something else)? Since many existing JTAG debuggers as well as Qemu will not offer a Linux task list, I think such logic belongs on the GDB client side. It's also on the client side where you have the DWARF info from vmlinux that enables dealing with varying field offsets in structures. Whether the logic should be written in C or as a Python (or Guile) extension, I don't have a strong opinion. > 3) Any other implementations / advantages / disadvantages that I'm not currently > aware of? > > 2) Loadable module support > ========================== > > 2.1) LKD Loadable module support > ================================ > > lkd-modules.c adds better Linux kernel module symbolic symbol support to GDB using the > GDB shared libraries infrastructure (solib). > > It has hooks to enable the debugging of module __init sections to reflect that these pages > get discarded post-load. It also implements the same section layout algorithm as > the kernel to speed up symbol resolution, only inspecting the target’s memory if there is a > mismatch (inspecting target memory can be slow). > > I think this part could be upstreamed separately to the task awareness support, although > I’ve not tried separating it yet from the other LKD patches. Yes, I think that separate patch sets would be helpful. > Advantages > - Allows full symbolic debugging of Linux loadable modules including init sections > > - Would be independent of underlying communication mechanism (JTAG / gdbremote etc) > > Disadvantages > - Some dependency between GDB and Linux kernel is still present > > Questions: > > Are GDB community happy to have a Linux specific solib functionality in the GDB code base? For my part, same as with thread support: GDB already supports various shared library implementations and "knows" about their internals, so this would just be another one. > 2.2) Python Loadable module support > =================================== > > I've not managed to look to much at this, but some basic support exists, see here > https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt > > Having read the LKD modules implementation I don’t think this can be anywhere near > as functionally complete. > > > 3) Linux Helper commands > ======================== > > 3.1) LKD helper commands > ======================== > > LKD implements various Linux helper commands inside GDB such as: - > - dmesg - dump dmesg log buffer from kernel > - process_info - prints various info about current process > - pmap - prints memory map of current process > - vm_translate - translates virtual to physical address > - proc_interrupts - prints interrupt statistics > - proc_iomem - prints I/O mem map > - proc_cmdline - prints the contents of /proc/cmdline > - proc_version - prints the contents of /proc/version > - proc_mounts - print the contents of /proc/mounts > - proc_meminfo - print the contents of /proc/meminfo. > > Advantages > - Can be used by all GDB based debug solutions > - Some precedent of OS related commands in the GDB code base > https://sourceware.org/gdb/current/onlinedocs/gdb/OS-Information.html#OS-Information > > Disadvantages > - Creates a GDB dependency with the kernel > > Python helper commands > ====================== > > Jan’s python scripts also implement some of the same LKD commands such as 'dmesg'. > > See here https://lwn.net/Articles/631167/ and > https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt > > Advantages > - Code that is parsing Linux kernel structures lives in the kernel source tree > - Can be used by all GDB based debug solutions > > > Questions > - Do GDB community mind Linux specific custom commands being added to GDB code base? I'm not sure about that one. We also need to consider the capability for the user to add more such commands, e.g. specific to certain device drivers or certain debug scenarios. In my view both cases should ideally be covered with the same underlying mechanism. > My current opinion is that helper commands which can be, should be migrated from C code > into Python, and merged into the kernel source tree (and then retired from the LKD patchset). That's my current view as well. Or maybe we could consider using EPPIC for those. (See my Cauldron slides.) > If you got here, thanks for reading this far! Like I said at the beginning, the purpose of > this email is to stimulate some discussion on what you folks consider the 'correct' way to > implement this OS awareness functionality is. > > All feedback is welcome. > > Kind regards, > > Peter. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: RFC GDB Linux Awareness analysis 2015-08-20 18:22 ` RFC GDB Linux Awareness analysis Andreas Arnez @ 2015-09-30 13:27 ` Peter Griffin 2015-09-30 16:41 ` Duane Ellis ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Peter Griffin @ 2015-09-30 13:27 UTC (permalink / raw) To: Andreas Arnez; +Cc: gdb, lee.jones Hi Andreas, On Thu, 20 Aug 2015, Andreas Arnez wrote: Many thanks for your feedback. > > Sorry for the late reply. Yao Qi pointed me to this RFC; I've just > missed it before. For those who have missed it as well, the original > posting was here: > > https://sourceware.org/ml/gdb-patches/2015-06/msg00040.html > > Independently of that proposal I have presented some thoughts about > improving GDB's Linux kernel debugging support on the GNU Tools > Cauldron. Surprisingly or not, some of the goals look very similar. > The slides are here: > > >https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile&do=view&target=Andreas+Arnez_+Debugging+Linux+kernel+dumps+with+GDB.pdf Thanks for the link to your slides, that is very intresting. From your slides LKD implements exactly some of the features you were proposing. For example info threads in LKD reports running and sleeping threads and the loadable module support, resuses solib infrastructure inside GDB. > > In that talk I was focusing more on kernel dump debugging rather than > live debugging, but I also tried to emphasize the fact that the feature > sets required for both (should) have a large overlap. Only a few > features are specific to one or the other, e.g. live debugging requires > a notification method for refreshing the task list and the modules list. > > From the resonance in the talk I got the impression that there is indeed > wide-spread interest in having improved kernel debug support in GDB. Ok that is good news, one reason for posting the RFC to the mailing list was to try and gauge what the GDB community support was for having such a feature. So it is good to know at least so far it seems positive. > So I believe posting the patches to get more feedback would be worth-while. Ok, so the existing LKD code is on quite an old GDB version (7.6). Also the whole LKD patchset is currently quite large (15k LoC). So my plan was to try and reduce this as much as possible by removing parts of LKD that can (or already have) python implementations e.g. dmesg. Essentially the aim would be to try and reduce the LKD patchset as much as possible to the "core features" which I consider to be task awareness and loadable module support. At this point we could look at what might need adding to the GDB python API to migrate even more parts into python. Posting the whole patchset currently would involve porting to the latest GDB version and before doing that I'd like to get a better idea of what parts are likely to need re-writing or at least only ports the necessary parts. > > On Wed, Jun 03 2015, Peter Griffin wrote: > > > Hi GDB community, > > > > Overview > > ======== > > > > The purpose of this email is to describe a useful feature which has been developed > > by STMicroelectronics inside of GDB for debugging Linux kernels. I will cover at > > a high level some of the implementation details and what I see as the advantages > > / disadvantages of the implementation. I will also cover some other alternative > > approaches that I'm aware of. > > > > The purpose is to facilitate discussion with the GDB experts on this > > mailing list as to what the "correct" way to implement this functionality would > > be. > > I'm copying gdb@sourceware.org instead of gdb-patches, since I think > it's better suited for now -- until we are actually discussing a patch. Thanks :) > > > The end goal is to have an upstream implementation of this functionality. > > > > Introduction > > ============ > > > > STMicroelectronics has a patchset on top of vanilla GDB which adds much better > > Linux kernel awareness into GDB. They have called this GDB extension LKD (Linux > > Kernel Debugger). This GDB extension is primarily used in conjunction with ST's > > JTAG debugger for debugging ARM / SH4 SoCs, and is implemented as an internal > > GDB extension, written in C. > > > > The LKD extension is nicely abstracted from the underlying JTAG interface, > > and I have used it to debug a ARMv7 kernel running in QEMU via gdbremote. > > > > ST would like to contribute these patches back to GDB, as we think it could > > be useful not only for ARM Linux debugging, but also other CPU/OS combinations > > in the future. > > > > LKD can be broadly split into the following parts: - > > > > 1) Linux task awareness > > 2) Loadable module support > > 3) Various Linux helper commands > > > > The next section looks at each of the three parts, and any other implementations > > I'm aware of that currently exist. > > > > > > 1) Task Awareness > > ================= > > > > 1.1) LKD Linux Task awareness > > ============================= > > > > When using mainline GDB for debugging a Linux kernel via JTAG GDB will typically > > only show the actual hardware threads. > > > > The LKD task awareness extension (lkd-process.c) adds the ability for GDB to > > parse some kernel data structures, so it can build a thread list of all kernel > > threads in the running Linux kernel. > > > > When halting the processor (via a breakpoint, ctrl+c etc) the thread list is > > re-enumerated, so new tasks are visible to GDB and tasks which have exited > > are removed from the thread list. > > > > To achieve this GDB has to know about various Linux kernel data structures, > > and also various fields within these structures (mainly task_struct, > > thread_info, mm_struct). Code has been implemented to parse these structs, > > and also do virtual to physical address translations and cache handling. > > Various frame unwinders are also implemented to stop backtracing on various > > exceptions, and entry points to the kernel (these would be a useful addition > > regardless of which task awareness approach is taken). > > > > Advantages > > - Adds Linux kernel task awareness to GDB > > - Supports symbolic debugging of user processes > > - Contextual information (structs / field offsets) are readily available inside of GDB > > - Has been used and well tested inside ST for some time > > > > Disadvantages > > - Being implemented in C within GDB creates a dependency between GDB and the Linux kernel > > - Mainly tested on 3.10 and 2.6.30 with ARM and SH4 kernels, being upstreamed would > > expose it to many differing kernel versions and architectures. > > > > - I can't see any other "OS awareness" support currently in the GDB code base > > Actually, it seems that there's a bit of special OS kernel debug > support, e.g. in bsd-kvm.c. Thanks for the pointer I will take a look in there. It's useful to know where all the OS specific bits in GDB are already. > > But I'd rather view this as providing support for the Linux kernel > *runtime* and compare it to components like linux-thread-db.c for > user-space threads. GDB has several of those already. They often leave > the internals of accessing the thread library's data structures to a > "thread debug library" outside GDB. We may or may not want to do > something similar for the kernel, depending on the complexity of the > data structures involved and on the likelihood for them to change. Interesting..I wasn't aware of the use of an external library, I will take a look in there to. Looking at the gdb-kdump implementation that someone kindly CC'ed me in on the discussion it looks like that is doing something similar at least for the virt to phys translation (which is something the LKD patches currently do in GDB). > Maybe the "thread debug library" could actually be a Python script > provided by the kernel developers, as outlined in 1.3 below. Or maybe > it's just easier to put all the required logic into GDB and perform > appropriate version checks where necessary. It is certainly easier from an implementation PoV as we have access to all the required objects inside GDB in C, which I don't think is the case currently with the python interface. Also of course we have 2 existing out of tree implementations (LKD and gdb-kdump). Having taken a brief look at the gdb-kdump code that is also parsing the kernel structures in GDB C code. > > > 1.2) OpenOCD / GDB Linux task awareness > > ======================================= > > > > Whilst looking for any prior art in this area I found that OpenOCD already > > implements some basic Linux task awareness. See here: - > > http://openocd.org/doc/doxygen/html/linux_8c_source.html > > > > I've this working to the extent where I can connect via JTAG to a ARMv7 U8500 > > Snowball board and enumerate a thread list in GDB. It required some debugging > > & hacking to get this far. > > > > This implementation passes the thread list to GDB via the gdbremote protocol > > and as such changes required in GDB are minimal. > > > > Advantages > > - OpenOCD already supports many target types (ARM / MIPS), and has support > > for virt to phys translations / cache handling etc. > > > > - OpenOCD also implements task awareness for other RTOSâs (ThreadX / FreeRTOS / eCos) > > > > - Using gdbremote means GDB changes are (so far) minimal. > > > > - OpenOCD / Linux Kernel dependency already exists > > > > > > Disadvantages > > - Creates a dependency between Linux kernel data structures and OpenOCD. > > > > - I believe finding field offsets within structs is currently not possible via > > gdbremote protocol. > > > > Currently OpenOCD generates these offsets at compile time which is ugly and needs > > fixing. See here http://openocd.org/doc-release/doxygen/linux__header_8h_source.html. > > > > Being able to find field offsets would in my opinion be a useful addition to the > > gdbremote protocol which would allow OpenOCD task awareness to work much better > > at runtime. > > > > - Doesn't support debugging user processes. I think this would still require some > > GDB changes, and also gdbremote protocol changes to get working correctly. > > > > - Needs to be made more generic > > > > > > 1.3) Python Task awareness > > ========================== > > > > Jan Kiszka from Siemens has implemented some basic Linux kernel task awareness using > > the GDB Python interface. > > > > See here https://lwn.net/Articles/631167/ and here > > https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt > > > > This support is currently limited to the following commands: - > > > > lx_current -- Return current task > > lx_per_cpu -- Return per-cpu variable > > lx_task_by_pid -- Find Linux task by PID and return the task_struct variable > > lx_thread_info -- Calculate Linux thread_info from task variable > > > > However this could be extended to build up a thread list. The GDB Python interface > > would I believe need to be extended to allow a thread list to be passed back to GDB > > via Python. > > > > Advantages > > - Code parsing Linux kernel structures lives in the kernel source tree > > - Contextual information is easily available > > - Works with all OCD implementations not just OpenOCD > > > > Disadvantages > > - Doesn't exist yet > > - Python / GDB interface would need to be extended to support threads > > > > Questions > > ======== > > > > 1) What do GDB community think about having Linux OS task awareness in the GDB > > codebase? > > For my part, I don't see problem with that. We already have code in GDB > for supporting various runtime environments. This would just be another > one. Currently, even if an external thread debug library is involved, > the code that uses it is still specially crafted for a particular > runtime and does not work for another. Ok, good to know that a C implementation inside GDB isn't necessarily a no-go and that there are other implementations already merged. This route would enable the most anmount of re-use with the existing LKD implementation. Does anyone have experience / thoughts on how often the existing threading implementations that are already inside GDB break? Presumably these are also tightly coupled to parsing out of tree data structures (or the libraries which GDB relies on to do this). I'm trying to get a feel for what the current maintenance burden is like having these implementations in C. > > > 2) Is there a preferred way to implement this (C extension / gdbremote / Python > > / something else)? > > Since many existing JTAG debuggers as well as Qemu will not offer a > Linux task list, I think such logic belongs on the GDB client side. > It's also on the client side where you have the DWARF info from vmlinux > that enables dealing with varying field offsets in structures. > > Whether the logic should be written in C or as a Python (or Guile) > extension, I don't have a strong opinion. Ok. > > > 3) Any other implementations / advantages / disadvantages that I'm not currently > > aware of? > > > > 2) Loadable module support > > ========================== > > > > 2.1) LKD Loadable module support > > ================================ > > > > lkd-modules.c adds better Linux kernel module symbolic symbol support to GDB using the > > GDB shared libraries infrastructure (solib). > > > > It has hooks to enable the debugging of module __init sections to reflect that these pages > > get discarded post-load. It also implements the same section layout algorithm as > > the kernel to speed up symbol resolution, only inspecting the targetâs memory if there is a > > mismatch (inspecting target memory can be slow). > > > > I think this part could be upstreamed separately to the task awareness support, although > > Iâve not tried separating it yet from the other LKD patches. > > Yes, I think that separate patch sets would be helpful. > > > Advantages > > - Allows full symbolic debugging of Linux loadable modules including init sections > > > > - Would be independent of underlying communication mechanism (JTAG / gdbremote etc) > > > > Disadvantages > > - Some dependency between GDB and Linux kernel is still present > > > > Questions: > > > > Are GDB community happy to have a Linux specific solib functionality in the GDB code base? > > For my part, same as with thread support: GDB already supports various > shared library implementations and "knows" about their internals, so > this would just be another one. Ok good to know. I will take a read of the existing thread support already in GDB. > > > 2.2) Python Loadable module support > > =================================== > > > > I've not managed to look to much at this, but some basic support exists, see here > > https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt > > > > Having read the LKD modules implementation I donât think this can be anywhere near > > as functionally complete. > > > > > > 3) Linux Helper commands > > ======================== > > > > 3.1) LKD helper commands > > ======================== > > > > LKD implements various Linux helper commands inside GDB such as: - > > - dmesg - dump dmesg log buffer from kernel > > - process_info - prints various info about current process > > - pmap - prints memory map of current process > > - vm_translate - translates virtual to physical address > > - proc_interrupts - prints interrupt statistics > > - proc_iomem - prints I/O mem map > > - proc_cmdline - prints the contents of /proc/cmdline > > - proc_version - prints the contents of /proc/version > > - proc_mounts - print the contents of /proc/mounts > > - proc_meminfo - print the contents of /proc/meminfo. > > > > Advantages > > - Can be used by all GDB based debug solutions > > - Some precedent of OS related commands in the GDB code base > > https://sourceware.org/gdb/current/onlinedocs/gdb/OS-Information.html#OS-Information > > > > Disadvantages > > - Creates a GDB dependency with the kernel > > > > Python helper commands > > ====================== > > > > Janâs python scripts also implement some of the same LKD commands such as 'dmesg'. > > > > See here https://lwn.net/Articles/631167/ and > > https://github.com/torvalds/linux/blob/master/Documentation/gdb-kernel-debugging.txt > > > > Advantages > > - Code that is parsing Linux kernel structures lives in the kernel source tree > > - Can be used by all GDB based debug solutions > > > > > > Questions > > - Do GDB community mind Linux specific custom commands being added to GDB code base? > > I'm not sure about that one. We also need to consider the capability > for the user to add more such commands, e.g. specific to certain device > drivers or certain debug scenarios. In my view both cases should > ideally be covered with the same underlying mechanism. > > > My current opinion is that helper commands which can be, should be migrated from C code > > into Python, and merged into the kernel source tree (and then retired from the LKD patchset). > > That's my current view as well. Or maybe we could consider using EPPIC > for those. (See my Cauldron slides.) Ok, I'm glad we're aligned on that. Once again many thanks for your feedback. regards, Peter. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: RFC GDB Linux Awareness analysis 2015-09-30 13:27 ` Peter Griffin @ 2015-09-30 16:41 ` Duane Ellis 2015-10-05 18:32 ` Doug Evans 2015-10-01 9:25 ` Yao Qi 2015-10-02 10:56 ` Andreas Arnez 2 siblings, 1 reply; 8+ messages in thread From: Duane Ellis @ 2015-09-30 16:41 UTC (permalink / raw) To: Peter Griffin; +Cc: Andreas Arnez, gdb, lee.jones I would offer the following for GDB Kernel Dump analysis - there is *a*lot* more that is needed. 1) It is not uncommon to have a raw ram dump of the running system captured by some other means This RAMDUMP can be loaded into some type of CPU or MEMORY simulator By raw RAMDUMP - assume the target has a 1GIG memory, the raw ram dump file is exactly 1 gig And - may contain the contexts of other non-linux memory sections Other non-linux components examples include: GPU state, power management processor state, DSP state Or perhaps a video processing subsystem (ie: AMP core 0 = linux, core 1= dedicated video) 2) These other things need to be examined “in context” That context might be a different endian order A different instruction set Or different structure packing rules Or perhaps they are encoded logs that need to be converted to readable ascii text 3) GDB - generally as stated in the caldron slide deck is a “application debugger” it is not a bare metal debugger I cannot agree with this more - and it is a fundamental limitation for GDB And it is the source of much of the attitude about what gets done with GDB Crashdump is exactly a bare metal debug situation But you can also think about live target debug. Example: Step through user space into the kernel, and perhaps into a hypervisor call And debug each of these situations within their separate (and different) contexts. 4) In the bare metal world - GDB has a really big (fundamental) problem - GDB thinks an address is a integer These problems exist during *LIVE* debug of a target, and during postmortem debug analysis of a crash dump LLDB has the same issue. It is not - an address in bare metal consists of 3 components, I would call these “memory access attributes": Component (A) the integer like index into the memory region Component (B) The route or memory region identifier Component (C) Attributes specific to the memory region Some examples include: ARM Trust Zone - secure vrs non-secure access (dump logs in trust zone) Dump context of non-active thread in that threads virtual memory configuration Some SOCs include alternate access means to memory For example an ARM SoC has the CPU memory aperture (view) of system memory And the ARM SoC may have the DAP’s view of memory access. Access to system registers of some type (i.e.: ARM mrc/mcr/mrs/msr, other cpus have other forms) 5) In the crash dump case, the memory emulation system needs to be told *where* is the active MMU table The memory simulation needs a means to set the mum translation table base register(s) In the crash dump case, GDB will issue a “read memory request” to examine a data structure. The memory simulation needs to perform MMU page table walks 6) Lots of the above needs to be scripted (i.e.: Python is a great solution here, but is not always present) And - these scripts could be provided by the Linux Kernel build process Specifically: The kernel build process should produce an architecture/build-specific data file with structure definition These scripts that I talk about, could read the ‘build-specific’ data file (more about this later) 7) A good example of scripting is during postmortem debug GDB cannot call (execute) a helper function within the target because the target is not “live” Thus, many of these things have to be written on the host side Writing this in C is painful… Python offers some better solution and increased flexibility 7) We are talking here about a command line, ascii text interface for GDB There is another slew of implications when you add a GUI window For example - how do you specify a memory access for a memory (variable) dump (or watch) window? There are other interesting windows - things that display CPU configuration registers (ie: MMU enable/disable, cache control, the list goes on) I’m debugging a kernel - so these things are relevant to the debug session And thus the debugger should provide a means to access these items 8) The GDB expression parser (ie: address parser) needs to support casting to a memory address with attributes For example: I have a “phys_pointer” I want to cast this to a C data structure But - some variables are accessed via “the current [default] memory configuration” But other variables {the one I just cast) needs to use a different memory access configuration So what is the way out of this: 1) GDB needs to become more “bare metal friendly” or at least “bare metal aware" 2) GDB fundamentally needs the ability to specify the 3 missing elements to an address. This needs to become pervasive throughout GDB This is not a simple change - but a means needs to be created Example: The commercial debugger from lauterbach use what they call a “memory class” In that debugger the feature is pervasive - every target SYMBOL can have an access attribute Memory display windows have attributes Items in a dialog box (i.e.: CPU register view window) can have attributes 3) Some attitudes need to change about where things belong Some argue that feature (X) belongs in the gdb server And others say (Y) belongs in the GUI side of things I agree - architecturally some of these things belong in those other places But these do not address the scripting problem. Here’s an example: My target performs image or video processing With GDB - we are talking about 3 separate processes. The GUI, GDB it self, and the ‘gdb-remote’ or simulation process I want to point at a physical memory location( the image buffer ) That memory buffer could be video bit planes, or RGBA data Or maybe it is some software defined radio data stream Accessing this might require starting with a data structure, an element within that structure (data pointer) I can now access that memory - I have the data in Python array of some type Can I use the graphics library in Python to display my image or wave form? Can I use Python graphics to create a task switch timeline graph? Here’s an example (finger print) http://www.analog.com/library/analogdialogue/archives/42-07/fingerprint.html This is not limited to BARE METAL - what about applications that manipulate images? How can I write *ONE* script that controls all three execution processes: The GUI - eclipse or DDD or something else GDB - the core, what we talk about here GDB-REMOTE - which might be a JTAG thing, or a SIMULATION thing Attitudes that keep these separate make scripting these solutions impossibly hard 4) GDB does not currently expose enough via the scripting interface As I stated above - attitudes need to change about GDB LLDB suffers from this also - i.e.: It needs to work when Python is not present…. Grrr... In order to solve these bare-metal problems somebody needs to write code to make it happen. In effect, these would be “bare metal plugin” features You could think of “image processing” or “DSP aware” features as another plugin. Python offers that plugin solution :-) 5) The GDB server thing (for jtag/bare metal) needs to change But that is a discussion for a different day and a different email chain -Duane ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: RFC GDB Linux Awareness analysis 2015-09-30 16:41 ` Duane Ellis @ 2015-10-05 18:32 ` Doug Evans 0 siblings, 0 replies; 8+ messages in thread From: Doug Evans @ 2015-10-05 18:32 UTC (permalink / raw) To: Duane Ellis; +Cc: Peter Griffin, Andreas Arnez, gdb, Lee Jones On Wed, Sep 30, 2015 at 9:41 AM, Duane Ellis <duane@duaneellis.com> wrote: > I would offer the following for GDB Kernel Dump analysis - there is *a*lot* more that is needed. > ... > 3) GDB - generally as stated in the caldron slide deck is a “application debugger” it is not a bare metal debugger > I cannot agree with this more - and it is a fundamental limitation for GDB The extent to which gdb is not a bare metal debugger is IMO mostly driven by patches. Application level debugging gets more attention. But if reasonable patches came our way I would certainly approve them. [The devil is, again, in the details of course. One of the harder parts of hacking on gdb is that you can't break, or make harder, any of the other of the myriad of supported use cases, some of which are so old it's a crime that we're still having to put time into them (IMO).] > 4) In the bare metal world - GDB has a really big (fundamental) problem - GDB thinks an address is a integer This has been an off and on discussion in gdb since at least as far back as Cygnus days. :-) No disagreement here that we have the wrong kind of abstraction for an address. > 6) Lots of the above needs to be scripted (i.e.: Python is a great solution here, but is not always present) > And - these scripts could be provided by the Linux Kernel build process > Specifically: The kernel build process should produce an architecture/build-specific data file with structure definition > These scripts that I talk about, could read the ‘build-specific’ data file > (more about this later) I'd like to see a world where we actually open up gdb innards more, instead of providing an API on top of a closed system. The scripting possibilities would increase by at least an order of magnitude. > 7) A good example of scripting is during postmortem debug > GDB cannot call (execute) a helper function within the target because the target is not “live” Depends on what the helper function does of course. E.g., it's possible to resurrect a corefile (assuming it hasn't been truncated, etc.) enough to run a pretty-printer contained in the app (as opposed to in python). > 4) GDB does not currently expose enough via the scripting interface > As I stated above - attitudes need to change about GDB No disagreement here. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: RFC GDB Linux Awareness analysis 2015-09-30 13:27 ` Peter Griffin 2015-09-30 16:41 ` Duane Ellis @ 2015-10-01 9:25 ` Yao Qi 2015-10-02 10:56 ` Andreas Arnez 2 siblings, 0 replies; 8+ messages in thread From: Yao Qi @ 2015-10-01 9:25 UTC (permalink / raw) To: Peter Griffin; +Cc: Andreas Arnez, gdb, lee.jones Peter Griffin <peter.griffin@linaro.org> writes: > Does anyone have experience / thoughts on how often the existing threading > implementations that are already inside GDB break? Presumably these are also > tightly coupled to parsing out of tree data structures (or the libraries which > GDB relies on to do this). It shouldn't break in any case. There are some conventions in these thread libraries to make it debuggable. In glibc NPTL, __nptl_create_event should be called after a thread is created. GDB relies on this, and set a breakpoint on it to monitor the thread creation. In AIX thread lib, GDB replies on the symbol name returned by pthdb_session_pthreaded, and set a breakpoint on it similarly. We need linux kernel to have such convention with GDB, that is, a function is called after a thread is created, and the function name shouldn't be changed across different version of kernels. -- Yao (齐尧) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: RFC GDB Linux Awareness analysis 2015-09-30 13:27 ` Peter Griffin 2015-09-30 16:41 ` Duane Ellis 2015-10-01 9:25 ` Yao Qi @ 2015-10-02 10:56 ` Andreas Arnez 2 siblings, 0 replies; 8+ messages in thread From: Andreas Arnez @ 2015-10-02 10:56 UTC (permalink / raw) To: Peter Griffin; +Cc: gdb, lee.jones On Wed, Sep 30 2015, Peter Griffin wrote: > Hi Andreas, > > On Thu, 20 Aug 2015, Andreas Arnez wrote: > > [...] >> So I believe posting the patches to get more feedback would be >> worth-while. > > Ok, so the existing LKD code is on quite an old GDB version > (7.6). Also the whole LKD patchset is currently quite large (15k > LoC). So my plan was to try and reduce this as much as possible by > removing parts of LKD that can (or already have) python > implementations e.g. dmesg. > > Essentially the aim would be to try and reduce the LKD patchset as > much as possible to the "core features" which I consider to be task > awareness and loadable module support. At this point we could look at > what might need adding to the GDB python API to migrate even more > parts into python. > > Posting the whole patchset currently would involve porting to the > latest GDB version and before doing that I'd like to get a better idea > of what parts are likely to need re-writing or at least only ports the > necessary parts. I suggest to start with a small patch set that can be reviewed easily -- and changed, if necessary. Such as loadable modules support, or whatever else you deem best suited for laying the foundations for this project. And then see how it goes. >[...] > > Does anyone have experience / thoughts on how often the existing threading > implementations that are already inside GDB break? Presumably these are also > tightly coupled to parsing out of tree data structures (or the libraries which > GDB relies on to do this). > > I'm trying to get a feel for what the current maintenance burden is like having > these implementations in C. A user-space runtime usually has some sort of "debug interface" which gives certain guarantees about reliable breakpoint targets and data structure layouts. Such a convention allows the GDB support to stay pretty stable. Ideally the Linux kernel runtime provides a similar interface. For instance, see the description of do_init_module() in module.c: /* * This is where the real work happens. * * Keep it uninlined to provide a reliable breakpoint target, e.g. for the gdb * helper command 'lx-symbols'. */ One thing we haven't addressed so far is testing. Usually a new GDB feature should come with a regression test case, such that it's less likely to break after unrelated changes. Maybe we can put various vmlinux- and associated crash dump binaries into the test suite, along with a test case that loads/analyzes those. However, such binaries can grow fairly large (many Giga- or even Terabytes), so I'm not sure whether this is a viable option. Maybe remove everything from the binaries that is irrelevant to the test case? Other ideas for testing? -- Andreas ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: RFC GDB Linux Awareness analysis @ 2015-10-05 18:54 duane 2015-10-05 19:41 ` Doug Evans 0 siblings, 1 reply; 8+ messages in thread From: duane @ 2015-10-05 18:54 UTC (permalink / raw) To: Doug Evans; +Cc: Peter Griffin, Andreas Arnez, gdb, Lee Jones duane> 7) A good example of scripting is during postmortem debug duane> GDB cannot call (execute) a helper function within the duane> target because the target is not “live” doug> Depends on what the helper function does of course. doug> E.g., it's possible to resurrect a corefile (assuming it hasn't been doug> truncated, etc.) enough to run a pretty-printer contained in the doug> app (as opposed to in python). Yes, as you said "it depends on what it does" I would say almost categorically you *cannot* use the target pretty printer. Bare metal is often a synonym for "cross compiling". That means host != target, and loading a complete image into a real target is not always viable, or possible. (ie: You may need to construct MMU page tables, or initialize memory decoders, or DDR to make memory work. The advantage is: The Target CPU might be able to execute opcodes :-) The easier method is loading into a 'memory only' emulation, and do not support execution. Yes, you could use an opcode simulator... (ie: the armulator) but that is not viable for all CPU types. Using your "pretty print" solution as the example - another problem is the run time library support. I'll bet it would be very helpful to use the targets version of "printf()" in some way (maybe sprintf()) The problem is often the underlying printf() routines in the target C library in some cases [often!] use malloc/free to manage buffers And thus, often make use of some OS feature like a semaphore or mutex. Thus, while calling this special function interrupts get turned back on and who knows what else get changed. Every target is different, every embedded system is different. Some of these pretty print problems also occur when live debugging. This whole area is one giant rat hole of problems to make work universally. -Duane. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: RFC GDB Linux Awareness analysis 2015-10-05 18:54 duane @ 2015-10-05 19:41 ` Doug Evans 0 siblings, 0 replies; 8+ messages in thread From: Doug Evans @ 2015-10-05 19:41 UTC (permalink / raw) To: duane; +Cc: Peter Griffin, Andreas Arnez, gdb, Lee Jones On Mon, Oct 5, 2015 at 11:54 AM, <duane@duaneellis.com> wrote: > duane> 7) A good example of scripting is during postmortem debug > duane> GDB cannot call (execute) a helper function within the > duane> target because the target is not “live” > > doug> Depends on what the helper function does of course. > doug> E.g., it's possible to resurrect a corefile (assuming it hasn't > been > doug> truncated, etc.) enough to run a pretty-printer contained in the > doug> app (as opposed to in python). > > Yes, as you said "it depends on what it does" And it depends on the debugging environment. > I would say almost categorically you *cannot* use the target pretty > printer. Whatever. It was just an example. > Every target is different, every embedded system is different. > > Some of these pretty print problems also occur when live debugging. > > This whole area is one giant rat hole of problems to make work > universally. I think you'll find you're preaching to the choir here. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-10-05 19:41 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20150603142858.GA19370@griffinp-ThinkPad-X1-Carbon-2nd> 2015-08-20 18:22 ` RFC GDB Linux Awareness analysis Andreas Arnez 2015-09-30 13:27 ` Peter Griffin 2015-09-30 16:41 ` Duane Ellis 2015-10-05 18:32 ` Doug Evans 2015-10-01 9:25 ` Yao Qi 2015-10-02 10:56 ` Andreas Arnez 2015-10-05 18:54 duane 2015-10-05 19:41 ` Doug Evans
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).