* GDB with PCIe device @ 2020-12-26 6:48 Rajinikanth Pandurangan 2021-01-08 15:17 ` Simon Marchi 0 siblings, 1 reply; 6+ messages in thread From: Rajinikanth Pandurangan @ 2020-12-26 6:48 UTC (permalink / raw) To: gdb Hello, As per my understanding, gdb calls ptrace system calls which intern uses kernel implementation of architecture specific action (updating debug registers,reading context memory...) to set breakpoints, and so on. But in case of running gdb with PCIe devices such as gpu or fpga, how does the hardware specific actions are being done? Should device drivers provide ptrace equivalent kernel implementation? Could any of the gdb gurus shed some light on debug software stacks in debugging software that runs on one of the mentioned pcie devices? Thanks in advance, ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: GDB with PCIe device 2020-12-26 6:48 GDB with PCIe device Rajinikanth Pandurangan @ 2021-01-08 15:17 ` Simon Marchi 2021-01-11 9:31 ` Aktemur, Tankut Baris 0 siblings, 1 reply; 6+ messages in thread From: Simon Marchi @ 2021-01-08 15:17 UTC (permalink / raw) To: Rajinikanth Pandurangan, gdb On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote: > Hello, > > As per my understanding, gdb calls ptrace system calls which intern uses > kernel implementation of architecture specific action (updating debug > registers,reading context memory...) to set breakpoints, and so on. > > But in case of running gdb with PCIe devices such as gpu or fpga, how does > the hardware specific actions are being done? > > Should device drivers provide ptrace equivalent kernel implementation? > > Could any of the gdb gurus shed some light on debug software stacks in > debugging software that runs on one of the mentioned pcie devices? > > Thanks in advance, > One such gdb port that is in development is ROCm-GDB, by AMD: https://github.com/ROCm-Developer-Tools/ROCgdb It uses a helper library to debug the GPU threads: https://github.com/ROCm-Developer-Tools/ROCdbgapi I don't want to get too much into how this library works, because I'm sure I'll say something wrong / misleading. You can look at the code. But I'm pretty sure the GPU isn't debugged through ptrace. The library communicates with the kernel driver somehow, however. So, the GPU devices can use whatever debug interface, as long as a corresponding target exist in GDB to communicate with it. Today, one GDB can communicate with multiple debugging target, but only with one target per inferior. So you can be debugging a local program while debugging another remote program. In the GPU / coprocessor programming world, the model is often that you run a program on the host, which spawns some threads on the GPU / coprocessor. From the point of view of the user, the threads on the host and the threads on the GPU / coprocessor belong to the same program, so would ideally appear in the same inferior. ROCm-GDB does this, but it's still done in a slightly hackish way, where the target that talks to the GPU is installed in the "arch" stratum (this is GDB internal stuff) of the inferior's target stack and hijacks the calls to the native Linux target. The better long term / general solution is probably to make GDB able to connect to multiple debug targets for a single inferior. Simon ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: GDB with PCIe device 2021-01-08 15:17 ` Simon Marchi @ 2021-01-11 9:31 ` Aktemur, Tankut Baris 2021-01-21 8:08 ` Rajinikanth Pandurangan 0 siblings, 1 reply; 6+ messages in thread From: Aktemur, Tankut Baris @ 2021-01-11 9:31 UTC (permalink / raw) To: Simon Marchi, Rajinikanth Pandurangan Cc: gdb, Metzger, Markus T, Saiapova, Natalia, Strasuns, Mihails On Friday, January 8, 2021 4:18 PM, Simon Marchi wrote: > On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote: > > Hello, > > > > As per my understanding, gdb calls ptrace system calls which intern uses > > kernel implementation of architecture specific action (updating debug > > registers,reading context memory...) to set breakpoints, and so on. > > > > But in case of running gdb with PCIe devices such as gpu or fpga, how does > > the hardware specific actions are being done? > > > > Should device drivers provide ptrace equivalent kernel implementation? > > > > Could any of the gdb gurus shed some light on debug software stacks in > > debugging software that runs on one of the mentioned pcie devices? > > > > Thanks in advance, > > > > One such gdb port that is in development is ROCm-GDB, by AMD: > > https://github.com/ROCm-Developer-Tools/ROCgdb > > It uses a helper library to debug the GPU threads: > > https://github.com/ROCm-Developer-Tools/ROCdbgapi > > I don't want to get too much into how this library works, because I'm > sure I'll say something wrong / misleading. You can look at the code. > But I'm pretty sure the GPU isn't debugged through ptrace. > The library communicates with the kernel driver somehow, however. > > So, the GPU devices can use whatever debug interface, as long as a > corresponding target exist in GDB to communicate with it. > > Today, one GDB can communicate with multiple debugging target, but only > with one target per inferior. So you can be debugging a local program > while debugging another remote program. We (Intel) use this approach. The host program that runs on the CPU is represented as an inferior with the native target, and the kernel that runs on the GPU is represented as another inferior with a remote target. The remote target is connected to an instance of gdbserver that uses a GPU-specific debug interface, which is not ptrace. A high-level presentation is available at https://dl.acm.org/doi/abs/10.1145/3388333.3388646 in case you want more information. Regards -Baris > > In the GPU / coprocessor programming world, the model is often that you > run a program on the host, which spawns some threads on the GPU / > coprocessor. From the point of view of the user, the threads on the host > and the threads on the GPU / coprocessor belong to the same program, so > would ideally appear in the same inferior. ROCm-GDB does this, but it's > still done in a slightly hackish way, where the target that talks to the > GPU is installed in the "arch" stratum (this is GDB internal stuff) of > the inferior's target stack and hijacks the calls to the native Linux > target. > > The better long term / general solution is probably to make GDB able to > connect to multiple debug targets for a single inferior. > > Simon Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Gary Kershaw Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: GDB with PCIe device 2021-01-11 9:31 ` Aktemur, Tankut Baris @ 2021-01-21 8:08 ` Rajinikanth Pandurangan 2021-01-27 16:00 ` Metzger, Markus T 0 siblings, 1 reply; 6+ messages in thread From: Rajinikanth Pandurangan @ 2021-01-21 8:08 UTC (permalink / raw) To: Aktemur, Tankut Baris Cc: Simon Marchi, gdb, Metzger, Markus T, Saiapova, Natalia, Strasuns, Mihails Thanks Simon and Aktemur for the details and pointers. What is the role of gdbserver here? I thought its needed only when gdb client and target machines are connected via serial/ethernet. Do we need gdbserver when we debug GPU kernels that are just over pcie? Thanks in advance! On Mon, Jan 11, 2021 at 1:31 AM Aktemur, Tankut Baris < tankut.baris.aktemur@intel.com> wrote: > On Friday, January 8, 2021 4:18 PM, Simon Marchi wrote: > > On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote: > > > Hello, > > > > > > As per my understanding, gdb calls ptrace system calls which intern > uses > > > kernel implementation of architecture specific action (updating debug > > > registers,reading context memory...) to set breakpoints, and so on. > > > > > > But in case of running gdb with PCIe devices such as gpu or fpga, how > does > > > the hardware specific actions are being done? > > > > > > Should device drivers provide ptrace equivalent kernel implementation? > > > > > > Could any of the gdb gurus shed some light on debug software stacks in > > > debugging software that runs on one of the mentioned pcie devices? > > > > > > Thanks in advance, > > > > > > > One such gdb port that is in development is ROCm-GDB, by AMD: > > > > https://github.com/ROCm-Developer-Tools/ROCgdb > > > > It uses a helper library to debug the GPU threads: > > > > https://github.com/ROCm-Developer-Tools/ROCdbgapi > > > > I don't want to get too much into how this library works, because I'm > > sure I'll say something wrong / misleading. You can look at the code. > > But I'm pretty sure the GPU isn't debugged through ptrace. > > The library communicates with the kernel driver somehow, however. > > > > So, the GPU devices can use whatever debug interface, as long as a > > corresponding target exist in GDB to communicate with it. > > > > Today, one GDB can communicate with multiple debugging target, but only > > with one target per inferior. So you can be debugging a local program > > while debugging another remote program. > > We (Intel) use this approach. The host program that runs on the CPU is > represented as an inferior with the native target, and the kernel that > runs on the GPU is represented as another inferior with a remote target. > The remote target is connected to an instance of gdbserver that uses a > GPU-specific debug interface, which is not ptrace. > > A high-level presentation is available at > https://dl.acm.org/doi/abs/10.1145/3388333.3388646 > in case you want more information. > > Regards > -Baris > > > > > In the GPU / coprocessor programming world, the model is often that you > > run a program on the host, which spawns some threads on the GPU / > > coprocessor. From the point of view of the user, the threads on the host > > and the threads on the GPU / coprocessor belong to the same program, so > > would ideally appear in the same inferior. ROCm-GDB does this, but it's > > still done in a slightly hackish way, where the target that talks to the > > GPU is installed in the "arch" stratum (this is GDB internal stuff) of > > the inferior's target stack and hijacks the calls to the native Linux > > target. > > > > The better long term / general solution is probably to make GDB able to > > connect to multiple debug targets for a single inferior. > > > > Simon > > > > Intel Deutschland GmbH > Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany > Tel: +49 89 99 8853-0, www.intel.de > Managing Directors: Christin Eisenschmid, Gary Kershaw > Chairperson of the Supervisory Board: Nicole Lau > Registered Office: Munich > Commercial Register: Amtsgericht Muenchen HRB 186928 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: GDB with PCIe device 2021-01-21 8:08 ` Rajinikanth Pandurangan @ 2021-01-27 16:00 ` Metzger, Markus T 2021-02-01 9:27 ` Aktemur, Tankut Baris 0 siblings, 1 reply; 6+ messages in thread From: Metzger, Markus T @ 2021-01-27 16:00 UTC (permalink / raw) To: Rajinikanth Pandurangan Cc: Simon Marchi, gdb, Saiapova, Natalia, Strasuns, Mihails, Aktemur, Tankut Baris Hello Pandurangan, We have separate target stacks for the CPU and the GPU. To add that second target stack, we needed to add another connection. Regards, Markus. From: Rajinikanth Pandurangan <rajinikanth.p@gmail.com> Sent: Donnerstag, 21. Januar 2021 09:08 To: Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com> Cc: Simon Marchi <simon.marchi@polymtl.ca>; gdb@sourceware.org; Metzger, Markus T <markus.t.metzger@intel.com>; Saiapova, Natalia <natalia.saiapova@intel.com>; Strasuns, Mihails <mihails.strasuns@intel.com> Subject: Re: GDB with PCIe device Thanks Simon and Aktemur for the details and pointers. What is the role of gdbserver here? I thought its needed only when gdb client and target machines are connected via serial/ethernet. Do we need gdbserver when we debug GPU kernels that are just over pcie? Thanks in advance! On Mon, Jan 11, 2021 at 1:31 AM Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com<mailto:tankut.baris.aktemur@intel.com>> wrote: On Friday, January 8, 2021 4:18 PM, Simon Marchi wrote: > On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote: > > Hello, > > > > As per my understanding, gdb calls ptrace system calls which intern uses > > kernel implementation of architecture specific action (updating debug > > registers,reading context memory...) to set breakpoints, and so on. > > > > But in case of running gdb with PCIe devices such as gpu or fpga, how does > > the hardware specific actions are being done? > > > > Should device drivers provide ptrace equivalent kernel implementation? > > > > Could any of the gdb gurus shed some light on debug software stacks in > > debugging software that runs on one of the mentioned pcie devices? > > > > Thanks in advance, > > > > One such gdb port that is in development is ROCm-GDB, by AMD: > > https://github.com/ROCm-Developer-Tools/ROCgdb > > It uses a helper library to debug the GPU threads: > > https://github.com/ROCm-Developer-Tools/ROCdbgapi > > I don't want to get too much into how this library works, because I'm > sure I'll say something wrong / misleading. You can look at the code. > But I'm pretty sure the GPU isn't debugged through ptrace. > The library communicates with the kernel driver somehow, however. > > So, the GPU devices can use whatever debug interface, as long as a > corresponding target exist in GDB to communicate with it. > > Today, one GDB can communicate with multiple debugging target, but only > with one target per inferior. So you can be debugging a local program > while debugging another remote program. We (Intel) use this approach. The host program that runs on the CPU is represented as an inferior with the native target, and the kernel that runs on the GPU is represented as another inferior with a remote target. The remote target is connected to an instance of gdbserver that uses a GPU-specific debug interface, which is not ptrace. A high-level presentation is available at https://dl.acm.org/doi/abs/10.1145/3388333.3388646 in case you want more information. Regards -Baris > > In the GPU / coprocessor programming world, the model is often that you > run a program on the host, which spawns some threads on the GPU / > coprocessor. From the point of view of the user, the threads on the host > and the threads on the GPU / coprocessor belong to the same program, so > would ideally appear in the same inferior. ROCm-GDB does this, but it's > still done in a slightly hackish way, where the target that talks to the > GPU is installed in the "arch" stratum (this is GDB internal stuff) of > the inferior's target stack and hijacks the calls to the native Linux > target. > > The better long term / general solution is probably to make GDB able to > connect to multiple debug targets for a single inferior. > > Simon Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de<http://www.intel.de> Managing Directors: Christin Eisenschmid, Gary Kershaw Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Gary Kershaw Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: GDB with PCIe device 2021-01-27 16:00 ` Metzger, Markus T @ 2021-02-01 9:27 ` Aktemur, Tankut Baris 0 siblings, 0 replies; 6+ messages in thread From: Aktemur, Tankut Baris @ 2021-02-01 9:27 UTC (permalink / raw) To: Metzger, Markus T, Rajinikanth Pandurangan Cc: Simon Marchi, gdb, Saiapova, Natalia, Strasuns, Mihails > What is the role of gdbserver here? I thought its needed only when gdb client and target machines are connected via serial/ethernet. Do we need gdbserver when we debug GPU kernels that are just over pcie? Hi, A gdbserver is not required. One may also define a GPU-aware native target. In that case, the inferior for the host computation (i.e. CPU) would be sitting on top of a linux native target as usual, whereas the inferior for the kernel (i.e. GPU) would have the GPU-aware native target underneath instead of the remote target. Note that both scenarios rely on the multi-target feature of GDB, if debugging both the host computation and the kernel is desired. We preferred the remote target approach because it also gives the option of debugging a kernel running in a remote machine. Regards -Baris From: Metzger, Markus T <markus.t.metzger@intel.com> Sent: Wednesday, January 27, 2021 5:01 PM Hello Pandurangan, We have separate target stacks for the CPU and the GPU. To add that second target stack, we needed to add another connection. Regards, Markus. From: Rajinikanth Pandurangan <mailto:rajinikanth.p@gmail.com> Sent: Donnerstag, 21. Januar 2021 09:08 To: Aktemur, Tankut Baris <mailto:tankut.baris.aktemur@intel.com> Cc: Simon Marchi <mailto:simon.marchi@polymtl.ca>; mailto:gdb@sourceware.org; Metzger, Markus T <mailto:markus.t.metzger@intel.com>; Saiapova, Natalia <mailto:natalia.saiapova@intel.com>; Strasuns, Mihails <mailto:mihails.strasuns@intel.com> Subject: Re: GDB with PCIe device Thanks Simon and Aktemur for the details and pointers. What is the role of gdbserver here? I thought its needed only when gdb client and target machines are connected via serial/ethernet. Do we need gdbserver when we debug GPU kernels that are just over pcie? Thanks in advance! On Mon, Jan 11, 2021 at 1:31 AM Aktemur, Tankut Baris <mailto:tankut.baris.aktemur@intel.com> wrote: On Friday, January 8, 2021 4:18 PM, Simon Marchi wrote: > On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote: > > Hello, > > > > As per my understanding, gdb calls ptrace system calls which intern uses > > kernel implementation of architecture specific action (updating debug > > registers,reading context memory...) to set breakpoints, and so on. > > > > But in case of running gdb with PCIe devices such as gpu or fpga, how does > > the hardware specific actions are being done? > > > > Should device drivers provide ptrace equivalent kernel implementation? > > > > Could any of the gdb gurus shed some light on debug software stacks in > > debugging software that runs on one of the mentioned pcie devices? > > > > Thanks in advance, > > > > One such gdb port that is in development is ROCm-GDB, by AMD: > > https://github.com/ROCm-Developer-Tools/ROCgdb > > It uses a helper library to debug the GPU threads: > > https://github.com/ROCm-Developer-Tools/ROCdbgapi > > I don't want to get too much into how this library works, because I'm > sure I'll say something wrong / misleading. You can look at the code. > But I'm pretty sure the GPU isn't debugged through ptrace. > The library communicates with the kernel driver somehow, however. > > So, the GPU devices can use whatever debug interface, as long as a > corresponding target exist in GDB to communicate with it. > > Today, one GDB can communicate with multiple debugging target, but only > with one target per inferior. So you can be debugging a local program > while debugging another remote program. We (Intel) use this approach. The host program that runs on the CPU is represented as an inferior with the native target, and the kernel that runs on the GPU is represented as another inferior with a remote target. The remote target is connected to an instance of gdbserver that uses a GPU-specific debug interface, which is not ptrace. A high-level presentation is available at https://dl.acm.org/doi/abs/10.1145/3388333.3388646 in case you want more information. Regards -Baris > > In the GPU / coprocessor programming world, the model is often that you > run a program on the host, which spawns some threads on the GPU / > coprocessor. From the point of view of the user, the threads on the host > and the threads on the GPU / coprocessor belong to the same program, so > would ideally appear in the same inferior. ROCm-GDB does this, but it's > still done in a slightly hackish way, where the target that talks to the > GPU is installed in the "arch" stratum (this is GDB internal stuff) of > the inferior's target stack and hijacks the calls to the native Linux > target. > > The better long term / general solution is probably to make GDB able to > connect to multiple debug targets for a single inferior. > > Simon Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Gary Kershaw Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-02-01 9:27 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-26 6:48 GDB with PCIe device Rajinikanth Pandurangan 2021-01-08 15:17 ` Simon Marchi 2021-01-11 9:31 ` Aktemur, Tankut Baris 2021-01-21 8:08 ` Rajinikanth Pandurangan 2021-01-27 16:00 ` Metzger, Markus T 2021-02-01 9:27 ` Aktemur, Tankut Baris
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).