GDB with PCIe device

public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed

* GDB with PCIe device
@ 2020-12-26  6:48 Rajinikanth Pandurangan
  2021-01-08 15:17 ` Simon Marchi
  0 siblings, 1 reply; 6+ messages in thread
From: Rajinikanth Pandurangan @ 2020-12-26  6:48 UTC (permalink / raw)
  To: gdb

Hello,

As per my understanding, gdb calls ptrace system calls which intern uses
kernel implementation of architecture specific action (updating debug
registers,reading context memory...) to set breakpoints, and so on.

But in case of running gdb with PCIe devices such as gpu or fpga, how does
the hardware specific actions are being done?

Should device drivers provide ptrace equivalent kernel implementation?

 Could any of the gdb gurus shed some light on debug software stacks in
debugging software that runs on one of the mentioned pcie devices?

Thanks in advance,

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: GDB with PCIe device
  2020-12-26  6:48 GDB with PCIe device Rajinikanth Pandurangan
@ 2021-01-08 15:17 ` Simon Marchi
  2021-01-11  9:31   ` Aktemur, Tankut Baris
  0 siblings, 1 reply; 6+ messages in thread
From: Simon Marchi @ 2021-01-08 15:17 UTC (permalink / raw)
  To: Rajinikanth Pandurangan, gdb

On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote:
> Hello,
> 
> As per my understanding, gdb calls ptrace system calls which intern uses
> kernel implementation of architecture specific action (updating debug
> registers,reading context memory...) to set breakpoints, and so on.
> 
> But in case of running gdb with PCIe devices such as gpu or fpga, how does
> the hardware specific actions are being done?
> 
> Should device drivers provide ptrace equivalent kernel implementation?
> 
>  Could any of the gdb gurus shed some light on debug software stacks in
> debugging software that runs on one of the mentioned pcie devices?
> 
> Thanks in advance,
> 

One such gdb port that is in development is ROCm-GDB, by AMD:

  https://github.com/ROCm-Developer-Tools/ROCgdb

It uses a helper library to debug the GPU threads:

  https://github.com/ROCm-Developer-Tools/ROCdbgapi

I don't want to get too much into how this library works, because I'm
sure I'll say something wrong / misleading.  You can look at the code.
But I'm pretty sure the GPU isn't debugged through ptrace.
The library communicates with the kernel driver somehow, however.

So, the GPU devices can use whatever debug interface, as long as a
corresponding target exist in GDB to communicate with it.

Today, one GDB can communicate with multiple debugging target, but only
with one target per inferior.  So you can be debugging a local program
while debugging another remote program.

In the GPU / coprocessor programming world, the model is often that you
run a program on the host, which spawns some threads on the GPU /
coprocessor.  From the point of view of the user, the threads on the host
and the threads on the GPU / coprocessor belong to the same program, so
would ideally appear in the same inferior.  ROCm-GDB does this, but it's
still done in a slightly hackish way, where the target that talks to the
GPU is installed in the "arch" stratum (this is GDB internal stuff) of
the inferior's target stack and hijacks the calls to the native Linux
target.

The better long term / general solution is probably to make GDB able to
connect to multiple debug targets for a single inferior.

Simon

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: GDB with PCIe device
  2021-01-08 15:17 ` Simon Marchi
@ 2021-01-11  9:31   ` Aktemur, Tankut Baris
  2021-01-21  8:08     ` Rajinikanth Pandurangan
  0 siblings, 1 reply; 6+ messages in thread
From: Aktemur, Tankut Baris @ 2021-01-11  9:31 UTC (permalink / raw)
  To: Simon Marchi, Rajinikanth Pandurangan
  Cc: gdb, Metzger, Markus T, Saiapova, Natalia, Strasuns, Mihails

On Friday, January 8, 2021 4:18 PM, Simon Marchi wrote:
> On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote:
> > Hello,
> >
> > As per my understanding, gdb calls ptrace system calls which intern uses
> > kernel implementation of architecture specific action (updating debug
> > registers,reading context memory...) to set breakpoints, and so on.
> >
> > But in case of running gdb with PCIe devices such as gpu or fpga, how does
> > the hardware specific actions are being done?
> >
> > Should device drivers provide ptrace equivalent kernel implementation?
> >
> >  Could any of the gdb gurus shed some light on debug software stacks in
> > debugging software that runs on one of the mentioned pcie devices?
> >
> > Thanks in advance,
> >
> 
> One such gdb port that is in development is ROCm-GDB, by AMD:
> 
>   https://github.com/ROCm-Developer-Tools/ROCgdb
> 
> It uses a helper library to debug the GPU threads:
> 
>   https://github.com/ROCm-Developer-Tools/ROCdbgapi
> 
> I don't want to get too much into how this library works, because I'm
> sure I'll say something wrong / misleading.  You can look at the code.
> But I'm pretty sure the GPU isn't debugged through ptrace.
> The library communicates with the kernel driver somehow, however.
> 
> So, the GPU devices can use whatever debug interface, as long as a
> corresponding target exist in GDB to communicate with it.
> 
> Today, one GDB can communicate with multiple debugging target, but only
> with one target per inferior.  So you can be debugging a local program
> while debugging another remote program.

We (Intel) use this approach.  The host program that runs on the CPU is
represented as an inferior with the native target, and the kernel that
runs on the GPU is represented as another inferior with a remote target.
The remote target is connected to an instance of gdbserver that uses a
GPU-specific debug interface, which is not ptrace.

A high-level presentation is available at
https://dl.acm.org/doi/abs/10.1145/3388333.3388646
in case you want more information.

Regards
-Baris

> 
> In the GPU / coprocessor programming world, the model is often that you
> run a program on the host, which spawns some threads on the GPU /
> coprocessor.  From the point of view of the user, the threads on the host
> and the threads on the GPU / coprocessor belong to the same program, so
> would ideally appear in the same inferior.  ROCm-GDB does this, but it's
> still done in a slightly hackish way, where the target that talks to the
> GPU is installed in the "arch" stratum (this is GDB internal stuff) of
> the inferior's target stack and hijacks the calls to the native Linux
> target.
> 
> The better long term / general solution is probably to make GDB able to
> connect to multiple debug targets for a single inferior.
> 
> Simon



Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: GDB with PCIe device
  2021-01-11  9:31   ` Aktemur, Tankut Baris
@ 2021-01-21  8:08     ` Rajinikanth Pandurangan
  2021-01-27 16:00       ` Metzger, Markus T
  0 siblings, 1 reply; 6+ messages in thread
From: Rajinikanth Pandurangan @ 2021-01-21  8:08 UTC (permalink / raw)
  To: Aktemur, Tankut Baris
  Cc: Simon Marchi, gdb, Metzger, Markus T, Saiapova, Natalia,
	Strasuns, Mihails

Thanks Simon and Aktemur for the details and pointers.

What is the role of gdbserver here?  I thought its needed only when gdb
client and target machines are connected via serial/ethernet.  Do we need
gdbserver when we debug GPU kernels that are just over pcie?

Thanks in advance!

On Mon, Jan 11, 2021 at 1:31 AM Aktemur, Tankut Baris <
tankut.baris.aktemur@intel.com> wrote:

> On Friday, January 8, 2021 4:18 PM, Simon Marchi wrote:
> > On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote:
> > > Hello,
> > >
> > > As per my understanding, gdb calls ptrace system calls which intern
> uses
> > > kernel implementation of architecture specific action (updating debug
> > > registers,reading context memory...) to set breakpoints, and so on.
> > >
> > > But in case of running gdb with PCIe devices such as gpu or fpga, how
> does
> > > the hardware specific actions are being done?
> > >
> > > Should device drivers provide ptrace equivalent kernel implementation?
> > >
> > >  Could any of the gdb gurus shed some light on debug software stacks in
> > > debugging software that runs on one of the mentioned pcie devices?
> > >
> > > Thanks in advance,
> > >
> >
> > One such gdb port that is in development is ROCm-GDB, by AMD:
> >
> >   https://github.com/ROCm-Developer-Tools/ROCgdb
> >
> > It uses a helper library to debug the GPU threads:
> >
> >   https://github.com/ROCm-Developer-Tools/ROCdbgapi
> >
> > I don't want to get too much into how this library works, because I'm
> > sure I'll say something wrong / misleading.  You can look at the code.
> > But I'm pretty sure the GPU isn't debugged through ptrace.
> > The library communicates with the kernel driver somehow, however.
> >
> > So, the GPU devices can use whatever debug interface, as long as a
> > corresponding target exist in GDB to communicate with it.
> >
> > Today, one GDB can communicate with multiple debugging target, but only
> > with one target per inferior.  So you can be debugging a local program
> > while debugging another remote program.
>
> We (Intel) use this approach.  The host program that runs on the CPU is
> represented as an inferior with the native target, and the kernel that
> runs on the GPU is represented as another inferior with a remote target.
> The remote target is connected to an instance of gdbserver that uses a
> GPU-specific debug interface, which is not ptrace.
>
> A high-level presentation is available at
> https://dl.acm.org/doi/abs/10.1145/3388333.3388646
> in case you want more information.
>
> Regards
> -Baris
>
> >
> > In the GPU / coprocessor programming world, the model is often that you
> > run a program on the host, which spawns some threads on the GPU /
> > coprocessor.  From the point of view of the user, the threads on the host
> > and the threads on the GPU / coprocessor belong to the same program, so
> > would ideally appear in the same inferior.  ROCm-GDB does this, but it's
> > still done in a slightly hackish way, where the target that talks to the
> > GPU is installed in the "arch" stratum (this is GDB internal stuff) of
> > the inferior's target stack and hijacks the calls to the native Linux
> > target.
> >
> > The better long term / general solution is probably to make GDB able to
> > connect to multiple debug targets for a single inferior.
> >
> > Simon
>
>
>
> Intel Deutschland GmbH
> Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
> Tel: +49 89 99 8853-0, www.intel.de
> Managing Directors: Christin Eisenschmid, Gary Kershaw
> Chairperson of the Supervisory Board: Nicole Lau
> Registered Office: Munich
> Commercial Register: Amtsgericht Muenchen HRB 186928
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: GDB with PCIe device
  2021-01-21  8:08     ` Rajinikanth Pandurangan
@ 2021-01-27 16:00       ` Metzger, Markus T
  2021-02-01  9:27         ` Aktemur, Tankut Baris
  0 siblings, 1 reply; 6+ messages in thread
From: Metzger, Markus T @ 2021-01-27 16:00 UTC (permalink / raw)
  To: Rajinikanth Pandurangan
  Cc: Simon Marchi, gdb, Saiapova, Natalia, Strasuns, Mihails, Aktemur,
	Tankut Baris

Hello Pandurangan,

We have separate target stacks for the CPU and the GPU.  To add that second target stack, we needed to add another connection.

Regards,
Markus.

From: Rajinikanth Pandurangan <rajinikanth.p@gmail.com>
Sent: Donnerstag, 21. Januar 2021 09:08
To: Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com>
Cc: Simon Marchi <simon.marchi@polymtl.ca>; gdb@sourceware.org; Metzger, Markus T <markus.t.metzger@intel.com>; Saiapova, Natalia <natalia.saiapova@intel.com>; Strasuns, Mihails <mihails.strasuns@intel.com>
Subject: Re: GDB with PCIe device

Thanks Simon and Aktemur for the details and pointers.

What is the role of gdbserver here?  I thought its needed only when gdb client and target machines are connected via serial/ethernet.  Do we need gdbserver when we debug GPU kernels that are just over pcie?

Thanks in advance!

On Mon, Jan 11, 2021 at 1:31 AM Aktemur, Tankut Baris <tankut.baris.aktemur@intel.com<mailto:tankut.baris.aktemur@intel.com>> wrote:
On Friday, January 8, 2021 4:18 PM, Simon Marchi wrote:
> On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote:
> > Hello,
> >
> > As per my understanding, gdb calls ptrace system calls which intern uses
> > kernel implementation of architecture specific action (updating debug
> > registers,reading context memory...) to set breakpoints, and so on.
> >
> > But in case of running gdb with PCIe devices such as gpu or fpga, how does
> > the hardware specific actions are being done?
> >
> > Should device drivers provide ptrace equivalent kernel implementation?
> >
> >  Could any of the gdb gurus shed some light on debug software stacks in
> > debugging software that runs on one of the mentioned pcie devices?
> >
> > Thanks in advance,
> >
>
> One such gdb port that is in development is ROCm-GDB, by AMD:
>
>   https://github.com/ROCm-Developer-Tools/ROCgdb
>
> It uses a helper library to debug the GPU threads:
>
>   https://github.com/ROCm-Developer-Tools/ROCdbgapi
>
> I don't want to get too much into how this library works, because I'm
> sure I'll say something wrong / misleading.  You can look at the code.
> But I'm pretty sure the GPU isn't debugged through ptrace.
> The library communicates with the kernel driver somehow, however.
>
> So, the GPU devices can use whatever debug interface, as long as a
> corresponding target exist in GDB to communicate with it.
>
> Today, one GDB can communicate with multiple debugging target, but only
> with one target per inferior.  So you can be debugging a local program
> while debugging another remote program.

We (Intel) use this approach.  The host program that runs on the CPU is
represented as an inferior with the native target, and the kernel that
runs on the GPU is represented as another inferior with a remote target.
The remote target is connected to an instance of gdbserver that uses a
GPU-specific debug interface, which is not ptrace.

A high-level presentation is available at
https://dl.acm.org/doi/abs/10.1145/3388333.3388646
in case you want more information.

Regards
-Baris

>
> In the GPU / coprocessor programming world, the model is often that you
> run a program on the host, which spawns some threads on the GPU /
> coprocessor.  From the point of view of the user, the threads on the host
> and the threads on the GPU / coprocessor belong to the same program, so
> would ideally appear in the same inferior.  ROCm-GDB does this, but it's
> still done in a slightly hackish way, where the target that talks to the
> GPU is installed in the "arch" stratum (this is GDB internal stuff) of
> the inferior's target stack and hijacks the calls to the native Linux
> target.
>
> The better long term / general solution is probably to make GDB able to
> connect to multiple debug targets for a single inferior.
>
> Simon



Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de<http://www.intel.de>
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: GDB with PCIe device
  2021-01-27 16:00       ` Metzger, Markus T
@ 2021-02-01  9:27         ` Aktemur, Tankut Baris
  0 siblings, 0 replies; 6+ messages in thread
From: Aktemur, Tankut Baris @ 2021-02-01  9:27 UTC (permalink / raw)
  To: Metzger, Markus T, Rajinikanth Pandurangan
  Cc: Simon Marchi, gdb, Saiapova, Natalia, Strasuns, Mihails

> What is the role of gdbserver here?  I thought its needed only when gdb client and target machines are connected via serial/ethernet.  Do we need gdbserver when we debug GPU kernels that are just over pcie?

Hi,

A gdbserver is not required.  One may also define a GPU-aware native target.
In that case, the inferior for the host computation (i.e.  CPU) would be sitting
on top of a linux native target as usual, whereas the inferior for the kernel
(i.e.  GPU) would have the GPU-aware native target underneath instead of the
remote target.  Note that both scenarios rely on the multi-target feature of
GDB, if debugging both the host computation and the kernel is desired.

We preferred the remote target approach because it also gives the option of
debugging a kernel running in a remote machine.

Regards
-Baris

From: Metzger, Markus T <markus.t.metzger@intel.com> 
Sent: Wednesday, January 27, 2021 5:01 PM

Hello Pandurangan,

We have separate target stacks for the CPU and the GPU.  To add that second target stack, we needed to add another connection.

Regards,
Markus.

From: Rajinikanth Pandurangan <mailto:rajinikanth.p@gmail.com> 
Sent: Donnerstag, 21. Januar 2021 09:08
To: Aktemur, Tankut Baris <mailto:tankut.baris.aktemur@intel.com>
Cc: Simon Marchi <mailto:simon.marchi@polymtl.ca>; mailto:gdb@sourceware.org; Metzger, Markus T <mailto:markus.t.metzger@intel.com>; Saiapova, Natalia <mailto:natalia.saiapova@intel.com>; Strasuns, Mihails <mailto:mihails.strasuns@intel.com>
Subject: Re: GDB with PCIe device

Thanks Simon and Aktemur for the details and pointers.

What is the role of gdbserver here?  I thought its needed only when gdb client and target machines are connected via serial/ethernet.  Do we need gdbserver when we debug GPU kernels that are just over pcie?

Thanks in advance!

On Mon, Jan 11, 2021 at 1:31 AM Aktemur, Tankut Baris <mailto:tankut.baris.aktemur@intel.com> wrote:
On Friday, January 8, 2021 4:18 PM, Simon Marchi wrote:
> On 2020-12-26 1:48 a.m., Rajinikanth Pandurangan via Gdb wrote:
> > Hello,
> >
> > As per my understanding, gdb calls ptrace system calls which intern uses
> > kernel implementation of architecture specific action (updating debug
> > registers,reading context memory...) to set breakpoints, and so on.
> >
> > But in case of running gdb with PCIe devices such as gpu or fpga, how does
> > the hardware specific actions are being done?
> >
> > Should device drivers provide ptrace equivalent kernel implementation?
> >
> >  Could any of the gdb gurus shed some light on debug software stacks in
> > debugging software that runs on one of the mentioned pcie devices?
> >
> > Thanks in advance,
> >
> 
> One such gdb port that is in development is ROCm-GDB, by AMD:
> 
>   https://github.com/ROCm-Developer-Tools/ROCgdb
> 
> It uses a helper library to debug the GPU threads:
> 
>   https://github.com/ROCm-Developer-Tools/ROCdbgapi
> 
> I don't want to get too much into how this library works, because I'm
> sure I'll say something wrong / misleading.  You can look at the code.
> But I'm pretty sure the GPU isn't debugged through ptrace.
> The library communicates with the kernel driver somehow, however.
> 
> So, the GPU devices can use whatever debug interface, as long as a
> corresponding target exist in GDB to communicate with it.
> 
> Today, one GDB can communicate with multiple debugging target, but only
> with one target per inferior.  So you can be debugging a local program
> while debugging another remote program.

We (Intel) use this approach.  The host program that runs on the CPU is
represented as an inferior with the native target, and the kernel that
runs on the GPU is represented as another inferior with a remote target.
The remote target is connected to an instance of gdbserver that uses a
GPU-specific debug interface, which is not ptrace.

A high-level presentation is available at
https://dl.acm.org/doi/abs/10.1145/3388333.3388646
in case you want more information.

Regards
-Baris

> 
> In the GPU / coprocessor programming world, the model is often that you
> run a program on the host, which spawns some threads on the GPU /
> coprocessor.  From the point of view of the user, the threads on the host
> and the threads on the GPU / coprocessor belong to the same program, so
> would ideally appear in the same inferior.  ROCm-GDB does this, but it's
> still done in a slightly hackish way, where the target that talks to the
> GPU is installed in the "arch" stratum (this is GDB internal stuff) of
> the inferior's target stack and hijacks the calls to the native Linux
> target.
> 
> The better long term / general solution is probably to make GDB able to
> connect to multiple debug targets for a single inferior.
> 
> Simon

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-02-01  9:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-26  6:48 GDB with PCIe device Rajinikanth Pandurangan
2021-01-08 15:17 ` Simon Marchi
2021-01-11  9:31   ` Aktemur, Tankut Baris
2021-01-21  8:08     ` Rajinikanth Pandurangan
2021-01-27 16:00       ` Metzger, Markus T
2021-02-01  9:27         ` Aktemur, Tankut Baris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).