public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] gdb/solib-rocm: Detect SO for unsupported AMDGPU device
@ 2023-08-23 15:57 Lancelot Six
  2023-08-23 16:13 ` Pedro Alves
  0 siblings, 1 reply; 3+ messages in thread
From: Lancelot Six @ 2023-08-23 15:57 UTC (permalink / raw)
  To: gdb-patches; +Cc: lsix, Lancelot SIX, Pedro Alves

From: Lancelot SIX <lancelot.six@amd.com>

It is possible to debug a process which uses unsupported AMDGPU devices.
In such scenario, we can still use librocm-dbgapi.so to attach to the
process and complete the runtime activation sequence.

However, when listing shared objects loaded on the AMDGPU devices, we
might list SOs loaded on the unsupported devices.  If such SO is
seen, one of two things can happen.

First, if the arch of this device is unknown to BFD,
'gdbarch_find_by_info (gdbarch_info info)' will return the gdbarch
matching default_bfd_arch.  As a result,
rocm_solib_relocate_section_addresses will delegate the relocation
operation to svr4_so_ops.relocate_section_addresses, but this makes no
sense: this code object was not loaded by the system loader.

The second case is if BFD knows the micro-architecture of the device,
but dbgapi does not support it.  In such case, gdbarch_info_fill will
successfully identify an amdgcn architecture (bfd_arch_amdgcn).  From
there, gdbarch_find_by_info calls amdgpu_gdbarch_init which will fail to
query arch specific details from dbgapi and subsequently fail to
initialize the gdbarch object.  As a result, gdbarch_find_by_info
returns nullptr, which will down the line cause some "gdb_assert
(gdbarch != nullptr)" assertion failures.

This patch proposes to add a check in rocm_solib_bfd_open to ensure that
the architecture associated with the code object to open is fully
supported by both BFD and amd-dbgapi, and error-out otherwise.

Change-Id: Ica97ab7cba45e4944b77d3080c54c1038aaeda54
Acked-by: Pedro Alves <pedro@palves.net>
---
 gdb/solib-rocm.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/gdb/solib-rocm.c b/gdb/solib-rocm.c
index 882920a3711..56c210e9fa5 100644
--- a/gdb/solib-rocm.c
+++ b/gdb/solib-rocm.c
@@ -663,6 +663,56 @@ rocm_solib_bfd_open (const char *pathname)
     error (_("`%s': ELF file HSA OS ABI version is not supported (%d)."),
 	   bfd_get_filename (abfd.get ()), osabiversion);
 
+  /* For GDB to be able to use this solib, the exact AMDGPU processor type
+     must be supported by both BFD and the amd-dbgapi library.  */
+  const unsigned char gfx_arch
+    = elf_elfheader (abfd)->e_flags & EF_AMDGPU_MACH ;
+  const bfd_arch_info_type *bfd_arch_info
+    = bfd_lookup_arch (bfd_arch_amdgcn, gfx_arch);
+
+  amd_dbgapi_architecture_id_t architecture_id;
+  amd_dbgapi_status_t dbgapi_query_arch
+    = amd_dbgapi_get_architecture (gfx_arch, &architecture_id);
+
+  if (dbgapi_query_arch != AMD_DBGAPI_STATUS_SUCCESS
+      || bfd_arch_info ==  nullptr)
+    {
+      if (dbgapi_query_arch != AMD_DBGAPI_STATUS_SUCCESS
+	  && bfd_arch_info ==  nullptr)
+	{
+	  /* Neither of the libraries knows about this arch, so we cannot
+	     provide a human readable name for it.  */
+	  error (_("'%s': AMDGCN architecture %#02x is not supported."),
+		 bfd_get_filename (abfd.get ()), gfx_arch);
+	}
+      else if (dbgapi_query_arch != AMD_DBGAPI_STATUS_SUCCESS)
+	{
+	  gdb_assert (bfd_arch_info != nullptr);
+	  error (_("'%s': AMDGCN architecture %s not supported by "
+		   "amd-dbgapi."),
+		 bfd_get_filename (abfd.get ()),
+		 bfd_arch_info->printable_name);
+	}
+      else
+	{
+	  gdb_assert (dbgapi_query_arch == AMD_DBGAPI_STATUS_SUCCESS);
+	  char *arch_name;
+	  if (amd_dbgapi_architecture_get_info
+	      (architecture_id, AMD_DBGAPI_ARCHITECTURE_INFO_NAME,
+	       sizeof (arch_name), &arch_name) != AMD_DBGAPI_STATUS_SUCCESS)
+	    error ("amd_dbgapi_architecture_get_info call failed for arch "
+		   "%#02x.", gfx_arch);
+	  gdb::unique_xmalloc_ptr<char> arch_name_cleaner (arch_name);
+
+	  error (_("'%s': AMDGCN architecture %s not supported."),
+		 bfd_get_filename (abfd.get ()),
+		 arch_name);
+	}
+    }
+
+  gdb_assert (gdbarch_from_bfd (abfd.get ()) != nullptr);
+  gdb_assert (is_amdgpu_arch (gdbarch_from_bfd (abfd.get ())));
+
   return abfd;
 }
 

base-commit: c99853f48cd9132c5a745ad7452d1b0d856f32b8
-- 
2.34.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] gdb/solib-rocm: Detect SO for unsupported AMDGPU device
  2023-08-23 15:57 [PATCH] gdb/solib-rocm: Detect SO for unsupported AMDGPU device Lancelot Six
@ 2023-08-23 16:13 ` Pedro Alves
  2023-08-24 19:36   ` Six, Lancelot
  0 siblings, 1 reply; 3+ messages in thread
From: Pedro Alves @ 2023-08-23 16:13 UTC (permalink / raw)
  To: Lancelot Six, gdb-patches; +Cc: lsix

On 23/08/23 16:57, Lancelot Six wrote:

> Acked-by: Pedro Alves <pedro@palves.net>

Sorry, I had meant:

  Approved-By: Pedro Alves <pedro@palves.net>

Feel free to swap with that and merge it.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [PATCH] gdb/solib-rocm: Detect SO for unsupported AMDGPU device
  2023-08-23 16:13 ` Pedro Alves
@ 2023-08-24 19:36   ` Six, Lancelot
  0 siblings, 0 replies; 3+ messages in thread
From: Six, Lancelot @ 2023-08-24 19:36 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: lsix

[AMD Official Use Only - General]

>
>   Approved-By: Pedro Alves <pedro@palves.net>
>
> Feel free to swap with that and merge it.

Thanks.

I have just pushed this patch.

Best,
Lancelot.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-08-24 19:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-23 15:57 [PATCH] gdb/solib-rocm: Detect SO for unsupported AMDGPU device Lancelot Six
2023-08-23 16:13 ` Pedro Alves
2023-08-24 19:36   ` Six, Lancelot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).