public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: Omar Sandoval <osandov@osandov.com>
Cc: elfutils-devel@sourceware.org
Subject: Re: [PATCH v2 2/4] libdw: Try .dwp file in __libdw_find_split_unit()
Date: Fri, 16 Feb 2024 00:17:06 +0100	[thread overview]
Message-ID: <20240215231706.GA10474@gnu.wildebeest.org> (raw)
In-Reply-To: <0e3b1e4de076d669f4b261f0a410cbf770837f4f.1701854319.git.osandov@fb.com>

Hi Omar,

On Wed, Dec 06, 2023 at 01:22:17AM -0800, Omar Sandoval wrote:
> Try opening the file in the location suggested by the standard (the
> skeleton file name + ".dwp") and looking up the unit in the package
> index.  The rest is similar to .dwo files, with slightly different
> cleanup since a single Dwarf handle is shared.

This is indeed an heuristic for finding the dwp file that should work.

I do wonder if this is how distros will do it for separate .debug
files. I can imagine combining separate debuginfo files with dwz
and/or placing the .dwo and index sections into the actual .debug file
itself. But we can play with that later.

The code seems to integrate really nicely with single file split .dwo.

> 
> 	* libdw/libdw_find_split_unit.c (try_dwp_file): New function.
> 	(__libdw_find_split_unit): Call try_dwp_file.
> 	* libdw/libdwP.h (Dwarf): Add dwp_dwarf and dwp_fd.
> 	(__libdw_dwp_findcu_id): New declaration.
> 	(__libdw_link_skel_split): Handle .debug_addr for dwp.
> 	* libdw/libdw_begin_elf.c (dwarf_begin_elf): Initialize
> 	result->dwp_fd.
> 	* libdw/dwarf_end.c (dwarf_end): Free dwarf->dwp_dwarf and close
> 	dwarf->dwp_fd.
> 	(cu_free): Don't free split dbg if it is dwp_dwarf.
> 
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
>  libdw/dwarf_begin_elf.c           |  1 +
>  libdw/dwarf_cu_dwp_section_info.c | 19 ++++++++
>  libdw/dwarf_end.c                 | 10 ++++-
>  libdw/libdwP.h                    | 23 ++++++++--
>  libdw/libdw_find_split_unit.c     | 75 ++++++++++++++++++++++++++++---
>  5 files changed, 119 insertions(+), 9 deletions(-)
> 
> diff --git a/libdw/dwarf_begin_elf.c b/libdw/dwarf_begin_elf.c
> index 323a91d0..ca2b7e2a 100644
> --- a/libdw/dwarf_begin_elf.c
> +++ b/libdw/dwarf_begin_elf.c
> @@ -567,6 +567,7 @@ dwarf_begin_elf (Elf *elf, Dwarf_Cmd cmd, Elf_Scn *scngrp)
>  
>    result->elf = elf;
>    result->alt_fd = -1;
> +  result->dwp_fd = -1;
>  
>    /* Initialize the memory handling.  Initial blocks are allocated on first
>       actual allocation.  */

OK.

> diff --git a/libdw/dwarf_cu_dwp_section_info.c b/libdw/dwarf_cu_dwp_section_info.c
> index 4a4eac8c..298f36f9 100644
> --- a/libdw/dwarf_cu_dwp_section_info.c
> +++ b/libdw/dwarf_cu_dwp_section_info.c
> @@ -340,6 +340,25 @@ __libdw_dwp_find_unit (Dwarf *dbg, bool debug_types, Dwarf_Off off,
>  				   abbrev_offsetp, NULL);
>  }
>  
> +Dwarf_CU *
> +internal_function
> +__libdw_dwp_findcu_id (Dwarf *dbg, uint64_t unit_id8)
> +{
> +  Dwarf_Package_Index *index = __libdw_package_index (dbg, false);
> +  uint32_t unit_row;
> +  Dwarf_Off offset;
> +  Dwarf_CU *cu;
> +  if (__libdw_dwp_unit_row (index, unit_id8, &unit_row) == 0
> +      && __libdw_dwp_section_info (index, unit_row, DW_SECT_INFO, &offset,
> +				   NULL) == 0
> +      && (cu = __libdw_findcu (dbg, offset, false)) != NULL
> +      && cu->unit_type == DW_UT_split_compile
> +      && cu->unit_id8 == unit_id8)
> +    return cu;
> +  else
> +    return NULL;
> +}
> +

Called from try_dwp_file to find the to split compile unit through the
index. When found try_dwp_file will put this cu into the split_tree.
OK.

>  int
>  dwarf_cu_dwp_section_info (Dwarf_CU *cu, unsigned int section,
>  			   Dwarf_Off *offsetp, Dwarf_Off *sizep)
> diff --git a/libdw/dwarf_end.c b/libdw/dwarf_end.c
> index b7f817d9..78224ddb 100644
> --- a/libdw/dwarf_end.c
> +++ b/libdw/dwarf_end.c
> @@ -66,7 +66,9 @@ cu_free (void *arg)
>  	  /* The fake_addr_cu might be shared, only release one.  */
>  	  if (p->dbg->fake_addr_cu == p->split->dbg->fake_addr_cu)
>  	    p->split->dbg->fake_addr_cu = NULL;
> -	  INTUSE(dwarf_end) (p->split->dbg);
> +	  /* There is only one DWP file. We free it later.  */
> +	  if (p->split->dbg != p->dbg->dwp_dwarf)
> +	    INTUSE(dwarf_end) (p->split->dbg);
>  	}
>      }
>  }
> @@ -147,6 +149,12 @@ dwarf_end (Dwarf *dwarf)
>  	  close (dwarf->alt_fd);
>  	}
>  
> +      if (dwarf->dwp_fd != -1)
> +	{
> +	  INTUSE(dwarf_end) (dwarf->dwp_dwarf);
> +	  close (dwarf->dwp_fd);
> +	}
> +
>        /* The cached path and dir we found the Dwarf ELF file in.  */
>        free (dwarf->elfpath);
>        free (dwarf->debugdir);

OK.

> diff --git a/libdw/libdwP.h b/libdw/libdwP.h
> index 7f8d69b5..54445886 100644
> --- a/libdw/libdwP.h
> +++ b/libdw/libdwP.h
> @@ -180,6 +180,9 @@ struct Dwarf
>    /* dwz alternate DWARF file.  */
>    Dwarf *alt_dwarf;
>  
> +  /* DWARF package file.  */
> +  Dwarf *dwp_dwarf;
> +
>    /* The section data.  */
>    Elf_Data *sectiondata[IDX_last];
>  
> @@ -197,6 +200,9 @@ struct Dwarf
>       close this file descriptor.  */
>    int alt_fd;
>  
> +  /* File descriptor of DWARF package file.  */
> +  int dwp_fd;
> +
>    /* Information for traversing the .debug_pubnames section.  This is
>       an array and separately allocated with malloc.  */
>    struct pubnames_s

OK.

> @@ -716,6 +722,10 @@ extern int __libdw_dwp_find_unit (Dwarf *dbg, bool debug_types, Dwarf_Off off,
>  				  Dwarf_Off *abbrev_offsetp)
>       __nonnull_attribute__ (1, 7, 8) internal_function;
>  
> +/* Find the compilation unit in a DWARF package file with the given id.  */
> +extern Dwarf_CU *__libdw_dwp_findcu_id (Dwarf *dbg, uint64_t unit_id8)
> +     __nonnull_attribute__ (1) internal_function;
> +
>  /* Get abbreviation with given code.  */
>  extern Dwarf_Abbrev *__libdw_findabbrev (struct Dwarf_CU *cu,
>  					 unsigned int code)

OK.

> @@ -1367,12 +1377,19 @@ __libdw_link_skel_split (Dwarf_CU *skel, Dwarf_CU *split)
>  
>    /* Get .debug_addr and addr_base greedy.
>       We also need it for the fake addr cu.
> -     There is only one per split debug.  */
> +     This needs to be done for each split unit (one per .dwo file, or multiple
> +     per .dwp file).  */
>    Dwarf *dbg = skel->dbg;
>    Dwarf *sdbg = split->dbg;
> -  if (sdbg->sectiondata[IDX_debug_addr] == NULL
> -      && dbg->sectiondata[IDX_debug_addr] != NULL)
> +  if (dbg->sectiondata[IDX_debug_addr] != NULL
> +      /* If this split file hasn't been linked yet...  */
> +      && (sdbg->sectiondata[IDX_debug_addr] == NULL
> +	  /* ... or it was linked to the same skeleton file for another
> +	     unit...  */
> +	  || (sdbg->sectiondata[IDX_debug_addr]
> +	      == dbg->sectiondata[IDX_debug_addr])))
>      {
> +      /* ... then link the address information for this file and unit.  */
>        sdbg->sectiondata[IDX_debug_addr]
>  	= dbg->sectiondata[IDX_debug_addr];
>        split->addr_base = __libdw_cu_addr_base (skel);

OK. This can now happen when the offset into the .debug_addr.dwo is
different.

> diff --git a/libdw/libdw_find_split_unit.c b/libdw/libdw_find_split_unit.c
> index 533f8072..e1e78951 100644
> --- a/libdw/libdw_find_split_unit.c
> +++ b/libdw/libdw_find_split_unit.c
> @@ -85,6 +85,67 @@ try_split_file (Dwarf_CU *cu, const char *dwo_path)
>      }
>  }
>  
> +static void
> +try_dwp_file (Dwarf_CU *cu)
> +{
> +  if (cu->dbg->dwp_dwarf == NULL)
> +    {
> +      if (cu->dbg->elfpath != NULL)
> +	{
> +	  /* The DWARF 5 standard says "the package file is typically placed in
> +	     the same directory as the application, and is given the same name
> +	     with a '.dwp' extension".  */
> +	  size_t elfpath_len = strlen (cu->dbg->elfpath);
> +	  char *dwp_path = malloc (elfpath_len + 5);
> +	  if (dwp_path == NULL)
> +	    {
> +	      __libdw_seterrno (DWARF_E_NOMEM);
> +	      return;
> +	    }
> +	  memcpy (dwp_path, cu->dbg->elfpath, elfpath_len);
> +	  strcpy (dwp_path + elfpath_len, ".dwp");
> +	  int dwp_fd = open (dwp_path, O_RDONLY);
> +	  free (dwp_path);
> +	  if (dwp_fd != -1)
> +	    {
> +	      Dwarf *dwp_dwarf = dwarf_begin (dwp_fd, DWARF_C_READ);
> +	      /* There's no way to know whether we got the correct file until
> +		 we look up the unit, but it should at least be a dwp file.  */
> +	      if (dwp_dwarf != NULL
> +		  && (dwp_dwarf->sectiondata[IDX_debug_cu_index] != NULL
> +		      || dwp_dwarf->sectiondata[IDX_debug_tu_index] != NULL))
> +		{
> +		  cu->dbg->dwp_dwarf = dwp_dwarf;
> +		  cu->dbg->dwp_fd = dwp_fd;
> +		}
> +	      else
> +		close (dwp_fd);
> +	    }
> +	}
> +      if (cu->dbg->dwp_dwarf == NULL)
> +	cu->dbg->dwp_dwarf = (Dwarf *) -1;
> +    }

OK, makes sure this is only tried once for each Dwarf. So if the .dwp
file isn't found for one skeleton CU, it will not be tried for any
other.

> +
> +  if (cu->dbg->dwp_dwarf != (Dwarf *) -1)
> +    {
> +      Dwarf_CU *split = __libdw_dwp_findcu_id (cu->dbg->dwp_dwarf,
> +					       cu->unit_id8);
> +      if (split != NULL)
> +	{
> +	  if (tsearch (split->dbg, &cu->dbg->split_tree,
> +		       __libdw_finddbg_cb) == NULL)
> +	    {
> +	      /* Something went wrong.  Don't link.  */
> +	      __libdw_seterrno (DWARF_E_NOMEM);
> +	      return;
> +	    }
> +
> +	  /* Link skeleton and split compile units.  */
> +	  __libdw_link_skel_split (cu, split);
> +	}
> +    }
> +}

OK, works as if it was a separate .dwo.

> +
>  Dwarf_CU *
>  internal_function
>  __libdw_find_split_unit (Dwarf_CU *cu)
> @@ -98,14 +159,18 @@ __libdw_find_split_unit (Dwarf_CU *cu)
>       same id as the skeleton.  */
>    if (cu->unit_type == DW_UT_skeleton)
>      {
> +      /* First, try the dwp file.  */
> +      try_dwp_file (cu);
> +
>        Dwarf_Die cudie = CUDIE (cu);
>        Dwarf_Attribute dwo_name;
> -      /* It is fine if dwo_dir doesn't exists, but then dwo_name needs
> -	 to be an absolute path.  */
> -      if (dwarf_attr (&cudie, DW_AT_dwo_name, &dwo_name) != NULL
> -	  || dwarf_attr (&cudie, DW_AT_GNU_dwo_name, &dwo_name) != NULL)
> +      /* Try a dwo file.  It is fine if dwo_dir doesn't exist, but then
> +	 dwo_name needs to be an absolute path.  */
> +      if (cu->split == (Dwarf_CU *) -1
> +	  && (dwarf_attr (&cudie, DW_AT_dwo_name, &dwo_name) != NULL
> +	      || dwarf_attr (&cudie, DW_AT_GNU_dwo_name, &dwo_name) != NULL))
>  	{
> -	  /* First try the dwo file name in the same directory
> +	  /* Try the dwo file name in the same directory
>  	     as we found the skeleton file.  */
>  	  const char *dwo_file = dwarf_formstring (&dwo_name);
>  	  const char *debugdir = cu->dbg->debugdir;

Nicely integrated.

Pushed,

Mark

  reply	other threads:[~2024-02-15 23:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-06  9:22 [PATCH v2 0/4] elfutils: DWARF package (.dwp) file support Omar Sandoval
2023-12-06  9:22 ` [PATCH v2 1/4] libdw: Parse DWARF package file index sections Omar Sandoval
2024-02-15 17:40   ` Mark Wielaard
2023-12-06  9:22 ` [PATCH v2 2/4] libdw: Try .dwp file in __libdw_find_split_unit() Omar Sandoval
2024-02-15 23:17   ` Mark Wielaard [this message]
2023-12-06  9:22 ` [PATCH v2 3/4] libdw: Apply DWARF package file section offsets where appropriate Omar Sandoval
2024-02-16 15:00   ` Mark Wielaard
2024-02-23  0:53     ` Omar Sandoval
2024-02-23  1:03       ` Omar Sandoval
2024-02-24 14:00         ` Mark Wielaard
2024-02-26 19:31           ` Omar Sandoval
2024-02-24 13:30       ` Mark Wielaard
2023-12-06  9:22 ` [PATCH v2 4/4] libdw: Handle overflowed DW_SECT_INFO offsets in DWARF package file indexes Omar Sandoval
2024-01-04  0:53 ` [PATCH v2 0/4] elfutils: DWARF package (.dwp) file support Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240215231706.GA10474@gnu.wildebeest.org \
    --to=mark@klomp.org \
    --cc=elfutils-devel@sourceware.org \
    --cc=osandov@osandov.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).