From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=rv5E=GO=klomp.org=mark@sourceware.org>
Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184])
	by sourceware.org (Postfix) with ESMTPS id 64BAB3858D32
	for <elfutils-devel@sourceware.org>; Wed,  1 Nov 2023 14:04:03 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 64BAB3858D32
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 64BAB3858D32
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=45.83.234.184
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698847453; cv=none;
	b=rVAzko5eXNEvlmAydUCClnQuT+d7oJn3Ew1E0o8dF04u0ir0iWL0ySkavyirWSZ1BdpdyXYRHrfbRyiMswjayZY2mn5U5CiqB9Al//OwBHZBJ5EJ2a9u2+2oC0LWSxO/WmrJbWtqh7TaFywLy8CfPtx7C7pLzYHoM0owr77smAk=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1698847453; c=relaxed/simple;
	bh=oMVfxOks/8YFsp+nM96DEI3vK6Uuuk3GDK9j4R0VcoU=;
	h=Message-ID:Subject:From:To:Date:MIME-Version; b=IjJlVtGGrm+1tOjYw8XkMM3MWwy1kFsejJScgoKdbTBFb0/IbyEQj56k3nCVxGMqYtlYjX7HLHshk2kQCQ+9bNvJDxcwrzkVgfMibg9kTSGx4nr+kQD6GDw7ZvM0mXfWNuMm1FLbon1v8ZUOkcEY1Jw8Uf4Aix3QUtfT09XJgpQ=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from r6.localdomain (82-217-174-174.cable.dynamic.v4.ziggo.nl [82.217.174.174])
	(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by gnu.wildebeest.org (Postfix) with ESMTPSA id E4C5D302FDC3;
	Wed,  1 Nov 2023 15:03:57 +0100 (CET)
Received: by r6.localdomain (Postfix, from userid 1000)
	id A56223401F4; Wed,  1 Nov 2023 15:03:57 +0100 (CET)
Message-ID: <051e05da49fc84a0100f1865f0fca1d501cbeb49.camel@klomp.org>
Subject: Re: [PATCH 07/14] libdw: Recognize .debug_[ct]u_index sections in
 dwarf_elf_begin
From: Mark Wielaard <mark@klomp.org>
To: Omar Sandoval <osandov@osandov.com>, elfutils-devel@sourceware.org
Date: Wed, 01 Nov 2023 15:03:57 +0100
In-Reply-To: <5837a252931d4a85079d29d08f8b6890ae070700.1695837512.git.osandov@fb.com>
References: <cover.1695837512.git.osandov@fb.com>
	 <5837a252931d4a85079d29d08f8b6890ae070700.1695837512.git.osandov@fb.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
User-Agent: Evolution 3.48.4 (3.48.4-1.fc38) 
MIME-Version: 1.0
X-Spam-Status: No, score=-3033.3 required=5.0 tests=BAYES_00,GIT_PATCH_0,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,RCVD_IN_BARRACUDACENTRAL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <elfutils-devel.sourceware.org>

Hi Omar,

On Wed, 2023-09-27 at 11:20 -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>=20
> DWARF package (.dwp) files have a .debug_cu_index section and,
> optionally, a .debug_tu_index section.  Add them to the list of DWARF
> sections.
>=20
> Unfortunately, it's not that simple: the other debug sections in a dwp
> file have names ending with .dwo, which confuses the checks introduced
> by commit 5b21e70216b8 ("libdw: dwarf_elf_begin should use either plain,
> dwo or lto DWARF sections.").  So, we also have to special case
> .debug_cu_index and .debug_tu_index in scn_dwarf_type and check_section
> to treat them as TYPE_DWO sections.

This seems to work, but I wonder if we should have a specific TYPE_DWP?
It doesn't really matter now, because the enum dwarf_type is only used
internally. But I was hoping to extend the dwarf_begin interface with a
flag so that you can open a DWARF as a specific type. For example there
are single file split DWARF files. Which contain both "plain" and
".dwo" sections. Currently you can only open them as "plain", but there
are actually two "views" of such files.

https://sourceware.org/bugzilla/show_bug.cgi?id=3D28573

Do you think there are reasons to open a file as either TYPE_DWO or
TYPE_DWP? Or doesn't that not make sense?

> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
>  libdw/ChangeLog         |  8 +++++++
>  libdw/dwarf_begin_elf.c | 53 +++++++++++++++++++++--------------------
>  libdw/libdwP.h          |  2 ++
>  3 files changed, 37 insertions(+), 26 deletions(-)
>=20
> diff --git a/libdw/ChangeLog b/libdw/ChangeLog
> index be1e40bc..52327688 100644
> --- a/libdw/ChangeLog
> +++ b/libdw/ChangeLog
> @@ -20,6 +20,14 @@
>  	instead of dbg parameter, which is now unused.
>  	* libdwP.h (Dwarf_Macro_Op_Table): Replace is_64bit with address_size
>  	and offset_size.  Add dbg.
> +	Add IDX_debug_cu_index and IDX_debug_tu_index.
> +	* dwarf_begin_elf.c (dwarf_scnnames): Add IDX_debug_cu_index and
> +	IDX_debug_tu_index.
> +	(scn_to_string_section_idx): Ditto.
> +	(scn_dwarf_type): Check for .debug_cu_index, .debug_tu_index,
> +	.zdebug_cu_index, and .zdebug_tu_index.
> +	(check_section): Change .dwo suffix matching to account for
> +	.debug_cu_index and .debug_tu_index.
> =20
>  2023-02-22  Mark Wielaard  <mark@klomp.org>
> =20
> diff --git a/libdw/dwarf_begin_elf.c b/libdw/dwarf_begin_elf.c
> index 92d76d24..7936d343 100644
> --- a/libdw/dwarf_begin_elf.c
> +++ b/libdw/dwarf_begin_elf.c
> @@ -67,6 +67,8 @@ static const char dwarf_scnnames[IDX_last][19] =3D
>    [IDX_debug_macro] =3D ".debug_macro",
>    [IDX_debug_ranges] =3D ".debug_ranges",
>    [IDX_debug_rnglists] =3D ".debug_rnglists",
> +  [IDX_debug_cu_index] =3D ".debug_cu_index",
> +  [IDX_debug_tu_index] =3D ".debug_tu_index",
>    [IDX_gnu_debugaltlink] =3D ".gnu_debugaltlink"
>  };
>  #define ndwarf_scnnames (sizeof (dwarf_scnnames) / sizeof (dwarf_scnname=
s[0]))
> @@ -92,6 +94,8 @@ static const enum string_section_index scn_to_string_se=
ction_idx[IDX_last] =3D
>    [IDX_debug_macro] =3D STR_SCN_IDX_last,
>    [IDX_debug_ranges] =3D STR_SCN_IDX_last,
>    [IDX_debug_rnglists] =3D STR_SCN_IDX_last,
> +  [IDX_debug_cu_index] =3D STR_SCN_IDX_last,
> +  [IDX_debug_tu_index] =3D STR_SCN_IDX_last,
>    [IDX_gnu_debugaltlink] =3D STR_SCN_IDX_last
>  };

OK

> @@ -109,6 +113,11 @@ scn_dwarf_type (Dwarf *result, size_t shstrndx, Elf_=
Scn *scn)
>      {
>        if (startswith (scnname, ".gnu.debuglto_.debug"))
>  	return TYPE_GNU_LTO;
> +      else if (strcmp (scnname, ".debug_cu_index") =3D=3D 0
> +	       || strcmp (scnname, ".debug_tu_index") =3D=3D 0
> +	       || strcmp (scnname, ".zdebug_cu_index") =3D=3D 0
> +	       || strcmp (scnname, ".zdebug_tu_index") =3D=3D 0)
> +	return TYPE_DWO;
>        else if (startswith (scnname, ".debug_") || startswith (scnname, "=
.zdebug_"))
>  	{
>  	  size_t len =3D strlen (scnname);

OK

> @@ -173,42 +182,34 @@ check_section (Dwarf *result, size_t shstrndx, Elf_=
Scn *scn, bool inscngrp)
>    bool gnu_compressed =3D false;
>    for (cnt =3D 0; cnt < ndwarf_scnnames; ++cnt)
>      {
> +      /* .debug_cu_index and .debug_tu_index don't have a .dwo suffix,
> +	 but they are for DWO.  */
> +      if (result->type !=3D TYPE_DWO
> +	  && (cnt =3D=3D IDX_debug_cu_index || cnt =3D=3D IDX_debug_tu_index))
> +	continue;
> +      bool need_dot_dwo =3D
> +	(result->type =3D=3D TYPE_DWO
> +	 && cnt !=3D IDX_debug_cu_index
> +	 && cnt !=3D IDX_debug_tu_index);
>        size_t dbglen =3D strlen (dwarf_scnnames[cnt]);
>        size_t scnlen =3D strlen (scnname);
>        if (strncmp (scnname, dwarf_scnnames[cnt], dbglen) =3D=3D 0
> -	  && (dbglen =3D=3D scnlen
> -	      || (scnlen =3D=3D dbglen + 4
> +	  && ((!need_dot_dwo && dbglen =3D=3D scnlen)
> +	      || (need_dot_dwo
> +		  && scnlen =3D=3D dbglen + 4
>  		  && strstr (scnname, ".dwo") =3D=3D scnname + dbglen)))
> -	{
> -	  if (dbglen =3D=3D scnlen)
> -	    {
> -	      if (result->type =3D=3D TYPE_PLAIN)
> -		break;
> -	    }
> -	  else if (result->type =3D=3D TYPE_DWO)
> -	    break;
> -	}
> +	break;
>        else if (scnname[0] =3D=3D '.' && scnname[1] =3D=3D 'z'
>  	       && (strncmp (&scnname[2], &dwarf_scnnames[cnt][1],
>  			    dbglen - 1) =3D=3D 0
> -		   && (scnlen =3D=3D dbglen + 1
> -		       || (scnlen =3D=3D dbglen + 5
> +		   && ((!need_dot_dwo && scnlen =3D=3D dbglen + 1)
> +		       || (need_dot_dwo
> +			   && scnlen =3D=3D dbglen + 5
>  			   && strstr (scnname,
>  				      ".dwo") =3D=3D scnname + dbglen + 1))))
>  	{
> -	  if (scnlen =3D=3D dbglen + 1)
> -	    {
> -	      if (result->type =3D=3D TYPE_PLAIN)
> -		{
> -		  gnu_compressed =3D true;
> -		  break;
> -		}
> -	    }
> -	  else if (result->type <=3D TYPE_DWO)
> -	    {
> -	      gnu_compressed =3D true;
> -	      break;
> -	    }
> +	  gnu_compressed =3D true;
> +	  break;
>  	}
>        else if (scnlen > 14 /* .gnu.debuglto_ prefix. */
>  	       && startswith (scnname, ".gnu.debuglto_")

OK. It took me a bit to realize how this works. But the idea is that we
do two scans over the section names in global_read the first one sets
the result->type and then in the second scan this function is called to
check the found section is part of the type. That we also set
gnu_compressed in this loop makes thing extra funny. But you did
simplify that check. Thanks.

> diff --git a/libdw/libdwP.h b/libdw/libdwP.h
> index 77959b3b..0c44d40c 100644
> --- a/libdw/libdwP.h
> +++ b/libdw/libdwP.h
> @@ -82,6 +82,8 @@ enum
>      IDX_debug_macro,
>      IDX_debug_ranges,
>      IDX_debug_rnglists,
> +    IDX_debug_cu_index,
> +    IDX_debug_tu_index,
>      IDX_gnu_debugaltlink,
>      IDX_last
>    };

OK.