public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Simon Marchi <simark@simark.ca>
To: Bruno Larsen <blarsen@redhat.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH v2] gdb: Fix issue with Clang CLI macros
Date: Mon, 17 Oct 2022 08:32:03 -0400	[thread overview]
Message-ID: <42c22074-89cb-f23b-eea1-faeea22026d7@simark.ca> (raw)
In-Reply-To: <20221017102627.2540273-1-blarsen@redhat.com>


Hi Bruno,

Thanks for fixing this.

On 2022-10-17 06:26, Bruno Larsen via Gdb-patches wrote:
> Clang up to the current version adds macros that were defined in the

Might as well say what is the current version.

> command line or by "other means", according to the Dwarf specification,
> after the last DW_MACRO_end_file, instead of before the first
> DW_MACRO_start_file, as the specification dictates.  When GDB reads the
> macros after the last file is closed, the macros never end up "in scope"
> and so we can't print them.  This has been submitted as a bug to Clang
> developers, and PR macros/29034 was opened for GDB to keep track of
> this.

Can you add the link to the Clang bug?

> 
> Seeing as there is no expected date for it to be fixed, a workaround was
> added for all current versions of Clang.  The workaround detects when

Nit commit message style comment, when talking about the work done in
the current patch, use the present tense:

    Seeing as there is no expected date for it to be fixed, add a workaround
    for all current versions of Clang.

> the main file would be closed and if the producer is clang, and turns
> that operation into a noop, so we keep a reference to the current_file
> as those macros are read.
> 
> A test case was added to confirm the functionality.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29034
> ---
>  gdb/dwarf2/cu.h                               |   1 +
>  gdb/dwarf2/macro.c                            |  18 +++-
>  gdb/dwarf2/macro.h                            |   2 +-
>  gdb/dwarf2/read.c                             |  15 ++-
>  gdb/dwarf2/read.h                             |   2 +
>  gdb/producer.c                                |  26 +++++
>  gdb/producer.h                                |   4 +
>  gdb/testsuite/gdb.dwarf2/clang_cli_macros.exp | 101 ++++++++++++++++++
>  8 files changed, 163 insertions(+), 6 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.dwarf2/clang_cli_macros.exp
> 
> diff --git a/gdb/dwarf2/cu.h b/gdb/dwarf2/cu.h
> index 23cb3d21b2e..51638e71b19 100644
> --- a/gdb/dwarf2/cu.h
> +++ b/gdb/dwarf2/cu.h
> @@ -264,6 +264,7 @@ struct dwarf2_cu
>    bool producer_is_icc : 1;
>    bool producer_is_icc_lt_14 : 1;
>    bool producer_is_codewarrior : 1;
> +  bool producer_is_clang : 1;
>  
>    /* When true, the file that we're processing is known to have
>       debugging info for C++ namespaces.  GCC 3.3.x did not produce
> diff --git a/gdb/dwarf2/macro.c b/gdb/dwarf2/macro.c
> index 38c0fdfec73..c753dc300f2 100644
> --- a/gdb/dwarf2/macro.c
> +++ b/gdb/dwarf2/macro.c
> @@ -445,7 +445,7 @@ dwarf_decode_macro_bytes (dwarf2_per_objfile *per_objfile,
>  			  struct dwarf2_section_info *str_section,
>  			  struct dwarf2_section_info *str_offsets_section,
>  			  gdb::optional<ULONGEST> str_offsets_base,
> -			  htab_t include_hash)
> +			  htab_t include_hash, struct dwarf2_cu *cu)
>  {
>    struct objfile *objfile = per_objfile->objfile;
>    enum dwarf_macro_record_type macinfo_type;
> @@ -672,6 +672,16 @@ dwarf_decode_macro_bytes (dwarf2_per_objfile *per_objfile,
>  	  if (! current_file)
>  	    complaint (_("macro debug info has an unmatched "
>  			 "`close_file' directive"));
> +	  else if (current_file->included_by == nullptr
> +		   && producer_is_clang (cu))
> +	    {
> +	      /* Clang, until the current version, misplaces macro definitions,

Precision: macros defined on the command-line, not all macros.

> +		 putting them after the last DW_MACRO_end_file instead of
> +		 before the first DW_MACRO_start_file.  */
> +	      /* FIXME: Since at the time of writing there is no clang version
> +		 with this bug fixed, we check for any clang producer.  This
> +		 should be changed to producer_is_clang_lt_XX when possible. */

The second comment is relevant, but I would remove the FIXME, as it's
not a known deficiency in GDB.  Also, I think you can merge the two
comments into a single one.

> @@ -13343,6 +13352,10 @@ check_producer (struct dwarf2_cu *cu)
>      }
>    else if (startswith (cu->producer, "CodeWarrior S12/L-ISA"))
>      cu->producer_is_codewarrior = true;
> +  else if (producer_is_clang (cu->producer, &major, &minor))
> +    {
> +      cu->producer_is_clang = true;
> +    }

Remove braces.

>    else
>      {
>        /* For other non-GCC compilers, expect their behavior is DWARF version
> @@ -23363,7 +23376,7 @@ dwarf_decode_macros (struct dwarf2_cu *cu, unsigned int offset,
>  
>    dwarf_decode_macros (per_objfile, builder, section, lh,
>  		       offset_size, offset, str_section, str_offsets_section,
> -		       str_offsets_base, section_is_gnu);
> +		       str_offsets_base, section_is_gnu, cu);
>  }
>  
>  /* Return the .debug_loc section to use for CU.
> diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
> index 5f01fbc1025..a47e6bf5144 100644
> --- a/gdb/dwarf2/read.h
> +++ b/gdb/dwarf2/read.h
> @@ -761,4 +761,6 @@ extern void dwarf2_get_section_info (struct objfile *,
>  				     asection **, const gdb_byte **,
>  				     bfd_size_type *);
>  
> +extern bool producer_is_clang (struct dwarf2_cu *cu);
> +
>  #endif /* DWARF2READ_H */
> diff --git a/gdb/producer.c b/gdb/producer.c
> index ef1dd93afbc..a9fb7dd8245 100644
> --- a/gdb/producer.c
> +++ b/gdb/producer.c
> @@ -127,6 +127,32 @@ producer_is_llvm (const char *producer)
>  				 || startswith (producer, " F90 Flang ")));
>  }
>  
> +/* see producer.h */

/* See producer.h.  */

> +
> +bool
> +producer_is_clang (const char* producer, int *major, int* minor)

Space before * (twice).

> +{
> +  const char *cs;

Declare where initialized.

> +
> +  if (producer != nullptr && startswith (producer, "clang "))
> +    {
> +      int maj, min;
> +      if (major == nullptr)
> +	major = &maj;
> +      if (minor == nullptr)
> +	minor = &min;
> +
> +      /* The full producer sting will look something like

sting -> string

> +	 "clang version XX.X.X ..."
> +	 So we can safely ignore all characters before the first digit.  */
> +      cs = producer + strlen("clang version ");

Space before strlen.

Above, you check that the string starts with "clang ", but here you
assume it contains "clang version ".  If you crafted a binary with
"clang " as a producer string, it would make an out-of-bounds access.
I'd suggest changing the startswith call to check for "clang version "
too.  Then, if you crafted a binary with "clang version " as the
producer, I think the sscanf below wouldn't make an out-of-bounds
access, it would just fail (as it would try to parse an empty string).

> +
> +      if (sscanf (cs, "%d.%d", major, minor) == 2)
> +	  return true

One indentation level too much.

;
> +    }
> +  return false;
> +}
> +
>  #if defined GDB_SELF_TEST
>  namespace selftests {
>  namespace producer {
> diff --git a/gdb/producer.h b/gdb/producer.h
> index f7c19368bc6..b75cfae6569 100644
> --- a/gdb/producer.h
> +++ b/gdb/producer.h
> @@ -41,4 +41,8 @@ extern bool producer_is_icc (const char *producer, int *major, int *minor);
>     false otherwise.*/
>  extern bool producer_is_llvm (const char *producer);
>  
> +/* Returns true if the given PRODUCER string is clang, false otherwise.
> +   Sets MAJOR and MINOR accordingly, if not NULL.  */
> +extern bool producer_is_clang (const char *producer, int *major, int *minor);
> +
>  #endif
> diff --git a/gdb/testsuite/gdb.dwarf2/clang_cli_macros.exp b/gdb/testsuite/gdb.dwarf2/clang_cli_macros.exp
> new file mode 100644
> index 00000000000..f056d8ad832
> --- /dev/null
> +++ b/gdb/testsuite/gdb.dwarf2/clang_cli_macros.exp

We use more kebab case than snake case in test file names.

> @@ -0,0 +1,101 @@
> +# This testcase is part of GDB, the GNU debugger.
> +
> +# Copyright 2022 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test that GDB is able to print macros defined in the command line invocaion
> +# of clang. It has to be explicitly tested because of:

Two spaces after period.

> +# https://github.com/llvm/llvm-project/issues/54506
> +
> +load_lib dwarf.exp
> +
> +if {![dwarf2_support]} {
> +    return 0
> +}
> +if {![test_compiler_info gcc-*-*]} {
> +    untested "dwarf assembler needs GCC"
> +}
> +
> +standard_testfile .S
> +
> +set srcfile macro-source-path.c

copy-pasto.  Actually, can you use:

  standard_testfile .c .S

and then use $srcfile for the .c, and $srcfile2 (which you can reassign
to $asm_file for clarity) for the .S?

> +set asm_file [standard_output_file clang_cli_macros.S]

It's preferable not to hard-code source file names, so the test keeps
working if renamed.

> +
> +lassign [function_range main $srcdir/$subdir/$srcfile] \
> +    main_start main_len
> +
> +Dwarf::assemble $asm_file {
> +    global srcfile

I prefer not using the global keyword, but use the $::srcfile form
instead.  It avoids stale "global" definitions, and it shows at the
point of use that the variable is a global one.

> +    declare_labels L cu_macros
> +
> +    cu {} {
> +	DW_TAG_compile_unit {
> +	    {DW_AT_producer "clang version 15.0.0"}
> +	    {DW_AT_language @DW_LANG_C11}
> +	    {DW_AT_name $srcfile}
> +	    {DW_AT_macros $cu_macros DW_FORM_sec_offset}
> +	    {DW_AT_stmt_list $L DW_FORM_sec_offset}
> +	} {
> +	    declare_labels int_type
> +
> +	    int_type: DW_TAG_base_type {
> +	        {DW_AT_byte_size 4 DW_FORM_sdata}
> +	        {DW_AT_encoding  @DW_ATE_signed}
> +	        {DW_AT_name int}
> +	    }
> +	    DW_TAG_subprogram {
> +	        {MACRO_AT_func {main}}
> +	        {type :$int_type}
> +	    }
> +	}
> +    }
> +    lines {version 2} L {
> +	file_name $srcfile 1
> +	program {
> +	    DW_LNE_set_address $::main_start
> +	    line 10
> +	    DW_LNS_copy
> +	    DW_LNE_set_address "$::main_start + $::main_len"
> +	    DW_LNE_end_sequence
> +	}
> +    }
> +
> +    # Define the .debug_macro section.
> +    macro {
> +	cu_macros: unit {
> +	    "debug-line-offset-label" $L
> +        } {
> +	    # Clang has this bug where it puts the macros defined on
> +	    # the command-line after the main file portion (see
> +	    # PR 29034).

I would put that information in the intro comment for the test, where
you link to the llvm bug tracker.

> +	    start_file 0 1
> +		# A macro defined at line 1 of the main file.
> +		define 1 "TWO 2"
> +	    end_file
> +	    define 0 "ONE 1"
> +	}
> +    }
> +}
> +
> +if {[prepare_for_testing "failed to prepare" $testfile [list $srcfile $asm_file] {nodebug}]} {
> +    return
> +}
> +
> +if {![runto_main]} {
> +    return
> +}
> +
> +gdb_test "print TWO" "= 2" "printing simple macro"
> +gdb_test "print ONE" "= 1" "printing defined from CLI"

printing -> print

Thanks,

Simon

  reply	other threads:[~2022-10-17 12:32 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-17 10:26 Bruno Larsen
2022-10-17 12:32 ` Simon Marchi [this message]
2022-10-17 14:54   ` Bruno Larsen
2022-10-17 15:28     ` Simon Marchi
2022-10-17 13:22 ` Simon Marchi
2022-10-17 13:35   ` Bruno Larsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42c22074-89cb-f23b-eea1-faeea22026d7@simark.ca \
    --to=simark@simark.ca \
    --cc=blarsen@redhat.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).