Hi, In PR27588 a test-case was reported where dwz did not optimize due to illegal DWARF produced by nasm: ... $ dwz libxul.so -o libxul.so.z dwz: libxul.so: loclistptr attribute, yet no .debug_loc section ... Add an option --devel-skip-producer <producer> such that we can filter out CUs coming from a particular producer, such that we have instead: ... $ readelf -wi libxul.so | grep DW_AT_producer \ | sed 's/.*: //' | sort | uniq -c 176 clang LLVM (rustc version 1.50.0) 2 GNU AS 2.36.1 2 GNU C11 11.0.1 20210315 (experimental) <SNIP> 16 NASM 2.15.05 76 yasm 1.3.0 $ dwz libxul.so -o libxul.so.z --devel-skip-producer NASM $ readelf -wi libxul.so.z | grep DW_AT_producer \ | sed 's/.*: //' | sort | uniq -c 176 clang LLVM (rustc version 1.50.0) 2 GNU AS 2.36.1 2 GNU C11 11.0.1 20210315 (experimental) <SNIP> 76 yasm 1.3.0 ... Any comments? Thanks, - Tom Add --devel-skip-producer 2021-03-16 Tom de Vries <tdevries@suse.de> * dwz.c (add_skip_producer, skip_producer): New function. (read_debug_info): Skip CUs fi DW_AT_producer in skip_producers list. (dwz_options, usage): Add --devel-skip-producer entries. (parse_args): Handle skip_producer_parsed. --- dwz.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 56 insertions(+), 1 deletion(-) diff --git a/dwz.c b/dwz.c index 54b4bda..06e73bc 100644 --- a/dwz.c +++ b/dwz.c @@ -289,6 +289,8 @@ static enum odr_mode odr_mode = ODR_LINK; static int odr_mode_parsed = 0; static bool odr_active_p = false; +static int skip_producer_parsed = 0; + /* Struct to gather statistics. */ struct stats { @@ -6648,6 +6650,49 @@ read_lang (unsigned char *ptr, enum dwarf_form form, return ptr; } +static const char **skip_producers; +static size_t skip_producers_size; +static size_t nr_skip_producers; + +static void +add_skip_producer (const char *producer) +{ + size_t alloc_size; + if (skip_producers == NULL) + { + skip_producers_size = 10; + alloc_size = skip_producers_size * sizeof (const char *); + skip_producers = malloc (alloc_size); + } + else if (nr_skip_producers == skip_producers_size) + { + skip_producers_size += 10; + alloc_size = skip_producers_size * sizeof (const char *); + skip_producers = realloc (skip_producers, alloc_size); + } + + skip_producers[nr_skip_producers] = producer; + nr_skip_producers++; +} + +static bool +skip_producer (const char *producer) +{ + size_t i; + + if (producer == NULL) + return false; + + for (i = 0; i < nr_skip_producers; ++i) + { + const char *skip = skip_producers[i]; + if (strncmp (skip, producer, strlen (skip)) == 0) + return true; + } + + return false; +} + /* First phase of the DWARF compression. Parse .debug_info section (for kind == DEBUG_INFO) or .debug_types section (for kind == DEBUG_TYPES) for each CU in it construct internal representation for the CU @@ -7320,6 +7365,11 @@ read_debug_info (DSO *dso, int kind, unsigned int *die_count) } cu->cu_comp_dir = get_AT_string (cu->cu_die, DW_AT_comp_dir); + if (skip_producer (get_AT_string (cu->cu_die, DW_AT_producer))) + { + cu->cu_die->die_remove = 1; + continue; + } enum dwarf_form form; debug_line_off = get_AT_int (cu->cu_die, DW_AT_stmt_list, &present, &form); @@ -16374,6 +16424,8 @@ static struct option dwz_options[] = no_argument, &checksum_cycle_opt, 1 }, { "devel-no-checksum-cycle-opt", no_argument, &checksum_cycle_opt, 0 }, + { "devel-skip-producer", + required_argument, &skip_producer_parsed, 1}, #endif { "odr", no_argument, &odr, 1 }, { "no-odr", no_argument, &odr, 0 }, @@ -16630,7 +16682,8 @@ usage (const char *progname, int failing) " --devel-die-count-method\n" " --devel-deduplication-mode={none,intra-cu,inter-cu}\n" " --devel-uni-lang / --devel-no-uni-lang\n" - " --devel-gen-cu / --devel-no-gen-cu\n")); + " --devel-gen-cu / --devel-no-gen-cu\n" + " --devel-skip-producer <producer>\n")); #endif exit (failing); @@ -16726,6 +16779,8 @@ parse_args (int argc, char *argv[], bool *hardlink, const char **outfile) error (1, 0, "invalid argument --odr-mode %s", optarg); } + if (skip_producer_parsed) + add_skip_producer (optarg); break; case 'o':
Hi Tom,
On Tue, Mar 16, 2021 at 03:19:20PM +0100, Tom de Vries wrote:
> In PR27588 a test-case was reported where dwz did not optimize due
> to illegal DWARF produced by nasm:
> ...
> $ dwz libxul.so -o libxul.so.z
> dwz: libxul.so: loclistptr attribute, yet no .debug_loc section
> ...
>
> Add an option --devel-skip-producer <producer> such that we can filter out
> CUs coming from a particular producer, such that we have instead:
> ...
> $ readelf -wi libxul.so | grep DW_AT_producer \
> | sed 's/.*: //' | sort | uniq -c
> 176 clang LLVM (rustc version 1.50.0)
> 2 GNU AS 2.36.1
> 2 GNU C11 11.0.1 20210315 (experimental) <SNIP>
> 16 NASM 2.15.05
> 76 yasm 1.3.0
> $ dwz libxul.so -o libxul.so.z --devel-skip-producer NASM
> $ readelf -wi libxul.so.z | grep DW_AT_producer \
> | sed 's/.*: //' | sort | uniq -c
> 176 clang LLVM (rustc version 1.50.0)
> 2 GNU AS 2.36.1
> 2 GNU C11 11.0.1 20210315 (experimental) <SNIP>
> 76 yasm 1.3.0
> ...
>
> Any comments?
So this skips and then removes the whole CU (because we set die_remove
on the cu_die)?
It is useful as --devel option, but I would be against it as non-devel
option.
Cheers,
Mark
On 3/16/21 10:27 PM, Mark Wielaard wrote: > Hi Tom, > > On Tue, Mar 16, 2021 at 03:19:20PM +0100, Tom de Vries wrote: >> In PR27588 a test-case was reported where dwz did not optimize due >> to illegal DWARF produced by nasm: >> ... >> $ dwz libxul.so -o libxul.so.z >> dwz: libxul.so: loclistptr attribute, yet no .debug_loc section >> ... >> >> Add an option --devel-skip-producer <producer> such that we can filter out >> CUs coming from a particular producer, such that we have instead: >> ... >> $ readelf -wi libxul.so | grep DW_AT_producer \ >> | sed 's/.*: //' | sort | uniq -c >> 176 clang LLVM (rustc version 1.50.0) >> 2 GNU AS 2.36.1 >> 2 GNU C11 11.0.1 20210315 (experimental) <SNIP> >> 16 NASM 2.15.05 >> 76 yasm 1.3.0 >> $ dwz libxul.so -o libxul.so.z --devel-skip-producer NASM >> $ readelf -wi libxul.so.z | grep DW_AT_producer \ >> | sed 's/.*: //' | sort | uniq -c >> 176 clang LLVM (rustc version 1.50.0) >> 2 GNU AS 2.36.1 >> 2 GNU C11 11.0.1 20210315 (experimental) <SNIP> >> 76 yasm 1.3.0 >> ... >> >> Any comments? > > So this skips and then removes the whole CU (because we set die_remove > on the cu_die)? > Yes. > It is useful as --devel option, but I would be against it as non-devel > option. Could you explain in more detail why you would be against this as a non-devel option? F.i., I'm curious, is it an abstract objection, or do you foresee concrete problems with the approach? Thanks, - Tom
Hi Tom,
On Wed, 2021-03-17 at 11:46 +0100, Tom de Vries wrote:
> On 3/16/21 10:27 PM, Mark Wielaard wrote:
> > It is useful as --devel option, but I would be against it as non-
> > devel
> > option.
>
> Could you explain in more detail why you would be against this as a
> non-devel option? F.i., I'm curious, is it an abstract objection, or
> do
> you foresee concrete problems with the approach?
Of course I would never call my own objections abstract :)
But it partly is. I think it is questionable if we have to rely on the
producer string to do the right thing. But specifically just dropping a
whole CU DIE tree seems the wrong thing to do (what about the ranges,
loclists, stmt_lists left behind for example).
Cheers,
Mark
On 3/17/21 12:14 PM, Mark Wielaard wrote: > Hi Tom, > > On Wed, 2021-03-17 at 11:46 +0100, Tom de Vries wrote: >> On 3/16/21 10:27 PM, Mark Wielaard wrote: >>> It is useful as --devel option, but I would be against it as non- >>> devel >>> option. >> >> Could you explain in more detail why you would be against this as a >> non-devel option? F.i., I'm curious, is it an abstract objection, or >> do >> you foresee concrete problems with the approach? > > Of course I would never call my own objections abstract :) > But it partly is. I think it is questionable if we have to rely on the > producer string to do the right thing. I'd say dropping a CU is not the right thing, it's a workaround, and relying on the producer string for a workaround sounds acceptable to me. > But specifically just dropping a > whole CU DIE tree seems the wrong thing to do (what about the ranges, > loclists, stmt_lists left behind for example). Yeah, the feature may not be complete yet. F.i. if there's a dwarf expression in a loclist referring to a DIE in the dropped CU, then that'll blow up probably. But for dwarf generated by assemblers, which is not that complicated, this seems good enough. Thanks, - Tom