public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: dwz@sourceware.org, jakub@redhat.com, mark@klomp.org
Subject: [PATCH] Fix DW_AT_decl_file for odr
Date: Mon, 22 Feb 2021 16:11:36 +0100	[thread overview]
Message-ID: <c82bab24-5006-0f42-b3ba-4ac284ec4614@suse.de> (raw)
In-Reply-To: <20210222082439.GA14387@delia>

[-- Attachment #1: Type: text/plain, Size: 241 bytes --]

[ was: Re: [FTR] Fix DW_AT_decl_file for odr ]

On 2/22/21 9:24 AM, Tom de Vries wrote:
> Hi,
> 
> [ For now, this is a for-the-record posting.  I'm working on a different fix
> for this. ]
> 

And this is the proposed fix.

Thanks,
- Tom



[-- Attachment #2: 0001-Fix-DW_AT_decl_file-for-odr.patch --]
[-- Type: text/x-patch, Size: 8380 bytes --]

Fix DW_AT_decl_file for odr

Consider odr-struct.  It has two structs aaa (from different CUs), each with
members of type bbb and ccc, but in one case bbb is a decl, in the other case
ccc is a decl.

When doing odr, we end up with one struct aaa, and no decls:
...
$ dwz --odr odr-struct
$ readelf -wi odr-struct \
    | egrep -A2 "DW_TAG_structure" \
    | egrep "DW_TAG|DW_AT_name|DW_AT_decl"
 <1><19>: Abbrev Number: 25 (DW_TAG_structure_type)
    <1a>   DW_AT_name        : ccc
 <1><2f>: Abbrev Number: 25 (DW_TAG_structure_type)
    <30>   DW_AT_name        : aaa
 <1><4b>: Abbrev Number: 25 (DW_TAG_structure_type)
    <4c>   DW_AT_name        : bbb
...

Now consider using the same file for multifile optimization, combined with
odr:
...
$ cp odr-struct 1; cp 1 2; dwz -m 3 1 2 --odr
...
The desired outcome is that the structs aaa are unified (same as above) in
both 1 and 2, and then moved to multifile 3.

The multifile looks good:
...
$ readelf -wi 3 \
    | egrep -A2 "DW_TAG_structure" \
    | egrep "DW_TAG|DW_AT_name|DW_AT_decl"
 <1><14>: Abbrev Number: 15 (DW_TAG_structure_type)
    <15>   DW_AT_name        : ccc
 <1><2a>: Abbrev Number: 15 (DW_TAG_structure_type)
    <2b>   DW_AT_name        : aaa
 <1><46>: Abbrev Number: 15 (DW_TAG_structure_type)
    <47>   DW_AT_name        : bbb
...
but some struct types are left in 1:
...
$ readelf -wi 1 \
    | egrep -A2 "DW_TAG_structure" \
    | egrep "DW_TAG|DW_AT_name|DW_AT_decl"
 <1><1e>: Abbrev Number: 12 (DW_TAG_structure_type)
    <1f>   DW_AT_name        : aaa
 <1><3d>: Abbrev Number: 12 (DW_TAG_structure_type)
    <3e>   DW_AT_name        : bbb
...

The problem is that the DW_AT_decl_file is different for struct bbb in 1 and
3, and that causes the struct types to linger in 1.

The problem can already be shown without multifile mode using:
....
$ dwz odr-struct --odr
$ llvm-dwarfdump odr-struct \
    | grep -A3 struct \
    | egrep -v "^--|DW_AT_byte_size" \
    | sed 's%/.*/%%'
0x00000019:   DW_TAG_structure_type
                DW_AT_name      ("ccc")
                DW_AT_decl_file ("odr.cc")
0x0000002f:   DW_TAG_structure_type
                DW_AT_name      ("aaa")
                DW_AT_decl_file ("odr.h")
0x0000004b:   DW_TAG_structure_type
                DW_AT_name      ("bbb")
                DW_AT_decl_file ("odr.cc")
...
The DW_AT_decl_file for struct bbb should be odr-2.cc.

The problem is caused by by odr: odr allows defs and decls to be part of the
same duplicate chain, which breaks the invariant that DIEs in the duplicate
chain are isomorph.  During write_die, the first DIE in the chain is written
out as the representative copy, which means having a decl as the first in the
chain is counterproductive.  We have reorder_dups to fix this problem, which
detects if a duplicate chain starts with a decl and then moves the first def
before it.  However, this breaks another variant: that for each partition, all
representative DIEs are from the same CU.  Consequently, the file table of the
partition may not match with the DW_AT_decl_file number.

In other words, the reordered duplicate chain is in the wrong partition.

Fix this by:
- ignoring ODR_DECL DIEs at the start of a duplicate chain when partitioning
- moving the reorder_dups call to the start of partial unit creating, such
  that we get the correct refcu.

This also breaks the invariant checked in create_import_tree that
partition_dups doesn't generate two seperate partial units with the same set
of referrer CUs, so we allow this for odr.

2021-02-19  Tom de Vries  <tdevries@suse.de>

	PR dwz/27438
	* dwz.c (partition_cmp): Ignore ODR_DECL dies at the
	start of a duplicate chain.
	(partition_dups_1): Same.  Move call to reorder_dups earlier.
	(create_import_tree): Allow partial units with same set of referrers
	for odr.
	* testsuite/dwz.tests/odr-struct-multifile.sh: New test.

---
 dwz.c                                       | 44 ++++++++++++++++++++----
 testsuite/dwz.tests/odr-struct-multifile.sh | 53 +++++++++++++++++++++++++++++
 2 files changed, 91 insertions(+), 6 deletions(-)

diff --git a/dwz.c b/dwz.c
index 076f39c..59ead79 100644
--- a/dwz.c
+++ b/dwz.c
@@ -7399,8 +7399,20 @@ partition_cmp (const void *p, const void *q)
   dw_die_ref die2 = *(dw_die_ref *) q;
   dw_die_ref ref1, ref2;
   dw_cu_ref last_cu1 = NULL, last_cu2 = NULL;
-  for (ref1 = die1, ref2 = die2;;
-       ref1 = ref1->die_nextdup, ref2 = ref2->die_nextdup)
+  ref1 = die1;
+  ref2 = die2;
+  if (odr_active_p && odr_mode != ODR_BASIC)
+    {
+      while (ref1 && die_odr_state (die_cu (ref1), ref1) == ODR_DECL)
+	ref1 = ref1->die_nextdup;
+      if (ref1 == NULL)
+	ref1 = die1;
+      while (ref2 && die_odr_state (die_cu (ref2), ref2) == ODR_DECL)
+	ref2 = ref2->die_nextdup;
+      if (ref2 == NULL)
+	ref2 = die2;
+    }
+  for (;; ref1 = ref1->die_nextdup, ref2 = ref2->die_nextdup)
     {
       dw_cu_ref ref1cu = NULL;
       dw_cu_ref ref2cu = NULL;
@@ -7927,8 +7939,20 @@ partition_dups_1 (dw_die_ref *arr, size_t vec_size,
 	  dw_die_ref ref1, ref2;
 	  dw_cu_ref last_cu1 = NULL, last_cu2 = NULL;
 	  size_t this_cnt = 0;
-	  for (ref1 = arr[i], ref2 = arr[j];;
-	       ref1 = ref1->die_nextdup, ref2 = ref2->die_nextdup)
+	  ref1 = arr[i];
+	  ref2 = arr[j];
+	  if (odr_active_p && odr_mode != ODR_BASIC)
+	    {
+	      while (ref1 && die_odr_state (die_cu (ref1), ref1) == ODR_DECL)
+		ref1 = ref1->die_nextdup;
+	      if (ref1 == NULL)
+		ref1 = arr[i];
+	      while (ref2 && die_odr_state (die_cu (ref2), ref2) == ODR_DECL)
+		ref2 = ref2->die_nextdup;
+	      if (ref2 == NULL)
+		ref2 = arr[j];
+	    }
+	  for (;; ref1 = ref1->die_nextdup, ref2 = ref2->die_nextdup)
 	    {
 	      dw_cu_ref ref1cu = NULL;
 	      dw_cu_ref ref2cu = NULL;
@@ -8090,6 +8114,13 @@ partition_dups_1 (dw_die_ref *arr, size_t vec_size,
 		 && (ignore_size || orig_size > new_size));
       if (force)
 	{
+	  if (odr_active_p && odr_mode != ODR_BASIC)
+	    for (k = i; k < j; k++)
+	      {
+		if (second_phase && !arr[k]->die_ref_seen)
+		  continue;
+		arr[k] = reorder_dups (arr[k]);
+	      }
 	  dw_die_ref die, *diep;
 	  dw_cu_ref refcu = die_cu (arr[i]);
 	  dw_cu_ref partial_cu = pool_alloc (dw_cu, sizeof (struct dw_cu));
@@ -8133,8 +8164,6 @@ partition_dups_1 (dw_die_ref *arr, size_t vec_size,
 	      dw_die_ref child;
 	      if (second_phase && !arr[k]->die_ref_seen)
 		continue;
-	      if (odr_active_p && odr_mode != ODR_BASIC)
-		arr[k] = reorder_dups (arr[k]);
 	      if (dump_pus_p)
 		dump_die (arr[k]);
 	      child = copy_die_tree (die, arr[k]);
@@ -9399,6 +9428,9 @@ create_import_tree (void)
 		     IPU2 node into IPU, by removing all incoming edges
 		     of IPU2 and moving over all outgoing edges of IPU2
 		     to IPU.  */
+		  if (odr_active_p && odr_mode != ODR_BASIC
+		      && ipu2->idx < npus + ncus)
+		    continue;
 		  assert (ipu2->idx >= npus + ncus);
 		  size_inc = 0;
 		  if (edge_cost)
diff --git a/testsuite/dwz.tests/odr-struct-multifile.sh b/testsuite/dwz.tests/odr-struct-multifile.sh
new file mode 100644
index 0000000..cc462c9
--- /dev/null
+++ b/testsuite/dwz.tests/odr-struct-multifile.sh
@@ -0,0 +1,53 @@
+if ! $execs/dwz-for-test --odr -v 2>/dev/null; then
+    exit 77
+fi
+
+cp $execs/odr-struct 1
+cp 1 2
+
+for name in aaa bbb ccc; do
+    cnt=$(readelf -wi 1 | grep -c "DW_AT_name.*:.*$name" || true)
+    [ $cnt -eq 2 ]
+done
+
+for name in member_one member_two member_three member_four; do
+    cnt=$(readelf -wi 1 | grep -c "DW_AT_name.*:.*$name" || true)
+    case $name in
+	member_one|member_two)
+	    [ $cnt -eq 2 ]
+	    ;;
+	member_three|member_four)
+	    [ $cnt -eq 1 ]
+	    ;;
+	esac
+done
+
+decl_cnt=$(readelf -wi 1 | grep -c "DW_AT_declaration" || true)
+
+$execs/dwz-for-test --odr 1 2 -m 3
+
+verify-dwarf.sh 1
+verify-dwarf.sh 3
+
+for name in aaa bbb ccc; do
+    cnt=$(readelf -wi 3 | grep -c "DW_AT_name.*:.*$name" || true)
+    [ $cnt -eq 1 ]
+done
+
+for name in member_one member_two member_three member_four; do
+    cnt=$(readelf -wi 3 | grep -c "DW_AT_name.*:.*$name" || true)
+    [ $cnt -eq 1 ]
+done
+
+
+for name in aaa bbb ccc; do
+    cnt=$(readelf -wi 1 | grep -c "DW_AT_name.*:.*$name" || true)
+    [ $cnt -eq 0 ]
+done
+
+for name in member_one member_two member_three member_four; do
+    cnt=$(readelf -wi 1 | grep -c "DW_AT_name.*:.*$name" || true)
+    [ $cnt -eq 0 ]
+done
+
+rm -f 1 2 3

      reply	other threads:[~2021-02-22 15:11 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22  8:24 [FTR] " Tom de Vries
2021-02-22 15:11 ` Tom de Vries [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c82bab24-5006-0f42-b3ba-4ac284ec4614@suse.de \
    --to=tdevries@suse.de \
    --cc=dwz@sourceware.org \
    --cc=jakub@redhat.com \
    --cc=mark@klomp.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).