public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <andrew.burgess@embecosm.com>
To: Tom de Vries <tdevries@suse.de>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH][gdb/symtab] Fix line-table end-of-sequence sorting
Date: Sat, 6 Jun 2020 07:51:52 +0100	[thread overview]
Message-ID: <20200606065152.GF3522@embecosm.com> (raw)
In-Reply-To: <18b1ee90-2ece-a5b4-787b-2507b081da81@suse.de>

* Tom de Vries <tdevries@suse.de> [2020-06-06 01:44:42 +0200]:

> [ was: Re: [PATCH 2/3] gdb: Don't reorder line table entries too much
> when sorting. ]
> 
> On 05-06-2020 18:00, Tom de Vries wrote:
> > On 05-06-2020 16:49, Tom de Vries wrote:
> >> On 23-12-2019 02:51, Andrew Burgess wrote:
> >>> I had to make a small adjustment in find_pc_sect_line in order to
> >>> correctly find the previous line in the line table.  In some line
> >>> tables I was seeing an actual line entry and an end of sequence marker
> >>> at the same address, before this commit these would reorder to move
> >>> the end of sequence marker before the line entry (end of sequence has
> >>> line number 0).  Now the end of sequence marker remains in its correct
> >>> location, and in order to find a previous line we should step backward
> >>> over any end of sequence markers.
> >>>
> >>> As an example, the binary:
> >>>   gdb/testsuite/outputs/gdb.dwarf2/dw2-ranges-func/dw2-ranges-func-lo-cold
> >>>
> >>> Has this line table before the patch:
> >>>
> >>>   INDEX    LINE ADDRESS
> >>>   0          48 0x0000000000400487
> >>>   1         END 0x000000000040048e
> >>>   2          52 0x000000000040048e
> >>>   3          54 0x0000000000400492
> >>>   4          56 0x0000000000400497
> >>>   5         END 0x000000000040049a
> >>>   6          62 0x000000000040049a
> >>>   7         END 0x00000000004004a1
> >>>   8          66 0x00000000004004a1
> >>>   9          68 0x00000000004004a5
> >>>   10         70 0x00000000004004aa
> >>>   11         72 0x00000000004004b9
> >>>   12        END 0x00000000004004bc
> >>>   13         76 0x00000000004004bc
> >>>   14         78 0x00000000004004c0
> >>>   15         80 0x00000000004004c5
> >>>   16        END 0x00000000004004cc
> >>>
> >>> And after this patch:
> >>>
> >>>   INDEX    LINE ADDRESS
> >>>   0          48 0x0000000000400487
> >>>   1          52 0x000000000040048e
> >>>   2         END 0x000000000040048e
> >>>   3          54 0x0000000000400492
> >>>   4          56 0x0000000000400497
> >>>   5         END 0x000000000040049a
> >>>   6          62 0x000000000040049a
> >>>   7          66 0x00000000004004a1
> >>>   8         END 0x00000000004004a1
> >>>   9          68 0x00000000004004a5
> >>>   10         70 0x00000000004004aa
> >>>   11         72 0x00000000004004b9
> >>>   12        END 0x00000000004004bc
> >>>   13         76 0x00000000004004bc
> >>>   14         78 0x00000000004004c0
> >>>   15         80 0x00000000004004c5
> >>>   16        END 0x00000000004004cc
> >>>
> >>> When calling find_pc_sect_line with the address 0x000000000040048e, in
> >>> both cases we find entry #3, we then try to find the previous entry,
> >>> which originally found this entry '2         52 0x000000000040048e',
> >>> after the patch it finds '2         END 0x000000000040048e', which
> >>> cases the lookup to fail.
> >>>
> >>> By skipping the END marker after this patch we get back to the correct
> >>> entry, which is now #1: '1          52 0x000000000040048e', and
> >>> everything works again.
> >>
> >> I start to suspect that you have been working around an incorrect line
> >> table.
> >>
> >> Consider this bit:
> >> ...
> >>    0          48 0x0000000000400487
> >>    1          52 0x000000000040048e
> >>    2         END 0x000000000040048e
> >> ...
> >>
> >> The end marker marks the address one past the end of the sequence.
> >> Therefore, it makes no sense to have an entry in the sequence with the
> >> same address as the end marker.
> >>
> >> [ dwarf doc:
> >>
> >> end_sequence:
> >>
> >> A boolean indicating that the current address is that of the first byte
> >> after the end of a sequence of target machine instructions. end_sequence
> >> terminates a sequence of lines; therefore other information in the same
> >> row is not meaningful.
> >>
> >> DW_LNE_end_sequence:
> >>
> >> The DW_LNE_end_sequence opcode takes no operands. It sets the
> >> end_sequence register of the state machine to “true” and appends a row
> >> to the matrix using the current values of the state-machine registers.
> >> Then it resets the registers to the initial values specified above (see
> >> Section 6.2.2). Every line number program sequence must end with a
> >> DW_LNE_end_sequence instruction which creates a row whose address is
> >> that of the byte after the last target machine instruction of the sequence.
> >>
> >> ]
> >>
> >> The incorrect entry is generated by this dwarf assembler sequence:
> >> ...
> >>                 {DW_LNS_copy}
> >>                 {DW_LNE_end_sequence}
> >> ...
> >>
> >> I think we should probably fix the dwarf assembly test-cases.
> >>
> >> If we want to handle this in gdb, the thing that seems most logical to
> >> me is to ignore this kind of entries.
> > 
> > Hmm, that seems to be done already, in buildsym_compunit::record_line.
> > 
> > Anyway, I was looking at the line table for
> > gdb.dwarf2/dw2-ranges-base.exp, and got a line table with subsequent end
> > markers:
> > ...
> > INDEX  LINE   ADDRESS            IS-STMT
> > 0      31     0x00000000004004a7 Y
> > 1      21     0x00000000004004ae Y
> > 2      END    0x00000000004004ae Y
> > 3      11     0x00000000004004ba Y
> > 4      END    0x00000000004004ba Y
> > 5      END    0x00000000004004c6 Y
> > ...
> > 
> > By using this patch:
> > ...
> > diff --git a/gdb/buildsym.c b/gdb/buildsym.c
> > index 33bf6523e9..76f0b54ff6 100644
> > --- a/gdb/buildsym.c
> > +++ b/gdb/buildsym.c
> > @@ -943,6 +943,10 @@ buildsym_compunit::end_symtab_with_blockvector
> > (struct block *static_block,
> >             = [] (const linetable_entry &ln1,
> >                   const linetable_entry &ln2) -> bool
> >               {
> > +               if (ln1.pc == ln2.pc
> > +                   && ((ln1.line == 0) != (ln2.line == 0)))
> > +                 return ln1.line == 0 ? true : false;

I will take a look at this patch properly as soon as I can, but just
spotted this pet peeve of mine, please just write:

  return ln1.line == 0;

Thanks,
Andrew


> > +
> >                 return (ln1.pc < ln2.pc);
> >               };
> > 
> > ...
> > I get the expected:
> > ...
> > INDEX  LINE   ADDRESS            IS-STMT
> > 0      31     0x00000000004004a7 Y
> > 1      END    0x00000000004004ae Y
> > 2      21     0x00000000004004ae Y
> > 3      END    0x00000000004004ba Y
> > 4      11     0x00000000004004ba Y
> > 5      END    0x00000000004004c6 Y
> > ...
> 
> Any comments on patch below?
> 
> Thanks,
> - Tom
> 

> [gdb/symtab] Fix line-table end-of-sequence sorting
> 
> Consider test-case gdb.dwarf2/dw2-ranges-base.exp.  It has a line-table for
> dw2-ranges-base.c like this:
> ...
>  Line Number Statements:
>   [0x0000014e]  Extended opcode 2: set Address to 0x4004ba
>   [0x00000159]  Advance Line by 10 to 11
>   [0x0000015b]  Copy
>   [0x0000015c]  Advance PC by 12 to 0x4004c6
>   [0x0000015e]  Advance Line by 19 to 30
>   [0x00000160]  Copy
>   [0x00000161]  Extended opcode 1: End of Sequence
> 
>   [0x00000164]  Extended opcode 2: set Address to 0x4004ae
>   [0x0000016f]  Advance Line by 20 to 21
>   [0x00000171]  Copy
>   [0x00000172]  Advance PC by 12 to 0x4004ba
>   [0x00000174]  Advance Line by 29 to 50
>   [0x00000176]  Copy
>   [0x00000177]  Extended opcode 1: End of Sequence
> 
>   [0x0000017a]  Extended opcode 2: set Address to 0x4004a7
>   [0x00000185]  Advance Line by 30 to 31
>   [0x00000187]  Copy
>   [0x00000188]  Advance PC by 7 to 0x4004ae
>   [0x0000018a]  Advance Line by 39 to 70
>   [0x0000018c]  Copy
>   [0x0000018d]  Extended opcode 1: End of Sequence
> ...
> 
> The Copy followed by End-of-Sequence is as specified in the dwarf assembly,
> but incorrect.  F.i., consider:
> ...
>   [0x0000015c]  Advance PC by 12 to 0x4004c6
>   [0x0000015e]  Advance Line by 19 to 30
>   [0x00000160]  Copy
>   [0x00000161]  Extended opcode 1: End of Sequence
> ...
> 
> Both the Copy and the End-of-Sequence append a row to the matrix using the
> same addres: 0x4004c6.  The Copy declares a target instruction at that
> address.  The End-of-Sequence declares that the sequence ends before that
> address.  It's a contradiction that the target instruction is both part of the
> sequence (according to Copy) and not part of the sequence (according to
> End-of-Sequence).
> 
> The offending Copy is skipped though by buildsym_compunit::record_line for
> unrelated reasons.  So, if we disable the sorting in
> buildsym_compunit::end_symtab_with_blockvector, we have:
> ...
> INDEX  LINE   ADDRESS            IS-STMT
> 0      11     0x00000000004004ba Y
> 1      END    0x00000000004004c6 Y
> 2      21     0x00000000004004ae Y
> 3      END    0x00000000004004ba Y
> 4      31     0x00000000004004a7 Y
> 5      END    0x00000000004004ae Y
> ...
> but if we re-enable the sorting, we have:
> ...
> INDEX  LINE   ADDRESS            IS-STMT
> 0      31     0x00000000004004a7 Y
> 1      21     0x00000000004004ae Y
> 2      END    0x00000000004004ae Y
> 3      11     0x00000000004004ba Y
> 4      END    0x00000000004004ba Y
> 5      END    0x00000000004004c6 Y
> ...
> which has both:
> - the contradictory order for the same-address pairs 1/2 and 3/4, as well as
> - a non-sensical pair of ENDs,
> while we'd like:
> ...
> INDEX  LINE   ADDRESS            IS-STMT
> 0      31     0x00000000004004a7 Y
> 1      END    0x00000000004004ae Y
> 2      21     0x00000000004004ae Y
> 3      END    0x00000000004004ba Y
> 4      11     0x00000000004004ba Y
> 5      END    0x00000000004004c6 Y
> ...
> 
> This is a regression since commit 3d92a3e313 "gdb: Don't reorder line table
> entries too much when sorting", that introduced sorting on address while
> keeping entries with the same address in pre-sort order, which leads to
> incorrect results if one of the entries is an End-Of-Sequence.
> 
> Fix this by handling End-Of-Sequence entries in the sorting function.
> 
> Tested on x86_64-linux.
> 
> gdb/ChangeLog:
> 
> 2020-06-06  Tom de Vries  <tdevries@suse.de>
> 
> 	* buildsym.c (buildsym_compunit::end_symtab_with_blockvector): Handle
> 	End-Of-Sequence in lte_is_less_than.
> 
> gdb/testsuite/ChangeLog:
> 
> 2020-06-06  Tom de Vries  <tdevries@suse.de>
> 
> 	* gdb.dwarf2/dw2-ranges-base.exp: Test line-table order.
> 
> ---
>  gdb/buildsym.c                               |  4 ++++
>  gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp | 14 ++++++++++++++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/gdb/buildsym.c b/gdb/buildsym.c
> index 33bf6523e9..76f0b54ff6 100644
> --- a/gdb/buildsym.c
> +++ b/gdb/buildsym.c
> @@ -943,6 +943,10 @@ buildsym_compunit::end_symtab_with_blockvector (struct block *static_block,
>  	    = [] (const linetable_entry &ln1,
>  		  const linetable_entry &ln2) -> bool
>  	      {
> +		if (ln1.pc == ln2.pc
> +		    && ((ln1.line == 0) != (ln2.line == 0)))
> +		  return ln1.line == 0 ? true : false;
> +
>  		return (ln1.pc < ln2.pc);
>  	      };
>  
> diff --git a/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp b/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp
> index 92f8f6cecb..39281a8857 100644
> --- a/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp
> +++ b/gdb/testsuite/gdb.dwarf2/dw2-ranges-base.exp
> @@ -144,12 +144,26 @@ gdb_test "info line frame3" \
>  
>  # Ensure that the line table correctly tracks the end of sequence markers.
>  set end_seq_count 0
> +set prev -1
> +set seq_count 0
>  gdb_test_multiple "maint info line-table gdb.dwarf2/dw2-ranges-base.c" \
>      "count END markers in line table" {
>  	-re "^$decimal\[ \t\]+$decimal\[ \t\]+$hex\(\[ \t\]+Y\)? *\r\n" {
> +	    if { $prev != -1 } {
> +		gdb_assert "$prev == 1" \
> +		    "prev of normal entry at $seq_count is end marker"
> +	    }
> +	    set prev 0
> +	    incr seq_count
>  	    exp_continue
>  	}
>  	-re "^$decimal\[ \t\]+END\[ \t\]+$hex\(\[ \t\]+Y\)? *\r\n" {
> +	    if { $prev != -1 } {
> +		gdb_assert "$prev == 0" \
> +		    "prev of end marker at $seq_count is normal entry"
> +	    }
> +	    set prev 1
> +	    incr seq_count
>  	    incr end_seq_count
>  	    exp_continue
>  	}


  reply	other threads:[~2020-06-06  6:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-23  1:51 [PATCH 0/3] Improve inline frame debug experience Andrew Burgess
2019-12-23  1:51 ` [PATCH 1/3] gdb: Include end of sequence markers in the line table Andrew Burgess
2019-12-23  1:51 ` [PATCH 3/3] gdb: Better frame tracking for inline frames Andrew Burgess
2019-12-26  7:25   ` Christian Biesinger via gdb-patches
2019-12-26 23:33     ` Andrew Burgess
2019-12-23  1:51 ` [PATCH 2/3] gdb: Don't reorder line table entries too much when sorting Andrew Burgess
2020-01-24 17:40   ` Tom Tromey
2020-06-05  6:10     ` Tom de Vries
2020-06-05 14:49   ` Tom de Vries
2020-06-05 16:00     ` Tom de Vries
2020-06-05 23:44       ` [PATCH][gdb/symtab] Fix line-table end-of-sequence sorting Tom de Vries
2020-06-06  6:51         ` Andrew Burgess [this message]
2020-06-06  8:18           ` Tom de Vries
2020-06-06  9:25         ` Andrew Burgess
2020-06-08 14:40           ` [gdb/testsuite] Fix bad line table entry sequence Tom de Vries
2020-06-15 10:31             ` Andrew Burgess
2020-06-08 15:50           ` [PATCH][gdb/symtab] Fix line-table end-of-sequence sorting Tom de Vries
2020-06-15 10:42             ` Andrew Burgess
2020-01-06 22:14 ` [PATCH 0/3] Improve inline frame debug experience Andrew Burgess
2020-01-17 17:56   ` Andrew Burgess
2020-01-24 18:12     ` Tom Tromey
2020-01-25  5:08       ` Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200606065152.GF3522@embecosm.com \
    --to=andrew.burgess@embecosm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=tdevries@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).