[ was: Re: [PATCH][gdb/symtab] Fix line-table end-of-sequence sorting ]

On 06-06-2020 11:25, Andrew Burgess wrote:
> * Tom de Vries <tdevries@suse.de> [2020-06-06 01:44:42 +0200]:
> 
>> [ was: Re: [PATCH 2/3] gdb: Don't reorder line table entries too much
>> when sorting. ]
>>
>> On 05-06-2020 18:00, Tom de Vries wrote:
>>> On 05-06-2020 16:49, Tom de Vries wrote:
>>>> On 23-12-2019 02:51, Andrew Burgess wrote:
>>>>> I had to make a small adjustment in find_pc_sect_line in order to
>>>>> correctly find the previous line in the line table.  In some line
>>>>> tables I was seeing an actual line entry and an end of sequence marker
>>>>> at the same address, before this commit these would reorder to move
>>>>> the end of sequence marker before the line entry (end of sequence has
>>>>> line number 0).  Now the end of sequence marker remains in its correct
>>>>> location, and in order to find a previous line we should step backward
>>>>> over any end of sequence markers.
>>>>>
>>>>> As an example, the binary:
>>>>>   gdb/testsuite/outputs/gdb.dwarf2/dw2-ranges-func/dw2-ranges-func-lo-cold
>>>>>
>>>>> Has this line table before the patch:
>>>>>
>>>>>   INDEX    LINE ADDRESS
>>>>>   0          48 0x0000000000400487
>>>>>   1         END 0x000000000040048e
>>>>>   2          52 0x000000000040048e
>>>>>   3          54 0x0000000000400492
>>>>>   4          56 0x0000000000400497
>>>>>   5         END 0x000000000040049a
>>>>>   6          62 0x000000000040049a
>>>>>   7         END 0x00000000004004a1
>>>>>   8          66 0x00000000004004a1
>>>>>   9          68 0x00000000004004a5
>>>>>   10         70 0x00000000004004aa
>>>>>   11         72 0x00000000004004b9
>>>>>   12        END 0x00000000004004bc
>>>>>   13         76 0x00000000004004bc
>>>>>   14         78 0x00000000004004c0
>>>>>   15         80 0x00000000004004c5
>>>>>   16        END 0x00000000004004cc
>>>>>
>>>>> And after this patch:
>>>>>
>>>>>   INDEX    LINE ADDRESS
>>>>>   0          48 0x0000000000400487
>>>>>   1          52 0x000000000040048e
>>>>>   2         END 0x000000000040048e
>>>>>   3          54 0x0000000000400492
>>>>>   4          56 0x0000000000400497
>>>>>   5         END 0x000000000040049a
>>>>>   6          62 0x000000000040049a
>>>>>   7          66 0x00000000004004a1
>>>>>   8         END 0x00000000004004a1
>>>>>   9          68 0x00000000004004a5
>>>>>   10         70 0x00000000004004aa
>>>>>   11         72 0x00000000004004b9
>>>>>   12        END 0x00000000004004bc
>>>>>   13         76 0x00000000004004bc
>>>>>   14         78 0x00000000004004c0
>>>>>   15         80 0x00000000004004c5
>>>>>   16        END 0x00000000004004cc
>>>>>
>>>>> When calling find_pc_sect_line with the address 0x000000000040048e, in
>>>>> both cases we find entry #3, we then try to find the previous entry,
>>>>> which originally found this entry '2         52 0x000000000040048e',
>>>>> after the patch it finds '2         END 0x000000000040048e', which
>>>>> cases the lookup to fail.
>>>>>
>>>>> By skipping the END marker after this patch we get back to the correct
>>>>> entry, which is now #1: '1          52 0x000000000040048e', and
>>>>> everything works again.
>>>>
>>>> I start to suspect that you have been working around an incorrect line
>>>> table.
>>>>
>>>> Consider this bit:
>>>> ...
>>>>    0          48 0x0000000000400487
>>>>    1          52 0x000000000040048e
>>>>    2         END 0x000000000040048e
>>>> ...
>>>>
>>>> The end marker marks the address one past the end of the sequence.
>>>> Therefore, it makes no sense to have an entry in the sequence with the
>>>> same address as the end marker.
>>>>
>>>> [ dwarf doc:
>>>>
>>>> end_sequence:
>>>>
>>>> A boolean indicating that the current address is that of the first byte
>>>> after the end of a sequence of target machine instructions. end_sequence
>>>> terminates a sequence of lines; therefore other information in the same
>>>> row is not meaningful.
>>>>
>>>> DW_LNE_end_sequence:
>>>>
>>>> The DW_LNE_end_sequence opcode takes no operands. It sets the
>>>> end_sequence register of the state machine to “true” and appends a row
>>>> to the matrix using the current values of the state-machine registers.
>>>> Then it resets the registers to the initial values specified above (see
>>>> Section 6.2.2). Every line number program sequence must end with a
>>>> DW_LNE_end_sequence instruction which creates a row whose address is
>>>> that of the byte after the last target machine instruction of the sequence.
>>>>
>>>> ]
>>>>
>>>> The incorrect entry is generated by this dwarf assembler sequence:
>>>> ...
>>>>                 {DW_LNS_copy}
>>>>                 {DW_LNE_end_sequence}
>>>> ...
>>>>
>>>> I think we should probably fix the dwarf assembly test-cases.
>>>>
>>>> If we want to handle this in gdb, the thing that seems most logical to
>>>> me is to ignore this kind of entries.
>>>
>>> Hmm, that seems to be done already, in buildsym_compunit::record_line.
>>>
>>> Anyway, I was looking at the line table for
>>> gdb.dwarf2/dw2-ranges-base.exp, and got a line table with subsequent end
>>> markers:
>>> ...
>>> INDEX  LINE   ADDRESS            IS-STMT
>>> 0      31     0x00000000004004a7 Y
>>> 1      21     0x00000000004004ae Y
>>> 2      END    0x00000000004004ae Y
>>> 3      11     0x00000000004004ba Y
>>> 4      END    0x00000000004004ba Y
>>> 5      END    0x00000000004004c6 Y
>>> ...
>>>
>>> By using this patch:
>>> ...
>>> diff --git a/gdb/buildsym.c b/gdb/buildsym.c
>>> index 33bf6523e9..76f0b54ff6 100644
>>> --- a/gdb/buildsym.c
>>> +++ b/gdb/buildsym.c
>>> @@ -943,6 +943,10 @@ buildsym_compunit::end_symtab_with_blockvector
>>> (struct block *static_block,
>>>             = [] (const linetable_entry &ln1,
>>>                   const linetable_entry &ln2) -> bool
>>>               {
>>> +               if (ln1.pc == ln2.pc
>>> +                   && ((ln1.line == 0) != (ln2.line == 0)))
>>> +                 return ln1.line == 0 ? true : false;
>>> +
>>>                 return (ln1.pc < ln2.pc);
>>>               };
>>>
>>> ...
>>> I get the expected:
>>> ...
>>> INDEX  LINE   ADDRESS            IS-STMT
>>> 0      31     0x00000000004004a7 Y
>>> 1      END    0x00000000004004ae Y
>>> 2      21     0x00000000004004ae Y
>>> 3      END    0x00000000004004ba Y
>>> 4      11     0x00000000004004ba Y
>>> 5      END    0x00000000004004c6 Y
>>> ...
>>
>> Any comments on patch below?
> 
> Yes! Thank you for working on this.
> 
> This should go in, however, I'd like to tweak the commit message a bit
> please (see below).
> 
> Also, do you plan to include the revert of find_pc_sect_line from
> 3d92a3e313 in this patch - I think you should.
> 
> Thanks,
> Andrew
> 
>>
>> Thanks,
>> - Tom
>>
> 
>> [gdb/symtab] Fix line-table end-of-sequence sorting
>>
>> Consider test-case gdb.dwarf2/dw2-ranges-base.exp.  It has a line-table for
>> dw2-ranges-base.c like this:
>> ...
>>  Line Number Statements:
>>   [0x0000014e]  Extended opcode 2: set Address to 0x4004ba
>>   [0x00000159]  Advance Line by 10 to 11
>>   [0x0000015b]  Copy
>>   [0x0000015c]  Advance PC by 12 to 0x4004c6
>>   [0x0000015e]  Advance Line by 19 to 30
>>   [0x00000160]  Copy
>>   [0x00000161]  Extended opcode 1: End of Sequence
>>
>>   [0x00000164]  Extended opcode 2: set Address to 0x4004ae
>>   [0x0000016f]  Advance Line by 20 to 21
>>   [0x00000171]  Copy
>>   [0x00000172]  Advance PC by 12 to 0x4004ba
>>   [0x00000174]  Advance Line by 29 to 50
>>   [0x00000176]  Copy
>>   [0x00000177]  Extended opcode 1: End of Sequence
>>
>>   [0x0000017a]  Extended opcode 2: set Address to 0x4004a7
>>   [0x00000185]  Advance Line by 30 to 31
>>   [0x00000187]  Copy
>>   [0x00000188]  Advance PC by 7 to 0x4004ae
>>   [0x0000018a]  Advance Line by 39 to 70
>>   [0x0000018c]  Copy
>>   [0x0000018d]  Extended opcode 1: End of Sequence
>> ...
>>
>> The Copy followed by End-of-Sequence is as specified in the dwarf assembly,
>> but incorrect.  F.i., consider:
>> ...
>>   [0x0000015c]  Advance PC by 12 to 0x4004c6
>>   [0x0000015e]  Advance Line by 19 to 30
>>   [0x00000160]  Copy
>>   [0x00000161]  Extended opcode 1: End of Sequence
>> ...
>>
>> Both the Copy and the End-of-Sequence append a row to the matrix using the
>> same addres: 0x4004c6.  The Copy declares a target instruction at that
>> address.  The End-of-Sequence declares that the sequence ends before that
>> address.  It's a contradiction that the target instruction is both part of the
>> sequence (according to Copy) and not part of the sequence (according to
>> End-of-Sequence).
> 
> I don't want to argue about this specific test, but about the idea of
> an empty line occurring at the same address as an end of sequence
> marker.  This can happen due to compiler optimisation, though it is
> perfectly reasonable to suggest that in this case the compiler should
> be removing the line marker, this doesn't always happen.
> 
> I guess what I'm saying is that I think the case for this being
> obviously wrong is not quite as clear cut as you seem to imply>, but
> also, I don't think this is really relevant to this bug - so maybe we
> could just drop this part?
> 

To me it's obviously wrong, so it would help me if you explain your
interpretation of the dwarf standard on this topic.

I think though I may have indeed conflated two issues in my original
patch, so I've now got one patch that deals with the incorrect dwarf in
the dwarf test-cases (attached below), and I'll submit another dealing
with the regression.  I hope that will make the conversation easier.

Thanks,
- Tom