[ was: Re: [PATCH][gdb/symtab] Fix line-table end-of-sequence sorting ] On 06-06-2020 11:25, Andrew Burgess wrote: > * Tom de Vries [2020-06-06 01:44:42 +0200]: > >> [ was: Re: [PATCH 2/3] gdb: Don't reorder line table entries too much >> when sorting. ] >> >> On 05-06-2020 18:00, Tom de Vries wrote: >>> On 05-06-2020 16:49, Tom de Vries wrote: >>>> On 23-12-2019 02:51, Andrew Burgess wrote: >>>>> I had to make a small adjustment in find_pc_sect_line in order to >>>>> correctly find the previous line in the line table. In some line >>>>> tables I was seeing an actual line entry and an end of sequence marker >>>>> at the same address, before this commit these would reorder to move >>>>> the end of sequence marker before the line entry (end of sequence has >>>>> line number 0). Now the end of sequence marker remains in its correct >>>>> location, and in order to find a previous line we should step backward >>>>> over any end of sequence markers. >>>>> >>>>> As an example, the binary: >>>>> gdb/testsuite/outputs/gdb.dwarf2/dw2-ranges-func/dw2-ranges-func-lo-cold >>>>> >>>>> Has this line table before the patch: >>>>> >>>>> INDEX LINE ADDRESS >>>>> 0 48 0x0000000000400487 >>>>> 1 END 0x000000000040048e >>>>> 2 52 0x000000000040048e >>>>> 3 54 0x0000000000400492 >>>>> 4 56 0x0000000000400497 >>>>> 5 END 0x000000000040049a >>>>> 6 62 0x000000000040049a >>>>> 7 END 0x00000000004004a1 >>>>> 8 66 0x00000000004004a1 >>>>> 9 68 0x00000000004004a5 >>>>> 10 70 0x00000000004004aa >>>>> 11 72 0x00000000004004b9 >>>>> 12 END 0x00000000004004bc >>>>> 13 76 0x00000000004004bc >>>>> 14 78 0x00000000004004c0 >>>>> 15 80 0x00000000004004c5 >>>>> 16 END 0x00000000004004cc >>>>> >>>>> And after this patch: >>>>> >>>>> INDEX LINE ADDRESS >>>>> 0 48 0x0000000000400487 >>>>> 1 52 0x000000000040048e >>>>> 2 END 0x000000000040048e >>>>> 3 54 0x0000000000400492 >>>>> 4 56 0x0000000000400497 >>>>> 5 END 0x000000000040049a >>>>> 6 62 0x000000000040049a >>>>> 7 66 0x00000000004004a1 >>>>> 8 END 0x00000000004004a1 >>>>> 9 68 0x00000000004004a5 >>>>> 10 70 0x00000000004004aa >>>>> 11 72 0x00000000004004b9 >>>>> 12 END 0x00000000004004bc >>>>> 13 76 0x00000000004004bc >>>>> 14 78 0x00000000004004c0 >>>>> 15 80 0x00000000004004c5 >>>>> 16 END 0x00000000004004cc >>>>> >>>>> When calling find_pc_sect_line with the address 0x000000000040048e, in >>>>> both cases we find entry #3, we then try to find the previous entry, >>>>> which originally found this entry '2 52 0x000000000040048e', >>>>> after the patch it finds '2 END 0x000000000040048e', which >>>>> cases the lookup to fail. >>>>> >>>>> By skipping the END marker after this patch we get back to the correct >>>>> entry, which is now #1: '1 52 0x000000000040048e', and >>>>> everything works again. >>>> >>>> I start to suspect that you have been working around an incorrect line >>>> table. >>>> >>>> Consider this bit: >>>> ... >>>> 0 48 0x0000000000400487 >>>> 1 52 0x000000000040048e >>>> 2 END 0x000000000040048e >>>> ... >>>> >>>> The end marker marks the address one past the end of the sequence. >>>> Therefore, it makes no sense to have an entry in the sequence with the >>>> same address as the end marker. >>>> >>>> [ dwarf doc: >>>> >>>> end_sequence: >>>> >>>> A boolean indicating that the current address is that of the first byte >>>> after the end of a sequence of target machine instructions. end_sequence >>>> terminates a sequence of lines; therefore other information in the same >>>> row is not meaningful. >>>> >>>> DW_LNE_end_sequence: >>>> >>>> The DW_LNE_end_sequence opcode takes no operands. It sets the >>>> end_sequence register of the state machine to “true” and appends a row >>>> to the matrix using the current values of the state-machine registers. >>>> Then it resets the registers to the initial values specified above (see >>>> Section 6.2.2). Every line number program sequence must end with a >>>> DW_LNE_end_sequence instruction which creates a row whose address is >>>> that of the byte after the last target machine instruction of the sequence. >>>> >>>> ] >>>> >>>> The incorrect entry is generated by this dwarf assembler sequence: >>>> ... >>>> {DW_LNS_copy} >>>> {DW_LNE_end_sequence} >>>> ... >>>> >>>> I think we should probably fix the dwarf assembly test-cases. >>>> >>>> If we want to handle this in gdb, the thing that seems most logical to >>>> me is to ignore this kind of entries. >>> >>> Hmm, that seems to be done already, in buildsym_compunit::record_line. >>> >>> Anyway, I was looking at the line table for >>> gdb.dwarf2/dw2-ranges-base.exp, and got a line table with subsequent end >>> markers: >>> ... >>> INDEX LINE ADDRESS IS-STMT >>> 0 31 0x00000000004004a7 Y >>> 1 21 0x00000000004004ae Y >>> 2 END 0x00000000004004ae Y >>> 3 11 0x00000000004004ba Y >>> 4 END 0x00000000004004ba Y >>> 5 END 0x00000000004004c6 Y >>> ... >>> >>> By using this patch: >>> ... >>> diff --git a/gdb/buildsym.c b/gdb/buildsym.c >>> index 33bf6523e9..76f0b54ff6 100644 >>> --- a/gdb/buildsym.c >>> +++ b/gdb/buildsym.c >>> @@ -943,6 +943,10 @@ buildsym_compunit::end_symtab_with_blockvector >>> (struct block *static_block, >>> = [] (const linetable_entry &ln1, >>> const linetable_entry &ln2) -> bool >>> { >>> + if (ln1.pc == ln2.pc >>> + && ((ln1.line == 0) != (ln2.line == 0))) >>> + return ln1.line == 0 ? true : false; >>> + >>> return (ln1.pc < ln2.pc); >>> }; >>> >>> ... >>> I get the expected: >>> ... >>> INDEX LINE ADDRESS IS-STMT >>> 0 31 0x00000000004004a7 Y >>> 1 END 0x00000000004004ae Y >>> 2 21 0x00000000004004ae Y >>> 3 END 0x00000000004004ba Y >>> 4 11 0x00000000004004ba Y >>> 5 END 0x00000000004004c6 Y >>> ... >> >> Any comments on patch below? > > Yes! Thank you for working on this. > > This should go in, however, I'd like to tweak the commit message a bit > please (see below). > > Also, do you plan to include the revert of find_pc_sect_line from > 3d92a3e313 in this patch - I think you should. > > Thanks, > Andrew > >> >> Thanks, >> - Tom >> > >> [gdb/symtab] Fix line-table end-of-sequence sorting >> >> Consider test-case gdb.dwarf2/dw2-ranges-base.exp. It has a line-table for >> dw2-ranges-base.c like this: >> ... >> Line Number Statements: >> [0x0000014e] Extended opcode 2: set Address to 0x4004ba >> [0x00000159] Advance Line by 10 to 11 >> [0x0000015b] Copy >> [0x0000015c] Advance PC by 12 to 0x4004c6 >> [0x0000015e] Advance Line by 19 to 30 >> [0x00000160] Copy >> [0x00000161] Extended opcode 1: End of Sequence >> >> [0x00000164] Extended opcode 2: set Address to 0x4004ae >> [0x0000016f] Advance Line by 20 to 21 >> [0x00000171] Copy >> [0x00000172] Advance PC by 12 to 0x4004ba >> [0x00000174] Advance Line by 29 to 50 >> [0x00000176] Copy >> [0x00000177] Extended opcode 1: End of Sequence >> >> [0x0000017a] Extended opcode 2: set Address to 0x4004a7 >> [0x00000185] Advance Line by 30 to 31 >> [0x00000187] Copy >> [0x00000188] Advance PC by 7 to 0x4004ae >> [0x0000018a] Advance Line by 39 to 70 >> [0x0000018c] Copy >> [0x0000018d] Extended opcode 1: End of Sequence >> ... >> >> The Copy followed by End-of-Sequence is as specified in the dwarf assembly, >> but incorrect. F.i., consider: >> ... >> [0x0000015c] Advance PC by 12 to 0x4004c6 >> [0x0000015e] Advance Line by 19 to 30 >> [0x00000160] Copy >> [0x00000161] Extended opcode 1: End of Sequence >> ... >> >> Both the Copy and the End-of-Sequence append a row to the matrix using the >> same addres: 0x4004c6. The Copy declares a target instruction at that >> address. The End-of-Sequence declares that the sequence ends before that >> address. It's a contradiction that the target instruction is both part of the >> sequence (according to Copy) and not part of the sequence (according to >> End-of-Sequence). > > I don't want to argue about this specific test, but about the idea of > an empty line occurring at the same address as an end of sequence > marker. This can happen due to compiler optimisation, though it is > perfectly reasonable to suggest that in this case the compiler should > be removing the line marker, this doesn't always happen. > > I guess what I'm saying is that I think the case for this being > obviously wrong is not quite as clear cut as you seem to imply>, but > also, I don't think this is really relevant to this bug - so maybe we > could just drop this part? > To me it's obviously wrong, so it would help me if you explain your interpretation of the dwarf standard on this topic. I think though I may have indeed conflated two issues in my original patch, so I've now got one patch that deals with the incorrect dwarf in the dwarf test-cases (attached below), and I'll submit another dealing with the regression. I hope that will make the conversation easier. Thanks, - Tom