public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
From: Simon Marchi <simark@simark.ca>
To: psmith@gnu.org, gdb@sourceware.org
Subject: Re: GDB 13.2: breakpoint at wrong line after unrelated change
Date: Sat, 16 Mar 2024 12:33:38 -0400	[thread overview]
Message-ID: <14826a17-bf87-469b-a7ff-0273b5c390a4@simark.ca> (raw)
In-Reply-To: <ebfc38cb4dad4b691ea231ea5b708e2fb5b8152e.camel@gnu.org>



On 2024-03-15 18:19, Paul Smith wrote:
> On Fri, 2024-03-15 at 17:11 -0400, Paul Smith via Gdb wrote:
>> I can tell you that in the "good" binary case I can see that
>> amd64_tdep.c:amd64_skip_prologue() is invoked which invokes
>> symtab.c:skip_prologue_using_sal() as you suggested.  In fact, these
>> methods are called numerous times.
>>
>> In the "bad" binary case, neither of those methods is called, ever. 
>> I put a gdb_printf() in both functions and in the "good" binary I see
>> probably 20 invocations between starting, setting the breakpoint,
>> running, and exiting: in the "bad" binary zero invocations.  I do see
>> that we definitely invoke set_gdbarch_skip_prologue() with the amd64
>> function pointer in both cases, so it's not that.
> 
> More details, no answers.
> 
> However, the problem is much deeper than some kind of incorrect
> computation of the prologue length.  It appears to be a major
> difference in the structure of the binary itself, which is weird.
> 
> The difference happens in symtab.c:find_function_start_sal_1().  When
> this is called on the "good" binary,
> sal.symtab->compunit()->locations_valid() is 0 so we fall through to
> calling skip_prologue_sal().
> 
> In the "bad" binary, locations_valid() returns 1 instead.  This sends
> us through this code starting at symtab.c:3607:
> 
>   if (funfirstline && sal.symtab != NULL
>       && (sal.symtab->compunit ()->locations_valid ()
>           || sal.symtab->language () == language_asm))
>     {
>       struct gdbarch *gdbarch = sal.symtab->compunit ()->objfile ()-
>> arch ();
> 
>       sal.pc = func_addr;
>       if (gdbarch_skip_entrypoint_p (gdbarch))
>         sal.pc = gdbarch_skip_entrypoint (gdbarch, sal.pc);
>       return sal;
>     }
> 
> thus returning early.  I've checked and gdb_arch_skip_entrypoint_p()
> returns null so gdbarch_skip_entrypoint() is not called.
> 
> I've also verified that all other aspects of the above if-statement
> (funfirstline and sal.symtab->language()) are the same (1 and 4)
> between the good and bad calls.  The difference appears to be the
> return code of locations_valid().
> 
> 
> Looking into this it appears to be something set for the entire binary,
> differently between the "good" and "bad" binary.
> 
> In the "good" binary we enter read.c:process_full_comp_unit() the
> passed-in dwarf2_cu value of has_loclist is false.  Because of that,
> this is not called:
> 
>       if (cu->has_loclist && gcc_4_minor >= 5)
>         cust->set_locations_valid (true);
> 
> and because this is not called, the locations_valid() return above is
> false.
> 
> In the "bad" binary when we enter process_full_comp_unit(), the value
> of has_loclist is true.  Because of this we call cust-
>> set_locations_valid(true) above, and this means locations_valid()
> returns true and we follow the alternate path when skip_prologue_sal()
> is called.
> 
> I have to stop here for today but maybe I'll have more time later this
> weekend.  If anyone has hints on how to determine why the settings of
> struct dwarf2_cu is different let me know.

Hi Paul,

I started to look at this problem this week, because I hit a case in my
own C++ program very similar to yours.  I didn't have time to finish my
reply, but my findings were very similar to yours.  When compiled with
gcc 11, the prologue is skipped.  When compiled with gcc 12 and 13, the
prologue is not skipped.  All with -O0.

Here's my analysis (partly redundant with what you said):

First, what I see:

Here, GDB stopped at the very first instruction of the function.  The
arguments are wrong:

    (gdb) info args
    this = 0x3dd736
    msgType1 = ((anonymous namespace)::MsgType::MSG_ITER_INACTIVITY | unknown: 0x5554)
    msgType2 = (unknown: 0x555553f8)

If I step past the prologue, they become correct:

    (gdb) n
    183         const auto specTestName = makeSpecTestName(_mTestName, msgType1, msgType2);
    (gdb) info args
    this = 0x555555a65e40 <(anonymous namespace)::errorTestCases>
    msgType1 = (anonymous namespace)::MsgType::STREAM
    msgType2 = (anonymous namespace)::MsgType::STREAM

When the prologue is skipped in the gcc 11-compiled executable, we reach
the skip_prologue_sal function like this:

    #0  skip_prologue_sal (sal=0x7ffd145c64b0) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:3852
    #1  0x0000561d4fa02155 in find_function_start_sal_1 (func_addr=1910486, section=0x561d5247d818, funfirstline=true)
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:3716
    #2  0x0000561d4fa0221e in find_function_start_sal (sym=0x561d52ed5900, funfirstline=true)
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:3744
    #3  0x0000561d4f7bc0e5 in symbol_to_sal (result=0x7ffd145c6570, funfirstline=1, sym=0x561d52ed5900)
        at /home/smarchi/src/binutils-gdb/gdb/linespec.c:4376
    #4  0x0000561d4f7b62f1 in convert_linespec_to_sals (state=0x7ffd145c69a0, ls=0x7ffd145c69f0)
        at /home/smarchi/src/binutils-gdb/gdb/linespec.c:2255
    #5  0x0000561d4f7b73a5 in parse_linespec (parser=0x7ffd145c6970, arg=0x7f002c018ce0 "_runOne",
        match_type=symbol_name_match_type::WILD) at /home/smarchi/src/binutils-gdb/gdb/linespec.c:2640
    #6  0x0000561d4f7b84e4 in location_spec_to_sals (parser=0x7ffd145c6970, locspec=0x561d52464650)
        at /home/smarchi/src/binutils-gdb/gdb/linespec.c:3080
    #7  0x0000561d4f7b890a in decode_line_full (locspec=0x561d52464650, flags=1, search_pspace=0x0, default_symtab=0x0, default_line=0,
        canonical=0x7ffd145c6e00, select_mode=0x0, filter=0x0) at /home/smarchi/src/binutils-gdb/gdb/linespec.c:3157
    #8  0x0000561d4f4989e9 in parse_breakpoint_sals (locspec=0x561d52464650, canonical=0x7ffd145c6e00)
        at /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:8895
    #9  0x0000561d4f4a5077 in create_sals_from_location_spec_default (locspec=0x561d52464650, canonical=0x7ffd145c6e00)
        at /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:13200
    #10 0x0000561d4f499a0f in create_breakpoint (gdbarch=0x561d5244cac0, locspec=0x561d52464650, cond_string=0x0, thread=-1, inferior=-1,
        extra_string=0x0, force_condition=false, parse_extra=1, tempflag=0, type_wanted=bp_breakpoint, ignore_count=0,
        pending_break_support=AUTO_BOOLEAN_AUTO, ops=0x561d501b5100 <code_breakpoint_ops>, from_tty=1, enabled=1, internal=0, flags=0)
        at /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:9230
    #11 0x0000561d4f49a4da in break_command_1 (arg=0x561d521e1d49 "", flag=0, from_tty=1)
        at /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:9415

When the prologue is not skipped, with gcc 12 and 13, skip_prologue_sal
is never called.  Backtracking a bit, I found that in that case
find_function_start_sal_1 returns early due to `sal.symtab->compunit
()->locations_valid ()` being true.  The locations_valid flag is set in
the DWARF reader (process_full_comp_unit function) whenever
dwarf2_cu::has_loclist is true.  That is set in var_decode_location when
processing a symbol whose location (DW_AT_location) is a loclist.

In my gcc 11-generated executable, I don't have a symbol whose location
is a loclist.  In my gcc 12 or 13-generated executable, I do:

    DW_AT_location [DW_FORM_sec_offset]   (0x0000000c:
       [0x000000000024ed64, 0x000000000024ed7b): DW_OP_reg5 RDI
       [0x000000000024ed7b, 0x000000000024ee14): DW_OP_reg3 RBX
       [0x000000000024ee14, 0x000000000024ee15): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value
       [0x000000000024ee15, 0x000000000024ef0b): DW_OP_reg3 RBX)

The reasoning being this is explained here in process_full_comp_unit:

      /* GCC-4.0 has started to support -fvar-tracking.  GCC-3.x still can
	 produce DW_AT_location with location lists but it can be possibly
	 invalid without -fvar-tracking.  Still up to GCC-4.4.x incl. 4.4.0
	 there were bugs in prologue debug info, fixed later in GCC-4.5
	 by "unwind info for epilogues" patch (which is not directly related).

	 For -gdwarf-4 type units LOCATIONS_VALID indication is fortunately not
	 needed, it would be wrong due to missing DW_AT_producer there.

	 Still one can confuse GDB by using non-standard GCC compilation
	 options - this waits on GCC PR other/32998 (-frecord-gcc-switches).
	 */
      if (cu->has_loclist && gcc_4_minor >= 5)
	cust->set_locations_valid (true);

So, as soon as it sees one loclist in the compilation unit, GDB assumes
that GCC has produced loclists that describe accurately variable values
even in prologues everywhere.  This assumption is not true here.  The
locations for the two arguments I tried to print earlier are only valid
after the prologue, after the stack has been set up:


0x00055daa:     DW_TAG_formal_parameter
                  DW_AT_name [DW_FORM_strp]     ("msgType1")
                  DW_AT_location [DW_FORM_exprloc]      (DW_OP_fbreg -1660)

0x00055dba:     DW_TAG_formal_parameter
                  DW_AT_name [DW_FORM_strp]     ("msgType2")
                  DW_AT_location [DW_FORM_exprloc]      (DW_OP_fbreg -1664)

Simon

  reply	other threads:[~2024-03-16 16:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-11 18:28 Paul Smith
2024-03-11 19:14 ` Simon Marchi
2024-03-11 19:38   ` Paul Smith
2024-03-11 19:50     ` Simon Marchi
2024-03-11 20:17       ` Paul Smith
2024-03-15 21:11       ` Paul Smith
2024-03-15 22:19         ` Paul Smith
2024-03-16 16:33           ` Simon Marchi [this message]
2024-03-16 19:57             ` Paul Smith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14826a17-bf87-469b-a7ff-0273b5c390a4@simark.ca \
    --to=simark@simark.ca \
    --cc=gdb@sourceware.org \
    --cc=psmith@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).