From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id 8583C3858C33 for ; Sat, 16 Mar 2024 16:33:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8583C3858C33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8583C3858C33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=158.69.221.121 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710606830; cv=none; b=byp+mD/4FVKnqKgimeiy+nJ2cLbdm6jOyHqeylSg9Twzer15dD0Rw2/XamJS9+CMdM1S3ORLvIQ0ZqJE0PLDdGNZ8Gv/USQtwRcWG96t1nvchACrBOF73B7G7tnSFiZ+PaFQiN6XwVtLVQjaM6DiEjAUH2vUPBuXBwl7o5SStIM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710606830; c=relaxed/simple; bh=Kt7+4HsYd4++riInmxJ0shcYgbrtq2G3441nrO21IbA=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=GOZ2Qy2u/TVSqqbcLyBkbQRjRZ+W1vLfKrNv7t9Asa80Q+0D7xIAXITlfehb1KhHLTU+UCdkcOcOMIJmigPJMGl6fU9OSCnixTqCzd4RalHdl/36397g9TwYhNX6G1MObYj5IqTn4aHrxvuHR1Lk3u9edGCPAtEIutzibUO1dCY= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail; t=1710606818; bh=Kt7+4HsYd4++riInmxJ0shcYgbrtq2G3441nrO21IbA=; h=Date:Subject:To:References:From:In-Reply-To:From; b=sMTJ4n+1mdWvDyxS2W1QTUPezNa8TeJMBDEd3843hymJB9gJDYEhTvmkHm0t9UvFH q6GBwGHuAXGb/MrYojYBBaI4PL3CQGCQIgxEi2YGYcm5Nl5xLvd278thbHLF/gyoPW pgKQQn+1sJC0XUShWsI3hDH57YmOs6b6QquZ7Pm0= Received: from [10.0.0.11] (modemcable238.237-201-24.mc.videotron.ca [24.201.237.238]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id B19801E01D; Sat, 16 Mar 2024 12:33:38 -0400 (EDT) Message-ID: <14826a17-bf87-469b-a7ff-0273b5c390a4@simark.ca> Date: Sat, 16 Mar 2024 12:33:38 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: GDB 13.2: breakpoint at wrong line after unrelated change Content-Language: en-US To: psmith@gnu.org, gdb@sourceware.org References: <7c064cc544bfe453a250ca58b4d5ebd97919178d.camel@gnu.org> <4c3f986c-a562-4506-b382-c754678abcff@simark.ca> <7fcf12fbd4a106af488168d0e740e0d8ca9b3022.camel@gnu.org> <06c7a0d1-6ba7-440a-a21a-616ed05cb5b0@simark.ca> <3b4f943d23543da2ba96dad8c81fa8c495fffd08.camel@gnu.org> From: Simon Marchi In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2024-03-15 18:19, Paul Smith wrote: > On Fri, 2024-03-15 at 17:11 -0400, Paul Smith via Gdb wrote: >> I can tell you that in the "good" binary case I can see that >> amd64_tdep.c:amd64_skip_prologue() is invoked which invokes >> symtab.c:skip_prologue_using_sal() as you suggested. In fact, these >> methods are called numerous times. >> >> In the "bad" binary case, neither of those methods is called, ever. >> I put a gdb_printf() in both functions and in the "good" binary I see >> probably 20 invocations between starting, setting the breakpoint, >> running, and exiting: in the "bad" binary zero invocations. I do see >> that we definitely invoke set_gdbarch_skip_prologue() with the amd64 >> function pointer in both cases, so it's not that. > > More details, no answers. > > However, the problem is much deeper than some kind of incorrect > computation of the prologue length. It appears to be a major > difference in the structure of the binary itself, which is weird. > > The difference happens in symtab.c:find_function_start_sal_1(). When > this is called on the "good" binary, > sal.symtab->compunit()->locations_valid() is 0 so we fall through to > calling skip_prologue_sal(). > > In the "bad" binary, locations_valid() returns 1 instead. This sends > us through this code starting at symtab.c:3607: > > if (funfirstline && sal.symtab != NULL > && (sal.symtab->compunit ()->locations_valid () > || sal.symtab->language () == language_asm)) > { > struct gdbarch *gdbarch = sal.symtab->compunit ()->objfile ()- >> arch (); > > sal.pc = func_addr; > if (gdbarch_skip_entrypoint_p (gdbarch)) > sal.pc = gdbarch_skip_entrypoint (gdbarch, sal.pc); > return sal; > } > > thus returning early. I've checked and gdb_arch_skip_entrypoint_p() > returns null so gdbarch_skip_entrypoint() is not called. > > I've also verified that all other aspects of the above if-statement > (funfirstline and sal.symtab->language()) are the same (1 and 4) > between the good and bad calls. The difference appears to be the > return code of locations_valid(). > > > Looking into this it appears to be something set for the entire binary, > differently between the "good" and "bad" binary. > > In the "good" binary we enter read.c:process_full_comp_unit() the > passed-in dwarf2_cu value of has_loclist is false. Because of that, > this is not called: > > if (cu->has_loclist && gcc_4_minor >= 5) > cust->set_locations_valid (true); > > and because this is not called, the locations_valid() return above is > false. > > In the "bad" binary when we enter process_full_comp_unit(), the value > of has_loclist is true. Because of this we call cust- >> set_locations_valid(true) above, and this means locations_valid() > returns true and we follow the alternate path when skip_prologue_sal() > is called. > > I have to stop here for today but maybe I'll have more time later this > weekend. If anyone has hints on how to determine why the settings of > struct dwarf2_cu is different let me know. Hi Paul, I started to look at this problem this week, because I hit a case in my own C++ program very similar to yours. I didn't have time to finish my reply, but my findings were very similar to yours. When compiled with gcc 11, the prologue is skipped. When compiled with gcc 12 and 13, the prologue is not skipped. All with -O0. Here's my analysis (partly redundant with what you said): First, what I see: Here, GDB stopped at the very first instruction of the function. The arguments are wrong: (gdb) info args this = 0x3dd736 msgType1 = ((anonymous namespace)::MsgType::MSG_ITER_INACTIVITY | unknown: 0x5554) msgType2 = (unknown: 0x555553f8) If I step past the prologue, they become correct: (gdb) n 183 const auto specTestName = makeSpecTestName(_mTestName, msgType1, msgType2); (gdb) info args this = 0x555555a65e40 <(anonymous namespace)::errorTestCases> msgType1 = (anonymous namespace)::MsgType::STREAM msgType2 = (anonymous namespace)::MsgType::STREAM When the prologue is skipped in the gcc 11-compiled executable, we reach the skip_prologue_sal function like this: #0 skip_prologue_sal (sal=0x7ffd145c64b0) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:3852 #1 0x0000561d4fa02155 in find_function_start_sal_1 (func_addr=1910486, section=0x561d5247d818, funfirstline=true) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:3716 #2 0x0000561d4fa0221e in find_function_start_sal (sym=0x561d52ed5900, funfirstline=true) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:3744 #3 0x0000561d4f7bc0e5 in symbol_to_sal (result=0x7ffd145c6570, funfirstline=1, sym=0x561d52ed5900) at /home/smarchi/src/binutils-gdb/gdb/linespec.c:4376 #4 0x0000561d4f7b62f1 in convert_linespec_to_sals (state=0x7ffd145c69a0, ls=0x7ffd145c69f0) at /home/smarchi/src/binutils-gdb/gdb/linespec.c:2255 #5 0x0000561d4f7b73a5 in parse_linespec (parser=0x7ffd145c6970, arg=0x7f002c018ce0 "_runOne", match_type=symbol_name_match_type::WILD) at /home/smarchi/src/binutils-gdb/gdb/linespec.c:2640 #6 0x0000561d4f7b84e4 in location_spec_to_sals (parser=0x7ffd145c6970, locspec=0x561d52464650) at /home/smarchi/src/binutils-gdb/gdb/linespec.c:3080 #7 0x0000561d4f7b890a in decode_line_full (locspec=0x561d52464650, flags=1, search_pspace=0x0, default_symtab=0x0, default_line=0, canonical=0x7ffd145c6e00, select_mode=0x0, filter=0x0) at /home/smarchi/src/binutils-gdb/gdb/linespec.c:3157 #8 0x0000561d4f4989e9 in parse_breakpoint_sals (locspec=0x561d52464650, canonical=0x7ffd145c6e00) at /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:8895 #9 0x0000561d4f4a5077 in create_sals_from_location_spec_default (locspec=0x561d52464650, canonical=0x7ffd145c6e00) at /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:13200 #10 0x0000561d4f499a0f in create_breakpoint (gdbarch=0x561d5244cac0, locspec=0x561d52464650, cond_string=0x0, thread=-1, inferior=-1, extra_string=0x0, force_condition=false, parse_extra=1, tempflag=0, type_wanted=bp_breakpoint, ignore_count=0, pending_break_support=AUTO_BOOLEAN_AUTO, ops=0x561d501b5100 , from_tty=1, enabled=1, internal=0, flags=0) at /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:9230 #11 0x0000561d4f49a4da in break_command_1 (arg=0x561d521e1d49 "", flag=0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:9415 When the prologue is not skipped, with gcc 12 and 13, skip_prologue_sal is never called. Backtracking a bit, I found that in that case find_function_start_sal_1 returns early due to `sal.symtab->compunit ()->locations_valid ()` being true. The locations_valid flag is set in the DWARF reader (process_full_comp_unit function) whenever dwarf2_cu::has_loclist is true. That is set in var_decode_location when processing a symbol whose location (DW_AT_location) is a loclist. In my gcc 11-generated executable, I don't have a symbol whose location is a loclist. In my gcc 12 or 13-generated executable, I do: DW_AT_location [DW_FORM_sec_offset] (0x0000000c: [0x000000000024ed64, 0x000000000024ed7b): DW_OP_reg5 RDI [0x000000000024ed7b, 0x000000000024ee14): DW_OP_reg3 RBX [0x000000000024ee14, 0x000000000024ee15): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value [0x000000000024ee15, 0x000000000024ef0b): DW_OP_reg3 RBX) The reasoning being this is explained here in process_full_comp_unit: /* GCC-4.0 has started to support -fvar-tracking. GCC-3.x still can produce DW_AT_location with location lists but it can be possibly invalid without -fvar-tracking. Still up to GCC-4.4.x incl. 4.4.0 there were bugs in prologue debug info, fixed later in GCC-4.5 by "unwind info for epilogues" patch (which is not directly related). For -gdwarf-4 type units LOCATIONS_VALID indication is fortunately not needed, it would be wrong due to missing DW_AT_producer there. Still one can confuse GDB by using non-standard GCC compilation options - this waits on GCC PR other/32998 (-frecord-gcc-switches). */ if (cu->has_loclist && gcc_4_minor >= 5) cust->set_locations_valid (true); So, as soon as it sees one loclist in the compilation unit, GDB assumes that GCC has produced loclists that describe accurately variable values even in prologues everywhere. This assumption is not true here. The locations for the two arguments I tried to print earlier are only valid after the prologue, after the stack has been set up: 0x00055daa: DW_TAG_formal_parameter DW_AT_name [DW_FORM_strp] ("msgType1") DW_AT_location [DW_FORM_exprloc] (DW_OP_fbreg -1660) 0x00055dba: DW_TAG_formal_parameter DW_AT_name [DW_FORM_strp] ("msgType2") DW_AT_location [DW_FORM_exprloc] (DW_OP_fbreg -1664) Simon