From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id E74A5385740D for ; Fri, 20 May 2022 06:45:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E74A5385740D Received: from fencepost.gnu.org ([2001:470:142:3::e]:49428) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nrwOD-0007XV-83; Fri, 20 May 2022 02:45:33 -0400 Received: from [87.69.77.57] (port=3383 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nrwO8-00017v-9N; Fri, 20 May 2022 02:45:32 -0400 Date: Fri, 20 May 2022 09:45:25 +0300 Message-Id: <834k1kd7ne.fsf@gnu.org> From: Eli Zaretskii To: Pedro Alves Cc: gdb-patches@sourceware.org In-Reply-To: <20220519215552.3254012-2-pedro@palves.net> (message from Pedro Alves on Thu, 19 May 2022 22:55:51 +0100) Subject: Re: [PATCH 1/2] Always show locations for breakpoints & show canonical location spec References: <20220519215552.3254012-1-pedro@palves.net> <20220519215552.3254012-2-pedro@palves.net> X-Spam-Status: No, score=1.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2022 06:45:36 -0000 > From: Pedro Alves > Date: Thu, 19 May 2022 22:55:51 +0100 > > Another interesting case where this commit helps is when the user sets > a breakpoint by line number, and the line corresponds to a comment or > empty line, or to code that was optimized away. E.g., again when > debugging GDB: > > (top-gdb) b 27 > Breakpoint 4 at 0x469aa6: file src/gdb/gdb.c, line 28. > > Note I asked for line 27 but got line 28. > > "info breakpoints" currently says: > > (top-gdb) info breakpoints 4 > Num Type Disp Enb Address What > 4 breakpoint keep y 0x0000000000469aa6 in main(int, char**) at src/gdb/gdb.c:28 > > Again here we lost the info about the original location spec, line 27. > While after this commit, we get: > > (top-gdb) info breakpoints 4 > Num Type Disp Enb Address What > 4 breakpoint keep y src/gdb/gdb.c:27 > 4.1 y 0x0000000000469aa6 in main(int, char**) at src/gdb/gdb.c:28 Wow! that is soooo confusing. If we want to have this feature, we've got to put some explanation for the difference in line numbers (which could be much more than one line), as partr of the display itself. Something like "(comment line)" after gdb.c:27, for example. I would even find it confusing to see 4.1 location in this case, since it's clear that only a single location is required. I realize now that every breakpoint will from now on have its N.1 line displayed, which is also a change to the worse, IMO, given the multiple-location breakpoints we are/were used to until now. > +@item break @var{location_spec} > +Set a breakpoint at all the locations the given @var{location_spec} > +resolves to. @var{location_spec} can specify a function name, a line > +number, an address of an instruction, and more. @xref{Specify > +Location}, for a list of all the possible ways to specify a > +@var{location}. The breakpoint will stop your program just before it > +executes any of the instructions at the resolved locations' addresses. > + > +When using source languages that permit overloading of symbols, such > +as C@t{++}, a function name may refer to more than one symbol, and > +thus more than one place to break. @xref{Ambiguous > +Expressions,,Ambiguous Expressions}, for a discussion of that > +situation. > + > +It is possible that a given function is instantiated more than once in > +the program, resulting in a breakpoint with multiple resolved > +locations. This can happen e.g., with inlined functions, or template > +functions. This text was written when we showed N.x locations only when needed, so the text means to explain when a breakpoint will have several locations instead of just one. But with this changeset, we will be displaying multiple locations _always_, so this text is no longer enough. We should augment it with the background for multiple shown locations even in the cases where the underlying breakpoint is inserted only at a single address, such as your example of setting a breakpoint at a location that doesn't have any corresponding code address. > +A breakpoint location specification (@pxref{Specify Location}) may be > +resolved to several locations in your program. E.g., @code{break > +func} will find a location for each function named @code{func} in the > +program. Again, this "e.g." is incomplete and misses the frequent case, where there will be always 2 "locations", possibly identical, shown in the table. > For each code breakpoint and tracepoint (not for > +watchpoints, nor catchpoints), @value{GDBN} prints a header entry, and > +then one entry for each resolved location of the breakpoint. Likewise: "one entry" will be misleading with the new display, because we will always show at least 2. > +Enabled breakpoints and breakpoint locations are marked with @samp{y}. What does "enabled breakpoint location" mean? One cannot enable a location. We should find some different terminology here for the N.x breakpoints, maybe something like "breakpoint" for the N.x and "breakpoint group" for the "parent" breakpoint? You seem to use "breakpoint header" below, but that term is not explained and not used consistently. Nor is it a "header", strictly speaking: it's conceptually an item that represents a group of actual breakpoints. > +For a breakpoint whose location specification hasn't been resolved to > +any location yet, this field will contain @samp{}. Such > +breakpoint won't fire until its location spec is resolved to an actual > +location, such as e.g., when a shared library that has the symbol or ^^^^^^^^ This should be "address", to avoid even more confusion wrt "location". (After reading more of your text, I realize that this is the tip of a very large iceberg, see below.) > +For a breakpoint header row, the original location specification > +passed to the breakpoint command. For a breakpoint location row, > +where the breakpoint location is in the source for your program, as a > +file and line number. This should explain the crucial difference between "the original location specification" and "breakpoint location" used elsewhere in the text. It is entirely not trivial to grasp the importance of that distinction. (Another tip of the iceberg.) > +@value{GDBN} may add new locations to existing breakpoints when new > +symbols are loaded. For example, if you have a breakpoint in a > +C@t{++} template function, and a newly loaded shared library has an > +instantiation of that template, a new location is added to the list of > +locations for the breakpoint. More on this further below. I fail to see why this paragraph is important to have. What it says is trivial, and having it here makes a complex description harder to understand, because it interrupts the description with unimportant details. > +Regulars code breakpoints and tracepoints are displayed in the ^^^^^^^^ Typo. > +breakpoint table using several rows---one header row per breakpoint, > +followed by one row for each of the breakpoint's locations. The > +header row has an empty address column, and the original location > +specification in the what column. The rows for individual locations > +contain the addresses for the locations, and show the functions to > +which those addresses belong. The number column for a location is > +indented by one space to the right for grouping, and is of the form > @var{breakpoint-number}.@var{location-number}. IMO, something like this text should precede the description of the table contents, because it both provides a high-level overview of how each breakpoint is displayed, and defines terminology like "header row", "breakpoint location row", etc. Having it here is "too late". > set a breakpoint in a shared library at the beginning of your > debugging session, when the library is not loaded, and when the > symbols from the library are not available. When you try to set > -breakpoint, @value{GDBN} will ask you if you want to set > -a so called @dfn{pending breakpoint}---breakpoint whose address > -is not yet resolved. > +a breakpoint, @value{GDBN} will ask you for confirmation of whether > +you want to set a so called @dfn{pending breakpoint}---a breakpoint with > +no resolved locations yet. I'm not sure it is a good idea to talk about "resolved locations", here and elsewhere. Why not "resolved addresses", as the original text did? Using "location" here introduces a conceptual ambiguity, whereby a "location" could be in the source code and also in the executable code, and that overloading of concepts makes this complex issue even harder to understand. My suggestion is to use "location" for source-level location specs, and for nothing else. It sounds like you want to use the term "breakpoint location" to refer to those N.x "instances" of the breakpoints, and that is the root cause of the above terminology overloading. If that is the case, let's find a better term for "breakpoint location" (which in itself is a vague and problematic term, since there are actually 2 locations involved: the source-level location and the address in the code). > After the program is run, whenever a new shared library is loaded, > -@value{GDBN} reevaluates all the breakpoints. When a newly loaded > -shared library contains the symbol or line referred to by some > -pending breakpoint, that breakpoint is resolved and becomes an > -ordinary breakpoint. When a library is unloaded, all breakpoints > -that refer to its symbols or source lines become pending again. > - > -This logic works for breakpoints with multiple locations, too. For > -example, if you have a breakpoint in a C@t{++} template function, and > -a newly loaded shared library has an instantiation of that template, > -a new location is added to the list of locations for the breakpoint. > - > -Except for having unresolved address, pending breakpoints do not > +@value{GDBN} reevaluates all the breakpoints' location specs. When a > +newly loaded shared library contains the symbol or line referred to by > +some pending breakpoint, a location for the breakpoint is found and > +the breakpoint becomes an ordinary breakpoint. When a library is > +unloaded, breakpoint locations that refer to its symbols or source > +lines are not deleted -- they are instead displayed with a > +@code{} address. This is so that if the shared library is > +loaded again, the location's enabled/disabled state is preserved. This text has the same terminology difficulties I mentioned above. > @value{GDBN} provides some additional commands for controlling what > -happens when the @samp{break} command cannot resolve breakpoint > -address specification to an address: > +happens when the @samp{break} command cannot resolve its breakpoint > +location specification to any location: The "resolve breakpoint location specification to any location" part uses "location" in two very different senses in the same sentence -- do you see how confusing and hard to understand this is? > + When @value{GDBN} cannot find any > +location for the location specification, it queries you whether a Likewise here. > +This indicates that when @value{GDBN} cannot resolve any location for > +the location specification, it should create a pending breakpoint And here. And elsewhere -- if we don't find a good solution for this problematic terminology, we will bump into such problems and confuse the heck out of our users from here to eternity. > +This indicates that pending breakpoints are not to be created. If > +@value{GDBN} cannot resolve any location for the location > +specification, @value{GDBN} throws an error. I wouldn't use "throws an error" in describing user-facing features. "Displays an error message", perhaps? > +breakpoint; This field will not be present if no address can be ^^^ Typo(s). > +This field is present for ordinary breakpoints and tracepoints. What is an "ordinary" breakpoint, and how is it different from the other kinds? (There are other places in the new text that use this unexplained terminology.) Thanks.