From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id D43273838002 for ; Tue, 24 May 2022 13:06:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D43273838002 Received: from fencepost.gnu.org ([2001:470:142:3::e]:36434) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ntUEt-00015Y-07; Tue, 24 May 2022 09:06:19 -0400 Received: from [87.69.77.57] (port=4005 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ntUEm-000802-G5; Tue, 24 May 2022 09:06:16 -0400 Date: Tue, 24 May 2022 16:06:00 +0300 Message-Id: <83ilpv5bd3.fsf@gnu.org> From: Eli Zaretskii To: Pedro Alves Cc: gdb-patches@sourceware.org In-Reply-To: <625057b2-1691-a472-fa93-0dabacbddd39@palves.net> (message from Pedro Alves on Mon, 23 May 2022 18:04:18 +0100) Subject: Re: [PATCH v2 1/2] Always show locations for breakpoints & show canonical location spec References: <20220519215552.3254012-1-pedro@palves.net> <20220519215552.3254012-2-pedro@palves.net> <834k1kd7ne.fsf@gnu.org> <625057b2-1691-a472-fa93-0dabacbddd39@palves.net> X-Spam-Status: No, score=1.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, WEIRD_PORT autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 May 2022 13:06:22 -0000 > Date: Mon, 23 May 2022 18:04:18 +0100 > Cc: gdb-patches@sourceware.org > From: Pedro Alves > > >> +@item break @var{location_spec} > >> +Set a breakpoint at all the locations the given @var{location_spec} > >> +resolves to. @var{location_spec} can specify a function name, a line > >> +number, an address of an instruction, and more. @xref{Specify > >> +Location}, for a list of all the possible ways to specify a > >> +@var{location}. The breakpoint will stop your program just before it > >> +executes any of the instructions at the resolved locations' addresses. > >> + > >> +When using source languages that permit overloading of symbols, such > >> +as C@t{++}, a function name may refer to more than one symbol, and > >> +thus more than one place to break. @xref{Ambiguous > >> +Expressions,,Ambiguous Expressions}, for a discussion of that > >> +situation. > >> + > >> +It is possible that a given function is instantiated more than once in > >> +the program, resulting in a breakpoint with multiple resolved > >> +locations. This can happen e.g., with inlined functions, or template > >> +functions. > > > > This text was written when we showed N.x locations only when needed, > > so the text means to explain when a breakpoint will have several > > locations instead of just one. But with this changeset, we will be > > displaying multiple locations _always_, so this text is no longer > > enough. We should augment it with the background for multiple shown > > locations even in the cases where the underlying breakpoint is > > inserted only at a single address, such as your example of setting a > > breakpoint at a location that doesn't have any corresponding code > > address. > > I don't understand what you're actually suggesting. This is the > documentation of the "break" command, not "info breakpoints". My point was that "breakpoint with multiple location" will from now on happen in much more frequent situations than function overloading. In particular, it will happen in programs that don't use any OOP techniques and not even written in OO programming languages. IOW, the text above describes a use case that is not a very important one, under the proposed changes, while keeping silence about a much more important use case. > >> +A breakpoint location specification (@pxref{Specify Location}) may be > >> +resolved to several locations in your program. E.g., @code{break > >> +func} will find a location for each function named @code{func} in the > >> +program. > > > > Again, this "e.g." is incomplete and misses the frequent case, where > > there will be always 2 "locations", possibly identical, shown in the > > table. > > I am making a distinction between a "location specification", and > a "resolved location". So here: > > (top-gdb) info breakpoints > Num Type Disp Enb Address What > 1 breakpoint keep y internal_error > 1.1 y 0x0000555555f813be in internal_error(char const*, int, char const*, ...) at /home/pedro/gdb/binutils-gdb/src/gdbsupport/errors.cc:51 > (top-gdb) I understand that this is what you had in mind. What I had in mind is this other example of yours: > 4 breakpoint keep y gdb.c:27 > 4.1 y 0x000055555564107b in main(int, char**) at src/gdb/gdb.c:28 As you see, here we also have "2 locations, possibly identical", without having 2 functions called 'main'. Specifically, the "gdb.c:27" vs "gdb.c:28" parts, which in many cases, but not always, will be identical. I'm saying that this is no less important than multiple functions by the same name, and should be part of the above text. > The "will find a location for each function named func", is talking about more > than info breakpoints. Since this is the part describing the "break" command, > you can already see it here: > > (top-gdb) b main > Breakpoint 3 at 0xed06c: main. (24 locations) > ^^^^^^^^^^^^ > > The user typed "main", and gdb found 24 locations. I.e., the single > location specification "main" was resolved to 24 locations in your program. I'm saying that from the user POV, at least this user, the above two situations are very similar. I realize that internally they are quite different, but the manner in which both breakpoints will be presented by "info breakpoints" has the same traits relevant to the discussion of "locations": there could be more than one "location" for a breakpoint in either case. We should keep in mind that the notion a GDB user has about breakpoints is based on what GDB shows, not on the internal implementation details. So similar displays lead to similar mental models, regardless of the actual implementation. > >> For each code breakpoint and tracepoint (not for > >> +watchpoints, nor catchpoints), @value{GDBN} prints a header entry, and > >> +then one entry for each resolved location of the breakpoint. > > > > Likewise: "one entry" will be misleading with the new display, because > > we will always show at least 2. > > It says "prints a header entry", so that's 1 entry, "and then one entry FOR EACH > resolved location". I'm not sure what you find confusing here. I guess the confusing part is the "header entry". AFAIR, it was never explicitly described under that name before it is used. Also, "header" is too general a word, not necessarily related to breakpoints, so it doesn't explain itself well enough, given its wide use and its higher importance under the new behavior, where there will be _always_ a header entry. "Header" is much more natural for describing the table header, i.e. this part: >> Num Type Disp Enb Address What > I just reused that text before the table so that I could talk about > breakpoint locations in the description of the columns. I just didn't use the > word "row". I've reworded the new text a bit now, using "row" too. Hopefully it reads > clearer that way. I can only say that it confused me, when I compared the text with the "pictures": the examples of the new display. > >> +Enabled breakpoints and breakpoint locations are marked with @samp{y}. > > > > What does "enabled breakpoint location" mean? One cannot enable a > > location. > > Sure we can. We are mis-communicating. "Location" is a frequently-used word that has certain natural meaning in many contexts. That natural meaning doesn't support the notion of "enabling". You use "location" in a very special (and IMO problematic, as I tried to explain later in my comments) meaning, and in that meaning a "location" indeed _can_ be enabled. But when reading this text, that meaning was not yet clear to me, if it ever becomes clear later, and thus this text made me raise a brow. I described the surprise which I felt while reading the text as a feedback in the hope that we will be able to make the manual more clear and less confusing upon first, linear, top-to-bottom reading, and avoid the need for the reader to repeatedly read the same text before its meaning dawns on the reader. > This is documented in the "Set Breaks" node already, here: > > "You cannot delete the individual locations from a breakpoint. However, each location can be individually enabled or > disabled by passing breakpoint-number.location-number as argument to the enable and disable commands." > > Note that, as I explained in the commit log of the patch in question, with current master, if the > breakpoint's enabled state is different from the location's enabled state, then GDB is already > presenting a single-location breakpoint in "multi-location" mode. E.g., with current master > as top gdb, debugging another gdb: This completely misses the point. I'm not accusing you in doing something inconsistent, I'm just saying that reading that text, in its immediate context, caused me to wonder what is all this about. So there's no need to defend what you did by pointing to some other text elsewhere. What I hope is that by pointing out the parts that caused confusion and brow-raising, I will help in making the manual more self-explanatory and unequivocal. If my comments aren't helpful, you can disregard them, but you cannot convince me that what I felt when I've read the text was not confusion. If we want to do something about that confusion, let's discuss ways of changing the text to avoid such confusion. > > This should explain the crucial difference between "the original > > location specification" and "breakpoint location" used elsewhere in > > the text. It is entirely not trivial to grasp the importance of that > > distinction. (Another tip of the iceberg.) > > OK, I can agree with that. Do note the current manual is already > ambiguous here, for it calls both things "location", and that's it. Under the current GDB behavior, this is less important, because multi-location display is not as pervasive as you intend to make it. Thus, in the current manual I could completely ignore this terminology issue if I don't bump frequently enough into overloaded functions. Not so under the new behavior. > So I'm actually proposing to improve things by calling the user-input > text as "location specification", distinct from the actual locations > the specification matches or resolves to. My suggestion is to use "location" (with or without additional qualifiers) for only one of these two, if only because we tend to shorthand that by using just "location". We could use it for what you now call "location specification", which would allow us to use just "location" as it shorthand. With your proposed text, "location" is ambiguous, because it could allude to any of these two. The other meaning of "location" I propose to call "address in the code" or "address" for short -- because that's what it conceptually is. (If you are bothered by the case of unloaded shared library and similar situations, we could say "unresolved address" in those cases -- which are rather marginal, so once explained, they can be ignored in the rest of the text. By contrast, these two "locations" are pervasive: we need to refer to them all over the manual. So it's much more important IMO to make that part of our terminology clear and unequivocal.) Can you tel concretely why you object to changing the "location" terminology along these lines? Even if you think the current terminology is okay, what will we lose by switching the terminology I propose above? > Even GDB's source code uses "location" for both -- struct bp_location > for the breakpoint locations, the "1.1", "1.2" things, stored in each > struct breakpoint in the breakpoint::loc field, as a linked list: I know. But I hope we can avoid terminology ambiguities even though they are present in the code. The reader of the GDB manual will not necessarily read the GDB source at the same time, and is not required to do so if all they want is to learn how to use GDB. > >> +@value{GDBN} may add new locations to existing breakpoints when new > >> +symbols are loaded. For example, if you have a breakpoint in a > >> +C@t{++} template function, and a newly loaded shared library has an > >> +instantiation of that template, a new location is added to the list of > >> +locations for the breakpoint. More on this further below. > > > > I fail to see why this paragraph is important to have. What it says > > is trivial, and having it here makes a complex description harder to > > understand, because it interrupts the description with unimportant > > details. > > Note this is a rewording of preexisting text. Preexisting text is not sacred. It's okay to modify or even delete parts of the old text where they are sub-optimal or aren't contributing to the manual enough to justify their existence. Also, please understand that when presented with a large rewrite of the manual, I normally cannot go back to the old text and refrain from commenting on some parts that were more or less verbatim copied from the old version. Instead, I read the new text as a whole and comment on that, since that is what will the reader see after the changes are installed. > >> +breakpoint table using several rows---one header row per breakpoint, > >> +followed by one row for each of the breakpoint's locations. The > >> +header row has an empty address column, and the original location > >> +specification in the what column. The rows for individual locations > >> +contain the addresses for the locations, and show the functions to > >> +which those addresses belong. The number column for a location is > >> +indented by one space to the right for grouping, and is of the form > >> @var{breakpoint-number}.@var{location-number}. > > > > IMO, something like this text should precede the description of the > > table contents, because it both provides a high-level overview of how > > each breakpoint is displayed, and defines terminology like "header > > row", "breakpoint location row", etc. Having it here is "too late". > > I actually already did that, that's why I added the paragraph saying: > [...] > >> set a breakpoint in a shared library at the beginning of your > >> debugging session, when the library is not loaded, and when the > >> symbols from the library are not available. When you try to set > >> -breakpoint, @value{GDBN} will ask you if you want to set > >> -a so called @dfn{pending breakpoint}---breakpoint whose address > >> -is not yet resolved. > >> +a breakpoint, @value{GDBN} will ask you for confirmation of whether > >> +you want to set a so called @dfn{pending breakpoint}---a breakpoint with > >> +no resolved locations yet. > > > > I'm not sure it is a good idea to talk about "resolved locations", > > here and elsewhere. Why not "resolved addresses", as the original > > text did? Using "location" here introduces a conceptual ambiguity, > > whereby a "location" could be in the source code and also in the > > executable code, and that overloading of concepts makes this complex > > issue even harder to understand. My suggestion is to use "location" > > for source-level location specs, and for nothing else. > > I disagree with that, because that's not how GDB behaves. > [...] I'm disappointed by such a wholesale rejection of these parts of my comments. The terminology issue is very important, at least IMO, and the rest of the review circles around it and its various aspects. If you disagree with it so strongly, almost all the non-trivial parts of my review can be just ignored, because they are based on this difficulty in the terminology. I'm not sure I know how to proceed, given that we have such a basic disagreement on these aspects of the changes for the manual. > >> +This field is present for ordinary breakpoints and tracepoints. > > > > What is an "ordinary" breakpoint, and how is it different from the > > other kinds? (There are other places in the new text that use this > > unexplained terminology.) > > I've just using the terminology already used in this table. See: When I say "not explained" I mean there's no text which says something like "an ordinary breakpoint is ...", or some text that explains what are the "not ordinary" breakpoints, and by doing that explains implicitly what are the "ordinary" ones. If the old text didn't have that, it means that old text was also sub-optimal. (And my reading of these changes to the manual led me to believe that this distinction will be more important once the behavior changes, because you modified the original text even where the old one was not invalidated by the changes in behavior.) Thus the comment. And this part of the patch: > -A location in a multi-location breakpoint is represented as a tuple with the > -following fields: > +Each location in an ordinary breakpoint or tracepoint is represented > +as a tuple with the following fields: led me to believe that "ordinary" somehow is related to "multi-location". > Thus "ordinary" here refers to actual code breakpoints, as opposed to the generic "breakpoint" > term meaning all of code breakpoints, watchpoints, tracepoints, catchpoints. How about saying that explicitly, before we start relying on this terminology? > Below is the updated patch. Only the documentation changed. Given that you rejected most of the important parts of my comments, I'm not sure how to proceed with reviewing this new version. The main part of my review is the suggestion to find better, more intuitive terminology for "breakpoints" vs "header rows" in the "info break" display, and also better terminology for the two types of "locations". If this is left as it was in the original patch, what should I look at in the revised patch? Please understand, like I explained above, that I read the new text after the patch as a whole, and base my review on what it tells me and how it describes the GDB features. It's not just diffs to me, it's a more-or-less complete document, and it must make sense as a whole. Currently, it doesn't, due to those terminology problems. If we keep this terminology, the relevant sections of the manual will keep confusing me. The result is that I will keep mentioning this in my future reviews, and that makes little sense to me.