From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 244F93858434 for ; Thu, 26 May 2022 13:52:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 244F93858434 Received: from fencepost.gnu.org ([2001:470:142:3::e]:34666) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nuDv3-0005Bp-7s; Thu, 26 May 2022 09:52:53 -0400 Received: from [87.69.77.57] (port=4266 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nuDv2-0008Nr-O0; Thu, 26 May 2022 09:52:53 -0400 Date: Thu, 26 May 2022 16:52:45 +0300 Message-Id: <83y1yoo0ya.fsf@gnu.org> From: Eli Zaretskii To: Pedro Alves Cc: gdb-patches@sourceware.org In-Reply-To: <18ef47c5-43cf-e38a-41c6-506f43ae5af2@palves.net> (message from Pedro Alves on Thu, 26 May 2022 13:26:07 +0100) Subject: Re: [PATCH v3] gdb/manual: Introduce location specs References: <20220525193126.1613411-1-pedro@palves.net> <83mtf5perq.fsf@gnu.org> <1a48f5fd-e545-00a7-e657-55dd4ec41c74@palves.net> <83bkvkpz3a.fsf@gnu.org> <18ef47c5-43cf-e38a-41c6-506f43ae5af2@palves.net> X-Spam-Status: No, score=1.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 May 2022 13:52:56 -0000 > Date: Thu, 26 May 2022 13:26:07 +0100 > Cc: gdb-patches@sourceware.org > From: Pedro Alves > > I like many of the suggestions you made in this direction for e.g., > the "list" command and others. But not for breakpoints. (now, the following > few replies to your comments happen to all be in the area, but if you look further > below, I agree with your suggestions a lot more...) > > Because what you suggest above is not equivalent: what really happens is that we do set > a breakpoint at each location {address, function name, filename, line} in the > program that matches the spec. Recall my inline functions example in the other thread. > If we think only in terms of addresses, GDB would behave differently in that inlines example. > It's not just the address that matters. Like in geography, you can think of locations > having coordinates (e.g., {x,y,z}), and address is just one of them. I already asked what does a "location" entail in your eyes, in addition to the program address to which it eventually resolves. I don't think we will arrive at a full agreement before we see this spelled out and documented. And I'm not saying that address is the only thing that matters, I'm just saying that thinking about addresses is useful when describing how GDB uses location specs for setting breakpoints or for other related features. > Like below for example, you start with a locspec like "main", and the breakpoint is set at > ..../gdb.c:25. > > (top-gdb) b main > (top-gdb) info breakpoints > Num Type Disp Enb Address What > 3 breakpoint keep y > 3.1 y 0x00000000000ed06c in main(int, char**) at /home/pedro/gdb/binutils-gdb/src/gdb/gdb.c:25 > > All the info in the "Address" and "What" columns above define the coordinates of the location > in the program, not just the address. The function name is important. The line number is important. I'm talking about terminology, not about the aspects that are important. You seem to be interpreting "address" too literally, and "location" too generally. > The set breakpoint is then implemented by placing a breakpoint instruction at the address > of each of the breakpoint's locations. I would say "set breakpoint is implemented by arranging for the program to stop at every address that matches the location specification". ("Placing a breakpoint instruction" is inaccurate, because we have hardware-assisted breakpoints.) > This is what we need to convey. Just talking about addresses is only talking about > the implementation detail, not what the users see, and not all that matters about each > location. I think users of GDB have a clear understanding about the equivalence between source-level locations and breakpoint addresses. If this is an implementation detail, then it had leaked to the GDB user level long ago. > >> +@var{locspec} can specify a function name, a line number, an address > >> +of an instruction, and more. @xref{Location Specifications}, for the > >> +various forms of @var{locspec}. The breakpoint will stop your program > >> +just before it executes any of the code at any of the breakpoint's > >> +locations. > > ^^^^^^^^^ > > "addresses", not "locations". > > I think it should be both. "breakpoint's locations' addresses". > I went with that. >From the English POV, there should be only one "'s", the second one. We could also make it less awkward (double construct state is discouraged): ...at any of the location addresses of the breakpoint. > >> +It is possible that a breakpoint corresponds to several locations in > >> +your program. @xref{Location Specifications}, for examples. > > > > I would rephrase: > > > > It is possible that a breakpoint's location spec corresponds to > > several places in your program. > > IMO, it just adds to confusion. The cindex (just above) is called "multiple locations". > There's is nothing wrong with saying "locations". We have been saying "location" all > these years. Only the xref needs to change, which is what I was doing. The main problem with "location" is that it is too general a notion, and can mean many similar but different things. As long as we use it only in one sense, that is somewhat tolerable, but once we start using it for more than one thing, and related things at that, it becomes a source of confusion. > >> @value{GDBN} provides some additional commands for controlling what > >> -happens when the @samp{break} command cannot resolve breakpoint > >> -address specification to an address: > >> +happens when the @samp{break} command cannot find any location that > >> +matches the location spec (@pxref{Location Specifications}): > > > > This should say "...cannot resolve the breakpoint's location spec to > > an address". IOW, the only problem in the original text was with > > using "address specification", where we now want to use "location > > specification" instead. > > Yes, but it's not as correct. If "break" didn't find any location {line number, > function name, etc.) that matches whatever was specified in the location > spec, then the breakpoint ends up with no breakpoint locations, and in > that particular case, the breakpoint is called a pending breakpoint. > > If GDB manages to create a breakpoint location for the breakpoint later, when > new symbols are loaded, and _afterwards_ the code at that location goes away (due to > shared library unload, for example), the breakpoint doesn't go back to being a pending > breakpoint -- GDB will remember the location where the breakpoint location was set at, > with only the _address_ of the location being unresolved, not the breakpoint itself. I understand, but I don't see how this invalidates my comment and the rewording suggestion. What is described in that text refers to something done when defining the breakpoint, so what happens afterwards (and is not described there) cannot affect the clarity of the text or its interpretation by the reader, who at that point wants only to understand what happens with these specifications. > >> -@item clear @var{location} > >> -Delete any breakpoints set at the specified @var{location}. > >> -@xref{Specify Location}, for the various forms of @var{location}; the > >> -most useful ones are listed below: > >> +@item clear @var{locspec} > >> +Delete any breakpoints set at the locations that match @var{locspec}. > > > > "Delete any breakpoints set at addresses that match the location spec > > @var{locspec}." > > No, that is ambiguous, it kind of suggests that you can only pass > address location specs here. It does? It explicitly says "addresses that match", so doesn't imply that addresses are passed. > > "If either @var{first} or @var{last} match more than one source line > > in the program, the @code{list} command will show the list of > > ambiguous source lines, and will not print any source lines." > > I like the first part about matching lines, but I think "show the list of ambiguous source lines" > is worse, because it's ambiguous that way -- it ends up with "source lines" used twice to mean different > things. The first refers to the location in the program, the second refers to the contents > of source code at the lines. And, GDB prints more location coordinates than lines when ambiguous: > > file: "/home/pedro/gdb/binutils-gdb/src/gdb/gdb.c", line number: 25, symbol: "main(int, char**)" > file: "/home/pedro/gdb/binutils-gdb/src/gdb/unittests/basic_string_view/cons/char/1.cc", line number: 61, symbol: "selftests::string_view::cons_1::main()" > file: "/home/pedro/gdb/binutils-gdb/src/gdb/unittests/basic_string_view/cons/char/2.cc", line number: 40, symbol: "selftests::string_view::cons_2::main()" > ... You are saying that the above are locations? That's again different from what we show in "info breakpoints", even under your latest patch. > >> +A location spec serves as a blueprint, and it may match more than one > >> +actual location in your program. Examples of this situation are: > > ^^^^^^^^ > > "address". > > > > We're defining a location spec here, so that would be an overcorrection. There's nothing > wrong with referring to "a location in the program". It's even exposed to C++ users in > the language itself: https://en.cppreference.com/w/cpp/utility/source_location > > This should really say that specifications match actual locations. The "spec" > qualifier in "location spec" makes this unambiguous, and the point is really to > distinguish the "spec" from the actual "thing". > > It is no different from saying: > > "a cake specification serves as a blueprint, and it may match more than one > actual cake in the cake shop". > > There is nothing ambiguous in this sentence using cakes. And I am saying the > exact same thing, but for locations. This analogy doesn't really work. "Cake" is a real concrete object: you can eat it and report its taste and nutritional values; "cake specification" (people actually use "recipe") is something entirely different: it's a text recorded on some media. By contrast, "location" is not a tangible object, it's an abstraction. So its difference from "location specification" more subtle, and thus harder to grasp. Which makes the confusion easier. The C++ URL you pointed to doesn't talk about "location" (which, as I said above is too general, and thus problematic), it talks about "source location", and clearly documents its attributes. If you are okay with using "source location" instead, I could go with it, provided that: . we always use these two words, never just "location" . we consider "source location" as the result of fully resolving a "location specification", and describe it as such . we clearly document what a "source location" entails, i.e. what are its attributes > > "You can also inquire (using @code{*@var{addr}} as the form for > > @var{locspec}) what source line covers a particular address > > @var{addr}:" > > AFAICS, you're suggesting to add "@var{addr}". Yes. > I don't think that would be correct without other changes. Try reading the > sentence without the parenthesis, it wouldn't make sense then: > > "You can also inquire what source line covers a particular address > @var{addr}:" > > because "addr" is not referred to in the example that follows, it is only referring > to the addr in the parenthesized part. So I think that if you want to > add "addr" here, the sentence should be tweaked further. If we don't reference "addr", the text does not explain clearly enough what is alluded to as "particular address". > >> @smallexample > >> - -exec-until [ @var{location} ] > >> + -exec-until [ @var{locspec} ] > >> @end smallexample > >> > >> -Executes the inferior until the @var{location} specified in the > >> -argument is reached. If there is no argument, the inferior executes > >> -until a source line greater than the current one is reached. The > >> -reason for stopping in this case will be @samp{location-reached}. > >> +Executes the inferior until a location that matches @var{locspec} is > >> +reached. > > > > "Executes the inferior until it reaches an address that matches > > @var{locspec}." > > I think that reads worse than before. It was good to say "location" before > my change, so it should still be good after. Please let's not overcorrect here. Why is it overcorrection? This is about program execution, so talking about addresses is very natural. But if we can agree about the "source location" variant, maybe most or all of the remaining disagreements will go away. Thanks.