Re: [PATCH v2 1/2] Always show locations for breakpoints & show canonical location spec

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Pedro Alves <pedro@palves.net>
To: Eli Zaretskii <eliz@gnu.org>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH v2 1/2] Always show locations for breakpoints & show canonical location spec
Date: Wed, 25 May 2022 20:32:24 +0100	[thread overview]
Message-ID: <4c7a9504-83e0-6c02-fda6-0254ab4eede4@palves.net> (raw)
In-Reply-To: <83ilpv5bd3.fsf@gnu.org>

On 2022-05-24 14:06, Eli Zaretskii wrote:

> 
>>>> +Enabled breakpoints and breakpoint locations are marked with @samp{y}.
>>>
>>> What does "enabled breakpoint location" mean?  One cannot enable a
>>> location.  
>>
>> Sure we can.
> 
> We are mis-communicating.  "Location" is a frequently-used word that
> has certain natural meaning in many contexts.  That natural meaning
> doesn't support the notion of "enabling".  

Well, yeah, but above the text is very clearly talking about "breakpoint
locations", i.e., the locations in the program where the breakpoint is armed.
Enabling/disabling one of these locations is really about enabling/disabling
the arming of a breakpoint at such a location.  There is no need to be
concerned about the general term here, the context doesn't leave
scope for ambiguity.  Even though, the term does make sense here.

> You use "location" in a
> very special (and IMO problematic, as I tried to explain later in my
> comments) meaning, and in that meaning a "location" indeed _can_ be
> enabled.  But when reading this text, that meaning was not yet clear
> to me, if it ever becomes clear later, and thus this text made me
> raise a brow.  I described the surprise which I felt while reading the
> text as a feedback in the hope that we will be able to make the manual
> more clear and less confusing upon first, linear, top-to-bottom
> reading, and avoid the need for the reader to repeatedly read the same
> text before its meaning dawns on the reader.

I think we can get rid of any confusion by naming the input location
spec, the arguments you pass to commands, differently.

I've just sent a separate patch about this, which cleans up the manual in
that direction, and you'll notice that it isn't really about "info breakpoints".
In fact, it barely touches that section.  It's a lot more general.  Please take a
fresh look at that one.  I'll then get back to these "info breakpoints" changes on
top of it, and I think we will have a much better foundation.

Here is the URL to the patch in question:

 [PATCH] gdb/manual: Introduce locspecs
 https://sourceware.org/pipermail/gdb-patches/2022-May/189417.html

> 
>> So I'm actually proposing to improve things by calling the user-input
>> text as "location specification", distinct from the actual locations
>> the specification matches or resolves to.
> 
> My suggestion is to use "location" (with or without additional
> qualifiers) for only one of these two, if only because we tend to
> shorthand that by using just "location".  We could use it for what you
> now call "location specification", which would allow us to use just
> "location" as it shorthand.  With your proposed text, "location" is
> ambiguous, because it could allude to any of these two.  The other
> meaning of "location" I propose to call "address in the code" or
> "address" for short -- because that's what it conceptually is.  (If
> you are bothered by the case of unloaded shared library and similar
> situations, we could say "unresolved address" in those cases -- which
> are rather marginal, so once explained, they can be ignored in the
> rest of the text.  By contrast, these two "locations" are pervasive:
> we need to refer to them all over the manual.  So it's much more
> important IMO to make that part of our terminology clear and
> unequivocal.)

I really think that this is the wrong direction, and working on
the patch that I pointed at above really convinced me so even more.
I believe that that patch will convince you as well.  Please take a look
there.

As another data point, after writing that other patch, i.e., just now,
I went to look what other debuggers call similar things, and here's what I saw:

 - Delve, the Go debugger, calls what I'm suggesting to call
  "Location Specification" + shorthand "locspec", as "Location Specifier".
   See:
    https://github.com/go-delve/delve/blob/master/Documentation/cli/locspec.md

   That's very close -- specification vs specifier.  I could go with "Specifier" too
   instead of "Specification" if people prefer it.

   Note how their "break" command is documented as:

     "break [name] [locspec]"

   https://github.com/go-delve/delve/tree/master/Documentation/cli#break

   That's very close to what I proposed in the patch I just sent, as it uses
   the same "locspec" shorthand.  Nice coincidence.

   They also have wording using "matching", as in 

        "If called with the locspec argument it will delete all the breakpoints matching the locspec. 
         If locspec is omitted all breakpoints are deleted."

   ... which is the wording I used in that patch as well.  I swear I didn't find this until
   after I wrote my patch.  It should give some credence at this being a good candidate
   for a term of art.

 - LLDB calls "locations" to what GDB calls breakpoint locations too:

   See here:
     https://lldb.llvm.org/use/tutorial.html#setting-breakpoints

   They use the "resolve to one or more locations" wording too.  See:

     "Note that setting a breakpoint creates a logical breakpoint, which could resolve to one or more
     locations. For instance, break by selector would set a breakpoint on all the methods that implement
     that selector in the classes in your program. Similarly, a file and line breakpoint might result in multiple
     locations if that file and line were inlined in different places in your code."

   Here, lldb debugging gdb, note how the user interface uses "location" just like GDB.

   Setting a breakpoint:

     (lldb) b main
     Breakpoint 1: 24 locations.
                   ^^^^^^^^^^^^
   Listing breakpoints:

     (lldb) break list
     Current breakpoints:
     1: name = 'main', locations = 24
                       ^^^^^^^^^^^^^^
       1.1: where = gdb`main + 34 at gdb.c:28:10, address = gdb[0x00000000000ed07b], unresolved, hit count = 0 
       1.2: where = gdb`main + 8 at 13650.cc:45:9, address = gdb[0x0000000000869cf3], unresolved, hit count = 0 
       1.3: where = gdb`main + 8 at 1.cc:39:9, address = gdb[0x0000000000869eb3], unresolved, hit count = 0 
       1.4: where = gdb`main + 8 at 1.cc:160:9, address = gdb[0x000000000086a476], unresolved, hit count = 0 
       1.5: where = gdb`main + 8 at 2.cc:158:9, address = gdb[0x000000000086a9d8], unresolved, hit count = 0 
       1.6: where = gdb`main + 8 at 3.cc:158:9, address = gdb[0x000000000086b039], unresolved, hit count = 0 
       1.7: where = gdb`main + 8 at 4.cc:40:9, address = gdb[0x000000000086b0e4], unresolved, hit count = 0 
       1.8: where = gdb`main + 8 at 1.cc:90:9, address = gdb[0x000000000086b6b7], unresolved, hit count = 0 
       1.9: where = gdb`main + 8 at 2.cc:48:9, address = gdb[0x000000000086b910], unresolved, hit count = 0 
       1.10: where = gdb`main + 8 at 3.cc:63:9, address = gdb[0x000000000086bcc4], unresolved, hit count = 0 
       1.11: where = gdb`main + 8 at 1.cc:74:9, address = gdb[0x000000000086c001], unresolved, hit count = 0 
       1.12: where = gdb`main + 8 at 2.cc:366:9, address = gdb[0x000000000086de53], unresolved, hit count = 0 
       1.13: where = gdb`main + 8 at 1.cc:41:9, address = gdb[0x0000000000869de2], unresolved, hit count = 0 
       1.14: where = gdb`main + 8 at 1.cc:127:9, address = gdb[0x0000000000869bb0], unresolved, hit count = 0 
       1.15: where = gdb`main + 8 at 1.cc:58:9, address = gdb[0x00000000008693ed], unresolved, hit count = 0 
       1.16: where = gdb`main + 8 at 1.cc:58:9, address = gdb[0x00000000008692a1], unresolved, hit count = 0 
       1.17: where = gdb`main + 8 at 2.cc:84:9, address = gdb[0x0000000000869135], unresolved, hit count = 0 
       1.18: where = gdb`main + 8 at front_back.cc:38:9, address = gdb[0x0000000000868d09], unresolved, hit count = 0 
       1.19: where = gdb`main + 27 at empty.cc:27:22, address = gdb[0x0000000000868b0a], unresolved, hit count = 0 
       1.20: where = gdb`main + 8 at 1.cc:66:9, address = gdb[0x0000000000868ae3], unresolved, hit count = 0 
       1.21: where = gdb`main + 8 at 3.cc:34:9, address = gdb[0x0000000000868912], unresolved, hit count = 0 
       1.22: where = gdb`main + 8 at 2.cc:41:9, address = gdb[0x00000000008688ac], unresolved, hit count = 0 
       1.23: where = gdb`main + 8 at 1.cc:62:9, address = gdb[0x00000000008687bb], unresolved, hit count = 0 
       1.24: where = gdb`main + 8 at 1.cc:167:9, address = gdb[0x000000000086839c], unresolved, hit count = 0 

     2: name = 'handle_inferior_event', locations = 1

       2.1: where = gdb`::handle_inferior_event(execution_control_state *) + 44 at infrun.c:5335:21, address = gdb[0x00000000004ea21d], unresolved, hit count = 0 

     (lldb)

   Disabling a location:

     (lldb) break disable 1.50
     error: '1.50' is not a currently valid breakpoint/location id.
                                            ^^^^^^^^^^^^^^^^^^^

   Curiously, a single-location breakpoint is shown just like multi-location breakpoints too, there's no
   distinction.  See breakpoint 2 above.  And they show the original breakpoint spec in the "name = ..." field,
   it's not lost for them like it is in gdb's "info breakpoints".

   OOC, here's what they show for a disabled breakpoint with enabled locations (except location 1.3, I disabled that one):

     (lldb) break list
     Current breakpoints:
     1: name = 'main', locations = 24 Options: disabled 
       1.1: where = gdb`main + 34 at gdb.c:28:10, address = gdb[0x00000000000ed07b], unresolved, hit count = 0 
       1.2: where = gdb`main + 8 at 13650.cc:45:9, address = gdb[0x0000000000869cf3], unresolved, hit count = 0 
       1.3: where = gdb`main + 8 at 1.cc:39:9, address = gdb[0x0000000000869eb3], unresolved, hit count = 0  Options: disabled 
       1.4: where = gdb`main + 8 at 1.cc:160:9, address = gdb[0x000000000086a476], unresolved, hit count = 0 
       1.5: where = gdb`main + 8 at 2.cc:158:9, address = gdb[0x000000000086a9d8], unresolved, hit count = 0 
       1.6: where = gdb`main + 8 at 3.cc:158:9, address = gdb[0x000000000086b039], unresolved, hit count = 0 
       ...

> 
> Can you tel concretely why you object to changing the "location"
> terminology along these lines?  Even if you think the current
> terminology is okay, what will we lose by switching the terminology I
> propose above?

You lose the fact that it is the right name for it, and the fact that everyone
understands what it means today, so you force everyone to learn something new
for no good reason.  Also, the list of breakpoint locations in MI is exposed in the
field called ... "locations".  And there is a proposal pending on the list to
expose the breakpoint locations to Python, via a "locations" attribute...

> 
>> Even GDB's source code uses "location" for both -- struct bp_location
>> for the breakpoint locations, the "1.1", "1.2" things, stored in each
>> struct breakpoint in the breakpoint::loc field, as a linked list:
> 
> I know.  But I hope we can avoid terminology ambiguities even though
> they are present in the code.  The reader of the GDB manual will not
> necessarily read the GDB source at the same time, and is not required
> to do so if all they want is to learn how to use GDB.

My point is that the ambiguity already exists, is old, but I plan to
do something about, including fixing the code, and was am telling you
what my plan for the source code was.

> Preexisting text is not sacred.  It's okay to modify or even delete
> parts of the old text where they are sub-optimal or aren't
> contributing to the manual enough to justify their existence.

Certainly.  But it is also the case that you should expect that a contributor
will assume that the terms used in current text will have already been vetted
when the text was originally added, so reusing them or moving text around shouldn't
require rewriting everything.

> 
> Also, please understand that when presented with a large rewrite of
> the manual, I normally cannot go back to the old text and refrain from
> commenting on some parts that were more or less verbatim copied from
> the old version.  Instead, I read the new text as a whole and comment
> on that, since that is what will the reader see after the changes are
> installed.

I somewhat understand that (don't understand the "cannot", you could do
that for terms you find suspicious, if not for the whole text), though I also
believe that can lead to people feeling like they're getting unfair
requirements.

>>> I'm not sure it is a good idea to talk about "resolved locations",
>>> here and elsewhere.  Why not "resolved addresses", as the original
>>> text did?  Using "location" here introduces a conceptual ambiguity,
>>> whereby a "location" could be in the source code and also in the
>>> executable code, and that overloading of concepts makes this complex
>>> issue even harder to understand.  My suggestion is to use "location"
>>> for source-level location specs, and for nothing else.
>>
>> I disagree with that, because that's not how GDB behaves.
>> [...]
> 
> I'm disappointed by such a wholesale rejection of these parts of my
> comments.  The terminology issue is very important, at least IMO, and
> the rest of the review circles around it and its various aspects.  If
> you disagree with it so strongly, almost all the non-trivial parts of
> my review can be just ignored, because they are based on this
> difficulty in the terminology.
> 
> I'm not sure I know how to proceed, given that we have such a basic
> disagreement on these aspects of the changes for the manual.

It should be OK to have disagreements, I hope.  I tried to explain
why I disagreed.  You can disagree back and explain why.  At some point
we should reach somewhere in the middle.  But let me try again.

When you set a breakpoint, you pass the "break" command some arguments, which
can be a function name, a file name, a line number, a stap probe,
an address, and more.  There are three formats you can use to
specify where you want the breakpoints.  Linespecs, explicit locations,
and address locations.  Once you pass this input to GDB, gdb will
find the matching locations, and will tell you how many it found.
For example:

 (top-gdb) b main
 Breakpoint 3 at 0xed06c: main. (24 locations)

It is not correct to think of the locations as just "addresses",
because they are more than that.  For example here are three breakpoints
set at different functions, one of them inlined.  Note that both breakpoints'
locations have the same address, though different function names and line numbers:

(gdb) b func_extern_caller
Breakpoint 1 at 0x5555555552d3: file /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c, line 208.
(gdb) b func_inline_caller
Note: breakpoint 1 also set at pc 0x5555555552d3.
Breakpoint 2 at 0x5555555552d3: file /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c, line 199.
(gdb) info breakpoints 
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x00005555555552d3 in func_extern_caller at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c:208
2       breakpoint     keep y   0x00005555555552d3 in func_inline_caller at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c:199
                                ^^^^^^^^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^^                                                                          ^^^
                                    same!                  different...                                                                 different...

So they should be considered different locations.  Running to each of the breakpoints will make GDB
present the stop differently:

(gdb) disable
(gdb) enable 1
(gdb) r
Starting program: /home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.opt/inline-break/inline-break 

Breakpoint 1, func_extern_caller (x=1) at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c:208
208       return func_inline_caller (x);
(gdb) disable 
(gdb) enable 2
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.opt/inline-break/inline-break 

Breakpoint 2, func_inline_caller (x=1) at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c:199
199       return func_inline_callee (x);

Also, the local variables you can access in each of the breakpoints's conditions are
different, the breakpoints's commands will see a different context, etc., as the breakpoints
are set in different functions, even though they have the same addresses.

So calling breakpoint locations "addresses" instead if just incorrect.

> 
>>>> +This field is present for ordinary breakpoints and tracepoints.
>>>
>>> What is an "ordinary" breakpoint, and how is it different from the
>>> other kinds?  (There are other places in the new text that use this
>>> unexplained terminology.)
>>
>> I've just using the terminology already used in this table.  See:
> 
> When I say "not explained" I mean there's no text which says something
> like "an ordinary breakpoint is ...", or some text that explains what
> are the "not ordinary" breakpoints, and by doing that explains
> implicitly what are the "ordinary" ones.  If the old text didn't have
> that, it means that old text was also sub-optimal.  (And my reading of
> these changes to the manual led me to believe that this distinction
> will be more important once the behavior changes, because you modified
> the original text even where the old one was not invalidated by the
> changes in behavior.)  Thus the comment.
> 

OK, but you have to understand that if a contributor just copies a term from
the surrounding text, they'll think that it should be OK to use it
again, otherwise how did the term get into the manual in the first place?
Requiring that contributions that just add more instances of the same clean up
the prior mess isn't fair, because adding one more use of the same term will not
make any difference to the user's understanding, nor to the work necessary to clarify
elsewhere in some intro text what the term means.  That can be done as an
orthogonal change.

> And this part of the patch:
> 
>> -A location in a multi-location breakpoint is represented as a tuple with the
>> -following fields:
>> +Each location in an ordinary breakpoint or tracepoint is represented
>> +as a tuple with the following fields:
> 
> led me to believe that "ordinary" somehow is related to
> "multi-location".
> 
>> Thus "ordinary" here refers to actual code breakpoints, as opposed to the generic "breakpoint"
>> term meaning all of code breakpoints, watchpoints, tracepoints, catchpoints.
> 
> How about saying that explicitly, before we start relying on this
> terminology?

The text I'm changing _already_ relies on that terminology, so "before we start"
doesn't apply.  I agree it would be nice to clarify this, but I don't see why it has
to be in this patch.  I actually think it should be a separate patch.  There are
references to "regular breakpoints" too, to mean the same thing.  And I'm starting to
use "code breakpoints" more myself, as I think it has more meaning (contrast
with "data breakpoint", which is a term many are familiar as alternative to
our "watchpoint").

> 
>> Below is the updated patch.  Only the documentation changed.
> 
> Given that you rejected most of the important parts of my comments,
> I'm not sure how to proceed with reviewing this new version.  The main
> part of my review is the suggestion to find better, more intuitive
> terminology for "breakpoints" vs "header rows" in the "info break"
> display, and also better terminology for the two types of "locations".
> If this is left as it was in the original patch, what should I look at
> in the revised patch?

It wasn't left as it was, the v2 patch is different, as I had explained
with some bullet points where I detailed the difference, which you've skipped.

But nevermind, let's focus on discussing the patch I pointed at at the
top of this email.

The "info breakpoints" improvements changes should be much smaller once
I rebase them on that patch (after it goes in).

> 
> Please understand, like I explained above, that I read the new text
> after the patch as a whole, and base my review on what it tells me and
> how it describes the GDB features.  It's not just diffs to me, it's a
> more-or-less complete document, and it must make sense as a whole.
> Currently, it doesn't, due to those terminology problems.

Please understand that when writing the patch I too look at
the manual as a whole.  I constantly build the html of the manual
when I'm editing it, and re-read the whole section or sections I am changing,
trying to make sure everything fits together.  I sometimes spend hours tweaking
small details here and there until I have something that I think is good.  The
changes I proposed made sense to me from that angle.

In the end, we all want the manual to improve.

> 
> If we keep this terminology, the relevant sections of the manual will
> keep confusing me.  The result is that I will keep mentioning this in
> my future reviews, and that makes little sense to me.

I think the patch I posted to introduce locspecs will fix that.  Please take
a look at it.

next prev parent reply	other threads:[~2022-05-25 19:32 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-19 21:55 [PATCH 0/2] info breakpoints improvements Pedro Alves
2022-05-19 21:55 ` [PATCH 1/2] Always show locations for breakpoints & show canonical location spec Pedro Alves
2022-05-20  6:45   ` Eli Zaretskii
2022-05-23 17:04     ` [PATCH v2 " Pedro Alves
2022-05-24 13:06       ` Eli Zaretskii
2022-05-25 19:32         ` Pedro Alves [this message]
2022-05-26 12:48           ` Eli Zaretskii
2022-05-26 14:04             ` Pedro Alves
2022-05-26 15:03               ` Eli Zaretskii
2022-05-26 15:10                 ` Pedro Alves
2022-05-26 15:33                   ` Eli Zaretskii
2022-05-26 19:29                 ` Pedro Alves
2022-05-26 19:55                   ` Eli Zaretskii
2022-05-26 20:40                     ` Pedro Alves
2023-04-10 15:07                   ` Andrew Burgess
2022-05-20  7:45   ` [PATCH " Metzger, Markus T
2022-05-23 17:05     ` Lancelot SIX
2022-05-19 21:55 ` [PATCH 2/2] Introduce "info breakpoints -hide-locations" Pedro Alves
2022-05-20  6:48   ` Eli Zaretskii
2022-05-20  5:57 ` [PATCH 0/2] info breakpoints improvements Eli Zaretskii
2022-05-23 17:06   ` Pedro Alves
2022-05-24 13:14     ` Eli Zaretskii
2022-05-24 13:45       ` Pedro Alves
2022-05-24  8:38 ` Luis Machado
2022-05-24 10:02   ` Pedro Alves
2022-05-24 13:20     ` Eli Zaretskii
2022-05-24 13:29       ` Pedro Alves
2022-05-24 13:43         ` Eli Zaretskii
2022-05-24 13:50           ` Pedro Alves
2022-05-24 14:03             ` Eli Zaretskii
2022-05-24 14:09               ` Pedro Alves
2022-05-24 14:25                 ` Eli Zaretskii
2022-05-24 14:33                   ` Pedro Alves
2022-05-24 14:11               ` Andreas Schwab
2022-05-24 14:17                 ` Pedro Alves
2022-05-24 19:49                   ` [PATCH] Show enabled locations with disabled breakpoint parent as "y-" (Re: [PATCH 0/2] info breakpoints improvements) Pedro Alves
2022-05-25 13:57                     ` Eli Zaretskii
2022-05-25 19:19                       ` Pedro Alves
2022-05-24 14:26                 ` [PATCH 0/2] info breakpoints improvements Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c7a9504-83e0-6c02-fda6-0254ab4eede4@palves.net \
    --to=pedro@palves.net \
    --cc=eliz@gnu.org \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).