Re: [PATCH v7 15/18] [gdb/generic] corefile/bug: Add hook to control the use of target description notes from corefiles

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Andrew Burgess <aburgess@redhat.com>
To: Luis Machado <luis.machado@arm.com>, gdb-patches@sourceware.org
Cc: thiago.bauermann@linaro.org
Subject: Re: [PATCH v7 15/18] [gdb/generic] corefile/bug: Add hook to control the use of target description notes from corefiles
Date: Wed, 27 Sep 2023 18:56:33 +0100	[thread overview]
Message-ID: <87fs2zwlda.fsf@redhat.com> (raw)
In-Reply-To: <e9d929bb-42d6-ae13-920c-6a1e8b516a7a@arm.com>

Luis Machado <luis.machado@arm.com> writes:

> Hi Andrew,
>
> On 9/25/23 10:57, Andrew Burgess wrote:
>> Luis Machado <luis.machado@arm.com> writes:
>>> Sure, that makes sense, and I don't disagree with that. But as I mentioned above, it is an
>>> orthogonal problem that might be better suited as its own patch/series. A more complete patch
>>> addressing this particular situation I stumbled upon.
>> 
>> That's fine.  I understand the urgency you have.  For me, the only issue
>> I have here is that the new hook you add is to prevent GDB loading a
>> "broken" NT_GDB_TDESC. I'd also like this patch to include code that
>> prevents GDB from writing out "broken" NT_GDB_TDESC.
>> 
>> I understand we're likely stuck with the "don't load the NT_GDB_TDESC"
>> hook due to broken GDB's already existing in the wild.  That sucks, but
>> it is what it is.
>> 
>> However, having identified some broken behaviour I think we really
>> should fix that at the same time as adding the work around.
>> 
>
> Yes, I see your point and think it's reasonable.
>
>>>> When I originally added this code my problem case was RISC-V, where the
>>>> CSRs (control regs) for embedded targets can be different for each
>>>> target.  The CSRs are just encoded as a blob, so unless you already know
>>>> which regs are present you're stuck.  For real h/w the controller
>>>> (e.g. openocd) provides a tdesc (via target.xml), but for a core file we
>>>> needed a way to reacquire the tdesc.
>>>>
>>>> My assumption at the time that the tdesc we wrote would always be
>>>> correct, but clearly this isn't the case.
>>>
>>> That's true. With the introduction of dynamic target descriptions like sve and sme, this
>>> no longer holds true.
>> 
>> OK. But is it really the case that the assumption is wrong, or is it
>> that the current code (that I added) is just using the wrong tdesc?
>
> For the general case, it is the wrong assumption because you can have threads
> with different tdesc's/gdbarches, and using a single NT_GDB_TDESC doesn't cover
> that part.
>
> Putting aside the threaded case, it can also be incorrect for the single-threaded
> case.
>
> Imagine you start up with a vector length of 256 bits (vg == 4). At this point the
> gdbarch/target description that gets stored in the inferior (the tdesc is still a
> per-inferior entry IMO) says the vector length is 256 bits.
>
> Now imagine we executed a few instructions and now the vector length is 128 bits
> (vg == 2). Though gdb will cope with it just fine during debugging, because
> aarch64/linux implements the thread_architecture hook, if we attempt to output
> a gcore core file, the current code (without your proposed patch) will dump
> the target description/gdbarch cached in the inferior. And that one says we
> have a vector length of 256 bits.
>
> So the tdesc is incorrect when we try to load the gcore file, if we trust the
> NT_GDB_TDESC note. The register data in the notes has the correct information
> though.

I think we agree on the above.

> Your attached patch fixes this because it uses the signalled thread's
> gdbarch/tdesc instead of the cached inferior gdbarch/tdesc.
>
> target_gdbarch can be a bit dangerous unfortunately, but we still have to use
> it in some places.

That's unfortunate.  I guess that might be because in some places we
have an inferior selected but no thread?

It feels like the right solution is to transition GDB to only having a
per-thread gdbarch/tdesc, which would avoid problems like this in the
future.

>
>> 
>> I know you described the idea of updating gcore_elf_make_tdesc_note to
>> take a gdbarch* as orthogonal, but I really think this might be the way
>> to solve this problem.
>> 
>> I've attached a patch at the end of this email that can be applied to
>> head of master BEFORE your patch series.  This patch updates the
>> NT_GDB_TDESC generation to use the gdbarch of the signalled thread.
>> 
>> Now, if I've understood how everything works, I believe that the
>> NT_GDB_TDESC that is emitted _with_ my patch should be good enough -- it
>> should be the correct tdesc for the signalled thread, which, if all
>> threads use the same tdesc, should be good enough.
>> 
>> As a test, you can apply my patch, and then TEMPORARILY update
>> aarch64_use_target_description_from_corefile_notes to always return
>> true.  Using such a GDB you should be able to create a core file, and
>> then load it back in correctly.
>
> Yes, that works correctly from testing on my end. So at a minimum that's
> already an improvement, as changing the vector length for one (or all threads,
> but with all of them still using the same vector length) during execution is a
> useful use case. Like a program setting up for some sve operation, and all of a
> someone wants to dump a core file.
>
>> 
>> As I already mentioned, I acknowledge that we are stuck with the
>> aarch64_use_target_description_from_corefile_notes hook -- there are
>> broken GDBs in the wild.
>
> Though unfortunate, the most common core files are likely coming from crashes
> outside a debugging session (so not gcore).
>
>> 
>> But, if I'm correct, then I think this is the best solution for now,
>> we're emitting a NT_GDB_TDESC that is mostly correct, but AArch64 will
>> continue to ignore it, just as you originally proposed.
>> 
>
> Sorry, it took me a little bit to get some tests running on the simulator. Happy to
> report that I see no regressions in terms of testing with the changes you proposed for
> handling the signalled thread.
>
> Also, I checked the case I described above about a program starting with vg == 4 and then
> changing it to vg == 2 before we output a gcore file.
>
> When I load the corefile back in a patched gdb (and forcing gdb to use the tdesc, by always
> returning true in the aarch64-linux hook), I can see that the tdesc is the correct
> one (vector length indicating vg == 2).
>
> Without your patch, gdb used the inferior's cached gdbarch/tdesc to dump the NT_GDB_TDESC
> note, which means the vector length was vg == 4, when in reality it should've been
> vg == 2, as we had changed it before dumping the core file.

This all sounds positive.

>
>>>
>>>>
>>>> Right now RISC-V doesn't event override gdbarch_core_read_description,
>>>> so if we read the encoded tdesc after checking that gdbarch hook nothing
>>>> would change.
>>>
>>> That was my assessment back in the earlier versions of this series.
>>>
>>>>
>>>> For other architectures, if gdbarch_core_read_description is implemented
>>>> then we'll be back to using that, and that worked fine before.
>>>>
>>>> ... And if, in some future world we do implement
>>>> gdbarch_core_read_description for RISC-V then the solution seems
>>>> obvious, if there's a section containing CSRs, and there's also a
>>>> section containing a tdesc, then the RISC-V
>>>> gdbarch_core_read_description can just return nullptr, safe in the
>>>> knowledge the generic GDB code will load the encoded tdesc.
>>>>
>>>>>
>>>>> Except for some different names and tweaks, your proposed approach works correctly.
>>>>>
>>>>> I tested this and noticed the lack of NT_GDB_TDESC for a AArch64 Linux target
>>>>> with sve/sme support and one thread with a differing gdbarch. I also saw the note for
>>>>> a AArch64 Linux target without sve/sme support, or with sve/sme support but all threads
>>>>> having the exact same gdbarch. So that looks good to me.
>>>>>
>>>>> Would you like this extra code to be included as part of this
>>>>> particular patch?
>>>>
>>>> Yeah I think something like this should be added to this patch, we
>>>> should stop GDB creating incorrect core files I think.
>>>
>>> Got it.
>>>
>>> I'm afraid I'm going to need some clarification on ways forward with regards to this particular
>>> series. Do you still want this (or other) potential changes as part of this series or would you
>>> be ok with a follow-up patch/series to (try to, haven't gone in-depth yet) address this particular
>>> core file/gdbarch/threads situation?
>> 
>> Here's what I propose:to output the cached gdbarch/
>> 
>>   1. Try the patch I've supplied here, if this works for you (see above)
>>   then I think this is the best solution, we merge this, then go with v7
>>   of this patch unmodified,
>
> It did work. Thanks!
>
> But should we keep your suggested changes to check for distinct gdbarches in the threaded case? And
> not output NT_GDB_TDESC if we find multiple distinct gdbarch entries?

I don't think so.  As you've already said, in the core file case we
already don't support per-thread tdesc, even if we use the
aarch64_linux_core_read_description hook - that's only called once, and
will just generate a tdesc based on the main (signalled) thread as far
as I can tell.

Telling GDB to dump per-thread NT_GDB_TDESC is trivial, but I haven't
looked at updating the core file code to read in per-thread tdesc, but I
suspect you'll (eventually) need to solve that problem even when using
the aarch64_linux_core_read_description hook.

I know Simon has mentioned changing GDB to not write out NT_GDB_TDESC,
and I think that would be fine, however, I think we would still want my
proposed patch -- we should always write out the thread tdesc, not the
cached global one, as that is always going to be more accurate.

My proposal then would be: we merge my patch to write NT_GDB_TDESC using
the per-thread tdesc, then you merge your original V7 version of this
patch with no other changes.

Later, if we want, we can add a new hook to avoid writing the
NT_GDB_TDESC at all -- but that can come later, and would be in addition
to my proposed patch, not instead of it.

If you and Simon are happy with this plan then I'll go ahead and push my
patch, this should unblock you, and you'll be able to merge this series.

Let me know what you think,

Thanks,
Andrew

next prev parent reply	other threads:[~2023-09-27 17:56 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-18 21:26 [PATCH v7 00/18] SME support for AArch64 gdb/gdbserver on Linux Luis Machado
2023-09-18 21:26 ` [PATCH v7 01/18] [gdb/aarch64] Fix register fetch/store order for native AArch64 Linux Luis Machado
2023-09-18 21:26 ` [PATCH v7 02/18] [gdb/aarch64] refactor: Rename SVE-specific files Luis Machado
2023-09-18 21:26 ` [PATCH v7 03/18] [gdb/gdbserver] refactor: Simplify SVE interface to read/write registers Luis Machado
2023-09-18 21:26 ` [PATCH v7 04/18] [gdb/aarch64] sve: Fix return command when using V registers in a SVE-enabled target Luis Machado
2023-09-18 21:26 ` [PATCH v7 05/18] [gdb/aarch64] sme: Enable SME registers and pseudo-registers Luis Machado
2023-10-13 13:06   ` Tom Tromey
2023-10-13 14:44     ` Luis Machado
2023-10-13 14:50       ` Luis Machado
2023-09-18 21:26 ` [PATCH v7 06/18] [gdbserver/generic] Convert tdesc's expedite_regs to a string vector Luis Machado
2023-09-18 21:26 ` [PATCH v7 07/18] [gdbserver/aarch64] refactor: Adjust expedited registers dynamically Luis Machado
2023-09-18 21:26 ` [PATCH v7 08/18] [gdbserver/aarch64] sme: Add support for SME Luis Machado
2023-09-18 21:26 ` [PATCH v7 09/18] [gdb/aarch64] sve: Fix signal frame z/v register restore Luis Machado
2023-09-18 21:26 ` [PATCH v7 10/18] [gdb/aarch64] sme: Signal frame support Luis Machado
2023-09-18 21:26 ` [PATCH v7 11/18] [gdb/aarch64] sme: Fixup sigframe gdbarch when vg/svg changes Luis Machado
2023-09-18 21:26 ` [PATCH v7 12/18] [gdb/aarch64] sme: Support TPIDR2 signal frame context Luis Machado
2023-09-18 21:26 ` [PATCH v7 13/18] [gdb/generic] Get rid of linux-core-thread-data Luis Machado
2023-09-18 21:26 ` [PATCH v7 14/18] [gdb/generic] corefile/bug: Use thread-specific gdbarch when dumping register state to core files Luis Machado
2023-09-18 21:26 ` [PATCH v7 15/18] [gdb/generic] corefile/bug: Add hook to control the use of target description notes from corefiles Luis Machado
2023-09-19 20:49   ` Simon Marchi
2023-09-20  5:49     ` Luis Machado
2023-09-20 14:01       ` Luis Machado
2023-09-20 14:22   ` Andrew Burgess
2023-09-20 15:26     ` Andrew Burgess
2023-09-20 23:35       ` Luis Machado
2023-09-21 10:02         ` Andrew Burgess
2023-09-21 10:44           ` Luis Machado
2023-09-25  9:57             ` Andrew Burgess
2023-09-26 12:39               ` Luis Machado
2023-09-27 17:56                 ` Andrew Burgess [this message]
2023-09-28  8:23                   ` Luis Machado
2023-10-03 17:23                     ` Andrew Burgess
2023-10-04 15:27                       ` Luis Machado
2023-09-25 15:41             ` Simon Marchi
2023-09-27 17:44               ` Andrew Burgess
2023-09-18 21:26 ` [PATCH v7 16/18] [gdb/aarch64] sme: Core file support for Linux Luis Machado
2023-09-18 21:26 ` [PATCH v7 17/18] [gdb/testsuite] sme: Add SVE/SME testcases Luis Machado
2023-09-19 19:12   ` Simon Marchi
2023-09-19 20:02     ` Luis Machado
2023-09-18 21:26 ` [PATCH v7 18/18] [gdb/docs] sme: Document SME registers and features Luis Machado
2023-10-04 15:27 ` [PATCH v7 00/18] SME support for AArch64 gdb/gdbserver on Linux Luis Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fs2zwlda.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=luis.machado@arm.com \
    --cc=thiago.bauermann@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).