From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 0F26B3841446 for ; Mon, 6 Jun 2022 17:49:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0F26B3841446 Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-393-nF2QKYlcOnWb2QyjlvRwKw-1; Mon, 06 Jun 2022 13:48:59 -0400 X-MC-Unique: nF2QKYlcOnWb2QyjlvRwKw-1 Received: by mail-wr1-f71.google.com with SMTP id h2-20020adfe982000000b002102da95c71so3038704wrm.23 for ; Mon, 06 Jun 2022 10:48:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:in-reply-to:references:date :message-id:mime-version; bh=//O/5+ZwZ9nt018CE8mYRxZ0/r9g1CjyKIhBHGMsiJc=; b=JX2+EHo3hqi3KtCGA54qcwIAyz0opZMvBuF+FRSSlWL2pR4bg0YLPGO2VPMpc4n2JJ olraanaMiKcCKsfn7ggBddTE6/54iBSwk7WOUMfb/tSn+mmbXFaXN8q8UcD4eRQJU4QE PIh3WhB8Y2E28sblv28qB9ZT/tu6e1doq2dQBqGwNjvsDUUJMGpv0ZKDjZ/ikWqubssT u3gq7F8lUAWuFHkcfVkrGCM9ow0SZV7WeFGmQZbm379OdZZ/lrzU3g1sLu+wF9DIr1FW 307zyvcRNGG80qfyk/X1P1Sff/tjParVofBZDWVKssTXsbqkMmAn4RnJkx+QG1YCZcCG P9Gg== X-Gm-Message-State: AOAM533NM68FxctzBvIQ+IyXi7U6ajAiRKq9/5N+kNodObBfprD0frRw yS9QTUVY6Fi2/DB98SwetQGf5+XpC3H9J8afY/Z1LBoLsuPINt/0hHfq33JmeNrgzHTcOpg1oTE kLyu8U7s3mvsEhv/9Qp7IOA== X-Received: by 2002:a05:600c:154d:b0:394:8d64:9166 with SMTP id f13-20020a05600c154d00b003948d649166mr24132948wmg.102.1654537737796; Mon, 06 Jun 2022 10:48:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz8jJHAqp3WrVQEyKot22W5/eqgcpv1wutK9gDY4c8JeBqaMZReviXtYNzdMj1plq8NZJFcMQ== X-Received: by 2002:a05:600c:154d:b0:394:8d64:9166 with SMTP id f13-20020a05600c154d00b003948d649166mr24132918wmg.102.1654537737447; Mon, 06 Jun 2022 10:48:57 -0700 (PDT) Received: from localhost (host109-152-215-36.range109-152.btcentralplus.com. [109.152.215.36]) by smtp.gmail.com with ESMTPSA id u12-20020a05600c210c00b003942a244f39sm15407555wml.18.2022.06.06.10.48.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 10:48:56 -0700 (PDT) From: Andrew Burgess To: Luis Machado , John Baldwin , gdb-patches@sourceware.org Subject: Re: [PATCH 5/5] gdb: native target invalid architecture detection In-Reply-To: <87ilpdhn73.fsf@redhat.com> References: <71a986a5-2cfa-543e-4034-70f3af7dfecf@FreeBSD.org> <87ee09d4rt.fsf@redhat.com> <09afe250-9573-45e1-993b-a2f911f03630@arm.com> <87ilpdhn73.fsf@redhat.com> Date: Mon, 06 Jun 2022 18:48:55 +0100 Message-ID: <87ee01hed4.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Jun 2022 17:49:03 -0000 Andrew Burgess writes: > Luis Machado via Gdb-patches writes: > >> On 5/31/22 17:51, Andrew Burgess via Gdb-patches wrote: >>> John Baldwin writes: >>> >>>> On 5/31/22 7:30 AM, Andrew Burgess via Gdb-patches wrote: >>>>> If GDB is asked to start a new inferior, or attach to an existing >>>>> process, using a binary file for an architecture that does not match >>>>> the current native target, then, currently, GDB will assert. Here's >>>>> an example session using current HEAD of master with GDB built for an >>>>> x86-64 GNU/Linux native target, the binary being used is a RISC-V ELF: >>>>> >>>>> $ ./gdb/gdb -q --data-directory ./gdb/data-directory/ >>>>> (gdb) file /tmp/hello.rv32imc.x >>>>> Reading symbols from /tmp/hello.rv32imc.x... >>>>> (gdb) start >>>>> Temporary breakpoint 1 at 0x101b2: file hello.rv32.c, line 23. >>>>> Starting program: /tmp/hello.rv32imc.x >>>>> ../../src/gdb/gdbarch.h:166: internal-error: gdbarch_tdep: Assertion `dynamic_cast (tdep) != nullptr' failed. >>>>> A problem internal to GDB has been detected, >>>>> further debugging may prove unreliable. >>>>> >>>>> The same error is encountered if, instead of starting a new inferior, >>>>> the user tries to attach to an x86-64 process with a RISC-V binary set >>>>> as the current executable. >>>>> >>>>> These errors are not specific to the x86-64/RISC-V pairing I'm using >>>>> here, any attempt to use a binary for one architecture with a native >>>>> target of a different architecture will result in a similar error. >>>>> >>>>> Clearly, attempting to use this cross-architecture combination is a >>>>> user error, but I think GDB should do better than an assert; ideally a >>>>> nice error should be printed. >>>>> >>>>> The problem we run into is that, when the user starts a new inferior, >>>>> or attaches to an inferior, the inferior stops. At this point GDB >>>>> attempts to handle the stop, and this involves reading registers from >>>>> the inferior. >>>>> >>>>> These register reads end up being done through the native target, so >>>>> in the example above, we end up in the amd64_supply_fxsave function. >>>>> However, these functions need a gdbarch. The gdbarch is fetched from >>>>> the register set, which was constructed using the gdbarch from the >>>>> binary currently in use. And so we end up in amd64_supply_fxsave >>>>> using a RISC-V gdbarch. >>>>> >>>>> When we call: >>>>> >>>>> i386_gdbarch_tdep *tdep = gdbarch_tdep (gdbarch); >>>>> >>>>> this will assert as the gdbarch_tdep data within the RISC-V gdbarch is >>>>> of the type riscv_gdbarch_tdep not i386_gdbarch_tdep. >>>>> >>>>> The solution I propose in this commit is to add a new target_ops >>>>> method supports_architecture_p. This method will return true if a >>>>> target can safely be used with a specific architecture, otherwise, the >>>>> method returns false. >>>>> >>>>> I imagine that a result of true from this method doesn't guarantee >>>>> that GDB can start an inferior of a given architecture, it just means >>>>> that GDB will not crash if such an attempt is made. A result of false >>>>> is a hard stop; attempting to use this target with this architecture >>>>> is not supported, and may cause GDB to crash. >>>>> >>>>> This distinction is important I think for things like remote targets, >>>>> and possibly simulator targets. We might imagine that GDB can ask a >>>>> remote (or simulator) to start with a particular executable, and the >>>>> target might still refuse for some reason. But my thinking is that >>>>> these refusals should be well handled (i.e. GDB should give a user >>>>> friendly error), rather than crashing, as is the case with the native >>>>> targets. >>>>> >>>>> For example, if I start gdbserver on an x86-64 machine like this: >>>>> >>>>> gdbserver --multi :54321 >>>>> >>>>> Then make use of this from a GDB session like this: >>>>> >>>>> $ ./gdb/gdb -q --data-directory ./gdb/data-directory/ >>>>> (gdb) file /tmp/hello.rv32imc.x >>>>> Reading symbols from /tmp/hello.rv32imc.x... >>>>> (gdb) target extended-remote :54321 >>>>> Remote debugging using :54321 >>>>> (gdb) run >>>>> Starting program: /tmp/hello.rv32imc.x >>>>> Running the default executable on the remote target failed; try "set remote exec-file"? >>>>> (gdb) >>>>> >>>>> Though the error is not very helpful in diagnosing the problem, we can >>>>> see that GDB has not crashed, but has given the user an error. >>>>> >>>>> And so, the supports_architecture_p method is created to return true >>>>> by default, then I override this in inf_child_target, where I compare >>>>> the architecture in question with the default_bfd_arch. >>>>> >>>>> Finally, I've added calls to supports_architecture_p for the >>>>> run (which covers run, start, starti) and attach commands. >>>>> >>>>> You will notice a lack of tests for this change. I'm not sure of a >>>>> good way that I can build a binary for a different architecture as >>>>> part of a test, but if anyone has any ideas then I'll be happy to add >>>>> a test here. >>>> >>>> Have you considered multi-arch cases such as running i386 binaries on an x86-64 >>>> host or 32-bit arm binaries on an AArch64 host? Will we need to override this >>>> method in certain targets (e.g. x86-linux-nat.c or x86-fbsd-nat.c) to support >>>> such cases? >>> >>> For the x86 examples you gave, I think these all have the bfd_arch_i386 >>> bfd architecture, so should work just fine. >>> >>> But for the aarch64 case, I admit I don't know how this works. A 32-bit >>> ARM binary is going to have bfd_arch_arm, while the AArch64 target will >>> be expecting a gdbarch with bfd_arch_aarch64. But how GDB supports >>> running the former on the latter, I don't know. >>> >>> I notice there's aarch64-linux-nat.c and aarch32-linux-nat.c, I wonder >>> if this has something to do with it... >> >> aarch32 is the 32-bit state of aarch64, but the BFD architecture is >> different. So this won't work out-of-the-box. > > If I understand correctly a aarch64 binary will have bfd_arch_aarch64, > and a aarch32 (armv7?) binary will be bfd_arch_arm. > > What I don't understand, is how this all hangs together when running on > an aarch64 machine, using a native target. > > Unless I'm missing something (please let me know if I am) then there's > just a single native linux target, the_aarch64_linux_nat_target, which > is registered from _initialize_aarch64_linux_nat in > aarch64-linux-nat.c. > > In aarch64_linux_nat_target::fetch_registers, we assume the gdbarch_tdep > is of type aarch64_gdbarch_tdep, so this will only work for > bfd_arch_aarch64 binaries. > > Then there's the_arm_linux_nat_target, from arm-linux-nat.c, which has > arm_linux_nat_target::fetch_registers, which assumed gdbarch_tdep will > be of type arm_gdbarch_tdep, so will only work for bfd_arch_arm targets. > > So, I guess, if we can debug bfd_arch_arm and bfd_arch_aarch64 binaries > on a native aarch64 target, we must somehow be switching the current > native target object between these two, right? But I can't figure out > where that's happening... > >> >>> >>> Maybe someone with more ARM/AArch64 knowledge will chip in to fill in >>> some of the blanks. >> >> When attempting to run a 32-bit binary in 64-bit state, I get... > > Does this mean you're on an aarch64 machine, start GDB with an aarch32 > (armv7?) binary, and then "run" to start the process? > >> >> The target does not support architecture "armv7". > > I'd be really interested to know if you only take the first 4 patches > from this series, can you still run armv7 binaries using an aarch64 GDB? > I'm wonderinging if the assert added in patch #4 will trigger. I managed to get a setup where I'm running on an aarch64 machine, and using GDB to debug an ARMv7 binary. I'm a little confused as to what's going on, so I'm hoping I can describe what I'm seeing, and you can tell me if GDB is supposed to do better than this or not. First, I'm using a modified version of patch #4 of my series (i.e. I've applied patches #1, #2, #3, and #4, but modified #4), in gdbarch.h I've replaced the _function_ gdbarch_tdep with this: template static inline TDepType * gdbarch_tdep (struct gdbarch *gdbarch) { TDepType *tdep = dynamic_cast (gdbarch_tdep_1 (gdbarch)); if (tdep == nullptr) { fprintf (stderr, "XXX: Incorrect cast performed!\n"); return static_cast (gdbarch_tdep_1 (gdbarch)); } return tdep; } Here's my GDB session then, debugging a 32-bit ARMv7 binary; $ file /tmp/ubuntu-16.04-armhf/bin/true /tmp/ubuntu-16.04-armhf/bin/true: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=c80229c22d4b6b30b71ab1b1b5a1de6b86b6aadf, stripped $ ./gdb/gdb -q --data-directory ./gdb/data-directory/ /tmp/ubuntu-16.04-armhf/bin/true Reading symbols from /tmp/ubuntu-16.04-armhf/bin/true... (No debugging symbols found in /tmp/ubuntu-16.04-armhf/bin/true) (gdb) show architecture The target architecture is set to "auto" (currently "armv7"). (gdb) starti Starting program: /tmp/ubuntu-16.04-armhf/bin/true XXX: Incorrect cast performed! XXX: Incorrect cast performed! warning: Unable to determine the number of hardware watchpoints available. warning: Unable to determine the number of hardware breakpoints available. XXX: Incorrect cast performed! Program stopped. 0x0000000000000006 in ?? () (gdb) show architecture The target architecture is set to "auto" (currently "aarch64"). (gdb) pipe maintenance print xml-tdesc | head -n 20 arm (gdb) info registers x0 0x0 0 x1 0x0 0 x2 0x0 0 x3 0x0 0 x4 0x0 0 x5 0x0 0 x6 0xfffef6a000000000 -291782898221056 x7 0xf77d0b4100000000 -613321600451739648 x8 0x30 48 x9 0x8d03c8 9241544 x10 0x23cad1e8 600494568 x11 0x7fd7d9dce8 549082225896 x12 0x7fd7d9dd00 549082225920 x13 0x7fa10128d0 548162054352 x14 0x23c3d8a0 600037536 x15 0x0 0 x16 0xd32a70 13838960 x17 0xd32aa0 13839008 x18 0x0 0 x19 0x0 0 x20 0x0 0 x21 0x0 0 x22 0x0 0 x23 0x0 0 x24 0x7fd7d9dd50 549082226000 x25 0x412694 4269716 x26 0x7cdf14 8183572 x27 0x23b60ca0 599133344 x28 0x4105e0 4261344 x29 0x0 0 x30 0x7fd7d9dd50 549082226000 sp 0x23c3d8a0 0x23c3d8a0 pc 0x6 0x6 cpsr 0x6 [ EL=1 BTYPE=0 ] Unable to fetch vFP/SIMD registers.: Invalid argument. (gdb) There's a couple of weird things I notice here. First, obviously we're casting the tdep field incorrectly 3 times. This happens because, initially, the GDB inferior has a gdbarch selected based on the ELF, so it will be an ARM gdbarch with an arm_gdbarch_tdep. However, I believe that the native target will be aarch64_linux_nat_target. So, when we first stop, after the target is created, we will call aarch64_linux_nat_target::fetch_registers as part of the stop process, this will perform one of the invalid casts. I haven't (yet) tracked down the others. The next weird thing is that, after starting, the architecture has changed to aarch64. I have no idea why this is the case (I haven't looked yet). My expectation was that after the first stop GDB would ask the native target for a target description, we'd get an ARMv7 target description, and we'd then use that description to update the gdbarch. It appears I was partially correct, as you can see that the xml-tdesc is for ARM, however, the current architect still switches to aarch64, and I have no idea why that is. Finally, in the 'info registes' we appear to be displaying aarch64 register names, rather than the ARMv7 names from the xml-tdesc, which seems weird (again, I haven't looked yet), and then, at the end of the 'info registers' output we appear to try and fetch some registers that the target doens't support. If I try the same experiment using an aarch64 test binary then the architecture starts as aarch64 and remains so after the 'starti' (as you'd expect, and the 'info registers' output still displays the aarch64 names (as you'd expect), but the warning about 'Unable to fetch vfP/SIMD registers' is no longer present - so that issue is something specific relating to running the ARMv7 binary. I think there's a lot here for me to look into, but I'm hoping you might be able to let me know if me experiences above are the "expected" behaviour, or if I'm not configured/built GDB correctly, or maybe I'm just not using GDB correctly in this ARMv7/ARMv8 situation. Thanks, Andrew