From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <aburgess@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.129.124])
 by sourceware.org (Postfix) with ESMTPS id 0F26B3841446
 for <gdb-patches@sourceware.org>; Mon,  6 Jun 2022 17:49:01 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0F26B3841446
Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com
 [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-393-nF2QKYlcOnWb2QyjlvRwKw-1; Mon, 06 Jun 2022 13:48:59 -0400
X-MC-Unique: nF2QKYlcOnWb2QyjlvRwKw-1
Received: by mail-wr1-f71.google.com with SMTP id
 h2-20020adfe982000000b002102da95c71so3038704wrm.23
 for <gdb-patches@sourceware.org>; Mon, 06 Jun 2022 10:48:59 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:from:to:subject:in-reply-to:references:date
 :message-id:mime-version;
 bh=//O/5+ZwZ9nt018CE8mYRxZ0/r9g1CjyKIhBHGMsiJc=;
 b=JX2+EHo3hqi3KtCGA54qcwIAyz0opZMvBuF+FRSSlWL2pR4bg0YLPGO2VPMpc4n2JJ
 olraanaMiKcCKsfn7ggBddTE6/54iBSwk7WOUMfb/tSn+mmbXFaXN8q8UcD4eRQJU4QE
 PIh3WhB8Y2E28sblv28qB9ZT/tu6e1doq2dQBqGwNjvsDUUJMGpv0ZKDjZ/ikWqubssT
 u3gq7F8lUAWuFHkcfVkrGCM9ow0SZV7WeFGmQZbm379OdZZ/lrzU3g1sLu+wF9DIr1FW
 307zyvcRNGG80qfyk/X1P1Sff/tjParVofBZDWVKssTXsbqkMmAn4RnJkx+QG1YCZcCG
 P9Gg==
X-Gm-Message-State: AOAM533NM68FxctzBvIQ+IyXi7U6ajAiRKq9/5N+kNodObBfprD0frRw
 yS9QTUVY6Fi2/DB98SwetQGf5+XpC3H9J8afY/Z1LBoLsuPINt/0hHfq33JmeNrgzHTcOpg1oTE
 kLyu8U7s3mvsEhv/9Qp7IOA==
X-Received: by 2002:a05:600c:154d:b0:394:8d64:9166 with SMTP id
 f13-20020a05600c154d00b003948d649166mr24132948wmg.102.1654537737796; 
 Mon, 06 Jun 2022 10:48:57 -0700 (PDT)
X-Google-Smtp-Source: ABdhPJz8jJHAqp3WrVQEyKot22W5/eqgcpv1wutK9gDY4c8JeBqaMZReviXtYNzdMj1plq8NZJFcMQ==
X-Received: by 2002:a05:600c:154d:b0:394:8d64:9166 with SMTP id
 f13-20020a05600c154d00b003948d649166mr24132918wmg.102.1654537737447; 
 Mon, 06 Jun 2022 10:48:57 -0700 (PDT)
Received: from localhost (host109-152-215-36.range109-152.btcentralplus.com.
 [109.152.215.36]) by smtp.gmail.com with ESMTPSA id
 u12-20020a05600c210c00b003942a244f39sm15407555wml.18.2022.06.06.10.48.56
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Mon, 06 Jun 2022 10:48:56 -0700 (PDT)
From: Andrew Burgess <aburgess@redhat.com>
To: Luis Machado <luis.machado@arm.com>, John Baldwin <jhb@FreeBSD.org>,
 gdb-patches@sourceware.org
Subject: Re: [PATCH 5/5] gdb: native target invalid architecture detection
In-Reply-To: <87ilpdhn73.fsf@redhat.com>
References: <cover.1654007211.git.aburgess@redhat.com>
 <fccb395397948ad99bc7024b85c93cb7c3d0a1e0.1654007211.git.aburgess@redhat.com>
 <71a986a5-2cfa-543e-4034-70f3af7dfecf@FreeBSD.org>
 <87ee09d4rt.fsf@redhat.com> <09afe250-9573-45e1-993b-a2f911f03630@arm.com>
 <87ilpdhn73.fsf@redhat.com>
Date: Mon, 06 Jun 2022 18:48:55 +0100
Message-ID: <87ee01hed4.fsf@redhat.com>
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain
X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF,
 RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Jun 2022 17:49:03 -0000

Andrew Burgess <aburgess@redhat.com> writes:

> Luis Machado via Gdb-patches <gdb-patches@sourceware.org> writes:
>
>> On 5/31/22 17:51, Andrew Burgess via Gdb-patches wrote:
>>> John Baldwin <jhb@FreeBSD.org> writes:
>>> 
>>>> On 5/31/22 7:30 AM, Andrew Burgess via Gdb-patches wrote:
>>>>> If GDB is asked to start a new inferior, or attach to an existing
>>>>> process, using a binary file for an architecture that does not match
>>>>> the current native target, then, currently, GDB will assert.  Here's
>>>>> an example session using current HEAD of master with GDB built for an
>>>>> x86-64 GNU/Linux native target, the binary being used is a RISC-V ELF:
>>>>>
>>>>>     $ ./gdb/gdb -q --data-directory ./gdb/data-directory/
>>>>>     (gdb) file /tmp/hello.rv32imc.x
>>>>>     Reading symbols from /tmp/hello.rv32imc.x...
>>>>>     (gdb) start
>>>>>     Temporary breakpoint 1 at 0x101b2: file hello.rv32.c, line 23.
>>>>>     Starting program: /tmp/hello.rv32imc.x
>>>>>     ../../src/gdb/gdbarch.h:166: internal-error: gdbarch_tdep: Assertion `dynamic_cast<TDepType *> (tdep) != nullptr' failed.
>>>>>     A problem internal to GDB has been detected,
>>>>>     further debugging may prove unreliable.
>>>>>
>>>>> The same error is encountered if, instead of starting a new inferior,
>>>>> the user tries to attach to an x86-64 process with a RISC-V binary set
>>>>> as the current executable.
>>>>>
>>>>> These errors are not specific to the x86-64/RISC-V pairing I'm using
>>>>> here, any attempt to use a binary for one architecture with a native
>>>>> target of a different architecture will result in a similar error.
>>>>>
>>>>> Clearly, attempting to use this cross-architecture combination is a
>>>>> user error, but I think GDB should do better than an assert; ideally a
>>>>> nice error should be printed.
>>>>>
>>>>> The problem we run into is that, when the user starts a new inferior,
>>>>> or attaches to an inferior, the inferior stops.  At this point GDB
>>>>> attempts to handle the stop, and this involves reading registers from
>>>>> the inferior.
>>>>>
>>>>> These register reads end up being done through the native target, so
>>>>> in the example above, we end up in the amd64_supply_fxsave function.
>>>>> However, these functions need a gdbarch.  The gdbarch is fetched from
>>>>> the register set, which was constructed using the gdbarch from the
>>>>> binary currently in use.  And so we end up in amd64_supply_fxsave
>>>>> using a RISC-V gdbarch.
>>>>>
>>>>> When we call:
>>>>>
>>>>>     i386_gdbarch_tdep *tdep = gdbarch_tdep<i386_gdbarch_tdep> (gdbarch);
>>>>>
>>>>> this will assert as the gdbarch_tdep data within the RISC-V gdbarch is
>>>>> of the type riscv_gdbarch_tdep not i386_gdbarch_tdep.
>>>>>
>>>>> The solution I propose in this commit is to add a new target_ops
>>>>> method supports_architecture_p.  This method will return true if a
>>>>> target can safely be used with a specific architecture, otherwise, the
>>>>> method returns false.
>>>>>
>>>>> I imagine that a result of true from this method doesn't guarantee
>>>>> that GDB can start an inferior of a given architecture, it just means
>>>>> that GDB will not crash if such an attempt is made.  A result of false
>>>>> is a hard stop; attempting to use this target with this architecture
>>>>> is not supported, and may cause GDB to crash.
>>>>>
>>>>> This distinction is important I think for things like remote targets,
>>>>> and possibly simulator targets.  We might imagine that GDB can ask a
>>>>> remote (or simulator) to start with a particular executable, and the
>>>>> target might still refuse for some reason.  But my thinking is that
>>>>> these refusals should be well handled (i.e. GDB should give a user
>>>>> friendly error), rather than crashing, as is the case with the native
>>>>> targets.
>>>>>
>>>>> For example, if I start gdbserver on an x86-64 machine like this:
>>>>>
>>>>>     gdbserver --multi :54321
>>>>>
>>>>> Then make use of this from a GDB session like this:
>>>>>
>>>>>     $ ./gdb/gdb -q --data-directory ./gdb/data-directory/
>>>>>     (gdb) file /tmp/hello.rv32imc.x
>>>>>     Reading symbols from /tmp/hello.rv32imc.x...
>>>>>     (gdb) target extended-remote :54321
>>>>>     Remote debugging using :54321
>>>>>     (gdb) run
>>>>>     Starting program: /tmp/hello.rv32imc.x
>>>>>     Running the default executable on the remote target failed; try "set remote exec-file"?
>>>>>     (gdb)
>>>>>
>>>>> Though the error is not very helpful in diagnosing the problem, we can
>>>>> see that GDB has not crashed, but has given the user an error.
>>>>>
>>>>> And so, the supports_architecture_p method is created to return true
>>>>> by default, then I override this in inf_child_target, where I compare
>>>>> the architecture in question with the default_bfd_arch.
>>>>>
>>>>> Finally, I've added calls to supports_architecture_p for the
>>>>> run (which covers run, start, starti) and attach commands.
>>>>>
>>>>> You will notice a lack of tests for this change.  I'm not sure of a
>>>>> good way that I can build a binary for a different architecture as
>>>>> part of a test, but if anyone has any ideas then I'll be happy to add
>>>>> a test here.
>>>>
>>>> Have you considered multi-arch cases such as running i386 binaries on an x86-64
>>>> host or 32-bit arm binaries on an AArch64 host?  Will we need to override this
>>>> method in certain targets (e.g. x86-linux-nat.c or x86-fbsd-nat.c) to support
>>>> such cases?
>>> 
>>> For the x86 examples you gave, I think these all have the bfd_arch_i386
>>> bfd architecture, so should work just fine.
>>> 
>>> But for the aarch64 case, I admit I don't know how this works.  A 32-bit
>>> ARM binary is going to have bfd_arch_arm, while the AArch64 target will
>>> be expecting a gdbarch with bfd_arch_aarch64.  But how GDB supports
>>> running the former on the latter, I don't know.
>>> 
>>> I notice there's aarch64-linux-nat.c and aarch32-linux-nat.c, I wonder
>>> if this has something to do with it...
>>
>> aarch32 is the 32-bit state of aarch64, but the BFD architecture is
>> different. So this won't work out-of-the-box.
>
> If I understand correctly a aarch64 binary will have bfd_arch_aarch64,
> and a aarch32 (armv7?) binary will be bfd_arch_arm.
>
> What I don't understand, is how this all hangs together when running on
> an aarch64 machine, using a native target.
>
> Unless I'm missing something (please let me know if I am) then there's
> just a single native linux target, the_aarch64_linux_nat_target, which
> is registered from _initialize_aarch64_linux_nat in
> aarch64-linux-nat.c.
>
> In aarch64_linux_nat_target::fetch_registers, we assume the gdbarch_tdep
> is of type aarch64_gdbarch_tdep, so this will only work for
> bfd_arch_aarch64 binaries.
>
> Then there's the_arm_linux_nat_target, from arm-linux-nat.c, which has
> arm_linux_nat_target::fetch_registers, which assumed gdbarch_tdep will
> be of type arm_gdbarch_tdep, so will only work for bfd_arch_arm targets.
>
> So, I guess, if we can debug bfd_arch_arm and bfd_arch_aarch64 binaries
> on a native aarch64 target, we must somehow be switching the current
> native target object between these two, right?  But I can't figure out
> where that's happening...
>
>>
>>> 
>>> Maybe someone with more ARM/AArch64 knowledge will chip in to fill in
>>> some of the blanks.
>>
>> When attempting to run a 32-bit binary in 64-bit state, I get...
>
> Does this mean you're on an aarch64 machine, start GDB with an aarch32
> (armv7?) binary, and then "run" to start the process?
>
>>
>> The target does not support architecture "armv7".
>
> I'd be really interested to know if you only take the first 4 patches
> from this series, can you still run armv7 binaries using an aarch64 GDB?
> I'm wonderinging if the assert added in patch #4 will trigger.

I managed to get a setup where I'm running on an aarch64 machine, and
using GDB to debug an ARMv7 binary.

I'm a little confused as to what's going on, so I'm hoping I can
describe what I'm seeing, and you can tell me if GDB is supposed to do
better than this or not.

First, I'm using a modified version of patch #4 of my series (i.e. I've
applied patches #1, #2, #3, and #4, but modified #4), in gdbarch.h I've
replaced the _function_ gdbarch_tdep with this:

  template<typename TDepType>
  static inline TDepType *
  gdbarch_tdep (struct gdbarch *gdbarch)
  {
    TDepType *tdep = dynamic_cast<TDepType *> (gdbarch_tdep_1 (gdbarch));
    if (tdep == nullptr)
      {
        fprintf (stderr, "XXX: Incorrect cast performed!\n");
        return static_cast<TDepType *> (gdbarch_tdep_1 (gdbarch));
      }
    return tdep;
  }

Here's my GDB session then, debugging a 32-bit ARMv7 binary;

  $ file /tmp/ubuntu-16.04-armhf/bin/true
  /tmp/ubuntu-16.04-armhf/bin/true: ELF 32-bit LSB  executable, ARM, EABI5 version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=c80229c22d4b6b30b71ab1b1b5a1de6b86b6aadf, stripped
  $ ./gdb/gdb -q --data-directory ./gdb/data-directory/ /tmp/ubuntu-16.04-armhf/bin/true
  Reading symbols from /tmp/ubuntu-16.04-armhf/bin/true...
  (No debugging symbols found in /tmp/ubuntu-16.04-armhf/bin/true)
  (gdb) show architecture 
  The target architecture is set to "auto" (currently "armv7").
  (gdb) starti
  Starting program: /tmp/ubuntu-16.04-armhf/bin/true 
  XXX: Incorrect cast performed!
  XXX: Incorrect cast performed!
  warning: Unable to determine the number of hardware watchpoints available.
  warning: Unable to determine the number of hardware breakpoints available.
  XXX: Incorrect cast performed!
  
  Program stopped.
  0x0000000000000006 in ?? ()
  (gdb) show architecture 
  The target architecture is set to "auto" (currently "aarch64").
  (gdb) pipe maintenance print xml-tdesc | head -n 20
  <?xml version="1.0"?>
  <!DOCTYPE target SYSTEM "gdb-target.dtd">
  <target>
    <architecture>arm</architecture>
    <feature name="org.gnu.gdb.arm.core">
      <reg name="r0" bitsize="32" type="uint32" regnum="0"/>
      <reg name="r1" bitsize="32" type="uint32" regnum="1"/>
      <reg name="r2" bitsize="32" type="uint32" regnum="2"/>
      <reg name="r3" bitsize="32" type="uint32" regnum="3"/>
      <reg name="r4" bitsize="32" type="uint32" regnum="4"/>
      <reg name="r5" bitsize="32" type="uint32" regnum="5"/>
      <reg name="r6" bitsize="32" type="uint32" regnum="6"/>
      <reg name="r7" bitsize="32" type="uint32" regnum="7"/>
      <reg name="r8" bitsize="32" type="uint32" regnum="8"/>
      <reg name="r9" bitsize="32" type="uint32" regnum="9"/>
      <reg name="r10" bitsize="32" type="uint32" regnum="10"/>
      <reg name="r11" bitsize="32" type="uint32" regnum="11"/>
      <reg name="r12" bitsize="32" type="uint32" regnum="12"/>
      <reg name="sp" bitsize="32" type="data_ptr" regnum="13"/>
      <reg name="lr" bitsize="32" type="int" regnum="14"/>
  (gdb) info registers 
  x0             0x0                 0
  x1             0x0                 0
  x2             0x0                 0
  x3             0x0                 0
  x4             0x0                 0
  x5             0x0                 0
  x6             0xfffef6a000000000  -291782898221056
  x7             0xf77d0b4100000000  -613321600451739648
  x8             0x30                48
  x9             0x8d03c8            9241544
  x10            0x23cad1e8          600494568
  x11            0x7fd7d9dce8        549082225896
  x12            0x7fd7d9dd00        549082225920
  x13            0x7fa10128d0        548162054352
  x14            0x23c3d8a0          600037536
  x15            0x0                 0
  x16            0xd32a70            13838960
  x17            0xd32aa0            13839008
  x18            0x0                 0
  x19            0x0                 0
  x20            0x0                 0
  x21            0x0                 0
  x22            0x0                 0
  x23            0x0                 0
  x24            0x7fd7d9dd50        549082226000
  x25            0x412694            4269716
  x26            0x7cdf14            8183572
  x27            0x23b60ca0          599133344
  x28            0x4105e0            4261344
  x29            0x0                 0
  x30            0x7fd7d9dd50        549082226000
  sp             0x23c3d8a0          0x23c3d8a0
  pc             0x6                 0x6
  cpsr           0x6                 [ EL=1 BTYPE=0 ]
  Unable to fetch vFP/SIMD registers.: Invalid argument.
  (gdb) 

There's a couple of weird things I notice here.  First, obviously we're
casting the tdep field incorrectly 3 times.  This happens because,
initially, the GDB inferior has a gdbarch selected based on the ELF, so
it will be an ARM gdbarch with an arm_gdbarch_tdep.  However, I believe
that the native target will be aarch64_linux_nat_target.

So, when we first stop, after the target is created, we will call
aarch64_linux_nat_target::fetch_registers as part of the stop process,
this will perform one of the invalid casts.  I haven't (yet) tracked
down the others.

The next weird thing is that, after starting, the architecture has
changed to aarch64.  I have no idea why this is the case (I haven't
looked yet).

My expectation was that after the first stop GDB would ask the native
target for a target description, we'd get an ARMv7 target description,
and we'd then use that description to update the gdbarch.  It appears I
was partially correct, as you can see that the xml-tdesc is for ARM,
however, the current architect still switches to aarch64, and I have no
idea why that is.

Finally, in the 'info registes' we appear to be displaying aarch64
register names, rather than the ARMv7 names from the xml-tdesc, which
seems weird (again, I haven't looked yet), and then, at the end of the
'info registers' output we appear to try and fetch some registers that
the target doens't support.

If I try the same experiment using an aarch64 test binary then the
architecture starts as aarch64 and remains so after the 'starti' (as
you'd expect, and the 'info registers' output still displays the aarch64
names (as you'd expect), but the warning about 'Unable to fetch vfP/SIMD
registers' is no longer present - so that issue is something specific
relating to running the ARMv7 binary.

I think there's a lot here for me to look into, but I'm hoping you might
be able to let me know if me experiences above are the "expected"
behaviour, or if I'm not configured/built GDB correctly, or maybe I'm
just not using GDB correctly in this ARMv7/ARMv8 situation.

Thanks,
Andrew