From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id C53833858401 for ; Wed, 6 Sep 2023 17:22:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C53833858401 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3E05B106F; Wed, 6 Sep 2023 10:22:53 -0700 (PDT) Received: from [10.2.78.54] (e120077-lin.cambridge.arm.com [10.2.78.54]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A6D193F7B4; Wed, 6 Sep 2023 10:22:14 -0700 (PDT) Message-ID: Date: Wed, 6 Sep 2023 18:22:13 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 Subject: Re: Question on aarch64 prologue code. To: Iain Sandoe , Richard Sandiford Cc: GCC Development References: <9C47ECA6-D1CF-4020-8BC5-B4C0D2C9C671@googlemail.com> <66FBEE1F-A485-4D2D-AD49-8A0F2D14723B@googlemail.com> Content-Language: en-GB From: "Richard Earnshaw (lists)" In-Reply-To: <66FBEE1F-A485-4D2D-AD49-8A0F2D14723B@googlemail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3491.6 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,NICE_REPLY_A,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 06/09/2023 15:03, Iain Sandoe wrote: > Hi Richard, > >> On 6 Sep 2023, at 13:43, Richard Sandiford via Gcc wrote: >> >> Iain Sandoe writes: > >>> On the Darwin aarch64 port, we have a number of cleanup test fails (pretty much corresponding to the [still open] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39244). However, let’s assume that bug could be a red herring.. >>> >>> the underlying reason is missing CFI for the set of the FP which [with Darwin’s LLVM libunwind impl.] breaks the unwind through the function that triggers a signal. >> >> Just curious, do you have more details about why that is? If the unwinder >> is sophisticated enough to process CFI, it seems odd that it requires the >> CFA to be defined in terms of the frame pointer. > > Let me see if I can answer that below. > > > >>> <—— >>> >>> I have currently worked around this by defining a TARGET_FRAME_POINTER_REQUIRED which returns true unless the function is a leaf (if that’s the correct solution, then all is fine). >> >> I suppose it depends on why the frame-pointer-based CFA is important >> for Darwin. If it's due to a more general requirement for a frame >> pointer to be used, then yeah, that's probably the right fix. > > The Darwin ABI mandates a frame pointer (although it is omitted by clang for leaf functions). > >> If it's >> more a quirk of the unwinder. then we could probably expose whatever >> that quirk is as a new status bit. Target-independent code in >> dwarf2cfi.cc would then need to be aware as well. > > (I suspect) it is the interaction between the mandatory FP and the fact that GCC lays out the stack differently from the other Darwin toolchains at present [port Issue #19]. > > For the system toolchain, 30 and 29 are always placed first, right below the SP (other callee saves are made below that in a specified order and always in pairs - presumably, with an uneccessary spill half the time) - Actually, I had a look at the weekend, but cannot find specific documentation on this particular aspect of the ABI (but, of course, the de facto ABI is what the system toolchain does, regardless of presence/absence of any such doc). > > However (speculation) that means that the FP is not saved where the system tools expect it, maybe that is confusing the unwinder absent the fp cfa. Of course, it could also just be an unwinder bug that is never triggered by clang’s codegen. > > GCC’s different layout currently defeats compact unwinding on all but leaf frames, so one day I want to fix it ... > .. however making this change is quite heavy lifting and I think there are higher priorities for the port (so, as long as we have working unwind and no observable fallout, I am deferring that change). > > Note that Darwin’s ABI also has a red zone (but we have not yet made any use of this, since there is no existing aarch64 impl. and I’ve not had time to get to it). However, AFAICS that is an optimisation - we can still be correct without it. > >>> ——— >>> >>> However, it does seem odd that the existing code sets up the FP, but never produces any CFA for it. >>> >>> So is this a possible bug, or just that I misunderstand the relevant set of circumstances? >> >> emit_frame_chain fulfills an ABI requirement that every non-leaf function >> set up a frame-chain record. When emit_frame_chain && !frame_pointer_needed, >> we set up the FP for ABI purposes only. GCC can still access everything >> relative to the stack pointer, and it can still describe the CFI based >> purely on the stack pointer. > > Thanks that makes sense > - I guess libunwind is never used with aarch64 linux, even in a clang/llvm toolchain. >> >> glibc-based systems only need the CFA to be based on the frame pointer >> if the stack pointer moves during the body of the function (usually due >> to alloca or VLAs). > > I’d have to poke more at the unwinder code and do some more debugging - it seems reasonable that it could work for any unwinder that’s based on DWARF (although, if we have completely missing unwind info, then the different stack layout would surely defeat any fallback proceedure). > This is only a guess, but it sounds to me like the issue might be that although we create a frame record, we don't use the frame pointer for accessing stack variables unless SP can't be used (eg: because the function calls alloca()). This tends to be more efficient because offset addressing for SP is more flexible. If we wanted to switch to making FP be the canonical frame address register we'd need to change all the code gen to use FP in addressing as well (or end up with some really messy translation when emitting debug information). R. > thanks > Iain > >> >> Thanks, >> Richard >