From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B629A3858C2D for ; Mon, 27 Nov 2023 18:30:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B629A3858C2D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=foss.arm.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=foss.arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B629A3858C2D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701109813; cv=none; b=T34ooxhBXmUTzWK8qdegy0KAr0VktLXiER1UoW76KHcIADdWe2y4xquUQh6mA/M4gg/E4Mfy6CSdfWDUNdOA/UXXMzAb6tuX2HaIoM2w/dHW+I4/Dm1sasdDFNtZiqy4LcGMK+cb+W1riMgSpvrU6hyu9Bn7g7qFM+RIGuHhwFM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701109813; c=relaxed/simple; bh=6Dh/9pAxA0R2wfqxN1Mch2G9zYrf2q50/LpnbAxCmvo=; h=Message-ID:Date:MIME-Version:Subject:To:From; b=asEg7jTw8vVkToy4gWdBdXx1RPyh58jggPSkU/EHnROhcV4UjgO/jaoORAjHRlnPwwcaMjspwJDCr5FSmOtj9+Ffdr52A8yQFpWujmlEHeRegEMPp7toznrv/0NS1h5JQzYe0Ex4weE+OSzG21Ud+3uSZ/GK5kY93ZB7GbcNhUo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2DDBA2F4; Mon, 27 Nov 2023 10:30:59 -0800 (PST) Received: from [10.57.41.113] (unknown [10.57.41.113]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B00AB3F5A1; Mon, 27 Nov 2023 10:30:10 -0800 (PST) Message-ID: <75572ae9-7c6f-4def-8147-9fbd078e007b@foss.arm.com> Date: Mon, 27 Nov 2023 18:30:09 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: arm-none-eabi, nested function trampolines and caching Content-Language: en-GB To: David Brown , edd.robbins@gmail.com, gcc-help@gcc.gnu.org Cc: Ed Robbins References: <36da6a00-6b2f-4628-b7ae-d7190a691c46@hesbynett.no> From: Richard Earnshaw In-Reply-To: <36da6a00-6b2f-4628-b7ae-d7190a691c46@hesbynett.no> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3487.6 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 27/11/2023 17:23, David Brown wrote: > On 27/11/2023 16:16, Ed Robbins via Gcc-help wrote: >> Hello, >> I am using gcc-arm-none-eabi with a cortex M7 device, with caches >> (data/instruction) enabled. Nested function calls result in a usage fault >> because there is no clear cache call for this platform. >> > > I am not sure I understand you here.  Are you talking about trying to > use gcc nested function extensions, implemented by trampolines (small > function stubs on the stack)?  If so, then the simple answer is - don't > do that.  It's a /really/ bad idea.  As far as I understand it, these > are a left-over from the way nested functions were originally > implemented in other gcc languages (Pascal, Ada, Modula-2), which now > handle things differently and far more efficiently.  Trampolines were a > convenient way to implement nested functions some 30 years ago, before > caches were the norm, before anyone thought about security, before > processors had prefetching, and before people realised what an > appallingly bad idea self-modifying code is. > > If you want to use nested functions, use a language that supports nested > functions, such as Ada, or use C++ with lambdas (which are a bit like > nested functions only much better). > >> Is there a way to provide the required functions without rebuilding >> gcc? I >> have been looking at the source and, as far as I can tell, there is not. > > I can think of at least four ways : > > 1. The SDK for your microcontroller, provided by the manufacturer, will > have headers with cache clear functions. > > 2. The ARM CMSIS headers - also available from your manufacturer - has > intrinsic functions, including cache clear functions. > > 3. gcc has a generic "__buitin__clear_cache" function : > > > 4. gcc supports the "ARM C Language Extensions", which include cache > control intrinsics: > > > > I completely agree with David's comments about nested functions. Don't do it! Cleaning the D-caches from user-level on Arm is practically impossible if there is no "OS" support; flushing the I-cache is equally difficult. This includes m-profile devices with secure and non-secure code, where only secure code can execute the cache management operations. The same is true for some, if not all, a-profile devices as well. Looking at the compiler sources the __clear_cache builtin is only implemented for Linux and even there it calls the kernel to do the work. ACLE does not define a clear cache intrinsic operation (as far as I can see). It does provide some of the primitives needed for a cache clear, such as __dmb() and __isb(), but on their own, these are not enough. CMSIS does appear to provide some primitives (SCB_CleanDCache_by_Addr and SCB_InvalidateICache_by_Addr), but these will directly invoke the relevent secure-mode primitives. If you want them in non-secure mode, you'll need to export a suitable API from your secure code and then arrange to use that. The compiler knows nothing about CMSIS, so this isn't much help for trampolines, I'm afraid. Microcontroller SDK's are almost certain to face similar issues, since the root issue is the same: you can't do this from non-secure mode. R. > >> >> But there also doesn't look to be a clean way to implement this: It >> appears >> that this is done on an operating system basis, and when running bare >> metal >> it is not clear where the code would live. > > There is no "clean" way to handle the appropriate cache invalidation, > because there is no clean way to get the addresses you need for > invalidating the instruction cache.  (Cleanly invalidating the > instruction cache for other purposes, such as during firmware upgrades, > is no problem.) > >> >> There are also at least two approaches to solve it, I guess: >> 1. Somehow indicate on the command line (via target or a dedicated >> option) >> to emit the clear cache call for cortex M, and I guess that the function >> itself should do nothing if both caches are disabled. >> 2. Define hooks or provide a command line option so that developers can >> provide an implementation for their platform? >> >> Assuming I were to do this the improper way (and just create a build that >> works only for my particular target): Where should I define >> CLEAR_INSN_CACHE? >> >> I am not sure if there is already a way to do all this that I am just >> unaware of? >> > > Seriously - don't use nested functions in C.  Even if you get them > working, it would be painfully inefficient.  You'd have to flush parts > of the data cache (to make sure the stack data is written out to main > memory), taking time.  You'd then have to invalidate the relevant parts > of the instruction cache.  (Even calculating what parts of these caches > to clear will take time and effort.)  Then everything needs to be read > into the caches again to actually execute the function. > > And what's the point?  So that you can write : > >     void foo(...) { >         int bar(...) { >             ... >         } >         bar(); >     } > > instead of > >     static int foo_bar(...) { >         ... >     } >     void foo(...) { >         foo_bar(); >     } > > or > >     void foo(...) { >         auto bar = [](...) { >             ... >         } >         bar(); >     } > > ? > >