Re: arm-none-eabi, nested function trampolines and caching

public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed

From: Matthias Pfaller <leo@marco.de>
To: David Brown <david.brown@hesbynett.no>,
	edd.robbins@gmail.com, gcc-help@gcc.gnu.org
Cc: Ed Robbins <edd.robbins@googlemail.com>
Subject: Re: arm-none-eabi, nested function trampolines and caching
Date: Wed, 29 Nov 2023 13:33:34 +0100	[thread overview]
Message-ID: <b6712fa5-4275-5c71-5b95-1c06295ffcc4@marco.de> (raw)
In-Reply-To: <f87b0625-2722-3fd2-becf-d81d5c5be690@hesbynett.no>

On 2023-11-29 12:52, David Brown wrote:
> On 29/11/2023 08:50, Matthias Pfaller wrote:
>> On 2023-11-28 19:00, David Brown wrote:
>>  > Can I ask (either or both of you) why you are using are using nested functions like
>>  > this?  This is possibly the first time I have heard of anyone using them, certainly
>>  > the first time in embedded development. Even when I programmed in Pascal, where
>>  > nested functions are part of the language, I did not use them more than a couple of
>>  > times.
>>  >
>>  > What benefit do you see in nested functions in C, compared to having separate
>>  > functions?  Have you considered moving to C++ and using lambdas, which are more
>>  > flexible, standard, and can be very much more efficient?
>>  >
>>  > This is, of course, straying from the topicality of this mailing list. But I am
>>  > curious, and I doubt if I am the only one who is.
>>
>> - I'm maintaining our token threaded forth interpreter. In the inner loop there is 
>> a absurdly big switch for the primitives. I'm loading rp, sp and tos into local 
>> variables. pushing, popping  and memory access is done by nested functions 
>> (checking for stack over and under flows, managing tos, access violations, ...). Of 
>> course that could be done by macros. But when I'm calling C-functions from within 
>> the switch I'll sometimes pass pointers to the local functions (e.g. for 
>> catch/throw exception handling).
>>
>> - When calling list iterators, I'm sometimes passing references to nested functions
>>
>> - When locking is necessary and the function has multiple return points I'm doing 
>> something like:
>>
>> void somefunction(void)
>> {
>>    void f(void)
>>    {
>>       ...
>>    }
>>    lock();
>>    f();
>>    unlock();
>> }
>>
>> I know, in a lot of cases I could just define some outer static function or use 
>> gotos. But to my eye it just looks nicer that way. In most cases there will be no 
>> trampoline necessary anyway. Its not used that often and we could probably get rid 
>> of it in most cases by using macros and ({ ... }).
>>
> 
> Thanks for that.
> 
> I can appreciate that local functions can look nicer than macros or goto spaghetti.  
> In simple cases (which is probably the majority for your usage), the local functions 
> will be inlined and will give pretty much exactly the same code as you'd get for 
> macros, outer static functions, or other methods.  But I'd be very unhappy to see 
> trampolines here, as you will need for more complicated cases.  The overheads are not 
> something you'd want to see in the inner loop of an interpreter.
> 
> AFAIUI, the reason the compiler has to generate trampolines here is to make a 
> function that has access to some of the local variables, while being shoe-horned into 
> the appearance of a function with parameters that don't include any extra values or 
> references.  If you were, as an alternative, to switch to C++ and use lambdas instead 
> of nested functions that all disappears precisely because lambdas do not have to be 
> forced to match the function signature - the generated lambda can take extra hidden 
> parameters (and even extra hidden state) as needed.
> 
> Of course it's never easy to change these kinds of things in existing code.  And it 
> is particularly difficult to get solutions that work efficiently on a wide range of 
> compilers or versions.
> 
> David

We are using (at the moment) two micro controllers with cache. The at91sam4e is a 
cortex-m4 device with two kilobytes of unified i/d-cache. Because of this cache must 
only be considered when using DMA.

The atsame7x/atsamv7x series is a cortex-m7 device with 16k i-cache and 16k d-cache. 
Here you have to worry about i-cache invalidates. Evicting a single i-cache (and the 
trampoline code is small) doesn't hurt too much. Especially if it happens very seldom 
(its not like every function passes pointers to nested functions...).

Besides that @300MHz (or @120MHz for the cortex-m4) the core is more than fast enough 
for our applications. The 384k of RAM on the at91sam[ev]7x and the 128k of RAM on the 
at91sam4e are a lot more of a hindrance...

I'm aware of the reason for the trampoline code and because of this I know that (as 
you wrote) in the majority of the cases trampoline code is not needed (because no 
outer arguments or variables are referenced and there is no passing of function 
pointers to other functions).

In the cases where the trampoline code is needed I'm willing to take the performance 
hit in exchange for the gain I get.

e.g. in the example with the interpreter inner loop I would need to pass along all 
kinds of state every time I call an external function needing access to local 
interpreter state. If I pass just a pointer to a callback function (that will then 
access local state) there is much less opportunity for errors... In most of the cases 
passing the callback is not necessary anyway.

Matthias

next prev parent reply	other threads:[~2023-11-29 12:33 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-27 15:16 Ed Robbins
2023-11-27 17:23 ` David Brown
2023-11-27 18:30   ` Richard Earnshaw
2023-11-28  9:18   ` Ed Robbins
     [not found] ` <8342aeef-4eef-231b-bf45-416660954fdb@marco.de>
2023-11-28  9:51   ` Ed Robbins
2023-11-28 18:00     ` David Brown
2023-11-29  7:50       ` Matthias Pfaller
2023-11-29 11:52         ` David Brown
2023-11-29 12:33           ` Matthias Pfaller [this message]
2023-12-05 17:10       ` Ed Robbins
2023-12-05 18:40         ` Ed Robbins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6712fa5-4275-5c71-5b95-1c06295ffcc4@marco.de \
    --to=leo@marco.de \
    --cc=david.brown@hesbynett.no \
    --cc=edd.robbins@gmail.com \
    --cc=edd.robbins@googlemail.com \
    --cc=gcc-help@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).