public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Why does this unrolled function write to the stack?
@ 2023-02-08 13:29 Gaelan Steele
  2023-02-08 13:49 ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: Gaelan Steele @ 2023-02-08 13:29 UTC (permalink / raw)
  To: gcc-help

Hi all,

In a computer architecture class, we happened across a strange compilation choice by GCC that neither I nor my professor can make much sense of. The source is as follows:

void foo(int *a, const int *__restrict b, const int *__restrict c)
{
  for (int i = 0; i < 16; i++) {
    a[i] = b[i] + c[i];
  }
}

I won't reproduce the full compiled output here, as it's rather long, but when compiled with -O3 -mno-avx -mno-sse, GCC 12.2 for x86-64 (via Compiler Explorer: https://godbolt.org/z/o9e4o7cj4) produces an unrolled loop that appears to write each sum into an array on the stack before copying it into the provided pointer a. This seems hugely inefficient - it's doing quite a few memory accesses - and I can't see why it would be necessary.

Am I missing some reason why this is more efficient than the naive approach (computing the each sum into an intermediate register, then writing it directly into a)?

Thanks,
Gaelan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-02-08 18:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-08 13:29 Why does this unrolled function write to the stack? Gaelan Steele
2023-02-08 13:49 ` Jonathan Wakely
2023-02-08 13:53   ` Jonathan Wakely
2023-02-08 15:32     ` David Brown
2023-02-08 18:52       ` Gaelan Steele

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).