public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jonathan Wakely <jwakely.gcc@gmail.com>
To: Gaelan Steele <gbs3@st-andrews.ac.uk>
Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
Subject: Re: Why does this unrolled function write to the stack?
Date: Wed, 8 Feb 2023 13:53:44 +0000	[thread overview]
Message-ID: <CAH6eHdR5GcWckSWrSrgTbL=T3CFTv5NiciuVA=6j6As6FJiyDg@mail.gmail.com> (raw)
In-Reply-To: <CAH6eHdQeS_D_6tM9cgD-yNZnHuRYDzoFtpo=+DZweT4Lm1iHKw@mail.gmail.com>

On Wed, 8 Feb 2023 at 13:49, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>
> On Wed, 8 Feb 2023 at 13:31, Gaelan Steele via Gcc-help
> <gcc-help@gcc.gnu.org> wrote:
> >
> > Hi all,
> >
> > In a computer architecture class, we happened across a strange compilation choice by GCC that neither I nor my professor can make much sense of. The source is as follows:
> >
> > void foo(int *a, const int *__restrict b, const int *__restrict c)
> > {
> >   for (int i = 0; i < 16; i++) {
> >     a[i] = b[i] + c[i];
> >   }
> > }
> >
> > I won't reproduce the full compiled output here, as it's rather long, but when compiled with -O3 -mno-avx -mno-sse, GCC 12.2 for x86-64 (via Compiler Explorer: https://godbolt.org/z/o9e4o7cj4) produces an unrolled loop that appears to write each sum into an array on the stack before copying it into the provided pointer a. This seems hugely inefficient - it's doing quite a few memory accesses - and I can't see why it would be necessary.
>
> I don't think it's *necessary*. If you use -Os or -O1 or -O2 you get a
> loop. So it's just an optimization choice at -O3 presumably based on
> cost estimates that say that fully unrolling the loop will make the
> code faster than looping.
>
> >
> > Am I missing some reason why this is more efficient than the naive approach (computing the each sum into an intermediate register, then writing it directly into a)?
>
> Benchmarking the function at different optimization levels I get:
>
> Run on (8 X 4500 MHz CPU s)
> CPU Caches:
>  L1 Data 32 KiB (x4)
>  L1 Instruction 32 KiB (x4)
>  L2 Unified 256 KiB (x4)
>  L3 Unified 8192 KiB (x1)
> Load Average: 0.14, 0.22, 0.39
> ***WARNING*** CPU scaling is enabled, the benchmark real time
> measurements may be noisy and will incur extra overhead.
> -----------------------------------------------------
> Benchmark           Time             CPU   Iterations
> -----------------------------------------------------
> O3               1.60 ns         1.60 ns    432901632
> O2               3.56 ns         3.56 ns    197086506
> O1               6.87 ns         6.86 ns    101839250
> Os               8.23 ns         8.22 ns     85273333
>
>
> Using quickbench:
> https://quick-bench.com/q/sSwVvtrkOCp9q-XyKAevthiaNAw

Oops, sorry, those were my original results *without* the -mno-avx
-mno-sse options! But that just shows that vectorization makes the
function fast.

Turning that off I get:

O3               58.3 ns         58.2 ns     11725604
O2               61.7 ns         61.6 ns     10930434
O1               7.37 ns         7.35 ns     95752192
Os               8.57 ns         8.56 ns     79448548

So it does look like GCC is making poor choices here.

  reply	other threads:[~2023-02-08 13:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-08 13:29 Gaelan Steele
2023-02-08 13:49 ` Jonathan Wakely
2023-02-08 13:53   ` Jonathan Wakely [this message]
2023-02-08 15:32     ` David Brown
2023-02-08 18:52       ` Gaelan Steele

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAH6eHdR5GcWckSWrSrgTbL=T3CFTv5NiciuVA=6j6As6FJiyDg@mail.gmail.com' \
    --to=jwakely.gcc@gmail.com \
    --cc=gbs3@st-andrews.ac.uk \
    --cc=gcc-help@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).