public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/108724] New: [11 regression] Poor codegen when summing two arrays without AVX or SSE
@ 2023-02-08 19:17 gbs at canishe dot com
  2023-02-08 19:30 ` [Bug tree-optimization/108724] " pinskia at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: gbs at canishe dot com @ 2023-02-08 19:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108724

            Bug ID: 108724
           Summary: [11 regression] Poor codegen when summing two arrays
                    without AVX or SSE
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gbs at canishe dot com
  Target Milestone: ---

This program:

void foo(int *a, const int *__restrict b, const int *__restrict c)
{
  for (int i = 0; i < 16; i++) {
    a[i] = b[i] + c[i];
  }
}


When compiled for x86 by GCC 11.1+ with -O3 -mno-avx -mno-sse, produces:

foo:
        movq    %rdx, %rax
        subq    $8, %rsp
        movl    (%rsi), %edx
        movq    %rsi, %rcx
        addl    (%rax), %edx
        movl    4(%rax), %esi
        movq    $0, (%rsp)
        movl    %edx, (%rsp)
        movq    (%rsp), %rdx
        addl    4(%rcx), %esi
        movq    %rdx, -8(%rsp)
        movl    %esi, -4(%rsp)
        movq    -8(%rsp), %rdx
        movq    %rdx, (%rdi)
        movl    8(%rax), %edx
        addl    8(%rcx), %edx
        movq    $0, -16(%rsp)
        movl    %edx, -16(%rsp)
        movq    -16(%rsp), %rdx
        movl    12(%rcx), %esi
        addl    12(%rax), %esi
        movq    %rdx, -24(%rsp)
        movl    %esi, -20(%rsp)
        movq    -24(%rsp), %rdx
        movq    %rdx, 8(%rdi)
        [snip more of the same]
        movl    48(%rcx), %edx
        movq    $0, -96(%rsp)
        addl    48(%rax), %edx
        movl    %edx, -96(%rsp)
        movq    -96(%rsp), %rdx
        movl    52(%rcx), %esi
        addl    52(%rax), %esi
        movq    %rdx, -104(%rsp)
        movl    %esi, -100(%rsp)
        movq    -104(%rsp), %rdx
        movq    %rdx, 48(%rdi)
        movl    56(%rcx), %edx
        movq    $0, -112(%rsp)
        addl    56(%rax), %edx
        movl    %edx, -112(%rsp)
        movq    -112(%rsp), %rdx
        movl    60(%rcx), %ecx
        addl    60(%rax), %ecx
        movq    %rdx, -120(%rsp)
        movl    %ecx, -116(%rsp)
        movq    -120(%rsp), %rdx
        movq    %rdx, 56(%rdi)
        addq    $8, %rsp
        ret

(Godbolt link: https://godbolt.org/z/qq9dbP8ed)

This is bizarre - it's storing intermediate results on the stack, instead of
keeping them in registers or writing them directly to *a, which is bound to be
slow. (GCC 10.4, and Clang, produce more or less what I would expect, using
only the provided arrays and a register.) I haven't done any benchmarking
myself, but Jonathan Wakely's results (on list:
https://gcc.gnu.org/pipermail/gcc-help/2023-February/142181.html) seem to bear
this out.

From a bisect, this behavior seems to have been introduced by commit
33c0f246f799b7403171e97f31276a8feddd05c9 (tree-optimization/97626 - handle SCCs
properly in SLP stmt analysis) from Oct 2020, and persists into GCC trunk.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-05-29 10:08 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-08 19:17 [Bug target/108724] New: [11 regression] Poor codegen when summing two arrays without AVX or SSE gbs at canishe dot com
2023-02-08 19:30 ` [Bug tree-optimization/108724] " pinskia at gcc dot gnu.org
2023-02-09  9:37 ` crazylht at gmail dot com
2023-02-09 13:54 ` rguenth at gcc dot gnu.org
2023-02-10 10:00 ` rguenth at gcc dot gnu.org
2023-02-10 10:07 ` [Bug tree-optimization/108724] [11/12/13 Regression] " rguenth at gcc dot gnu.org
2023-02-10 11:22 ` cvs-commit at gcc dot gnu.org
2023-02-10 11:22 ` [Bug tree-optimization/108724] [11/12 " rguenth at gcc dot gnu.org
2023-03-15  9:48 ` cvs-commit at gcc dot gnu.org
2023-05-05  8:34 ` [Bug tree-optimization/108724] [11 " rguenth at gcc dot gnu.org
2023-05-05 12:06 ` [Bug target/108724] " rguenth at gcc dot gnu.org
2023-05-23 12:55 ` rguenth at gcc dot gnu.org
2023-05-29 10:08 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).