public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code
@ 2021-07-17  2:59 andres at anarazel dot de
  2021-07-17  3:36 ` [Bug tree-optimization/101481] [11/12 Regression] " pinskia at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: andres at anarazel dot de @ 2021-07-17  2:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481

            Bug ID: 101481
           Summary: -ftree-loop-distribute-patterns can slow down and
                    increases size of code
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andres at anarazel dot de
  Target Milestone: ---

Created attachment 51168
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51168&action=edit
simplified example reproducing problem

Hi,

I found -ftree-loop-distribute-patterns to be far too aggressive in replacing
code, leading to increased code size and substantial slowdowns (12% in the
program I just hit this).

The code size increase & slowdown are partially caused by the function call
itself, and partially due to the spilling necessary to make that function call.
Worsened by the PLT call to memmove().

A very simplified example (also attached) is this:

typedef struct node
{
    unsigned char chunks[4];
    unsigned char count;
} node;

void
foo(node *a, unsigned char newchunk, unsigned char off)
{
    if (a->count > 3)
        __builtin_unreachable();

    for (int i = a->count - 1; i >= off; i--)
        a->chunks[i + 1] = a->chunks[i];
    a->chunks[off] = newchunk;
}

which with `-O2 -fPIC` boils down to:
foo(node*, unsigned char, unsigned char):
        pushq   %r12
        movl    %edx, %r8d
        movl    %esi, %r12d
        pushq   %rbp
        movq    %rdi, %rbp
        pushq   %rbx
        movzbl  4(%rdi), %ecx
        movzbl  %r8b, %ebx
        leal    -1(%rcx), %edx
        cmpl    %ebx, %edx
        jl      .L2
        movl    %ecx, %eax
        movslq  %edx, %rsi
        subl    %ebx, %ecx
        subl    $1, %ecx
        movq    %rsi, %rdx
        subq    %rcx, %rdx
        leaq    1(%rcx), %r8
        leaq    (%rdi,%rdx), %rsi
        movzbl  %al, %edi
        movq    %r8, %rdx
        movq    %rdi, %rax
        subq    %rcx, %rax
        leaq    0(%rbp,%rax), %rdi
        call    memmove@PLT
.L2:
        movb    %r12b, 0(%rbp,%rbx)
        popq    %rbx
        popq    %rbp
        popq    %r12
        ret

compare to `-O2 -fPIC -fno-tree-loop-distribute-patterns`

foo(node*, unsigned char, unsigned char):
        movzbl  4(%rdi), %eax
        movzbl  %dl, %edx
        subl    $1, %eax
        cmpl    %edx, %eax
        jl      .L2
        cltq
.L3:
        movzbl  (%rdi,%rax), %ecx
        movb    %cl, 1(%rdi,%rax)
        subq    $1, %rax
        cmpl    %eax, %edx
        jle     .L3
.L2:
        movb    %sil, (%rdi,%rdx)
        ret

Which I think makes the problem apparent.

Regards,

Andres Freund

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-05-29 10:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-17  2:59 [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code andres at anarazel dot de
2021-07-17  3:36 ` [Bug tree-optimization/101481] [11/12 Regression] " pinskia at gcc dot gnu.org
2021-07-17  3:37 ` pinskia at gcc dot gnu.org
2021-07-19  6:29 ` rguenth at gcc dot gnu.org
2021-07-28  7:07 ` rguenth at gcc dot gnu.org
2022-04-21  7:49 ` rguenth at gcc dot gnu.org
2023-05-29 10:05 ` [Bug tree-optimization/101481] [11/12/13/14 " jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).