From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id B09453857C6D; Sat, 17 Jul 2021 02:59:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B09453857C6D From: "andres at anarazel dot de" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/101481] New: -ftree-loop-distribute-patterns can slow down and increases size of code Date: Sat, 17 Jul 2021 02:59:16 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: andres at anarazel dot de X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Jul 2021 02:59:16 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101481 Bug ID: 101481 Summary: -ftree-loop-distribute-patterns can slow down and increases size of code Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: andres at anarazel dot de Target Milestone: --- Created attachment 51168 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D51168&action=3Dedit simplified example reproducing problem Hi, I found -ftree-loop-distribute-patterns to be far too aggressive in replaci= ng code, leading to increased code size and substantial slowdowns (12% in the program I just hit this). The code size increase & slowdown are partially caused by the function call itself, and partially due to the spilling necessary to make that function c= all. Worsened by the PLT call to memmove(). A very simplified example (also attached) is this: typedef struct node { unsigned char chunks[4]; unsigned char count; } node; void foo(node *a, unsigned char newchunk, unsigned char off) { if (a->count > 3) __builtin_unreachable(); for (int i =3D a->count - 1; i >=3D off; i--) a->chunks[i + 1] =3D a->chunks[i]; a->chunks[off] =3D newchunk; } which with `-O2 -fPIC` boils down to: foo(node*, unsigned char, unsigned char): pushq %r12 movl %edx, %r8d movl %esi, %r12d pushq %rbp movq %rdi, %rbp pushq %rbx movzbl 4(%rdi), %ecx movzbl %r8b, %ebx leal -1(%rcx), %edx cmpl %ebx, %edx jl .L2 movl %ecx, %eax movslq %edx, %rsi subl %ebx, %ecx subl $1, %ecx movq %rsi, %rdx subq %rcx, %rdx leaq 1(%rcx), %r8 leaq (%rdi,%rdx), %rsi movzbl %al, %edi movq %r8, %rdx movq %rdi, %rax subq %rcx, %rax leaq 0(%rbp,%rax), %rdi call memmove@PLT .L2: movb %r12b, 0(%rbp,%rbx) popq %rbx popq %rbp popq %r12 ret compare to `-O2 -fPIC -fno-tree-loop-distribute-patterns` foo(node*, unsigned char, unsigned char): movzbl 4(%rdi), %eax movzbl %dl, %edx subl $1, %eax cmpl %edx, %eax jl .L2 cltq .L3: movzbl (%rdi,%rax), %ecx movb %cl, 1(%rdi,%rax) subq $1, %rax cmpl %eax, %edx jle .L3 .L2: movb %sil, (%rdi,%rdx) ret Which I think makes the problem apparent. Regards, Andres Freund=