public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "hubicka at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/110692] New: epilogues for loop which can be also vectorized with half size can be improved. Date: Sun, 16 Jul 2023 19:31:27 +0000 [thread overview] Message-ID: <bug-110692-4@http.gcc.gnu.org/bugzilla/> (raw) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110692 Bug ID: 110692 Summary: epilogues for loop which can be also vectorized with half size can be improved. Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: hubicka at gcc dot gnu.org Target Milestone: --- for: int a[99]; test() { for (int i = 0 ; i < 99; i++) a[i]++; } we produce: 0: 66 0f 6f 0d 00 00 00 movdqa 0x0(%rip),%xmm1 # 8 <test+0x8> 7: 00 8: b8 00 00 00 00 mov $0x0,%eax d: ba 00 00 00 00 mov $0x0,%edx 12: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 18: 66 0f 6f 00 movdqa (%rax),%xmm0 1c: 48 83 c0 10 add $0x10,%rax 20: 66 0f fe c1 paddd %xmm1,%xmm0 24: 0f 29 40 f0 movaps %xmm0,-0x10(%rax) 28: 48 39 c2 cmp %rax,%rdx 2b: 75 eb jne 18 <test+0x18> 2d: f3 0f 7e 05 00 00 00 movq 0x0(%rip),%xmm0 # 35 <test+0x35> 34: 00 35: 83 05 00 00 00 00 01 addl $0x1,0x0(%rip) # 3c <test+0x3c> 3c: f3 0f 7e 0d 00 00 00 movq 0x0(%rip),%xmm1 # 44 <test+0x44> 43: 00 44: 66 0f fe c1 paddd %xmm1,%xmm0 48: 66 0f d6 05 00 00 00 movq %xmm0,0x0(%rip) # 50 <test+0x50> 4f: 00 50: c3 ret which does the 4x vectorized loop followed by 2x vectorized loopless epilogue and copy of remaining byte. When bound is unknow we do: int a[99]; test(int n) { for (int i = 0 ; i < n; i++) a[i]++; } 0: 85 ff test %edi,%edi 2: 7e 70 jle 74 <test+0x74> 4: 8d 47 ff lea -0x1(%rdi),%eax 7: 83 f8 02 cmp $0x2,%eax a: 76 6d jbe 79 <test+0x79> c: 89 fa mov %edi,%edx e: 66 0f 6f 0d 00 00 00 movdqa 0x0(%rip),%xmm1 # 16 <test+0x16> 15: 00 16: 31 c0 xor %eax,%eax 18: c1 ea 02 shr $0x2,%edx 1b: 48 c1 e2 04 shl $0x4,%rdx 1f: 66 0f 6f 80 00 00 00 movdqa 0x0(%rax),%xmm0 26: 00 27: 48 83 c0 10 add $0x10,%rax 2b: 66 0f fe c1 paddd %xmm1,%xmm0 2f: 0f 29 80 00 00 00 00 movaps %xmm0,0x0(%rax) 36: 48 39 d0 cmp %rdx,%rax 39: 75 e4 jne 1f <test+0x1f> 3b: 89 f8 mov %edi,%eax 3d: 83 e0 fc and $0xfffffffc,%eax 40: 40 f6 c7 03 test $0x3,%dil 44: 74 32 je 78 <test+0x78> 46: 48 63 d0 movslq %eax,%rdx 49: 83 04 95 00 00 00 00 addl $0x1,0x0(,%rdx,4) 50: 01 51: 8d 50 01 lea 0x1(%rax),%edx 54: 39 d7 cmp %edx,%edi 56: 7e 1c jle 74 <test+0x74> 58: 48 63 d2 movslq %edx,%rdx 5b: 83 c0 02 add $0x2,%eax 5e: 83 04 95 00 00 00 00 addl $0x1,0x0(,%rdx,4) 65: 01 66: 39 c7 cmp %eax,%edi 68: 7e 0a jle 74 <test+0x74> 6a: 48 98 cltq 6c: 83 04 85 00 00 00 00 addl $0x1,0x0(,%rax,4) 73: 01 74: c3 ret 75: 0f 1f 00 nopl (%rax) 78: c3 ret 79: 31 c0 xor %eax,%eax 7b: eb c9 jmp 46 <test+0x46> the profitability threshold is 4. Producing the loopless epilogue just as in the first case with additional tests for block size would work better. The code looks quite bad for small trip counts since there is extra jump down to 79.
next reply other threads:[~2023-07-16 19:31 UTC|newest] Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-07-16 19:31 hubicka at gcc dot gnu.org [this message] 2023-07-17 7:45 ` [Bug tree-optimization/110692] " rguenth at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-110692-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).