public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug target/95798] New: Initialization code --- suboptimal @ 2020-06-21 7:16 zero at smallinteger dot com 2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com ` (11 more replies) 0 siblings, 12 replies; 13+ messages in thread From: zero at smallinteger dot com @ 2020-06-21 7:16 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 Bug ID: 95798 Summary: Initialization code --- suboptimal Product: gcc Version: 9.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: zero at smallinteger dot com Target Milestone: --- Created attachment 48764 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48764&action=edit sample code This is similar to (but not the same as) bug 87223 for structs. Further, this bug expands on this issue for gcc 10.x. Originally, this was noted in gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0, compiling with -O3. First, note the initialization code that trivially sets values to zero in an array, mov eax, edi sub rsp, 8080 xor edx, edx and eax, 127 mov QWORD PTR [rsp-120+rax*8], 0 mov QWORD PTR [rsp-112+rax*8], 0 mov QWORD PTR [rsp-104+rax*8], 0 mov QWORD PTR [rsp-96+rax*8], 0 mov QWORD PTR [rsp-88+rax*8], 0 mov QWORD PTR [rsp-80+rax*8], 0 mov QWORD PTR [rsp-72+rax*8], 0 mov QWORD PTR [rsp-64+rax*8], 0 xor eax, eax would be better by first setting a register to zero, then writing the value of the register. Further, note that there is already a zero register available (edx), but it is not used. This is similar to 87223 for structs, and here the issue manifests for arrays. Second, using gcc 10 versions and -O3 at godbolt.org results in this code: mov eax, edi mov edx, edi sub rsp, 8072 and eax, 127 and edx, 127 mov QWORD PTR [rsp-120+rdx*8], 0 lea edx, [rax+1] movsx rdx, edx mov QWORD PTR [rsp-120+rdx*8], 0 lea edx, [rax+2] movsx rdx, edx mov QWORD PTR [rsp-120+rdx*8], 0 lea edx, [rax+3] movsx rdx, edx mov QWORD PTR [rsp-120+rdx*8], 0 lea edx, [rax+4] movsx rdx, edx mov QWORD PTR [rsp-120+rdx*8], 0 lea edx, [rax+5] movsx rdx, edx mov QWORD PTR [rsp-120+rdx*8], 0 lea edx, [rax+6] add eax, 7 movsx rdx, edx cdqe mov QWORD PTR [rsp-120+rdx*8], 0 xor edx, edx mov QWORD PTR [rsp-120+rax*8], 0 xor eax, eax This is much, much more verbose than in gcc 9.3, for no apparent gain. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com @ 2020-06-21 16:57 ` zero at smallinteger dot com 2020-06-22 10:42 ` [Bug target/95798] [10/11 Regression] " jakub at gcc dot gnu.org ` (10 subsequent siblings) 11 siblings, 0 replies; 13+ messages in thread From: zero at smallinteger dot com @ 2020-06-21 16:57 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 --- Comment #1 from zero at smallinteger dot com --- (note that changing the array declaration to be initialized does not result in the individual array writes being optimized away, as one might expect at first glance) ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com 2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com @ 2020-06-22 10:42 ` jakub at gcc dot gnu.org 2020-06-22 10:44 ` jakub at gcc dot gnu.org ` (9 subsequent siblings) 11 siblings, 0 replies; 13+ messages in thread From: jakub at gcc dot gnu.org @ 2020-06-22 10:42 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Summary|Initialization code --- |[10/11 Regression] |suboptimal |Initialization code --- | |suboptimal Last reconfirmed| |2020-06-22 CC| |jakub at gcc dot gnu.org, | |rdapp at gcc dot gnu.org Target Milestone|--- |10.2 Ever confirmed|0 |1 --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> --- The 9 -> 10 regression started with r10-2806-gdf7d46d925c7baca7bf9961aee900876d8aef225 since which the IL is much larger and the resulting code less efficient. The testcase as written is just weird, it is an expensive check whether the program is called with multiple of 128 arguments >= 1024 arguments (otherwise it invokes UB). Adjusted testcase that is more meaningful: void bar (unsigned long long *, int); void foo (int y, unsigned long long z) { unsigned long long x[1024]; unsigned long long i = y % 127; __builtin_memset (x, -1, sizeof (x)); x[i] = 0; x[i + 1] = 0; x[i + 2] = 0; x[i + 3] = 0; x[i + 4] = 0; x[i + 5] = 0; x[i + 6] = 0; x[i + 7] = 0; bar (x, y); } ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com 2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com 2020-06-22 10:42 ` [Bug target/95798] [10/11 Regression] " jakub at gcc dot gnu.org @ 2020-06-22 10:44 ` jakub at gcc dot gnu.org 2020-06-22 13:05 ` jakub at gcc dot gnu.org ` (8 subsequent siblings) 11 siblings, 0 replies; 13+ messages in thread From: jakub at gcc dot gnu.org @ 2020-06-22 10:44 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Perhaps the change should be guarded on single_use? ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com ` (2 preceding siblings ...) 2020-06-22 10:44 ` jakub at gcc dot gnu.org @ 2020-06-22 13:05 ` jakub at gcc dot gnu.org 2020-07-23 6:51 ` rguenth at gcc dot gnu.org ` (7 subsequent siblings) 11 siblings, 0 replies; 13+ messages in thread From: jakub at gcc dot gnu.org @ 2020-06-22 13:05 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Partially related, using the following -O2 -fno-ipa-icf: void foo (int x, int *p) { p[x + 1] = 1; } void bar (int x, int *p) { p[x + 1UL] = 1; } void baz (int x, int *p) { unsigned long l = x; l++; p[l] = 1; } void qux (int x, int *p) { unsigned long l = x + 1; p[l] = 1; } we get the same 3 insn functions for the first 3 cases and 4 insn for the last one. I'm surprised that we treat foo and qux differently, as x + 1 has undefined overflow, so (unsigned long) (x + 1) can be implemented with x + 1UL and when used in address arithmetics it should be beneficial like that (so shall e.g. expansion optimize it, or ivopts, or isel)? ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com ` (3 preceding siblings ...) 2020-06-22 13:05 ` jakub at gcc dot gnu.org @ 2020-07-23 6:51 ` rguenth at gcc dot gnu.org 2020-10-12 12:54 ` rguenth at gcc dot gnu.org ` (6 subsequent siblings) 11 siblings, 0 replies; 13+ messages in thread From: rguenth at gcc dot gnu.org @ 2020-07-23 6:51 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|10.2 |10.3 --- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- GCC 10.2 is released, adjusting target milestone. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com ` (4 preceding siblings ...) 2020-07-23 6:51 ` rguenth at gcc dot gnu.org @ 2020-10-12 12:54 ` rguenth at gcc dot gnu.org 2021-02-24 15:33 ` jakub at gcc dot gnu.org ` (5 subsequent siblings) 11 siblings, 0 replies; 13+ messages in thread From: rguenth at gcc dot gnu.org @ 2020-10-12 12:54 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P3 |P2 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com ` (5 preceding siblings ...) 2020-10-12 12:54 ` rguenth at gcc dot gnu.org @ 2021-02-24 15:33 ` jakub at gcc dot gnu.org 2021-02-25 9:26 ` cvs-commit at gcc dot gnu.org ` (4 subsequent siblings) 11 siblings, 0 replies; 13+ messages in thread From: jakub at gcc dot gnu.org @ 2021-02-24 15:33 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Created attachment 50249 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50249&action=edit gcc11-pr95798.patch Untested fix. The #c4 qux case above is not fixed by it, but it isn't a regression, so I think we should defer that one for GCC 12 (file a separate PR for that). And, it would work (if done e.g. at expansion time) only when there is signed integer overflow, so not for -fwrapv nor when e.g. the narrower type is unsigned, so I think we need the single_use match.pd case because after those changes, there is no way to undo that. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com ` (6 preceding siblings ...) 2021-02-24 15:33 ` jakub at gcc dot gnu.org @ 2021-02-25 9:26 ` cvs-commit at gcc dot gnu.org 2021-02-25 9:27 ` [Bug target/95798] [10 " jakub at gcc dot gnu.org ` (3 subsequent siblings) 11 siblings, 0 replies; 13+ messages in thread From: cvs-commit at gcc dot gnu.org @ 2021-02-25 9:26 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 --- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:880682e7b2348d66f4089fa4af102b69eaaefbc2 commit r11-7384-g880682e7b2348d66f4089fa4af102b69eaaefbc2 Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Feb 25 10:22:53 2021 +0100 match.pd: Use :s for (T)(A) + CST -> (T)(A + CST) [PR95798] The r10-2806 change regressed following testcases, instead of doing int -> unsigned long sign-extension once and then add 8, 16, ... 56 to it for each of the memory access, it adds 8, 16, ... 56 in int mode and then sign extends each. So that means: + movq $0, (%rsp,%rax,8) + leal 1(%rdx), %eax + cltq + movq $1, (%rsp,%rax,8) + leal 2(%rdx), %eax + cltq + movq $2, (%rsp,%rax,8) + leal 3(%rdx), %eax + cltq + movq $3, (%rsp,%rax,8) + leal 4(%rdx), %eax + cltq + movq $4, (%rsp,%rax,8) + leal 5(%rdx), %eax + cltq + movq $5, (%rsp,%rax,8) + leal 6(%rdx), %eax + addl $7, %edx + cltq + movslq %edx, %rdx + movq $6, (%rsp,%rax,8) + movq $7, (%rsp,%rdx,8) - movq $0, (%rsp,%rdx,8) - movq $1, 8(%rsp,%rdx,8) - movq $2, 16(%rsp,%rdx,8) - movq $3, 24(%rsp,%rdx,8) - movq $4, 32(%rsp,%rdx,8) - movq $5, 40(%rsp,%rdx,8) - movq $6, 48(%rsp,%rdx,8) - movq $7, 56(%rsp,%rdx,8) GCC 9 -> 10 change or: - movq $0, (%rsp,%rdx,8) - movq $1, 8(%rsp,%rdx,8) - movq $2, 16(%rsp,%rdx,8) - movq $3, 24(%rsp,%rdx,8) - movq $4, 32(%rsp,%rdx,8) - movq $5, 40(%rsp,%rdx,8) - movq $6, 48(%rsp,%rdx,8) - movq $7, 56(%rsp,%rdx,8) + movq $0, (%rsp,%rax,8) + leal 1(%rdx), %eax + movq $1, (%rsp,%rax,8) + leal 2(%rdx), %eax + movq $2, (%rsp,%rax,8) + leal 3(%rdx), %eax + movq $3, (%rsp,%rax,8) + leal 4(%rdx), %eax + movq $4, (%rsp,%rax,8) + leal 5(%rdx), %eax + movq $5, (%rsp,%rax,8) + leal 6(%rdx), %eax + movq $6, (%rsp,%rax,8) + leal 7(%rdx), %eax + movq $7, (%rsp,%rax,8) change on the other test. While for the former case of int there is due to signed integer overflow (unless -fwrapv) the possibility to undo it e.g. during expansion, for the unsigned case information is unfortunately lost. The following patch adds :s to the convert which restores these testcases but keeps the testcases the patch meant to improve as is. 2021-02-25 Jakub Jelinek <jakub@redhat.com> PR target/95798 * match.pd ((T)(A) + CST -> (T)(A + CST)): Add :s to convert. * gcc.target/i386/pr95798-1.c: New test. * gcc.target/i386/pr95798-2.c: New test. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com ` (7 preceding siblings ...) 2021-02-25 9:26 ` cvs-commit at gcc dot gnu.org @ 2021-02-25 9:27 ` jakub at gcc dot gnu.org 2021-04-08 12:02 ` rguenth at gcc dot gnu.org ` (2 subsequent siblings) 11 siblings, 0 replies; 13+ messages in thread From: jakub at gcc dot gnu.org @ 2021-02-25 9:27 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|[10/11 Regression] |[10 Regression] |Initialization code --- |Initialization code --- |suboptimal |suboptimal --- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Fixed on the trunk so far. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com ` (8 preceding siblings ...) 2021-02-25 9:27 ` [Bug target/95798] [10 " jakub at gcc dot gnu.org @ 2021-04-08 12:02 ` rguenth at gcc dot gnu.org 2022-06-28 10:41 ` jakub at gcc dot gnu.org 2023-07-07 8:55 ` rguenth at gcc dot gnu.org 11 siblings, 0 replies; 13+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-04-08 12:02 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|10.3 |10.4 --- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> --- GCC 10.3 is being released, retargeting bugs to GCC 10.4. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com ` (9 preceding siblings ...) 2021-04-08 12:02 ` rguenth at gcc dot gnu.org @ 2022-06-28 10:41 ` jakub at gcc dot gnu.org 2023-07-07 8:55 ` rguenth at gcc dot gnu.org 11 siblings, 0 replies; 13+ messages in thread From: jakub at gcc dot gnu.org @ 2022-06-28 10:41 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|10.4 |10.5 --- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> --- GCC 10.4 is being released, retargeting bugs to GCC 10.5. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10 Regression] Initialization code --- suboptimal 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com ` (10 preceding siblings ...) 2022-06-28 10:41 ` jakub at gcc dot gnu.org @ 2023-07-07 8:55 ` rguenth at gcc dot gnu.org 11 siblings, 0 replies; 13+ messages in thread From: rguenth at gcc dot gnu.org @ 2023-07-07 8:55 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Known to work| |11.1.0 Target Milestone|10.5 |11.0 Status|ASSIGNED |RESOLVED Known to fail| |10.5.0 --- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> --- Fixed for GCC 11. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-07-07 8:55 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com 2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com 2020-06-22 10:42 ` [Bug target/95798] [10/11 Regression] " jakub at gcc dot gnu.org 2020-06-22 10:44 ` jakub at gcc dot gnu.org 2020-06-22 13:05 ` jakub at gcc dot gnu.org 2020-07-23 6:51 ` rguenth at gcc dot gnu.org 2020-10-12 12:54 ` rguenth at gcc dot gnu.org 2021-02-24 15:33 ` jakub at gcc dot gnu.org 2021-02-25 9:26 ` cvs-commit at gcc dot gnu.org 2021-02-25 9:27 ` [Bug target/95798] [10 " jakub at gcc dot gnu.org 2021-04-08 12:02 ` rguenth at gcc dot gnu.org 2022-06-28 10:41 ` jakub at gcc dot gnu.org 2023-07-07 8:55 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).