public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/95798] New: Initialization code --- suboptimal
@ 2020-06-21 7:16 zero at smallinteger dot com
2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com
` (11 more replies)
0 siblings, 12 replies; 13+ messages in thread
From: zero at smallinteger dot com @ 2020-06-21 7:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
Bug ID: 95798
Summary: Initialization code --- suboptimal
Product: gcc
Version: 9.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: zero at smallinteger dot com
Target Milestone: ---
Created attachment 48764
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48764&action=edit
sample code
This is similar to (but not the same as) bug 87223 for structs. Further, this
bug expands on this issue for gcc 10.x. Originally, this was noted in gcc
(Ubuntu 9.3.0-10ubuntu2) 9.3.0, compiling with -O3.
First, note the initialization code that trivially sets values to zero in an
array,
mov eax, edi
sub rsp, 8080
xor edx, edx
and eax, 127
mov QWORD PTR [rsp-120+rax*8], 0
mov QWORD PTR [rsp-112+rax*8], 0
mov QWORD PTR [rsp-104+rax*8], 0
mov QWORD PTR [rsp-96+rax*8], 0
mov QWORD PTR [rsp-88+rax*8], 0
mov QWORD PTR [rsp-80+rax*8], 0
mov QWORD PTR [rsp-72+rax*8], 0
mov QWORD PTR [rsp-64+rax*8], 0
xor eax, eax
would be better by first setting a register to zero, then writing the value of
the register. Further, note that there is already a zero register available
(edx), but it is not used. This is similar to 87223 for structs, and here the
issue manifests for arrays.
Second, using gcc 10 versions and -O3 at godbolt.org results in this code:
mov eax, edi
mov edx, edi
sub rsp, 8072
and eax, 127
and edx, 127
mov QWORD PTR [rsp-120+rdx*8], 0
lea edx, [rax+1]
movsx rdx, edx
mov QWORD PTR [rsp-120+rdx*8], 0
lea edx, [rax+2]
movsx rdx, edx
mov QWORD PTR [rsp-120+rdx*8], 0
lea edx, [rax+3]
movsx rdx, edx
mov QWORD PTR [rsp-120+rdx*8], 0
lea edx, [rax+4]
movsx rdx, edx
mov QWORD PTR [rsp-120+rdx*8], 0
lea edx, [rax+5]
movsx rdx, edx
mov QWORD PTR [rsp-120+rdx*8], 0
lea edx, [rax+6]
add eax, 7
movsx rdx, edx
cdqe
mov QWORD PTR [rsp-120+rdx*8], 0
xor edx, edx
mov QWORD PTR [rsp-120+rax*8], 0
xor eax, eax
This is much, much more verbose than in gcc 9.3, for no apparent gain.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
@ 2020-06-21 16:57 ` zero at smallinteger dot com
2020-06-22 10:42 ` [Bug target/95798] [10/11 Regression] " jakub at gcc dot gnu.org
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: zero at smallinteger dot com @ 2020-06-21 16:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
--- Comment #1 from zero at smallinteger dot com ---
(note that changing the array declaration to be initialized does not result in
the individual array writes being optimized away, as one might expect at first
glance)
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com
@ 2020-06-22 10:42 ` jakub at gcc dot gnu.org
2020-06-22 10:44 ` jakub at gcc dot gnu.org
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-22 10:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Summary|Initialization code --- |[10/11 Regression]
|suboptimal |Initialization code ---
| |suboptimal
Last reconfirmed| |2020-06-22
CC| |jakub at gcc dot gnu.org,
| |rdapp at gcc dot gnu.org
Target Milestone|--- |10.2
Ever confirmed|0 |1
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The 9 -> 10 regression started with
r10-2806-gdf7d46d925c7baca7bf9961aee900876d8aef225
since which the IL is much larger and the resulting code less efficient.
The testcase as written is just weird, it is an expensive check whether the
program is called with multiple of 128 arguments >= 1024 arguments (otherwise
it invokes UB).
Adjusted testcase that is more meaningful:
void bar (unsigned long long *, int);
void
foo (int y, unsigned long long z)
{
unsigned long long x[1024];
unsigned long long i = y % 127;
__builtin_memset (x, -1, sizeof (x));
x[i] = 0;
x[i + 1] = 0;
x[i + 2] = 0;
x[i + 3] = 0;
x[i + 4] = 0;
x[i + 5] = 0;
x[i + 6] = 0;
x[i + 7] = 0;
bar (x, y);
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com
2020-06-22 10:42 ` [Bug target/95798] [10/11 Regression] " jakub at gcc dot gnu.org
@ 2020-06-22 10:44 ` jakub at gcc dot gnu.org
2020-06-22 13:05 ` jakub at gcc dot gnu.org
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-22 10:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Perhaps the change should be guarded on single_use?
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
` (2 preceding siblings ...)
2020-06-22 10:44 ` jakub at gcc dot gnu.org
@ 2020-06-22 13:05 ` jakub at gcc dot gnu.org
2020-07-23 6:51 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-22 13:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Partially related, using the following -O2 -fno-ipa-icf:
void
foo (int x, int *p)
{
p[x + 1] = 1;
}
void
bar (int x, int *p)
{
p[x + 1UL] = 1;
}
void
baz (int x, int *p)
{
unsigned long l = x;
l++;
p[l] = 1;
}
void
qux (int x, int *p)
{
unsigned long l = x + 1;
p[l] = 1;
}
we get the same 3 insn functions for the first 3 cases and 4 insn for the last
one. I'm surprised that we treat foo and qux differently, as x + 1 has
undefined overflow, so (unsigned long) (x + 1) can be implemented with x + 1UL
and when used in address arithmetics it should be beneficial like that (so
shall e.g. expansion optimize it, or ivopts, or isel)?
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
` (3 preceding siblings ...)
2020-06-22 13:05 ` jakub at gcc dot gnu.org
@ 2020-07-23 6:51 ` rguenth at gcc dot gnu.org
2020-10-12 12:54 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-07-23 6:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|10.2 |10.3
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10.2 is released, adjusting target milestone.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
` (4 preceding siblings ...)
2020-07-23 6:51 ` rguenth at gcc dot gnu.org
@ 2020-10-12 12:54 ` rguenth at gcc dot gnu.org
2021-02-24 15:33 ` jakub at gcc dot gnu.org
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-10-12 12:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
` (5 preceding siblings ...)
2020-10-12 12:54 ` rguenth at gcc dot gnu.org
@ 2021-02-24 15:33 ` jakub at gcc dot gnu.org
2021-02-25 9:26 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-24 15:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50249
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50249&action=edit
gcc11-pr95798.patch
Untested fix.
The #c4 qux case above is not fixed by it, but it isn't a regression, so I
think we should defer that one for GCC 12 (file a separate PR for that).
And, it would work (if done e.g. at expansion time) only when there is signed
integer overflow, so not for -fwrapv nor when e.g. the narrower type is
unsigned, so I think we need the single_use match.pd case because after those
changes, there is no way to undo that.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
` (6 preceding siblings ...)
2021-02-24 15:33 ` jakub at gcc dot gnu.org
@ 2021-02-25 9:26 ` cvs-commit at gcc dot gnu.org
2021-02-25 9:27 ` [Bug target/95798] [10 " jakub at gcc dot gnu.org
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-02-25 9:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:880682e7b2348d66f4089fa4af102b69eaaefbc2
commit r11-7384-g880682e7b2348d66f4089fa4af102b69eaaefbc2
Author: Jakub Jelinek <jakub@redhat.com>
Date: Thu Feb 25 10:22:53 2021 +0100
match.pd: Use :s for (T)(A) + CST -> (T)(A + CST) [PR95798]
The r10-2806 change regressed following testcases, instead of doing
int -> unsigned long sign-extension once and then add 8, 16, ... 56 to it
for each of the memory access, it adds 8, 16, ... 56 in int mode and then
sign extends each. So that means:
+ movq $0, (%rsp,%rax,8)
+ leal 1(%rdx), %eax
+ cltq
+ movq $1, (%rsp,%rax,8)
+ leal 2(%rdx), %eax
+ cltq
+ movq $2, (%rsp,%rax,8)
+ leal 3(%rdx), %eax
+ cltq
+ movq $3, (%rsp,%rax,8)
+ leal 4(%rdx), %eax
+ cltq
+ movq $4, (%rsp,%rax,8)
+ leal 5(%rdx), %eax
+ cltq
+ movq $5, (%rsp,%rax,8)
+ leal 6(%rdx), %eax
+ addl $7, %edx
+ cltq
+ movslq %edx, %rdx
+ movq $6, (%rsp,%rax,8)
+ movq $7, (%rsp,%rdx,8)
- movq $0, (%rsp,%rdx,8)
- movq $1, 8(%rsp,%rdx,8)
- movq $2, 16(%rsp,%rdx,8)
- movq $3, 24(%rsp,%rdx,8)
- movq $4, 32(%rsp,%rdx,8)
- movq $5, 40(%rsp,%rdx,8)
- movq $6, 48(%rsp,%rdx,8)
- movq $7, 56(%rsp,%rdx,8)
GCC 9 -> 10 change or:
- movq $0, (%rsp,%rdx,8)
- movq $1, 8(%rsp,%rdx,8)
- movq $2, 16(%rsp,%rdx,8)
- movq $3, 24(%rsp,%rdx,8)
- movq $4, 32(%rsp,%rdx,8)
- movq $5, 40(%rsp,%rdx,8)
- movq $6, 48(%rsp,%rdx,8)
- movq $7, 56(%rsp,%rdx,8)
+ movq $0, (%rsp,%rax,8)
+ leal 1(%rdx), %eax
+ movq $1, (%rsp,%rax,8)
+ leal 2(%rdx), %eax
+ movq $2, (%rsp,%rax,8)
+ leal 3(%rdx), %eax
+ movq $3, (%rsp,%rax,8)
+ leal 4(%rdx), %eax
+ movq $4, (%rsp,%rax,8)
+ leal 5(%rdx), %eax
+ movq $5, (%rsp,%rax,8)
+ leal 6(%rdx), %eax
+ movq $6, (%rsp,%rax,8)
+ leal 7(%rdx), %eax
+ movq $7, (%rsp,%rax,8)
change on the other test. While for the former case of
int there is due to signed integer overflow (unless -fwrapv)
the possibility to undo it e.g. during expansion, for the unsigned
case information is unfortunately lost.
The following patch adds :s to the convert which restores these
testcases but keeps the testcases the patch meant to improve as is.
2021-02-25 Jakub Jelinek <jakub@redhat.com>
PR target/95798
* match.pd ((T)(A) + CST -> (T)(A + CST)): Add :s to convert.
* gcc.target/i386/pr95798-1.c: New test.
* gcc.target/i386/pr95798-2.c: New test.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
` (7 preceding siblings ...)
2021-02-25 9:26 ` cvs-commit at gcc dot gnu.org
@ 2021-02-25 9:27 ` jakub at gcc dot gnu.org
2021-04-08 12:02 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-25 9:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[10/11 Regression] |[10 Regression]
|Initialization code --- |Initialization code ---
|suboptimal |suboptimal
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk so far.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
` (8 preceding siblings ...)
2021-02-25 9:27 ` [Bug target/95798] [10 " jakub at gcc dot gnu.org
@ 2021-04-08 12:02 ` rguenth at gcc dot gnu.org
2022-06-28 10:41 ` jakub at gcc dot gnu.org
2023-07-07 8:55 ` rguenth at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-08 12:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|10.3 |10.4
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10.3 is being released, retargeting bugs to GCC 10.4.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
` (9 preceding siblings ...)
2021-04-08 12:02 ` rguenth at gcc dot gnu.org
@ 2022-06-28 10:41 ` jakub at gcc dot gnu.org
2023-07-07 8:55 ` rguenth at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|10.4 |10.5
--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/95798] [10 Regression] Initialization code --- suboptimal
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
` (10 preceding siblings ...)
2022-06-28 10:41 ` jakub at gcc dot gnu.org
@ 2023-07-07 8:55 ` rguenth at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07 8:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Known to work| |11.1.0
Target Milestone|10.5 |11.0
Status|ASSIGNED |RESOLVED
Known to fail| |10.5.0
--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed for GCC 11.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-07-07 8:55 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-21 7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com
2020-06-22 10:42 ` [Bug target/95798] [10/11 Regression] " jakub at gcc dot gnu.org
2020-06-22 10:44 ` jakub at gcc dot gnu.org
2020-06-22 13:05 ` jakub at gcc dot gnu.org
2020-07-23 6:51 ` rguenth at gcc dot gnu.org
2020-10-12 12:54 ` rguenth at gcc dot gnu.org
2021-02-24 15:33 ` jakub at gcc dot gnu.org
2021-02-25 9:26 ` cvs-commit at gcc dot gnu.org
2021-02-25 9:27 ` [Bug target/95798] [10 " jakub at gcc dot gnu.org
2021-04-08 12:02 ` rguenth at gcc dot gnu.org
2022-06-28 10:41 ` jakub at gcc dot gnu.org
2023-07-07 8:55 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).