public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/109667] New: [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct
@ 2023-04-28 11:46 chfast at gmail dot com
2023-04-28 12:27 ` [Bug tree-optimization/109667] " rguenth at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: chfast at gmail dot com @ 2023-04-28 11:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109667
Bug ID: 109667
Summary: [12/13/14 Regression] Unnecessary temporary storage
used for 32-byte struct
Product: gcc
Version: 12.3.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: chfast at gmail dot com
Target Milestone: ---
Reduced reproducer:
struct i256 {
long v[4];
};
void assign(struct i256 *v, long z) {
struct i256 r = {};
for (int i = 0; i < 1; ++i)
r.v[i] = z;
*v = r;
}
https://godbolt.org/z/avM74o3r6
The compiler allocates temporary storage on stack for `r`:
assign:
pxor xmm0, xmm0
mov QWORD PTR [rsp-40], rsi
movups XMMWORD PTR [rsp-32], xmm0
movdqa xmm1, XMMWORD PTR [rsp-40]
mov QWORD PTR [rsp-16], 0
movdqa xmm2, XMMWORD PTR [rsp-24]
movups XMMWORD PTR [rdi], xmm1
movups XMMWORD PTR [rdi+16], xmm2
ret
Regression since 12. The 11 compiles nicely to:
assign:
mov QWORD PTR [rdi], rsi
mov QWORD PTR [rdi+8], 0
mov QWORD PTR [rdi+16], 0
mov QWORD PTR [rdi+24], 0
ret
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/109667] [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct
2023-04-28 11:46 [Bug tree-optimization/109667] New: [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct chfast at gmail dot com
@ 2023-04-28 12:27 ` rguenth at gcc dot gnu.org
2023-05-02 9:45 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-04-28 12:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109667
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |12.3
CC| |jamborm at gcc dot gnu.org
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Last reconfirmed| |2023-04-28
Priority|P3 |P2
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed. We end up with
void assign (struct i256 * v, long int z)
{
struct i256 r;
<bb 2> [local count: 536870913]:
MEM <char[24]> [(struct i256 *)&r + 8B] = {};
r.v[0] = z_8(D);
*v_5(D) = r;
r ={v} {CLOBBER(eol)};
return;
}
and I think the issue is that DSE trims the zero-initialization. -fno-tree-dse
fixes it - not doing DSE enables SRA:
r = {};
r.v[0] = z_8(D);
*v_5(D) = r;
r ={v} {CLOBBER(eol)};
can SRA while
MEM <char[24]> [(struct i256 *)&r + 8B] = {};
r.v[0] = z_8(D);
*v_5(D) = r;
r ={v} {CLOBBER(eol)};
FAILs:
Candidate (2756): r
Created a replacement for r offset: 0, size: 64: r$v$0D.2766
...
MEM <char[24]> [(struct i256 *)&r + 8B] = {};
r$v$0_13 = z_8(D);
r.v[0] = r$v$0_13;
*v_5(D) = r;
that was pointless - compared to
Candidate (2756): r
Will attempt to totally scalarize r (UID: 2756):
Created a replacement for r offset: 0, size: 64: r$v$0D.2766
Created a replacement for r offset: 64, size: 64: r$v$1D.2767
Created a replacement for r offset: 128, size: 64: r$v$2D.2768
Created a replacement for r offset: 192, size: 64: r$v$3D.2769
...
r$v$0_13 = 0;
r$v$1_2 = 0;
r$v$2_1 = 0;
r$v$3_11 = 0;
r$v$0_10 = z_8(D);
v_5(D)->v[0] = r$v$0_10;
v_5(D)->v[1] = r$v$1_2;
v_5(D)->v[2] = r$v$2_1;
v_5(D)->v[3] = r$v$3_11;
possibly SRA is confused by the char[24] type. It's going to be difficult
to do better though.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/109667] [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct
2023-04-28 11:46 [Bug tree-optimization/109667] New: [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct chfast at gmail dot com
2023-04-28 12:27 ` [Bug tree-optimization/109667] " rguenth at gcc dot gnu.org
@ 2023-05-02 9:45 ` jakub at gcc dot gnu.org
2023-05-02 9:56 ` jakub at gcc dot gnu.org
2023-05-08 12:27 ` rguenth at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-05-02 9:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109667
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Started with r12-155-gd8e1f1d24179690fd9c0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/109667] [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct
2023-04-28 11:46 [Bug tree-optimization/109667] New: [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct chfast at gmail dot com
2023-04-28 12:27 ` [Bug tree-optimization/109667] " rguenth at gcc dot gnu.org
2023-05-02 9:45 ` jakub at gcc dot gnu.org
@ 2023-05-02 9:56 ` jakub at gcc dot gnu.org
2023-05-08 12:27 ` rguenth at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-05-02 9:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109667
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, either SRA should be tweaked so that it can deal with DSE trimming of
initializations (I think that is the best way forward, after all, user could
have
done it manually too -
struct i256 {
long v[4];
};
void assign(struct i256 *v, long z) {
struct i256 r;
__builtin_memset (&r.v[1], 0, sizeof (long) * 3);
for (int i = 0; i < 1; ++i)
r.v[i] = z;
*v = r;
}
regressed with r7-2588-gdf7ec09f1209a33b35 ), or we should do something at the
RTL level like it apparently worked before the r7-2588 change, or we should
disable trimming in the DSE part before SRA.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/109667] [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct
2023-04-28 11:46 [Bug tree-optimization/109667] New: [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct chfast at gmail dot com
` (2 preceding siblings ...)
2023-05-02 9:56 ` jakub at gcc dot gnu.org
@ 2023-05-08 12:27 ` rguenth at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-08 12:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109667
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|12.3 |12.4
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 12.3 is being released, retargeting bugs to GCC 12.4.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-05-08 12:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-28 11:46 [Bug tree-optimization/109667] New: [12/13/14 Regression] Unnecessary temporary storage used for 32-byte struct chfast at gmail dot com
2023-04-28 12:27 ` [Bug tree-optimization/109667] " rguenth at gcc dot gnu.org
2023-05-02 9:45 ` jakub at gcc dot gnu.org
2023-05-02 9:56 ` jakub at gcc dot gnu.org
2023-05-08 12:27 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).