public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os
@ 2022-03-23 6:50 crazylht at gmail dot com
2022-03-23 11:01 ` [Bug target/105034] " rguenth at gcc dot gnu.org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-03-23 6:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034
Bug ID: 105034
Summary: [10/11/12 regression]Suboptimal codegen for min/max
with -Os
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: crazylht at gmail dot com
Target Milestone: ---
#define max(a,b) (((a) > (b))? (a) : (b))
#define min(a,b) (((a) < (b))? (a) : (b))
int foo(int x)
{
return max(x,0);
}
int bar(int x)
{
return min(x,0);
}
unsigned int baz(unsigned int x)
{
return min(x,1);
}
gcc10/11/12 -Os -msse4.1
foo(int):
movd xmm0, edi
xorps xmm1, xmm1
pmaxsd xmm0, xmm1
movd eax, xmm0
ret
bar(int):
movd xmm0, edi
xorps xmm1, xmm1
pminsd xmm0, xmm1
movd eax, xmm0
ret
baz(unsigned int):
xor eax, eax
test edi, edi
setne al
ret
gcc9.4 -Os -msse4.1
foo(int):
test edi, edi
mov eax, 0
cmovns eax, edi
ret
bar(int):
test edi, edi
mov eax, 0
cmovle eax, edi
ret
baz(unsigned int):
xor eax, eax
test edi, edi
setne al
ret
Os size
text data bss dec hex filename
178 0 0 178 b2 Os.o
O2 size
text data bss dec hex filename
176 0 0 176 b0 O2.o
https://godbolt.org/z/1sYxdTcKz
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
2022-03-23 6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
@ 2022-03-23 11:01 ` rguenth at gcc dot gnu.org
2022-03-28 2:49 ` wwwhhhyyy333 at gmail dot com
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-03-23 11:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2022-03-23
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Target Milestone|--- |10.4
Priority|P3 |P2
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
With -mavx it gets worse since vpxor is one byte larger than xorps. Not sure
if STV is tuned for -Os very well.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
2022-03-23 6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
2022-03-23 11:01 ` [Bug target/105034] " rguenth at gcc dot gnu.org
@ 2022-03-28 2:49 ` wwwhhhyyy333 at gmail dot com
2022-04-14 8:38 ` roger at nextmovesoftware dot com
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: wwwhhhyyy333 at gmail dot com @ 2022-03-28 2:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034
--- Comment #2 from Hongyu Wang <wwwhhhyyy333 at gmail dot com> ---
For -O2 stv doesn't do such transform
Computing gain for chain #1...
Instruction gain 8 for 7: {r84:SI=smax(r85:SI,0);clobber flags:CC;}
REG_DEAD r85:SI
REG_UNUSED flags:CC
Instruction conversion gain: 8
Registers conversion cost: 12
Total gain: -4
Since sse->integer reg move cost is 6 for generic cost.
Buf for -Os the cost is 3 so it is consider to be profitable.
Computing gain for chain #1...
Instruction gain 8 for 7: {r84:SI=smax(r85:SI,0);clobber flags:CC;}
REG_DEAD r85:SI
REG_UNUSED flags:CC
Instruction conversion gain: 8
Registers conversion cost: 6
Total gain: 2
FWIW, the solution would be either adjust the ix86_size cost, or blocks out
optimize_size in the stv gate.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
2022-03-23 6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
2022-03-23 11:01 ` [Bug target/105034] " rguenth at gcc dot gnu.org
2022-03-28 2:49 ` wwwhhhyyy333 at gmail dot com
@ 2022-04-14 8:38 ` roger at nextmovesoftware dot com
2022-04-14 9:00 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: roger at nextmovesoftware dot com @ 2022-04-14 8:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |roger at nextmovesoftware dot com
--- Comment #3 from Roger Sayle <roger at nextmovesoftware dot com> ---
Hi Hongtao,
Note that -mstv is a net win on the code size benchmark CSiBE, so gating the
entire pass on optimize_size is not an ideal solution. Instead the gain
function needs to choose which chains to transform based on optimize_size aware
costs.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
2022-03-23 6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
` (2 preceding siblings ...)
2022-04-14 8:38 ` roger at nextmovesoftware dot com
@ 2022-04-14 9:00 ` rguenth at gcc dot gnu.org
2022-04-14 9:20 ` roger at nextmovesoftware dot com
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-14 9:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Example that we don't transform but could:
typedef int v4si __attribute__((vector_size(16)));
#define min(a,b) ((a)<(b)?(a):(b))
v4si foo (v4si a, v4si b)
{
a[0] = min (a[0], b[0]);
return a;
}
here the scalar code is
movd %xmm0, %edx
movd %xmm1, %eax
cmpl %edx, %eax
cmovg %edx, %eax
pinsrd $0, %eax, %xmm0
where we could use sth like
movq %xmm0, %xmm2
minpd %xmm2, %xmm1
<some pack/unpack/palign or whatever>
a testcase variant could return the scalar minimum. For both cases it's
likely a win even for -Os.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
2022-03-23 6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
` (3 preceding siblings ...)
2022-04-14 9:00 ` rguenth at gcc dot gnu.org
@ 2022-04-14 9:20 ` roger at nextmovesoftware dot com
2022-06-28 10:48 ` [Bug target/105034] [10/11/12/13 " jakub at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: roger at nextmovesoftware dot com @ 2022-04-14 9:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034
--- Comment #5 from Roger Sayle <roger at nextmovesoftware dot com> ---
The latest CSiBE results on x86_64-pc-linux-gnu: With -Os the total size is
3696263, and with -Os -mno-stv the total size is 3966887, i.e. 624 bytes
larger. The worst regression from -mno-stv is
teem-1.6.0-src/src/nrrd/parseNrrd which 402 bytes larger, and the best
improvement from -mno-stv is linux-2.4.23-pre3-testplatform/net/ipv4/route
which is 134 bytes smaller. So I think this is a fine tuning problem.
cmp/cmov is much shorter than a pmax or a pmin, so SImode MAX/MIN should have
negative gain with -Os. Likewise for const0_rtx.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/105034] [10/11/12/13 regression]Suboptimal codegen for min/max with -Os
2022-03-23 6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
` (4 preceding siblings ...)
2022-04-14 9:20 ` roger at nextmovesoftware dot com
@ 2022-06-28 10:48 ` jakub at gcc dot gnu.org
2023-07-07 10:42 ` [Bug target/105034] [11/12/13/14 " rguenth at gcc dot gnu.org
2023-11-01 3:54 ` crazylht at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|10.4 |10.5
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/105034] [11/12/13/14 regression]Suboptimal codegen for min/max with -Os
2022-03-23 6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
` (5 preceding siblings ...)
2022-06-28 10:48 ` [Bug target/105034] [10/11/12/13 " jakub at gcc dot gnu.org
@ 2023-07-07 10:42 ` rguenth at gcc dot gnu.org
2023-11-01 3:54 ` crazylht at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07 10:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|10.5 |11.5
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10 branch is being closed.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/105034] [11/12/13/14 regression]Suboptimal codegen for min/max with -Os
2022-03-23 6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
` (6 preceding siblings ...)
2023-07-07 10:42 ` [Bug target/105034] [11/12/13/14 " rguenth at gcc dot gnu.org
@ 2023-11-01 3:54 ` crazylht at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2023-11-01 3:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
Looks like it's fixed in latest trunk.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-11-01 3:54 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-23 6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
2022-03-23 11:01 ` [Bug target/105034] " rguenth at gcc dot gnu.org
2022-03-28 2:49 ` wwwhhhyyy333 at gmail dot com
2022-04-14 8:38 ` roger at nextmovesoftware dot com
2022-04-14 9:00 ` rguenth at gcc dot gnu.org
2022-04-14 9:20 ` roger at nextmovesoftware dot com
2022-06-28 10:48 ` [Bug target/105034] [10/11/12/13 " jakub at gcc dot gnu.org
2023-07-07 10:42 ` [Bug target/105034] [11/12/13/14 " rguenth at gcc dot gnu.org
2023-11-01 3:54 ` crazylht at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).