public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed
@ 2020-12-15 14:46 gabravier at gmail dot com
2020-12-15 17:24 ` [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur jakub at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: gabravier at gmail dot com @ 2020-12-15 14:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289
Bug ID: 98289
Summary: [x86] Suboptimal optimization of stack usage when
function call to cold function is not needed
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
void f(bool cond)
{
if (cond)
__builtin_abort();
}
On x86 with current trunk and -O3, this results in :
f(bool):
sub rsp, 8
test dil, dil
jne .L3
add rsp, 8
ret
f(bool) [clone .cold]:
.L3:
call abort
This seems like a regression over GCC 7.5, which outputs :
f(bool):
test dil, dil
jne .L7
rep ret
.L7:
sub rsp, 8
call abort
Along with LLVM, which has similar output. Only emitting the code to begin the
call upon being asked to do seems quicker in the case where the call doesn't
occur.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
@ 2020-12-15 17:24 ` jakub at gcc dot gnu.org
2020-12-15 18:03 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-15 17:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
Target Milestone|--- |8.5
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2020-12-15
Summary|[x86] Suboptimal |[8/9/10/11 Regression]
|optimization of stack usage |[x86] Suboptimal
|when function call does not |optimization of stack usage
|occur |when function call does not
| |occur
CC| |hubicka at gcc dot gnu.org,
| |jakub at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Started with r8-1272-g227b76c3b4ef63b1226f4e584dbdf42c9e56ff9f
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
2020-12-15 17:24 ` [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur jakub at gcc dot gnu.org
@ 2020-12-15 18:03 ` jakub at gcc dot gnu.org
2020-12-15 18:23 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-15 18:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That particular change is of course correct, but all that means that
shrink-wrapping at least on this testcase doesn't work with
-freorder-blocks-and-partition which is on by default.
Compiling it with -O2 -fno-reorder-blocks-and-partition (or -O3 + that option)
makes it shrink-wrapped again.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
2020-12-15 17:24 ` [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur jakub at gcc dot gnu.org
2020-12-15 18:03 ` jakub at gcc dot gnu.org
@ 2020-12-15 18:23 ` jakub at gcc dot gnu.org
2020-12-17 12:30 ` [Bug rtl-optimization/98289] " cvs-commit at gcc dot gnu.org
2020-12-29 13:35 ` [Bug rtl-optimization/98289] [8/9/10 " jakub at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-15 18:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
--- gcc/shrink-wrap.c.jj 2020-07-28 15:39:09.983756571 +0200
+++ gcc/shrink-wrap.c 2020-12-15 19:15:00.213861334 +0100
@@ -494,7 +494,7 @@ can_get_prologue (basic_block pro, HARD_
edge e;
edge_iterator ei;
FOR_EACH_EDGE (e, ei, pro->preds)
- if (e->flags & (EDGE_COMPLEX | EDGE_CROSSING)
+ if (e->flags & EDGE_COMPLEX
&& !dominated_by_p (CDI_DOMINATORS, e->src, pro))
return false;
fixes it for me. Not sure I understand why EDGE_CROSSING has been listed
there,
if pro is in the cold partition and has EDGE_CROSSING edge leading to it, then
why can't the prologue be added to the cold partition and the jump redirected
to it?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
` (2 preceding siblings ...)
2020-12-15 18:23 ` jakub at gcc dot gnu.org
@ 2020-12-17 12:30 ` cvs-commit at gcc dot gnu.org
2020-12-29 13:35 ` [Bug rtl-optimization/98289] [8/9/10 " jakub at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-12-17 12:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:62cb9680e592057a49de66eac34da679338932f9
commit r11-6222-g62cb9680e592057a49de66eac34da679338932f9
Author: Jakub Jelinek <jakub@redhat.com>
Date: Thu Dec 17 13:28:48 2020 +0100
shrink-wrap: Don't put on incoming EDGE_CROSSING [PR98289]
As mentioned in the PR, shrink-wrapping disqualifies for prologue
placement basic blocks that have EDGE_CROSSING incoming edge.
I don't see why that is necessary, those edges seem to be redirected
just fine, both on x86_64 and powerpc64. In the former case, they
are usually conditional jumps that patch_jump_insn can handle just fine,
after all, they were previously crossing and will be crossing after
the redirection too, just to a different label. And in the powerpc64
case, it is a simple_jump instead that again seems to be handled by
patch_jump_insn just fine.
Sure, redirecting an edge that was previously not crossing to be crossing
or
vice versa can fail, but that is not what shrink-wrapping needs.
Also tested in GCC 8 with this patch and don't see ICEs there either
(though, of course, I'm not suggesting we should backport this to release
branches).
The old ICEs could have been fixed by PR87475 fix or some other one
years ago.
2020-12-17 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/98289
* shrink-wrap.c (can_get_prologue): Don't punt on EDGE_CROSSING
incoming edges.
* gcc.target/i386/pr98289.c: New test.
* gcc.dg/torture/pr98289.c: New test.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/98289] [8/9/10 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
` (3 preceding siblings ...)
2020-12-17 12:30 ` [Bug rtl-optimization/98289] " cvs-commit at gcc dot gnu.org
@ 2020-12-29 13:35 ` jakub at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-29 13:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Summary|[8/9/10/11 Regression] |[8/9/10 Regression] [x86]
|[x86] Suboptimal |Suboptimal optimization of
|optimization of stack usage |stack usage when function
|when function call does not |call does not occur
|occur |
Target Milestone|8.5 |11.0
Resolution|--- |FIXED
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk, no plans to backport.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-12-29 13:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
2020-12-15 17:24 ` [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur jakub at gcc dot gnu.org
2020-12-15 18:03 ` jakub at gcc dot gnu.org
2020-12-15 18:23 ` jakub at gcc dot gnu.org
2020-12-17 12:30 ` [Bug rtl-optimization/98289] " cvs-commit at gcc dot gnu.org
2020-12-29 13:35 ` [Bug rtl-optimization/98289] [8/9/10 " jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).