public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed
@ 2020-12-15 14:46 gabravier at gmail dot com
  2020-12-15 17:24 ` [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur jakub at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: gabravier at gmail dot com @ 2020-12-15 14:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289

            Bug ID: 98289
           Summary: [x86] Suboptimal optimization of stack usage when
                    function call to cold function is not needed
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

void f(bool cond)
{
    if (cond)
        __builtin_abort();
}

On x86 with current trunk and -O3, this results in :

f(bool):
  sub rsp, 8
  test dil, dil
  jne .L3
  add rsp, 8
  ret
f(bool) [clone .cold]:
.L3:
  call abort

This seems like a regression over GCC 7.5, which outputs :

f(bool):
  test dil, dil
  jne .L7
  rep ret
.L7:
  sub rsp, 8
  call abort

Along with LLVM, which has similar output. Only emitting the code to begin the
call upon being asked to do seems quicker in the case where the call doesn't
occur.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
  2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
@ 2020-12-15 17:24 ` jakub at gcc dot gnu.org
  2020-12-15 18:03 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-15 17:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
   Target Milestone|---                         |8.5
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-12-15
            Summary|[x86] Suboptimal            |[8/9/10/11 Regression]
                   |optimization of stack usage |[x86] Suboptimal
                   |when function call does not |optimization of stack usage
                   |occur                       |when function call does not
                   |                            |occur
                 CC|                            |hubicka at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Started with r8-1272-g227b76c3b4ef63b1226f4e584dbdf42c9e56ff9f

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
  2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
  2020-12-15 17:24 ` [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur jakub at gcc dot gnu.org
@ 2020-12-15 18:03 ` jakub at gcc dot gnu.org
  2020-12-15 18:23 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-15 18:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That particular change is of course correct, but all that means that
shrink-wrapping at least on this testcase doesn't work with
-freorder-blocks-and-partition which is on by default.
Compiling it with -O2 -fno-reorder-blocks-and-partition (or -O3 + that option)
makes it shrink-wrapped again.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
  2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
  2020-12-15 17:24 ` [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur jakub at gcc dot gnu.org
  2020-12-15 18:03 ` jakub at gcc dot gnu.org
@ 2020-12-15 18:23 ` jakub at gcc dot gnu.org
  2020-12-17 12:30 ` [Bug rtl-optimization/98289] " cvs-commit at gcc dot gnu.org
  2020-12-29 13:35 ` [Bug rtl-optimization/98289] [8/9/10 " jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-15 18:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
--- gcc/shrink-wrap.c.jj        2020-07-28 15:39:09.983756571 +0200
+++ gcc/shrink-wrap.c   2020-12-15 19:15:00.213861334 +0100
@@ -494,7 +494,7 @@ can_get_prologue (basic_block pro, HARD_
   edge e;
   edge_iterator ei;
   FOR_EACH_EDGE (e, ei, pro->preds)
-    if (e->flags & (EDGE_COMPLEX | EDGE_CROSSING)
+    if (e->flags & EDGE_COMPLEX
        && !dominated_by_p (CDI_DOMINATORS, e->src, pro))
       return false;

fixes it for me.  Not sure I understand why EDGE_CROSSING has been listed
there,
if pro is in the cold partition and has EDGE_CROSSING edge leading to it, then
why can't the prologue be added to the cold partition and the jump redirected
to it?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug rtl-optimization/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
  2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2020-12-15 18:23 ` jakub at gcc dot gnu.org
@ 2020-12-17 12:30 ` cvs-commit at gcc dot gnu.org
  2020-12-29 13:35 ` [Bug rtl-optimization/98289] [8/9/10 " jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-12-17 12:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:62cb9680e592057a49de66eac34da679338932f9

commit r11-6222-g62cb9680e592057a49de66eac34da679338932f9
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Dec 17 13:28:48 2020 +0100

    shrink-wrap: Don't put on incoming EDGE_CROSSING [PR98289]

    As mentioned in the PR, shrink-wrapping disqualifies for prologue
    placement basic blocks that have EDGE_CROSSING incoming edge.
    I don't see why that is necessary, those edges seem to be redirected
    just fine, both on x86_64 and powerpc64.  In the former case, they
    are usually conditional jumps that patch_jump_insn can handle just fine,
    after all, they were previously crossing and will be crossing after
    the redirection too, just to a different label.  And in the powerpc64
    case, it is a simple_jump instead that again seems to be handled by
    patch_jump_insn just fine.
    Sure, redirecting an edge that was previously not crossing to be crossing
or
    vice versa can fail, but that is not what shrink-wrapping needs.
    Also tested in GCC 8 with this patch and don't see ICEs there either
    (though, of course, I'm not suggesting we should backport this to release
    branches).
    The old ICEs could have been fixed by PR87475 fix or some other one
    years ago.

    2020-12-17  Jakub Jelinek  <jakub@redhat.com>

            PR rtl-optimization/98289
            * shrink-wrap.c (can_get_prologue): Don't punt on EDGE_CROSSING
            incoming edges.

            * gcc.target/i386/pr98289.c: New test.
            * gcc.dg/torture/pr98289.c: New test.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug rtl-optimization/98289] [8/9/10 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur
  2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2020-12-17 12:30 ` [Bug rtl-optimization/98289] " cvs-commit at gcc dot gnu.org
@ 2020-12-29 13:35 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-29 13:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
            Summary|[8/9/10/11 Regression]      |[8/9/10 Regression] [x86]
                   |[x86] Suboptimal            |Suboptimal optimization of
                   |optimization of stack usage |stack usage when function
                   |when function call does not |call does not occur
                   |occur                       |
   Target Milestone|8.5                         |11.0
         Resolution|---                         |FIXED

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk, no plans to backport.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-12-29 13:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-15 14:46 [Bug target/98289] New: [x86] Suboptimal optimization of stack usage when function call to cold function is not needed gabravier at gmail dot com
2020-12-15 17:24 ` [Bug target/98289] [8/9/10/11 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur jakub at gcc dot gnu.org
2020-12-15 18:03 ` jakub at gcc dot gnu.org
2020-12-15 18:23 ` jakub at gcc dot gnu.org
2020-12-17 12:30 ` [Bug rtl-optimization/98289] " cvs-commit at gcc dot gnu.org
2020-12-29 13:35 ` [Bug rtl-optimization/98289] [8/9/10 " jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).