public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0)
@ 2023-01-10 12:21 yann at ywg dot ch
  2023-01-10 14:42 ` [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036 marxin at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: yann at ywg dot ch @ 2023-01-10 12:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352

            Bug ID: 108352
           Summary: Dead Code Elimination Regression at -O2 (trunk vs.
                    12.2.0)
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: yann at ywg dot ch
  Target Milestone: ---

Created attachment 54227
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54227&action=edit
Code as file

cat case.c #30
long a;
int b;
void bar64_(void);
void foo();
int main() {
  char c = 0;
  unsigned d = 10;
  int e = 2;
  for (; d; d--) {
    bar64_();
    b = d;
    e && (c = (e = 0) != 4) > 1;
  }
  if (c < 1)
    foo();
  a = b;
}

`gcc-cb93c5f8008b95743b741d6f1842f9be50c6985c (trunk) -O2` can not eliminate
`foo` but `gcc-releases/gcc-12.2.0 -O2` can.

`gcc-cb93c5f8008b95743b741d6f1842f9be50c6985c (trunk) -O2 -S -o /dev/stdout
case.c`
--------- OUTPUT ---------
main:
.LFB0:
        .cfi_startproc
        pushq   %r13
        .cfi_def_cfa_offset 16
        .cfi_offset 13, -16
        movl    $1, %r13d
        pushq   %r12
        .cfi_def_cfa_offset 24
        .cfi_offset 12, -24
        movl    $2, %r12d
        pushq   %rbp
        .cfi_def_cfa_offset 32
        .cfi_offset 6, -32
        xorl    %ebp, %ebp
        pushq   %rbx
        .cfi_def_cfa_offset 40
        .cfi_offset 3, -40
        movl    $10, %ebx
        subq    $8, %rsp
        .cfi_def_cfa_offset 48
        .p2align 4,,10
        .p2align 3
.L3:
        call    bar64_
        testl   %r12d, %r12d
        movl    %ebx, b(%rip)
        cmovne  %r13d, %ebp
        xorl    %r12d, %r12d
        subl    $1, %ebx
        jne     .L3
        movl    $1, %eax
        testb   %bpl, %bpl
        je      .L10
.L4:
        movq    %rax, a(%rip)
        addq    $8, %rsp
        .cfi_remember_state
        .cfi_def_cfa_offset 40
        xorl    %eax, %eax
        popq    %rbx
        .cfi_def_cfa_offset 32
        popq    %rbp
        .cfi_def_cfa_offset 24
        popq    %r12
        .cfi_def_cfa_offset 16
        popq    %r13
        .cfi_def_cfa_offset 8
        ret
.L10:
        .cfi_restore_state
        xorl    %eax, %eax
        call    foo
        movslq  b(%rip), %rax
        jmp     .L4
---------- END OUTPUT ---------


`gcc-releases/gcc-12.2.0 -O2 -S -o /dev/stdout case.c`
--------- OUTPUT ---------
main:
.LFB0:
        .cfi_startproc
        pushq   %rbx
        .cfi_def_cfa_offset 16
        .cfi_offset 3, -16
        movl    $10, %ebx
        call    bar64_
        movl    $10, %eax
        jmp     .L3
        .p2align 4,,10
        .p2align 3
.L6:
        call    bar64_
        movl    %ebx, %eax
.L3:
        movl    %eax, b(%rip)
        subl    $1, %ebx
        jne     .L6
        movq    $1, a(%rip)
        xorl    %eax, %eax
        popq    %rbx
        .cfi_def_cfa_offset 8
        ret
---------- END OUTPUT ---------


Bisects to:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d86d81a449c03641e079f23a2b3e1b2279a162fe

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
  2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
@ 2023-01-10 14:42 ` marxin at gcc dot gnu.org
  2023-01-10 16:26 ` [Bug tree-optimization/108352] [13 Regression] " rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: marxin at gcc dot gnu.org @ 2023-01-10 14:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
                 CC|                            |marxin at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org
   Last reconfirmed|                            |2023-01-10
            Summary|Dead Code Elimination       |Dead Code Elimination
                   |Regression at -O2 (trunk    |Regression at -O2 since
                   |vs. 12.2.0)                 |r13-1960-gd86d81a449c036
     Ever confirmed|0                           |1
   Target Milestone|---                         |13.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108352] [13 Regression] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
  2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
  2023-01-10 14:42 ` [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036 marxin at gcc dot gnu.org
@ 2023-01-10 16:26 ` rguenth at gcc dot gnu.org
  2023-01-11 11:14 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-10 16:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I will have a look.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108352] [13 Regression] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
  2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
  2023-01-10 14:42 ` [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036 marxin at gcc dot gnu.org
  2023-01-10 16:26 ` [Bug tree-optimization/108352] [13 Regression] " rguenth at gcc dot gnu.org
@ 2023-01-11 11:14 ` rguenth at gcc dot gnu.org
  2023-01-11 11:59 ` cvs-commit at gcc dot gnu.org
  2023-01-11 12:07 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-11 11:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at gcc dot gnu.org

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Checking profitability of path (backwards):  bb:3 (6 insns) bb:9 (0 insns)
(latch) bb:5
  Control statement insns: 2
  Overall: 4 insns
  [4] Registering jump thread: (5, 9) incoming edge;  (9, 3) normal (back) (3,
4) nocopy;
path: 5->9->3->4 SUCCESS

but

Checking profitability of path (backwards):  bb:3 (6 insns) bb:9 (latch)
  Control statement insns: 2
  Overall: 4 insns
  FAIL: Would create irreducible loop without threading multiway branch.
path: 9->3->xx REJECTED

we are no longer considering the first which just adds an unrelated jump
to the path after the patch.  That's the

  /* We avoid creating irreducible inner loops unless we thread through
     a multiway branch, in which case we have deemed it worth losing
     other loop optimizations later.

     We also consider it worth creating an irreducible inner loop if
     the number of copied statement is low relative to the length of
     the path -- in that case there's little the traditional loop
     optimizer would have done anyway, so an irreducible loop is not
     so bad.  */
  if (!threaded_multiway_branch
      && creates_irreducible_loop
      && *creates_irreducible_loop
      && (n_insns * (unsigned) param_fsm_scale_path_stmts
          > (m_path.length () *
             (unsigned) param_fsm_scale_path_blocks)))

    {
      if (dump_file && (dump_flags & TDF_DETAILS))
        fprintf (dump_file,
                 "  FAIL: Would create irreducible loop without threading "
                 "multiway branch.\n");
      return false;

heuristic which with 9 -> 3 is 4 * 2 > 2 * 3 but with 5 -> 9 -> 3 we
get 4 * 2 > 3 * 3.

It's also worth noting that neither of the two threads create an irreducible
loop in the end for this particular case since e is also constant on entry
and thus the jump is resolved and the extra loop entry is removed (but
that's out of scope of the threaders analysis here).

It IMHO still makes no sense to reject the shorter path over the longer one
so the above "heuristic" makes absolutely no sense to me.  Raising
--param fsm-scale-path-blocks to 4 "fixes" the testcase on trunk.

The heuristic was added in r6-6600-g2b572b3c213b51 by Jeff in the attempt
to address a coremark regression (PR68398).  I guess Jeff remembers nothing
about this.

Note this is not about adding inner irreducible loops but making loop
itself irreducible.  The length of the path itself also says nothing
about the length of a path through the irreducible loop ...

Reverting the heuristic will reject all non-multi-way branch irreducible
loop creation.  We have another heuristic that rejects threading through
the latch early:

  /* Threading through an empty latch would cause code to be added to
     the latch.  This could alter the loop form sufficiently to cause
     loop optimizations to fail.  Disable these threads until after
     loop optimizations have run.  */
  if ((threaded_through_latch
       || (taken_edge && taken_edge->dest == loop->latch))
      && !(cfun->curr_properties & PROP_loop_opts_done)
      && empty_block_p (loop->latch))

so we could reject irreducible loops before loop opts (w/o just covering
the empty latch case) and otherwise generally allow it even for
non-multi-way branches.

That said, I fear I'm going to replace one bogus heuristic with another ;)

I'm still going to test replacing the heuristic with the following
(which allows to remove the fsm-scale-path-blocks param).

  /* We avoid creating irreducible inner loops unless we thread through
     a multiway branch, in which case we have deemed it worth losing
     other loop optimizations later.

     We also consider it worth creating an irreducible inner loop after
     loop optimizations if the number of copied statement is low.  */
  if (!m_threaded_multiway_branch
      && *creates_irreducible_loop
      && (!(cfun->curr_properties & PROP_loop_opts_done)
          || (m_n_insns * param_fsm_scale_path_stmts
              >= param_max_jump_thread_duplication_stmts)))
    {
      if (dump_file && (dump_flags & TDF_DETAILS))
        fprintf (dump_file,
                 "  FAIL: Would create irreducible loop early without "
                 "threading multiway branch.\n");
      /* We compute creates_irreducible_loop only late.  */
      return false; 
    }

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108352] [13 Regression] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
  2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
                   ` (2 preceding siblings ...)
  2023-01-11 11:14 ` rguenth at gcc dot gnu.org
@ 2023-01-11 11:59 ` cvs-commit at gcc dot gnu.org
  2023-01-11 12:07 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-01-11 11:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:7c9f20fcfdc2d8453df88ceb7e693debfcd678c0

commit r13-5103-g7c9f20fcfdc2d8453df88ceb7e693debfcd678c0
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Jan 11 12:07:16 2023 +0100

    tree-optimization/108352 - FSM threads creating irreducible loops

    The following relaxes a heuristic that prevents creating irreducible
    loops from FSM threads not covering multi-way branches.  Instead of
    allowing threads that adhere to

          && (n_insns * (unsigned) param_fsm_scale_path_stmts
              > (m_path.length () *
                 (unsigned) param_fsm_scale_path_blocks))

    with reasoning "We also consider it worth creating an irreducible inner
loop if
    the number of copied statement is low relative to the length of the path --
    in that case there's little the traditional loop optimizer would have done
    anyway, so an irreducible loop is not so bad." that I cannot make much
    sense of the following patch changes that to only allow those after
    loop optimization and when they are (scaled) short:

          && (!(cfun->curr_properties & PROP_loop_opts_done)
              || (m_n_insns * param_fsm_scale_path_stmts
                  >= param_max_jump_thread_duplication_stmts)))

    This allows us to get rid of --param fsm-scale-path-blocks which
    previous to the bisected revision allowed an enlarged path covering
    the original allowance (but we do not consider that enlarged path
    now because enlarging it doesn't add any information).

            PR tree-optimization/108352
            * tree-ssa-threadbackward.cc
            (back_threader_profitability::profitable_path_p): Adjust
            heuristic that allows non-multi-way branch threads creating
            irreducible loops.
            * doc/invoke.texi (--param fsm-scale-path-blocks): Remove.
            (--param fsm-scale-path-stmts): Adjust.
            * params.opt (--param=fsm-scale-path-blocks=): Remove.
            (-param=fsm-scale-path-stmts=): Adjust description.

            * gcc.dg/tree-ssa/ssa-thread-21.c: New testcase.
            * gcc.dg/tree-ssa/vrp46.c: Remove --param fsm-scale-path-blocks=1.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108352] [13 Regression] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
  2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
                   ` (3 preceding siblings ...)
  2023-01-11 11:59 ` cvs-commit at gcc dot gnu.org
@ 2023-01-11 12:07 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-11 12:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-01-11 12:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
2023-01-10 14:42 ` [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036 marxin at gcc dot gnu.org
2023-01-10 16:26 ` [Bug tree-optimization/108352] [13 Regression] " rguenth at gcc dot gnu.org
2023-01-11 11:14 ` rguenth at gcc dot gnu.org
2023-01-11 11:59 ` cvs-commit at gcc dot gnu.org
2023-01-11 12:07 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).