public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0)
@ 2023-01-10 12:21 yann at ywg dot ch
2023-01-10 14:42 ` [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036 marxin at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: yann at ywg dot ch @ 2023-01-10 12:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352
Bug ID: 108352
Summary: Dead Code Elimination Regression at -O2 (trunk vs.
12.2.0)
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: yann at ywg dot ch
Target Milestone: ---
Created attachment 54227
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54227&action=edit
Code as file
cat case.c #30
long a;
int b;
void bar64_(void);
void foo();
int main() {
char c = 0;
unsigned d = 10;
int e = 2;
for (; d; d--) {
bar64_();
b = d;
e && (c = (e = 0) != 4) > 1;
}
if (c < 1)
foo();
a = b;
}
`gcc-cb93c5f8008b95743b741d6f1842f9be50c6985c (trunk) -O2` can not eliminate
`foo` but `gcc-releases/gcc-12.2.0 -O2` can.
`gcc-cb93c5f8008b95743b741d6f1842f9be50c6985c (trunk) -O2 -S -o /dev/stdout
case.c`
--------- OUTPUT ---------
main:
.LFB0:
.cfi_startproc
pushq %r13
.cfi_def_cfa_offset 16
.cfi_offset 13, -16
movl $1, %r13d
pushq %r12
.cfi_def_cfa_offset 24
.cfi_offset 12, -24
movl $2, %r12d
pushq %rbp
.cfi_def_cfa_offset 32
.cfi_offset 6, -32
xorl %ebp, %ebp
pushq %rbx
.cfi_def_cfa_offset 40
.cfi_offset 3, -40
movl $10, %ebx
subq $8, %rsp
.cfi_def_cfa_offset 48
.p2align 4,,10
.p2align 3
.L3:
call bar64_
testl %r12d, %r12d
movl %ebx, b(%rip)
cmovne %r13d, %ebp
xorl %r12d, %r12d
subl $1, %ebx
jne .L3
movl $1, %eax
testb %bpl, %bpl
je .L10
.L4:
movq %rax, a(%rip)
addq $8, %rsp
.cfi_remember_state
.cfi_def_cfa_offset 40
xorl %eax, %eax
popq %rbx
.cfi_def_cfa_offset 32
popq %rbp
.cfi_def_cfa_offset 24
popq %r12
.cfi_def_cfa_offset 16
popq %r13
.cfi_def_cfa_offset 8
ret
.L10:
.cfi_restore_state
xorl %eax, %eax
call foo
movslq b(%rip), %rax
jmp .L4
---------- END OUTPUT ---------
`gcc-releases/gcc-12.2.0 -O2 -S -o /dev/stdout case.c`
--------- OUTPUT ---------
main:
.LFB0:
.cfi_startproc
pushq %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movl $10, %ebx
call bar64_
movl $10, %eax
jmp .L3
.p2align 4,,10
.p2align 3
.L6:
call bar64_
movl %ebx, %eax
.L3:
movl %eax, b(%rip)
subl $1, %ebx
jne .L6
movq $1, a(%rip)
xorl %eax, %eax
popq %rbx
.cfi_def_cfa_offset 8
ret
---------- END OUTPUT ---------
Bisects to:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d86d81a449c03641e079f23a2b3e1b2279a162fe
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
@ 2023-01-10 14:42 ` marxin at gcc dot gnu.org
2023-01-10 16:26 ` [Bug tree-optimization/108352] [13 Regression] " rguenth at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: marxin at gcc dot gnu.org @ 2023-01-10 14:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
CC| |marxin at gcc dot gnu.org,
| |rguenth at gcc dot gnu.org
Last reconfirmed| |2023-01-10
Summary|Dead Code Elimination |Dead Code Elimination
|Regression at -O2 (trunk |Regression at -O2 since
|vs. 12.2.0) |r13-1960-gd86d81a449c036
Ever confirmed|0 |1
Target Milestone|--- |13.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/108352] [13 Regression] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
2023-01-10 14:42 ` [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036 marxin at gcc dot gnu.org
@ 2023-01-10 16:26 ` rguenth at gcc dot gnu.org
2023-01-11 11:14 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-10 16:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I will have a look.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/108352] [13 Regression] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
2023-01-10 14:42 ` [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036 marxin at gcc dot gnu.org
2023-01-10 16:26 ` [Bug tree-optimization/108352] [13 Regression] " rguenth at gcc dot gnu.org
@ 2023-01-11 11:14 ` rguenth at gcc dot gnu.org
2023-01-11 11:59 ` cvs-commit at gcc dot gnu.org
2023-01-11 12:07 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-11 11:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |law at gcc dot gnu.org
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Checking profitability of path (backwards): bb:3 (6 insns) bb:9 (0 insns)
(latch) bb:5
Control statement insns: 2
Overall: 4 insns
[4] Registering jump thread: (5, 9) incoming edge; (9, 3) normal (back) (3,
4) nocopy;
path: 5->9->3->4 SUCCESS
but
Checking profitability of path (backwards): bb:3 (6 insns) bb:9 (latch)
Control statement insns: 2
Overall: 4 insns
FAIL: Would create irreducible loop without threading multiway branch.
path: 9->3->xx REJECTED
we are no longer considering the first which just adds an unrelated jump
to the path after the patch. That's the
/* We avoid creating irreducible inner loops unless we thread through
a multiway branch, in which case we have deemed it worth losing
other loop optimizations later.
We also consider it worth creating an irreducible inner loop if
the number of copied statement is low relative to the length of
the path -- in that case there's little the traditional loop
optimizer would have done anyway, so an irreducible loop is not
so bad. */
if (!threaded_multiway_branch
&& creates_irreducible_loop
&& *creates_irreducible_loop
&& (n_insns * (unsigned) param_fsm_scale_path_stmts
> (m_path.length () *
(unsigned) param_fsm_scale_path_blocks)))
{
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file,
" FAIL: Would create irreducible loop without threading "
"multiway branch.\n");
return false;
heuristic which with 9 -> 3 is 4 * 2 > 2 * 3 but with 5 -> 9 -> 3 we
get 4 * 2 > 3 * 3.
It's also worth noting that neither of the two threads create an irreducible
loop in the end for this particular case since e is also constant on entry
and thus the jump is resolved and the extra loop entry is removed (but
that's out of scope of the threaders analysis here).
It IMHO still makes no sense to reject the shorter path over the longer one
so the above "heuristic" makes absolutely no sense to me. Raising
--param fsm-scale-path-blocks to 4 "fixes" the testcase on trunk.
The heuristic was added in r6-6600-g2b572b3c213b51 by Jeff in the attempt
to address a coremark regression (PR68398). I guess Jeff remembers nothing
about this.
Note this is not about adding inner irreducible loops but making loop
itself irreducible. The length of the path itself also says nothing
about the length of a path through the irreducible loop ...
Reverting the heuristic will reject all non-multi-way branch irreducible
loop creation. We have another heuristic that rejects threading through
the latch early:
/* Threading through an empty latch would cause code to be added to
the latch. This could alter the loop form sufficiently to cause
loop optimizations to fail. Disable these threads until after
loop optimizations have run. */
if ((threaded_through_latch
|| (taken_edge && taken_edge->dest == loop->latch))
&& !(cfun->curr_properties & PROP_loop_opts_done)
&& empty_block_p (loop->latch))
so we could reject irreducible loops before loop opts (w/o just covering
the empty latch case) and otherwise generally allow it even for
non-multi-way branches.
That said, I fear I'm going to replace one bogus heuristic with another ;)
I'm still going to test replacing the heuristic with the following
(which allows to remove the fsm-scale-path-blocks param).
/* We avoid creating irreducible inner loops unless we thread through
a multiway branch, in which case we have deemed it worth losing
other loop optimizations later.
We also consider it worth creating an irreducible inner loop after
loop optimizations if the number of copied statement is low. */
if (!m_threaded_multiway_branch
&& *creates_irreducible_loop
&& (!(cfun->curr_properties & PROP_loop_opts_done)
|| (m_n_insns * param_fsm_scale_path_stmts
>= param_max_jump_thread_duplication_stmts)))
{
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file,
" FAIL: Would create irreducible loop early without "
"threading multiway branch.\n");
/* We compute creates_irreducible_loop only late. */
return false;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/108352] [13 Regression] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
` (2 preceding siblings ...)
2023-01-11 11:14 ` rguenth at gcc dot gnu.org
@ 2023-01-11 11:59 ` cvs-commit at gcc dot gnu.org
2023-01-11 12:07 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-01-11 11:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:7c9f20fcfdc2d8453df88ceb7e693debfcd678c0
commit r13-5103-g7c9f20fcfdc2d8453df88ceb7e693debfcd678c0
Author: Richard Biener <rguenther@suse.de>
Date: Wed Jan 11 12:07:16 2023 +0100
tree-optimization/108352 - FSM threads creating irreducible loops
The following relaxes a heuristic that prevents creating irreducible
loops from FSM threads not covering multi-way branches. Instead of
allowing threads that adhere to
&& (n_insns * (unsigned) param_fsm_scale_path_stmts
> (m_path.length () *
(unsigned) param_fsm_scale_path_blocks))
with reasoning "We also consider it worth creating an irreducible inner
loop if
the number of copied statement is low relative to the length of the path --
in that case there's little the traditional loop optimizer would have done
anyway, so an irreducible loop is not so bad." that I cannot make much
sense of the following patch changes that to only allow those after
loop optimization and when they are (scaled) short:
&& (!(cfun->curr_properties & PROP_loop_opts_done)
|| (m_n_insns * param_fsm_scale_path_stmts
>= param_max_jump_thread_duplication_stmts)))
This allows us to get rid of --param fsm-scale-path-blocks which
previous to the bisected revision allowed an enlarged path covering
the original allowance (but we do not consider that enlarged path
now because enlarging it doesn't add any information).
PR tree-optimization/108352
* tree-ssa-threadbackward.cc
(back_threader_profitability::profitable_path_p): Adjust
heuristic that allows non-multi-way branch threads creating
irreducible loops.
* doc/invoke.texi (--param fsm-scale-path-blocks): Remove.
(--param fsm-scale-path-stmts): Adjust.
* params.opt (--param=fsm-scale-path-blocks=): Remove.
(-param=fsm-scale-path-stmts=): Adjust description.
* gcc.dg/tree-ssa/ssa-thread-21.c: New testcase.
* gcc.dg/tree-ssa/vrp46.c: Remove --param fsm-scale-path-blocks=1.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/108352] [13 Regression] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036
2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
` (3 preceding siblings ...)
2023-01-11 11:59 ` cvs-commit at gcc dot gnu.org
@ 2023-01-11 12:07 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-11 12:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-01-11 12:07 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-10 12:21 [Bug tree-optimization/108352] New: Dead Code Elimination Regression at -O2 (trunk vs. 12.2.0) yann at ywg dot ch
2023-01-10 14:42 ` [Bug tree-optimization/108352] Dead Code Elimination Regression at -O2 since r13-1960-gd86d81a449c036 marxin at gcc dot gnu.org
2023-01-10 16:26 ` [Bug tree-optimization/108352] [13 Regression] " rguenth at gcc dot gnu.org
2023-01-11 11:14 ` rguenth at gcc dot gnu.org
2023-01-11 11:59 ` cvs-commit at gcc dot gnu.org
2023-01-11 12:07 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).