public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
@ 2021-11-16 7:41 luoxhu at gcc dot gnu.org
2021-11-16 8:19 ` [Bug testsuite/103270] " rguenth at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-16 7:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
Bug ID: 103270
Summary: [12 regression] gcc.dg/vect/pr96698.c inner loop
turned from hot to cold after r12-4526
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: testsuite
Assignee: unassigned at gcc dot gnu.org
Reporter: luoxhu at gcc dot gnu.org
Target Milestone: ---
For the testcase gcc.dg/vect/pr96698.c, the inner loop was hot (preheader count
< loop count), but it is NOT now after r12-4526, bb 3's profile count 354334801
is only 1/2 of the preheader bb 5's profile count 719407024.
But I guess it should be fixed in tree-ssa-loop-ch.c when copy_headers, there
are profile count update there, this case should be handled specially when the
single exit loop only has two bbs and the old header is new exit->src, no need
to scale down the old header profile count to preserve the hotness of the loop.
pr96698.c.138t.lim2:
void test (int a, int * i)
{
int i__lsm.5;
int c;
int b;
int _22;
int _23;
int _24; <bb 2> [local count: 88915474]:
if (a_12(D) <= 4)
goto <bb 7>; [89.00%]
else
goto <bb 6>; [11.00%]
<bb 7> [local count: 79134772]:
i__lsm.5_11 = *i_16(D);
goto <bb 5>; [100.00%]
<bb 9> [local count: 116930484]:
<bb 3> [local count: 354334801]:
# b_3 = PHI <b_15(9), 0(5)>
# c_17 = PHI <b_3(9), 0(5)>
# i__lsm.5_20 = PHI <i__lsm.5_4(9), i__lsm.5_1(5)>
_22 = i__lsm.5_20;
_23 = a_2 & c_17;
_24 = _22 + _23;
i__lsm.5_4 = _24;
b_15 = b_3 + -1;
if (b_15 != -11)
goto <bb 9>; [33.00%]
else
goto <bb 4>; [67.00%]
<bb 4> [local count: 719407024]:
# i__lsm.5_7 = PHI <i__lsm.5_4(3)>
a_14 = a_2 + 1;
if (a_14 <= 4)
goto <bb 8>; [89.00%]
else
goto <bb 10>; [11.00%]
<bb 8> [local count: 640272252]:
<bb 5> [local count: 719407024]:
# a_2 = PHI <a_14(8), a_12(D)(7)>
# i__lsm.5_1 = PHI <i__lsm.5_7(8), i__lsm.5_11(7)>
goto <bb 3>; [100.00%]
<bb 10> [local count: 79134772]:
# i__lsm.5_5 = PHI <i__lsm.5_7(4)>
*i_16(D) = i__lsm.5_5;
<bb 6> [local count: 88915474]:
return;
}
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
2021-11-16 7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
@ 2021-11-16 8:19 ` rguenth at gcc dot gnu.org
2021-11-17 1:54 ` luoxhu at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-11-16 8:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu.org
Last reconfirmed| |2021-11-16
Target Milestone|--- |12.0
Keywords| |missed-optimization
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
So you say this is a problem with loop header copying, that would mean the
issue is really latent and general, no? Header copying uses
gimple_duplicate_sese_region and has no own profile updating. I guess its
profile updating code isn't designed to cope with copying a region with
"side"-entries (we are ignoring the backedge here). Not sure if we can
somehow generally handle those (maybe we can learn from tracer or threader
here).
Honza?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
2021-11-16 7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
2021-11-16 8:19 ` [Bug testsuite/103270] " rguenth at gcc dot gnu.org
@ 2021-11-17 1:54 ` luoxhu at gcc dot gnu.org
2021-11-23 5:13 ` luoxhu at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-17 1:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
--- Comment #2 from luoxhu at gcc dot gnu.org ---
(In reply to Richard Biener from comment #1)
> So you say this is a problem with loop header copying, that would mean the
> issue is really latent and general, no? Header copying uses
> gimple_duplicate_sese_region and has no own profile updating. I guess its
> profile updating code isn't designed to cope with copying a region with
> "side"-entries (we are ignoring the backedge here). Not sure if we can
> somehow generally handle those (maybe we can learn from tracer or threader
> here).
>
> Honza?
Yes, it seems to be a general issue in gimple_duplicate_sese_region, the inner
loop cfg was:
8
|
3<--
| \ |
5 4
And it is modified by ch_base::copy_headers->gimple_duplicate_sese_region to(
entry edge is 8->3, exit edge is 3->4):
8
|
12
|
4<--
| |
3---
|
5
bb 12 is copied block from bb 3 as new preheader, bb 3 is rotated to be new
exit of the loop, bb 3 and bb 12 are adjusted count to "total_count -
entry_count" (354334800) and "entry_count"(719407024), at last bb 3 and bb 4
will be merged to one block by gimple_merge_blocks later by TODO_cleanup_cfg
with much smaller
count than preheader.
gimple_duplicate_sese_region:
if (total_count.initialized_p () && entry_count.initialized_p ())
{
scale_bbs_frequencies_profile_count (region, n_region,
total_count - entry_count,
total_count);
scale_bbs_frequencies_profile_count (region_copy, n_region, entry_count,
total_count);
}
Obviously, region of bb 3's profile count shouldn't be decreased from
"total_count" to "total_count - entry_count", it executes at every execution of
the loop. Simply adjust it back to total_count and region_copy to entry_count
will cause some other cases fail. And at the moment edge 3->4 is still not a
backedge now?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
2021-11-16 7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
2021-11-16 8:19 ` [Bug testsuite/103270] " rguenth at gcc dot gnu.org
2021-11-17 1:54 ` luoxhu at gcc dot gnu.org
@ 2021-11-23 5:13 ` luoxhu at gcc dot gnu.org
2021-11-23 5:13 ` luoxhu at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-23 5:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
--- Comment #3 from luoxhu at gcc dot gnu.org ---
The profile count is correct but something wrong with edge probability, and it
turns out that r12-4526 exposes a long-existing issue in
profile_estimate:predict_extra_loop_exits, when searching extra exit edges for
inner loop, it goes out and find a edge belongs to *outer loop*, setting that
edge with predict value 33%, then predict_loops won't reset that edge for outer
loop.
I drafted a patch to ignore EDGE_DFS_BACK edges when iterating in
predict_extra_loop_exits, then inner loop becomes hot again.
diff base/pr103270.c.047t.profile_estimate
patched/pr103270.c.047t.profile_estimate -U15
Predictions for bb 5
1 edges in bb 5 predicted to even probabilities
Predictions for bb 6
- first match heuristics: 33.00%
- combined heuristics: 33.00%
+ first match heuristics: 91.67%
+ combined heuristics: 91.67%
opcode values nonequal (on trees) heuristics of edge 6->11 (ignored): 66.00%
- extra loop exit heuristics of edge 6->11: 33.00%
+ loop iterations heuristics of edge 6->7: 8.33%
Predictions for bb 11
1 edges in bb 11 predicted to even probabilities
Predictions for bb 7
1 edges in bb 7 predicted to even probabilitie
…
- <bb 2> [local count: 88915474]:
+ <bb 2> [local count: 6029625]:
goto <bb 8>; [100.00%]
- <bb 3> [local count: 354334800]:
+ <bb 3> [local count: 536870913]:
_1 = *i_19(D);
_2 = a_4 & c_6;
_3 = _1 + _2;
*i_19(D) = _3;
- <bb 4> [local count: 708669601]:
+ <bb 4> [local count: 1073741824]:
# c_6 = PHI <c_7(11), b_5(3)>
# d_8 = PHI <0(11), 1(3)>
if (d_8 == 0)
goto <bb 3>; [50.00%]
else
goto <bb 5>; [50.00%]
- <bb 5> [local count: 354334800]:
+ <bb 5> [local count: 536870913]:
# c_21 = PHI <c_6(4)>
b_18 = b_5 + -1;
- <bb 6> [local count: 1073741824]:
+ <bb 6> [local count: 585656064]:
# b_5 = PHI <0(10), b_18(5)>
# c_7 = PHI <0(10), c_21(5)>
if (b_5 != -11)
- goto <bb 11>; [33.00%]
+ goto <bb 11>; [91.67%]
else
- goto <bb 7>; [67.00%]
+ goto <bb 7>; [8.33%]
- <bb 11> [local count: 354334800]:
+ <bb 11> [local count: 536870913]:
goto <bb 4>; [100.00%]
- <bb 7> [local count: 719407024]:
+ <bb 7> [local count: 48785151]:
a_16 = a_4 + 1;
- <bb 8> [local count: 808322498]:
+ <bb 8> [local count: 54814777]:
# a_4 = PHI <a_12(D)(2), a_16(7)>
if (a_4 <= 4)
goto <bb 10>; [89.00%]
else
goto <bb 9>; [11.00%]
- <bb 10> [local count: 719407024]:
+ <bb 10> [local count: 48785151]:
goto <bb 6>; [100.00%]
- <bb 9> [local count: 88915474]:
+ <bb 9> [local count: 6029625]:
return;
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
2021-11-16 7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
` (2 preceding siblings ...)
2021-11-23 5:13 ` luoxhu at gcc dot gnu.org
@ 2021-11-23 5:13 ` luoxhu at gcc dot gnu.org
2021-11-23 5:31 ` luoxhu at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-23 5:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
--- Comment #4 from luoxhu at gcc dot gnu.org ---
Created attachment 51851
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51851&action=edit
Fix incorrect loop exit edge probability
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
2021-11-16 7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
` (3 preceding siblings ...)
2021-11-23 5:13 ` luoxhu at gcc dot gnu.org
@ 2021-11-23 5:31 ` luoxhu at gcc dot gnu.org
2021-12-21 3:50 ` cvs-commit at gcc dot gnu.org
2021-12-22 2:49 ` luoxhu at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-23 5:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
--- Comment #5 from luoxhu at gcc dot gnu.org ---
;; Loop 0
;; header 0, latch 1
;; depth 0, outer -1
;; nodes: 0 1 2 3 4 5 6 11 7 8 10 9
;;
;; Loop 1
;; header 8, latch 7
;; depth 1, outer 0
;; nodes: 8 7 6 10 5 4 11 3
;;
;; Loop 2
;; header 6, latch 5
;; depth 2, outer 1
;; nodes: 6 5 4 11 3
;;
;; Loop 3
;; header 4, latch 3
;; depth 3, outer 2
;; nodes: 4 3
;; 2 succs { 8 }
;; 3 succs { 4 }
;; 4 succs { 3 5 }
;; 5 succs { 6 }
;; 6 succs { 11 7 }
;; 11 succs { 4 }
;; 7 succs { 8 }
;; 8 succs { 10 9 }
;; 10 succs { 6 }
;; 9 succs { 1 }
The CFG is:
2
|
8<----
| \ |
10 9 |
| |
6----7
6<----
| |
11 |
| |
4<- |
| \| |
5 3 |
| |
------
When iterating loop 3 in predict_extra_loop_exits, exit edge is 4->5, it finds
edge 3->4 for statement "if (d_8 == 0)", and set all e->src->preds with
"predict_paths_leading_to_edge (e1, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN);".
(gdb) pbb 3
;; basic block 3, loop depth 3
;; pred: 4
_1 = *i_19(D);
_2 = a_4 & c_6;
_3 = _1 + _2;
*i_19(D) = _3;
;; succ: 4
(gdb) pbb 4
;; basic block 4, loop depth 3
;; pred: 11
;; 3
# c_6 = PHI <c_7(11), b_5(3)>
# d_8 = PHI <0(11), 1(3)>
if (d_8 == 0)
goto <bb 3>; [INV]
else
goto <bb 5>; [INV]
;; succ: 3
;; 5
(gdb) p e->src->preds
$16 = 0x7ffff4fba140 = {<edge 0x7ffff4d05b20 (4 -> 3)>}
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
2021-11-16 7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
` (4 preceding siblings ...)
2021-11-23 5:31 ` luoxhu at gcc dot gnu.org
@ 2021-12-21 3:50 ` cvs-commit at gcc dot gnu.org
2021-12-22 2:49 ` luoxhu at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-12-21 3:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Xiong Hu Luo <luoxhu@gcc.gnu.org>:
https://gcc.gnu.org/g:46bfe1b0e11c4797c5926e0754fae2848026376c
commit r12-6085-g46bfe1b0e11c4797c5926e0754fae2848026376c
Author: Xionghu Luo <luoxhu@linux.ibm.com>
Date: Mon Dec 20 21:10:09 2021 -0600
Fix incorrect loop exit edge probability [PR103270]
r12-4526 cancelled jump thread path rotates loop. It exposes a issue in
profile-estimate when predict_extra_loop_exits, outer loop's exit edge
is marked as inner loop's extra loop exit and set with incorrect
prediction, then a hot inner loop will become cold loop finally through
optimizations, this patch add loop check when searching extra exit edges
to avoid unexpected predict_edge from predict_paths_for_bb.
Regression tested on P8LE.
gcc/ChangeLog:
2021-12-21 Xionghu Luo <luoxhu@linux.ibm.com>
PR middle-end/103270
* predict.c (predict_extra_loop_exits): Add loop parameter.
(predict_loops): Call with loop argument.
gcc/testsuite/ChangeLog:
2021-12-21 Xionghu Luo <luoxhu@linux.ibm.com>
PR middle-end/103270
* gcc.dg/pr103270.c: New test.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
2021-11-16 7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
` (5 preceding siblings ...)
2021-12-21 3:50 ` cvs-commit at gcc dot gnu.org
@ 2021-12-22 2:49 ` luoxhu at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-12-22 2:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
luoxhu at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #7 from luoxhu at gcc dot gnu.org ---
Fixed on master.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-12-22 2:49 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-16 7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
2021-11-16 8:19 ` [Bug testsuite/103270] " rguenth at gcc dot gnu.org
2021-11-17 1:54 ` luoxhu at gcc dot gnu.org
2021-11-23 5:13 ` luoxhu at gcc dot gnu.org
2021-11-23 5:13 ` luoxhu at gcc dot gnu.org
2021-11-23 5:31 ` luoxhu at gcc dot gnu.org
2021-12-21 3:50 ` cvs-commit at gcc dot gnu.org
2021-12-22 2:49 ` luoxhu at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).