public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
@ 2021-11-16  7:41 luoxhu at gcc dot gnu.org
  2021-11-16  8:19 ` [Bug testsuite/103270] " rguenth at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-16  7:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270

            Bug ID: 103270
           Summary: [12 regression] gcc.dg/vect/pr96698.c inner loop
                    turned from hot to cold after r12-4526
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: testsuite
          Assignee: unassigned at gcc dot gnu.org
          Reporter: luoxhu at gcc dot gnu.org
  Target Milestone: ---

For the testcase gcc.dg/vect/pr96698.c, the inner loop was hot (preheader count
< loop count), but it is NOT now after r12-4526, bb 3's profile count 354334801
is only 1/2 of the preheader bb 5's profile count 719407024.

But I guess it should be fixed in tree-ssa-loop-ch.c when copy_headers, there
are profile count update there, this case should be handled specially when the
single exit loop only has two bbs and the old header is new exit->src, no need
to scale down the old header profile count to preserve the hotness of the loop.


pr96698.c.138t.lim2:
void test (int a, int * i)
{
  int i__lsm.5;
  int c;
  int b;
  int _22;
  int _23;
  int _24;  <bb 2> [local count: 88915474]:
  if (a_12(D) <= 4)
    goto <bb 7>; [89.00%]
  else
    goto <bb 6>; [11.00%]

  <bb 7> [local count: 79134772]:
  i__lsm.5_11 = *i_16(D);
  goto <bb 5>; [100.00%]

  <bb 9> [local count: 116930484]:

  <bb 3> [local count: 354334801]:
  # b_3 = PHI <b_15(9), 0(5)>
  # c_17 = PHI <b_3(9), 0(5)>
  # i__lsm.5_20 = PHI <i__lsm.5_4(9), i__lsm.5_1(5)>
  _22 = i__lsm.5_20;
  _23 = a_2 & c_17;
  _24 = _22 + _23;
  i__lsm.5_4 = _24;
  b_15 = b_3 + -1;
  if (b_15 != -11)
    goto <bb 9>; [33.00%]
  else
    goto <bb 4>; [67.00%]

  <bb 4> [local count: 719407024]:
  # i__lsm.5_7 = PHI <i__lsm.5_4(3)>
  a_14 = a_2 + 1;
  if (a_14 <= 4)
    goto <bb 8>; [89.00%]
  else
    goto <bb 10>; [11.00%]

  <bb 8> [local count: 640272252]:

  <bb 5> [local count: 719407024]:
  # a_2 = PHI <a_14(8), a_12(D)(7)>
  # i__lsm.5_1 = PHI <i__lsm.5_7(8), i__lsm.5_11(7)>
  goto <bb 3>; [100.00%]

  <bb 10> [local count: 79134772]:
  # i__lsm.5_5 = PHI <i__lsm.5_7(4)>
  *i_16(D) = i__lsm.5_5;

  <bb 6> [local count: 88915474]:
  return;
}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
  2021-11-16  7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
@ 2021-11-16  8:19 ` rguenth at gcc dot gnu.org
  2021-11-17  1:54 ` luoxhu at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-11-16  8:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org
   Last reconfirmed|                            |2021-11-16
   Target Milestone|---                         |12.0
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
So you say this is a problem with loop header copying, that would mean the
issue is really latent and general, no?  Header copying uses
gimple_duplicate_sese_region and has no own profile updating.  I guess its
profile updating code isn't designed to cope with copying a region with
"side"-entries (we are ignoring the backedge here).  Not sure if we can
somehow generally handle those (maybe we can learn from tracer or threader
here).

Honza?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
  2021-11-16  7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
  2021-11-16  8:19 ` [Bug testsuite/103270] " rguenth at gcc dot gnu.org
@ 2021-11-17  1:54 ` luoxhu at gcc dot gnu.org
  2021-11-23  5:13 ` luoxhu at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-17  1:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270

--- Comment #2 from luoxhu at gcc dot gnu.org ---
(In reply to Richard Biener from comment #1)
> So you say this is a problem with loop header copying, that would mean the
> issue is really latent and general, no?  Header copying uses
> gimple_duplicate_sese_region and has no own profile updating.  I guess its
> profile updating code isn't designed to cope with copying a region with
> "side"-entries (we are ignoring the backedge here).  Not sure if we can
> somehow generally handle those (maybe we can learn from tracer or threader
> here).
> 
> Honza?

Yes, it seems to be a general issue in gimple_duplicate_sese_region, the inner
loop cfg was:

8
|
3<-- 
| \ |
5  4  

And it is modified by ch_base::copy_headers->gimple_duplicate_sese_region to(
entry edge is 8->3, exit edge is 3->4):

8
|
12
|
4<-- 
|   |
3---
|
5

bb 12 is copied block from bb 3 as new preheader, bb 3 is rotated to be new
exit of the loop, bb 3 and bb 12 are adjusted count to "total_count -
entry_count" (354334800) and "entry_count"(719407024), at last bb 3 and bb 4
will be merged to one block by gimple_merge_blocks later by TODO_cleanup_cfg
with much smaller
count than preheader.


gimple_duplicate_sese_region:

  if (total_count.initialized_p () && entry_count.initialized_p ())
    {
      scale_bbs_frequencies_profile_count (region, n_region,
                                           total_count - entry_count,
                                           total_count);
      scale_bbs_frequencies_profile_count (region_copy, n_region, entry_count,
                                           total_count);
    }


Obviously, region of bb 3's profile count shouldn't be decreased from
"total_count" to "total_count - entry_count", it executes at every execution of
the loop.  Simply adjust it back to total_count and region_copy to entry_count
will cause some other cases fail. And at the moment edge 3->4 is still not a
backedge now?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
  2021-11-16  7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
  2021-11-16  8:19 ` [Bug testsuite/103270] " rguenth at gcc dot gnu.org
  2021-11-17  1:54 ` luoxhu at gcc dot gnu.org
@ 2021-11-23  5:13 ` luoxhu at gcc dot gnu.org
  2021-11-23  5:13 ` luoxhu at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-23  5:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270

--- Comment #3 from luoxhu at gcc dot gnu.org ---
The profile count is correct but something wrong with edge probability, and it
turns out that r12-4526 exposes a long-existing issue in
profile_estimate:predict_extra_loop_exits, when searching extra exit edges for
inner loop, it goes out and find a edge belongs to *outer loop*, setting that
edge with predict value 33%, then predict_loops won't reset that edge for outer
loop.
I drafted a patch to ignore EDGE_DFS_BACK edges when iterating in
predict_extra_loop_exits, then inner loop becomes hot again.

diff base/pr103270.c.047t.profile_estimate
patched/pr103270.c.047t.profile_estimate  -U15

 Predictions for bb 5
 1 edges in bb 5 predicted to even probabilities
 Predictions for bb 6
-  first match heuristics: 33.00%
-  combined heuristics: 33.00%
+  first match heuristics: 91.67%
+  combined heuristics: 91.67%
   opcode values nonequal (on trees) heuristics of edge 6->11 (ignored): 66.00%
-  extra loop exit heuristics of edge 6->11: 33.00%
+  loop iterations heuristics of edge 6->7: 8.33%
 Predictions for bb 11
 1 edges in bb 11 predicted to even probabilities
 Predictions for bb 7
 1 edges in bb 7 predicted to even probabilitie

…

-  <bb 2> [local count: 88915474]:
+  <bb 2> [local count: 6029625]:
   goto <bb 8>; [100.00%]

-  <bb 3> [local count: 354334800]:
+  <bb 3> [local count: 536870913]:
   _1 = *i_19(D);
   _2 = a_4 & c_6;
   _3 = _1 + _2;
   *i_19(D) = _3;

-  <bb 4> [local count: 708669601]:
+  <bb 4> [local count: 1073741824]:
   # c_6 = PHI <c_7(11), b_5(3)>
   # d_8 = PHI <0(11), 1(3)>
   if (d_8 == 0)
     goto <bb 3>; [50.00%]
   else
     goto <bb 5>; [50.00%]

-  <bb 5> [local count: 354334800]:
+  <bb 5> [local count: 536870913]:
   # c_21 = PHI <c_6(4)>
   b_18 = b_5 + -1;

-  <bb 6> [local count: 1073741824]:
+  <bb 6> [local count: 585656064]:
   # b_5 = PHI <0(10), b_18(5)>
   # c_7 = PHI <0(10), c_21(5)>
   if (b_5 != -11)
-    goto <bb 11>; [33.00%]
+    goto <bb 11>; [91.67%]
   else
-    goto <bb 7>; [67.00%]
+    goto <bb 7>; [8.33%]

-  <bb 11> [local count: 354334800]:
+  <bb 11> [local count: 536870913]:
   goto <bb 4>; [100.00%]

-  <bb 7> [local count: 719407024]:
+  <bb 7> [local count: 48785151]:
   a_16 = a_4 + 1;

-  <bb 8> [local count: 808322498]:
+  <bb 8> [local count: 54814777]:
   # a_4 = PHI <a_12(D)(2), a_16(7)>
   if (a_4 <= 4)
     goto <bb 10>; [89.00%]
   else
     goto <bb 9>; [11.00%]

-  <bb 10> [local count: 719407024]:
+  <bb 10> [local count: 48785151]:
   goto <bb 6>; [100.00%]

-  <bb 9> [local count: 88915474]:
+  <bb 9> [local count: 6029625]:
   return;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
  2021-11-16  7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-11-23  5:13 ` luoxhu at gcc dot gnu.org
@ 2021-11-23  5:13 ` luoxhu at gcc dot gnu.org
  2021-11-23  5:31 ` luoxhu at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-23  5:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270

--- Comment #4 from luoxhu at gcc dot gnu.org ---
Created attachment 51851
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51851&action=edit
Fix incorrect loop exit edge probability

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
  2021-11-16  7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-11-23  5:13 ` luoxhu at gcc dot gnu.org
@ 2021-11-23  5:31 ` luoxhu at gcc dot gnu.org
  2021-12-21  3:50 ` cvs-commit at gcc dot gnu.org
  2021-12-22  2:49 ` luoxhu at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-11-23  5:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270

--- Comment #5 from luoxhu at gcc dot gnu.org ---
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2 3 4 5 6 11 7 8 10 9
;;
;; Loop 1
;;  header 8, latch 7
;;  depth 1, outer 0
;;  nodes: 8 7 6 10 5 4 11 3
;;
;; Loop 2
;;  header 6, latch 5
;;  depth 2, outer 1
;;  nodes: 6 5 4 11 3
;;
;; Loop 3
;;  header 4, latch 3
;;  depth 3, outer 2
;;  nodes: 4 3
;; 2 succs { 8 }
;; 3 succs { 4 }
;; 4 succs { 3 5 }
;; 5 succs { 6 }
;; 6 succs { 11 7 }
;; 11 succs { 4 }
;; 7 succs { 8 }
;; 8 succs { 10 9 }
;; 10 succs { 6 }
;; 9 succs { 1 }

The CFG is:

2
|
8<----
| \  |
10 9 |
|    |
6----7
6<----
|    | 
11   | 
|    |
4<-  | 
| \| |
5  3 |
|    |
------

When iterating loop 3 in predict_extra_loop_exits, exit edge is 4->5, it finds
edge 3->4 for statement "if (d_8 == 0)", and set all e->src->preds with
"predict_paths_leading_to_edge (e1, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN);".

(gdb) pbb 3
;; basic block 3, loop depth 3
;;  pred:       4
_1 = *i_19(D);
_2 = a_4 & c_6;
_3 = _1 + _2;
*i_19(D) = _3;
;;  succ:       4

(gdb) pbb 4
;; basic block 4, loop depth 3
;;  pred:       11
;;              3
# c_6 = PHI <c_7(11), b_5(3)>
# d_8 = PHI <0(11), 1(3)>
if (d_8 == 0)
  goto <bb 3>; [INV]
else
  goto <bb 5>; [INV]
;;  succ:       3
;;              5 
(gdb) p e->src->preds
$16 = 0x7ffff4fba140 = {<edge 0x7ffff4d05b20 (4 -> 3)>}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
  2021-11-16  7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-11-23  5:31 ` luoxhu at gcc dot gnu.org
@ 2021-12-21  3:50 ` cvs-commit at gcc dot gnu.org
  2021-12-22  2:49 ` luoxhu at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-12-21  3:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Xiong Hu Luo <luoxhu@gcc.gnu.org>:

https://gcc.gnu.org/g:46bfe1b0e11c4797c5926e0754fae2848026376c

commit r12-6085-g46bfe1b0e11c4797c5926e0754fae2848026376c
Author: Xionghu Luo <luoxhu@linux.ibm.com>
Date:   Mon Dec 20 21:10:09 2021 -0600

    Fix incorrect loop exit edge probability [PR103270]

    r12-4526 cancelled jump thread path rotates loop. It exposes a issue in
    profile-estimate when predict_extra_loop_exits, outer loop's exit edge
    is marked as inner loop's extra loop exit and set with incorrect
    prediction, then a hot inner loop will become cold loop finally through
    optimizations, this patch add loop check when searching extra exit edges
    to avoid unexpected predict_edge from predict_paths_for_bb.

    Regression tested on P8LE.

    gcc/ChangeLog:

    2021-12-21  Xionghu Luo  <luoxhu@linux.ibm.com>

            PR middle-end/103270
            * predict.c (predict_extra_loop_exits): Add loop parameter.
            (predict_loops): Call with loop argument.

    gcc/testsuite/ChangeLog:

    2021-12-21  Xionghu Luo  <luoxhu@linux.ibm.com>

            PR middle-end/103270
            * gcc.dg/pr103270.c: New test.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug testsuite/103270] [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526
  2021-11-16  7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-12-21  3:50 ` cvs-commit at gcc dot gnu.org
@ 2021-12-22  2:49 ` luoxhu at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-12-22  2:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270

luoxhu at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from luoxhu at gcc dot gnu.org ---
Fixed on master.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-12-22  2:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-16  7:41 [Bug testsuite/103270] New: [12 regression] gcc.dg/vect/pr96698.c inner loop turned from hot to cold after r12-4526 luoxhu at gcc dot gnu.org
2021-11-16  8:19 ` [Bug testsuite/103270] " rguenth at gcc dot gnu.org
2021-11-17  1:54 ` luoxhu at gcc dot gnu.org
2021-11-23  5:13 ` luoxhu at gcc dot gnu.org
2021-11-23  5:13 ` luoxhu at gcc dot gnu.org
2021-11-23  5:31 ` luoxhu at gcc dot gnu.org
2021-12-21  3:50 ` cvs-commit at gcc dot gnu.org
2021-12-22  2:49 ` luoxhu at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).