From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id F06ED3858D28; Tue, 14 Dec 2021 22:19:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F06ED3858D28 From: "sss@li-snyder.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/103721] New: [12 regression] wrong code generated for loop with conditional (jump threading?) Date: Tue, 14 Dec 2021 22:19:20 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: sss@li-snyder.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Dec 2021 22:19:21 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103721 Bug ID: 103721 Summary: [12 regression] wrong code generated for loop with conditional (jump threading?) Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sss@li-snyder.org Target Milestone: --- hi - With a recent checkout of gcc12 (20211214), on a x86_64-pc-linux-gnu host, the following source is miscompiled with -O2: -- x.cc -------------------------------------------------- bool flag(); const int* bar(); const int* f (const int* world) { const int* searchVolume =3D world; const int* currentVolume =3D nullptr; while (currentVolume !=3D searchVolume && searchVolume) { currentVolume =3D searchVolume; if (flag()) searchVolume =3D bar(); } return (currentVolume); } ---------------------------------------------------------- This can be demonstrated with this test harness: -- test.cc ------------------------------------------------- #include const int* f (const int* world); bool flag() { return true; } int ii[3] =3D {0}; int ipos =3D 0; const int *bar() { if (ipos =3D=3D 2) return nullptr; return &ii[++ipos]; } int main() { const int* i =3D f (&ii[0]); printf ("%d\n", i - &ii[0]); return 0; } ------------------------------------------------------------ I expect this to print 2, which it does when compiled with -O0 or O1. But with -O2: $ g++ -c -O2 x.cc -o x.o $ g++ -o test test.cc x.o $ ./test 0 Here is the generated code for the function f (directives removed for readability): _Z1fPKi: .LFB0: pushq %rbx movq %rdi, %rbx testq %rdi, %rdi je .L2 call _Z4flagv testb %al, %al jne .L10 .L2: movq %rbx, %rax popq %rbx ret .L10: call _Z3barv movq %rbx, %rax popq %rbx ret As one can see, in the generated code, the loop is not executed more than once. It appears as if the fact that searchVolume can be changed within the if statement is being lost --- if the conditional is commented out, then things work as expected. Things seem to go bad by at least the threadfull1 pass. In the previous pass, 110t.mergephi2, we have: ----------------------------------------------------------------- ;; Function f (_Z1fPKi, funcdef_no=3D0, decl_uid=3D2370, cgraph_uid=3D1, symbol_order=3D0) Removing basic block 6 const int * f (const int * world) { const int * currentVolume; const int * searchVolume; bool _1; bool _2; bool _3; bool _14; const int * _16; [local count: 118111600]: goto ; [100.00%] [local count: 955630225]: _14 =3D flag (); if (_14 !=3D 0) goto ; [33.00%] else goto ; [67.00%] [local count: 315357972]: _16 =3D bar (); [local count: 1073741824]: # searchVolume_4 =3D PHI <_16(4), world_7(D)(2), searchVolume_4(3)> # currentVolume_5 =3D PHI _1 =3D searchVolume_4 !=3D currentVolume_5; _2 =3D searchVolume_4 !=3D 0B; _3 =3D _1 & _2; if (_3 !=3D 0) goto ; [89.00%] else goto ; [11.00%] [local count: 118111600]: return currentVolume_5; } ----------------------------------------------------------------- which does seem to represent the loop correctly. But in 111t.threadfull1, this has been changed to: ------------------------------------------------------------------ ;; Function f (_Z1fPKi, funcdef_no=3D0, decl_uid=3D2370, cgraph_uid=3D1, symbol_order=3D0) Disambiguating loop 1 with multiple latches Merged latch edges of loop 1 ;; 2 loops found ;; ;; Loop 0 ;; header 0, latch 1 ;; depth 0, outer -1 ;; nodes: 0 1 2 3 4 8 9 7 ;; ;; Loop 1 ;; header 9, latch 8 ;; depth 1, outer 0 ;; nodes: 9 8 4 3 ;; 2 succs { 9 } ;; 3 succs { 4 8 } ;; 4 succs { 8 } ;; 8 succs { 9 } ;; 9 succs { 3 7 } ;; 7 succs { 1 } Jump threading proved probability of edge 9->7 too small (it is 11.0% (gues= sed) should be always (guessed)) SSA replacement table N_i -> { O_1 ... O_j } means that N_i replaces O_1, ..., O_j searchVolume_12 -> { searchVolume_4 } currentVolume_17 -> { currentVolume_5 } .MEM_18 -> { .MEM_6 } _19 -> { _1 } _20 -> { _2 } _21 -> { _3 } currentVolume_22 -> { currentVolume_5 } .MEM_23 -> { .MEM_6 } Incremental SSA update started at block: 9 Number of blocks in CFG: 11 Number of blocks to update: 6 ( 55%) Merging blocks 2 and 9 Merging blocks 8 and 10 const int * f (const int * world) { const int * currentVolume; const int * searchVolume; bool _1; bool _2; bool _3; bool _14; const int * _16; bool _19; bool _20; bool _21; [local count: 118111600]: _1 =3D world_7(D) !=3D 0B; _2 =3D world_7(D) !=3D 0B; _3 =3D _1 & _2; if (_3 !=3D 0) goto ; [97.00%] else goto ; [3.00%] [local count: 955630225]: _14 =3D flag (); if (_14 !=3D 0) goto ; [33.00%] else goto ; [67.00%] [local count: 315357972]: _16 =3D bar (); [local count: 955630224]: # searchVolume_11 =3D PHI <_16(4), world_7(D)(3)> # currentVolume_8 =3D PHI _19 =3D currentVolume_8 !=3D searchVolume_11; _20 =3D searchVolume_11 !=3D 0B; _21 =3D _19 & _20; [local count: 118111600]: # currentVolume_22 =3D PHI <0B(2), currentVolume_8(5)> return currentVolume_22; } ------------------------------------------------------------------ which has no backwards branch.=