From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id B2D0B3858C42; Fri, 12 Jan 2024 16:52:43 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B2D0B3858C42 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1705078363; bh=ZFkElhPFQ7zizffps803X5f2jhlu8v3+WG/DEZX7I8c=; h=From:To:Subject:Date:In-Reply-To:References:From; b=S0JT/k4GhJz2jaxfnTqGb9P0sevKjHANvL8hYPFV16lbVZW/mlEQTHPj3PNyPKYAU MFRCK5d0UNOWIcW+KthBM0+LU6CHrshyRnEJu4eAI6CnTVPZmZfF/VgjM33/fV7XWm 0zq0FWVhb8g8+ULas0QIuzWLShs7DwiCEGHHnlgo= From: "acoplan at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/113070] [14 regression] [AArch64] [PGO/LTO] Miscompilation of go compiler Date: Fri, 12 Jan 2024 16:52:42 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: lto, wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: acoplan at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: acoplan at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113070 --- Comment #7 from Alex Coplan --- Just to give a concrete example / reduced testcase where this goes wrong (to aid review). For the following testcase (reduced from libiberty) with -O2 -mlate-ldp-fusion: struct { unsigned D; int E; } * sha1_process_block_ctx; void *sha1_process_block_buffer; int sha1_process_block_ctx_1, sha1_process_block_ctx_0, sha1_process_block_ctx_3, sha1_process_block_d, sha1_process_block_e, sha1_process_block_tm, sha1_process_block_a, sha1_process_block_x_6, sha1_process_block_x_14, sha1_process_block_x_15; unsigned sha1_process_block_ctx_2; void sha1_process_block() { int *words =3D sha1_process_block_buffer; int endp =3D *words, x_0; int x[6]; unsigned b, c; while (endp) { int t =3D 0; for (; t < 6;) t =3D *words; sha1_process_block_a +=3D sha1_process_block_ctx_2 + 8348 + sha1_process_block_tm; x_0 +=3D sha1_process_block_tm =3D x[73]; b +=3D sha1_process_block_x_15 =3D sha1_process_block_tm; sha1_process_block_a +=3D b | 1; sha1_process_block_tm =3D sha1_process_block_x_14 ^ 8; sha1_process_block_e =3D sha1_process_block_tm; sha1_process_block_tm =3D x[8]; c +=3D sha1_process_block_x_14 =3D sha1_process_block_tm; b +=3D sha1_process_block_x_15; sha1_process_block_tm =3D x_0 ^ x[3]; sha1_process_block_a +=3D sha1_process_block_tm; sha1_process_block_tm =3D x[4] ^ x[15]; sha1_process_block_e +=3D sha1_process_block_a + b ^ sha1_process_block_d + sha1_process_block_tm; sha1_process_block_tm =3D sha1_process_block_x_6 ^ x[15]; sha1_process_block_d +=3D sha1_process_block_e >> 5 + (sha1_process_block_x_6 =3D sha1_process_block_tm); sha1_process_block_ctx_0 +=3D sha1_process_block_ctx_1 +=3D sha1_process_block_ctx_2 +=3D c; sha1_process_block_ctx_3 +=3D sha1_process_block_ctx->E +=3D sha1_process_block_e; } } we try to do this: fusing pair [L=3D0] (200,199), base=3D31, hazards: (27,54), move_range: (54= ,54) with the initial IR: insn i200 in bb3 [ebb3] at point 102: +--------------------------- | 200: [sp:DI+0x64]=3Dx0:SI | REG_DEAD x0:SI +--------------------------- uses: use of set r0:i37 (x0:SI) use of phi node r31:a12 (sp:DI) appears inside an address defines: set mem:i200 insn i198 in bb3 [ebb3] at point 104: +--------------------------- | 198: [sp:DI+0x6c]=3Dx2:SI | REG_DEAD x2:SI +--------------------------- uses: use of set r2:i81 (x2:SI) use of phi node r31:a12 (sp:DI) appears inside an address defines: set mem:i198 used by insn i27 in bb3 [ebb3] at point 108 insn i54 in bb3 [ebb3] at point 106: +-------------------------- | 54: x2:SI=3Dx16:SI<<0x1 +-------------------------- uses: SI use of set r16:i28 (x16:DI) defines: set r2:i54 (x2:SI) used by insn i199 in bb3 [ebb3] at point 110 insn i27 in bb3 [ebb3] at point 108: +-------------------------------------------- | 27: x0:DI=3Dzero_extend([x1:DI+0x18]) | REG_EQUAL [const(`*.LANCHOR0'+0x18)] +-------------------------------------------- uses: use of set r1:i223 (x1:DI) appears inside an address use of set mem:i198 defines: set r0:i27 (x0:DI) live out from bb3 [ebb3] at point 114 used by phi node r0:a15 (x0:DI) in ebb6 at point 116 insn i199 in bb3 [ebb3] at point 110: +--------------------------- | 199: [sp:DI+0x68]=3Dx2:SI | REG_DEAD x2:SI +--------------------------- uses: use of set r2:i54 (x2:SI) use of phi node r31:a12 (sp:DI) appears inside an address defines: set mem:i199 used by phi node mem:a15 in ebb6 at point 116 as it stands, after fusing that pair, we have: insn i200 in bb3 [ebb3] at point 102: +-------------------------- | 200: clobber [scratch] +-------------------------- defines: set mem:i200 insn i198 in bb3 [ebb3] at point 104: +--------------------------- | 198: [sp:DI+0x6c]=3Dx2:SI | REG_DEAD x2:SI +--------------------------- uses: use of set r2:i81 (x2:SI) use of phi node r31:a12 (sp:DI) appears inside an address defines: set mem:i198 used by insn i27 in bb3 [ebb3] at point 108 insn i54 in bb3 [ebb3] at point 106: +-------------------------- | 54: x2:SI=3Dx16:SI<<0x1 +-------------------------- uses: SI use of set r16:i28 (x16:DI) defines: set r2:i54 (x2:SI) used by insn i244 in bb3 [ebb3] at point 107 insn i244 in bb3 [ebb3] at point 107: +-------------------------------------------- | 244: [sp:DI+0x64]=3Dunspec[x0:SI,x2:SI] 38 +-------------------------------------------- uses: use of set r0:i37 (x0:SI) use of set r2:i54 (x2:SI) use of phi node r31:a12 (sp:DI) appears inside an address defines: set mem:i244 insn i27 in bb3 [ebb3] at point 108: +-------------------------------------------- | 27: x0:DI=3Dzero_extend([x1:DI+0x18]) | REG_EQUAL [const(`*.LANCHOR0'+0x18)] +-------------------------------------------- uses: use of set r1:i223 (x1:DI) appears inside an address use of set mem:i198 defines: set r0:i27 (x0:DI) live out from bb3 [ebb3] at point 114 used by phi node r0:a15 (x0:DI) in ebb6 at point 116 insn i199 in bb3 [ebb3] at point 110: +-------------------------- | 199: clobber [scratch] +-------------------------- defines: set mem:i199 used by phi node mem:a15 in ebb6 at point 116 The use problem is already visible here: i27 is consuming mem from i198, but it should be consuming mem from our newly-inserted stp (i244). The def problem is visible if we look in GDB: (gdb) call debug (i2) insn i199 in bb3 [ebb3] at point 110: +-------------------------- | 199: clobber [scratch] +-------------------------- defines: set mem:i199 used by phi node mem:a15 in ebb6 at point 116 (gdb) call debug (i2->defs ()[0]) set mem:i199 in bb3 [ebb3] at point 110 used by phi node mem:a15 in ebb6 at point 116 (gdb) call debug (i2->defs ()[0]->prev_def ()) set mem:i198 in bb3 [ebb3] at point 104 used by insn i27 in bb3 [ebb3] at point 108 here the previous def should be our new stp (i244) instead of i198. I have patches to fix both of these issues.=