From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BC067385DC32; Mon, 4 May 2020 08:49:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BC067385DC32 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1588582186; bh=0964yNN4IuWpXaHkN51PZ7SQ2R6R72Rgtj+Ost1KTxk=; h=From:To:Subject:Date:From; b=HHn9OnhoYJR2X7xteOE/lq4ORVW1tuxqUg9CYcOk1yWydFfUk90reGNivQMC/kVXD bFQziVZb9hfs9NGDhaxokCOOZAQCLOzAdx5F/4bQq7AVQjI5eq7h1eBZvGfwGMSBx0 pda4smqsaxnPeAyP46XITrgUqbPKWdHyXYvIsb24= From: "rsandifo at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/94941] New: Expansion of some internal fns can drop the lhs on the floor Date: Mon, 04 May 2020 08:49:46 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 10.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rsandifo at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2020 08:49:46 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D94941 Bug ID: 94941 Summary: Expansion of some internal fns can drop the lhs on the floor Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: rsandifo at gcc dot gnu.org Target Milestone: --- internal-fn.c:expand_mask_load_optab_fn uses expand_insn to emit the load instruction, but doesn't then test whether the coerced output operand is the same as the target of the gcall. It might not be, for example, in unoptimised code, where the target of the gcall expands to a MEM rtx and the load insn requires a REG destination. We need the equivalent of: if (!rtx_equal_p (lhs_rtx, ops[0].value)) emit_move_insn (lhs_rtx, ops[0].value); in expand_while_optab_fn. This can be seen for AArch64 with the following test, compiled with -O0 -march=3Darmv8.2-a+sve: ---------------------------------------------------------- #include svfloat32_t foo (float *ptr) { svbool_t pg =3D svptrue_pat_b32 (SV_VL1); svfloat32_t res =3D svld1 (pg, ptr); return res; } int main (void) { svbool_t pg =3D svptrue_pat_b32 (SV_VL1); float x[1] =3D { 1 }; if (svptest_any (pg, svcmpne (pg, foo (x), 1.0))) __builtin_abort (); return 0; } ---------------------------------------------------------- We emit: ;; res_5 =3D .MASK_LOAD (ptr_4(D), 4B, _2); (insn 9 8 10 (set (reg/f:DI 96) (mem/f/c:DI (plus:DI (reg/f:DI 87 virtual-stack-vars) (const_poly_int:DI [-40, -32])) [3 ptr+0 S8 A64])) "/tmp/foo.c":7:21 -1 (nil)) (insn 10 9 0 (set (reg:VNx4SF 97) (unspec:VNx4SF [ (reg:VNx4BI 92 [ _2 ]) (mem:VNx4SF (reg/f:DI 96) [0 MEM [(float *)ptr_4(D)]+0 S[16, 16] A8]) ] UNSPEC_LD1_SVE)) "/tmp/foo.c":7:21 -1 (nil)) but don't store reg 97 to the stack slot for "res". Then the return statement loads from "res": (insn 12 11 0 (set (reg:VNx4SF 93 [ _6 ]) (unspec:VNx4SF [ (subreg:VNx4BI (reg:VNx16BI 98) 0) (mem/c:VNx4SF (plus:DI (reg/f:DI 87 virtual-stack-vars) (const_poly_int:DI [-32, -32])) [2 res+0 S[16, 16] A128]) ] UNSPEC_PRED_X)) "/tmp/foo.c":8:10 -1 (nil)) meaning we return uninitialised stack contents. The same problem affects expand_load_lanes_optab_fn and expand_gather_load_optab_fn. I think this problem has existed since the mask load/store functions were introduced, but it was probably latent until GCC 10 because nothing would use them in unoptimised code.=