From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id A4E513858C62; Thu, 4 Apr 2024 17:08:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A4E513858C62 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1712250497; bh=5O76HmX6a13B3vwcm90Xd3BB0FNX3Pe+brgt6o+n6LQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=cehfQoI4MGbT3FqKc25oWfJjS/jsveCVbRBCJRBl+5EDZe9owYK+Jaayl55jbqULh qHn6sdC1ZqkCp+Db26cbvAwhibLzRsCH0yQBT4UkWeFH+jYJA2gHGquZYRtEz9MIAF iWCJDmU6W2+fgEkenKPh7ItDFY9+drhIwdv2jLYM= From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/114566] [11/12/13/14 Regression] Misaligned vmovaps when compiling with stack-protector-strong for znver4 Date: Thu, 04 Apr 2024 17:08:16 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 13.2.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 11.5 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc short_desc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114566 Jakub Jelinek changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |avieira at gcc dot gnu.org Summary|[11/12/13 Regression] |[11/12/13/14 Regression] |Misaligned vmovaps when |Misaligned vmovaps when |compiling with |compiling with |stack-protector-strong for |stack-protector-strong for |znver4 |znver4 --- Comment #14 from Jakub Jelinek --- Ah, it is the 10753 /* The vector size of the epilogue is smaller than that of the main loop 10754 so the alignment is either the same or lower. This means t= he dr will 10755 thus by definition be aligned. */ 10756 STMT_VINFO_DR_INFO (stmt_vinfo)->base_misaligned =3D false; that clears base_misaligned, but somehow nothing forced the higher alignmen= t on the var before. And the assumption is just wrong. In the main loop it is using 512-bit vectors and we have base_alignment 16, offset_alignment 32, so for V16SFmode accesses in the main vectorized loop = as the earlier one in the vectorized epilogue, so vect_compute_data_ref_alignm= ent in that case gave up already earlier: if (drb->offset_alignment < vect_align_c || !step_preserves_misalignment_p /* We need to know whether the step wrt the vectorized loop is negative when computing the starting misalignment below. */ || TREE_CODE (drb->step) !=3D INTEGER_CST) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "Unknown alignment for access: %T\n", ref); return; } and just in the V8SFmode case in the epilogue, because vect_align_c is ther= e 32 rather than 64, goes further and triggers if (base_alignment < vect_align_c) { unsigned int max_alignment; tree base =3D get_base_for_alignment (drb->base_address, &max_alignme= nt); if (max_alignment < vect_align_c || !vect_can_force_dr_alignment_p (base, vect_align_c * BITS_PER_UNIT)) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "can't force alignment of ref: %T\n", ref); return; } /* Force the alignment of the decl. NOTE: This is the only change to the code we make during the analysis phase, before deciding to vectorize the loop. */ if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "force alignment of %T\n", ref); dr_info->base_decl =3D base; dr_info->base_misaligned =3D true; base_misalignment =3D 0; } So, if we don't want to force higher base alignment just because of some accesses in vectorizable epilogue, I think we need to recompute the alignment/misalignment there as well. Marking for 14 as well because I believe the trunk commit just made it late= nt there rather than fixed.=