From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 61FE63860763; Wed, 14 Feb 2024 20:12:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 61FE63860763 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1707941522; bh=2QwCtuXIEDlWOXshMBF65i+JaO2MMpz0QsxftCXKw7Q=; h=From:To:Subject:Date:In-Reply-To:References:From; b=jj39shi8wC80MfDImfJaEAkrtEPaJYaL4YWyYtDOO2iWeyzWLqLQjRtO7xlaMBkJM O0ezmdA256JTtFY3F4G2yFdpbl9cWjkPQ+LPhYLd5jskAow67RlKXElQgCeAtMQ9ac yQmbNiR2kTlWmr4BBlbbB3tOR064Bv0gzBdhd+S4= From: "tnfchris at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/111156] [14 Regression] aarch64 aarch64/sve/mask_struct_store_4.c failures Date: Wed, 14 Feb 2024 20:11:59 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: testsuite-fail, wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: tnfchris at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111156 Tamar Christina changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenth at gcc dot gnu.org --- Comment #12 from Tamar Christina --- The commit that caused it is: commit g:a1558e9ad856938f165f838733955b331ebbec09 Author: Richard Biener Date: Wed Aug 23 14:28:26 2023 +0200 tree-optimization/111115 - SLP of masked stores The following adds the capability to do SLP on .MASK_STORE, I do not plan to add interleaving support. specifically this change: diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 3e9a284666c..a2caf6cb1c7 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -3048,8 +3048,7 @@ can_group_stmts_p (stmt_vec_info stmt1_info, stmt_vec_info stmt2_info, like those created by build_mask_conversion. */ tree mask1 =3D gimple_call_arg (call1, 2); tree mask2 =3D gimple_call_arg (call2, 2); - if (!operand_equal_p (mask1, mask2, 0) - && (ifn =3D=3D IFN_MASK_STORE || !allow_slp_p)) + if (!operand_equal_p (mask1, mask2, 0) && !allow_slp_p) { mask1 =3D strip_conversion (mask1); if (!mask1) With the change it now incorrectly thinks that the two masks (a <=3D7, a > = 2) are the same which is why one of the masks go missing. Part of it is that the boolean is used in a weird way. During vect_analyze_data_ref_accesses where this difference is important we pass t= rue in the initial check. but the || before made it so that we checked the MASK_STOREs still. Now it means during analysis we never check. later on in the same method we check it again but with false as the argument for determining STMT_VINFO_SLP_VECT_ONLY. The debug statement there is weird btw, as it says: if (dump_enabled_p () && STMT_VINFO_SLP_VECT_ONLY (stmtinfo_a)) dump_printf_loc (MSG_NOTE, vect_location, "Load suitable for SLP vectorization only.\n"); but as far as I can see, stmtinfo_a can be a store too, based on the checks= for DR_IS_READ (dra) just slightly higher up. The patch that added this check (g:997636716c5dde7d59d026726a6f58918069f122) says it's because the vectorizer doesn't support SLP of masked loads, and I can't tell if we do now. If we do, the boolean should be dropped.. if we don't, we probably need the check back to allow the check for stores. It looks like this check us being used to disable STMT_VINFO_SLP_VECT_ONLY for loads, which is a bit counter intuitive and feels like a hack rather than just doing: STMT_VINFO_SLP_VECT_ONLY (stmtinfo_a) - =3D !can_group_stmts_p (stmtinfo_a, stmtinfo_b, false); + =3D !can_group_stmts_p (stmtinfo_a, stmtinfo_b) + && DR_IS_WRITE (dra); So I think the boolean should be dropped and just reject loads for STMT_VINFO_SLP_VECT_ONLY... This also seems to give much better codegen... in any case, richi?=