From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 494723858C78; Mon, 7 Mar 2022 07:18:43 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 494723858C78 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/101929] [12 Regression] r12-7319 regress x264_r by 4% on CLX. Date: Mon, 07 Mar 2022 07:18:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2022 07:18:43 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101929 --- Comment #7 from Richard Biener --- diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 9188d727e33..7f1f12fb6c6 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -2374,7 +2375,7 @@ fail: n_vector_builds++; } } - if (all_uniform_p + if ((all_uniform_p && !two_operators) || n_vector_builds > 1 || (n_vector_builds =3D=3D children.length () && is_a (stmt_info->stmt))) will re-enable the vectorization - it evades the vect_construct cost bump by instead using scalar_to_vec (aka splat) which has not yet been fixed to account for a possible gpr to xmm move (so it would be a temporary "solutio= n" at best). Another change to mute the effect somewhat (but not fixing x264) that was mentioned is diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index b2bf90576d5..acf2cc977b4 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -22595,7 +22595,7 @@ ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, case vec_construct: { /* N element inserts into SSE vectors. */ - int cost =3D TYPE_VECTOR_SUBPARTS (vectype) * ix86_cost->sse_op; + int cost =3D (TYPE_VECTOR_SUBPARTS (vectype) - 1) * ix86_cost->ss= e_op; /* One vinserti128 for combining two SSE vectors for AVX256. */ if (GET_MODE_BITSIZE (mode) =3D=3D 256) cost +=3D ix86_vec_cost (mode, ix86_cost->addss); which makes sense as the cost of the initial value of the xmm destination is now costed separately when from GPR and free when from xmm.=