From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id BFE2A3858C74; Mon,  7 Mar 2022 08:22:01 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BFE2A3858C74
From: "crazylht at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/101929] [12 Regression] r12-7319 regress x264_r by 4% on
 CLX.
Date: Mon, 07 Mar 2022 08:22:01 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: crazylht at gmail dot com
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-101929-4-NLAwPG8nSP@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-101929-4@http.gcc.gnu.org/bugzilla/>
References: <bug-101929-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Mar 2022 08:22:01 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101929
--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #7)
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 9188d727e33..7f1f12fb6c6 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -2374,7 +2375,7 @@ fail:
>                 n_vector_builds++;
>             }
>         }
> -      if (all_uniform_p
> +      if ((all_uniform_p && !two_operators)
>           || n_vector_builds > 1
>           || (n_vector_builds =3D=3D children.length ()
>               && is_a <gphi *> (stmt_info->stmt)))
>=20
>=20
> will re-enable the vectorization - it evades the vect_construct cost bump
> by instead using scalar_to_vec (aka splat) which has not yet been fixed to
> account for a possible gpr to xmm move (so it would be a temporary "solut=
ion"
> at best).
>=20
> Another change to mute the effect somewhat (but not fixing x264) that was
> mentioned is
>=20
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index b2bf90576d5..acf2cc977b4 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -22595,7 +22595,7 @@ ix86_builtin_vectorization_cost (enum
> vect_cost_for_stmt type_of_cost,
>        case vec_construct:
>         {
>           /* N element inserts into SSE vectors.  */
> -         int cost =3D TYPE_VECTOR_SUBPARTS (vectype) * ix86_cost->sse_op;
> +         int cost =3D (TYPE_VECTOR_SUBPARTS (vectype) - 1) *
> ix86_cost->sse_op;

(In reply to Richard Biener from comment #7)
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 9188d727e33..7f1f12fb6c6 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -2374,7 +2375,7 @@ fail:
>                 n_vector_builds++;
>             }
>         }
> -      if (all_uniform_p
> +      if ((all_uniform_p && !two_operators)
>           || n_vector_builds > 1
>           || (n_vector_builds =3D=3D children.length ()
>               && is_a <gphi *> (stmt_info->stmt)))
>=20
>=20
> will re-enable the vectorization - it evades the vect_construct cost bump
> by instead using scalar_to_vec (aka splat) which has not yet been fixed to
> account for a possible gpr to xmm move (so it would be a temporary "solut=
ion"
> at best).
>=20
> Another change to mute the effect somewhat (but not fixing x264) that was
> mentioned is
>=20
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index b2bf90576d5..acf2cc977b4 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -22595,7 +22595,7 @@ ix86_builtin_vectorization_cost (enum
> vect_cost_for_stmt type_of_cost,
>        case vec_construct:
>         {
>           /* N element inserts into SSE vectors.  */
> -         int cost =3D TYPE_VECTOR_SUBPARTS (vectype) * ix86_cost->sse_op;
> +         int cost =3D (TYPE_VECTOR_SUBPARTS (vectype) - 1) *
> ix86_cost->sse_op;
n - 1 is right for 128-bit vector, but for 256-bit vector, shouldn't it be =
n -
2, since we have a separate cost for vinserti128, and n - 4 for 512-bit one=
.=