From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 5D8E5395202C; Thu, 17 Mar 2022 12:31:32 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5D8E5395202C
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/104912] [12 Regression] 416.gamess regression after
 r12-7612-g69619acd8d9b58
Date: Thu, 17 Mar 2022 12:31:32 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: attachments.created
Message-ID: <bug-104912-4-YtwErmBNHe@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-104912-4@http.gcc.gnu.org/bugzilla/>
References: <bug-104912-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Mar 2022 12:31:32 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104912
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 52640
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D52640&action=3Dedit
patch

Like this - this counts the number of vector stmts and the number of strided
loads/stores and then when finishing up:

+void
+ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
+{
+  m_finished =3D true;
+  if (m_costing_for_scalar)
+    return;
+
+  /* When we have more than one strided load or store and the
+     number of strided stores is high compared to all vector
+     stmts in the body we require at least an estimated
+     improvement due to the vectorization of a factor of two.  */
+  if (m_n_body_strided_load_store > 1
+      && m_n_body_stmts / m_n_body_strided_load_store < 4)
+    {
+      unsigned vf =3D 1;
+      if (is_a <loop_vec_info> (m_vinfo))
+       vf =3D vect_vf_for_cost (as_a <loop_vec_info> (m_vinfo));
+      if (scalar_costs->prologue_cost () * vf < 2 * body_cost ())
+       m_costs[vect_body] *=3D 2;
+    }
+}


the scaling of m_costs[vect_body] will make the vectorization unprofitable.
Instead of a hard limit like this we could also scale the strided load
cost based on the overall number of them, like if adding
m_n_body_strided_load_store squared to the cost.

Note that the "true" cost would only be visible when doing a scheduling
model with dependences in mind.  Note that for this particular case this
is all hand-waving since the true cost is the versioning/branching overhead,
not the vectorized loop body and the low number of iterations makes this
particularly visible.  So for 416.gamess it will be all a hack...=