From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id A5995385DC20; Sun, 23 Jun 2024 08:24:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A5995385DC20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1719131075; bh=iaIGB7l9z3TVMEBBI9mwYvbE15HCFVOa+zeZeGea3uw=; h=From:To:Subject:Date:From; b=gcqvKQqd9mJLxJlpzAqPBZGOAyK+kD3IwHG/9i/Q2AlRUobDWbZq4TG28Rf/KrHD4 yUZ0Hb4WYUzymyXMr7z/ljrJedeMPDks4u6FP7lGUn7NKIeUya+OR+qzYN9GoL0OCm S3MqHYtUsrKC9zTH+embn/XjQ5eKWw8Xmu72MYRk= From: "tnfchris at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/115597] New: [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452 Date: Sun, 23 Jun 2024 08:24:29 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 15.0 X-Bugzilla-Keywords: compile-time-hog X-Bugzilla-Severity: normal X-Bugzilla-Who: tnfchris at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter cc target_milestone cf_gcctarget attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D115597 Bug ID: 115597 Summary: [15 Regression] vectorizer takes 20+ h compiling 510.parest in SPECCPU2017 since g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452 Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: compile-time-hog Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org CC: rguenth at gcc dot gnu.org Target Milestone: --- Target: aarch64* Created attachment 58496 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D58496&action=3Dedit slp dump graph Since: commit 46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452 (HEAD) Author: Richard Biener Date: Wed Jun 19 12:57:27 2024 +0200 tree-optimization/114413 - SLP CSE after permute optimization We currently fail to re-CSE SLP nodes after optimizing permutes which results in off cost estimates. For gcc.dg/vect/bb-slp-32.c this shows in not re-using the SLP node with the load and arithmetic for both the store and the reduction. The following implements CSE by re-bst-mapping nodes as finalization part of vect_optimize_slp. I've tried to make the CSE part of permute materialization but it isn't a very good fit there. I've not bothered to implement something more complete, also handling external defs or defs without SLP_TREE_SCALAR_STMTS. I realize this might result in more BB SLP which in turn might slow down code given costing for BB SLP is difficult (even that we now vectorize gcc.dg/vect/bb-slp-32.c on x86_64 might be not a good idea). This is nevertheless feeding more accurate info to costing which is good. PR tree-optimization/114413 * tree-vect-slp.cc (release_scalar_stmts_to_slp_tree_map): New function, split out from ... (vect_analyze_slp): ... here. Call it. (vect_cse_slp_nodes): New function. (vect_optimize_slp): Call it. * gcc.dg/vect/bb-slp-32.c: Expect CSE and vectorization on x86. Compilation takes an extremely long time in 510.parest_r. The problem seems to be that vect_cse_slp_nodes visits the same nodes twice. It looks like the function has no visited set, and the hot loop in parest (= when vectorizable thanks to libmvec) has many TWO_OPERANDS nodes and one of them= is rooted at the top level. vect_cse_slp_nodes seems to skip VEC_PERM_EXPR but not it's children, as su= ch it ends up visiting the same subgraphs multiple times. The graph in parest = has so many TWO_OPERAND nodes that essentially compilation never finishes. I believe this function needs a visited node set. example call graph: #334 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x4132f40: 0x3df2ec0) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #335 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x41321a0: 0x3df2b90) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #336 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x4130b00: 0x3df2860) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #337 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x41348a0: 0x3df2530) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #338 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x3b8b0d0: 0x3df2310) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #339 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x41348f0: 0x3dee928) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #340 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x4134500: 0x3dee460) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #341 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x3c14600: 0x3ded690) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #342 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x3ca75f0: 0x3de7910) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #343 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x3e28590: 0x3de8768) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #344 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x3c2e4b8: 0x3de7778) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #345 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x3da5e58: 0x3de7dd8) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #346 0x00000000018a1e14 in vect_cse_slp_nodes (bst_map=3D0x41627a0, node=3D@0x41d0770: 0x3de7f70) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6111 #347 0x00000000018a1f20 in vect_optimize_slp (vinfo=3D0x3e291c0) at /opt/buildAgent/work/5c94c4ced6ebfcd0/gcc/tree-vect-slp.cc:6128=