From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id BFBBC3858D1E; Fri, 10 Nov 2023 11:34:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BFBBC3858D1E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1699616084; bh=pL1fJppbr9DLrVHNjJF2oI8mGeTsJK+RdfNGIjuWqtU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=LQudO9dWa0pGBrC0X9GZbfdEJfPimEFwv+FBYay0CkACkYayCSNY2GpYZSsiIkqRK kFsqzppWd4k829bM7wlhxaQ3QFEAHh7zXK6QhiE1ksiwfsPbMltYvwHwrPWIklChnd V52BLTLx8k/kNC+2JnFm6dFiMt2oxfdopnU3xrv8= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/110221] With AVX512 fully masking gfortran.dg/pr68146.f ICEs Date: Fri, 10 Nov 2023 11:34:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: ice-on-valid-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: everconfirmed keywords bug_status cf_reconfirmed_on cc blocked Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110221 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Keywords| |ice-on-valid-code Status|UNCONFIRMED |NEW Last reconfirmed| |2023-11-10 CC| |rsandifo at gcc dot gnu.org Blocks| |53947 --- Comment #2 from Richard Biener --- So in this case the stmt requiring the loop mask is only "indirectly" invar= iant as the mask itself is inside of the loop but with invariant operands. What works is avoiding to schedule internal def vectorized stmts outside of= the loop. That will then leave possible invariant motion to the LIM pass, at least when no loop masking/len is required. So I'm testing the following. diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 3e5814c3a31..80e279d8f50 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -9081,6 +9081,16 @@ vect_schedule_slp_node (vec_info *vinfo, /* Emit other stmts after the children vectorized defs which is earliest possible. */ gimple *last_stmt =3D NULL; + if (auto loop_vinfo =3D dyn_cast (vinfo)) + if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo) + || LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) + { + /* But avoid scheduling internal defs outside of the loop when + we might have only implicitly tracked loop mask/len defs. */ + gimple_stmt_iterator si + =3D gsi_after_labels (LOOP_VINFO_LOOP (loop_vinfo)->header); + last_stmt =3D *si; + } bool seen_vector_def =3D false; FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) if (SLP_TREE_DEF_TYPE (child) =3D=3D vect_internal_def) Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations=