From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id D79DB3858C50; Mon, 15 Jan 2024 08:17:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D79DB3858C50 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1705306624; bh=G/aFyeCaZb0S181bgJpbZzS+iCyVntaPdOJzD2BT20Y=; h=From:To:Subject:Date:In-Reply-To:References:From; b=N8Q274UYM1WdhGgBbepcpUnCsvvroP9Oz+jRCakhvZMNRDqi3MAPm/spiflSg45yT tVgDYSSJjkl3WjIaLeSionfeqKlOVH4g4JjNQqORTjG61FFo67aj8QUdpu5r6SW0a3 QjdVhkTK41eLNQuaI1j08UdvO2qM19CzGyj52Ri4= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/113358] OpenMP inhibits vectorization Date: Mon, 15 Jan 2024 08:16:56 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.2.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113358 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenth at gcc dot gnu.org --- Comment #3 from Richard Biener --- The issue with block.c is Analyzing loop at block.c:22 block.c:22:39: note: =3D=3D=3D analyze_loop_nest =3D=3D=3D block.c:22:39: note: =3D=3D=3D vect_analyze_loop_form =3D=3D=3D block.c:22:39: note: =3D=3D=3D get_loop_niters =3D=3D=3D block.c:22:39: missed: not vectorized: number of iterations cannot be computed. block.c:22:39: missed: bad loop form. block.c:22:39: missed: couldn't vectorize loop we fail to compute an expression for the number of scalar iterations in the innermost loop. That's because we have 'j < J + BLOCK && j < n' as the terminating condition. I suspect that the blocking should peel the case where J + BLOCK > n, basically if (J + BLOCK > n || I + BLOCK > n) { ... blocking nest with < n exit condition } else { ... blocking nest with < {J,I} + BLOCK exit condition } the vectorizer (or rather niter analysis) could try to recover in a similar way with using 'assumptions' - basically we can compute the number of iterations to BLOCK if we assume that J + BLOCK <=3D n. The exit condition looks like _145 =3D J_86 + 999; ... [local count: 958878294]: # j_88 =3D PHI ... j_58 =3D j_88 + 1; _63 =3D n_49(D) > j_58; _64 =3D j_58 <=3D _145; _65 =3D _63 & _64; if (_65 !=3D 0) we could try to pattern-match this NE_EXPR (we need to choose which condition we use as assumption and which to base the niters on). Another possibility would be (I think this came up in another bugreport as well) to use j < MIN (J + BLOCK, n). The following source modification works: for (int i =3D I; i < I + BLOCK && i < n; i++) { int m =3D J + BLOCK > n ? n : J + BLOCK; for (int j =3D J; j < m; j++) { whether it's a general profitable transform or should be matched again only during niter analysis I'm not sure (if the MIN is loop invariant and this is an exit condition it surely is profitable).=