From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 05C55385840C; Wed, 10 Jan 2024 08:12:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 05C55385840C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1704874367; bh=cmt3fffsjwHt3PltpL/riAAJDvA6YkSW6a6VZEFYURo=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Ep4zlTU3Yhtn9NKYsDoS3HHCYA+/y3MVQNyJkJ4Y77FI0ce9Kvx1bFL2lZZFi06n5 s8rSwUz+kEL/VWXCcVc3TV1jDHdIHDM9kkm1zeMgqq0wZycy9wwhtPejzwwExr2dtC bZS0FFvj1FfLV5xZ97yA8MetVqDEavxntb4Azjb8= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/113205] [14 Regression] internal compiler error: in backward_pass, at tree-vect-slp.cc:5346 since r14-3220 Date: Wed, 10 Jan 2024 08:12:45 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: ice-on-valid-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc see_also Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113205 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenth at gcc dot gnu.org, | |rsandifo at gcc dot gnu.org See Also| |https://gcc.gnu.org/bugzill | |a/show_bug.cgi?id=3D110935 --- Comment #4 from Richard Biener --- OK, so this should already reproduce before the change when removing the invariant add (p + 8000). The issue seems to be that SLP build ends up with an unsupported load permutation when we try with V2SImode vectorization after V4SImode is scrapped because of cost issues. We have t.c:18:10: note: node 0x6471a48 (max_nunits=3D2, refcnt=3D2) vector(2) int t.c:18:10: note: op template: _3 =3D MEM[(int *)i.0_1 + 4B]; t.c:18:10: note: stmt 0 _3 =3D MEM[(int *)i.0_1 + 4B]; t.c:18:10: note: stmt 1 _5 =3D MEM[(int *)i.0_1 + 12B]; t.c:18:10: note: stmt 2 _4 =3D MEM[(int *)i.0_1 + 8B]; t.c:18:10: note: stmt 3 _2 =3D *i.0_1; t.c:18:10: note: load permutation { 1 3 2 0 } I'm not sure whether that's a supported situation. Changing the code to be more graceful like diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index b6cce55ce90..a12214bc1ad 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -5343,8 +5343,8 @@ vect_optimize_slp_pass::backward_pass () } } - gcc_assert (min_layout_cost.is_possible ()); - partition.layout =3D min_layout_i; + if (min_layout_cost.is_possible ()) + partition.layout =3D min_layout_i; } } then yields t.c:18:10: note: SLP optimize permutations: t.c:18:10: note: 1: { 1, 3, 2, 0 } t.c:18:10: note: SLP optimize partitions: t.c:18:10: note: ------------- t.c:18:10: note: partition 0 (layout 0): t.c:18:10: note: nodes: t.c:18:10: note: - 0x5f0d9b0: t.c:18:10: note: weight: 1.000000 t.c:18:10: note: out weight: 1.000000 (degree 1) t.c:18:10: note: op template: _20 =3D (int) _19; t.c:18:10: note: edges: t.c:18:10: note: - 0x5f0d9b0 --> [2] 0x5f0d928 t.c:18:10: note: layout 0: rejected t.c:18:10: note: layout 1: rejected t.c:18:10: note: ------------- t.c:18:10: note: partition 1 (layout 1): t.c:18:10: note: nodes: t.c:18:10: note: - 0x5f0da38: t.c:18:10: note: weight: 1.000000 t.c:18:10: note: out weight: 1.000000 (degree 1) t.c:18:10: note: op template: _3 =3D MEM[(int *)i.0_1 + 4B]; t.c:18:10: note: edges: t.c:18:10: note: - 0x5f0da38 --> [2] 0x5f0d928 t.c:18:10: note: layout 0: rejected t.c:18:10: note: layout 1: rejected t.c:18:10: note: ------------- t.c:18:10: note: partition 2 (layout 1): t.c:18:10: note: nodes: t.c:18:10: note: - 0x5f0d928: t.c:18:10: note: weight: 1.000000 t.c:18:10: note: out weight: 1.000000 (degree 1) t.c:18:10: note: op template: _21 =3D _3 * _20; t.c:18:10: note: edges: t.c:18:10: note: - 0x5f0d928 --> [3] 0x5f0d8a0 t.c:18:10: note: - 0x5f0d9b0 [0] --> 0x5f0d928 t.c:18:10: note: - 0x5f0da38 [1] --> 0x5f0d928 t.c:18:10: note: layout 0: rejected t.c:18:10: note: layout 1: rejected t.c:18:10: note: ------------- t.c:18:10: note: partition 3 (layout 1): t.c:18:10: note: nodes: t.c:18:10: note: - 0x5f0d8a0: t.c:18:10: note: weight: 1.000000 t.c:18:10: note: op template: _22 =3D (unsigned int) _21; t.c:18:10: note: edges: t.c:18:10: note: - 0x5f0d928 [2] --> 0x5f0d8a0 t.c:18:10: note: layout 0: t.c:18:10: note: {depth: 1.000000, total: 1.000000} t.c:18:10: note: + {depth: 0.000000, total: 0.000000} t.c:18:10: note: + {depth: 0.000000, total: 0.000000} t.c:18:10: note: =3D {depth: 1.000000, total: 1.000000} t.c:18:10: note: layout 1: (*) t.c:18:10: note: {depth: 0.000000, total: 0.000000} t.c:18:10: note: + {depth: 0.000000, total: 0.000000} t.c:18:10: note: + {depth: 0.000000, total: 0.000000} t.c:18:10: note: =3D {depth: 0.000000, total: 0.000000} t.c:18:10: note: inserting permutation node in place of 0x5f0d9b0 t.c:18:10: note: recording new base alignment for i.0_1 ... t.c:18:10: note: vectorizing permutation op0[3] op0[0] op0[2] op0[1] t.c:18:10: note: vectorizing permutation op0[3] op0[0] op0[2] op0[1] t.c:18:10: note: as vops0[1][1] vops0[0][0], vops0[1][0] vops0[0][1] t.c:18:10: missed: unsupported vect permute { 1 2 } t.c:18:10: note: Building vector operands of 0x5f0db48 from scalars inste= ad ... t.c:18:10: note: removing SLP instance operations starting from: _25 =3D = _24 + _40; t.c:18:10: missed: not vectorized: bad operation in basic block. t.c:18:10: note: ***** Analysis failed with vector mode V8QI t.c:18:10: note: ***** Re-trying analysis with vector mode V4QI and the ICE is gone. I'm not sure if we can "recover" in this way or whether leaving partition.layout unchanged could lead to wrong-code if it were actually possible to code generate it, thus whether it's really the inability to generate the permute that triggers this issue. Related to PR110935, with -Ofast we should elide the unsupported permute.=