From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 859B2385801A; Tue, 23 Feb 2021 16:40:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 859B2385801A From: "tnfchris at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/99220] New: [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes Date: Tue, 23 Feb 2021 16:40:33 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: ice-on-valid-code X-Bugzilla-Severity: normal X-Bugzilla-Who: tnfchris at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: tnfchris at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter target_milestone cf_gcctarget Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2021 16:40:33 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99220 Bug ID: 99220 Summary: [11 Regression] ICE during vectorization when multiple instances do the same calculation but have different num lanes Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: tree-optimization Assignee: tnfchris at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Target Milestone: --- Target: aarch64-* The following testcase class a { float b; float c; public: a(float d, float e) : b(d), c(e) {} a operator+(a d) { return a(b + d.b, c + d.c); } a operator-(a d) { return a(b - d.b, c - d.c); } a operator*(a d) { return a(b * b - c * c, b * c + c * d.b); } }; long f; a *g; class { a *h; long i; a *j; public: void k() { a l =3D h[0], m =3D g[i], n =3D l * g[1], o =3D l * j[8]; g[i] =3D m + n; g[i + 1] =3D m - n; j[f] =3D o; } } p; main() { p.k(); } crashes with aarch64-none-elf-g++ -w -march=3Darmv8.3-a -O3 -S main.cpp because two nodes end up with the same pointer. During the loop that analyz= es all the instances during optimize_load_redistribution_1 we do if (value) { SLP_TREE_REF_COUNT (value)++; SLP_TREE_CHILDREN (root)[i] =3D value; vect_free_slp_tree (node); } when doing a replacement. When this is done and the refcount for the node reaches 0, the node is removed, which allows the libc to return the pointer again in the next call to new, which it does.. First instance note: node 0x5325f48 (max_nunits=3D1, refcnt=3D2) note: op: VEC_PERM_EXPR note: { } note: lane permutation { 0[0] 1[1] 0[2] 1[3] } note: children 0x5325db0 0x5325200 Second instance note: node 0x5325f48 (max_nunits=3D1, refcnt=3D1) note: op: VEC_PERM_EXPR note: { } note: lane permutation { 0[0] 1[1] } note: children 0x53255b8 0x5325530 This will end up with the illegal construction of note: node 0x53258e8 (max_nunits=3D2, refcnt=3D2) note: op template: slp_patt_57 =3D .COMPLEX_MUL (_16, _16); note: stmt 0 _16 =3D _14 - _15; note: stmt 1 _23 =3D _17 + _22; note: children 0x53257d8 0x5325d28 note: node 0x53257d8 (max_nunits=3D2, refcnt=3D3) note: op template: l$b_4 =3D MEM[(const struct a &)_3].b; note: stmt 0 l$b_4 =3D MEM[(const struct a &)_3].b; note: stmt 1 l$c_5 =3D MEM[(const struct a &)_3].c; note: load permutation { 0 1 } note: node 0x5325d28 (max_nunits=3D2, refcnt=3D8) note: op template: l$b_4 =3D MEM[(const struct a &)_3].b; note: stmt 0 l$b_4 =3D MEM[(const struct a &)_3].b; note: stmt 1 l$c_5 =3D MEM[(const struct a &)_3].c; note: stmt 2 l$b_4 =3D MEM[(const struct a &)_3].b; note: stmt 3 l$c_5 =3D MEM[(const struct a &)_3].c; note: load permutation { 0 1 0 1 } To prevent this we need to add these temporary VEC_PERM_EXPR nodes to the bst_map cache and increase their refcnt one more.=