From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 23F573850437; Fri, 2 Jul 2021 09:52:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 23F573850437 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/99728] code pessimization when using wrapper classes around SIMD types Date: Fri, 02 Jul 2021 09:52:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 10.2.1 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Jul 2021 09:52:44 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99728 --- Comment #13 from Richard Biener --- Created attachment 51100 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D51100&action=3Dedit hack The attached tries to rewrite the aggregate assignments into a load/store sequence producing _33 =3D VIEW_CONVERT_EXPR(d_42(D)->lam2D.32702); VIEW_CONVERT_EXPR(d_42(D)->lam1D.32701) =3D _33; from originally d_42(D)->lam1D.32701 =3D d_42(D)->lam2D.32702; that's a bit ugly but still falls short of doing the full store-motion but at least now moves all but the above store: ... _35 =3D _36 + val$v_63; _30 =3D VIEW_CONVERT_EXPR(_56); VIEW_CONVERT_EXPR(*d_28(D).lam1D.32701) =3D _30; *d_28(D).lam2D.32702.vD.32579 =3D _35; il_33 =3D il_69 + 1; l_34 =3D l_68 + 2; if (lmax_26(D) >=3D l_34) goto ; [89.00%] else goto ; [11.00%] [local count: 850510901]: goto ; [100.00%] [local count: 105119324]: # _84 =3D PHI <_30(3)> # _85 =3D PHI <_35(3)> # d__v_lsm.37_86 =3D PHI # d__v_lsm.38_87 =3D PHI # d__v_lsm.39_88 =3D PHI # d__v_lsm.40_89 =3D PHI MEM[(struct TvsimpleD.32577 *)d_28(D) + 192B].vD.32579 =3D d__v_lsm.37_86; MEM[(struct TvsimpleD.32577 *)d_28(D) + 224B].vD.32579 =3D d__v_lsm.38_87; MEM[(struct TvsimpleD.32577 *)d_28(D) + 256B].vD.32579 =3D d__v_lsm.39_88; MEM[(struct TvsimpleD.32577 *)d_28(D) + 288B].vD.32579 =3D d__v_lsm.40_89; VIEW_CONVERT_EXPR(*d_28(D).lam1D.32701) =3D _84; *d_28(D).lam2D.32702.vD.32579 =3D _85; the dependence analysis of store-motion considers the last stores (ref 14 a= nd 15) dependent: Querying dependency of refs 2 and 15: dependent. Querying RAW dependencies of ref 2 in loop 1: dependent Querying dependency of refs 13 and 14: dependent. Querying RAW dependencies of ref 13 in loop 1: dependent Querying dependency of refs 14 and 13: dependent. Querying SM WAR dependencies of ref 14 in loop 1: dependent Querying dependency of refs 15 and 2: dependent. Querying SM WAR dependencies of ref 15 in loop 1: dependent That's the usual issue of LIM needing to identify "identical" refs but appearanlty failing to do so for: Memory reference 2: MEM[(const struct Tvsimple *)d_28(D) + 128B].v Memory reference 15: *d_28(D).lam2.v which is because we don't factor the MEM_REF contained offset. I'll see to do that independently of the "hack" (which I'm not sure is an appropriate way of avoiding to change LIM to deal with aggregates ...)=