From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 23F573850437; Fri,  2 Jul 2021 09:52:44 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 23F573850437
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/99728] code pessimization when using wrapper
 classes around SIMD types
Date: Fri, 02 Jul 2021 09:52:43 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 10.2.1
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: attachments.created
Message-ID: <bug-99728-4-K2baQ3kC33@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-99728-4@http.gcc.gnu.org/bugzilla/>
References: <bug-99728-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Jul 2021 09:52:44 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99728
--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 51100
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D51100&action=3Dedit
hack

The attached tries to rewrite the aggregate assignments into a load/store
sequence producing

  _33 =3D VIEW_CONVERT_EXPR<vector(32) unsigned char>(d_42(D)->lam2D.32702);
  VIEW_CONVERT_EXPR<vector(32) unsigned char>(d_42(D)->lam1D.32701) =3D _33;

from originally

  d_42(D)->lam1D.32701 =3D d_42(D)->lam2D.32702;

that's a bit ugly but still falls short of doing the full store-motion but
at least now moves all but the above store:

...
  _35 =3D _36 + val$v_63;
  _30 =3D VIEW_CONVERT_EXPR<vector(32) unsigned char>(_56);
  VIEW_CONVERT_EXPR<vector(32) unsigned char>(*d_28(D).lam1D.32701) =3D _30;
  *d_28(D).lam2D.32702.vD.32579 =3D _35;
  il_33 =3D il_69 + 1;
  l_34 =3D l_68 + 2;
  if (lmax_26(D) >=3D l_34)
    goto <bb 6>; [89.00%]
  else
    goto <bb 7>; [11.00%]

  <bb 6> [local count: 850510901]:
  goto <bb 3>; [100.00%]

  <bb 7> [local count: 105119324]:
  # _84 =3D PHI <_30(3)>
  # _85 =3D PHI <_35(3)>
  # d__v_lsm.37_86 =3D PHI <d__v_lsm.37_74(3)>
  # d__v_lsm.38_87 =3D PHI <d__v_lsm.38_75(3)>
  # d__v_lsm.39_88 =3D PHI <d__v_lsm.39_76(3)>
  # d__v_lsm.40_89 =3D PHI <d__v_lsm.40_77(3)>
  MEM[(struct TvsimpleD.32577 *)d_28(D) + 192B].vD.32579 =3D d__v_lsm.37_86;
  MEM[(struct TvsimpleD.32577 *)d_28(D) + 224B].vD.32579 =3D d__v_lsm.38_87;
  MEM[(struct TvsimpleD.32577 *)d_28(D) + 256B].vD.32579 =3D d__v_lsm.39_88;
  MEM[(struct TvsimpleD.32577 *)d_28(D) + 288B].vD.32579 =3D d__v_lsm.40_89;
  VIEW_CONVERT_EXPR<vector(32) unsigned char>(*d_28(D).lam1D.32701) =3D _84;
  *d_28(D).lam2D.32702.vD.32579 =3D _85;

the dependence analysis of store-motion considers the last stores (ref 14 a=
nd
15) dependent:

Querying dependency of refs 2 and 15: dependent.
Querying RAW dependencies of ref 2 in loop 1: dependent
Querying dependency of refs 13 and 14: dependent.
Querying RAW dependencies of ref 13 in loop 1: dependent
Querying dependency of refs 14 and 13: dependent.
Querying SM WAR dependencies of ref 14 in loop 1: dependent
Querying dependency of refs 15 and 2: dependent.
Querying SM WAR dependencies of ref 15 in loop 1: dependent

That's the usual issue of LIM needing to identify "identical" refs
but appearanlty failing to do so for:

Memory reference 2: MEM[(const struct Tvsimple *)d_28(D) + 128B].v
Memory reference 15: *d_28(D).lam2.v

which is because we don't factor the MEM_REF contained offset.  I'll see
to do that independently of the "hack" (which I'm not sure is an appropriate
way of avoiding to change LIM to deal with aggregates ...)=