From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 0C4BE39540C4; Wed,  7 Jul 2021 13:59:36 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0C4BE39540C4
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/99728] code pessimization when using wrapper
 classes around SIMD types
Date: Wed, 07 Jul 2021 13:59:35 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 10.2.1
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-99728-4-PAU9CjziDL@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-99728-4@http.gcc.gnu.org/bugzilla/>
References: <bug-99728-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Jul 2021 13:59:36 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99728

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jason at gcc dot gnu.org
--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
So we have

  val$v_62 =3D *d_28(D).lam1D.32701.vD.32579;
  *d_28(D).lam1D.32701 =3D *d_28(D).lam2D.32702;
  *d_28(D).lam2D.32702.vD.32579 =3D _34;

I believe that the SRA pass has the best analysis capabilities to eventually
decompose aggregate copies into register pieces (with cost considerations).
In particular it knows (but without flow info) what kind of types
sub-accesses use.  Since we want the aggregate copy replaced with pieces
that match the rest of the accesses (here because of LIMs restrictions).

In particular we'd like to use 'vector double' typed accesses here, sth
the middle-end usually avoids for block-copies which aggregate copies
are to the middle-end.

That said, it would be _much_ easier if the frontend with its language spec=
ific
semantic knowledge could avoid doing block-copies for such simple wrappers
and instead perform (recursively) memberwise copy (for single member
aggregates).

Of course the simple fix in source is to add

  Tvsimple &operator=3D(const Tvsimple &other) { v =3D other.v; return *thi=
s;}

producing optimal code.  Jason - would you consider this premature
"optimization" in the C++ frontend?  It doesn't seem that there's
a operator=3D synthesized, instead we directly emit

   <<cleanup_point <<< Unknown tree: expr_stmt
  (void) (d->lam1 =3D *(const struct Tvsimple &) &d->lam2) >>>>>;

from

    d.lam1 =3D d.lam2;

from build_over_call which has a series of optimizations at

  else if (DECL_ASSIGNMENT_OPERATOR_P (fn)
           && DECL_OVERLOADED_OPERATOR_IS (fn, NOP_EXPR)
           && trivial_fn_p (fn))
    {
...
      if (is_really_empty_class (type, /*ignore_vptr*/true))
        {
          /* Avoid copying empty classes.  */
          val =3D build2 (COMPOUND_EXPR, type, arg, to);
          suppress_warning (val, OPT_Wunused);
        }
      else if (tree_int_cst_equal (TYPE_SIZE (type), TYPE_SIZE (as_base)))
        {
          if (is_std_init_list (type)
              && conv_binds_ref_to_prvalue (convs[1]))
            warning_at (loc, OPT_Winit_list_lifetime,
                        "assignment from temporary %<initializer_list%> doe=
s "
                        "not extend the lifetime of the underlying array");
          arg =3D cp_build_fold_indirect_ref (arg);
          val =3D build2 (MODIFY_EXPR, TREE_TYPE (to), to, arg);

so we handle empty classes, maybe we can also handle single data-member
classes (not sure how to exactly test for this - walking TYPE_FIELDs
repeatedly for each considered assignment would be slow I guess).=