From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.guenther@gmail.com>
Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [IPv6:2a00:1450:4864:20::52e])
	by sourceware.org (Postfix) with ESMTPS id DD3D43858D35;
	Sat,  5 Nov 2022 11:38:24 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DD3D43858D35
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-ed1-x52e.google.com with SMTP id l11so11027207edb.4;
        Sat, 05 Nov 2022 04:38:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:from:to:cc:subject:date:message-id:reply-to;
        bh=CuVhHBDbFfy2PcN/oNIaiQAAjonf6JBDsjxKzZjYxtk=;
        b=X5qm2pSa5Wo0OYgehalVTFzEo0Nl1J7CHkp7vb4r+Uvluwe0XmGhrsM0czRGpgmHi6
         1jPwYDgZsSk2GjlL2/OQefgnok+bv/0bQ48p1izSZGc+jOoR79Vg0ijf+T0iSSzQ76z2
         LvRFV0GAzqf2USQ1z43+Gwbu8syxNJBDv7er+te2e6tM8vKU7O94xGO1ehZ9HXb6N8hq
         6EpAs96jYiHI71NuJvbeS+5+SPwt6jD6VjMESfqPIrjjDtLO8eMiw0LRTT2qxSPqseUZ
         PdJ2WifqekejUvmX+Yi5991sCrFKbApQdMeUn3aUSSxvQWV/KKTJ12/ntmqFa1Mwq8iv
         Nc4w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=CuVhHBDbFfy2PcN/oNIaiQAAjonf6JBDsjxKzZjYxtk=;
        b=mTSEbnBfE5XYVUI3KkDJyPAoF/0Z5dD602NYTq32tfbtBBDnnQoPMvYqknsHP+cd3p
         7d8CiqcU9qsMEG7YRAqK+m9G87y/I6zAVt2mnCUBbH/ocicVPYkhNDWnG4IizkZB+574
         76qRrQ6E4I1yvnyyKy0sf4x7cgqDo928bTjiHHLyZ5EtT8GdjIAqakY/9ZMB1RIpcb5e
         5W5/PZ2gU0dq1PnWPSxQPmfNnY5/TdJxs567G1ZJvvBmaRcBtfHE0F5zKMHYGGnoEmSO
         /LAxqSuYirmNNpaqFcPdZDgffx2kjvn73O3YZMRI20FEOvFsSOr8/OqEyWzeRem6u7QW
         pZTw==
X-Gm-Message-State: ACrzQf16axDAj1pr47mvMoVyV22e/QpagexbrEGbH0i09saDneqtKqSO
	yegNqqpIAtf/9MVmjYLKHBw8UZ/gA3JP0QVqmpU=
X-Google-Smtp-Source: AMsMyM5rMpBB588qzPbFAImCb936Sw/27IBmSnWINEOxgAxlLMB4Z7FTF3YyqhzxgfChZU1re5rTkH7HE2td+vdVQ58=
X-Received: by 2002:a05:6402:3457:b0:463:2017:ae64 with SMTP id
 l23-20020a056402345700b004632017ae64mr36250575edc.218.1667648303578; Sat, 05
 Nov 2022 04:38:23 -0700 (PDT)
MIME-Version: 1.0
References: <20221031024235.110995-1-guojiufu@linux.ibm.com> <daf54634-cb3e-a7f8-213d-c18ba781a3ef@gmail.com>
In-Reply-To: <daf54634-cb3e-a7f8-213d-c18ba781a3ef@gmail.com>
From: Richard Biener <richard.guenther@gmail.com>
Date: Sat, 5 Nov 2022 12:38:11 +0100
Message-ID: <CAFiYyc1gtCSC5563LAWDGEn1EAcbpkcCqjj5JMEqnRyKMTmr6Q@mail.gmail.com>
Subject: Re: [RFC] propgation leap over memory copy for struct
To: Jeff Law <jeffreyalaw@gmail.com>
Cc: Jiufu Guo <guojiufu@linux.ibm.com>, gcc-patches@gcc.gnu.org, pinskia@gcc.gnu.org, 
	dje.gcc@gmail.com, linkw@gcc.gnu.org, segher@kernel.crashing.org, 
	rguenth@gcc.gnu.org
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Mon, Oct 31, 2022 at 11:14 PM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
> On 10/30/22 20:42, Jiufu Guo via Gcc-patches wrote:
> > Hi,
> >
> > We know that for struct variable assignment, memory copy may be used.
> > And for memcpy, we may load and store more bytes as possible at one time.
> > While it may be not best here:
> > 1. Before/after stuct variable assignment, the vaiable may be operated.
> > And it is hard for some optimizations to leap over memcpy.  Then some struct
> > operations may be sub-optimimal.  Like the issue in PR65421.
> > 2. The size of struct is constant mostly, the memcpy would be expanded.  Using
> > small size to load/store and executing in parallel may not slower than using
> > large size to loat/store. (sure, more registers may be used for smaller bytes.)
> >
> >
> > In PR65421, For source code as below:
> > ////////t.c
> > #define FN 4
> > typedef struct { double a[FN]; } A;
> >
> > A foo (const A *a) { return *a; }
> > A bar (const A a) { return a; }
>
> So the first question in my mind is can we do better at the gimple
> phase?  For the second case in particular can't we just "return a"
> rather than copying a into <retval> then returning <retval>?  This feels
> a lot like the return value optimization from C++.  I'm not sure if it
> applies to the first case or not, it's been a long time since I looked
> at NRV optimizations, but it might be worth poking around in there a bit
> (tree-nrv.cc).
>
>
> But even so, these kinds of things are still bound to happen, so it's
> probably worth thinking about if we can do better in RTL as well.
>
>
> The first thing that comes to my mind is to annotate memcpy calls that
> are structure assignments.  The idea here is that we may want to expand
> a memcpy differently in those cases.   Changing how we expand an opaque
> memcpy call is unlikely to be beneficial in most cases.  But changing
> how we expand a structure copy may be beneficial by exposing the
> underlying field values.   This would roughly correspond to your method #1.
>
> Or instead of changing how we expand, teach the optimizers about these
> annotated memcpy calls -- they're just a a copy of each field.   That's
> how CSE and the propagators could treat them. After some point we'd
> lower them in the usual ways, but at least early in the RTL pipeline we
> could keep them as annotated memcpy calls.  This roughly corresponds to
> your second suggestion.

In the end it depends on the access patterns so some analysis like SRA
performs would be nice.  The issue with expanding memcpy on GIMPLE
is that we currently cannot express 'rep; movb;' or other target specific
sequences from the cpymem like optabs on GIMPLE and recovering those
from piecewise copies on RTL is going to be difficult.

>
> jeff
>
>
>