From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [IPv6:2a00:1450:4864:20::52e]) by sourceware.org (Postfix) with ESMTPS id DD3D43858D35; Sat, 5 Nov 2022 11:38:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DD3D43858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x52e.google.com with SMTP id l11so11027207edb.4; Sat, 05 Nov 2022 04:38:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=CuVhHBDbFfy2PcN/oNIaiQAAjonf6JBDsjxKzZjYxtk=; b=X5qm2pSa5Wo0OYgehalVTFzEo0Nl1J7CHkp7vb4r+Uvluwe0XmGhrsM0czRGpgmHi6 1jPwYDgZsSk2GjlL2/OQefgnok+bv/0bQ48p1izSZGc+jOoR79Vg0ijf+T0iSSzQ76z2 LvRFV0GAzqf2USQ1z43+Gwbu8syxNJBDv7er+te2e6tM8vKU7O94xGO1ehZ9HXb6N8hq 6EpAs96jYiHI71NuJvbeS+5+SPwt6jD6VjMESfqPIrjjDtLO8eMiw0LRTT2qxSPqseUZ PdJ2WifqekejUvmX+Yi5991sCrFKbApQdMeUn3aUSSxvQWV/KKTJ12/ntmqFa1Mwq8iv Nc4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=CuVhHBDbFfy2PcN/oNIaiQAAjonf6JBDsjxKzZjYxtk=; b=mTSEbnBfE5XYVUI3KkDJyPAoF/0Z5dD602NYTq32tfbtBBDnnQoPMvYqknsHP+cd3p 7d8CiqcU9qsMEG7YRAqK+m9G87y/I6zAVt2mnCUBbH/ocicVPYkhNDWnG4IizkZB+574 76qRrQ6E4I1yvnyyKy0sf4x7cgqDo928bTjiHHLyZ5EtT8GdjIAqakY/9ZMB1RIpcb5e 5W5/PZ2gU0dq1PnWPSxQPmfNnY5/TdJxs567G1ZJvvBmaRcBtfHE0F5zKMHYGGnoEmSO /LAxqSuYirmNNpaqFcPdZDgffx2kjvn73O3YZMRI20FEOvFsSOr8/OqEyWzeRem6u7QW pZTw== X-Gm-Message-State: ACrzQf16axDAj1pr47mvMoVyV22e/QpagexbrEGbH0i09saDneqtKqSO yegNqqpIAtf/9MVmjYLKHBw8UZ/gA3JP0QVqmpU= X-Google-Smtp-Source: AMsMyM5rMpBB588qzPbFAImCb936Sw/27IBmSnWINEOxgAxlLMB4Z7FTF3YyqhzxgfChZU1re5rTkH7HE2td+vdVQ58= X-Received: by 2002:a05:6402:3457:b0:463:2017:ae64 with SMTP id l23-20020a056402345700b004632017ae64mr36250575edc.218.1667648303578; Sat, 05 Nov 2022 04:38:23 -0700 (PDT) MIME-Version: 1.0 References: <20221031024235.110995-1-guojiufu@linux.ibm.com> In-Reply-To: From: Richard Biener Date: Sat, 5 Nov 2022 12:38:11 +0100 Message-ID: Subject: Re: [RFC] propgation leap over memory copy for struct To: Jeff Law Cc: Jiufu Guo , gcc-patches@gcc.gnu.org, pinskia@gcc.gnu.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, segher@kernel.crashing.org, rguenth@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Oct 31, 2022 at 11:14 PM Jeff Law via Gcc-patches wrote: > > > On 10/30/22 20:42, Jiufu Guo via Gcc-patches wrote: > > Hi, > > > > We know that for struct variable assignment, memory copy may be used. > > And for memcpy, we may load and store more bytes as possible at one time. > > While it may be not best here: > > 1. Before/after stuct variable assignment, the vaiable may be operated. > > And it is hard for some optimizations to leap over memcpy. Then some struct > > operations may be sub-optimimal. Like the issue in PR65421. > > 2. The size of struct is constant mostly, the memcpy would be expanded. Using > > small size to load/store and executing in parallel may not slower than using > > large size to loat/store. (sure, more registers may be used for smaller bytes.) > > > > > > In PR65421, For source code as below: > > ////////t.c > > #define FN 4 > > typedef struct { double a[FN]; } A; > > > > A foo (const A *a) { return *a; } > > A bar (const A a) { return a; } > > So the first question in my mind is can we do better at the gimple > phase? For the second case in particular can't we just "return a" > rather than copying a into then returning ? This feels > a lot like the return value optimization from C++. I'm not sure if it > applies to the first case or not, it's been a long time since I looked > at NRV optimizations, but it might be worth poking around in there a bit > (tree-nrv.cc). > > > But even so, these kinds of things are still bound to happen, so it's > probably worth thinking about if we can do better in RTL as well. > > > The first thing that comes to my mind is to annotate memcpy calls that > are structure assignments. The idea here is that we may want to expand > a memcpy differently in those cases. Changing how we expand an opaque > memcpy call is unlikely to be beneficial in most cases. But changing > how we expand a structure copy may be beneficial by exposing the > underlying field values. This would roughly correspond to your method #1. > > Or instead of changing how we expand, teach the optimizers about these > annotated memcpy calls -- they're just a a copy of each field. That's > how CSE and the propagators could treat them. After some point we'd > lower them in the usual ways, but at least early in the RTL pipeline we > could keep them as annotated memcpy calls. This roughly corresponds to > your second suggestion. In the end it depends on the access patterns so some analysis like SRA performs would be nice. The issue with expanding memcpy on GIMPLE is that we currently cannot express 'rep; movb;' or other target specific sequences from the cpymem like optabs on GIMPLE and recovering those from piecewise copies on RTL is going to be difficult. > > jeff > > >