From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jeffreyalaw@gmail.com>
Received: from mail-pf1-x435.google.com (mail-pf1-x435.google.com [IPv6:2607:f8b0:4864:20::435])
	by sourceware.org (Postfix) with ESMTPS id 613933858C50;
	Mon, 31 Oct 2022 22:13:45 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 613933858C50
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-pf1-x435.google.com with SMTP id d10so11872319pfh.6;
        Mon, 31 Oct 2022 15:13:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :from:to:cc:subject:date:message-id:reply-to;
        bh=52uxnPP08hYKCEOSKXNNQ/+7MtxA9pQ3zhf/WWtSULA=;
        b=BT9OSI9EJdMImwNvyDyHSRTFHNiDJEud99w3OD75pUOyDBZexZvUOQ+BySBzYqrNUP
         WKJ/nFI+uGmS639awq2u7NI8rrHLpfDczeFmtX5wyDh9Jx2P1GYSLvnU9c248CmLX62n
         jlPbJldiZiTXegupXDBKZFXznpZfiyoHB0vDp5GbIycjbmU9tTKtbKcH5uRFdnhmV+Zu
         9YRdjx/QDle5IU5+govar1pRw0Sa/ftS891azHB/4cEl4vycGH81M5DmOE3aivjeUKTp
         XuCFc4gkZ5yVs/EssPlRI1+4vFNpE2oLbMKYEAGZvw79YLen6t1rZ91LWCxkn5drDXB6
         v+Nw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=52uxnPP08hYKCEOSKXNNQ/+7MtxA9pQ3zhf/WWtSULA=;
        b=IuDITgubiQi8kY66XlQ6VJmf6yDYxLy94ICi73HFQlwBHUySaQIAkxmew+7XVHKyM5
         CT4GaDCtPhCQmaaEsUNCvKx46pk7/P/Z+xe8d08NYMLhOQhpHoQcEOPFTmEW5mq1f9bw
         cIWC3yrRBl5I5Yyrcu2Hdd/AuUN9VRzjgOTZp7TB0jDgbU046haPM1rPkbXtCLORasBV
         6LPtbs81UFMVN4HLRf0LvXWvelnsWqo2y9ssREUGmBtX2rhplYv4H8AuHowLRmZBPP4A
         nSNGQzufJES/TP5isqm+DDKErqz5eh0y1TdrHEkd6zRzAZ0uzJ0gpwJdDrLFPjlGlTWh
         klSQ==
X-Gm-Message-State: ACrzQf25c2YRkMA4SqjnHl+XSquTgQUj5wEFUuptcjRX9z5pMvvZsktH
	Dj5HyNlZLars7i5pMKZ8aX0=
X-Google-Smtp-Source: AMsMyM6LNIgY7XDkbhG+4DADMYIG6AkckEz8/r9Sm9VTl8mnTRm5e19B/ZQoFCW3WuOy7CUY7IwWrA==
X-Received: by 2002:a63:54b:0:b0:464:8e6:11e7 with SMTP id 72-20020a63054b000000b0046408e611e7mr14803720pgf.212.1667254424165;
        Mon, 31 Oct 2022 15:13:44 -0700 (PDT)
Received: from ?IPV6:2601:681:8600:13d0::f0a? ([2601:681:8600:13d0::f0a])
        by smtp.gmail.com with ESMTPSA id a10-20020a17090a70ca00b001ef81574355sm4638951pjm.12.2022.10.31.15.13.40
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Mon, 31 Oct 2022 15:13:43 -0700 (PDT)
Message-ID: <daf54634-cb3e-a7f8-213d-c18ba781a3ef@gmail.com>
Date: Mon, 31 Oct 2022 16:13:38 -0600
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.3.1
Subject: Re: [RFC] propgation leap over memory copy for struct
Content-Language: en-US
To: Jiufu Guo <guojiufu@linux.ibm.com>, gcc-patches@gcc.gnu.org
Cc: segher@kernel.crashing.org, rguenth@gcc.gnu.org, pinskia@gcc.gnu.org,
 linkw@gcc.gnu.org, dje.gcc@gmail.com
References: <20221031024235.110995-1-guojiufu@linux.ibm.com>
From: Jeff Law <jeffreyalaw@gmail.com>
In-Reply-To: <20221031024235.110995-1-guojiufu@linux.ibm.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>


On 10/30/22 20:42, Jiufu Guo via Gcc-patches wrote:
> Hi,
>
> We know that for struct variable assignment, memory copy may be used.
> And for memcpy, we may load and store more bytes as possible at one time.
> While it may be not best here:
> 1. Before/after stuct variable assignment, the vaiable may be operated.
> And it is hard for some optimizations to leap over memcpy.  Then some struct
> operations may be sub-optimimal.  Like the issue in PR65421.
> 2. The size of struct is constant mostly, the memcpy would be expanded.  Using
> small size to load/store and executing in parallel may not slower than using
> large size to loat/store. (sure, more registers may be used for smaller bytes.)
>
>
> In PR65421, For source code as below:
> ////////t.c
> #define FN 4
> typedef struct { double a[FN]; } A;
>
> A foo (const A *a) { return *a; }
> A bar (const A a) { return a; }

So the first question in my mind is can we do better at the gimple 
phase?  For the second case in particular can't we just "return a" 
rather than copying a into <retval> then returning <retval>?  This feels 
a lot like the return value optimization from C++.  I'm not sure if it 
applies to the first case or not, it's been a long time since I looked 
at NRV optimizations, but it might be worth poking around in there a bit 
(tree-nrv.cc).


But even so, these kinds of things are still bound to happen, so it's 
probably worth thinking about if we can do better in RTL as well.


The first thing that comes to my mind is to annotate memcpy calls that 
are structure assignments.  The idea here is that we may want to expand 
a memcpy differently in those cases.   Changing how we expand an opaque 
memcpy call is unlikely to be beneficial in most cases.  But changing 
how we expand a structure copy may be beneficial by exposing the 
underlying field values.   This would roughly correspond to your method #1.

Or instead of changing how we expand, teach the optimizers about these 
annotated memcpy calls -- they're just a a copy of each field.   That's 
how CSE and the propagators could treat them. After some point we'd 
lower them in the usual ways, but at least early in the RTL pipeline we 
could keep them as annotated memcpy calls.  This roughly corresponds to 
your second suggestion.


jeff