From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by sourceware.org (Postfix) with ESMTPS id 923C63858C56; Sat, 5 Nov 2022 14:13:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 923C63858C56 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 553B81F897; Sat, 5 Nov 2022 14:13:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1667657636; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=GyvYzgHHH8xwA7qsbDQM2x0H6tYmFma5aqrp9QiwWEc=; b=A70ap9elcMr7QE6iZGhOuaRzItGZH2whlsCx/sbKeu4XD8V5b8LQD5cK1VbXMn5PTeWPAh bPDXT5+2zByyATvX03ROmGhHd+6qUZWokr5PQP89/LOcT0CIaI0DZNl69WNma2IPV4cJ7s 8OzPUguX00cBzmsR7jTjJFJL02lGB8g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1667657636; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=GyvYzgHHH8xwA7qsbDQM2x0H6tYmFma5aqrp9QiwWEc=; b=ssDmlooxGt3LVK4Q2CXSpab6d+CKm0pUCoIvhEJrQjmvkI9QtDzB0xFandWICGwZNRPAT9 FDaZDwfsHC2UQ+Ag== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 1F6CC13AA6; Sat, 5 Nov 2022 14:13:56 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id guImBqRvZmObMAAAMHmgww (envelope-from ); Sat, 05 Nov 2022 14:13:56 +0000 Date: Sat, 5 Nov 2022 15:13:55 +0100 (CET) From: Richard Biener To: Jiufu Guo cc: Segher Boessenkool , Jeff Law , gcc-patches@gcc.gnu.org, rguenth@gcc.gnu.org, pinskia@gcc.gnu.org, linkw@gcc.gnu.org, dje.gcc@gmail.com Subject: Re: [RFC] propgation leap over memory copy for struct In-Reply-To: <7e1qqnwb36.fsf@pike.rch.stglabs.ibm.com> Message-ID: <381qr8s3-53n-pr61-7r1n-6q8q71nsqnq@fhfr.qr> References: <20221031024235.110995-1-guojiufu@linux.ibm.com> <20221101004956.GL25951@gate.crashing.org> <7e1qqnwb36.fsf@pike.rch.stglabs.ibm.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="-1463801166-1605410050-1667657636=:12246" X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463801166-1605410050-1667657636=:12246 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT On Tue, 1 Nov 2022, Jiufu Guo wrote: > Segher Boessenkool writes: > > > On Mon, Oct 31, 2022 at 04:13:38PM -0600, Jeff Law wrote: > >> On 10/30/22 20:42, Jiufu Guo via Gcc-patches wrote: > >> >We know that for struct variable assignment, memory copy may be used. > >> >And for memcpy, we may load and store more bytes as possible at one time. > >> >While it may be not best here: > > > >> So the first question in my mind is can we do better at the gimple > >> phase?  For the second case in particular can't we just "return a" > >> rather than copying a into then returning ?  This feels > >> a lot like the return value optimization from C++.  I'm not sure if it > >> applies to the first case or not, it's been a long time since I looked > >> at NRV optimizations, but it might be worth poking around in there a bit > >> (tree-nrv.cc). > > > > If it is a bigger struct you end up with quite a lot of stuff in > > registers. GCC will eventually put that all in memory so it will work > > out fine in the end, but you are likely to get inefficient code. > Yes. We may need to use memory to save regiters for big struct. > Small struct may be practical to use registers. We may leverage the > idea that: some type of small struct are passing to function through > registers. > > > > > OTOH, 8 bytes isn't as big as we would want these days, is it? So it > > would be useful to put smaller temportaries, say 32 bytes and smaller, > > in registers instead of in memory. > I think you mean: we should try to registers to avoid memory accesing, > and using registers would be ok for more bytes memcpy(32bytes). > Great sugguestion, thanks a lot! > > Like below idea: > [r100:TI, r101:TI] = src; //Or r100:OI/OO = src; > dest = [r100:TI, r101:TI]; > > Currently, for 8bytes structure, we are using TImode for it. > And subreg/fwprop/cse passes are able to optimize it as expected. > Two concerns here: larger int modes(OI/OO/..) may be not introduced yet; > I'm not sure if current infrastructure supports to use two more > registers for one structure. > > > > >> But even so, these kinds of things are still bound to happen, so it's > >> probably worth thinking about if we can do better in RTL as well. > > > > Always. It is a mistake to think that having better high-level > > optimisations means that you don't need good low-level optimisations > > anymore: in fact deficiencies there become more glaringly apparent if > > the early pipeline opts become better :-) > Understant, thanks :) > > > > >> The first thing that comes to my mind is to annotate memcpy calls that > >> are structure assignments.  The idea here is that we may want to expand > >> a memcpy differently in those cases.   Changing how we expand an opaque > >> memcpy call is unlikely to be beneficial in most cases.  But changing > >> how we expand a structure copy may be beneficial by exposing the > >> underlying field values.   This would roughly correspond to your method > >> #1. > >> > >> Or instead of changing how we expand, teach the optimizers about these > >> annotated memcpy calls -- they're just a a copy of each field.   That's > >> how CSE and the propagators could treat them. After some point we'd > >> lower them in the usual ways, but at least early in the RTL pipeline we > >> could keep them as annotated memcpy calls.  This roughly corresponds to > >> your second suggestion. > > > > Ideally this won't ever make it as far as RTL, if the structures do not > > need to go via memory. All high-level optimissations should have been > > done earlier, and hopefully it was not expand tiself that forced stuff > > into memory! :-/ > Currently, after early gimple optimization, the struct member accessing > may still need to be in memory (if the mode of the struct is BLK). > For example: > > _Bool foo (const A a) { return a.a[0] > 1.0; } > > The optimized gimple would be: > _1 = a.a[0]; > _3 = _1 > 1.0e+0; > return _3; > > During expand to RTL, parm 'a' is store to memory from arg regs firstly, > and "a.a[0]" is also reading from memory. It may be better to use > "f1" for "a.a[0]" here. > > Maybe, method3 is similar with your idea: using "parallel:BLK {DF;DF;DF; DF}" > for the struct (BLK may be changed), and using 4 DF registers to access > the structure in expand pass. I think for cases like this it might be a good idea to perform SRA-like analysis at RTL expansion time when we know how parameters arrive (in pieces) and take that knowledge into account when assigning the RTL to a decl. The same applies for the return ABI. Since we rely on RTL to elide copies to/from return/argument registers/slots we have to assign "layout compatible" registers to the corresponding auto vars. > > Thanks again for your kindly and helpful comments! > > BR, > Jeff(Jiufu) > > > > > > > Segher > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ---1463801166-1605410050-1667657636=:12246--