From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id 4503F381EC98; Wed, 7 Dec 2022 15:24:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4503F381EC98 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 2B7FNe4R028063; Wed, 7 Dec 2022 09:23:41 -0600 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 2B7FNddX028062; Wed, 7 Dec 2022 09:23:39 -0600 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Wed, 7 Dec 2022 09:23:39 -0600 From: Segher Boessenkool To: Jiufu Guo Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, ebotcazou@libertysurf.fr, steven@gcc.gnu.org, rguenther@suse.de, jeffreyalaw@gmail.com Subject: Re: [PATCH V3] Use reg mode to move sub blocks for parameters and returns Message-ID: <20221207152339.GA25951@gate.crashing.org> References: <20221207120008.126895-1-guojiufu@linux.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221207120008.126895-1-guojiufu@linux.ibm.com> User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,KAM_SHORT,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi! On Wed, Dec 07, 2022 at 08:00:08PM +0800, Jiufu Guo wrote: > When assigning a parameter to a variable, or assigning a variable to > return value with struct type, "block move" are used to expand > the assignment. It would be better to use the register mode according > to the target/ABI to move the blocks if the parameter/return is passed > through registers. And then this would raise more opportunities for > other optimization passes(cse/dse/xprop). > > As the example code (like code in PR65421): > > typedef struct SA {double a[3];} A; > A ret_arg_pt (A *a) {return *a;} // on ppc64le, expect only 3 lfd(s) > A ret_arg (A a) {return a;} // just empty fun body > void st_arg (A a, A *p) {*p = a;} //only 3 stfd(s) What is this like if you use [5] instead? Or use an ABI without homogeneous aggregates? > +static void > +move_sub_blocks (rtx to_rtx, tree from, machine_mode sub_mode, bool nontemporal) > +{ > + HOST_WIDE_INT size, sub_size; > + int len; > + > + gcc_assert (MEM_P (to_rtx)); > + > + size = MEM_SIZE (to_rtx).to_constant (); > + sub_size = GET_MODE_SIZE (sub_mode).to_constant (); > + len = size / sub_size; Unrelated, but a pet peeve: it is much more modern (and imo much better taste) to not put all declarations at the start; just declare at first use: gcc_assert (MEM_P (to_rtx)); HOST_WIDE_INT size = MEM_SIZE (to_rtx).to_constant (); HOST_WIDE_INT sub_size = GET_MODE_SIZE (sub_mode).to_constant (); int len = size / sub_size; > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c > @@ -0,0 +1,15 @@ > +/* PR target/65421 */ > +/* { dg-options "-O2" } */ > +/* { dg-require-effective-target has_arch_ppc64 } */ > + > +typedef struct SA > +{ > + double a[2]; > + long l; > +} A; > + > +/* std 3 param regs to return slot */ > +A ret_arg (A a) {return a;} > +/* { dg-final { scan-assembler-times {\mstd 4,0\(3\)\s} 1 } } */ > +/* { dg-final { scan-assembler-times {\mstd 5,8\(3\)\s} 1 } } * > +/* { dg-final { scan-assembler-times {\mstd 6,16\(3\)\s} 1 } } */ This is only correct on certain ABIs, probably only ELFv2 even. We certainly can improve the homogeneous aggregates stuff, but please make sure you don't degrade all other stuff? Older, as well as when things are not an homogeneous aggregate, for example too big. Can you please add tests for such cases? Segher