From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 06F3F38432D6; Thu, 24 Nov 2022 07:31:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 06F3F38432D6 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id AED481F8BE; Thu, 24 Nov 2022 07:31:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1669275075; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2zuLlaasBnfubDYpNADifbqtiWUaL4QRz6/7Dj5A2PM=; b=Nvt1P12FcsM+UjJiPoIR5/86GJ6t2y3mRNbNuu7vBHCE3C95IUIDNFjaFDTgAQIIxhzHbR Wk5aYfFo5Uyi0hhLzfIWGBsbWhJz0cFTiDmuiYqQe2u5Go6QIhfWkJ9ZmrefAM7GTxy/AM yB8ZKNM5eD+hP6Ax+0wZYJgkbS3RaFE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1669275075; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2zuLlaasBnfubDYpNADifbqtiWUaL4QRz6/7Dj5A2PM=; b=sjLGTJEsNn1Ilw3zqranuaDcvxUfzw442aRoK75zB6HIxNOahkwJsK2j78a0/awlnos2uD GVAkEBQgjgllG7Cg== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 6EF092C142; Thu, 24 Nov 2022 07:31:15 +0000 (UTC) Date: Thu, 24 Nov 2022 07:31:15 +0000 (UTC) From: Richard Biener To: Jiufu Guo cc: Jeff Law , gcc-patches@gcc.gnu.org, segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org Subject: Re: [PATCH V2] Use subscalar mode to move struct block for parameter In-Reply-To: <7efseas7f1.fsf@pike.rch.stglabs.ibm.com> Message-ID: References: <20221117061549.178481-1-guojiufu@linux.ibm.com> <7ea64lroo6.fsf@pike.rch.stglabs.ibm.com> <9424d98e-a95f-58ae-9764-bcf8b4f503dc@gmail.com> <7efseas7f1.fsf@pike.rch.stglabs.ibm.com> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="-1609957120-603841475-1669275075=:3995" X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1609957120-603841475-1669275075=:3995 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT On Wed, 23 Nov 2022, Jiufu Guo wrote: > Hi Jeff, > > Thanks a lot for your comments! Sorry for the late response ... > Jeff Law writes: > > > On 11/20/22 20:07, Jiufu Guo wrote: > >> Jiufu Guo writes: > >> > >>> Hi, > >>> > >>> As mentioned in the previous version patch: > >>> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604646.html > >>> The suboptimal code is generated for "assigning from parameter" or > >>> "assigning to return value". > >>> This patch enhances the assignment from parameters like the below > >>> cases: > >>> /////case1.c > >>> typedef struct SA {double a[3];long l; } A; > >>> A ret_arg (A a) {return a;} > >>> void st_arg (A a, A *p) {*p = a;} > >>> > >>> ////case2.c > >>> typedef struct SA {double a[3];} A; > >>> A ret_arg (A a) {return a;} > >>> void st_arg (A a, A *p) {*p = a;} > >>> > >>> For this patch, bootstrap and regtest pass on ppc64{,le} > >>> and x86_64. > >>> * Besides asking for help reviewing this patch, I would like to > >>> consult comments about enhancing for "assigning to returns". > >> I updated the patch to fix the issue for returns. This patch > >> adds a flag DECL_USEDBY_RETURN_P to indicate if a var is used > >> by a return stmt. This patch fix the issue in expand pass only, > >> so, we would try to update the patch to avoid this flag. > >> > >> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc > >> index dd29ffffc03..09b8ec64cea 100644 > >> --- a/gcc/cfgexpand.cc > >> +++ b/gcc/cfgexpand.cc > >> @@ -2158,6 +2158,20 @@ expand_used_vars (bitmap forced_stack_vars) > >> frame_phase = off ? align - off : 0; > >> } > >> + /* Collect VARs on returns. */ > >> + if (DECL_RESULT (current_function_decl)) > >> + { > >> + edge_iterator ei; > >> + edge e; > >> + FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) > >> + if (greturn *ret = safe_dyn_cast (last_stmt (e->src))) > >> + { > >> + tree val = gimple_return_retval (ret); > >> + if (val && VAR_P (val)) > >> + DECL_USEDBY_RETURN_P (val) = 1; you probably want to check && auto_var_in_fn (val, ...) since val might be global? > >> + } > >> + } > >> + > >> /* Set TREE_USED on all variables in the local_decls. */ > >> FOR_EACH_LOCAL_DECL (cfun, i, var) > >> TREE_USED (var) = 1; > >> diff --git a/gcc/expr.cc b/gcc/expr.cc > >> index d9407432ea5..20973649963 100644 > >> --- a/gcc/expr.cc > >> +++ b/gcc/expr.cc > >> @@ -6045,6 +6045,52 @@ expand_assignment (tree to, tree from, bool nontemporal) > >> return; > >> } I miss an explanatory comment here on that the following is heuristics and its reasoning. > >> + if ((TREE_CODE (from) == PARM_DECL && DECL_INCOMING_RTL (from) > >> + && TYPE_MODE (TREE_TYPE (from)) == BLKmode Why check TYPE_MODE here? Do you want AGGREGATE_TYPE_P on the type instead? > >> + && (GET_CODE (DECL_INCOMING_RTL (from)) == PARALLEL > >> + || REG_P (DECL_INCOMING_RTL (from)))) > >> + || (VAR_P (to) && DECL_USEDBY_RETURN_P (to) > >> + && TYPE_MODE (TREE_TYPE (to)) == BLKmode Likewise. > >> + && GET_CODE (DECL_RTL (DECL_RESULT (current_function_decl))) > >> + == PARALLEL)) Not REG_P here? > >> + { > >> + push_temp_slots (); > >> + rtx par_ret; > >> + machine_mode mode; > >> + par_ret = TREE_CODE (from) == PARM_DECL > >> + ? DECL_INCOMING_RTL (from) > >> + : DECL_RTL (DECL_RESULT (current_function_decl)); > >> + mode = GET_CODE (par_ret) == PARALLEL > >> + ? GET_MODE (XEXP (XVECEXP (par_ret, 0, 0), 0)) > >> + : word_mode; > >> + int mode_size = GET_MODE_SIZE (mode).to_constant (); > >> + int size = INTVAL (expr_size (from)); > >> + > >> + /* If/How the parameter using submode, it dependes on the size and > >> + position of the parameter. Here using heurisitic number. */ > >> + int hurstc_num = 8; > > > > Where did this come from and what does it mean? > Sorry for does not make this clear. We know that an aggregate arg may be > on registers partially or totally, as assign_parm_adjust_entry_rtl. > For an example, if a parameter with 12 words and the target/ABI only > allow 8 gprs for arguments, then the parameter could use 8 regs at most > and left part in stack. I also wonder about the exact semantics of the parallels we get here. + int size = INTVAL (expr_size (from)); esp. when you use sth as simple as this. Shouldn't you instead look at to_rtx since that's already expanded? For returns that should be the desired layout to match 'from' to, no? Maybe it's better to not try sharing the code for both incoming and return copies for clarity? Also, what happens if there's a copy from a PARM_DECL to a DECL_USEDBY_RETURN_P decl? Which heuristic takes precedent? I think that at least the place of the copy improvement and the way you compute DECL_USEDBY_RETURN_P is reasonable. Thanks, Richard. > > > > > > Note that BLKmode subword values passed in registers can be either > > right or left justified.  I think you also need to worry about > > endianness here. > Since the subword is used to move block(read from source mem and then > store to destination mem with register mode), and this would keep to use > the same endianness on reg like move_block_from_reg. So, the patch does > not check the endianness. > > If any concerns and sugguestions, please point out, thanks! > > BR, > Jeff (Jiufu) > > > > > > Jeff > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg) ---1609957120-603841475-1669275075=:3995--