From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 5B0033852C79; Fri, 25 Nov 2022 05:12:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5B0033852C79 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AP3iOQc002016; Fri, 25 Nov 2022 05:12:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : references : date : in-reply-to : message-id : content-type : mime-version; s=pp1; bh=AnhpbPshvKGOztSfGAx/upJUkAOueoN3X2AUssBQICs=; b=Dm2Tq6b5U/xWg5ZjGjm7SxqL+4y45U9srvjXqxIy+arB6y6brYBGk0GzPL2XSFw02y8p Px4puVSQVsamCR/uHAjvVzBmL8s+YUnmdV33dlTcVH2XKYJSEdhnkfkbDzJ9CQGJroUy +WE6blobmobrCutwDiXeUFdro8x1mFLlmSuacfzsIwGhXfKp/pJd3CfeaoyCkUE2zsXr VX67gXdHlZQKA7pR5FaUmAHgJl+GJ5MWFLmxh7lE2UZRW78hk+99QoROoq1P67o7aBDh EEtxvPCiaOyiH1QUiLuYefO2AxLjan5n+voiDhNtd/gYlWUJP2Top6JDPQgwzMI1Hh93 ug== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3m2p2f9fte-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Nov 2022 05:12:21 +0000 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2AP56NSh025141; Fri, 25 Nov 2022 05:12:21 GMT Received: from ppma03wdc.us.ibm.com (ba.79.3fa9.ip4.static.sl-reverse.com [169.63.121.186]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3m2p2f9ft8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Nov 2022 05:12:21 +0000 Received: from pps.filterd (ppma03wdc.us.ibm.com [127.0.0.1]) by ppma03wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AP55Da0025884; Fri, 25 Nov 2022 05:12:20 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma03wdc.us.ibm.com with ESMTP id 3kxpsa5ptv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Nov 2022 05:12:20 +0000 Received: from smtpav06.dal12v.mail.ibm.com ([9.208.128.130]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AP5CGLA63963458 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Nov 2022 05:12:17 GMT Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3D44558043; Fri, 25 Nov 2022 05:12:19 +0000 (GMT) Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E7C785805E; Fri, 25 Nov 2022 05:12:18 +0000 (GMT) Received: from pike (unknown [9.5.12.127]) by smtpav06.dal12v.mail.ibm.com (Postfix) with ESMTPS; Fri, 25 Nov 2022 05:12:18 +0000 (GMT) From: Jiufu Guo To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, rguenther@suse.de, jeffreyalaw@gmail.com Subject: Re: [PATCH V2] Update block move for struct param or returns References: <20221124094148.125303-1-guojiufu@linux.ibm.com> Date: Fri, 25 Nov 2022 13:12:16 +0800 In-Reply-To: <20221124094148.125303-1-guojiufu@linux.ibm.com> (Jiufu Guo's message of "Thu, 24 Nov 2022 17:41:48 +0800") Message-ID: <7esfi7r51r.fsf@pike.rch.stglabs.ibm.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-GUID: hOdFfHWp2ucU3uhkfWEJ2o9R97YOczas X-Proofpoint-ORIG-GUID: FrIYPjTqhd-NrMY5XMdPQJGpz24cF8pq X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-25_02,2022-11-24_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 spamscore=0 phishscore=0 clxscore=1015 impostorscore=0 mlxscore=0 malwarescore=0 lowpriorityscore=0 priorityscore=1501 adultscore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211250039 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Based on the discussions in previous mails: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607139.html https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607197.html I will update the patch accordingly, and then submit a new version. BR, Jeff (Jiufu) Jiufu Guo writes: > Hi, > > When assigning a parameter to a variable, or assigning a variable to > return value with struct type, "block move" are used to expand > the assignment. It would be better to use the register mode according > to the target/ABI to move the blocks. And then this would raise more > opportunities for other optimization passes(cse/dse/xprop). > > As the example code (like code in PR65421): > > typedef struct SA {double a[3];} A; > A ret_arg_pt (A *a){return *a;} // on ppc64le, only 3 lfd(s) > A ret_arg (A a) {return a;} // just empty fun body > void st_arg (A a, A *p) {*p = a;} //only 3 stfd(s) > > This patch is based on the previous version which supports assignments > from parameter: > https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605709.html > This patch also supports returns. > > I also tried to update gimplify/nrv to replace "return D.xxx;" with > "return ;". While there is one issue: "" with > PARALLEL code can not be accessed through address/component_ref. > This issue blocks a few passes (e.g. sra, expand). > > On ppc64, some dead stores are not eliminated. e.g. for ret_arg: > .cfi_startproc > std 4,56(1)//reductant > std 5,64(1)//reductant > std 6,72(1)//reductant > std 4,0(3) > std 5,8(3) > std 6,16(3) > blr > > Bootstraped and regtested on ppc64le and x86_64. > > I'm wondering if this patch could be committed first. > Thanks for the comments and suggestions. > > > BR, > Jeff (Jiufu) > > PR target/65421 > > gcc/ChangeLog: > > * cfgexpand.cc (expand_used_vars): Add collecting return VARs. > (expand_gimple_stmt_1): Call expand_special_struct_assignment. > (pass_expand::execute): Free collections of return VARs. > * expr.cc (expand_special_struct_assignment): New function. > * expr.h (expand_special_struct_assignment): Declare. > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/pr65421-1.c: New test. > * gcc.target/powerpc/pr65421.c: New test. > > --- > gcc/cfgexpand.cc | 37 +++++++++++++++++ > gcc/expr.cc | 43 ++++++++++++++++++++ > gcc/expr.h | 3 ++ > gcc/testsuite/gcc.target/powerpc/pr65421-1.c | 21 ++++++++++ > gcc/testsuite/gcc.target/powerpc/pr65421.c | 19 +++++++++ > 5 files changed, 123 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421.c > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc > index dd29ffffc03..f185de39341 100644 > --- a/gcc/cfgexpand.cc > +++ b/gcc/cfgexpand.cc > @@ -341,6 +341,9 @@ static hash_map *decl_to_stack_part; > all of them in one big sweep. */ > static bitmap_obstack stack_var_bitmap_obstack; > > +/* Those VARs on returns. */ > +static bitmap return_vars; > + > /* An array of indices such that stack_vars[stack_vars_sorted[i]].size > is non-decreasing. */ > static size_t *stack_vars_sorted; > @@ -2158,6 +2161,24 @@ expand_used_vars (bitmap forced_stack_vars) > frame_phase = off ? align - off : 0; > } > > + /* Collect VARs on returns. */ > + return_vars = NULL; > + if (DECL_RESULT (current_function_decl) > + && TYPE_MODE (TREE_TYPE (DECL_RESULT (current_function_decl))) == BLKmode) > + { > + return_vars = BITMAP_ALLOC (NULL); > + > + edge_iterator ei; > + edge e; > + FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) > + if (greturn *ret = safe_dyn_cast (last_stmt (e->src))) > + { > + tree val = gimple_return_retval (ret); > + if (val && VAR_P (val)) > + bitmap_set_bit (return_vars, DECL_UID (val)); > + } > + } > + > /* Set TREE_USED on all variables in the local_decls. */ > FOR_EACH_LOCAL_DECL (cfun, i, var) > TREE_USED (var) = 1; > @@ -3942,6 +3963,17 @@ expand_gimple_stmt_1 (gimple *stmt) > /* This is a clobber to mark the going out of scope for > this LHS. */ > expand_clobber (lhs); > + else if ((TREE_CODE (rhs) == PARM_DECL && DECL_INCOMING_RTL (rhs) > + && TYPE_MODE (TREE_TYPE (rhs)) == BLKmode > + && (GET_CODE (DECL_INCOMING_RTL (rhs)) == PARALLEL > + || REG_P (DECL_INCOMING_RTL (rhs)))) > + || (VAR_P (lhs) && return_vars > + && DECL_RTL_SET_P (DECL_RESULT (current_function_decl)) > + && GET_CODE ( > + DECL_RTL (DECL_RESULT (current_function_decl))) > + == PARALLEL > + && bitmap_bit_p (return_vars, DECL_UID (lhs)))) > + expand_special_struct_assignment (lhs, rhs); > else > expand_assignment (lhs, rhs, > gimple_assign_nontemporal_move_p ( > @@ -7025,6 +7057,11 @@ pass_expand::execute (function *fun) > /* After expanding, the return labels are no longer needed. */ > return_label = NULL; > naked_return_label = NULL; > + if (return_vars) > + { > + BITMAP_FREE (return_vars); > + return_vars = NULL; > + } > > /* After expanding, the tm_restart map is no longer needed. */ > if (fun->gimple_df->tm_restart) > diff --git a/gcc/expr.cc b/gcc/expr.cc > index d9407432ea5..6ffd9439188 100644 > --- a/gcc/expr.cc > +++ b/gcc/expr.cc > @@ -5559,6 +5559,49 @@ mem_ref_refers_to_non_mem_p (tree ref) > return non_mem_decl_p (base); > } > > +/* Expand the assignment from parameter or to returns if it needs > + "block move" on struct type. */ > + > +void > +expand_special_struct_assignment (tree to, tree from) > +{ > + rtx result; > + > + push_temp_slots (); > + rtx par_ret = TREE_CODE (from) == PARM_DECL > + ? DECL_INCOMING_RTL (from) > + : DECL_RTL (DECL_RESULT (current_function_decl)); > + machine_mode mode = GET_CODE (par_ret) == PARALLEL > + ? GET_MODE (XEXP (XVECEXP (par_ret, 0, 0), 0)) > + : word_mode; > + int mode_size = GET_MODE_SIZE (mode).to_constant (); > + int size = INTVAL (expr_size (from)); > + rtx to_rtx = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE); > + > + /* Here using a heurisitic number for how many words may pass via gprs. */ > + int hurstc_num = 8; > + if (size < mode_size || (size % mode_size) != 0 > + || (GET_CODE (par_ret) != PARALLEL && size > (mode_size * hurstc_num))) > + result = store_expr (from, to_rtx, 0, false, false); > + else > + { > + rtx from_rtx > + = expand_expr (from, NULL_RTX, GET_MODE (to_rtx), EXPAND_NORMAL); > + for (int i = 0; i < size / mode_size; i++) > + { > + rtx temp = gen_reg_rtx (mode); > + rtx src = adjust_address (from_rtx, mode, mode_size * i); > + rtx dest = adjust_address (to_rtx, mode, mode_size * i); > + emit_move_insn (temp, src); > + emit_move_insn (dest, temp); > + } > + result = to_rtx; > + } > + > + preserve_temp_slots (result); > + pop_temp_slots (); > +} > + > /* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL > is true, try generating a nontemporal store. */ > > diff --git a/gcc/expr.h b/gcc/expr.h > index 08b59b8d869..10527f23a56 100644 > --- a/gcc/expr.h > +++ b/gcc/expr.h > @@ -281,6 +281,9 @@ extern void get_bit_range (poly_uint64_pod *, poly_uint64_pod *, tree, > /* Expand an assignment that stores the value of FROM into TO. */ > extern void expand_assignment (tree, tree, bool); > > +/* Expand an assignment from parameters or to returns. */ > +extern void expand_special_struct_assignment (tree, tree); > + > /* Generate code for computing expression EXP, > and storing the value into TARGET. > If SUGGEST_REG is nonzero, copy the value through a register > diff --git a/gcc/testsuite/gcc.target/powerpc/pr65421-1.c b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c > new file mode 100644 > index 00000000000..f55a0fe0002 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c > @@ -0,0 +1,21 @@ > +/* PR target/65421 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -m64" } */ > + > +typedef struct SA > +{ > + double a[3]; > + long l; > +} A; > + > +A ret_arg_pt (A *a){return *a;} > + > +A ret_arg (A a) {return a;} > + > +void st_arg (A a, A *p) {*p = a;} > + > +/* { dg-final { scan-assembler-times {\mlxvd2x\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mstxvd2x\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mstd\M} 8 } } */ > +/* { dg-final { scan-assembler-times {\mblr\M} 3 } } */ > +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 16 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/pr65421.c b/gcc/testsuite/gcc.target/powerpc/pr65421.c > new file mode 100644 > index 00000000000..26e85468470 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr65421.c > @@ -0,0 +1,19 @@ > +/* PR target/65421 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -m64" } */ > + > +typedef struct SA > +{ > + double a[3]; > +} A; > + > +A ret_arg_pt (A *a){return *a;} > + > +A ret_arg (A a) {return a;} > + > +void st_arg (A a, A *p) {*p = a;} > + > +/* { dg-final { scan-assembler-times {\mlfd\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mstfd\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mblr\M} 3 } } */ > +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9 } } */