From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 8AD013858CD1 for ; Thu, 1 Jun 2023 05:24:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8AD013858CD1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3515CBqk030030; Thu, 1 Jun 2023 05:24:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=PZx67VdHM5HNwiDSau2hKQlMgpRB71/Qu2FZbpfIoC0=; b=C7TF1iulU6o6DSSCZLpyRGIycFU6TRB4kgWJgUn9Ir9b2MK08S/NmCgudw/KJPEYT0Bg shyZYd9c5U/K5Y1QHyvgAvcIK6cLZMo+oUKYq3DQ2zMO8zX5bQhc3nSNlj484y1B6okI 5q2g7O8QpM12hN+6KQJiQ8vIF9UuN5gnV1V4a50X+5BgeC8KzXCW68o12ziRsoJbVScs t84CMJgy6Lm4m+ZvVQVN66PTUmEFSJNBv27KoYs4tXH3f5o9+1xHEuAy+aqkJqqEtES0 yblzGdk+7LFLHTh8I2TCFIQ7WxNmGALlARvmxe01zUdqb46tgAxqUpC7oFqafvOAq4mq Ig== Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qxmym88e1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Jun 2023 05:24:36 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35130Xwl026831; Thu, 1 Jun 2023 05:24:34 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3qu9g52b6f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 01 Jun 2023 05:24:34 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3515OW8N62390772 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 1 Jun 2023 05:24:32 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2F86A20049; Thu, 1 Jun 2023 05:24:32 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5C8B820040; Thu, 1 Jun 2023 05:24:30 +0000 (GMT) Received: from [9.177.71.14] (unknown [9.177.71.14]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 1 Jun 2023 05:24:29 +0000 (GMT) Message-ID: Date: Thu, 1 Jun 2023 13:24:28 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH V3] VECT: Change flow of decrement IV Content-Language: en-US To: "juzhe.zhong@rivai.ai" Cc: "richard.sandiford" , rguenther , gcc-patches References: <20230601043617.173986-1-juzhe.zhong@rivai.ai> <489BFF1AE94B6B81+2023060113000785279135@rivai.ai> From: "Kewen.Lin" In-Reply-To: <489BFF1AE94B6B81+2023060113000785279135@rivai.ai> Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: LALbCHVatqqcLud8hw7BmuYhQ5bFU3W7 X-Proofpoint-ORIG-GUID: LALbCHVatqqcLud8hw7BmuYhQ5bFU3W7 Content-Transfer-Encoding: quoted-printable X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-01_02,2023-05-31_03,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 spamscore=0 adultscore=0 malwarescore=0 mlxscore=0 clxscore=1015 mlxlogscore=999 suspectscore=0 bulkscore=0 phishscore=0 impostorscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2306010043 X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_ASCII_DIVIDERS,NICE_REPLY_A,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, on 2023/6/1 13:00, juzhe.zhong@rivai.ai wrote: > This patch is no difference from V2. I support this patch based on the testing and SPEC2017 evaluation results on Power (see my comments on patch v2). > Just add=C2=A0PR tree-optimization/109971 as Kewen's suggested. Thanks for adding that, I was expecting you will add that when you are committing it, not really requesting one new version. :) btw, the PR marker(s) will trigger scripts to comment some commit info (commit link, commit log) into the specified PR(s), people can find some connections between PRs and (fixing or progressing forward) commits easily. BR, Kewen >=20 > Already bootstrapped and Regression on X86 no difference. >=20 > Ok for trunkjuzhe.zhong@rivai.ai >=20 > =C2=A0 > *From:*=C2=A0juzhe.zhong > *Date:*=C2=A02023-06-01=C2=A012:36 > *To:*=C2=A0gcc-patches > *CC:*=C2=A0richard.sandiford ; rgue= nther ; linkw ; Ju-Zh= e Zhong > *Subject:*=C2=A0[PATCH V3] VECT: Change flow of decrement IV > From: Ju-Zhe Zhong > =C2=A0 > Follow Richi's suggestion, I change current decrement IV flow from: > =C2=A0 > do { > =C2=A0=C2=A0 remain -=3D MIN (vf, remain); > } while (remain !=3D 0); > =C2=A0 > into: > =C2=A0 > do { > =C2=A0=C2=A0 old_remain =3D remain; > =C2=A0=C2=A0 len =3D MIN (vf, remain); > =C2=A0=C2=A0 remain -=3D vf; > } while (old_remain >=3D vf); > =C2=A0 > to enhance SCEV. > =C2=A0 > Include fixes from kewen. > =C2=A0 > =C2=A0 > This patch will need to wait for Kewen's test feedback. > =C2=A0 > Testing on X86 is on-going > =C2=A0 > Co-Authored by: Kewen Lin=C2=A0 > =C2=A0 > =C2=A0 PR tree-optimization/109971 > =C2=A0 > gcc/ChangeLog: > =C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * tree-vect-loop-manip.cc = (vect_set_loop_controls_directly): Change decrement IV flow. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (vect_set_loop_condition_p= artial_vectors): Ditto. > =C2=A0 > --- > gcc/tree-vect-loop-manip.cc | 36 +++++++++++++++++++++++++----------- > 1 file changed, 25 insertions(+), 11 deletions(-) > =C2=A0 > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc > index acf3642ceb2..3f735945e67 100644 > --- a/gcc/tree-vect-loop-manip.cc > +++ b/gcc/tree-vect-loop-manip.cc > @@ -483,7 +483,7 @@ vect_set_loop_controls_directly (class loop *loop= , loop_vec_info loop_vinfo, > gimple_stmt_iterator loop_cond_gsi, > rgroup_controls *rgc, tree niters, > tree niters_skip, bool might_wrap_p, > - tree *iv_step) > + tree *iv_step, tree *compare_step) > { > =C2=A0=C2=A0 tree compare_type =3D LOOP_VINFO_RGROUP_COMPARE_TYPE (lo= op_vinfo); > =C2=A0=C2=A0 tree iv_type =3D LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); > @@ -538,9 +538,9 @@ vect_set_loop_controls_directly (class loop *loop= , loop_vec_info loop_vinfo, > =C2=A0=C2=A0 ... > =C2=A0=C2=A0 vect__4.8_28 =3D .LEN_LOAD (_17, 32B, _36, 0); > =C2=A0=C2=A0 ... > - =C2=A0=C2=A0 ivtmp_35 =3D ivtmp_9 - _36; > + =C2=A0=C2=A0 ivtmp_35 =3D ivtmp_9 - POLY_INT_CST [4, 4]; > =C2=A0=C2=A0 ... > - =C2=A0=C2=A0 if (ivtmp_35 !=3D 0) > + =C2=A0=C2=A0 if (ivtmp_9 > POLY_INT_CST [4, 4]) > =C2=A0=C2=A0=C2=A0=C2=A0 goto ; [83.33%] > =C2=A0=C2=A0 else > =C2=A0=C2=A0=C2=A0=C2=A0 goto ; [16.67%] > @@ -549,13 +549,15 @@ vect_set_loop_controls_directly (class loop *lo= op, loop_vec_info loop_vinfo, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tree step =3D rgc->controls.leng= th () =3D=3D 1 ? rgc->controls[0] > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 : make_ssa_name (iv_type); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* Create decrement IV.=C2=A0 */ > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 create_iv (nitems_total, MINUS_EXPR, = step, NULL_TREE, loop, &incr_gsi, > - insert_after, &index_before_incr, &index_after_incr); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 create_iv (nitems_total, MINUS_EXPR, = nitems_step, NULL_TREE, loop, > + &incr_gsi, insert_after, &index_before_incr, > + &index_after_incr); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 gimple_seq_add_stmt (header_seq,= gimple_build_assign (step, MIN_EXPR, > =C2=A0=C2=A0=C2=A0 index_before_incr, > =C2=A0=C2=A0=C2=A0 nitems_step)); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 *iv_step =3D step; > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return index_after_incr; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 *compare_step =3D nitems_step; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return index_before_incr; > =C2=A0=C2=A0=C2=A0=C2=A0 } > =C2=A0=C2=A0 /* Create increment IV.=C2=A0 */ > @@ -825,6 +827,7 @@ vect_set_loop_condition_partial_vectors (class lo= op *loop, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 arbitrarily pick the last.=C2=A0 */ > =C2=A0=C2=A0 tree test_ctrl =3D NULL_TREE; > =C2=A0=C2=A0 tree iv_step =3D NULL_TREE; > +=C2=A0 tree compare_step =3D NULL_TREE; > =C2=A0=C2=A0 rgroup_controls *rgc; > =C2=A0=C2=A0 rgroup_controls *iv_rgc =3D nullptr; > =C2=A0=C2=A0 unsigned int i; > @@ -861,7 +864,7 @@ vect_set_loop_condition_partial_vectors (class lo= op *loop, > &preheader_seq, &header_seq, > loop_cond_gsi, rgc, niters, > niters_skip, might_wrap_p, > - &iv_step); > + &iv_step, &compare_step); > =C2=A0=C2=A0=C2=A0 iv_rgc =3D rgc; > =C2=A0 } > @@ -884,10 +887,21 @@ vect_set_loop_condition_partial_vectors (class = loop *loop, > =C2=A0=C2=A0 /* Get a boolean result that tells us whether to iterate= .=C2=A0 */ > =C2=A0=C2=A0 edge exit_edge =3D single_exit (loop); > -=C2=A0 tree_code code =3D (exit_edge->flags & EDGE_TRUE_VALUE) ? EQ_= EXPR : NE_EXPR; > -=C2=A0 tree zero_ctrl =3D build_zero_cst (TREE_TYPE (test_ctrl)); > -=C2=A0 gcond *cond_stmt =3D gimple_build_cond (code, test_ctrl, zero= _ctrl, > - NULL_TREE, NULL_TREE); > +=C2=A0 gcond *cond_stmt; > +=C2=A0 if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo)) > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 gcc_assert (compare_step); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tree_code code =3D (exit_edge->flags = & EDGE_TRUE_VALUE) ? LE_EXPR : GT_EXPR; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cond_stmt =3D gimple_build_cond (code= , test_ctrl, compare_step, NULL_TREE, > + =C2=A0=C2=A0=C2=A0=C2=A0 NULL_TREE); > +=C2=A0=C2=A0=C2=A0 } > +=C2=A0 else > +=C2=A0=C2=A0=C2=A0 { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tree_code code =3D (exit_edge->flags = & EDGE_TRUE_VALUE) ? EQ_EXPR : NE_EXPR; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tree zero_ctrl =3D build_zero_cst (TR= EE_TYPE (test_ctrl)); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cond_stmt > + =3D gimple_build_cond (code, test_ctrl, zero_ctrl, NULL_TREE, NULL_= TREE); > +=C2=A0=C2=A0=C2=A0 } > =C2=A0=C2=A0 gsi_insert_before (&loop_cond_gsi, cond_stmt, GSI_SAME_S= TMT); > =C2=A0=C2=A0 /* The loop iterates (NITERS - 1) / VF + 1 times. > --=20 > 2.36.3 > =C2=A0 >