From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 02D683858C27 for ; Wed, 1 Sep 2021 03:30:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 02D683858C27 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 18133a6r124550; Tue, 31 Aug 2021 23:30:43 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3at17f8h7q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 31 Aug 2021 23:30:43 -0400 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 181354Vt130379; Tue, 31 Aug 2021 23:30:43 -0400 Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com with ESMTP id 3at17f8h7c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 31 Aug 2021 23:30:43 -0400 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1813IieM026688; Wed, 1 Sep 2021 03:30:42 GMT Received: from b03cxnp07027.gho.boulder.ibm.com (b03cxnp07027.gho.boulder.ibm.com [9.17.130.14]) by ppma02dal.us.ibm.com with ESMTP id 3aqcsdnvt3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 01 Sep 2021 03:30:42 +0000 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp07027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1813UfNV26280410 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 1 Sep 2021 03:30:41 GMT Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 37FB4C605D; Wed, 1 Sep 2021 03:30:41 +0000 (GMT) Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C64C1C6055; Wed, 1 Sep 2021 03:30:40 +0000 (GMT) Received: from pike (unknown [9.5.12.127]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTPS; Wed, 1 Sep 2021 03:30:40 +0000 (GMT) From: Jiufu Guo To: Richard Biener Cc: gcc-patches@gcc.gnu.org, amker.cheng@gmail.com, wschmidt@linux.ibm.com, segher@kernel.crashing.org, dje.gcc@gmail.com, jlaw@tachyum.com Subject: Re: [PATCH] Set bound/cmp/control for until wrap loop. References: <20210830061535.146396-1-guojiufu@linux.ibm.com> <70ea087486feb85536100fa538c877f4@imap.linux.ibm.com> Date: Wed, 01 Sep 2021 11:30:37 +0800 In-Reply-To: (Richard Biener's message of "Tue, 31 Aug 2021 15:37:39 +0200 (CEST)") Message-ID: <7esfypezvm.fsf@pike.rch.stglabs.ibm.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; format=flowed X-TM-AS-GCONF: 00 X-Proofpoint-GUID: oUSMzo4F1scIM8X56zWPKcoeiR0M1PVR X-Proofpoint-ORIG-GUID: Rz0JkB0uJLj-8n1jIhGYE2ZK7GeU_-2T X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-08-31_10:2021-08-31, 2021-08-31 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 adultscore=0 mlxscore=0 suspectscore=0 phishscore=0 priorityscore=1501 impostorscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxlogscore=999 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2109010016 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Sep 2021 03:30:55 -0000 Richard Biener writes: > On Tue, 31 Aug 2021, guojiufu wrote: > >> On 2021-08-30 20:02, Richard Biener wrote: >> > On Mon, 30 Aug 2021, guojiufu wrote: >> > >> >> On 2021-08-30 14:15, Jiufu Guo wrote: >> >> > Hi, >> >> > >> >> > In patch r12-3136, niter->control, niter->bound and >> >> > niter->cmp are >> >> > derived from number_of_iterations_lt. While for 'until >> >> > wrap condition', >> >> > the calculation in number_of_iterations_lt is not align >> >> > the requirements >> >> > on the define of them and requirements in >> >> > determine_exit_conditions. >> >> > >> >> > This patch calculate niter->control, niter->bound and >> >> > niter->cmp in >> >> > number_of_iterations_until_wrap. >> >> > >> >> > The ICEs in the PR are pass with this patch. >> >> > Bootstrap and reg-tests pass on ppc64/ppc64le and x86. >> >> > Is this ok for trunk? >> >> > >> >> > BR. >> >> > Jiufu Guo >> >> > >> >> Add ChangeLog: >> >> > create mode 100644 gcc/testsuite/gcc.dg/pr102087.c >> >> > >> >> > diff --git a/gcc/tree-ssa-loop-niter.c >> >> > b/gcc/tree-ssa-loop-niter.c >> >> > index 7af92d1c893..747f04d3ce0 100644 >> >> > --- a/gcc/tree-ssa-loop-niter.c >> >> > +++ b/gcc/tree-ssa-loop-niter.c >> >> > @@ -1482,7 +1482,7 @@ number_of_iterations_until_wrap >> >> > (class loop *, >> >> > tree type, affine_iv *iv0, >> >> > affine_iv *iv1, class >> >> > tree_niter_desc *niter) >> >> > { >> >> > tree niter_type = unsigned_type_for (type); >> >> > - tree step, num, assumptions, may_be_zero; >> >> > + tree step, num, assumptions, may_be_zero, span; >> >> > wide_int high, low, max, min; >> >> > >> >> > may_be_zero = fold_build2 (LE_EXPR, boolean_type_node, >> >> > iv1->base, >> >> > iv0->base); >> >> > @@ -1513,6 +1513,8 @@ number_of_iterations_until_wrap >> >> > (class loop *, >> >> > tree type, affine_iv *iv0, >> >> > low = wi::to_wide (iv0->base); >> >> > else >> >> > low = min; >> >> > + >> >> > + niter->control = *iv1; >> >> > } >> >> > /* {base, -C} < n. */ >> >> > else if (tree_int_cst_sign_bit (iv0->step) && >> >> > integer_zerop >> >> > (iv1->step)) >> >> > @@ -1533,6 +1535,8 @@ number_of_iterations_until_wrap >> >> > (class loop *, >> >> > tree type, affine_iv *iv0, >> >> > high = wi::to_wide (iv1->base); >> >> > else >> >> > high = max; >> >> > + >> >> > + niter->control = *iv0; >> >> > } >> >> > else >> >> > return false; >> > >> > it looks like the above two should already be in effect from >> > the >> > caller (guarding with integer_nozerop)? >> >> I add them just because set these fields in one function. >> Yes, they have been set in caller already, I could remove them >> here. >> >> > >> >> > @@ -1556,6 +1560,14 @@ number_of_iterations_until_wrap >> >> > (class loop *, >> >> > tree type, affine_iv *iv0, >> >> > niter->assumptions, assumptions); >> >> > >> >> > niter->control.no_overflow = false; >> >> > + niter->control.base = fold_build2 (MINUS_EXPR, >> >> > niter_type, >> >> > + niter->control.base, >> >> > niter->control.step); >> > >> > how do we know IVn - STEP doesn't already wrap? >> >> The last IV value is just cross the max/min value of the type >> at the last iteration, then IVn - STEP is the nearest value >> to max(or min) and not wrap. >> >> > A comment might be >> > good to explain you're turning the simplified exit condition >> > into >> > >> > { IVbase - STEP, +, STEP } != niter * STEP + (IVbase - >> > STEP) >> > >> > which, when mathematically looking at it makes me wonder why >> > there's >> > the seemingly redundant '- STEP' term? Also is NE_EXPR >> > really >> > correct since STEP might be not 1? Only for non equality >> > compares >> > the '- STEP' should matter? >> >> I need to add comments for this. This is a little tricky. >> The last value of the original IV just cross max/min at most >> one STEP, >> at there wrapping already happen. >> Using "{IVbase, +, STEP} != niter * STEP + IVbase" is not wrong >> in the aspect of exit condition. >> >> But this would not work well with existing code: >> like determine_exit_conditions, which will convert NE_EXP to >> LT_EXPR/GT_EXPR. And so, the '- STEP' is added to adjust the >> IV.base and bound, with '- STEP' the bound will be the last >> value >> just before wrap. > > Hmm. The control IV is documented as > > /* The simplified shape of the exit condition. The loop exits > if > CONTROL CMP BOUND is false, where CMP is one of NE_EXPR, > LT_EXPR, or GT_EXPR, and step of CONTROL is positive if CMP > is > LE_EXPR and negative if CMP is GE_EXPR. This information > is used > by loop unrolling. */ > affine_iv control; > > but determine_exit_conditions seems to assume the IV does not > wrap? Strictly speaking , I would say yes, determine_exit_conditions assume IV does not wrap: there is code: if (cmp == LT_EXPR) assum = fold_build2 (GE_EXPR, boolean_type_node, bound, fold_build2 (PLUS_EXPR, type, min, delta)); else xxxx This means if 'bound' is the value after wrap, the 'assum' with be false. This is also the reason that we may need to biase 'bound' and 'base' by 'step * 1'. Because, in our case like "while(n In fact determine_exit_conditions seems to just build ->base CMP > bound > where bound is the IV bound biased by #unroll * step - step. So > how > does biasing by step * 1 help? determine_exit_conditions adjust bound by #unroll * step - step for more check, like transform to "base cmp new-bound"; for this checking, 'bound' is ok with wrapped value, but for the 'assum' as above (assum = bound >= min + #unroll * step - step), biasing "step * 1" would be fine. > > Does the control IV wrap in our case? I think, this is a key question, which may affect if it make sense for above transform. Thanks! Let me explain my understanding: for original exit condition (like n > Richard. > >> Thanks again for your review! >> >> BR. >> Jiufu >> >> > >> > Richard. >> > >> >> > + span = fold_build2 (MULT_EXPR, niter_type, >> >> > niter->niter, >> >> > + fold_convert (niter_type, >> >> > niter->control.step)); >> >> > + niter->bound = fold_build2 (PLUS_EXPR, niter_type, >> >> > span, >> >> > + fold_convert (niter_type, >> >> > niter->control.base)); >> >> > + niter->bound = fold_convert (type, niter->bound); >> >> > + niter->cmp = NE_EXPR; >> >> > >> >> > return true; >> >> > } >> >> > diff --git a/gcc/testsuite/gcc.dg/pr102087.c >> >> > b/gcc/testsuite/gcc.dg/pr102087.c >> >> > new file mode 100644 >> >> > index 00000000000..ef1f9f5cba9 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.dg/pr102087.c >> >> > @@ -0,0 +1,25 @@ >> >> > +/* { dg-do compile } */ >> >> > +/* { dg-options "-O3" } */ >> >> > + >> >> > +unsigned __attribute__ ((noinline)) >> >> > +foo (int *__restrict__ a, int *__restrict__ b, unsigned >> >> > l, unsigned n) >> >> > +{ >> >> > + while (n < ++l) >> >> > + *a++ = *b++ + 1; >> >> > + return l; >> >> > +} >> >> > + >> >> > +volatile int a[1]; >> >> > +unsigned b; >> >> > +int c; >> >> > + >> >> > +int >> >> > +check () >> >> > +{ >> >> > + int d; >> >> > + for (; b > 1; b++) >> >> > + for (c = 0; c < 2; c++) >> >> > + for (d = 0; d < 2; d++) >> >> > + a[0]; >> >> > + return 0; >> >> > +} >> >> > diff --git a/gcc/testsuite/gcc.dg/vect/pr101145_3.c >> >> > b/gcc/testsuite/gcc.dg/vect/pr101145_3.c >> >> > index 99289afec0b..40cb0240aaa 100644 >> >> > --- a/gcc/testsuite/gcc.dg/vect/pr101145_3.c >> >> > +++ b/gcc/testsuite/gcc.dg/vect/pr101145_3.c >> >> > @@ -1,5 +1,6 @@ >> >> > /* { dg-require-effective-target vect_int } */ >> >> > /* { dg-options "-O3 -fdump-tree-vect-details" } */ >> >> > + >> >> > #define TYPE int * >> >> > #define MIN ((TYPE)0) >> >> > #define MAX ((TYPE)((long long)-1)) >> >> > @@ -10,4 +11,5 @@ >> >> > >> >> > #include "pr101145.inc" >> >> > >> >> > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" >> >> > 2 "vect" } } >> >> > */ >> >> > +/* pointer size may not be vectorized, checking niter is >> >> > ok. */ >> >> > +/* { dg-final { scan-tree-dump "Symbolic number of >> >> > iterations is" "vect" >> >> > } >> >> > } */ >> >> >> >>