From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id DE39F3951E5E; Tue, 12 May 2020 06:48:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org DE39F3951E5E Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 04C6Wvlv017326; Tue, 12 May 2020 02:48:50 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 30ws24pusr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 12 May 2020 02:48:50 -0400 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 04C6Wur2017294; Tue, 12 May 2020 02:48:49 -0400 Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 30ws24purh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 12 May 2020 02:48:49 -0400 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 04C6jno0030053; Tue, 12 May 2020 06:48:47 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma03fra.de.ibm.com with ESMTP id 30wm56aceq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 12 May 2020 06:48:47 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 04C6mjlk11599960 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 12 May 2020 06:48:45 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F274611C05B; Tue, 12 May 2020 06:48:44 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 04F6611C04C; Tue, 12 May 2020 06:48:42 +0000 (GMT) Received: from luoxhus-MacBook-Pro.local (unknown [9.200.42.143]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 12 May 2020 06:48:41 +0000 (GMT) Subject: [PATCH v2] Fold (add -1; zero_ext; add +1) operations to zero_ext when not overflow (PR37451, part of PR61837) To: Segher Boessenkool , luoxhu--- via Gcc-patches , linkw@gcc.gnu.org, joseph@codesourcery.com, richard.sandiford@arm.com, jakub@redhat.com Cc: wschmidt@linux.ibm.com References: <20200415084755.72653-1-luoxhu@linux.ibm.com> <20200417012140.GJ26902@gate.crashing.org> <20200417163202.GM26902@gate.crashing.org> <06f01ea4-2a9d-2314-0136-d333512c70dd@linux.ibm.com> <25612ac8-73f7-8cef-f2d6-8b220337dec3@linux.ibm.com> From: luoxhu Message-ID: Date: Tue, 12 May 2020 14:48:40 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <25612ac8-73f7-8cef-f2d6-8b220337dec3@linux.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.676 definitions=2020-05-12_01:2020-05-11, 2020-05-12 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 clxscore=1015 suspectscore=0 bulkscore=0 adultscore=0 impostorscore=0 mlxscore=0 lowpriorityscore=0 malwarescore=0 mlxlogscore=999 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005120058 X-Spam-Status: No, score=-16.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 May 2020 06:48:56 -0000 Minor refine of checking iterations nonoverflow and a testcase for stage 1. This "subtract/extend/add" existed for a long time and still annoying us (PR37451, part of PR61837) when converting from 32bits to 64bits, as the ctr register is used as 64bits on powerpc64, Andraw Pinski had a patch but caused some issue and reverted by Joseph S. Myers(PR37451, PR37782). Andraw: http://gcc.gnu.org/ml/gcc-patches/2008-09/msg01070.html http://gcc.gnu.org/ml/gcc-patches/2008-10/msg01321.html Joseph: https://gcc.gnu.org/legacy-ml/gcc-patches/2011-11/msg02405.html We can do the simplification from "subtract/extend/add" to only extend when loop iterations is known to be LT than MODE_MAX-1(NOT do simplify when counter+0x1 overflow). Bootstrap and regression tested pass on Power8-LE. gcc/ChangeLog 2020-05-12 Xiong Hu Luo PR rtl-optimization/37451, part of PR target/61837 * loop-doloop.c (doloop_modify): Simplify (add -1; zero_ext; add +1) to zero_ext when not wrapping overflow. gcc/testsuite/ChangeLog 2020-05-12 Xiong Hu Luo PR rtl-optimization/37451, part of PR target/61837 * gcc.target/powerpc/doloop-2.c: New test. --- gcc/loop-doloop.c | 46 ++++++++++++++++++++- gcc/testsuite/gcc.target/powerpc/doloop-2.c | 14 +++++++ 2 files changed, 59 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/doloop-2.c diff --git a/gcc/loop-doloop.c b/gcc/loop-doloop.c index db6a014e43d..16372382a22 100644 --- a/gcc/loop-doloop.c +++ b/gcc/loop-doloop.c @@ -477,7 +477,51 @@ doloop_modify (class loop *loop, class niter_desc *desc, } if (increment_count) - count = simplify_gen_binary (PLUS, mode, count, const1_rtx); + { + /* Fold (add -1; zero_ext; add +1) operations to zero_ext. i.e: + + 73: r145:SI=r123:DI#0-0x1 + 74: r144:DI=zero_extend (r145:SI) + 75: r143:DI=r144:DI+0x1 + ... + 31: r135:CC=cmp (r123:DI,0) + 72: {pc={(r143:DI!=0x1)?L70:pc};r143:DI=r143:DI-0x1;clobber + scratch;clobber scratch;} + + r123:DI#0-0x1 is param count derived from loop->niter_expr equal to the + loop iterations, if loop iterations expression doesn't overflow, then + (zero_extend (r123:DI#0-1))+1 could be simplified to zero_extend only. + */ + bool simplify_zext = false; + rtx extop0 = XEXP (count, 0); + if (GET_CODE (count) == ZERO_EXTEND && GET_CODE (extop0) == PLUS) + { + rtx addop0 = XEXP (extop0, 0); + rtx addop1 = XEXP (extop0, 1); + + int nonoverflow = 0; + unsigned int_mode + = GET_MODE_PRECISION (as_a GET_MODE (addop0)); + unsigned HOST_WIDE_INT int_mode_max + = (HOST_WIDE_INT_1U << (int_mode - 1) << 1) - 1; + if (get_max_loop_iterations (loop, &iterations) + && wi::ltu_p (iterations, int_mode_max)) + nonoverflow = 1; + + if (nonoverflow + && CONST_SCALAR_INT_P (addop1) + && GET_MODE_PRECISION (mode) == int_mode * 2 + && addop1 == GEN_INT (-1)) + { + count = simplify_gen_unary (ZERO_EXTEND, mode, addop0, + GET_MODE (addop0)); + simplify_zext = true; + } + } + + if (!simplify_zext) + count = simplify_gen_binary (PLUS, mode, count, const1_rtx); + } /* Insert initialization of the count register into the loop header. */ start_sequence (); diff --git a/gcc/testsuite/gcc.target/powerpc/doloop-2.c b/gcc/testsuite/gcc.target/powerpc/doloop-2.c new file mode 100644 index 00000000000..dc8516bb0ab --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/doloop-2.c @@ -0,0 +1,14 @@ +/* { dg-do compile { target powerpc*-*-* } } */ +/* { dg-options "-O2 -fno-unroll-loops" } */ + +int f(int l, int *a) +{ + int i; + for(i = 0;i < l; i++) + a[i] = i; + return l; +} + +/* { dg-final { scan-assembler-not "-1" } } */ +/* { dg-final { scan-assembler "bdnz" } } */ +/* { dg-final { scan-assembler-times "mtctr" 1 } } */ -- 2.21.0.777.g83232e3864