From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19736 invoked by alias); 17 Oct 2016 14:03:11 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 19716 invoked by uid 89); 17 Oct 2016 14:03:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW,RCVD_IN_SEMBACKSCATTER,RCVD_IN_SORBS_SPAM autolearn=no version=3.3.2 spammy=Hx-languages-length:1608, ThunderX, H*f:Sn1, H*i:Sn1 X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 17 Oct 2016 14:03:09 +0000 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u9HE0o6k032901 for ; Mon, 17 Oct 2016 10:03:07 -0400 Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152]) by mx0a-001b2d01.pphosted.com with ESMTP id 264y7bu4jv-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 17 Oct 2016 10:03:07 -0400 Received: from localhost by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 17 Oct 2016 08:03:04 -0600 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 17 Oct 2016 08:03:01 -0600 Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 1386819D806D; Mon, 17 Oct 2016 08:02:22 -0600 (MDT) Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u9HE1Nr88126776; Mon, 17 Oct 2016 07:02:55 -0700 Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8D7DE6A045; Mon, 17 Oct 2016 08:02:55 -0600 (MDT) Received: from oc1687012634.ibm.com (unknown [9.48.112.55]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP id 0439F6A03F; Mon, 17 Oct 2016 08:02:54 -0600 (MDT) Subject: Re: [PATCH] Don't peel extra copy of loop in unroller for loops with exit at end To: Andrew Pinski References: Cc: GCC Patches From: Pat Haugen Date: Mon, 17 Oct 2016 14:03:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16101714-0016-0000-0000-000004EFE403 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005928; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000187; SDB=6.00769345; UDB=6.00368577; IPR=6.00545754; BA=6.00004812; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013015; XFM=3.00000011; UTC=2016-10-17 14:03:02 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16101714-0017-0000-0000-000033D9EC12 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-10-17_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1610170245 X-IsSubscribed: yes X-SW-Source: 2016-10/txt/msg01311.txt.bz2 On 10/14/2016 10:29 PM, Andrew Pinski wrote: >>> >> This patch bumps the iteration count by 1 for loops with the exit at the end so that it represents the number of times the loop body is executed, and therefore removes the need to always execute that first peeled copy. With this change, when the number of executions of the loop is an even multiple of the unroll factor then the code will jump to the unrolled loop immediately instead of executing all the switch code and peeled copies of the loop and then falling into the unrolled loop. This change also reduces code size by removing a peeled copy of the loop. >>> >> >>> >> Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk? >> > >> > This patch or >> > PR rtl-optimization/68212 >> > * cfgloopmanip.c (duplicate_loop_to_header_edge): Use preheader edge >> > frequency when computing scale factor for peeled copies. >> > * loop-unroll.c (unroll_loop_runtime_iterations): Fix freq/count >> > values for switch/peel blocks/edges. >> > >> > Caused a ~2.7-3.5% regression in coremarks with -funroll-all-loops. > I should say on ThunderX (aarch64-linux-gnu). Sorry to hear about the degradation. Do you have more details on which patch and/or what specifically causes the degradation? This patch should only affect the execution path outside the unrolled loop (worst case is probably for loops that execute once). The pr68212 patch is just correcting some of the block frequency/count issues, so they're not as screwed up as what they were. Thanks, Pat