From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id A79223861C7E for ; Fri, 2 Jul 2021 03:33:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A79223861C7E Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 1623XUgP053538 for ; Thu, 1 Jul 2021 23:33:30 -0400 Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 39ht99guek-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 01 Jul 2021 23:33:21 -0400 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1623XK2x028142 for ; Fri, 2 Jul 2021 03:33:20 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma03fra.de.ibm.com with ESMTP id 39duv8hd43-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 02 Jul 2021 03:33:20 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1623XHHQ35979770 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Fri, 2 Jul 2021 03:33:17 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B49C6AE051 for ; Fri, 2 Jul 2021 03:33:17 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C3F45AE04D for ; Fri, 2 Jul 2021 03:33:16 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.200.53.103]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP for ; Fri, 2 Jul 2021 03:33:16 +0000 (GMT) To: GCC Development From: "Kewen.Lin" Subject: Question on tree LIM Message-ID: <1338ef7b-57f4-a376-5827-c85392ed53a8@linux.ibm.com> Date: Fri, 2 Jul 2021 11:33:14 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 Content-Type: text/plain; charset=gbk Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: OCjyn9EwRwDZi99p-jThwrY279G7U7aK X-Proofpoint-ORIG-GUID: OCjyn9EwRwDZi99p-jThwrY279G7U7aK X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-01_12:2021-07-01, 2021-07-01 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 mlxlogscore=749 clxscore=1015 lowpriorityscore=0 bulkscore=0 adultscore=0 priorityscore=1501 phishscore=0 spamscore=0 malwarescore=0 suspectscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107020016 X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Jul 2021 03:33:31 -0000 Hi, I am investigating one degradation related to SPEC2017 exchange2_r, with loop vectorization on at -O2, it degraded by 6%. By some isolation, I found it isn't directly caused by vectorization itself, but exposed by vectorization, some stuffs for vectorization condition checks are hoisted out and they increase the register pressure, finally results in more spillings than before. If I simply disable tree lim4, I can see the gap becomes smaller (just 40%+ of the original), if further disable rtl lim, it just becomes to 30% of the original. It seems to indicate there is some room to improve in both LIMs. By quick scanning in tree LIM, I noticed that there seems no any considerations on register pressure, it looked intentional? I am wondering what's the design philosophy behind it? Is it because that it's hard to model register pressure well here? If so, it seems to put the burden onto late RA, which needs to have a good rematerialization support. btw, the example loop is at line 1150 from src exchange2.fppized.f90 1150 block(rnext:9, 7, i7) = block(rnext:9, 7, i7) + 10 The extra hoisted statements after the vectorization on this loop (cheap cost model btw) are: _686 = (integer(kind=8)) rnext_679; _1111 = (sizetype) _19; _1112 = _1111 * 12; _1927 = _1112 + 12; * _1895 = _1927 - _2650; _1113 = (unsigned long) rnext_679; * niters.6220_1128 = 10 - _1113; * _1021 = 9 - _1113; * bnd.6221_940 = niters.6220_1128 >> 2; * niters_vector_mult_vf.6222_939 = niters.6220_1128 & 18446744073709551612; _144 = niters_vector_mult_vf.6222_939 + _1113; tmp.6223_934 = (integer(kind=8)) _144; S.823_1004 = _1021 <= 2 ? _686 : tmp.6223_934; * ivtmp.6410_289 = (unsigned long) S.823_1004; PS: * indicates the one has a long live interval. BR, Kewen