From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 82505 invoked by alias); 25 Jun 2018 15:41:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 82488 invoked by uid 89); 25 Jun 2018 15:41:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-12.6 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 25 Jun 2018 15:41:38 +0000 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w5PFeTDk034976 for ; Mon, 25 Jun 2018 11:41:37 -0400 Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152]) by mx0a-001b2d01.pphosted.com with ESMTP id 2ju0vuqsyg-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 25 Jun 2018 11:41:36 -0400 Received: from localhost by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 25 Jun 2018 09:41:35 -0600 Received: from b03cxnp08028.gho.boulder.ibm.com (9.17.130.20) by e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 25 Jun 2018 09:41:34 -0600 Received: from b03ledav004.gho.boulder.ibm.com (b03ledav004.gho.boulder.ibm.com [9.17.130.235]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w5PFfXIN10355064 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Mon, 25 Jun 2018 08:41:33 -0700 Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 69C9178063 for ; Mon, 25 Jun 2018 09:41:33 -0600 (MDT) Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3F0AE7805E for ; Mon, 25 Jun 2018 09:41:33 -0600 (MDT) Received: from ragesh3a (unknown [9.40.47.205]) by b03ledav004.gho.boulder.ibm.com (Postfix) with ESMTP for ; Mon, 25 Jun 2018 09:41:33 -0600 (MDT) Subject: [PATCH, rs6000] don't use unaligned vsx for memset of less than 32 bytes From: Aaron Sawdey To: GCC Patches Date: Mon, 25 Jun 2018 15:41:00 -0000 Content-Type: multipart/mixed; boundary="=-IjVffaNpXHhwRVRZ2HgQ" Mime-Version: 1.0 x-cbid: 18062515-0016-0000-0000-000008FF41F6 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009253; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000266; SDB=6.01052159; UDB=6.00539363; IPR=6.00830084; MB=3.00021850; MTD=3.00000008; XFM=3.00000015; UTC=2018-06-25 15:41:35 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18062515-0017-0000-0000-00003F681828 Message-Id: <979a1eeceb7c4c3f7b2068e9b924970760d695ff.camel@linux.ibm.com> X-IsSubscribed: yes X-SW-Source: 2018-06/txt/msg01559.txt.bz2 --=-IjVffaNpXHhwRVRZ2HgQ Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Content-length: 1022 In gcc 8 I added support for unaligned vsx in the builtin expansion of memset(x,0,y). Turns out that for memset of less than 32 bytes, this doesn't really help much, and it also runs into an egregious load-hit- store case in CPU2006 components gcc and hmmer. This patch reverts to the previous (gcc 7) behavior for memset of 16-31 bytes, which is to use vsx stores only if the target is 16 byte aligned. For 32 bytes or more, unaligned vsx stores will still be used. Performance testing of the memset expansion shows that not much is given up by using scalar stores for 16-31 bytes, and CPU2006 runs show the performance regression is fixed. Regstrap passes on powerpc64le, ok for trunk and backport to 8? Thanks, Aaron 2018-06-25 Aaron Sawdey * config/rs6000/rs6000-string.c (expand_block_clear): Don't use unaligned vsx for 16B memset. -- Aaron Sawdey, Ph.D. acsawdey@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain --=-IjVffaNpXHhwRVRZ2HgQ Content-Disposition: attachment; filename="memset1632.patch" Content-Transfer-Encoding: base64 Content-Type: text/x-patch; name="memset1632.patch"; charset="UTF-8" Content-length: 757 SW5kZXg6IGdjYy9jb25maWcvcnM2MDAwL3JzNjAwMC1zdHJpbmcuYwo9PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09Ci0tLSBnY2MvY29uZmlnL3JzNjAwMC9yczYw MDAtc3RyaW5nLmMJKHJldmlzaW9uIDI2MTgwOCkKKysrIGdjYy9jb25maWcv cnM2MDAwL3JzNjAwMC1zdHJpbmcuYwkod29ya2luZyBjb3B5KQpAQCAtOTAs NyArOTAsOSBAQAogICAgICAgbWFjaGluZV9tb2RlIG1vZGUgPSBCTEttb2Rl OwogICAgICAgcnR4IGRlc3Q7CiAKLSAgICAgIGlmIChieXRlcyA+PSAxNiAm JiBUQVJHRVRfQUxUSVZFQyAmJiAoYWxpZ24gPj0gMTI4IHx8IFRBUkdFVF9F RkZJQ0lFTlRfVU5BTElHTkVEX1ZTWCkpCisgICAgICBpZiAoVEFSR0VUX0FM VElWRUMKKwkgICYmICgoYnl0ZXMgPj0gMTYgJiYgYWxpZ24gPj0gMTI4KQor CSAgICAgIHx8IChieXRlcyA+PSAzMiAmJiBUQVJHRVRfRUZGSUNJRU5UX1VO QUxJR05FRF9WU1gpKSkKIAl7CiAJICBjbGVhcl9ieXRlcyA9IDE2OwogCSAg bW9kZSA9IFY0U0ltb2RlOwo= --=-IjVffaNpXHhwRVRZ2HgQ--