From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 35CF2386EC7E for ; Tue, 29 Sep 2020 15:22:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 35CF2386EC7E Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 08TF2hHE139269 for ; Tue, 29 Sep 2020 11:22:36 -0400 Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0b-001b2d01.pphosted.com with ESMTP id 33v6kbjax3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 29 Sep 2020 11:22:36 -0400 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 08TFCJaB028844 for ; Tue, 29 Sep 2020 15:22:36 GMT Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by ppma05wdc.us.ibm.com with ESMTP id 33sw9957vm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 29 Sep 2020 15:22:36 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 08TFMZKg16253710 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 29 Sep 2020 15:22:35 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C756C112067 for ; Tue, 29 Sep 2020 15:22:35 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 363C2112061 for ; Tue, 29 Sep 2020 15:22:35 +0000 (GMT) Received: from [9.163.14.40] (unknown [9.163.14.40]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP for ; Tue, 29 Sep 2020 15:22:34 +0000 (GMT) Subject: Re: [PATCH v3 1/2] powerpc: Add optimized strncpy for POWER9 To: libc-alpha@sourceware.org References: <20200929152103.18564-1-rzinsly@linux.ibm.com> From: Raphael M Zinsly Message-ID: <1930c5ea-6a2c-d1bd-3265-36bdc3d56652@linux.ibm.com> Date: Tue, 29 Sep 2020 12:22:33 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: <20200929152103.18564-1-rzinsly@linux.ibm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-09-29_07:2020-09-29, 2020-09-29 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 phishscore=0 priorityscore=1501 lowpriorityscore=0 suspectscore=1 adultscore=0 clxscore=1015 mlxscore=0 mlxlogscore=776 bulkscore=0 impostorscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009290130 X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, KAM_NUMSUBJECT, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Sep 2020 15:22:38 -0000 Benchtest output: generic_strncpy __strncpy_power9 __strncpy_power8 __strncpy_power7 __strncpy_ppc Length 16, n 16, alignment 1/ 1: 7.12492 2.55148 2.55079 5.71316 9.05306 Length 16, n 16, alignment 1/ 1: 6.71919 2.51696 2.56187 5.92056 9.43297 Length 16, n 16, alignment 1/ 2: 6.65909 2.53904 2.55074 5.6362 9.30194 Length 16, n 16, alignment 2/ 1: 6.50693 2.51671 2.82125 5.9298 9.18833 Length 2, n 4, alignment 7/ 2: 7.68477 2.27528 5.07192 4.8353 8.4619 Length 4, n 2, alignment 2/ 7: 6.03804 1.6644 2.32355 2.8178 6.27398 Length 2, n 4, alignment 7/ 2: 7.68944 2.31009 4.25078 4.83038 8.59554 Length 4, n 2, alignment 2/ 7: 6.04246 1.66875 2.31775 2.73826 6.8358 Length 16, n 16, alignment 2/ 2: 6.50729 2.51669 2.83075 5.91498 9.60274 Length 16, n 16, alignment 2/ 2: 6.3441 2.51684 2.82829 5.64233 9.29031 Length 16, n 16, alignment 2/ 4: 6.33989 2.51866 2.82089 5.59129 9.50426 Length 16, n 16, alignment 4/ 2: 7.88012 2.51145 3.44369 5.91774 9.50185 Length 4, n 8, alignment 6/ 4: 7.88965 2.27023 5.2189 4.67992 9.06714 Length 8, n 4, alignment 4/ 6: 5.8805 1.69238 2.67589 2.95865 7.70634 Length 4, n 8, alignment 6/ 4: 7.69107 2.29724 5.2196 4.68409 9.07751 Length 8, n 4, alignment 4/ 6: 6.33989 1.69501 2.67179 2.95862 7.75311 Length 16, n 16, alignment 3/ 3: 6.58968 2.51681 3.14295 5.92364 8.86981 Length 16, n 16, alignment 3/ 3: 6.76256 2.51385 3.14379 5.91558 9.02347 Length 16, n 16, alignment 3/ 6: 6.76734 2.53841 3.08182 5.90924 8.98558 Length 16, n 16, alignment 6/ 3: 6.67014 2.51618 4.16905 5.94761 7.81751 Length 8, n 16, alignment 5/ 6: 7.70082 2.30026 4.59182 5.41689 10.5428 Length 16, n 8, alignment 6/ 5: 5.63868 1.87873 2.32929 4.5053 5.78866 Length 8, n 16, alignment 5/ 6: 7.40013 2.2999 4.23768 5.41724 10.1649 Length 16, n 8, alignment 6/ 5: 5.63858 1.87872 2.32768 4.58045 6.02812 Length 16, n 16, alignment 4/ 4: 7.37003 2.5167 3.50594 5.91125 8.93866 Length 16, n 16, alignment 4/ 4: 7.51015 2.51684 3.58684 5.91127 8.60509 Length 16, n 16, alignment 4/ 0: 7.42056 2.51149 3.38179 5.92321 8.86607 Length 16, n 16, alignment 0/ 4: 6.6704 1.87853 2.44519 5.91475 7.68788 Length 16, n 32, alignment 4/ 0: 11.0276 3.0727 6.01877 6.9094 11.4447 Length 32, n 16, alignment 0/ 4: 6.90919 1.87852 2.45708 5.91217 6.7671 Length 16, n 32, alignment 4/ 0: 9.76588 3.07257 5.92168 6.81253 11.8936 Length 32, n 16, alignment 0/ 4: 6.90342 1.88296 2.44527 5.91673 7.68469 Length 16, n 16, alignment 5/ 5: 6.90186 2.51712 3.91963 5.91852 9.46308 Length 16, n 16, alignment 5/ 5: 6.58716 2.51626 3.94884 5.91303 9.59648 Length 16, n 16, alignment 5/ 2: 6.92421 2.52057 3.80827 5.91558 9.3486 Length 16, n 16, alignment 2/ 5: 6.50526 2.53369 2.82035 5.91729 9.065 Length 32, n 64, alignment 3/ 2: 14.0395 3.79978 6.41657 11.19 13.9713 Length 64, n 32, alignment 2/ 3: 9.85699 2.75331 3.21559 8.23056 11.4077 Length 32, n 64, alignment 3/ 2: 14.0923 3.8037 6.38851 11.4514 15.9838 Length 64, n 32, alignment 2/ 3: 9.4437 2.75344 3.21249 8.21276 13.9496 Length 16, n 16, alignment 6/ 6: 6.33989 2.51408 4.38486 5.91681 7.37203 Length 16, n 16, alignment 6/ 6: 6.76503 2.51645 4.26454 5.9103 7.87574 Length 16, n 16, alignment 6/ 4: 6.51654 2.51654 4.24635 5.91578 7.17827 Length 16, n 16, alignment 4/ 6: 7.28735 2.53335 3.54029 5.92337 8.63075 Length 64, n 128, alignment 2/ 4: 15.4973 4.98808 7.34157 11.5113 16.7688 Length 128, n 64, alignment 4/ 2: 11.6235 3.54914 4.80814 10.3103 11.6194 Length 64, n 128, alignment 2/ 4: 15.4979 5.02559 7.28236 11.5045 22.1309 Length 128, n 64, alignment 4/ 2: 11.6138 3.53841 4.80527 10.3293 19.5239 Length 16, n 16, alignment 7/ 7: 6.84212 2.51109 5.0585 5.7457 7.2307 Length 16, n 16, alignment 7/ 7: 6.86215 2.50957 5.06541 5.91726 8.55044 Length 16, n 16, alignment 7/ 6: 6.97428 2.51876 5.05053 5.92637 7.07715 Length 16, n 16, alignment 6/ 7: 7.01347 2.53448 4.38004 5.93278 7.86288 Length 128, n 256, alignment 1/ 6: 17.9407 7.92071 9.38384 16.9419 20.6065 Length 256, n 128, alignment 6/ 1: 13.3609 4.7983 7.967 12.5699 14.9996 Length 128, n 256, alignment 1/ 6: 17.9371 7.69161 9.36672 16.739 38.9048 Length 256, n 128, alignment 6/ 1: 13.3632 4.87671 7.80194 12.7028 33.9017 Length 8, n 16, alignment 0/ 0: 7.4529 2.29963 3.62737 4.22665 7.50268 Length 32, n 16, alignment 0/ 0: 6.86674 1.87853 2.45092 2.41528 7.30161 Length 8, n 16, alignment 7/ 2: 7.40103 2.29399 3.75703 5.43637 8.45285 Length 32, n 16, alignment 7/ 2: 7.72683 2.35278 5.04996 5.93629 7.18881 Length 16, n 32, alignment 0/ 0: 9.87066 3.17511 4.89448 4.41405 10.3408 Length 64, n 32, alignment 0/ 0: 8.06217 2.32926 2.94508 2.71275 8.11769 Length 16, n 32, alignment 6/ 4: 9.50052 3.07627 6.37858 9.46793 10.1393 Length 64, n 32, alignment 6/ 4: 9.7197 2.75154 4.47331 7.73667 9.26558 Length 32, n 64, alignment 0/ 0: 10.9157 3.79013 4.83041 4.97713 11.5486 Length 128, n 64, alignment 0/ 0: 9.28057 3.15788 3.5178 4.23091 11.0874 Length 32, n 64, alignment 5/ 6: 14.0472 3.8515 7.26431 10.1343 12.8115 Length 128, n 64, alignment 5/ 6: 11.5493 3.5659 5.05553 9.1005 13.4053 Length 64, n 128, alignment 0/ 0: 12.0056 4.94615 6.45436 7.06235 14.4743 Length 256, n 128, alignment 0/ 0: 7.87506 4.49546 6.4492 5.38877 12.1437 Length 64, n 128, alignment 4/ 0: 12.4174 4.99773 7.73749 11.1452 16.1494 Length 256, n 128, alignment 4/ 0: 12.2601 4.88446 6.95948 13.3726 16.7583 Length 128, n 256, alignment 0/ 0: 13.9215 7.51155 7.87942 8.79876 20.4226 Length 512, n 256, alignment 0/ 0: 10.5798 6.77319 8.79757 9.03297 20.0197 Length 128, n 256, alignment 3/ 2: 18.0213 7.57884 9.89436 18.7839 20.5445 Length 512, n 256, alignment 3/ 2: 16.9909 7.07957 9.9271 23.2621 25.2442 Length 256, n 512, alignment 0/ 0: 17.6825 12.3074 13.3245 13.9381 28.7687 Length 1024, n 512, alignment 0/ 0: 16.3837 10.8306 16.6999 16.6797 38.0562 Length 256, n 512, alignment 2/ 4: 23.1953 13.0445 14.324 26.8918 30.2049 Length 1024, n 512, alignment 2/ 4: 25.4059 12.0938 17.2483 41.4883 47.2025 Length 512, n 1024, alignment 0/ 0: 21.029 17.1782 19.4815 21.0035 43.2361 Length 2048, n 1024, alignment 0/ 0: 28.5154 19.3221 36.9624 35.482 68.4792 Length 512, n 1024, alignment 1/ 6: 32.4103 17.9272 21.5421 46.6099 55.059 Length 2048, n 1024, alignment 1/ 6: 43.0516 21.6315 37.8787 77.7889 83.4195