From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 2D57A3857C5E for ; Thu, 20 Aug 2020 18:31:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 2D57A3857C5E Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07KI3OGX114565 for ; Thu, 20 Aug 2020 14:31:18 -0400 Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0a-001b2d01.pphosted.com with ESMTP id 331uj3nqs8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 20 Aug 2020 14:31:15 -0400 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 07KIJJ0v004681 for ; Thu, 20 Aug 2020 18:31:14 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma04dal.us.ibm.com with ESMTP id 3304ur3sgn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 20 Aug 2020 18:31:14 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 07KIVAva15008324 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Thu, 20 Aug 2020 18:31:11 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CFD6FBE054 for ; Thu, 20 Aug 2020 18:31:13 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 51565BE04F for ; Thu, 20 Aug 2020 18:31:13 +0000 (GMT) Received: from [9.163.29.102] (unknown [9.163.29.102]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP for ; Thu, 20 Aug 2020 18:31:13 +0000 (GMT) Subject: Re: [PATCH 1/2] powerpc: Optimized strncpy for POWER9 To: libc-alpha@sourceware.org References: <20200820182917.12602-1-rzinsly@linux.ibm.com> From: Raphael M Zinsly Message-ID: <1baf3865-6775-c8d0-b5c3-d5c59c4dd532@linux.ibm.com> Date: Thu, 20 Aug 2020 15:31:12 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <20200820182917.12602-1-rzinsly@linux.ibm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-08-20_03:2020-08-19, 2020-08-20 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 lowpriorityscore=0 priorityscore=1501 malwarescore=0 clxscore=1015 mlxlogscore=744 mlxscore=0 adultscore=0 spamscore=0 suspectscore=1 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008200142 X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, KAM_NUMSUBJECT, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2020 18:31:20 -0000 Here is the make bench output: generic_strncpy __strncpy_power9 __strncpy_power8 __strncpy_power7 __strncpy_ppc Length 16, n 16, alignment 1/ 1: 7.11694 2.77348 2.80296 6.5724 10.4471 Length 16, n 16, alignment 1/ 1: 7.1557 2.75968 2.805 6.5748 10.5064 Length 16, n 16, alignment 1/ 2: 7.17956 2.79127 2.79964 6.57323 10.3281 Length 16, n 16, alignment 2/ 1: 7.15841 2.77364 3.10582 6.2332 10.331 Length 2, n 4, alignment 7/ 2: 8.90911 2.4623 5.38449 5.64873 9.36348 Length 4, n 2, alignment 2/ 7: 6.65395 1.84558 2.58298 3.10566 7.46376 Length 2, n 4, alignment 7/ 2: 8.70625 2.41166 5.38131 5.73421 9.64285 Length 4, n 2, alignment 2/ 7: 6.65458 1.84354 2.58382 3.64721 6.96163 Length 16, n 16, alignment 2/ 2: 7.01778 2.77373 3.10668 6.58047 10.6006 Length 16, n 16, alignment 2/ 2: 7.53778 2.75789 3.10591 6.2277 10.2613 Length 16, n 16, alignment 2/ 4: 7.13828 2.79132 3.10567 6.56847 10.619 Length 16, n 16, alignment 4/ 2: 7.38659 2.77668 3.70851 6.54537 9.17368 Length 4, n 8, alignment 6/ 4: 8.71748 2.45183 5.76669 4.65782 10.014 Length 8, n 4, alignment 4/ 6: 6.5504 1.83463 2.96574 2.66227 8.49964 Length 4, n 8, alignment 6/ 4: 8.96461 2.4499 5.78384 5.32287 9.79641 Length 8, n 4, alignment 4/ 6: 6.48083 1.83265 2.9783 3.38632 8.51888 Length 16, n 16, alignment 3/ 3: 7.7538 2.77353 3.29008 6.55912 9.94143 Length 16, n 16, alignment 3/ 3: 7.75279 2.76148 3.30616 6.5445 9.98866 Length 16, n 16, alignment 3/ 6: 7.21486 2.79444 3.33712 6.24747 10.113 Length 16, n 16, alignment 6/ 3: 6.99138 2.77778 4.50777 6.22522 8.53482 Length 8, n 16, alignment 5/ 6: 8.26994 2.77966 4.60681 6.10938 10.5975 Length 16, n 8, alignment 6/ 5: 6.28062 2.07193 2.57761 4.95636 6.48035 Length 8, n 16, alignment 5/ 6: 8.17113 2.43559 4.27753 5.95453 11.1796 Length 16, n 8, alignment 6/ 5: 6.21214 2.07239 2.57714 4.96762 6.76041 Length 16, n 16, alignment 4/ 4: 7.31373 2.77573 3.78349 6.19349 8.91432 Length 16, n 16, alignment 4/ 4: 7.32226 2.75658 3.70319 6.60792 9.17307 Length 16, n 16, alignment 4/ 0: 7.58812 2.76841 3.71554 6.54282 8.90051 Length 16, n 16, alignment 0/ 4: 6.92871 2.06944 2.66876 6.63947 8.08171 Length 16, n 32, alignment 4/ 0: 10.2972 3.3192 6.53695 7.77295 12.332 Length 32, n 16, alignment 0/ 4: 6.98056 2.06954 2.66445 6.54976 7.65286 Length 16, n 32, alignment 4/ 0: 10.5356 3.31343 6.53813 7.72029 12.2915 Length 32, n 16, alignment 0/ 4: 7.36068 2.06945 2.66424 6.21052 8.07614 Length 16, n 16, alignment 5/ 5: 7.2122 2.77732 4.17451 6.55383 10.4887 Length 16, n 16, alignment 5/ 5: 7.34438 2.77512 4.17191 6.56873 10.5664 Length 16, n 16, alignment 5/ 2: 7.15746 2.76198 4.14481 6.56235 10.7391 Length 16, n 16, alignment 2/ 5: 7.19372 2.79273 3.10693 6.56984 10.2697 Length 32, n 64, alignment 3/ 2: 15.3918 4.22964 7.0146 12.5809 13.8661 Length 64, n 32, alignment 2/ 3: 10.5331 3.02942 3.54253 9.19106 12.9356 Length 32, n 64, alignment 3/ 2: 15.369 4.17282 7.36163 12.5759 16.8501 Length 64, n 32, alignment 2/ 3: 10.5585 3.01971 3.52885 9.03369 15.6663 Length 16, n 16, alignment 6/ 6: 7.0405 2.77527 4.53842 6.54733 7.99437 Length 16, n 16, alignment 6/ 6: 7.02801 2.76059 4.52873 6.53536 8.45713 Length 16, n 16, alignment 6/ 4: 7.42011 2.77669 4.52223 6.57756 7.9899 Length 16, n 16, alignment 4/ 6: 7.37787 2.77507 3.77821 6.57058 9.17396 Length 64, n 128, alignment 2/ 4: 17.188 5.33493 8.00394 12.6196 19.1784 Length 128, n 64, alignment 4/ 2: 12.7962 3.91004 5.42994 11.294 12.5273 Length 64, n 128, alignment 2/ 4: 17.2298 5.2748 8.15392 12.6039 24.3802 Length 128, n 64, alignment 4/ 2: 12.7866 3.87534 5.3334 11.8516 21.6528 Length 16, n 16, alignment 7/ 7: 7.75015 2.76775 5.59024 6.57976 8.42318 Length 16, n 16, alignment 7/ 7: 7.81681 2.75691 5.56801 6.55397 10.0378 Length 16, n 16, alignment 7/ 6: 7.75225 2.77446 5.56813 6.57349 8.49645 Length 16, n 16, alignment 6/ 7: 7.23237 2.79186 4.51528 6.55304 8.63443 Length 128, n 256, alignment 1/ 6: 19.8414 8.37691 10.3445 18.4838 22.8314 Length 256, n 128, alignment 6/ 1: 14.7972 5.38498 8.83611 13.8521 16.6154 Length 128, n 256, alignment 1/ 6: 19.8497 8.37754 10.3469 18.2655 43.3568 Length 256, n 128, alignment 6/ 1: 14.7542 5.31075 8.75314 13.7759 37.6351 Length 8, n 16, alignment 0/ 0: 8.19872 2.45818 4.27602 4.6578 7.98513 Length 32, n 16, alignment 0/ 0: 6.92066 2.07115 2.66465 2.66381 7.75655 Length 8, n 16, alignment 7/ 2: 8.18253 2.42685 4.70317 6.01808 9.35743 Length 32, n 16, alignment 7/ 2: 7.79714 2.60074 5.58717 6.64181 6.98583 Length 16, n 32, alignment 0/ 0: 10.4715 3.40184 6.28388 4.86146 11.0819 Length 64, n 32, alignment 0/ 0: 10.4403 2.54135 3.07109 3.38791 9.35196 Length 16, n 32, alignment 6/ 4: 10.7077 3.34867 7.01321 10.4278 11.2951 Length 64, n 32, alignment 6/ 4: 10.9215 3.03041 5.04324 8.30023 11.2648 Length 32, n 64, alignment 0/ 0: 12.0062 4.09428 5.32372 5.48319 14.1455 Length 128, n 64, alignment 0/ 0: 10.1803 3.47282 3.83134 4.21557 10.6674 Length 32, n 64, alignment 5/ 6: 15.4165 4.16297 7.78876 10.8762 15.4308 Length 128, n 64, alignment 5/ 6: 12.7332 3.91667 5.8014 10.5869 14.0961 Length 64, n 128, alignment 0/ 0: 13.238 5.24242 6.90661 8.05566 15.9848 Length 256, n 128, alignment 0/ 0: 8.759 4.9483 6.98675 6.11489 12.6755 Length 64, n 128, alignment 4/ 0: 13.6593 5.27931 8.60925 12.5916 17.5016 Length 256, n 128, alignment 4/ 0: 13.4801 5.37114 7.47485 14.0585 17.4517 Length 128, n 256, alignment 0/ 0: 15.3147 8.02462 8.92006 9.67769 20.3757 Length 512, n 256, alignment 0/ 0: 11.5638 7.22535 9.80468 9.93597 21.3421 Length 128, n 256, alignment 3/ 2: 19.8948 8.15967 10.9435 20.6146 22.4146 Length 512, n 256, alignment 3/ 2: 18.681 7.77864 10.9269 25.9269 28.0105 Length 256, n 512, alignment 0/ 0: 19.4894 13.7363 14.8394 15.4064 31.6341 Length 1024, n 512, alignment 0/ 0: 18.0108 11.8737 18.1779 18.5072 41.5425 Length 256, n 512, alignment 2/ 4: 25.5662 14.5189 16.1872 29.5395 33.7587 Length 1024, n 512, alignment 2/ 4: 28.0079 13.2347 19.067 48.1998 52.3078 Length 512, n 1024, alignment 0/ 0: 23.1385 25.4237 21.2303 23.632 47.4502 Length 2048, n 1024, alignment 0/ 0: 31.201 21.308 40.6351 39.04 75.0329 Length 512, n 1024, alignment 1/ 6: 35.6234 27.0042 24.4711 51.3364 60.6277 Length 2048, n 1024, alignment 1/ 6: 47.442 24.0381 41.6616 85.4832 91.8897