From: Raphael M Zinsly <rzinsly@linux.ibm.com>
To: libc-alpha@sourceware.org
Subject: Re: [PATCH 2/2] powerpc: Optimzed stpncpy for POWER9
Date: Thu, 20 Aug 2020 15:31:59 -0300 [thread overview]
Message-ID: <ee5e66ce-1067-d315-b242-02689a8372d3@linux.ibm.com> (raw)
In-Reply-To: <20200820182917.12602-2-rzinsly@linux.ibm.com>
Here is the make bench output:
generic_stpncpy __stpncpy_power9
__stpncpy_power8 __stpncpy_power7 __stpncpy_ppc
Length 16, n 16, alignment 1/ 1: 7.31792 2.79249 2.98207 6.20964
11.2262
Length 16, n 16, alignment 1/ 1: 7.26441 2.79883 2.97986 6.09795
11.1118
Length 16, n 16, alignment 1/ 2: 7.22475 2.82518 2.98169 6.18967
10.9933
Length 16, n 16, alignment 2/ 1: 7.28211 2.78851 3.1079 6.06067 10.4232
Length 2, n 4, alignment 7/ 2: 9.30193 2.4733 4.30086 4.74387 9.25328
Length 4, n 2, alignment 2/ 7: 6.7756 1.91031 2.93946 3.24475 7.76389
Length 2, n 4, alignment 7/ 2: 8.81319 2.4726 4.57341 4.74421 9.44667
Length 4, n 2, alignment 2/ 7: 6.77806 1.9118 2.93637 3.1857 7.00171
Length 16, n 16, alignment 2/ 2: 7.35335 2.80104 3.10653 5.85492
10.5689
Length 16, n 16, alignment 2/ 2: 7.14308 2.78571 3.10889 6.10044
10.4816
Length 16, n 16, alignment 2/ 4: 7.21628 2.81563 3.10724 6.14674
10.6005
Length 16, n 16, alignment 4/ 2: 7.47713 2.80531 3.80081 5.86977
9.43599
Length 4, n 8, alignment 6/ 4: 8.63537 2.4676 5.53825 4.1877 9.88309
Length 8, n 4, alignment 4/ 6: 6.63429 1.91051 3.10751 2.76472 8.4156
Length 4, n 8, alignment 6/ 4: 8.59304 2.43152 5.30288 4.16475
9.77498
Length 8, n 4, alignment 4/ 6: 6.63843 1.91047 3.19713 2.69566
8.67023
Length 16, n 16, alignment 3/ 3: 7.45277 2.80045 3.42433 6.06204
9.92282
Length 16, n 16, alignment 3/ 3: 8.04191 2.78645 3.43317 5.99773
10.0662
Length 16, n 16, alignment 3/ 6: 7.5816 2.81606 3.44168 6.0801 9.94673
Length 16, n 16, alignment 6/ 3: 7.10582 2.80176 5.03947 6.06942
8.40249
Length 8, n 16, alignment 5/ 6: 8.19747 2.42028 4.30043 5.0752 11.3093
Length 16, n 8, alignment 6/ 5: 6.37287 2.07239 2.56322 4.36972
6.52164
Length 8, n 16, alignment 5/ 6: 8.25022 2.45124 4.05051 5.02258
10.8683
Length 16, n 8, alignment 6/ 5: 6.31868 2.07215 2.83061 4.44584
7.14464
Length 16, n 16, alignment 4/ 4: 7.54408 2.80105 3.82846 5.71392
9.91359
Length 16, n 16, alignment 4/ 4: 7.66265 2.79063 3.86233 6.06489
9.31705
Length 16, n 16, alignment 4/ 0: 7.84286 2.79896 3.83148 6.08954
9.55253
Length 16, n 16, alignment 0/ 4: 7.36697 2.07019 2.66533 6.13894
7.75685
Length 16, n 32, alignment 4/ 0: 10.3819 3.33088 6.32994 7.24949
12.3827
Length 32, n 16, alignment 0/ 4: 7.15586 2.07172 2.66097 6.11743
7.56448
Length 16, n 32, alignment 4/ 0: 10.3262 3.35225 6.34556 7.3211 12.2527
Length 32, n 16, alignment 0/ 4: 7.13287 2.07265 2.6613 6.17878 7.61901
Length 16, n 16, alignment 5/ 5: 7.22471 2.80128 4.65776 6.15455
9.93333
Length 16, n 16, alignment 5/ 5: 7.22458 2.78586 4.65874 6.06763
9.87968
Length 16, n 16, alignment 5/ 2: 7.22718 2.79127 4.65999 6.025 10.3775
Length 16, n 16, alignment 2/ 5: 7.73485 2.8025 3.10754 6.08303 10.3871
Length 32, n 64, alignment 3/ 2: 13.7685 4.1256 7.04965 11.5105 15.3903
Length 64, n 32, alignment 2/ 3: 10.526 3.05149 3.59497 8.45078 13.7462
Length 32, n 64, alignment 3/ 2: 13.7681 4.11611 7.08236 11.5129
16.6004
Length 64, n 32, alignment 2/ 3: 10.962 3.05712 3.60447 8.43981 15.4906
Length 16, n 16, alignment 6/ 6: 7.30916 2.80056 5.03985 6.16331
8.43692
Length 16, n 16, alignment 6/ 6: 7.31688 2.7914 5.02931 6.12345 8.42848
Length 16, n 16, alignment 6/ 4: 7.7402 2.7993 5.04435 6.02685 8.28199
Length 16, n 16, alignment 4/ 6: 7.79103 2.82496 3.82464 6.0778 9.31532
Length 64, n 128, alignment 2/ 4: 15.4969 5.3714 8.09812 12.6067 18.7831
Length 128, n 64, alignment 4/ 2: 12.9023 3.93138 5.46487 10.7071
13.3253
Length 64, n 128, alignment 2/ 4: 15.4998 5.42611 7.88843 12.6007
24.0491
Length 128, n 64, alignment 4/ 2: 12.8971 3.94646 5.49689 11.1747
21.5779
Length 16, n 16, alignment 7/ 7: 7.68992 2.78151 6.14775 6.19397
8.38412
Length 16, n 16, alignment 7/ 7: 7.90811 2.7803 6.11502 6.17383 8.78371
Length 16, n 16, alignment 7/ 6: 7.45456 2.80173 5.93657 6.15191
8.38489
Length 16, n 16, alignment 6/ 7: 7.44846 2.80238 5.03654 6.1154 8.41589
Length 128, n 256, alignment 1/ 6: 17.9114 8.39532 10.3246 17.9457
21.9452
Length 256, n 128, alignment 6/ 1: 14.8346 5.41104 8.89047 13.5379
17.1437
Length 128, n 256, alignment 1/ 6: 17.9118 8.39985 10.3271 17.9503
42.0831
Length 256, n 128, alignment 6/ 1: 14.8306 5.40714 9.04492 13.5227 37.819
Length 8, n 16, alignment 0/ 0: 8.19945 2.46752 4.04264 4.62897
8.22975
Length 32, n 16, alignment 0/ 0: 7.23617 2.07229 2.66504 2.66683
7.93411
Length 8, n 16, alignment 7/ 2: 8.26373 2.41779 4.18003 5.31418 9.0473
Length 32, n 16, alignment 7/ 2: 7.46119 2.63992 6.16424 6.14534
7.28237
Length 16, n 32, alignment 0/ 0: 10.1282 3.42401 5.00287 5.02318
11.4985
Length 64, n 32, alignment 0/ 0: 9.29452 2.57779 2.79807 3.1362 10.9532
Length 16, n 32, alignment 6/ 4: 10.2194 3.30297 7.48371 10.4067
11.2264
Length 64, n 32, alignment 6/ 4: 10.6887 3.04976 5.13062 8.10511
11.1225
Length 32, n 64, alignment 0/ 0: 12.1806 4.09924 5.12341 6.14159
14.0965
Length 128, n 64, alignment 0/ 0: 10.1569 3.52625 3.88528 4.65782
11.3018
Length 32, n 64, alignment 5/ 6: 13.7795 4.13456 8.53476 10.2846
15.1556
Length 128, n 64, alignment 5/ 6: 12.8171 3.92765 5.82505 10.3559
15.0831
Length 64, n 128, alignment 0/ 0: 13.6328 5.33523 6.43324 7.92213
16.4658
Length 256, n 128, alignment 0/ 0: 8.92495 4.97169 7.13044 6.30158
12.9039
Length 64, n 128, alignment 4/ 0: 13.8393 5.36588 7.52682 11.5294
17.5523
Length 256, n 128, alignment 4/ 0: 13.5309 5.36019 7.56527 13.3503
17.8202
Length 128, n 256, alignment 0/ 0: 15.2956 8.14449 8.79678 9.69352
21.2463
Length 512, n 256, alignment 0/ 0: 11.5667 7.22974 10.1355 10.2592
21.5805
Length 128, n 256, alignment 3/ 2: 18.0152 8.21506 10.9175 20.4131
22.3927
Length 512, n 256, alignment 3/ 2: 18.7328 7.81909 11.251 25.0633 29.2378
Length 256, n 512, alignment 0/ 0: 17.5135 13.9768 15.6849 16.1219
30.9344
Length 1024, n 512, alignment 0/ 0: 17.988 11.8498 18.4388 18.7385 41.5762
Length 256, n 512, alignment 2/ 4: 23.3724 14.8026 15.9182 28.6762
33.9031
Length 1024, n 512, alignment 2/ 4: 27.9562 13.2785 19.5893 46.9671
52.4943
Length 512, n 1024, alignment 0/ 0: 23.3637 25.283 21.2536 23.4228 55.6501
Length 2048, n 1024, alignment 0/ 0: 31.303 21.2731 40.7001 38.8365 75.1105
Length 512, n 1024, alignment 1/ 6: 33.0535 26.873 24.8167 51.5917 56.236
Length 2048, n 1024, alignment 1/ 6: 47.5444 24.0206 42.5163 86.0245
92.5819
next prev parent reply other threads:[~2020-08-20 18:32 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-20 18:29 [PATCH 1/2] powerpc: Optimized strncpy " Raphael Moreira Zinsly
2020-08-20 18:29 ` [PATCH 2/2] powerpc: Optimzed stpncpy " Raphael Moreira Zinsly
2020-08-20 18:31 ` Raphael M Zinsly [this message]
2020-08-28 17:04 ` Paul E Murphy
2020-08-20 18:31 ` [PATCH 1/2] powerpc: Optimized strncpy " Raphael M Zinsly
2020-08-28 14:25 ` Paul E Murphy
2020-08-28 19:12 ` Paul A. Clarke
2020-09-02 13:20 ` Tulio Magno Quites Machado Filho
2020-09-02 14:00 ` Paul E Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ee5e66ce-1067-d315-b242-02689a8372d3@linux.ibm.com \
--to=rzinsly@linux.ibm.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).