On Mon, Apr 29, 2024 at 1:20 PM H.J. Lu wrote: > On Mon, Apr 29, 2024 at 10:42 AM Sunil Pandey wrote: > > > > > > > > On Sun, Apr 28, 2024 at 9:17 AM H.J. Lu wrote: > >> > >> On Sun, Apr 28, 2024 at 9:13 AM Sunil Pandey wrote: > >> > > >> > > >> > > >> > On Sat, Apr 27, 2024 at 7:13 PM abush wang > wrote: > >> >> > >> >> Actually, I was handling performance issue from libmicro in our > distro OS. > >> >> I found that the performance degradation of localtime_r benchmark > from libmicro is blame to strlen. > >> >> So I abstracted this test case. > >> >> > >> > > >> > Can you consistently reproduce strlen perf behaviour by running > multiple times back-to-back? > >> > > >> > You can see high swing from run > >> > >> Hi Sunil, > >> > >> Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz is SKX. Please add this test > to > >> benchtests/bench-strlen.c and check its performance on SKX. > >> > >> -- > >> H.J. > > > > > > I collected the glibc micro-benchmark data for the string length in > question. > > > > 2.38 evex data: > > > > length=4, alignment=4: 4.40 > > length=4, alignment=0: 4.29 > > length=4, alignment=0: 3.64 > > length=4, alignment=7: 3.64 > > length=4, alignment=2: 3.64 > > > > 2.28 evex data: > > > > Length 4, alignment 4: 6.46875 > > Length 4, alignment 0: 6.5 > > Length 4, alignment 0: 6.53125 > > Length 4, alignment 7: 6.46875 > > Length 4, alignment 2: 6.53125 > > > > Data collected on Machine: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz > > > > 2.38 perf numbers are better than 2.28 as expected. > > 1. Please compare AVX2 vs EVEX strlen on glibc master branch. > 2. Please check strlen on strings of length == 4 and alignments = 0, 1, 2, > 3. > > -- > H.J. > Data from master branch: Data collected on Machine: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz __strlen_evex __strlen_avx2 ======================================================= length=4, alignment=0: 5.00 5.11 length=4, alignment=1: 4.92 4.80 length=4, alignment=2: 4.82 4.62 length=4, alignment=3: 4.62 4.92 length=4, alignment=4: 4.44 4.44 length=4, alignment=5: 4.59 4.29 length=4, alignment=6: 4.39 4.29 length=4, alignment=7: 4.14 4.14 length=4, alignment=8: 4.19 4.00 length=4, alignment=9: 4.00 4.00 length=4, alignment=10: 4.31 3.87 length=4, alignment=11: 3.96 3.87 length=4, alignment=12: 3.86 3.75 length=4, alignment=13: 3.75 3.75 length=4, alignment=14: 3.64 3.64 length=4, alignment=15: 3.64 3.72 length=4, alignment=16: 3.64 3.53 length=4, alignment=17: 3.63 3.53 length=4, alignment=18: 4.12 3.53 length=4, alignment=19: 3.43 3.43 length=4, alignment=20: 3.43 3.43 length=4, alignment=21: 3.33 3.33 length=4, alignment=22: 3.33 3.42 length=4, alignment=23: 3.33 3.33 length=4, alignment=24: 3.33 3.33 length=4, alignment=25: 3.33 3.33 length=4, alignment=26: 3.96 3.33 length=4, alignment=27: 3.33 3.41 length=4, alignment=28: 3.33 3.33 length=4, alignment=29: 3.41 3.33 length=4, alignment=30: 3.33 3.41 length=4, alignment=31: 3.33 3.33 --Sunil