public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
@ 2023-09-21 14:38 Noah Goldstein
  2023-09-21 14:39 ` Noah Goldstein
  2023-10-04 18:48 ` Noah Goldstein
  0 siblings, 2 replies; 12+ messages in thread
From: Noah Goldstein @ 2023-09-21 14:38 UTC (permalink / raw)
  To: libc-alpha; +Cc: goldstein.w.n, hjl.tools, carlos

This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
common implementation: `strrchr-evex-base.S`.

The motivation is `strrchr-evex` needed to be refactored to not use
64-bit masked registers in preperation for AVX10.

Once vec-width masked register combining was removed, the EVEX and
EVEX512 implementations can easily be implemented in the same file
without any major overhead.

The net result is performance improvements (measured on TGL) for both
`strrchr-evex` and `strrchr-evex512`. Although, note there are some
regressions in the test suite and it may be many of the cases that
make the total-geomean of improvement/regression across bench-strrchr
are cold. The point of the performance measurement is to show there
are no major regressions, but the primary motivation is preperation
for AVX10.

Benchmarks where taken on TGL:
https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html

EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87

Full check passes on x86.
---
 sysdeps/x86_64/multiarch/strrchr-evex-base.S | 466 ++++++++++++-------
 sysdeps/x86_64/multiarch/strrchr-evex.S      | 392 +---------------
 sysdeps/x86_64/multiarch/wcsrchr-evex.S      |   1 +
 3 files changed, 294 insertions(+), 565 deletions(-)

diff --git a/sysdeps/x86_64/multiarch/strrchr-evex-base.S b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
index 58b2853ab6..2c98f07fca 100644
--- a/sysdeps/x86_64/multiarch/strrchr-evex-base.S
+++ b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
@@ -25,240 +25,354 @@
 # include <sysdep.h>
 
 # ifdef USE_AS_WCSRCHR
+#  if VEC_SIZE == 64
+#   define RCX_M	cx
+#   define kortestM	kortestw
+#  else
+#   define RCX_M	cl
+#   define kortestM	kortestb
+#  endif
+
+#  define SHIFT_REG	VRCX
+#  define VPCOMPRESS	vpcompressd
 #  define CHAR_SIZE	4
-#  define VPBROADCAST   vpbroadcastd
-#  define VPCMPEQ	vpcmpeqd
-#  define VPMINU	vpminud
+#  define VPMIN	vpminud
 #  define VPTESTN	vptestnmd
+#  define VPTEST	vptestmd
+#  define VPBROADCAST	vpbroadcastd
+#  define VPCMPEQ	vpcmpeqd
+#  define VPCMP	vpcmpd
 # else
+#  define SHIFT_REG	VRDI
+#  define VPCOMPRESS	vpcompressb
 #  define CHAR_SIZE	1
-#  define VPBROADCAST   vpbroadcastb
-#  define VPCMPEQ	vpcmpeqb
-#  define VPMINU	vpminub
+#  define VPMIN	vpminub
 #  define VPTESTN	vptestnmb
+#  define VPTEST	vptestmb
+#  define VPBROADCAST	vpbroadcastb
+#  define VPCMPEQ	vpcmpeqb
+#  define VPCMP	vpcmpb
+
+#  define RCX_M	VRCX
+#  define kortestM	KORTEST
 # endif
 
-# define PAGE_SIZE	4096
+# define VMATCH	VMM(0)
 # define CHAR_PER_VEC	(VEC_SIZE / CHAR_SIZE)
+# define PAGE_SIZE	4096
 
 	.section SECTION(.text), "ax", @progbits
-/* Aligning entry point to 64 byte, provides better performance for
-   one vector length string.  */
-ENTRY_P2ALIGN (STRRCHR, 6)
-
-	/* Broadcast CHAR to VMM(0).  */
-	VPBROADCAST %esi, %VMM(0)
+	/* Aligning entry point to 64 byte, provides better performance for
+	   one vector length string.  */
+ENTRY_P2ALIGN(STRRCHR, 6)
 	movl	%edi, %eax
-	sall	$20, %eax
-	cmpl	$((PAGE_SIZE - VEC_SIZE) << 20), %eax
-	ja	L(page_cross)
+	/* Broadcast CHAR to VMATCH.  */
+	VPBROADCAST %esi, %VMATCH
 
-L(page_cross_continue):
-	/* Compare [w]char for null, mask bit will be set for match.  */
-	VMOVU	(%rdi), %VMM(1)
+	andl	$(PAGE_SIZE - 1), %eax
+	cmpl	$(PAGE_SIZE - VEC_SIZE), %eax
+	jg	L(cross_page_boundary)
 
-	VPTESTN	%VMM(1), %VMM(1), %k1
-	KMOV	%k1, %VRCX
-	test	%VRCX, %VRCX
-	jz	L(align_more)
-
-	VPCMPEQ	%VMM(1), %VMM(0), %k0
-	KMOV	%k0, %VRAX
-	BLSMSK	%VRCX, %VRCX
-	and	%VRCX, %VRAX
-	jz	L(ret)
-
-	BSR	%VRAX, %VRAX
+	VMOVU	(%rdi), %VMM(1)
+	/* k0 has a 1 for each zero CHAR in YMM1.  */
+	VPTESTN	%VMM(1), %VMM(1), %k0
+	KMOV	%k0, %VGPR(rsi)
+	test	%VGPR(rsi), %VGPR(rsi)
+	jz	L(aligned_more)
+	/* fallthrough: zero CHAR in first VEC.  */
+L(page_cross_return):
+	/* K1 has a 1 for each search CHAR match in VEC(1).  */
+	VPCMPEQ	%VMATCH, %VMM(1), %k1
+	KMOV	%k1, %VGPR(rax)
+	/* Build mask up until first zero CHAR (used to mask of
+	   potential search CHAR matches past the end of the string).  */
+	blsmsk	%VGPR(rsi), %VGPR(rsi)
+	and	%VGPR(rsi), %VGPR(rax)
+	jz	L(ret0)
+	/* Get last match (the `and` removed any out of bounds matches).  */
+	bsr	%VGPR(rax), %VGPR(rax)
 # ifdef USE_AS_WCSRCHR
 	leaq	(%rdi, %rax, CHAR_SIZE), %rax
 # else
-	add	%rdi, %rax
+	addq	%rdi, %rax
 # endif
-L(ret):
+L(ret0):
 	ret
 
-L(vector_x2_end):
-	VPCMPEQ	%VMM(2), %VMM(0), %k2
-	KMOV	%k2, %VRAX
-	BLSMSK	%VRCX, %VRCX
-	and	%VRCX, %VRAX
-	jz	L(vector_x1_ret)
-
-	BSR	%VRAX, %VRAX
-	leaq	(VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-	/* Check the first vector at very last to look for match.  */
-L(vector_x1_ret):
-	VPCMPEQ %VMM(1), %VMM(0), %k2
-	KMOV	%k2, %VRAX
-	test	%VRAX, %VRAX
-	jz	L(ret)
-
-	BSR	%VRAX, %VRAX
+	/* Returns for first vec x1/x2/x3 have hard coded backward
+	   search path for earlier matches.  */
+	.p2align 4,, 6
+L(first_vec_x1):
+	VPCMPEQ	%VMATCH, %VMM(2), %k1
+	KMOV	%k1, %VGPR(rax)
+	blsmsk	%VGPR(rcx), %VGPR(rcx)
+	/* eax non-zero if search CHAR in range.  */
+	and	%VGPR(rcx), %VGPR(rax)
+	jnz	L(first_vec_x1_return)
+
+	/* fallthrough: no match in YMM2 then need to check for earlier
+	   matches (in YMM1).  */
+	.p2align 4,, 4
+L(first_vec_x0_test):
+	VPCMPEQ	%VMATCH, %VMM(1), %k1
+	KMOV	%k1, %VGPR(rax)
+	test	%VGPR(rax), %VGPR(rax)
+	jz	L(ret1)
+	bsr	%VGPR(rax), %VGPR(rax)
 # ifdef USE_AS_WCSRCHR
 	leaq	(%rsi, %rax, CHAR_SIZE), %rax
 # else
-	add	%rsi, %rax
+
+	addq	%rsi, %rax
 # endif
+L(ret1):
+	ret
+
+	.p2align 4,, 10
+L(first_vec_x3):
+	VPCMPEQ	%VMATCH, %VMM(4), %k1
+	KMOV	%k1, %VGPR(rax)
+	blsmsk	%VGPR(rcx), %VGPR(rcx)
+	/* If no search CHAR match in range check YMM1/YMM2/YMM3.  */
+	and	%VGPR(rcx), %VGPR(rax)
+	jz	L(first_vec_x1_or_x2)
+	bsr	%VGPR(rax), %VGPR(rax)
+	leaq	(VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
+	ret
+	.p2align 4,, 4
+
+L(first_vec_x2):
+	VPCMPEQ	%VMATCH, %VMM(3), %k1
+	KMOV	%k1, %VGPR(rax)
+	blsmsk	%VGPR(rcx), %VGPR(rcx)
+	/* Check YMM3 for last match first. If no match try YMM2/YMM1.  */
+	and	%VGPR(rcx), %VGPR(rax)
+	jz	L(first_vec_x0_x1_test)
+	bsr	%VGPR(rax), %VGPR(rax)
+	leaq	(VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
 	ret
 
-L(align_more):
-	/* Zero r8 to store match result.  */
-	xorl	%r8d, %r8d
-	/* Save pointer of first vector, in case if no match found.  */
+	.p2align 4,, 6
+L(first_vec_x0_x1_test):
+	VPCMPEQ	%VMATCH, %VMM(2), %k1
+	KMOV	%k1, %VGPR(rax)
+	/* Check YMM2 for last match first. If no match try YMM1.  */
+	test	%VGPR(rax), %VGPR(rax)
+	jz	L(first_vec_x0_test)
+	.p2align 4,, 4
+L(first_vec_x1_return):
+	bsr	%VGPR(rax), %VGPR(rax)
+	leaq	(VEC_SIZE)(%r8, %rax, CHAR_SIZE), %rax
+	ret
+
+	.p2align 4,, 12
+L(aligned_more):
+L(page_cross_continue):
+	/* Need to keep original pointer incase VEC(1) has last match.  */
 	movq	%rdi, %rsi
-	/* Align pointer to vector size.  */
 	andq	$-VEC_SIZE, %rdi
-	/* Loop unroll for 2 vector loop.  */
-	VMOVA	(VEC_SIZE)(%rdi), %VMM(2)
+
+	VMOVU	VEC_SIZE(%rdi), %VMM(2)
 	VPTESTN	%VMM(2), %VMM(2), %k0
 	KMOV	%k0, %VRCX
+	movq	%rdi, %r8
 	test	%VRCX, %VRCX
-	jnz	L(vector_x2_end)
+	jnz	L(first_vec_x1)
+
+	VMOVU	(VEC_SIZE * 2)(%rdi), %VMM(3)
+	VPTESTN	%VMM(3), %VMM(3), %k0
+	KMOV	%k0, %VRCX
+
+	test	%VRCX, %VRCX
+	jnz	L(first_vec_x2)
+
+	VMOVU	(VEC_SIZE * 3)(%rdi), %VMM(4)
+	VPTESTN	%VMM(4), %VMM(4), %k0
+	KMOV	%k0, %VRCX
+
+	/* Intentionally use 64-bit here.  EVEX256 version needs 1-byte
+	   padding for efficient nop before loop alignment.  */
+	test	%rcx, %rcx
+	jnz	L(first_vec_x3)
 
-	/* Save pointer of second vector, in case if no match
-	   found.  */
-	movq	%rdi, %r9
-	/* Align address to VEC_SIZE * 2 for loop.  */
 	andq	$-(VEC_SIZE * 2), %rdi
+	.p2align 4
+L(first_aligned_loop):
+	/* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
+	   gurantee they don't store a match.  */
+	VMOVA	(VEC_SIZE * 4)(%rdi), %VMM(5)
+	VMOVA	(VEC_SIZE * 5)(%rdi), %VMM(6)
 
-	.p2align 4,,11
-L(loop):
-	/* 2 vector loop, as it provide better performance as compared
-	   to 4 vector loop.  */
-	VMOVA	(VEC_SIZE * 2)(%rdi), %VMM(3)
-	VMOVA	(VEC_SIZE * 3)(%rdi), %VMM(4)
-	VPCMPEQ	%VMM(3), %VMM(0), %k1
-	VPCMPEQ	%VMM(4), %VMM(0), %k2
-	VPMINU	%VMM(3), %VMM(4), %VMM(5)
-	VPTESTN	%VMM(5), %VMM(5), %k0
-	KOR	%k1, %k2, %k3
-	subq	$-(VEC_SIZE * 2), %rdi
-	/* If k0 and k3 zero, match and end of string not found.  */
-	KORTEST	%k0, %k3
-	jz	L(loop)
-
-	/* If k0 is non zero, end of string found.  */
-	KORTEST %k0, %k0
-	jnz	L(endloop)
-
-	lea	VEC_SIZE(%rdi), %r8
-	/* A match found, it need to be stored in r8 before loop
-	   continue.  */
-	/* Check second vector first.  */
-	KMOV	%k2, %VRDX
-	test	%VRDX, %VRDX
-	jnz	L(loop_vec_x2_match)
+	VPCMP	$4, %VMM(5), %VMATCH, %k2
+	VPCMP	$4, %VMM(6), %VMATCH, %k3{%k2}
+
+	VPMIN	%VMM(5), %VMM(6), %VMM(7)
+
+	VPTEST	%VMM(7), %VMM(7), %k1{%k3}
+	subq	$(VEC_SIZE * -2), %rdi
+	kortestM %k1, %k1
+	jc	L(first_aligned_loop)
 
+	VPTESTN	%VMM(7), %VMM(7), %k1
 	KMOV	%k1, %VRDX
-	/* Match is in first vector, rdi offset need to be subtracted
-	  by VEC_SIZE.  */
-	sub	$VEC_SIZE, %r8
-
-	/* If second vector doesn't have match, first vector must
-	   have match.  */
-L(loop_vec_x2_match):
-	BSR	%VRDX, %VRDX
-# ifdef USE_AS_WCSRCHR
-	sal	$2, %rdx
-# endif
-	add	%rdx, %r8
-	jmp	L(loop)
+	test	%VRDX, %VRDX
+	jz	L(second_aligned_loop_prep)
 
-L(endloop):
-	/* Check if string end in first loop vector.  */
-	VPTESTN	%VMM(3), %VMM(3), %k0
-	KMOV	%k0, %VRCX
-	test	%VRCX, %VRCX
-	jnz	L(loop_vector_x1_end)
+	kortestM %k3, %k3
+	jnc	L(return_first_aligned_loop)
 
-	/* Check if it has match in first loop vector.  */
-	KMOV	%k1, %VRAX
+	.p2align 4,, 6
+L(first_vec_x1_or_x2_or_x3):
+	VPCMPEQ	%VMM(4), %VMATCH, %k4
+	KMOV	%k4, %VRAX
 	test	%VRAX, %VRAX
-	jz	L(loop_vector_x2_end)
+	jz	L(first_vec_x1_or_x2)
+	bsr	%VRAX, %VRAX
+	leaq	(VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
+	ret
 
-	BSR	%VRAX, %VRAX
-	leaq	(%rdi, %rax, CHAR_SIZE), %r8
 
-	/* String must end in second loop vector.  */
-L(loop_vector_x2_end):
-	VPTESTN	%VMM(4), %VMM(4), %k0
+	.p2align 4,, 8
+L(return_first_aligned_loop):
+	VPTESTN	%VMM(5), %VMM(5), %k0
 	KMOV	%k0, %VRCX
+	blsmsk	%VRCX, %VRCX
+	jnc	L(return_first_new_match_first)
+	blsmsk	%VRDX, %VRDX
+	VPCMPEQ	%VMM(6), %VMATCH, %k0
+	KMOV	%k0, %VRAX
+	addq	$VEC_SIZE, %rdi
+	and	%VRDX, %VRAX
+	jnz	L(return_first_new_match_ret)
+	subq	$VEC_SIZE, %rdi
+L(return_first_new_match_first):
 	KMOV	%k2, %VRAX
-	BLSMSK	%VRCX, %VRCX
-	/* Check if it has match in second loop vector.  */
+# ifdef USE_AS_WCSRCHR
+	xorl	$((1 << CHAR_PER_VEC)- 1), %VRAX
 	and	%VRCX, %VRAX
-	jz	L(check_last_match)
+# else
+	andn	%VRCX, %VRAX, %VRAX
+# endif
+	jz	L(first_vec_x1_or_x2_or_x3)
+L(return_first_new_match_ret):
+	bsr	%VRAX, %VRAX
+	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
+	ret
 
-	BSR	%VRAX, %VRAX
-	leaq	(VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
+	.p2align 4,, 10
+L(first_vec_x1_or_x2):
+	VPCMPEQ	%VMM(3), %VMATCH, %k3
+	KMOV	%k3, %VRAX
+	test	%VRAX, %VRAX
+	jz	L(first_vec_x0_x1_test)
+	bsr	%VRAX, %VRAX
+	leaq	(VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
 	ret
 
-	/* String end in first loop vector.  */
-L(loop_vector_x1_end):
-	KMOV	%k1, %VRAX
-	BLSMSK	%VRCX, %VRCX
-	/* Check if it has match in second loop vector.  */
-	and	%VRCX, %VRAX
-	jz	L(check_last_match)
 
-	BSR	%VRAX, %VRAX
-	leaq	(%rdi, %rax, CHAR_SIZE), %rax
-	ret
+	.p2align 4
+	/* We can throw away the work done for the first 4x checks here
+	   as we have a later match. This is the 'fast' path persay.  */
+L(second_aligned_loop_prep):
+L(second_aligned_loop_set_furthest_match):
+	movq	%rdi, %rsi
+	VMOVA	%VMM(5), %VMM(7)
+	VMOVA	%VMM(6), %VMM(8)
+	.p2align 4
+L(second_aligned_loop):
+	VMOVU	(VEC_SIZE * 4)(%rdi), %VMM(5)
+	VMOVU	(VEC_SIZE * 5)(%rdi), %VMM(6)
+	VPCMP	$4, %VMM(5), %VMATCH, %k2
+	VPCMP	$4, %VMM(6), %VMATCH, %k3{%k2}
+
+	VPMIN	%VMM(5), %VMM(6), %VMM(4)
+
+	VPTEST	%VMM(4), %VMM(4), %k1{%k3}
+	subq	$(VEC_SIZE * -2), %rdi
+	KMOV	%k1, %VRCX
+	inc	%RCX_M
+	jz	L(second_aligned_loop)
+	VPTESTN	%VMM(4), %VMM(4), %k1
+	KMOV	%k1, %VRDX
+	test	%VRDX, %VRDX
+	jz	L(second_aligned_loop_set_furthest_match)
 
-	/* No match in first and second loop vector.  */
-L(check_last_match):
-	/* Check if any match recorded in r8.  */
-	test	%r8, %r8
-	jz	L(vector_x2_ret)
-	movq	%r8, %rax
+	kortestM %k3, %k3
+	jnc	L(return_new_match)
+	/* branch here because there is a significant advantage interms
+	   of output dependency chance in using edx.  */
+
+
+L(return_old_match):
+	VPCMPEQ	%VMM(8), %VMATCH, %k0
+	KMOV	%k0, %VRCX
+	bsr	%VRCX, %VRCX
+	jnz	L(return_old_match_ret)
+
+	VPCMPEQ	%VMM(7), %VMATCH, %k0
+	KMOV	%k0, %VRCX
+	bsr	%VRCX, %VRCX
+	subq	$VEC_SIZE, %rsi
+L(return_old_match_ret):
+	leaq	(VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
 	ret
 
-	/* No match recorded in r8. Check the second saved vector
-	   in beginning.  */
-L(vector_x2_ret):
-	VPCMPEQ %VMM(2), %VMM(0), %k2
-	KMOV	%k2, %VRAX
-	test	%VRAX, %VRAX
-	jz	L(vector_x1_ret)
 
-	/* Match found in the second saved vector.  */
-	BSR	%VRAX, %VRAX
-	leaq	(VEC_SIZE)(%r9, %rax, CHAR_SIZE), %rax
+L(return_new_match):
+	VPTESTN	%VMM(5), %VMM(5), %k0
+	KMOV	%k0, %VRCX
+	blsmsk	%VRCX, %VRCX
+	jnc	L(return_new_match_first)
+	dec	%VRDX
+	VPCMPEQ	%VMM(6), %VMATCH, %k0
+	KMOV	%k0, %VRAX
+	addq	$VEC_SIZE, %rdi
+	and	%VRDX, %VRAX
+	jnz	L(return_new_match_ret)
+	subq	$VEC_SIZE, %rdi
+L(return_new_match_first):
+	KMOV	%k2, %VRAX
+# ifdef USE_AS_WCSRCHR
+	xorl	$((1 << CHAR_PER_VEC)- 1), %VRAX
+	and	%VRCX, %VRAX
+# else
+	andn	%VRCX, %VRAX, %VRAX
+# endif
+	jz	L(return_old_match)
+L(return_new_match_ret):
+	bsr	%VRAX, %VRAX
+	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
 	ret
 
-L(page_cross):
-	mov	%rdi, %rax
-	movl	%edi, %ecx
+	.p2align 4,, 4
+L(cross_page_boundary):
+	xorq	%rdi, %rax
+	mov	$-1, %VRDX
+	VMOVU	(PAGE_SIZE - VEC_SIZE)(%rax), %VMM(6)
+	VPTESTN	%VMM(6), %VMM(6), %k0
+	KMOV	%k0, %VRSI
 
 # ifdef USE_AS_WCSRCHR
-	/* Calculate number of compare result bits to be skipped for
-	   wide string alignment adjustment.  */
-	andl	$(VEC_SIZE - 1), %ecx
-	sarl	$2, %ecx
+	movl	%edi, %ecx
+	and	$(VEC_SIZE - 1), %ecx
+	shrl	$2, %ecx
 # endif
-	/* ecx contains number of w[char] to be skipped as a result
-	   of address alignment.  */
-	andq    $-VEC_SIZE, %rax
-	VMOVA	(%rax), %VMM(1)
-	VPTESTN	%VMM(1), %VMM(1), %k1
-	KMOV	%k1, %VRAX
-	SHR     %cl, %VRAX
-	jz	L(page_cross_continue)
-	VPCMPEQ	%VMM(1), %VMM(0), %k0
-	KMOV	%k0, %VRDX
-	SHR     %cl, %VRDX
-	BLSMSK	%VRAX, %VRAX
-	and	%VRDX, %VRAX
-	jz	L(ret)
-	BSR	%VRAX, %VRAX
+	shlx	%SHIFT_REG, %VRDX, %VRDX
+
 # ifdef USE_AS_WCSRCHR
-	leaq	(%rdi, %rax, CHAR_SIZE), %rax
+	kmovw	%edx, %k1
 # else
-	add	%rdi, %rax
+	KMOV	%VRDX, %k1
 # endif
 
-	ret
-END (STRRCHR)
+	VPCOMPRESS %VMM(6), %VMM(1){%k1}{z}
+	/* We could technically just jmp back after the vpcompress but
+	   it doesn't save any 16-byte blocks.  */
+
+	shrx	%SHIFT_REG, %VRSI, %VRSI
+	test	%VRSI, %VRSI
+	jnz	L(page_cross_return)
+	jmp	L(page_cross_continue)
+	/* 1-byte from cache line.  */
+END(STRRCHR)
 #endif
diff --git a/sysdeps/x86_64/multiarch/strrchr-evex.S b/sysdeps/x86_64/multiarch/strrchr-evex.S
index 85e3b0119f..b606e6f69c 100644
--- a/sysdeps/x86_64/multiarch/strrchr-evex.S
+++ b/sysdeps/x86_64/multiarch/strrchr-evex.S
@@ -1,394 +1,8 @@
-/* strrchr/wcsrchr optimized with 256-bit EVEX instructions.
-   Copyright (C) 2021-2023 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <isa-level.h>
-
-#if ISA_SHOULD_BUILD (4)
-
-# include <sysdep.h>
-
 # ifndef STRRCHR
 #  define STRRCHR	__strrchr_evex
 # endif
 
-# include "x86-evex256-vecs.h"
-
-# ifdef USE_AS_WCSRCHR
-#  define SHIFT_REG	rsi
-#  define kunpck_2x	kunpckbw
-#  define kmov_2x	kmovd
-#  define maskz_2x	ecx
-#  define maskm_2x	eax
-#  define CHAR_SIZE	4
-#  define VPMIN	vpminud
-#  define VPTESTN	vptestnmd
-#  define VPTEST	vptestmd
-#  define VPBROADCAST	vpbroadcastd
-#  define VPCMPEQ	vpcmpeqd
-#  define VPCMP	vpcmpd
-
-#  define USE_WIDE_CHAR
-# else
-#  define SHIFT_REG	rdi
-#  define kunpck_2x	kunpckdq
-#  define kmov_2x	kmovq
-#  define maskz_2x	rcx
-#  define maskm_2x	rax
-
-#  define CHAR_SIZE	1
-#  define VPMIN	vpminub
-#  define VPTESTN	vptestnmb
-#  define VPTEST	vptestmb
-#  define VPBROADCAST	vpbroadcastb
-#  define VPCMPEQ	vpcmpeqb
-#  define VPCMP	vpcmpb
-# endif
-
-# include "reg-macros.h"
-
-# define VMATCH	VMM(0)
-# define CHAR_PER_VEC	(VEC_SIZE / CHAR_SIZE)
-# define PAGE_SIZE	4096
-
-	.section SECTION(.text), "ax", @progbits
-ENTRY_P2ALIGN(STRRCHR, 6)
-	movl	%edi, %eax
-	/* Broadcast CHAR to VMATCH.  */
-	VPBROADCAST %esi, %VMATCH
-
-	andl	$(PAGE_SIZE - 1), %eax
-	cmpl	$(PAGE_SIZE - VEC_SIZE), %eax
-	jg	L(cross_page_boundary)
-L(page_cross_continue):
-	VMOVU	(%rdi), %VMM(1)
-	/* k0 has a 1 for each zero CHAR in VEC(1).  */
-	VPTESTN	%VMM(1), %VMM(1), %k0
-	KMOV	%k0, %VRSI
-	test	%VRSI, %VRSI
-	jz	L(aligned_more)
-	/* fallthrough: zero CHAR in first VEC.  */
-	/* K1 has a 1 for each search CHAR match in VEC(1).  */
-	VPCMPEQ	%VMATCH, %VMM(1), %k1
-	KMOV	%k1, %VRAX
-	/* Build mask up until first zero CHAR (used to mask of
-	   potential search CHAR matches past the end of the string).
-	 */
-	blsmsk	%VRSI, %VRSI
-	and	%VRSI, %VRAX
-	jz	L(ret0)
-	/* Get last match (the `and` removed any out of bounds matches).
-	 */
-	bsr	%VRAX, %VRAX
-# ifdef USE_AS_WCSRCHR
-	leaq	(%rdi, %rax, CHAR_SIZE), %rax
-# else
-	addq	%rdi, %rax
-# endif
-L(ret0):
-	ret
-
-	/* Returns for first vec x1/x2/x3 have hard coded backward
-	   search path for earlier matches.  */
-	.p2align 4,, 6
-L(first_vec_x1):
-	VPCMPEQ	%VMATCH, %VMM(2), %k1
-	KMOV	%k1, %VRAX
-	blsmsk	%VRCX, %VRCX
-	/* eax non-zero if search CHAR in range.  */
-	and	%VRCX, %VRAX
-	jnz	L(first_vec_x1_return)
-
-	/* fallthrough: no match in VEC(2) then need to check for
-	   earlier matches (in VEC(1)).  */
-	.p2align 4,, 4
-L(first_vec_x0_test):
-	VPCMPEQ	%VMATCH, %VMM(1), %k1
-	KMOV	%k1, %VRAX
-	test	%VRAX, %VRAX
-	jz	L(ret1)
-	bsr	%VRAX, %VRAX
-# ifdef USE_AS_WCSRCHR
-	leaq	(%rsi, %rax, CHAR_SIZE), %rax
-# else
-	addq	%rsi, %rax
-# endif
-L(ret1):
-	ret
-
-	.p2align 4,, 10
-L(first_vec_x1_or_x2):
-	VPCMPEQ	%VMM(3), %VMATCH, %k3
-	VPCMPEQ	%VMM(2), %VMATCH, %k2
-	/* K2 and K3 have 1 for any search CHAR match. Test if any
-	   matches between either of them. Otherwise check VEC(1).  */
-	KORTEST %k2, %k3
-	jz	L(first_vec_x0_test)
-
-	/* Guaranteed that VEC(2) and VEC(3) are within range so merge
-	   the two bitmasks then get last result.  */
-	kunpck_2x %k2, %k3, %k3
-	kmov_2x	%k3, %maskm_2x
-	bsr	%maskm_2x, %maskm_2x
-	leaq	(VEC_SIZE * 1)(%r8, %rax, CHAR_SIZE), %rax
-	ret
-
-	.p2align 4,, 7
-L(first_vec_x3):
-	VPCMPEQ	%VMATCH, %VMM(4), %k1
-	KMOV	%k1, %VRAX
-	blsmsk	%VRCX, %VRCX
-	/* If no search CHAR match in range check VEC(1)/VEC(2)/VEC(3).
-	 */
-	and	%VRCX, %VRAX
-	jz	L(first_vec_x1_or_x2)
-	bsr	%VRAX, %VRAX
-	leaq	(VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-
-	.p2align 4,, 6
-L(first_vec_x0_x1_test):
-	VPCMPEQ	%VMATCH, %VMM(2), %k1
-	KMOV	%k1, %VRAX
-	/* Check VEC(2) for last match first. If no match try VEC(1).
-	 */
-	test	%VRAX, %VRAX
-	jz	L(first_vec_x0_test)
-	.p2align 4,, 4
-L(first_vec_x1_return):
-	bsr	%VRAX, %VRAX
-	leaq	(VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-
-	.p2align 4,, 10
-L(first_vec_x2):
-	VPCMPEQ	%VMATCH, %VMM(3), %k1
-	KMOV	%k1, %VRAX
-	blsmsk	%VRCX, %VRCX
-	/* Check VEC(3) for last match first. If no match try
-	   VEC(2)/VEC(1).  */
-	and	%VRCX, %VRAX
-	jz	L(first_vec_x0_x1_test)
-	bsr	%VRAX, %VRAX
-	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-
-	.p2align 4,, 12
-L(aligned_more):
-	/* Need to keep original pointer in case VEC(1) has last match.
-	 */
-	movq	%rdi, %rsi
-	andq	$-VEC_SIZE, %rdi
-
-	VMOVU	VEC_SIZE(%rdi), %VMM(2)
-	VPTESTN	%VMM(2), %VMM(2), %k0
-	KMOV	%k0, %VRCX
-
-	test	%VRCX, %VRCX
-	jnz	L(first_vec_x1)
-
-	VMOVU	(VEC_SIZE * 2)(%rdi), %VMM(3)
-	VPTESTN	%VMM(3), %VMM(3), %k0
-	KMOV	%k0, %VRCX
-
-	test	%VRCX, %VRCX
-	jnz	L(first_vec_x2)
-
-	VMOVU	(VEC_SIZE * 3)(%rdi), %VMM(4)
-	VPTESTN	%VMM(4), %VMM(4), %k0
-	KMOV	%k0, %VRCX
-	movq	%rdi, %r8
-	test	%VRCX, %VRCX
-	jnz	L(first_vec_x3)
-
-	andq	$-(VEC_SIZE * 2), %rdi
-	.p2align 4,, 10
-L(first_aligned_loop):
-	/* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
-	   guarantee they don't store a match.  */
-	VMOVA	(VEC_SIZE * 4)(%rdi), %VMM(5)
-	VMOVA	(VEC_SIZE * 5)(%rdi), %VMM(6)
-
-	VPCMPEQ	%VMM(5), %VMATCH, %k2
-	vpxord	%VMM(6), %VMATCH, %VMM(7)
-
-	VPMIN	%VMM(5), %VMM(6), %VMM(8)
-	VPMIN	%VMM(8), %VMM(7), %VMM(7)
-
-	VPTESTN	%VMM(7), %VMM(7), %k1
-	subq	$(VEC_SIZE * -2), %rdi
-	KORTEST %k1, %k2
-	jz	L(first_aligned_loop)
-
-	VPCMPEQ	%VMM(6), %VMATCH, %k3
-	VPTESTN	%VMM(8), %VMM(8), %k1
-
-	/* If k1 is zero, then we found a CHAR match but no null-term.
-	   We can now safely throw out VEC1-4.  */
-	KTEST	%k1, %k1
-	jz	L(second_aligned_loop_prep)
-
-	KORTEST %k2, %k3
-	jnz	L(return_first_aligned_loop)
-
-
-	.p2align 4,, 6
-L(first_vec_x1_or_x2_or_x3):
-	VPCMPEQ	%VMM(4), %VMATCH, %k4
-	KMOV	%k4, %VRAX
-	bsr	%VRAX, %VRAX
-	jz	L(first_vec_x1_or_x2)
-	leaq	(VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
-	ret
-
-
-	.p2align 4,, 8
-L(return_first_aligned_loop):
-	VPTESTN	%VMM(5), %VMM(5), %k0
-
-	/* Combined results from VEC5/6.  */
-	kunpck_2x %k0, %k1, %k0
-	kmov_2x	%k0, %maskz_2x
-
-	blsmsk	%maskz_2x, %maskz_2x
-	kunpck_2x %k2, %k3, %k3
-	kmov_2x	%k3, %maskm_2x
-	and	%maskz_2x, %maskm_2x
-	jz	L(first_vec_x1_or_x2_or_x3)
-
-	bsr	%maskm_2x, %maskm_2x
-	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-	.p2align 4
-	/* We can throw away the work done for the first 4x checks here
-	   as we have a later match. This is the 'fast' path persay.
-	 */
-L(second_aligned_loop_prep):
-L(second_aligned_loop_set_furthest_match):
-	movq	%rdi, %rsi
-	/* Ideally we would safe k2/k3 but `kmov/kunpck` take uops on
-	   port0 and have noticeable overhead in the loop.  */
-	VMOVA	%VMM(5), %VMM(7)
-	VMOVA	%VMM(6), %VMM(8)
-	.p2align 4
-L(second_aligned_loop):
-	VMOVU	(VEC_SIZE * 4)(%rdi), %VMM(5)
-	VMOVU	(VEC_SIZE * 5)(%rdi), %VMM(6)
-	VPCMPEQ	%VMM(5), %VMATCH, %k2
-	vpxord	%VMM(6), %VMATCH, %VMM(3)
-
-	VPMIN	%VMM(5), %VMM(6), %VMM(4)
-	VPMIN	%VMM(3), %VMM(4), %VMM(3)
-
-	VPTESTN	%VMM(3), %VMM(3), %k1
-	subq	$(VEC_SIZE * -2), %rdi
-	KORTEST %k1, %k2
-	jz	L(second_aligned_loop)
-	VPCMPEQ	%VMM(6), %VMATCH, %k3
-	VPTESTN	%VMM(4), %VMM(4), %k1
-	KTEST	%k1, %k1
-	jz	L(second_aligned_loop_set_furthest_match)
-
-	/* branch here because we know we have a match in VEC7/8 but
-	   might not in VEC5/6 so the latter is expected to be less
-	   likely.  */
-	KORTEST %k2, %k3
-	jnz	L(return_new_match)
-
-L(return_old_match):
-	VPCMPEQ	%VMM(8), %VMATCH, %k0
-	KMOV	%k0, %VRCX
-	bsr	%VRCX, %VRCX
-	jnz	L(return_old_match_ret)
-
-	VPCMPEQ	%VMM(7), %VMATCH, %k0
-	KMOV	%k0, %VRCX
-	bsr	%VRCX, %VRCX
-	subq	$VEC_SIZE, %rsi
-L(return_old_match_ret):
-	leaq	(VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
-	ret
-
-	.p2align 4,, 10
-L(return_new_match):
-	VPTESTN	%VMM(5), %VMM(5), %k0
-
-	/* Combined results from VEC5/6.  */
-	kunpck_2x %k0, %k1, %k0
-	kmov_2x	%k0, %maskz_2x
-
-	blsmsk	%maskz_2x, %maskz_2x
-	kunpck_2x %k2, %k3, %k3
-	kmov_2x	%k3, %maskm_2x
-
-	/* Match at end was out-of-bounds so use last known match.  */
-	and	%maskz_2x, %maskm_2x
-	jz	L(return_old_match)
-
-	bsr	%maskm_2x, %maskm_2x
-	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-L(cross_page_boundary):
-	/* eax contains all the page offset bits of src (rdi). `xor rdi,
-	   rax` sets pointer will all page offset bits cleared so
-	   offset of (PAGE_SIZE - VEC_SIZE) will get last aligned VEC
-	   before page cross (guaranteed to be safe to read). Doing this
-	   as opposed to `movq %rdi, %rax; andq $-VEC_SIZE, %rax` saves
-	   a bit of code size.  */
-	xorq	%rdi, %rax
-	VMOVU	(PAGE_SIZE - VEC_SIZE)(%rax), %VMM(1)
-	VPTESTN	%VMM(1), %VMM(1), %k0
-	KMOV	%k0, %VRCX
-
-	/* Shift out zero CHAR matches that are before the beginning of
-	   src (rdi).  */
-# ifdef USE_AS_WCSRCHR
-	movl	%edi, %esi
-	andl	$(VEC_SIZE - 1), %esi
-	shrl	$2, %esi
-# endif
-	shrx	%VGPR(SHIFT_REG), %VRCX, %VRCX
-
-	test	%VRCX, %VRCX
-	jz	L(page_cross_continue)
+#include "x86-evex512-vecs.h"
+#include "reg-macros.h"
 
-	/* Found zero CHAR so need to test for search CHAR.  */
-	VPCMP	$0, %VMATCH, %VMM(1), %k1
-	KMOV	%k1, %VRAX
-	/* Shift out search CHAR matches that are before the beginning of
-	   src (rdi).  */
-	shrx	%VGPR(SHIFT_REG), %VRAX, %VRAX
-
-	/* Check if any search CHAR match in range.  */
-	blsmsk	%VRCX, %VRCX
-	and	%VRCX, %VRAX
-	jz	L(ret3)
-	bsr	%VRAX, %VRAX
-# ifdef USE_AS_WCSRCHR
-	leaq	(%rdi, %rax, CHAR_SIZE), %rax
-# else
-	addq	%rdi, %rax
-# endif
-L(ret3):
-	ret
-END(STRRCHR)
-#endif
+#include "strrchr-evex-base.S"
diff --git a/sysdeps/x86_64/multiarch/wcsrchr-evex.S b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
index e5c5fe3bf2..a584cd3f43 100644
--- a/sysdeps/x86_64/multiarch/wcsrchr-evex.S
+++ b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
@@ -4,4 +4,5 @@
 
 #define STRRCHR	WCSRCHR
 #define USE_AS_WCSRCHR 1
+#define USE_WIDE_CHAR 1
 #include "strrchr-evex.S"
-- 
2.34.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-09-21 14:38 x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10 Noah Goldstein
@ 2023-09-21 14:39 ` Noah Goldstein
  2023-09-21 15:16   ` H.J. Lu
  2023-10-04 18:48 ` Noah Goldstein
  1 sibling, 1 reply; 12+ messages in thread
From: Noah Goldstein @ 2023-09-21 14:39 UTC (permalink / raw)
  To: libc-alpha; +Cc: hjl.tools, carlos

[-- Attachment #1: Type: text/plain, Size: 33942 bytes --]

On Thu, Sep 21, 2023 at 9:38 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
> common implementation: `strrchr-evex-base.S`.
>
> The motivation is `strrchr-evex` needed to be refactored to not use
> 64-bit masked registers in preperation for AVX10.
>
> Once vec-width masked register combining was removed, the EVEX and
> EVEX512 implementations can easily be implemented in the same file
> without any major overhead.
>
> The net result is performance improvements (measured on TGL) for both
> `strrchr-evex` and `strrchr-evex512`. Although, note there are some
> regressions in the test suite and it may be many of the cases that
> make the total-geomean of improvement/regression across bench-strrchr
> are cold. The point of the performance measurement is to show there
> are no major regressions, but the primary motivation is preperation
> for AVX10.
>
> Benchmarks where taken on TGL:
> https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
>
> EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
> EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
Full summary of attached here.

>
> Full check passes on x86.
> ---
>  sysdeps/x86_64/multiarch/strrchr-evex-base.S | 466 ++++++++++++-------
>  sysdeps/x86_64/multiarch/strrchr-evex.S      | 392 +---------------
>  sysdeps/x86_64/multiarch/wcsrchr-evex.S      |   1 +
>  3 files changed, 294 insertions(+), 565 deletions(-)
>
> diff --git a/sysdeps/x86_64/multiarch/strrchr-evex-base.S b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> index 58b2853ab6..2c98f07fca 100644
> --- a/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> +++ b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> @@ -25,240 +25,354 @@
>  # include <sysdep.h>
>
>  # ifdef USE_AS_WCSRCHR
> +#  if VEC_SIZE == 64
> +#   define RCX_M       cx
> +#   define kortestM    kortestw
> +#  else
> +#   define RCX_M       cl
> +#   define kortestM    kortestb
> +#  endif
> +
> +#  define SHIFT_REG    VRCX
> +#  define VPCOMPRESS   vpcompressd
>  #  define CHAR_SIZE    4
> -#  define VPBROADCAST   vpbroadcastd
> -#  define VPCMPEQ      vpcmpeqd
> -#  define VPMINU       vpminud
> +#  define VPMIN        vpminud
>  #  define VPTESTN      vptestnmd
> +#  define VPTEST       vptestmd
> +#  define VPBROADCAST  vpbroadcastd
> +#  define VPCMPEQ      vpcmpeqd
> +#  define VPCMP        vpcmpd
>  # else
> +#  define SHIFT_REG    VRDI
> +#  define VPCOMPRESS   vpcompressb
>  #  define CHAR_SIZE    1
> -#  define VPBROADCAST   vpbroadcastb
> -#  define VPCMPEQ      vpcmpeqb
> -#  define VPMINU       vpminub
> +#  define VPMIN        vpminub
>  #  define VPTESTN      vptestnmb
> +#  define VPTEST       vptestmb
> +#  define VPBROADCAST  vpbroadcastb
> +#  define VPCMPEQ      vpcmpeqb
> +#  define VPCMP        vpcmpb
> +
> +#  define RCX_M        VRCX
> +#  define kortestM     KORTEST
>  # endif
>
> -# define PAGE_SIZE     4096
> +# define VMATCH        VMM(0)
>  # define CHAR_PER_VEC  (VEC_SIZE / CHAR_SIZE)
> +# define PAGE_SIZE     4096
>
>         .section SECTION(.text), "ax", @progbits
> -/* Aligning entry point to 64 byte, provides better performance for
> -   one vector length string.  */
> -ENTRY_P2ALIGN (STRRCHR, 6)
> -
> -       /* Broadcast CHAR to VMM(0).  */
> -       VPBROADCAST %esi, %VMM(0)
> +       /* Aligning entry point to 64 byte, provides better performance for
> +          one vector length string.  */
> +ENTRY_P2ALIGN(STRRCHR, 6)
>         movl    %edi, %eax
> -       sall    $20, %eax
> -       cmpl    $((PAGE_SIZE - VEC_SIZE) << 20), %eax
> -       ja      L(page_cross)
> +       /* Broadcast CHAR to VMATCH.  */
> +       VPBROADCAST %esi, %VMATCH
>
> -L(page_cross_continue):
> -       /* Compare [w]char for null, mask bit will be set for match.  */
> -       VMOVU   (%rdi), %VMM(1)
> +       andl    $(PAGE_SIZE - 1), %eax
> +       cmpl    $(PAGE_SIZE - VEC_SIZE), %eax
> +       jg      L(cross_page_boundary)
>
> -       VPTESTN %VMM(1), %VMM(1), %k1
> -       KMOV    %k1, %VRCX
> -       test    %VRCX, %VRCX
> -       jz      L(align_more)
> -
> -       VPCMPEQ %VMM(1), %VMM(0), %k0
> -       KMOV    %k0, %VRAX
> -       BLSMSK  %VRCX, %VRCX
> -       and     %VRCX, %VRAX
> -       jz      L(ret)
> -
> -       BSR     %VRAX, %VRAX
> +       VMOVU   (%rdi), %VMM(1)
> +       /* k0 has a 1 for each zero CHAR in YMM1.  */
> +       VPTESTN %VMM(1), %VMM(1), %k0
> +       KMOV    %k0, %VGPR(rsi)
> +       test    %VGPR(rsi), %VGPR(rsi)
> +       jz      L(aligned_more)
> +       /* fallthrough: zero CHAR in first VEC.  */
> +L(page_cross_return):
> +       /* K1 has a 1 for each search CHAR match in VEC(1).  */
> +       VPCMPEQ %VMATCH, %VMM(1), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       /* Build mask up until first zero CHAR (used to mask of
> +          potential search CHAR matches past the end of the string).  */
> +       blsmsk  %VGPR(rsi), %VGPR(rsi)
> +       and     %VGPR(rsi), %VGPR(rax)
> +       jz      L(ret0)
> +       /* Get last match (the `and` removed any out of bounds matches).  */
> +       bsr     %VGPR(rax), %VGPR(rax)
>  # ifdef USE_AS_WCSRCHR
>         leaq    (%rdi, %rax, CHAR_SIZE), %rax
>  # else
> -       add     %rdi, %rax
> +       addq    %rdi, %rax
>  # endif
> -L(ret):
> +L(ret0):
>         ret
>
> -L(vector_x2_end):
> -       VPCMPEQ %VMM(2), %VMM(0), %k2
> -       KMOV    %k2, %VRAX
> -       BLSMSK  %VRCX, %VRCX
> -       and     %VRCX, %VRAX
> -       jz      L(vector_x1_ret)
> -
> -       BSR     %VRAX, %VRAX
> -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -       /* Check the first vector at very last to look for match.  */
> -L(vector_x1_ret):
> -       VPCMPEQ %VMM(1), %VMM(0), %k2
> -       KMOV    %k2, %VRAX
> -       test    %VRAX, %VRAX
> -       jz      L(ret)
> -
> -       BSR     %VRAX, %VRAX
> +       /* Returns for first vec x1/x2/x3 have hard coded backward
> +          search path for earlier matches.  */
> +       .p2align 4,, 6
> +L(first_vec_x1):
> +       VPCMPEQ %VMATCH, %VMM(2), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> +       /* eax non-zero if search CHAR in range.  */
> +       and     %VGPR(rcx), %VGPR(rax)
> +       jnz     L(first_vec_x1_return)
> +
> +       /* fallthrough: no match in YMM2 then need to check for earlier
> +          matches (in YMM1).  */
> +       .p2align 4,, 4
> +L(first_vec_x0_test):
> +       VPCMPEQ %VMATCH, %VMM(1), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       test    %VGPR(rax), %VGPR(rax)
> +       jz      L(ret1)
> +       bsr     %VGPR(rax), %VGPR(rax)
>  # ifdef USE_AS_WCSRCHR
>         leaq    (%rsi, %rax, CHAR_SIZE), %rax
>  # else
> -       add     %rsi, %rax
> +
> +       addq    %rsi, %rax
>  # endif
> +L(ret1):
> +       ret
> +
> +       .p2align 4,, 10
> +L(first_vec_x3):
> +       VPCMPEQ %VMATCH, %VMM(4), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> +       /* If no search CHAR match in range check YMM1/YMM2/YMM3.  */
> +       and     %VGPR(rcx), %VGPR(rax)
> +       jz      L(first_vec_x1_or_x2)
> +       bsr     %VGPR(rax), %VGPR(rax)
> +       leaq    (VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
> +       ret
> +       .p2align 4,, 4
> +
> +L(first_vec_x2):
> +       VPCMPEQ %VMATCH, %VMM(3), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> +       /* Check YMM3 for last match first. If no match try YMM2/YMM1.  */
> +       and     %VGPR(rcx), %VGPR(rax)
> +       jz      L(first_vec_x0_x1_test)
> +       bsr     %VGPR(rax), %VGPR(rax)
> +       leaq    (VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
>         ret
>
> -L(align_more):
> -       /* Zero r8 to store match result.  */
> -       xorl    %r8d, %r8d
> -       /* Save pointer of first vector, in case if no match found.  */
> +       .p2align 4,, 6
> +L(first_vec_x0_x1_test):
> +       VPCMPEQ %VMATCH, %VMM(2), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       /* Check YMM2 for last match first. If no match try YMM1.  */
> +       test    %VGPR(rax), %VGPR(rax)
> +       jz      L(first_vec_x0_test)
> +       .p2align 4,, 4
> +L(first_vec_x1_return):
> +       bsr     %VGPR(rax), %VGPR(rax)
> +       leaq    (VEC_SIZE)(%r8, %rax, CHAR_SIZE), %rax
> +       ret
> +
> +       .p2align 4,, 12
> +L(aligned_more):
> +L(page_cross_continue):
> +       /* Need to keep original pointer incase VEC(1) has last match.  */
>         movq    %rdi, %rsi
> -       /* Align pointer to vector size.  */
>         andq    $-VEC_SIZE, %rdi
> -       /* Loop unroll for 2 vector loop.  */
> -       VMOVA   (VEC_SIZE)(%rdi), %VMM(2)
> +
> +       VMOVU   VEC_SIZE(%rdi), %VMM(2)
>         VPTESTN %VMM(2), %VMM(2), %k0
>         KMOV    %k0, %VRCX
> +       movq    %rdi, %r8
>         test    %VRCX, %VRCX
> -       jnz     L(vector_x2_end)
> +       jnz     L(first_vec_x1)
> +
> +       VMOVU   (VEC_SIZE * 2)(%rdi), %VMM(3)
> +       VPTESTN %VMM(3), %VMM(3), %k0
> +       KMOV    %k0, %VRCX
> +
> +       test    %VRCX, %VRCX
> +       jnz     L(first_vec_x2)
> +
> +       VMOVU   (VEC_SIZE * 3)(%rdi), %VMM(4)
> +       VPTESTN %VMM(4), %VMM(4), %k0
> +       KMOV    %k0, %VRCX
> +
> +       /* Intentionally use 64-bit here.  EVEX256 version needs 1-byte
> +          padding for efficient nop before loop alignment.  */
> +       test    %rcx, %rcx
> +       jnz     L(first_vec_x3)
>
> -       /* Save pointer of second vector, in case if no match
> -          found.  */
> -       movq    %rdi, %r9
> -       /* Align address to VEC_SIZE * 2 for loop.  */
>         andq    $-(VEC_SIZE * 2), %rdi
> +       .p2align 4
> +L(first_aligned_loop):
> +       /* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
> +          gurantee they don't store a match.  */
> +       VMOVA   (VEC_SIZE * 4)(%rdi), %VMM(5)
> +       VMOVA   (VEC_SIZE * 5)(%rdi), %VMM(6)
>
> -       .p2align 4,,11
> -L(loop):
> -       /* 2 vector loop, as it provide better performance as compared
> -          to 4 vector loop.  */
> -       VMOVA   (VEC_SIZE * 2)(%rdi), %VMM(3)
> -       VMOVA   (VEC_SIZE * 3)(%rdi), %VMM(4)
> -       VPCMPEQ %VMM(3), %VMM(0), %k1
> -       VPCMPEQ %VMM(4), %VMM(0), %k2
> -       VPMINU  %VMM(3), %VMM(4), %VMM(5)
> -       VPTESTN %VMM(5), %VMM(5), %k0
> -       KOR     %k1, %k2, %k3
> -       subq    $-(VEC_SIZE * 2), %rdi
> -       /* If k0 and k3 zero, match and end of string not found.  */
> -       KORTEST %k0, %k3
> -       jz      L(loop)
> -
> -       /* If k0 is non zero, end of string found.  */
> -       KORTEST %k0, %k0
> -       jnz     L(endloop)
> -
> -       lea     VEC_SIZE(%rdi), %r8
> -       /* A match found, it need to be stored in r8 before loop
> -          continue.  */
> -       /* Check second vector first.  */
> -       KMOV    %k2, %VRDX
> -       test    %VRDX, %VRDX
> -       jnz     L(loop_vec_x2_match)
> +       VPCMP   $4, %VMM(5), %VMATCH, %k2
> +       VPCMP   $4, %VMM(6), %VMATCH, %k3{%k2}
> +
> +       VPMIN   %VMM(5), %VMM(6), %VMM(7)
> +
> +       VPTEST  %VMM(7), %VMM(7), %k1{%k3}
> +       subq    $(VEC_SIZE * -2), %rdi
> +       kortestM %k1, %k1
> +       jc      L(first_aligned_loop)
>
> +       VPTESTN %VMM(7), %VMM(7), %k1
>         KMOV    %k1, %VRDX
> -       /* Match is in first vector, rdi offset need to be subtracted
> -         by VEC_SIZE.  */
> -       sub     $VEC_SIZE, %r8
> -
> -       /* If second vector doesn't have match, first vector must
> -          have match.  */
> -L(loop_vec_x2_match):
> -       BSR     %VRDX, %VRDX
> -# ifdef USE_AS_WCSRCHR
> -       sal     $2, %rdx
> -# endif
> -       add     %rdx, %r8
> -       jmp     L(loop)
> +       test    %VRDX, %VRDX
> +       jz      L(second_aligned_loop_prep)
>
> -L(endloop):
> -       /* Check if string end in first loop vector.  */
> -       VPTESTN %VMM(3), %VMM(3), %k0
> -       KMOV    %k0, %VRCX
> -       test    %VRCX, %VRCX
> -       jnz     L(loop_vector_x1_end)
> +       kortestM %k3, %k3
> +       jnc     L(return_first_aligned_loop)
>
> -       /* Check if it has match in first loop vector.  */
> -       KMOV    %k1, %VRAX
> +       .p2align 4,, 6
> +L(first_vec_x1_or_x2_or_x3):
> +       VPCMPEQ %VMM(4), %VMATCH, %k4
> +       KMOV    %k4, %VRAX
>         test    %VRAX, %VRAX
> -       jz      L(loop_vector_x2_end)
> +       jz      L(first_vec_x1_or_x2)
> +       bsr     %VRAX, %VRAX
> +       leaq    (VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
> +       ret
>
> -       BSR     %VRAX, %VRAX
> -       leaq    (%rdi, %rax, CHAR_SIZE), %r8
>
> -       /* String must end in second loop vector.  */
> -L(loop_vector_x2_end):
> -       VPTESTN %VMM(4), %VMM(4), %k0
> +       .p2align 4,, 8
> +L(return_first_aligned_loop):
> +       VPTESTN %VMM(5), %VMM(5), %k0
>         KMOV    %k0, %VRCX
> +       blsmsk  %VRCX, %VRCX
> +       jnc     L(return_first_new_match_first)
> +       blsmsk  %VRDX, %VRDX
> +       VPCMPEQ %VMM(6), %VMATCH, %k0
> +       KMOV    %k0, %VRAX
> +       addq    $VEC_SIZE, %rdi
> +       and     %VRDX, %VRAX
> +       jnz     L(return_first_new_match_ret)
> +       subq    $VEC_SIZE, %rdi
> +L(return_first_new_match_first):
>         KMOV    %k2, %VRAX
> -       BLSMSK  %VRCX, %VRCX
> -       /* Check if it has match in second loop vector.  */
> +# ifdef USE_AS_WCSRCHR
> +       xorl    $((1 << CHAR_PER_VEC)- 1), %VRAX
>         and     %VRCX, %VRAX
> -       jz      L(check_last_match)
> +# else
> +       andn    %VRCX, %VRAX, %VRAX
> +# endif
> +       jz      L(first_vec_x1_or_x2_or_x3)
> +L(return_first_new_match_ret):
> +       bsr     %VRAX, %VRAX
> +       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> +       ret
>
> -       BSR     %VRAX, %VRAX
> -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> +       .p2align 4,, 10
> +L(first_vec_x1_or_x2):
> +       VPCMPEQ %VMM(3), %VMATCH, %k3
> +       KMOV    %k3, %VRAX
> +       test    %VRAX, %VRAX
> +       jz      L(first_vec_x0_x1_test)
> +       bsr     %VRAX, %VRAX
> +       leaq    (VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
>         ret
>
> -       /* String end in first loop vector.  */
> -L(loop_vector_x1_end):
> -       KMOV    %k1, %VRAX
> -       BLSMSK  %VRCX, %VRCX
> -       /* Check if it has match in second loop vector.  */
> -       and     %VRCX, %VRAX
> -       jz      L(check_last_match)
>
> -       BSR     %VRAX, %VRAX
> -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> +       .p2align 4
> +       /* We can throw away the work done for the first 4x checks here
> +          as we have a later match. This is the 'fast' path persay.  */
> +L(second_aligned_loop_prep):
> +L(second_aligned_loop_set_furthest_match):
> +       movq    %rdi, %rsi
> +       VMOVA   %VMM(5), %VMM(7)
> +       VMOVA   %VMM(6), %VMM(8)
> +       .p2align 4
> +L(second_aligned_loop):
> +       VMOVU   (VEC_SIZE * 4)(%rdi), %VMM(5)
> +       VMOVU   (VEC_SIZE * 5)(%rdi), %VMM(6)
> +       VPCMP   $4, %VMM(5), %VMATCH, %k2
> +       VPCMP   $4, %VMM(6), %VMATCH, %k3{%k2}
> +
> +       VPMIN   %VMM(5), %VMM(6), %VMM(4)
> +
> +       VPTEST  %VMM(4), %VMM(4), %k1{%k3}
> +       subq    $(VEC_SIZE * -2), %rdi
> +       KMOV    %k1, %VRCX
> +       inc     %RCX_M
> +       jz      L(second_aligned_loop)
> +       VPTESTN %VMM(4), %VMM(4), %k1
> +       KMOV    %k1, %VRDX
> +       test    %VRDX, %VRDX
> +       jz      L(second_aligned_loop_set_furthest_match)
>
> -       /* No match in first and second loop vector.  */
> -L(check_last_match):
> -       /* Check if any match recorded in r8.  */
> -       test    %r8, %r8
> -       jz      L(vector_x2_ret)
> -       movq    %r8, %rax
> +       kortestM %k3, %k3
> +       jnc     L(return_new_match)
> +       /* branch here because there is a significant advantage interms
> +          of output dependency chance in using edx.  */
> +
> +
> +L(return_old_match):
> +       VPCMPEQ %VMM(8), %VMATCH, %k0
> +       KMOV    %k0, %VRCX
> +       bsr     %VRCX, %VRCX
> +       jnz     L(return_old_match_ret)
> +
> +       VPCMPEQ %VMM(7), %VMATCH, %k0
> +       KMOV    %k0, %VRCX
> +       bsr     %VRCX, %VRCX
> +       subq    $VEC_SIZE, %rsi
> +L(return_old_match_ret):
> +       leaq    (VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
>         ret
>
> -       /* No match recorded in r8. Check the second saved vector
> -          in beginning.  */
> -L(vector_x2_ret):
> -       VPCMPEQ %VMM(2), %VMM(0), %k2
> -       KMOV    %k2, %VRAX
> -       test    %VRAX, %VRAX
> -       jz      L(vector_x1_ret)
>
> -       /* Match found in the second saved vector.  */
> -       BSR     %VRAX, %VRAX
> -       leaq    (VEC_SIZE)(%r9, %rax, CHAR_SIZE), %rax
> +L(return_new_match):
> +       VPTESTN %VMM(5), %VMM(5), %k0
> +       KMOV    %k0, %VRCX
> +       blsmsk  %VRCX, %VRCX
> +       jnc     L(return_new_match_first)
> +       dec     %VRDX
> +       VPCMPEQ %VMM(6), %VMATCH, %k0
> +       KMOV    %k0, %VRAX
> +       addq    $VEC_SIZE, %rdi
> +       and     %VRDX, %VRAX
> +       jnz     L(return_new_match_ret)
> +       subq    $VEC_SIZE, %rdi
> +L(return_new_match_first):
> +       KMOV    %k2, %VRAX
> +# ifdef USE_AS_WCSRCHR
> +       xorl    $((1 << CHAR_PER_VEC)- 1), %VRAX
> +       and     %VRCX, %VRAX
> +# else
> +       andn    %VRCX, %VRAX, %VRAX
> +# endif
> +       jz      L(return_old_match)
> +L(return_new_match_ret):
> +       bsr     %VRAX, %VRAX
> +       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
>         ret
>
> -L(page_cross):
> -       mov     %rdi, %rax
> -       movl    %edi, %ecx
> +       .p2align 4,, 4
> +L(cross_page_boundary):
> +       xorq    %rdi, %rax
> +       mov     $-1, %VRDX
> +       VMOVU   (PAGE_SIZE - VEC_SIZE)(%rax), %VMM(6)
> +       VPTESTN %VMM(6), %VMM(6), %k0
> +       KMOV    %k0, %VRSI
>
>  # ifdef USE_AS_WCSRCHR
> -       /* Calculate number of compare result bits to be skipped for
> -          wide string alignment adjustment.  */
> -       andl    $(VEC_SIZE - 1), %ecx
> -       sarl    $2, %ecx
> +       movl    %edi, %ecx
> +       and     $(VEC_SIZE - 1), %ecx
> +       shrl    $2, %ecx
>  # endif
> -       /* ecx contains number of w[char] to be skipped as a result
> -          of address alignment.  */
> -       andq    $-VEC_SIZE, %rax
> -       VMOVA   (%rax), %VMM(1)
> -       VPTESTN %VMM(1), %VMM(1), %k1
> -       KMOV    %k1, %VRAX
> -       SHR     %cl, %VRAX
> -       jz      L(page_cross_continue)
> -       VPCMPEQ %VMM(1), %VMM(0), %k0
> -       KMOV    %k0, %VRDX
> -       SHR     %cl, %VRDX
> -       BLSMSK  %VRAX, %VRAX
> -       and     %VRDX, %VRAX
> -       jz      L(ret)
> -       BSR     %VRAX, %VRAX
> +       shlx    %SHIFT_REG, %VRDX, %VRDX
> +
>  # ifdef USE_AS_WCSRCHR
> -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> +       kmovw   %edx, %k1
>  # else
> -       add     %rdi, %rax
> +       KMOV    %VRDX, %k1
>  # endif
>
> -       ret
> -END (STRRCHR)
> +       VPCOMPRESS %VMM(6), %VMM(1){%k1}{z}
> +       /* We could technically just jmp back after the vpcompress but
> +          it doesn't save any 16-byte blocks.  */
> +
> +       shrx    %SHIFT_REG, %VRSI, %VRSI
> +       test    %VRSI, %VRSI
> +       jnz     L(page_cross_return)
> +       jmp     L(page_cross_continue)
> +       /* 1-byte from cache line.  */
> +END(STRRCHR)
>  #endif
> diff --git a/sysdeps/x86_64/multiarch/strrchr-evex.S b/sysdeps/x86_64/multiarch/strrchr-evex.S
> index 85e3b0119f..b606e6f69c 100644
> --- a/sysdeps/x86_64/multiarch/strrchr-evex.S
> +++ b/sysdeps/x86_64/multiarch/strrchr-evex.S
> @@ -1,394 +1,8 @@
> -/* strrchr/wcsrchr optimized with 256-bit EVEX instructions.
> -   Copyright (C) 2021-2023 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <https://www.gnu.org/licenses/>.  */
> -
> -#include <isa-level.h>
> -
> -#if ISA_SHOULD_BUILD (4)
> -
> -# include <sysdep.h>
> -
>  # ifndef STRRCHR
>  #  define STRRCHR      __strrchr_evex
>  # endif
>
> -# include "x86-evex256-vecs.h"
> -
> -# ifdef USE_AS_WCSRCHR
> -#  define SHIFT_REG    rsi
> -#  define kunpck_2x    kunpckbw
> -#  define kmov_2x      kmovd
> -#  define maskz_2x     ecx
> -#  define maskm_2x     eax
> -#  define CHAR_SIZE    4
> -#  define VPMIN        vpminud
> -#  define VPTESTN      vptestnmd
> -#  define VPTEST       vptestmd
> -#  define VPBROADCAST  vpbroadcastd
> -#  define VPCMPEQ      vpcmpeqd
> -#  define VPCMP        vpcmpd
> -
> -#  define USE_WIDE_CHAR
> -# else
> -#  define SHIFT_REG    rdi
> -#  define kunpck_2x    kunpckdq
> -#  define kmov_2x      kmovq
> -#  define maskz_2x     rcx
> -#  define maskm_2x     rax
> -
> -#  define CHAR_SIZE    1
> -#  define VPMIN        vpminub
> -#  define VPTESTN      vptestnmb
> -#  define VPTEST       vptestmb
> -#  define VPBROADCAST  vpbroadcastb
> -#  define VPCMPEQ      vpcmpeqb
> -#  define VPCMP        vpcmpb
> -# endif
> -
> -# include "reg-macros.h"
> -
> -# define VMATCH        VMM(0)
> -# define CHAR_PER_VEC  (VEC_SIZE / CHAR_SIZE)
> -# define PAGE_SIZE     4096
> -
> -       .section SECTION(.text), "ax", @progbits
> -ENTRY_P2ALIGN(STRRCHR, 6)
> -       movl    %edi, %eax
> -       /* Broadcast CHAR to VMATCH.  */
> -       VPBROADCAST %esi, %VMATCH
> -
> -       andl    $(PAGE_SIZE - 1), %eax
> -       cmpl    $(PAGE_SIZE - VEC_SIZE), %eax
> -       jg      L(cross_page_boundary)
> -L(page_cross_continue):
> -       VMOVU   (%rdi), %VMM(1)
> -       /* k0 has a 1 for each zero CHAR in VEC(1).  */
> -       VPTESTN %VMM(1), %VMM(1), %k0
> -       KMOV    %k0, %VRSI
> -       test    %VRSI, %VRSI
> -       jz      L(aligned_more)
> -       /* fallthrough: zero CHAR in first VEC.  */
> -       /* K1 has a 1 for each search CHAR match in VEC(1).  */
> -       VPCMPEQ %VMATCH, %VMM(1), %k1
> -       KMOV    %k1, %VRAX
> -       /* Build mask up until first zero CHAR (used to mask of
> -          potential search CHAR matches past the end of the string).
> -        */
> -       blsmsk  %VRSI, %VRSI
> -       and     %VRSI, %VRAX
> -       jz      L(ret0)
> -       /* Get last match (the `and` removed any out of bounds matches).
> -        */
> -       bsr     %VRAX, %VRAX
> -# ifdef USE_AS_WCSRCHR
> -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> -# else
> -       addq    %rdi, %rax
> -# endif
> -L(ret0):
> -       ret
> -
> -       /* Returns for first vec x1/x2/x3 have hard coded backward
> -          search path for earlier matches.  */
> -       .p2align 4,, 6
> -L(first_vec_x1):
> -       VPCMPEQ %VMATCH, %VMM(2), %k1
> -       KMOV    %k1, %VRAX
> -       blsmsk  %VRCX, %VRCX
> -       /* eax non-zero if search CHAR in range.  */
> -       and     %VRCX, %VRAX
> -       jnz     L(first_vec_x1_return)
> -
> -       /* fallthrough: no match in VEC(2) then need to check for
> -          earlier matches (in VEC(1)).  */
> -       .p2align 4,, 4
> -L(first_vec_x0_test):
> -       VPCMPEQ %VMATCH, %VMM(1), %k1
> -       KMOV    %k1, %VRAX
> -       test    %VRAX, %VRAX
> -       jz      L(ret1)
> -       bsr     %VRAX, %VRAX
> -# ifdef USE_AS_WCSRCHR
> -       leaq    (%rsi, %rax, CHAR_SIZE), %rax
> -# else
> -       addq    %rsi, %rax
> -# endif
> -L(ret1):
> -       ret
> -
> -       .p2align 4,, 10
> -L(first_vec_x1_or_x2):
> -       VPCMPEQ %VMM(3), %VMATCH, %k3
> -       VPCMPEQ %VMM(2), %VMATCH, %k2
> -       /* K2 and K3 have 1 for any search CHAR match. Test if any
> -          matches between either of them. Otherwise check VEC(1).  */
> -       KORTEST %k2, %k3
> -       jz      L(first_vec_x0_test)
> -
> -       /* Guaranteed that VEC(2) and VEC(3) are within range so merge
> -          the two bitmasks then get last result.  */
> -       kunpck_2x %k2, %k3, %k3
> -       kmov_2x %k3, %maskm_2x
> -       bsr     %maskm_2x, %maskm_2x
> -       leaq    (VEC_SIZE * 1)(%r8, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -       .p2align 4,, 7
> -L(first_vec_x3):
> -       VPCMPEQ %VMATCH, %VMM(4), %k1
> -       KMOV    %k1, %VRAX
> -       blsmsk  %VRCX, %VRCX
> -       /* If no search CHAR match in range check VEC(1)/VEC(2)/VEC(3).
> -        */
> -       and     %VRCX, %VRAX
> -       jz      L(first_vec_x1_or_x2)
> -       bsr     %VRAX, %VRAX
> -       leaq    (VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -
> -       .p2align 4,, 6
> -L(first_vec_x0_x1_test):
> -       VPCMPEQ %VMATCH, %VMM(2), %k1
> -       KMOV    %k1, %VRAX
> -       /* Check VEC(2) for last match first. If no match try VEC(1).
> -        */
> -       test    %VRAX, %VRAX
> -       jz      L(first_vec_x0_test)
> -       .p2align 4,, 4
> -L(first_vec_x1_return):
> -       bsr     %VRAX, %VRAX
> -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -
> -       .p2align 4,, 10
> -L(first_vec_x2):
> -       VPCMPEQ %VMATCH, %VMM(3), %k1
> -       KMOV    %k1, %VRAX
> -       blsmsk  %VRCX, %VRCX
> -       /* Check VEC(3) for last match first. If no match try
> -          VEC(2)/VEC(1).  */
> -       and     %VRCX, %VRAX
> -       jz      L(first_vec_x0_x1_test)
> -       bsr     %VRAX, %VRAX
> -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -
> -       .p2align 4,, 12
> -L(aligned_more):
> -       /* Need to keep original pointer in case VEC(1) has last match.
> -        */
> -       movq    %rdi, %rsi
> -       andq    $-VEC_SIZE, %rdi
> -
> -       VMOVU   VEC_SIZE(%rdi), %VMM(2)
> -       VPTESTN %VMM(2), %VMM(2), %k0
> -       KMOV    %k0, %VRCX
> -
> -       test    %VRCX, %VRCX
> -       jnz     L(first_vec_x1)
> -
> -       VMOVU   (VEC_SIZE * 2)(%rdi), %VMM(3)
> -       VPTESTN %VMM(3), %VMM(3), %k0
> -       KMOV    %k0, %VRCX
> -
> -       test    %VRCX, %VRCX
> -       jnz     L(first_vec_x2)
> -
> -       VMOVU   (VEC_SIZE * 3)(%rdi), %VMM(4)
> -       VPTESTN %VMM(4), %VMM(4), %k0
> -       KMOV    %k0, %VRCX
> -       movq    %rdi, %r8
> -       test    %VRCX, %VRCX
> -       jnz     L(first_vec_x3)
> -
> -       andq    $-(VEC_SIZE * 2), %rdi
> -       .p2align 4,, 10
> -L(first_aligned_loop):
> -       /* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
> -          guarantee they don't store a match.  */
> -       VMOVA   (VEC_SIZE * 4)(%rdi), %VMM(5)
> -       VMOVA   (VEC_SIZE * 5)(%rdi), %VMM(6)
> -
> -       VPCMPEQ %VMM(5), %VMATCH, %k2
> -       vpxord  %VMM(6), %VMATCH, %VMM(7)
> -
> -       VPMIN   %VMM(5), %VMM(6), %VMM(8)
> -       VPMIN   %VMM(8), %VMM(7), %VMM(7)
> -
> -       VPTESTN %VMM(7), %VMM(7), %k1
> -       subq    $(VEC_SIZE * -2), %rdi
> -       KORTEST %k1, %k2
> -       jz      L(first_aligned_loop)
> -
> -       VPCMPEQ %VMM(6), %VMATCH, %k3
> -       VPTESTN %VMM(8), %VMM(8), %k1
> -
> -       /* If k1 is zero, then we found a CHAR match but no null-term.
> -          We can now safely throw out VEC1-4.  */
> -       KTEST   %k1, %k1
> -       jz      L(second_aligned_loop_prep)
> -
> -       KORTEST %k2, %k3
> -       jnz     L(return_first_aligned_loop)
> -
> -
> -       .p2align 4,, 6
> -L(first_vec_x1_or_x2_or_x3):
> -       VPCMPEQ %VMM(4), %VMATCH, %k4
> -       KMOV    %k4, %VRAX
> -       bsr     %VRAX, %VRAX
> -       jz      L(first_vec_x1_or_x2)
> -       leaq    (VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -
> -       .p2align 4,, 8
> -L(return_first_aligned_loop):
> -       VPTESTN %VMM(5), %VMM(5), %k0
> -
> -       /* Combined results from VEC5/6.  */
> -       kunpck_2x %k0, %k1, %k0
> -       kmov_2x %k0, %maskz_2x
> -
> -       blsmsk  %maskz_2x, %maskz_2x
> -       kunpck_2x %k2, %k3, %k3
> -       kmov_2x %k3, %maskm_2x
> -       and     %maskz_2x, %maskm_2x
> -       jz      L(first_vec_x1_or_x2_or_x3)
> -
> -       bsr     %maskm_2x, %maskm_2x
> -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -       .p2align 4
> -       /* We can throw away the work done for the first 4x checks here
> -          as we have a later match. This is the 'fast' path persay.
> -        */
> -L(second_aligned_loop_prep):
> -L(second_aligned_loop_set_furthest_match):
> -       movq    %rdi, %rsi
> -       /* Ideally we would safe k2/k3 but `kmov/kunpck` take uops on
> -          port0 and have noticeable overhead in the loop.  */
> -       VMOVA   %VMM(5), %VMM(7)
> -       VMOVA   %VMM(6), %VMM(8)
> -       .p2align 4
> -L(second_aligned_loop):
> -       VMOVU   (VEC_SIZE * 4)(%rdi), %VMM(5)
> -       VMOVU   (VEC_SIZE * 5)(%rdi), %VMM(6)
> -       VPCMPEQ %VMM(5), %VMATCH, %k2
> -       vpxord  %VMM(6), %VMATCH, %VMM(3)
> -
> -       VPMIN   %VMM(5), %VMM(6), %VMM(4)
> -       VPMIN   %VMM(3), %VMM(4), %VMM(3)
> -
> -       VPTESTN %VMM(3), %VMM(3), %k1
> -       subq    $(VEC_SIZE * -2), %rdi
> -       KORTEST %k1, %k2
> -       jz      L(second_aligned_loop)
> -       VPCMPEQ %VMM(6), %VMATCH, %k3
> -       VPTESTN %VMM(4), %VMM(4), %k1
> -       KTEST   %k1, %k1
> -       jz      L(second_aligned_loop_set_furthest_match)
> -
> -       /* branch here because we know we have a match in VEC7/8 but
> -          might not in VEC5/6 so the latter is expected to be less
> -          likely.  */
> -       KORTEST %k2, %k3
> -       jnz     L(return_new_match)
> -
> -L(return_old_match):
> -       VPCMPEQ %VMM(8), %VMATCH, %k0
> -       KMOV    %k0, %VRCX
> -       bsr     %VRCX, %VRCX
> -       jnz     L(return_old_match_ret)
> -
> -       VPCMPEQ %VMM(7), %VMATCH, %k0
> -       KMOV    %k0, %VRCX
> -       bsr     %VRCX, %VRCX
> -       subq    $VEC_SIZE, %rsi
> -L(return_old_match_ret):
> -       leaq    (VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
> -       ret
> -
> -       .p2align 4,, 10
> -L(return_new_match):
> -       VPTESTN %VMM(5), %VMM(5), %k0
> -
> -       /* Combined results from VEC5/6.  */
> -       kunpck_2x %k0, %k1, %k0
> -       kmov_2x %k0, %maskz_2x
> -
> -       blsmsk  %maskz_2x, %maskz_2x
> -       kunpck_2x %k2, %k3, %k3
> -       kmov_2x %k3, %maskm_2x
> -
> -       /* Match at end was out-of-bounds so use last known match.  */
> -       and     %maskz_2x, %maskm_2x
> -       jz      L(return_old_match)
> -
> -       bsr     %maskm_2x, %maskm_2x
> -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -L(cross_page_boundary):
> -       /* eax contains all the page offset bits of src (rdi). `xor rdi,
> -          rax` sets pointer will all page offset bits cleared so
> -          offset of (PAGE_SIZE - VEC_SIZE) will get last aligned VEC
> -          before page cross (guaranteed to be safe to read). Doing this
> -          as opposed to `movq %rdi, %rax; andq $-VEC_SIZE, %rax` saves
> -          a bit of code size.  */
> -       xorq    %rdi, %rax
> -       VMOVU   (PAGE_SIZE - VEC_SIZE)(%rax), %VMM(1)
> -       VPTESTN %VMM(1), %VMM(1), %k0
> -       KMOV    %k0, %VRCX
> -
> -       /* Shift out zero CHAR matches that are before the beginning of
> -          src (rdi).  */
> -# ifdef USE_AS_WCSRCHR
> -       movl    %edi, %esi
> -       andl    $(VEC_SIZE - 1), %esi
> -       shrl    $2, %esi
> -# endif
> -       shrx    %VGPR(SHIFT_REG), %VRCX, %VRCX
> -
> -       test    %VRCX, %VRCX
> -       jz      L(page_cross_continue)
> +#include "x86-evex512-vecs.h"
> +#include "reg-macros.h"
>
> -       /* Found zero CHAR so need to test for search CHAR.  */
> -       VPCMP   $0, %VMATCH, %VMM(1), %k1
> -       KMOV    %k1, %VRAX
> -       /* Shift out search CHAR matches that are before the beginning of
> -          src (rdi).  */
> -       shrx    %VGPR(SHIFT_REG), %VRAX, %VRAX
> -
> -       /* Check if any search CHAR match in range.  */
> -       blsmsk  %VRCX, %VRCX
> -       and     %VRCX, %VRAX
> -       jz      L(ret3)
> -       bsr     %VRAX, %VRAX
> -# ifdef USE_AS_WCSRCHR
> -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> -# else
> -       addq    %rdi, %rax
> -# endif
> -L(ret3):
> -       ret
> -END(STRRCHR)
> -#endif
> +#include "strrchr-evex-base.S"
> diff --git a/sysdeps/x86_64/multiarch/wcsrchr-evex.S b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> index e5c5fe3bf2..a584cd3f43 100644
> --- a/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> +++ b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> @@ -4,4 +4,5 @@
>
>  #define STRRCHR        WCSRCHR
>  #define USE_AS_WCSRCHR 1
> +#define USE_WIDE_CHAR 1
>  #include "strrchr-evex.S"
> --
> 2.34.1
>

[-- Attachment #2: strrchr-evex512-data-tgl.txt --]
[-- Type: text/plain, Size: 165398 bytes --]

Results For: strrchr
align,freq ,len  ,max_char ,pos  ,seek ,strrchr-dev ,strrchr-glibc ,strrchr-dev/strrchr-glibc 
0    ,1    ,1    ,127      ,0    ,0    ,2.953       ,3.078         ,0.959                     
0    ,1    ,1    ,127      ,0    ,23   ,3.178       ,3.188         ,0.997                     
0    ,1    ,10   ,127      ,9    ,0    ,3.081       ,3.463         ,0.89                      
0    ,1    ,10   ,127      ,9    ,23   ,3.145       ,3.437         ,0.915                     
0    ,1    ,1024 ,127      ,0    ,0    ,3.123       ,3.626         ,0.861                     
0    ,1    ,1024 ,127      ,0    ,23   ,19.23       ,21.471        ,0.896                     
0    ,1    ,1024 ,127      ,1024 ,0    ,17.362      ,20.028        ,0.867                     
0    ,1    ,1024 ,127      ,1024 ,23   ,18.9        ,21.757        ,0.869                     
0    ,1    ,1024 ,127      ,144  ,0    ,4.986       ,6.808         ,0.732                     
0    ,1    ,1024 ,127      ,144  ,23   ,17.706      ,21.908        ,0.808                     
0    ,1    ,1024 ,127      ,192  ,0    ,5.393       ,7.996         ,0.674                     
0    ,1    ,1024 ,127      ,192  ,23   ,17.277      ,21.515        ,0.803                     
0    ,1    ,1024 ,127      ,240  ,0    ,5.226       ,8.055         ,0.649                     
0    ,1    ,1024 ,127      ,240  ,23   ,17.189      ,22.304        ,0.771                     
0    ,1    ,1024 ,127      ,288  ,0    ,7.933       ,8.578         ,0.925                     
0    ,1    ,1024 ,127      ,288  ,23   ,18.501      ,24.366        ,0.759                     
0    ,1    ,1024 ,127      ,48   ,0    ,3.053       ,4.341         ,0.703                     
0    ,1    ,1024 ,127      ,48   ,23   ,19.11       ,21.532        ,0.888                     
0    ,1    ,1024 ,127      ,736  ,0    ,12.208      ,15.197        ,0.803                     
0    ,1    ,1024 ,127      ,736  ,23   ,18.065      ,23.566        ,0.767                     
0    ,1    ,1024 ,127      ,784  ,0    ,13.641      ,16.227        ,0.841                     
0    ,1    ,1024 ,127      ,784  ,23   ,18.869      ,24.555        ,0.768                     
0    ,1    ,1024 ,127      ,832  ,0    ,14.175      ,16.458        ,0.861                     
0    ,1    ,1024 ,127      ,832  ,23   ,18.311      ,22.605        ,0.81                      
0    ,1    ,1024 ,127      ,880  ,0    ,13.896      ,16.366        ,0.849                     
0    ,1    ,1024 ,127      ,880  ,23   ,19.03       ,25.616        ,0.743                     
0    ,1    ,1024 ,127      ,928  ,0    ,15.214      ,18.45         ,0.825                     
0    ,1    ,1024 ,127      ,928  ,23   ,19.255      ,23.225        ,0.829                     
0    ,1    ,1024 ,127      ,96   ,0    ,4.884       ,4.969         ,0.983                     
0    ,1    ,1024 ,127      ,96   ,23   ,18.357      ,22.832        ,0.804                     
0    ,1    ,1024 ,127      ,976  ,0    ,15.796      ,18.741        ,0.843                     
0    ,1    ,1024 ,127      ,976  ,23   ,18.915      ,24.523        ,0.771                     
0    ,1    ,1072 ,127      ,1024 ,0    ,17.09       ,19.652        ,0.87                      
0    ,1    ,1072 ,127      ,1024 ,23   ,18.211      ,23.276        ,0.782                     
0    ,1    ,11   ,127      ,10   ,0    ,3.136       ,3.301         ,0.95                      
0    ,1    ,11   ,127      ,10   ,23   ,3.16        ,3.516         ,0.899                     
0    ,1    ,112  ,127      ,144  ,0    ,4.981       ,4.323         ,1.152                     
0    ,1    ,112  ,127      ,144  ,23   ,5.125       ,5.25          ,0.976                     
0    ,1    ,112  ,127      ,16   ,0    ,3.098       ,3.151         ,0.983                     
0    ,1    ,112  ,127      ,16   ,23   ,4.441       ,5.117         ,0.868                     
0    ,1    ,112  ,127      ,256  ,0    ,4.933       ,4.223         ,1.168                     
0    ,1    ,112  ,127      ,256  ,23   ,5.031       ,5.075         ,0.991                     
0    ,1    ,112  ,127      ,64   ,0    ,4.957       ,4.164         ,1.191                     
0    ,1    ,112  ,127      ,64   ,23   ,5.006       ,4.21          ,1.189                     
0    ,1    ,112  ,127      ,96   ,0    ,4.933       ,4.019         ,1.227                     
0    ,1    ,112  ,127      ,96   ,23   ,5.031       ,5.669         ,0.887                     
0    ,1    ,1120 ,127      ,1024 ,0    ,17.092      ,20.59         ,0.83                      
0    ,1    ,1120 ,127      ,1024 ,23   ,18.837      ,22.851        ,0.824                     
0    ,1    ,1168 ,127      ,1024 ,0    ,17.058      ,22.111        ,0.771                     
0    ,1    ,1168 ,127      ,1024 ,23   ,22.504      ,29.475        ,0.763                     
0    ,1    ,12   ,127      ,11   ,0    ,3.009       ,3.155         ,0.954                     
0    ,1    ,12   ,127      ,11   ,23   ,3.19        ,3.153         ,1.012                     
0    ,1    ,1216 ,127      ,1024 ,0    ,17.007      ,20.269        ,0.839                     
0    ,1    ,1216 ,127      ,1024 ,23   ,20.889      ,26.587        ,0.786                     
0    ,1    ,1264 ,127      ,1024 ,0    ,16.923      ,20.842        ,0.812                     
0    ,1    ,1264 ,127      ,1024 ,23   ,21.588      ,26.289        ,0.821                     
0    ,1    ,128  ,127      ,0    ,0    ,3.099       ,3.174         ,0.976                     
0    ,1    ,128  ,127      ,0    ,23   ,5.803       ,8.722         ,0.665                     
0    ,1    ,128  ,127      ,112  ,0    ,4.933       ,3.916         ,1.26                      
0    ,1    ,128  ,127      ,112  ,23   ,5.254       ,8.698         ,0.604                     
0    ,1    ,128  ,127      ,128  ,0    ,5.091       ,6.146         ,0.828                     
0    ,1    ,128  ,127      ,128  ,23   ,6.28        ,8.127         ,0.773                     
0    ,1    ,128  ,127      ,144  ,0    ,5.207       ,6.103         ,0.853                     
0    ,1    ,128  ,127      ,144  ,23   ,6.443       ,8.413         ,0.766                     
0    ,1    ,128  ,127      ,192  ,0    ,5.085       ,6.159         ,0.826                     
0    ,1    ,128  ,127      ,192  ,23   ,6.341       ,9.672         ,0.656                     
0    ,1    ,128  ,127      ,240  ,0    ,5.088       ,6.309         ,0.806                     
0    ,1    ,128  ,127      ,240  ,23   ,6.289       ,8.934         ,0.704                     
0    ,1    ,128  ,127      ,288  ,0    ,5.289       ,6.064         ,0.872                     
0    ,1    ,128  ,127      ,288  ,23   ,6.258       ,9.495         ,0.659                     
0    ,1    ,128  ,127      ,32   ,0    ,3.094       ,3.209         ,0.964                     
0    ,1    ,128  ,127      ,32   ,23   ,6.02        ,8.555         ,0.704                     
0    ,1    ,128  ,127      ,48   ,0    ,3.099       ,3.3           ,0.939                     
0    ,1    ,128  ,127      ,48   ,23   ,5.846       ,8.252         ,0.708                     
0    ,1    ,128  ,127      ,80   ,0    ,4.958       ,4.269         ,1.161                     
0    ,1    ,128  ,127      ,80   ,23   ,5.379       ,7.371         ,0.73                      
0    ,1    ,128  ,127      ,96   ,0    ,4.956       ,4.144         ,1.196                     
0    ,1    ,128  ,127      ,96   ,23   ,5.423       ,7.668         ,0.707                     
0    ,1    ,13   ,127      ,12   ,0    ,3.053       ,3.478         ,0.878                     
0    ,1    ,13   ,127      ,12   ,23   ,3.169       ,3.306         ,0.959                     
0    ,1    ,1312 ,127      ,1024 ,0    ,16.929      ,22.15         ,0.764                     
0    ,1    ,1312 ,127      ,1024 ,23   ,23.496      ,28.888        ,0.813                     
0    ,1    ,14   ,127      ,13   ,0    ,3.087       ,3.311         ,0.932                     
0    ,1    ,14   ,127      ,13   ,23   ,3.145       ,3.27          ,0.962                     
0    ,1    ,144  ,127      ,128  ,0    ,5.095       ,6.041         ,0.843                     
0    ,1    ,144  ,127      ,128  ,23   ,5.325       ,6.875         ,0.775                     
0    ,1    ,15   ,127      ,14   ,0    ,3.136       ,3.056         ,1.026                     
0    ,1    ,15   ,127      ,14   ,23   ,3.145       ,3.309         ,0.95                      
0    ,1    ,16   ,127      ,0    ,0    ,3.116       ,3.867         ,0.806                     
0    ,1    ,16   ,127      ,0    ,23   ,3.149       ,3.453         ,0.912                     
0    ,1    ,16   ,127      ,144  ,0    ,3.083       ,3.039         ,1.015                     
0    ,1    ,16   ,127      ,144  ,23   ,3.852       ,3.699         ,1.041                     
0    ,1    ,16   ,127      ,15   ,0    ,3.048       ,3.481         ,0.875                     
0    ,1    ,16   ,127      ,15   ,23   ,3.204       ,3.403         ,0.942                     
0    ,1    ,16   ,127      ,16   ,0    ,3.13        ,3.178         ,0.985                     
0    ,1    ,16   ,127      ,16   ,23   ,3.768       ,3.776         ,0.998                     
0    ,1    ,16   ,127      ,192  ,0    ,3.099       ,3.305         ,0.938                     
0    ,1    ,16   ,127      ,192  ,23   ,3.755       ,3.683         ,1.02                      
0    ,1    ,16   ,127      ,240  ,0    ,3.122       ,3.284         ,0.951                     
0    ,1    ,16   ,127      ,240  ,23   ,3.764       ,3.709         ,1.015                     
0    ,1    ,16   ,127      ,256  ,0    ,3.166       ,3.44          ,0.921                     
0    ,1    ,16   ,127      ,256  ,23   ,3.792       ,3.847         ,0.986                     
0    ,1    ,16   ,127      ,288  ,0    ,3.114       ,3.166         ,0.983                     
0    ,1    ,16   ,127      ,288  ,23   ,4.129       ,3.874         ,1.066                     
0    ,1    ,16   ,127      ,48   ,0    ,3.11        ,3.159         ,0.985                     
0    ,1    ,16   ,127      ,48   ,23   ,3.774       ,3.804         ,0.992                     
0    ,1    ,16   ,127      ,64   ,0    ,3.098       ,3.101         ,0.999                     
0    ,1    ,16   ,127      ,64   ,23   ,3.773       ,3.994         ,0.945                     
0    ,1    ,16   ,127      ,96   ,0    ,3.13        ,3.151         ,0.993                     
0    ,1    ,16   ,127      ,96   ,23   ,3.794       ,3.755         ,1.01                      
0    ,1    ,160  ,127      ,144  ,0    ,5.076       ,6.069         ,0.836                     
0    ,1    ,160  ,127      ,144  ,23   ,5.318       ,6.691         ,0.795                     
0    ,1    ,160  ,127      ,16   ,0    ,3.084       ,3.166         ,0.974                     
0    ,1    ,160  ,127      ,16   ,23   ,5.866       ,8.417         ,0.697                     
0    ,1    ,160  ,127      ,256  ,0    ,5.387       ,5.984         ,0.9                       
0    ,1    ,160  ,127      ,256  ,23   ,6.289       ,8.095         ,0.777                     
0    ,1    ,160  ,127      ,64   ,0    ,5.007       ,4.453         ,1.124                     
0    ,1    ,160  ,127      ,64   ,23   ,5.595       ,7.571         ,0.739                     
0    ,1    ,160  ,127      ,96   ,0    ,4.981       ,4.675         ,1.065                     
0    ,1    ,160  ,127      ,96   ,23   ,5.388       ,7.77          ,0.693                     
0    ,1    ,17   ,127      ,16   ,0    ,3.024       ,3.367         ,0.898                     
0    ,1    ,17   ,127      ,16   ,23   ,3.153       ,3.303         ,0.954                     
0    ,1    ,176  ,127      ,128  ,0    ,5.107       ,6.03          ,0.847                     
0    ,1    ,176  ,127      ,128  ,23   ,5.166       ,6.57          ,0.786                     
0    ,1    ,176  ,127      ,160  ,0    ,5.087       ,5.925         ,0.859                     
0    ,1    ,176  ,127      ,160  ,23   ,5.204       ,7.107         ,0.732                     
0    ,1    ,176  ,127      ,32   ,0    ,3.098       ,3.584         ,0.865                     
0    ,1    ,176  ,127      ,32   ,23   ,5.966       ,8.015         ,0.744                     
0    ,1    ,1760 ,127      ,2048 ,0    ,26.801      ,28.778        ,0.931                     
0    ,1    ,1760 ,127      ,2048 ,23   ,27.873      ,30.757        ,0.906                     
0    ,1    ,1760 ,127      ,288  ,0    ,7.739       ,9.704         ,0.798                     
0    ,1    ,1760 ,127      ,288  ,23   ,27.582      ,33.667        ,0.819                     
0    ,1    ,18   ,127      ,17   ,0    ,3.166       ,3.214         ,0.985                     
0    ,1    ,18   ,127      ,17   ,23   ,3.153       ,3.391         ,0.93                      
0    ,1    ,1808 ,127      ,2048 ,0    ,27.859      ,31.173        ,0.894                     
0    ,1    ,1808 ,127      ,2048 ,23   ,29.615      ,32.244        ,0.918                     
0    ,1    ,1808 ,127      ,240  ,0    ,5.02        ,7.922         ,0.634                     
0    ,1    ,1808 ,127      ,240  ,23   ,28.216      ,32.914        ,0.857                     
0    ,1    ,1856 ,127      ,192  ,0    ,4.952       ,8.102         ,0.611                     
0    ,1    ,1856 ,127      ,192  ,23   ,28.129      ,32.761        ,0.859                     
0    ,1    ,1856 ,127      ,2048 ,0    ,28.685      ,30.899        ,0.928                     
0    ,1    ,1856 ,127      ,2048 ,23   ,30.153      ,32.325        ,0.933                     
0    ,1    ,19   ,127      ,18   ,0    ,3.064       ,3.373         ,0.908                     
0    ,1    ,19   ,127      ,18   ,23   ,3.174       ,3.314         ,0.958                     
0    ,1    ,1904 ,127      ,144  ,0    ,4.963       ,6.725         ,0.738                     
0    ,1    ,1904 ,127      ,144  ,23   ,28.694      ,32.558        ,0.881                     
0    ,1    ,1904 ,127      ,2048 ,0    ,28.546      ,30.818        ,0.926                     
0    ,1    ,1904 ,127      ,2048 ,23   ,29.761      ,32.772        ,0.908                     
0    ,1    ,192  ,127      ,176  ,0    ,5.245       ,6.23          ,0.842                     
0    ,1    ,192  ,127      ,176  ,23   ,5.649       ,8.73          ,0.647                     
0    ,1    ,1952 ,127      ,2048 ,0    ,29.867      ,31.426        ,0.95                      
0    ,1    ,1952 ,127      ,2048 ,23   ,31.458      ,33.255        ,0.946                     
0    ,1    ,1952 ,127      ,96   ,0    ,4.814       ,4.921         ,0.978                     
0    ,1    ,1952 ,127      ,96   ,23   ,30.899      ,33.712        ,0.917                     
0    ,1    ,2    ,127      ,1    ,0    ,2.919       ,3.139         ,0.93                      
0    ,1    ,2    ,127      ,1    ,23   ,3.178       ,3.483         ,0.912                     
0    ,1    ,20   ,127      ,19   ,0    ,3.038       ,3.106         ,0.978                     
0    ,1    ,20   ,127      ,19   ,23   ,3.137       ,3.349         ,0.937                     
0    ,1    ,2000 ,127      ,2048 ,0    ,30.484      ,32.671        ,0.933                     
0    ,1    ,2000 ,127      ,2048 ,23   ,31.584      ,33.561        ,0.941                     
0    ,1    ,2000 ,127      ,48   ,0    ,3.038       ,4.75          ,0.64                      
0    ,1    ,2000 ,127      ,48   ,23   ,33.027      ,35.259        ,0.937                     
0    ,1    ,2048 ,127      ,0    ,0    ,3.039       ,4.422         ,0.687                     
0    ,1    ,2048 ,127      ,0    ,23   ,33.124      ,38.138        ,0.869                     
0    ,1    ,2048 ,127      ,1024 ,0    ,15.974      ,22.101        ,0.723                     
0    ,1    ,2048 ,127      ,1024 ,23   ,34.138      ,39.756        ,0.859                     
0    ,1    ,2048 ,127      ,128  ,0    ,5.028       ,6.031         ,0.834                     
0    ,1    ,2048 ,127      ,128  ,23   ,32.031      ,35.342        ,0.906                     
0    ,1    ,2048 ,127      ,144  ,0    ,4.971       ,6.858         ,0.725                     
0    ,1    ,2048 ,127      ,144  ,23   ,31.915      ,36.571        ,0.873                     
0    ,1    ,2048 ,127      ,1760 ,0    ,27.017      ,30.804        ,0.877                     
0    ,1    ,2048 ,127      ,1760 ,23   ,36.742      ,46.636        ,0.788                     
0    ,1    ,2048 ,127      ,1808 ,0    ,27.686      ,31.56         ,0.877                     
0    ,1    ,2048 ,127      ,1808 ,23   ,35.858      ,41.571        ,0.863                     
0    ,1    ,2048 ,127      ,1856 ,0    ,28.132      ,30.865        ,0.911                     
0    ,1    ,2048 ,127      ,1856 ,23   ,35.061      ,43.113        ,0.813                     
0    ,1    ,2048 ,127      ,1904 ,0    ,28.136      ,31.06         ,0.906                     
0    ,1    ,2048 ,127      ,1904 ,23   ,37.04       ,51.291        ,0.722                     
0    ,1    ,2048 ,127      ,192  ,0    ,5.131       ,7.662         ,0.67                      
0    ,1    ,2048 ,127      ,192  ,23   ,31.421      ,36.868        ,0.852                     
0    ,1    ,2048 ,127      ,1952 ,0    ,29.478      ,32.207        ,0.915                     
0    ,1    ,2048 ,127      ,1952 ,23   ,35.721      ,41.805        ,0.854                     
0    ,1    ,2048 ,127      ,2000 ,0    ,30.193      ,33.171        ,0.91                      
0    ,1    ,2048 ,127      ,2000 ,23   ,37.127      ,47.851        ,0.776                     
0    ,1    ,2048 ,127      ,2048 ,0    ,32.362      ,33.903        ,0.955                     
0    ,1    ,2048 ,127      ,2048 ,23   ,34.087      ,35.347        ,0.964                     
0    ,1    ,2048 ,127      ,240  ,0    ,4.845       ,7.533         ,0.643                     
0    ,1    ,2048 ,127      ,240  ,23   ,31.459      ,36.229        ,0.868                     
0    ,1    ,2048 ,127      ,256  ,0    ,8.423       ,8.247         ,1.021                     
0    ,1    ,2048 ,127      ,256  ,23   ,32.5        ,36.053        ,0.901                     
0    ,1    ,2048 ,127      ,288  ,0    ,7.616       ,9.511         ,0.801                     
0    ,1    ,2048 ,127      ,288  ,23   ,32.728      ,37.416        ,0.875                     
0    ,1    ,2048 ,127      ,32   ,0    ,3.749       ,4.91          ,0.764                     
0    ,1    ,2048 ,127      ,32   ,23   ,33.571      ,35.647        ,0.942                     
0    ,1    ,2048 ,127      ,4096 ,0    ,30.151      ,37.401        ,0.806                     
0    ,1    ,2048 ,127      ,4096 ,23   ,33.08       ,35.407        ,0.934                     
0    ,1    ,2048 ,127      ,48   ,0    ,3.024       ,4.447         ,0.68                      
0    ,1    ,2048 ,127      ,48   ,23   ,32.7        ,35.269        ,0.927                     
0    ,1    ,2048 ,127      ,512  ,0    ,9.928       ,13.903        ,0.714                     
0    ,1    ,2048 ,127      ,512  ,23   ,32.663      ,35.359        ,0.924                     
0    ,1    ,2048 ,127      ,64   ,0    ,4.802       ,3.823         ,1.256                     
0    ,1    ,2048 ,127      ,64   ,23   ,32.43       ,35.213        ,0.921                     
0    ,1    ,2048 ,127      ,96   ,0    ,4.813       ,5.266         ,0.914                     
0    ,1    ,2048 ,127      ,96   ,23   ,32.365      ,35.394        ,0.914                     
0    ,1    ,208  ,127      ,16   ,0    ,3.147       ,3.166         ,0.994                     
0    ,1    ,208  ,127      ,16   ,23   ,7.466       ,9.269         ,0.805                     
0    ,1    ,208  ,127      ,192  ,0    ,4.927       ,7.07          ,0.697                     
0    ,1    ,208  ,127      ,192  ,23   ,5.551       ,8.845         ,0.628                     
0    ,1    ,208  ,127      ,256  ,0    ,5.082       ,7.057         ,0.72                      
0    ,1    ,208  ,127      ,256  ,23   ,7.29        ,10.092        ,0.722                     
0    ,1    ,208  ,127      ,48   ,0    ,3.083       ,3.198         ,0.964                     
0    ,1    ,208  ,127      ,48   ,23   ,7.076       ,10.236        ,0.691                     
0    ,1    ,208  ,127      ,64   ,0    ,4.981       ,4.078         ,1.221                     
0    ,1    ,208  ,127      ,64   ,23   ,6.322       ,8.175         ,0.773                     
0    ,1    ,2096 ,127      ,2048 ,0    ,31.566      ,33.729        ,0.936                     
0    ,1    ,2096 ,127      ,2048 ,23   ,36.351      ,48.19         ,0.754                     
0    ,1    ,21   ,127      ,20   ,0    ,3.119       ,3.137         ,0.994                     
0    ,1    ,21   ,127      ,20   ,23   ,3.129       ,3.465         ,0.903                     
0    ,1    ,2144 ,127      ,2048 ,0    ,31.428      ,33.892        ,0.927                     
0    ,1    ,2144 ,127      ,2048 ,23   ,35.634      ,42.19         ,0.845                     
0    ,1    ,2192 ,127      ,2048 ,0    ,31.851      ,36.199        ,0.88                      
0    ,1    ,2192 ,127      ,2048 ,23   ,37.356      ,43.984        ,0.849                     
0    ,1    ,22   ,127      ,21   ,0    ,3.092       ,3.221         ,0.96                      
0    ,1    ,22   ,127      ,21   ,23   ,3.169       ,3.238         ,0.979                     
0    ,1    ,224  ,127      ,128  ,0    ,5.114       ,6.103         ,0.838                     
0    ,1    ,224  ,127      ,128  ,23   ,5.591       ,7.366         ,0.759                     
0    ,1    ,224  ,127      ,208  ,0    ,5.383       ,8.35          ,0.645                     
0    ,1    ,224  ,127      ,208  ,23   ,5.77        ,8.116         ,0.711                     
0    ,1    ,224  ,127      ,288  ,0    ,5.229       ,6.715         ,0.779                     
0    ,1    ,224  ,127      ,288  ,23   ,7.445       ,9.99          ,0.745                     
0    ,1    ,224  ,127      ,32   ,0    ,3.098       ,3.182         ,0.974                     
0    ,1    ,224  ,127      ,32   ,23   ,6.725       ,9.277         ,0.725                     
0    ,1    ,224  ,127      ,512  ,0    ,5.134       ,6.787         ,0.756                     
0    ,1    ,224  ,127      ,512  ,23   ,6.925       ,9.44          ,0.734                     
0    ,1    ,2240 ,127      ,2048 ,0    ,31.668      ,34.929        ,0.907                     
0    ,1    ,2240 ,127      ,2048 ,23   ,38.233      ,47.12         ,0.811                     
0    ,1    ,2288 ,127      ,2048 ,0    ,31.751      ,34.151        ,0.93                      
0    ,1    ,2288 ,127      ,2048 ,23   ,38.224      ,47.005        ,0.813                     
0    ,1    ,23   ,127      ,22   ,0    ,3.113       ,3.17          ,0.982                     
0    ,1    ,23   ,127      ,22   ,23   ,3.145       ,3.409         ,0.923                     
0    ,1    ,2336 ,127      ,2048 ,0    ,31.737      ,33.776        ,0.94                      
0    ,1    ,2336 ,127      ,2048 ,23   ,38.77       ,45.634        ,0.85                      
0    ,1    ,24   ,127      ,23   ,0    ,3.045       ,3.081         ,0.989                     
0    ,1    ,24   ,127      ,23   ,23   ,3.154       ,3.374         ,0.935                     
0    ,1    ,240  ,127      ,224  ,0    ,5.533       ,7.803         ,0.709                     
0    ,1    ,240  ,127      ,224  ,23   ,5.021       ,7.161         ,0.701                     
0    ,1    ,25   ,127      ,24   ,0    ,3.058       ,3.305         ,0.925                     
0    ,1    ,25   ,127      ,24   ,23   ,3.163       ,3.288         ,0.962                     
0    ,1    ,256  ,127      ,0    ,0    ,3.099       ,3.291         ,0.942                     
0    ,1    ,256  ,127      ,0    ,23   ,8.672       ,12.097        ,0.717                     
0    ,1    ,256  ,127      ,112  ,0    ,4.981       ,4.872         ,1.023                     
0    ,1    ,256  ,127      ,112  ,23   ,8.213       ,10.799        ,0.761                     
0    ,1    ,256  ,127      ,144  ,0    ,5.365       ,6.079         ,0.883                     
0    ,1    ,256  ,127      ,144  ,23   ,7.458       ,10.201        ,0.731                     
0    ,1    ,256  ,127      ,16   ,0    ,3.13        ,3.315         ,0.944                     
0    ,1    ,256  ,127      ,16   ,23   ,9.39        ,11.868        ,0.791                     
0    ,1    ,256  ,127      ,160  ,0    ,5.197       ,6.426         ,0.809                     
0    ,1    ,256  ,127      ,160  ,23   ,7.439       ,9.818         ,0.758                     
0    ,1    ,256  ,127      ,192  ,0    ,5.394       ,6.988         ,0.772                     
0    ,1    ,256  ,127      ,192  ,23   ,7.062       ,9.702         ,0.728                     
0    ,1    ,256  ,127      ,208  ,0    ,5.073       ,6.989         ,0.726                     
0    ,1    ,256  ,127      ,208  ,23   ,6.976       ,11.408        ,0.612                     
0    ,1    ,256  ,127      ,240  ,0    ,5.395       ,7.002         ,0.771                     
0    ,1    ,256  ,127      ,240  ,23   ,7.176       ,10.974        ,0.654                     
0    ,1    ,256  ,127      ,256  ,0    ,7.82        ,8.248         ,0.948                     
0    ,1    ,256  ,127      ,256  ,23   ,8.751       ,12.726        ,0.688                     
0    ,1    ,256  ,127      ,288  ,0    ,8.132       ,8.052         ,1.01                      
0    ,1    ,256  ,127      ,288  ,23   ,8.855       ,10.821        ,0.818                     
0    ,1    ,256  ,127      ,48   ,0    ,3.18        ,3.458         ,0.92                      
0    ,1    ,256  ,127      ,48   ,23   ,8.687       ,11.542        ,0.753                     
0    ,1    ,256  ,127      ,64   ,0    ,5.014       ,4.221         ,1.188                     
0    ,1    ,256  ,127      ,64   ,23   ,8.102       ,10.699        ,0.757                     
0    ,1    ,256  ,127      ,96   ,0    ,4.933       ,4.059         ,1.215                     
0    ,1    ,256  ,127      ,96   ,23   ,8.175       ,10.479        ,0.78                      
0    ,1    ,26   ,127      ,25   ,0    ,3.023       ,3.209         ,0.942                     
0    ,1    ,26   ,127      ,25   ,23   ,3.295       ,3.323         ,0.992                     
0    ,1    ,27   ,127      ,26   ,0    ,3.039       ,3.175         ,0.957                     
0    ,1    ,27   ,127      ,26   ,23   ,3.311       ,3.318         ,0.998                     
0    ,1    ,272  ,127      ,128  ,0    ,5.201       ,6.129         ,0.849                     
0    ,1    ,272  ,127      ,128  ,23   ,7.471       ,9.733         ,0.768                     
0    ,1    ,272  ,127      ,240  ,0    ,5.495       ,7.266         ,0.756                     
0    ,1    ,272  ,127      ,240  ,23   ,7.348       ,11.8          ,0.623                     
0    ,1    ,272  ,127      ,256  ,0    ,7.966       ,8.277         ,0.962                     
0    ,1    ,272  ,127      ,256  ,23   ,7.957       ,9.253         ,0.86                      
0    ,1    ,272  ,127      ,32   ,0    ,3.114       ,3.3           ,0.944                     
0    ,1    ,272  ,127      ,32   ,23   ,11.581      ,11.002        ,1.053                     
0    ,1    ,272  ,127      ,512  ,0    ,7.898       ,8.493         ,0.93                      
0    ,1    ,272  ,127      ,512  ,23   ,8.492       ,12.399        ,0.685                     
0    ,1    ,28   ,127      ,27   ,0    ,3.038       ,3.329         ,0.913                     
0    ,1    ,28   ,127      ,27   ,23   ,3.332       ,3.397         ,0.981                     
0    ,1    ,288  ,127      ,272  ,0    ,7.768       ,7.912         ,0.982                     
0    ,1    ,288  ,127      ,272  ,23   ,7.954       ,10.425        ,0.763                     
0    ,1    ,29   ,127      ,28   ,0    ,3.067       ,3.245         ,0.945                     
0    ,1    ,29   ,127      ,28   ,23   ,3.204       ,3.42          ,0.937                     
0    ,1    ,3    ,127      ,2    ,0    ,2.998       ,3.362         ,0.892                     
0    ,1    ,3    ,127      ,2    ,23   ,3.177       ,3.366         ,0.944                     
0    ,1    ,30   ,127      ,29   ,0    ,3.067       ,3.39          ,0.905                     
0    ,1    ,30   ,127      ,29   ,23   ,3.169       ,3.5           ,0.906                     
0    ,1    ,304  ,127      ,16   ,0    ,3.114       ,3.283         ,0.949                     
0    ,1    ,304  ,127      ,16   ,23   ,9.138       ,11.629        ,0.786                     
0    ,1    ,304  ,127      ,256  ,0    ,7.948       ,8.377         ,0.949                     
0    ,1    ,304  ,127      ,256  ,23   ,7.994       ,8.97          ,0.891                     
0    ,1    ,304  ,127      ,64   ,0    ,4.957       ,4.276         ,1.159                     
0    ,1    ,304  ,127      ,64   ,23   ,8.033       ,10.141        ,0.792                     
0    ,1    ,31   ,127      ,30   ,0    ,3.075       ,3.344         ,0.92                      
0    ,1    ,31   ,127      ,30   ,23   ,3.161       ,3.434         ,0.921                     
0    ,1    ,32   ,127      ,0    ,0    ,3.114       ,3.167         ,0.983                     
0    ,1    ,32   ,127      ,0    ,23   ,3.131       ,4.026         ,0.778                     
0    ,1    ,32   ,127      ,128  ,0    ,3.099       ,3.198         ,0.969                     
0    ,1    ,32   ,127      ,128  ,23   ,3.813       ,3.793         ,1.005                     
0    ,1    ,32   ,127      ,144  ,0    ,3.099       ,3.268         ,0.948                     
0    ,1    ,32   ,127      ,144  ,23   ,3.942       ,3.699         ,1.065                     
0    ,1    ,32   ,127      ,16   ,0    ,3.099       ,3.252         ,0.953                     
0    ,1    ,32   ,127      ,16   ,23   ,3.146       ,3.563         ,0.883                     
0    ,1    ,32   ,127      ,192  ,0    ,3.099       ,3.214         ,0.964                     
0    ,1    ,32   ,127      ,192  ,23   ,3.833       ,3.796         ,1.01                      
0    ,1    ,32   ,127      ,240  ,0    ,3.114       ,3.068         ,1.015                     
0    ,1    ,32   ,127      ,240  ,23   ,3.793       ,3.999         ,0.948                     
0    ,1    ,32   ,127      ,288  ,0    ,3.114       ,3.229         ,0.964                     
0    ,1    ,32   ,127      ,288  ,23   ,3.755       ,4.339         ,0.865                     
0    ,1    ,32   ,127      ,31   ,0    ,3.099       ,3.268         ,0.948                     
0    ,1    ,32   ,127      ,31   ,23   ,3.144       ,3.375         ,0.932                     
0    ,1    ,32   ,127      ,32   ,0    ,3.109       ,3.166         ,0.982                     
0    ,1    ,32   ,127      ,32   ,23   ,3.749       ,3.771         ,0.994                     
0    ,1    ,32   ,127      ,48   ,0    ,3.13        ,3.284         ,0.953                     
0    ,1    ,32   ,127      ,48   ,23   ,3.736       ,3.773         ,0.99                      
0    ,1    ,32   ,127      ,96   ,0    ,3.091       ,3.174         ,0.974                     
0    ,1    ,32   ,127      ,96   ,23   ,3.785       ,3.832         ,0.988                     
0    ,1    ,320  ,127      ,128  ,0    ,5.065       ,6.166         ,0.821                     
0    ,1    ,320  ,127      ,128  ,23   ,7.503       ,11.099        ,0.676                     
0    ,1    ,320  ,127      ,192  ,0    ,5.379       ,8.151         ,0.66                      
0    ,1    ,320  ,127      ,192  ,23   ,7.318       ,11.752        ,0.623                     
0    ,1    ,320  ,127      ,32   ,0    ,3.099       ,3.439         ,0.901                     
0    ,1    ,320  ,127      ,32   ,23   ,8.722       ,12.582        ,0.693                     
0    ,1    ,320  ,127      ,512  ,0    ,8.005       ,8.763         ,0.913                     
0    ,1    ,320  ,127      ,512  ,23   ,8.642       ,13.466        ,0.642                     
0    ,1    ,352  ,127      ,256  ,0    ,7.73        ,7.652         ,1.01                      
0    ,1    ,352  ,127      ,256  ,23   ,8.957       ,9.988         ,0.897                     
0    ,1    ,352  ,127      ,64   ,0    ,4.957       ,4.031         ,1.23                      
0    ,1    ,352  ,127      ,64   ,23   ,8.06        ,11.231        ,0.718                     
0    ,1    ,368  ,127      ,128  ,0    ,5.088       ,6.068         ,0.839                     
0    ,1    ,368  ,127      ,128  ,23   ,7.415       ,10.921        ,0.679                     
0    ,1    ,368  ,127      ,144  ,0    ,5.147       ,7.052         ,0.73                      
0    ,1    ,368  ,127      ,144  ,23   ,7.529       ,11.268        ,0.668                     
0    ,1    ,368  ,127      ,512  ,0    ,7.866       ,8.924         ,0.881                     
0    ,1    ,368  ,127      ,512  ,23   ,8.798       ,12.517        ,0.703                     
0    ,1    ,4    ,127      ,3    ,0    ,3.114       ,3.148         ,0.989                     
0    ,1    ,4    ,127      ,3    ,23   ,3.161       ,3.288         ,0.961                     
0    ,1    ,400  ,127      ,256  ,0    ,7.881       ,7.791         ,1.011                     
0    ,1    ,400  ,127      ,256  ,23   ,9.582       ,14.172        ,0.676                     
0    ,1    ,416  ,127      ,128  ,0    ,5.089       ,6.431         ,0.791                     
0    ,1    ,416  ,127      ,128  ,23   ,9.002       ,12.069        ,0.746                     
0    ,1    ,416  ,127      ,512  ,0    ,8.506       ,9.524         ,0.893                     
0    ,1    ,416  ,127      ,512  ,23   ,10.297      ,13.686        ,0.752                     
0    ,1    ,416  ,127      ,96   ,0    ,4.932       ,5.159         ,0.956                     
0    ,1    ,416  ,127      ,96   ,23   ,9.48        ,12.207        ,0.777                     
0    ,1    ,448  ,127      ,256  ,0    ,7.779       ,8.728         ,0.891                     
0    ,1    ,448  ,127      ,256  ,23   ,9.679       ,13.841        ,0.699                     
0    ,1    ,464  ,127      ,48   ,0    ,3.083       ,3.242         ,0.951                     
0    ,1    ,464  ,127      ,48   ,23   ,10.09       ,14.314        ,0.705                     
0    ,1    ,464  ,127      ,512  ,0    ,8.895       ,10.239        ,0.869                     
0    ,1    ,464  ,127      ,512  ,23   ,10.076      ,13.941        ,0.723                     
0    ,1    ,48   ,127      ,32   ,0    ,3.083       ,3.423         ,0.901                     
0    ,1    ,48   ,127      ,32   ,23   ,3.161       ,3.694         ,0.856                     
0    ,1    ,496  ,127      ,256  ,0    ,8.113       ,8.964         ,0.905                     
0    ,1    ,496  ,127      ,256  ,23   ,9.628       ,12.635        ,0.762                     
0    ,1    ,5    ,127      ,4    ,0    ,2.96        ,3.057         ,0.968                     
0    ,1    ,5    ,127      ,4    ,23   ,3.164       ,3.422         ,0.925                     
0    ,1    ,512  ,127      ,0    ,0    ,3.083       ,4.111         ,0.75                      
0    ,1    ,512  ,127      ,0    ,23   ,11.966      ,14.633        ,0.818                     
0    ,1    ,512  ,127      ,144  ,0    ,5.114       ,6.373         ,0.802                     
0    ,1    ,512  ,127      ,144  ,23   ,10.685      ,13.732        ,0.778                     
0    ,1    ,512  ,127      ,192  ,0    ,5.273       ,7.845         ,0.672                     
0    ,1    ,512  ,127      ,192  ,23   ,10.666      ,14.213        ,0.75                      
0    ,1    ,512  ,127      ,224  ,0    ,5.286       ,8.042         ,0.657                     
0    ,1    ,512  ,127      ,224  ,23   ,10.377      ,15.298        ,0.678                     
0    ,1    ,512  ,127      ,240  ,0    ,5.419       ,8.092         ,0.67                      
0    ,1    ,512  ,127      ,240  ,23   ,10.246      ,15.382        ,0.666                     
0    ,1    ,512  ,127      ,272  ,0    ,7.864       ,9.463         ,0.831                     
0    ,1    ,512  ,127      ,272  ,23   ,11.848      ,14.992        ,0.79                      
0    ,1    ,512  ,127      ,288  ,0    ,8.865       ,7.956         ,1.114                     
0    ,1    ,512  ,127      ,288  ,23   ,11.47       ,14.51         ,0.79                      
0    ,1    ,512  ,127      ,320  ,0    ,8.35        ,9.526         ,0.877                     
0    ,1    ,512  ,127      ,320  ,23   ,11.58       ,15.567        ,0.744                     
0    ,1    ,512  ,127      ,368  ,0    ,8.559       ,9.328         ,0.918                     
0    ,1    ,512  ,127      ,368  ,23   ,10.747      ,15.593        ,0.689                     
0    ,1    ,512  ,127      ,416  ,0    ,8.594       ,10.529        ,0.816                     
0    ,1    ,512  ,127      ,416  ,23   ,11.284      ,14.946        ,0.755                     
0    ,1    ,512  ,127      ,464  ,0    ,9.515       ,11.655        ,0.816                     
0    ,1    ,512  ,127      ,464  ,23   ,11.494      ,16.584        ,0.693                     
0    ,1    ,512  ,127      ,48   ,0    ,3.098       ,4.732         ,0.655                     
0    ,1    ,512  ,127      ,48   ,23   ,12.04       ,15.473        ,0.778                     
0    ,1    ,512  ,127      ,512  ,0    ,10.208      ,12.077        ,0.845                     
0    ,1    ,512  ,127      ,512  ,23   ,11.86       ,15.423        ,0.769                     
0    ,1    ,512  ,127      ,96   ,0    ,4.957       ,4.58          ,1.082                     
0    ,1    ,512  ,127      ,96   ,23   ,11.308      ,13.901        ,0.813                     
0    ,1    ,544  ,127      ,256  ,0    ,8.042       ,8.428         ,0.954                     
0    ,1    ,544  ,127      ,256  ,23   ,11.444      ,16.021        ,0.714                     
0    ,1    ,560  ,127      ,512  ,0    ,10.263      ,11.774        ,0.872                     
0    ,1    ,560  ,127      ,512  ,23   ,10.401      ,15.021        ,0.692                     
0    ,1    ,6    ,127      ,5    ,0    ,3.079       ,3.237         ,0.951                     
0    ,1    ,6    ,127      ,5    ,23   ,3.177       ,3.331         ,0.954                     
0    ,1    ,608  ,127      ,512  ,0    ,10.274      ,10.897        ,0.943                     
0    ,1    ,608  ,127      ,512  ,23   ,11.243      ,12.839        ,0.876                     
0    ,1    ,64   ,127      ,0    ,0    ,3.098       ,3.378         ,0.917                     
0    ,1    ,64   ,127      ,0    ,23   ,4.35        ,5.456         ,0.797                     
0    ,1    ,64   ,127      ,144  ,0    ,4.991       ,4.189         ,1.191                     
0    ,1    ,64   ,127      ,144  ,23   ,5.102       ,5.032         ,1.014                     
0    ,1    ,64   ,127      ,16   ,0    ,3.121       ,3.117         ,1.001                     
0    ,1    ,64   ,127      ,16   ,23   ,4.704       ,4.996         ,0.942                     
0    ,1    ,64   ,127      ,192  ,0    ,4.981       ,4.048         ,1.23                      
0    ,1    ,64   ,127      ,192  ,23   ,5.092       ,5.123         ,0.994                     
0    ,1    ,64   ,127      ,240  ,0    ,5.006       ,4.037         ,1.24                      
0    ,1    ,64   ,127      ,240  ,23   ,5.006       ,5.54          ,0.904                     
0    ,1    ,64   ,127      ,256  ,0    ,4.981       ,4.277         ,1.165                     
0    ,1    ,64   ,127      ,256  ,23   ,5.032       ,4.932         ,1.02                      
0    ,1    ,64   ,127      ,288  ,0    ,4.957       ,3.897         ,1.272                     
0    ,1    ,64   ,127      ,288  ,23   ,5.246       ,5.288         ,0.992                     
0    ,1    ,64   ,127      ,48   ,0    ,3.091       ,3.249         ,0.951                     
0    ,1    ,64   ,127      ,48   ,23   ,4.393       ,5.008         ,0.877                     
0    ,1    ,64   ,127      ,64   ,0    ,4.974       ,4.005         ,1.242                     
0    ,1    ,64   ,127      ,64   ,23   ,5.147       ,4.958         ,1.038                     
0    ,1    ,64   ,127      ,96   ,0    ,4.957       ,4.148         ,1.195                     
0    ,1    ,64   ,127      ,96   ,23   ,5.032       ,5.032         ,1.0                       
0    ,1    ,656  ,127      ,512  ,0    ,10.123      ,12.522        ,0.808                     
0    ,1    ,656  ,127      ,512  ,23   ,13.767      ,19.72         ,0.698                     
0    ,1    ,7    ,127      ,6    ,0    ,3.375       ,3.032         ,1.113                     
0    ,1    ,7    ,127      ,6    ,23   ,3.219       ,3.565         ,0.903                     
0    ,1    ,704  ,127      ,512  ,0    ,10.395      ,13.15         ,0.79                      
0    ,1    ,704  ,127      ,512  ,23   ,14.495      ,19.966        ,0.726                     
0    ,1    ,736  ,127      ,1024 ,0    ,12.282      ,15.006        ,0.818                     
0    ,1    ,736  ,127      ,1024 ,23   ,13.497      ,16.572        ,0.814                     
0    ,1    ,736  ,127      ,288  ,0    ,7.659       ,9.138         ,0.838                     
0    ,1    ,736  ,127      ,288  ,23   ,12.953      ,16.572        ,0.782                     
0    ,1    ,752  ,127      ,512  ,0    ,10.177      ,12.643        ,0.805                     
0    ,1    ,752  ,127      ,512  ,23   ,13.022      ,16.383        ,0.795                     
0    ,1    ,784  ,127      ,1024 ,0    ,13.396      ,16.674        ,0.803                     
0    ,1    ,784  ,127      ,1024 ,23   ,15.41       ,18.729        ,0.823                     
0    ,1    ,784  ,127      ,240  ,0    ,5.42        ,8.15          ,0.665                     
0    ,1    ,784  ,127      ,240  ,23   ,13.429      ,18.319        ,0.733                     
0    ,1    ,8    ,127      ,7    ,0    ,3.142       ,3.395         ,0.925                     
0    ,1    ,8    ,127      ,7    ,23   ,3.16        ,3.437         ,0.919                     
0    ,1    ,80   ,127      ,128  ,0    ,4.969       ,4.257         ,1.167                     
0    ,1    ,80   ,127      ,128  ,23   ,5.048       ,5.026         ,1.004                     
0    ,1    ,80   ,127      ,32   ,0    ,3.098       ,3.053         ,1.015                     
0    ,1    ,80   ,127      ,32   ,23   ,4.382       ,5.119         ,0.856                     
0    ,1    ,80   ,127      ,48   ,0    ,3.083       ,3.3           ,0.934                     
0    ,1    ,80   ,127      ,48   ,23   ,4.645       ,5.007         ,0.928                     
0    ,1    ,80   ,127      ,64   ,0    ,4.934       ,3.898         ,1.266                     
0    ,1    ,80   ,127      ,64   ,23   ,5.06        ,4.547         ,1.113                     
0    ,1    ,800  ,127      ,512  ,0    ,10.443      ,12.587        ,0.83                      
0    ,1    ,800  ,127      ,512  ,23   ,15.132      ,19.943        ,0.759                     
0    ,1    ,832  ,127      ,1024 ,0    ,14.326      ,16.382        ,0.875                     
0    ,1    ,832  ,127      ,1024 ,23   ,15.299      ,18.528        ,0.826                     
0    ,1    ,832  ,127      ,192  ,0    ,5.276       ,7.963         ,0.663                     
0    ,1    ,832  ,127      ,192  ,23   ,13.365      ,18.251        ,0.732                     
0    ,1    ,880  ,127      ,1024 ,0    ,14.095      ,16.638        ,0.847                     
0    ,1    ,880  ,127      ,1024 ,23   ,15.424      ,18.966        ,0.813                     
0    ,1    ,880  ,127      ,144  ,0    ,5.135       ,7.68          ,0.669                     
0    ,1    ,880  ,127      ,144  ,23   ,14.477      ,18.979        ,0.763                     
0    ,1    ,9    ,127      ,8    ,0    ,3.004       ,3.37          ,0.891                     
0    ,1    ,9    ,127      ,8    ,23   ,3.167       ,3.375         ,0.938                     
0    ,1    ,928  ,127      ,1024 ,0    ,15.141      ,17.342        ,0.873                     
0    ,1    ,928  ,127      ,1024 ,23   ,17.292      ,19.89         ,0.869                     
0    ,1    ,928  ,127      ,96   ,0    ,4.893       ,4.551         ,1.075                     
0    ,1    ,928  ,127      ,96   ,23   ,16.483      ,19.632        ,0.84                      
0    ,1    ,96   ,127      ,80   ,0    ,4.957       ,4.127         ,1.201                     
0    ,1    ,96   ,127      ,80   ,23   ,5.056       ,4.769         ,1.06                      
0    ,1    ,976  ,127      ,1024 ,0    ,15.793      ,18.878        ,0.837                     
0    ,1    ,976  ,127      ,1024 ,23   ,16.935      ,20.985        ,0.807                     
0    ,1    ,976  ,127      ,48   ,0    ,3.053       ,5.109         ,0.598                     
0    ,1    ,976  ,127      ,48   ,23   ,17.511      ,20.885        ,0.838                     
0    ,16   ,1    ,127      ,0    ,23   ,3.178       ,3.183         ,0.998                     
0    ,16   ,10   ,127      ,9    ,23   ,3.178       ,3.339         ,0.952                     
0    ,16   ,1024 ,127      ,0    ,23   ,19.829      ,21.609        ,0.918                     
0    ,16   ,1024 ,127      ,1024 ,23   ,21.951      ,28.869        ,0.76                      
0    ,16   ,1024 ,127      ,144  ,23   ,17.617      ,21.48         ,0.82                      
0    ,16   ,1024 ,127      ,192  ,23   ,17.276      ,21.582        ,0.8                       
0    ,16   ,1024 ,127      ,240  ,23   ,17.258      ,21.54         ,0.801                     
0    ,16   ,1024 ,127      ,288  ,23   ,18.942      ,24.062        ,0.787                     
0    ,16   ,1024 ,127      ,48   ,23   ,19.643      ,21.643        ,0.908                     
0    ,16   ,1024 ,127      ,736  ,23   ,19.786      ,26.735        ,0.74                      
0    ,16   ,1024 ,127      ,784  ,23   ,20.751      ,27.499        ,0.755                     
0    ,16   ,1024 ,127      ,832  ,23   ,20.361      ,26.124        ,0.779                     
0    ,16   ,1024 ,127      ,880  ,23   ,20.246      ,27.91         ,0.725                     
0    ,16   ,1024 ,127      ,928  ,23   ,21.546      ,29.372        ,0.734                     
0    ,16   ,1024 ,127      ,96   ,23   ,18.497      ,21.586        ,0.857                     
0    ,16   ,1024 ,127      ,976  ,23   ,20.828      ,27.87         ,0.747                     
0    ,16   ,1072 ,127      ,1024 ,23   ,20.655      ,28.688        ,0.72                      
0    ,16   ,11   ,127      ,10   ,23   ,3.161       ,3.289         ,0.961                     
0    ,16   ,112  ,127      ,144  ,23   ,5.037       ,5.538         ,0.91                      
0    ,16   ,112  ,127      ,16   ,23   ,5.223       ,4.982         ,1.048                     
0    ,16   ,112  ,127      ,256  ,23   ,5.031       ,4.205         ,1.196                     
0    ,16   ,112  ,127      ,64   ,23   ,4.96        ,4.366         ,1.136                     
0    ,16   ,112  ,127      ,96   ,23   ,5.056       ,4.515         ,1.12                      
0    ,16   ,1120 ,127      ,1024 ,23   ,21.363      ,29.238        ,0.731                     
0    ,16   ,1168 ,127      ,1024 ,23   ,23.945      ,31.263        ,0.766                     
0    ,16   ,12   ,127      ,11   ,23   ,3.169       ,3.427         ,0.925                     
0    ,16   ,1216 ,127      ,1024 ,23   ,24.466      ,32.432        ,0.754                     
0    ,16   ,1264 ,127      ,1024 ,23   ,24.712      ,32.256        ,0.766                     
0    ,16   ,128  ,127      ,0    ,23   ,5.847       ,8.711         ,0.671                     
0    ,16   ,128  ,127      ,112  ,23   ,6.145       ,7.732         ,0.795                     
0    ,16   ,128  ,127      ,128  ,23   ,5.276       ,7.65          ,0.69                      
0    ,16   ,128  ,127      ,144  ,23   ,5.336       ,7.38          ,0.723                     
0    ,16   ,128  ,127      ,192  ,23   ,5.107       ,8.406         ,0.608                     
0    ,16   ,128  ,127      ,240  ,23   ,5.435       ,9.265         ,0.587                     
0    ,16   ,128  ,127      ,288  ,23   ,5.202       ,7.929         ,0.656                     
0    ,16   ,128  ,127      ,32   ,23   ,5.764       ,8.552         ,0.674                     
0    ,16   ,128  ,127      ,48   ,23   ,5.848       ,8.097         ,0.722                     
0    ,16   ,128  ,127      ,80   ,23   ,5.206       ,7.592         ,0.686                     
0    ,16   ,128  ,127      ,96   ,23   ,5.108       ,7.379         ,0.692                     
0    ,16   ,13   ,127      ,12   ,23   ,3.194       ,3.406         ,0.938                     
0    ,16   ,1312 ,127      ,1024 ,23   ,26.031      ,33.817        ,0.77                      
0    ,16   ,14   ,127      ,13   ,23   ,3.177       ,3.358         ,0.946                     
0    ,16   ,144  ,127      ,128  ,23   ,5.86        ,6.332         ,0.926                     
0    ,16   ,15   ,127      ,14   ,23   ,3.164       ,3.245         ,0.975                     
0    ,16   ,16   ,127      ,0    ,23   ,3.228       ,3.534         ,0.913                     
0    ,16   ,16   ,127      ,144  ,23   ,3.288       ,3.115         ,1.056                     
0    ,16   ,16   ,127      ,15   ,23   ,3.145       ,3.262         ,0.964                     
0    ,16   ,16   ,127      ,16   ,23   ,3.301       ,3.383         ,0.976                     
0    ,16   ,16   ,127      ,192  ,23   ,3.269       ,3.13          ,1.045                     
0    ,16   ,16   ,127      ,240  ,23   ,3.183       ,3.236         ,0.984                     
0    ,16   ,16   ,127      ,256  ,23   ,3.981       ,3.837         ,1.037                     
0    ,16   ,16   ,127      ,288  ,23   ,3.877       ,4.116         ,0.942                     
0    ,16   ,16   ,127      ,48   ,23   ,3.175       ,3.314         ,0.958                     
0    ,16   ,16   ,127      ,64   ,23   ,3.193       ,3.62          ,0.882                     
0    ,16   ,16   ,127      ,96   ,23   ,3.581       ,3.383         ,1.059                     
0    ,16   ,160  ,127      ,144  ,23   ,5.32        ,6.222         ,0.855                     
0    ,16   ,160  ,127      ,16   ,23   ,6.11        ,8.151         ,0.75                      
0    ,16   ,160  ,127      ,256  ,23   ,5.155       ,6.437         ,0.801                     
0    ,16   ,160  ,127      ,64   ,23   ,5.599       ,7.838         ,0.714                     
0    ,16   ,160  ,127      ,96   ,23   ,5.369       ,8.173         ,0.657                     
0    ,16   ,17   ,127      ,16   ,23   ,3.222       ,3.187         ,1.011                     
0    ,16   ,176  ,127      ,128  ,23   ,5.141       ,6.528         ,0.788                     
0    ,16   ,176  ,127      ,160  ,23   ,5.29        ,6.267         ,0.844                     
0    ,16   ,176  ,127      ,32   ,23   ,5.59        ,8.172         ,0.684                     
0    ,16   ,1760 ,127      ,2048 ,23   ,33.731      ,50.403        ,0.669                     
0    ,16   ,1760 ,127      ,288  ,23   ,27.6        ,32.727        ,0.843                     
0    ,16   ,18   ,127      ,17   ,23   ,3.16        ,3.445         ,0.917                     
0    ,16   ,1808 ,127      ,2048 ,23   ,35.232      ,56.189        ,0.627                     
0    ,16   ,1808 ,127      ,240  ,23   ,28.033      ,33.664        ,0.833                     
0    ,16   ,1856 ,127      ,192  ,23   ,28.71       ,34.266        ,0.838                     
0    ,16   ,1856 ,127      ,2048 ,23   ,35.865      ,60.854        ,0.589                     
0    ,16   ,19   ,127      ,18   ,23   ,3.169       ,3.382         ,0.937                     
0    ,16   ,1904 ,127      ,144  ,23   ,28.703      ,32.628        ,0.88                      
0    ,16   ,1904 ,127      ,2048 ,23   ,36.013      ,67.08         ,0.537                     
0    ,16   ,192  ,127      ,176  ,23   ,5.646       ,7.328         ,0.77                      
0    ,16   ,1952 ,127      ,2048 ,23   ,41.552      ,59.332        ,0.7                       
0    ,16   ,1952 ,127      ,96   ,23   ,31.231      ,35.501        ,0.88                      
0    ,16   ,2    ,127      ,1    ,23   ,3.146       ,3.179         ,0.989                     
0    ,16   ,20   ,127      ,19   ,23   ,3.169       ,3.246         ,0.976                     
0    ,16   ,2000 ,127      ,2048 ,23   ,46.0        ,70.366        ,0.654                     
0    ,16   ,2000 ,127      ,48   ,23   ,32.599      ,34.543        ,0.944                     
0    ,16   ,2048 ,127      ,0    ,23   ,33.828      ,36.499        ,0.927                     
0    ,16   ,2048 ,127      ,1024 ,23   ,38.208      ,45.153        ,0.846                     
0    ,16   ,2048 ,127      ,128  ,23   ,32.134      ,35.071        ,0.916                     
0    ,16   ,2048 ,127      ,144  ,23   ,33.146      ,35.794        ,0.926                     
0    ,16   ,2048 ,127      ,1760 ,23   ,39.068      ,54.347        ,0.719                     
0    ,16   ,2048 ,127      ,1808 ,23   ,48.313      ,57.817        ,0.836                     
0    ,16   ,2048 ,127      ,1856 ,23   ,43.439      ,50.915        ,0.853                     
0    ,16   ,2048 ,127      ,1904 ,23   ,47.154      ,51.681        ,0.912                     
0    ,16   ,2048 ,127      ,192  ,23   ,32.255      ,36.219        ,0.891                     
0    ,16   ,2048 ,127      ,1952 ,23   ,44.268      ,53.135        ,0.833                     
0    ,16   ,2048 ,127      ,2000 ,23   ,43.611      ,63.146        ,0.691                     
0    ,16   ,2048 ,127      ,2048 ,23   ,45.643      ,81.625        ,0.559                     
0    ,16   ,2048 ,127      ,240  ,23   ,31.904      ,35.888        ,0.889                     
0    ,16   ,2048 ,127      ,256  ,23   ,32.517      ,36.453        ,0.892                     
0    ,16   ,2048 ,127      ,288  ,23   ,33.122      ,43.043        ,0.77                      
0    ,16   ,2048 ,127      ,32   ,23   ,33.409      ,35.815        ,0.933                     
0    ,16   ,2048 ,127      ,4096 ,23   ,36.371      ,45.522        ,0.799                     
0    ,16   ,2048 ,127      ,48   ,23   ,33.454      ,34.874        ,0.959                     
0    ,16   ,2048 ,127      ,512  ,23   ,33.709      ,40.226        ,0.838                     
0    ,16   ,2048 ,127      ,64   ,23   ,32.639      ,35.727        ,0.914                     
0    ,16   ,2048 ,127      ,96   ,23   ,32.574      ,35.928        ,0.907                     
0    ,16   ,208  ,127      ,16   ,23   ,7.337       ,9.42          ,0.779                     
0    ,16   ,208  ,127      ,192  ,23   ,5.757       ,7.05          ,0.817                     
0    ,16   ,208  ,127      ,256  ,23   ,5.216       ,7.696         ,0.678                     
0    ,16   ,208  ,127      ,48   ,23   ,6.807       ,8.815         ,0.772                     
0    ,16   ,208  ,127      ,64   ,23   ,6.015       ,8.369         ,0.719                     
0    ,16   ,2096 ,127      ,2048 ,23   ,40.09       ,64.884        ,0.618                     
0    ,16   ,21   ,127      ,20   ,23   ,3.16        ,3.162         ,1.0                       
0    ,16   ,2144 ,127      ,2048 ,23   ,48.477      ,58.556        ,0.828                     
0    ,16   ,2192 ,127      ,2048 ,23   ,63.926      ,60.205        ,1.062                     
0    ,16   ,22   ,127      ,21   ,23   ,3.145       ,3.339         ,0.942                     
0    ,16   ,224  ,127      ,128  ,23   ,5.718       ,7.235         ,0.79                      
0    ,16   ,224  ,127      ,208  ,23   ,5.548       ,7.047         ,0.787                     
0    ,16   ,224  ,127      ,288  ,23   ,5.101       ,8.664         ,0.589                     
0    ,16   ,224  ,127      ,32   ,23   ,7.018       ,9.411         ,0.746                     
0    ,16   ,224  ,127      ,512  ,23   ,5.132       ,8.694         ,0.59                      
0    ,16   ,2240 ,127      ,2048 ,23   ,60.524      ,62.649        ,0.966                     
0    ,16   ,2288 ,127      ,2048 ,23   ,59.027      ,62.563        ,0.943                     
0    ,16   ,23   ,127      ,22   ,23   ,3.145       ,3.322         ,0.947                     
0    ,16   ,2336 ,127      ,2048 ,23   ,63.776      ,61.038        ,1.045                     
0    ,16   ,24   ,127      ,23   ,23   ,3.161       ,3.279         ,0.964                     
0    ,16   ,240  ,127      ,224  ,23   ,5.249       ,7.014         ,0.748                     
0    ,16   ,25   ,127      ,24   ,23   ,3.153       ,3.599         ,0.876                     
0    ,16   ,256  ,127      ,0    ,23   ,8.794       ,11.951        ,0.736                     
0    ,16   ,256  ,127      ,112  ,23   ,8.294       ,11.472        ,0.723                     
0    ,16   ,256  ,127      ,144  ,23   ,7.403       ,9.753         ,0.759                     
0    ,16   ,256  ,127      ,16   ,23   ,9.535       ,12.03         ,0.793                     
0    ,16   ,256  ,127      ,160  ,23   ,7.526       ,10.763        ,0.699                     
0    ,16   ,256  ,127      ,192  ,23   ,7.1         ,10.586        ,0.671                     
0    ,16   ,256  ,127      ,208  ,23   ,6.854       ,10.525        ,0.651                     
0    ,16   ,256  ,127      ,240  ,23   ,7.021       ,10.085        ,0.696                     
0    ,16   ,256  ,127      ,256  ,23   ,6.91        ,10.876        ,0.635                     
0    ,16   ,256  ,127      ,288  ,23   ,6.844       ,9.532         ,0.718                     
0    ,16   ,256  ,127      ,48   ,23   ,8.751       ,11.453        ,0.764                     
0    ,16   ,256  ,127      ,64   ,23   ,8.093       ,10.078        ,0.803                     
0    ,16   ,256  ,127      ,96   ,23   ,8.113       ,11.152        ,0.727                     
0    ,16   ,26   ,127      ,25   ,23   ,3.178       ,3.445         ,0.922                     
0    ,16   ,27   ,127      ,26   ,23   ,3.129       ,3.313         ,0.944                     
0    ,16   ,272  ,127      ,128  ,23   ,7.529       ,9.611         ,0.783                     
0    ,16   ,272  ,127      ,240  ,23   ,7.257       ,11.017        ,0.659                     
0    ,16   ,272  ,127      ,256  ,23   ,8.071       ,10.838        ,0.745                     
0    ,16   ,272  ,127      ,32   ,23   ,8.546       ,11.136        ,0.767                     
0    ,16   ,272  ,127      ,512  ,23   ,8.012       ,10.355        ,0.774                     
0    ,16   ,28   ,127      ,27   ,23   ,3.161       ,3.163         ,0.999                     
0    ,16   ,288  ,127      ,272  ,23   ,8.095       ,9.776         ,0.828                     
0    ,16   ,29   ,127      ,28   ,23   ,3.146       ,3.391         ,0.928                     
0    ,16   ,3    ,127      ,2    ,23   ,3.178       ,3.3           ,0.963                     
0    ,16   ,30   ,127      ,29   ,23   ,3.161       ,3.427         ,0.922                     
0    ,16   ,304  ,127      ,16   ,23   ,8.915       ,11.096        ,0.803                     
0    ,16   ,304  ,127      ,256  ,23   ,8.059       ,11.274        ,0.715                     
0    ,16   ,304  ,127      ,64   ,23   ,8.13        ,9.605         ,0.846                     
0    ,16   ,31   ,127      ,30   ,23   ,3.183       ,3.352         ,0.95                      
0    ,16   ,32   ,127      ,0    ,23   ,3.231       ,3.34          ,0.967                     
0    ,16   ,32   ,127      ,128  ,23   ,3.129       ,3.508         ,0.892                     
0    ,16   ,32   ,127      ,144  ,23   ,3.102       ,3.278         ,0.946                     
0    ,16   ,32   ,127      ,16   ,23   ,3.145       ,3.114         ,1.01                      
0    ,16   ,32   ,127      ,192  ,23   ,3.145       ,3.366         ,0.934                     
0    ,16   ,32   ,127      ,240  ,23   ,3.13        ,3.364         ,0.93                      
0    ,16   ,32   ,127      ,288  ,23   ,3.114       ,3.483         ,0.894                     
0    ,16   ,32   ,127      ,31   ,23   ,3.261       ,3.327         ,0.98                      
0    ,16   ,32   ,127      ,32   ,23   ,3.314       ,3.168         ,1.046                     
0    ,16   ,32   ,127      ,48   ,23   ,3.161       ,3.262         ,0.969                     
0    ,16   ,32   ,127      ,96   ,23   ,3.122       ,3.212         ,0.972                     
0    ,16   ,320  ,127      ,128  ,23   ,7.366       ,11.526        ,0.639                     
0    ,16   ,320  ,127      ,192  ,23   ,7.328       ,10.465        ,0.7                       
0    ,16   ,320  ,127      ,32   ,23   ,8.567       ,12.111        ,0.707                     
0    ,16   ,320  ,127      ,512  ,23   ,8.91        ,10.598        ,0.841                     
0    ,16   ,352  ,127      ,256  ,23   ,9.068       ,11.707        ,0.775                     
0    ,16   ,352  ,127      ,64   ,23   ,8.151       ,10.747        ,0.758                     
0    ,16   ,368  ,127      ,128  ,23   ,7.496       ,10.471        ,0.716                     
0    ,16   ,368  ,127      ,144  ,23   ,7.333       ,11.062        ,0.663                     
0    ,16   ,368  ,127      ,512  ,23   ,8.118       ,11.506        ,0.705                     
0    ,16   ,4    ,127      ,3    ,23   ,3.193       ,3.297         ,0.969                     
0    ,16   ,400  ,127      ,256  ,23   ,9.651       ,14.36         ,0.672                     
0    ,16   ,416  ,127      ,128  ,23   ,9.108       ,11.58         ,0.786                     
0    ,16   ,416  ,127      ,512  ,23   ,10.225      ,13.17         ,0.776                     
0    ,16   ,416  ,127      ,96   ,23   ,9.482       ,12.78         ,0.742                     
0    ,16   ,448  ,127      ,256  ,23   ,9.795       ,14.125        ,0.693                     
0    ,16   ,464  ,127      ,48   ,23   ,10.255      ,14.002        ,0.732                     
0    ,16   ,464  ,127      ,512  ,23   ,11.831      ,14.218        ,0.832                     
0    ,16   ,48   ,127      ,32   ,23   ,3.16        ,3.367         ,0.939                     
0    ,16   ,496  ,127      ,256  ,23   ,9.657       ,13.671        ,0.706                     
0    ,16   ,5    ,127      ,4    ,23   ,3.186       ,3.492         ,0.912                     
0    ,16   ,512  ,127      ,0    ,23   ,12.048      ,15.64         ,0.77                      
0    ,16   ,512  ,127      ,144  ,23   ,10.972      ,14.349        ,0.765                     
0    ,16   ,512  ,127      ,192  ,23   ,10.035      ,14.292        ,0.702                     
0    ,16   ,512  ,127      ,224  ,23   ,9.99        ,14.743        ,0.678                     
0    ,16   ,512  ,127      ,240  ,23   ,11.028      ,15.141        ,0.728                     
0    ,16   ,512  ,127      ,272  ,23   ,11.218      ,15.102        ,0.743                     
0    ,16   ,512  ,127      ,288  ,23   ,11.271      ,15.712        ,0.717                     
0    ,16   ,512  ,127      ,320  ,23   ,10.968      ,15.475        ,0.709                     
0    ,16   ,512  ,127      ,368  ,23   ,11.241      ,14.911        ,0.754                     
0    ,16   ,512  ,127      ,416  ,23   ,11.936      ,16.568        ,0.72                      
0    ,16   ,512  ,127      ,464  ,23   ,11.853      ,17.239        ,0.688                     
0    ,16   ,512  ,127      ,48   ,23   ,11.926      ,14.573        ,0.818                     
0    ,16   ,512  ,127      ,512  ,23   ,12.265      ,16.872        ,0.727                     
0    ,16   ,512  ,127      ,96   ,23   ,11.607      ,14.806        ,0.784                     
0    ,16   ,544  ,127      ,256  ,23   ,11.361      ,15.435        ,0.736                     
0    ,16   ,560  ,127      ,512  ,23   ,12.756      ,16.698        ,0.764                     
0    ,16   ,6    ,127      ,5    ,23   ,3.185       ,3.305         ,0.964                     
0    ,16   ,608  ,127      ,512  ,23   ,12.317      ,18.386        ,0.67                      
0    ,16   ,64   ,127      ,0    ,23   ,4.466       ,5.478         ,0.815                     
0    ,16   ,64   ,127      ,144  ,23   ,4.425       ,4.982         ,0.888                     
0    ,16   ,64   ,127      ,16   ,23   ,4.49        ,5.02          ,0.894                     
0    ,16   ,64   ,127      ,192  ,23   ,4.419       ,5.075         ,0.871                     
0    ,16   ,64   ,127      ,240  ,23   ,4.382       ,5.779         ,0.758                     
0    ,16   ,64   ,127      ,256  ,23   ,4.38        ,4.957         ,0.884                     
0    ,16   ,64   ,127      ,288  ,23   ,4.425       ,5.432         ,0.815                     
0    ,16   ,64   ,127      ,48   ,23   ,4.441       ,5.016         ,0.885                     
0    ,16   ,64   ,127      ,64   ,23   ,4.465       ,4.973         ,0.898                     
0    ,16   ,64   ,127      ,96   ,23   ,4.47        ,5.006         ,0.893                     
0    ,16   ,656  ,127      ,512  ,23   ,14.392      ,19.933        ,0.722                     
0    ,16   ,7    ,127      ,6    ,23   ,3.177       ,3.253         ,0.976                     
0    ,16   ,704  ,127      ,512  ,23   ,14.324      ,20.486        ,0.699                     
0    ,16   ,736  ,127      ,1024 ,23   ,15.263      ,20.274        ,0.753                     
0    ,16   ,736  ,127      ,288  ,23   ,13.001      ,18.038        ,0.721                     
0    ,16   ,752  ,127      ,512  ,23   ,14.322      ,20.224        ,0.708                     
0    ,16   ,784  ,127      ,1024 ,23   ,17.573      ,21.717        ,0.809                     
0    ,16   ,784  ,127      ,240  ,23   ,13.68       ,17.887        ,0.765                     
0    ,16   ,8    ,127      ,7    ,23   ,3.145       ,3.262         ,0.964                     
0    ,16   ,80   ,127      ,128  ,23   ,5.008       ,4.269         ,1.173                     
0    ,16   ,80   ,127      ,32   ,23   ,4.699       ,5.069         ,0.927                     
0    ,16   ,80   ,127      ,48   ,23   ,4.499       ,5.031         ,0.894                     
0    ,16   ,80   ,127      ,64   ,23   ,5.065       ,4.132         ,1.226                     
0    ,16   ,800  ,127      ,512  ,23   ,16.19       ,22.15         ,0.731                     
0    ,16   ,832  ,127      ,1024 ,23   ,17.574      ,23.476        ,0.749                     
0    ,16   ,832  ,127      ,192  ,23   ,14.099      ,18.214        ,0.774                     
0    ,16   ,880  ,127      ,1024 ,23   ,18.259      ,22.914        ,0.797                     
0    ,16   ,880  ,127      ,144  ,23   ,14.059      ,18.869        ,0.745                     
0    ,16   ,9    ,127      ,8    ,23   ,3.161       ,3.323         ,0.951                     
0    ,16   ,928  ,127      ,1024 ,23   ,18.622      ,25.614        ,0.727                     
0    ,16   ,928  ,127      ,96   ,23   ,16.562      ,19.474        ,0.85                      
0    ,16   ,96   ,127      ,80   ,23   ,5.057       ,4.359         ,1.16                      
0    ,16   ,976  ,127      ,1024 ,23   ,18.923      ,26.838        ,0.705                     
0    ,16   ,976  ,127      ,48   ,23   ,17.096      ,21.254        ,0.804                     
0    ,256  ,1    ,127      ,0    ,23   ,3.146       ,3.214         ,0.979                     
0    ,256  ,10   ,127      ,9    ,23   ,3.137       ,3.178         ,0.987                     
0    ,256  ,1024 ,127      ,0    ,23   ,19.994      ,21.784        ,0.918                     
0    ,256  ,1024 ,127      ,1024 ,23   ,21.983      ,28.581        ,0.769                     
0    ,256  ,1024 ,127      ,144  ,23   ,18.137      ,20.588        ,0.881                     
0    ,256  ,1024 ,127      ,192  ,23   ,17.06       ,25.282        ,0.675                     
0    ,256  ,1024 ,127      ,240  ,23   ,17.087      ,21.437        ,0.797                     
0    ,256  ,1024 ,127      ,288  ,23   ,18.225      ,24.604        ,0.741                     
0    ,256  ,1024 ,127      ,48   ,23   ,19.586      ,21.465        ,0.912                     
0    ,256  ,1024 ,127      ,736  ,23   ,19.255      ,25.27         ,0.762                     
0    ,256  ,1024 ,127      ,784  ,23   ,20.553      ,27.743        ,0.741                     
0    ,256  ,1024 ,127      ,832  ,23   ,19.867      ,29.565        ,0.672                     
0    ,256  ,1024 ,127      ,880  ,23   ,19.854      ,29.602        ,0.671                     
0    ,256  ,1024 ,127      ,928  ,23   ,21.875      ,27.64         ,0.791                     
0    ,256  ,1024 ,127      ,96   ,23   ,19.071      ,21.496        ,0.887                     
0    ,256  ,1024 ,127      ,976  ,23   ,21.498      ,27.902        ,0.77                      
0    ,256  ,1072 ,127      ,1024 ,23   ,20.903      ,27.906        ,0.749                     
0    ,256  ,11   ,127      ,10   ,23   ,3.144       ,3.305         ,0.951                     
0    ,256  ,112  ,127      ,144  ,23   ,5.07        ,4.641         ,1.092                     
0    ,256  ,112  ,127      ,16   ,23   ,4.403       ,5.206         ,0.846                     
0    ,256  ,112  ,127      ,256  ,23   ,4.957       ,4.315         ,1.149                     
0    ,256  ,112  ,127      ,64   ,23   ,5.045       ,3.835         ,1.315                     
0    ,256  ,112  ,127      ,96   ,23   ,5.057       ,4.119         ,1.228                     
0    ,256  ,1120 ,127      ,1024 ,23   ,21.671      ,28.532        ,0.76                      
0    ,256  ,1168 ,127      ,1024 ,23   ,24.189      ,34.961        ,0.692                     
0    ,256  ,12   ,127      ,11   ,23   ,3.145       ,3.541         ,0.888                     
0    ,256  ,1216 ,127      ,1024 ,23   ,25.203      ,35.574        ,0.708                     
0    ,256  ,1264 ,127      ,1024 ,23   ,24.902      ,34.122        ,0.73                      
0    ,256  ,128  ,127      ,0    ,23   ,5.888       ,8.837         ,0.666                     
0    ,256  ,128  ,127      ,112  ,23   ,5.258       ,7.4           ,0.711                     
0    ,256  ,128  ,127      ,128  ,23   ,5.34        ,7.324         ,0.729                     
0    ,256  ,128  ,127      ,144  ,23   ,5.227       ,7.499         ,0.697                     
0    ,256  ,128  ,127      ,192  ,23   ,5.338       ,8.648         ,0.617                     
0    ,256  ,128  ,127      ,240  ,23   ,5.691       ,8.385         ,0.679                     
0    ,256  ,128  ,127      ,288  ,23   ,6.039       ,8.97          ,0.673                     
0    ,256  ,128  ,127      ,32   ,23   ,5.875       ,8.509         ,0.69                      
0    ,256  ,128  ,127      ,48   ,23   ,6.03        ,8.378         ,0.72                      
0    ,256  ,128  ,127      ,80   ,23   ,5.997       ,8.062         ,0.744                     
0    ,256  ,128  ,127      ,96   ,23   ,5.271       ,8.035         ,0.656                     
0    ,256  ,13   ,127      ,12   ,23   ,3.152       ,3.482         ,0.905                     
0    ,256  ,1312 ,127      ,1024 ,23   ,26.122      ,32.511        ,0.803                     
0    ,256  ,14   ,127      ,13   ,23   ,3.161       ,3.427         ,0.922                     
0    ,256  ,144  ,127      ,128  ,23   ,5.318       ,6.503         ,0.818                     
0    ,256  ,15   ,127      ,14   ,23   ,3.145       ,3.358         ,0.936                     
0    ,256  ,16   ,127      ,0    ,23   ,3.14        ,3.331         ,0.943                     
0    ,256  ,16   ,127      ,144  ,23   ,3.145       ,3.491         ,0.901                     
0    ,256  ,16   ,127      ,15   ,23   ,3.145       ,3.273         ,0.961                     
0    ,256  ,16   ,127      ,16   ,23   ,3.145       ,3.349         ,0.939                     
0    ,256  ,16   ,127      ,192  ,23   ,3.145       ,3.114         ,1.01                      
0    ,256  ,16   ,127      ,240  ,23   ,3.124       ,3.264         ,0.957                     
0    ,256  ,16   ,127      ,256  ,23   ,3.099       ,3.717         ,0.834                     
0    ,256  ,16   ,127      ,288  ,23   ,3.145       ,3.894         ,0.808                     
0    ,256  ,16   ,127      ,48   ,23   ,3.17        ,3.366         ,0.942                     
0    ,256  ,16   ,127      ,64   ,23   ,3.16        ,3.366         ,0.939                     
0    ,256  ,16   ,127      ,96   ,23   ,3.144       ,3.145         ,1.0                       
0    ,256  ,160  ,127      ,144  ,23   ,5.181       ,6.164         ,0.841                     
0    ,256  ,160  ,127      ,16   ,23   ,5.727       ,8.65          ,0.662                     
0    ,256  ,160  ,127      ,256  ,23   ,5.121       ,6.453         ,0.794                     
0    ,256  ,160  ,127      ,64   ,23   ,5.819       ,7.465         ,0.78                      
0    ,256  ,160  ,127      ,96   ,23   ,5.101       ,8.167         ,0.625                     
0    ,256  ,17   ,127      ,16   ,23   ,3.16        ,3.323         ,0.951                     
0    ,256  ,176  ,127      ,128  ,23   ,5.168       ,6.314         ,0.818                     
0    ,256  ,176  ,127      ,160  ,23   ,5.22        ,6.309         ,0.827                     
0    ,256  ,176  ,127      ,32   ,23   ,5.722       ,8.539         ,0.67                      
0    ,256  ,1760 ,127      ,2048 ,23   ,32.889      ,43.99         ,0.748                     
0    ,256  ,1760 ,127      ,288  ,23   ,26.559      ,32.553        ,0.816                     
0    ,256  ,18   ,127      ,17   ,23   ,3.202       ,3.454         ,0.927                     
0    ,256  ,1808 ,127      ,2048 ,23   ,34.767      ,52.228        ,0.666                     
0    ,256  ,1808 ,127      ,240  ,23   ,27.299      ,31.891        ,0.856                     
0    ,256  ,1856 ,127      ,192  ,23   ,27.555      ,31.557        ,0.873                     
0    ,256  ,1856 ,127      ,2048 ,23   ,35.551      ,49.377        ,0.72                      
0    ,256  ,19   ,127      ,18   ,23   ,3.137       ,3.339         ,0.94                      
0    ,256  ,1904 ,127      ,144  ,23   ,28.06       ,31.927        ,0.879                     
0    ,256  ,1904 ,127      ,2048 ,23   ,35.478      ,47.111        ,0.753                     
0    ,256  ,192  ,127      ,176  ,23   ,5.577       ,7.232         ,0.771                     
0    ,256  ,1952 ,127      ,2048 ,23   ,37.287      ,59.21         ,0.63                      
0    ,256  ,1952 ,127      ,96   ,23   ,30.677      ,33.516        ,0.915                     
0    ,256  ,2    ,127      ,1    ,23   ,3.145       ,3.325         ,0.946                     
0    ,256  ,20   ,127      ,19   ,23   ,3.227       ,3.314         ,0.974                     
0    ,256  ,2000 ,127      ,2048 ,23   ,37.38       ,50.221        ,0.744                     
0    ,256  ,2000 ,127      ,48   ,23   ,30.747      ,33.806        ,0.91                      
0    ,256  ,2048 ,127      ,0    ,23   ,33.031      ,35.835        ,0.922                     
0    ,256  ,2048 ,127      ,1024 ,23   ,36.334      ,44.119        ,0.824                     
0    ,256  ,2048 ,127      ,128  ,23   ,31.88       ,35.272        ,0.904                     
0    ,256  ,2048 ,127      ,144  ,23   ,31.578      ,34.847        ,0.906                     
0    ,256  ,2048 ,127      ,1760 ,23   ,38.149      ,48.362        ,0.789                     
0    ,256  ,2048 ,127      ,1808 ,23   ,43.271      ,50.273        ,0.861                     
0    ,256  ,2048 ,127      ,1856 ,23   ,42.912      ,50.014        ,0.858                     
0    ,256  ,2048 ,127      ,1904 ,23   ,51.249      ,50.177        ,1.021                     
0    ,256  ,2048 ,127      ,192  ,23   ,31.015      ,36.799        ,0.843                     
0    ,256  ,2048 ,127      ,1952 ,23   ,51.777      ,51.757        ,1.0                       
0    ,256  ,2048 ,127      ,2000 ,23   ,60.602      ,60.728        ,0.998                     
0    ,256  ,2048 ,127      ,2048 ,23   ,51.292      ,56.859        ,0.902                     
0    ,256  ,2048 ,127      ,240  ,23   ,31.21       ,35.351        ,0.883                     
0    ,256  ,2048 ,127      ,256  ,23   ,32.582      ,36.327        ,0.897                     
0    ,256  ,2048 ,127      ,288  ,23   ,32.641      ,37.189        ,0.878                     
0    ,256  ,2048 ,127      ,32   ,23   ,33.0        ,35.158        ,0.939                     
0    ,256  ,2048 ,127      ,4096 ,23   ,46.652      ,54.16         ,0.861                     
0    ,256  ,2048 ,127      ,48   ,23   ,33.248      ,37.148        ,0.895                     
0    ,256  ,2048 ,127      ,512  ,23   ,33.596      ,39.634        ,0.848                     
0    ,256  ,2048 ,127      ,64   ,23   ,32.438      ,35.412        ,0.916                     
0    ,256  ,2048 ,127      ,96   ,23   ,32.241      ,36.44         ,0.885                     
0    ,256  ,208  ,127      ,16   ,23   ,6.961       ,9.348         ,0.745                     
0    ,256  ,208  ,127      ,192  ,23   ,5.231       ,7.171         ,0.729                     
0    ,256  ,208  ,127      ,256  ,23   ,4.941       ,7.662         ,0.645                     
0    ,256  ,208  ,127      ,48   ,23   ,6.592       ,9.107         ,0.724                     
0    ,256  ,208  ,127      ,64   ,23   ,6.441       ,8.076         ,0.798                     
0    ,256  ,2096 ,127      ,2048 ,23   ,40.313      ,56.168        ,0.718                     
0    ,256  ,21   ,127      ,20   ,23   ,3.153       ,3.4           ,0.927                     
0    ,256  ,2144 ,127      ,2048 ,23   ,57.465      ,55.629        ,1.033                     
0    ,256  ,2192 ,127      ,2048 ,23   ,47.3        ,55.284        ,0.856                     
0    ,256  ,22   ,127      ,21   ,23   ,3.195       ,3.254         ,0.982                     
0    ,256  ,224  ,127      ,128  ,23   ,5.415       ,7.406         ,0.731                     
0    ,256  ,224  ,127      ,208  ,23   ,5.4         ,7.517         ,0.718                     
0    ,256  ,224  ,127      ,288  ,23   ,5.108       ,8.419         ,0.607                     
0    ,256  ,224  ,127      ,32   ,23   ,6.738       ,9.569         ,0.704                     
0    ,256  ,224  ,127      ,512  ,23   ,5.075       ,7.537         ,0.673                     
0    ,256  ,2240 ,127      ,2048 ,23   ,53.873      ,55.138        ,0.977                     
0    ,256  ,2288 ,127      ,2048 ,23   ,54.51       ,58.666        ,0.929                     
0    ,256  ,23   ,127      ,22   ,23   ,3.144       ,3.145         ,1.0                       
0    ,256  ,2336 ,127      ,2048 ,23   ,61.72       ,56.847        ,1.086                     
0    ,256  ,24   ,127      ,23   ,23   ,3.145       ,3.313         ,0.949                     
0    ,256  ,240  ,127      ,224  ,23   ,5.183       ,6.867         ,0.755                     
0    ,256  ,25   ,127      ,24   ,23   ,3.169       ,3.333         ,0.951                     
0    ,256  ,256  ,127      ,0    ,23   ,8.681       ,12.197        ,0.712                     
0    ,256  ,256  ,127      ,112  ,23   ,8.259       ,10.596        ,0.779                     
0    ,256  ,256  ,127      ,144  ,23   ,7.334       ,9.898         ,0.741                     
0    ,256  ,256  ,127      ,16   ,23   ,9.449       ,12.266        ,0.77                      
0    ,256  ,256  ,127      ,160  ,23   ,7.509       ,10.0          ,0.751                     
0    ,256  ,256  ,127      ,192  ,23   ,6.896       ,9.401         ,0.734                     
0    ,256  ,256  ,127      ,208  ,23   ,6.825       ,10.7          ,0.638                     
0    ,256  ,256  ,127      ,240  ,23   ,7.694       ,10.464        ,0.735                     
0    ,256  ,256  ,127      ,256  ,23   ,7.033       ,9.86          ,0.713                     
0    ,256  ,256  ,127      ,288  ,23   ,7.227       ,9.406         ,0.768                     
0    ,256  ,256  ,127      ,48   ,23   ,8.621       ,12.415        ,0.694                     
0    ,256  ,256  ,127      ,64   ,23   ,8.982       ,10.854        ,0.828                     
0    ,256  ,256  ,127      ,96   ,23   ,8.387       ,10.322        ,0.813                     
0    ,256  ,26   ,127      ,25   ,23   ,3.16        ,3.305         ,0.956                     
0    ,256  ,27   ,127      ,26   ,23   ,3.129       ,3.261         ,0.959                     
0    ,256  ,272  ,127      ,128  ,23   ,9.172       ,9.648         ,0.951                     
0    ,256  ,272  ,127      ,240  ,23   ,6.991       ,11.363        ,0.615                     
0    ,256  ,272  ,127      ,256  ,23   ,8.142       ,10.487        ,0.776                     
0    ,256  ,272  ,127      ,32   ,23   ,10.148      ,11.691        ,0.868                     
0    ,256  ,272  ,127      ,512  ,23   ,8.368       ,10.967        ,0.763                     
0    ,256  ,28   ,127      ,27   ,23   ,3.186       ,3.313         ,0.961                     
0    ,256  ,288  ,127      ,272  ,23   ,7.975       ,10.551        ,0.756                     
0    ,256  ,29   ,127      ,28   ,23   ,3.144       ,3.374         ,0.932                     
0    ,256  ,3    ,127      ,2    ,23   ,3.145       ,3.398         ,0.925                     
0    ,256  ,30   ,127      ,29   ,23   ,3.152       ,3.425         ,0.92                      
0    ,256  ,304  ,127      ,16   ,23   ,10.272      ,11.657        ,0.881                     
0    ,256  ,304  ,127      ,256  ,23   ,8.008       ,10.999        ,0.728                     
0    ,256  ,304  ,127      ,64   ,23   ,9.544       ,9.819         ,0.972                     
0    ,256  ,31   ,127      ,30   ,23   ,3.145       ,3.23          ,0.974                     
0    ,256  ,32   ,127      ,0    ,23   ,3.145       ,4.03          ,0.78                      
0    ,256  ,32   ,127      ,128  ,23   ,3.13        ,3.263         ,0.959                     
0    ,256  ,32   ,127      ,144  ,23   ,3.145       ,3.246         ,0.969                     
0    ,256  ,32   ,127      ,16   ,23   ,3.145       ,3.544         ,0.887                     
0    ,256  ,32   ,127      ,192  ,23   ,3.144       ,3.262         ,0.964                     
0    ,256  ,32   ,127      ,240  ,23   ,3.144       ,3.113         ,1.01                      
0    ,256  ,32   ,127      ,288  ,23   ,3.144       ,3.099         ,1.015                     
0    ,256  ,32   ,127      ,31   ,23   ,3.152       ,3.289         ,0.958                     
0    ,256  ,32   ,127      ,32   ,23   ,3.14        ,3.186         ,0.986                     
0    ,256  ,32   ,127      ,48   ,23   ,3.144       ,3.23          ,0.973                     
0    ,256  ,32   ,127      ,96   ,23   ,3.145       ,3.265         ,0.963                     
0    ,256  ,320  ,127      ,128  ,23   ,10.148      ,10.832        ,0.937                     
0    ,256  ,320  ,127      ,192  ,23   ,7.227       ,10.995        ,0.657                     
0    ,256  ,320  ,127      ,32   ,23   ,11.107      ,12.936        ,0.859                     
0    ,256  ,320  ,127      ,512  ,23   ,8.956       ,10.671        ,0.839                     
0    ,256  ,352  ,127      ,256  ,23   ,9.058       ,11.587        ,0.782                     
0    ,256  ,352  ,127      ,64   ,23   ,10.604      ,11.707        ,0.906                     
0    ,256  ,368  ,127      ,128  ,23   ,10.45       ,11.429        ,0.914                     
0    ,256  ,368  ,127      ,144  ,23   ,7.747       ,11.321        ,0.684                     
0    ,256  ,368  ,127      ,512  ,23   ,9.028       ,11.435        ,0.789                     
0    ,256  ,4    ,127      ,3    ,23   ,3.168       ,3.289         ,0.963                     
0    ,256  ,400  ,127      ,256  ,23   ,11.227      ,14.437        ,0.778                     
0    ,256  ,416  ,127      ,128  ,23   ,11.666      ,13.116        ,0.889                     
0    ,256  ,416  ,127      ,512  ,23   ,10.019      ,12.995        ,0.771                     
0    ,256  ,416  ,127      ,96   ,23   ,9.753       ,12.526        ,0.779                     
0    ,256  ,448  ,127      ,256  ,23   ,12.314      ,13.949        ,0.883                     
0    ,256  ,464  ,127      ,48   ,23   ,10.197      ,14.128        ,0.722                     
0    ,256  ,464  ,127      ,512  ,23   ,10.467      ,14.219        ,0.736                     
0    ,256  ,48   ,127      ,32   ,23   ,3.145       ,3.189         ,0.986                     
0    ,256  ,496  ,127      ,256  ,23   ,11.836      ,14.011        ,0.845                     
0    ,256  ,5    ,127      ,4    ,23   ,3.145       ,3.411         ,0.922                     
0    ,256  ,512  ,127      ,0    ,23   ,12.587      ,15.042        ,0.837                     
0    ,256  ,512  ,127      ,144  ,23   ,11.395      ,13.993        ,0.814                     
0    ,256  ,512  ,127      ,192  ,23   ,9.988       ,14.644        ,0.682                     
0    ,256  ,512  ,127      ,224  ,23   ,9.985       ,15.975        ,0.625                     
0    ,256  ,512  ,127      ,240  ,23   ,10.083      ,14.035        ,0.718                     
0    ,256  ,512  ,127      ,272  ,23   ,11.22       ,16.058        ,0.699                     
0    ,256  ,512  ,127      ,288  ,23   ,11.384      ,15.452        ,0.737                     
0    ,256  ,512  ,127      ,320  ,23   ,10.717      ,15.884        ,0.675                     
0    ,256  ,512  ,127      ,368  ,23   ,10.69       ,15.46         ,0.691                     
0    ,256  ,512  ,127      ,416  ,23   ,13.053      ,16.36         ,0.798                     
0    ,256  ,512  ,127      ,464  ,23   ,13.77       ,15.835        ,0.87                      
0    ,256  ,512  ,127      ,48   ,23   ,12.562      ,15.047        ,0.835                     
0    ,256  ,512  ,127      ,512  ,23   ,13.772      ,16.889        ,0.815                     
0    ,256  ,512  ,127      ,96   ,23   ,12.085      ,13.615        ,0.888                     
0    ,256  ,544  ,127      ,256  ,23   ,12.012      ,15.513        ,0.774                     
0    ,256  ,560  ,127      ,512  ,23   ,12.756      ,16.063        ,0.794                     
0    ,256  ,6    ,127      ,5    ,23   ,3.178       ,3.238         ,0.981                     
0    ,256  ,608  ,127      ,512  ,23   ,12.669      ,17.017        ,0.745                     
0    ,256  ,64   ,127      ,0    ,23   ,4.403       ,5.565         ,0.791                     
0    ,256  ,64   ,127      ,144  ,23   ,4.545       ,4.981         ,0.912                     
0    ,256  ,64   ,127      ,16   ,23   ,4.47        ,4.998         ,0.894                     
0    ,256  ,64   ,127      ,192  ,23   ,4.466       ,4.969         ,0.899                     
0    ,256  ,64   ,127      ,240  ,23   ,4.545       ,6.049         ,0.751                     
0    ,256  ,64   ,127      ,256  ,23   ,4.477       ,4.908         ,0.912                     
0    ,256  ,64   ,127      ,288  ,23   ,4.522       ,5.154         ,0.877                     
0    ,256  ,64   ,127      ,48   ,23   ,4.414       ,5.14          ,0.859                     
0    ,256  ,64   ,127      ,64   ,23   ,4.523       ,5.051         ,0.895                     
0    ,256  ,64   ,127      ,96   ,23   ,4.667       ,5.006         ,0.932                     
0    ,256  ,656  ,127      ,512  ,23   ,15.319      ,20.812        ,0.736                     
0    ,256  ,7    ,127      ,6    ,23   ,3.153       ,3.187         ,0.989                     
0    ,256  ,704  ,127      ,512  ,23   ,15.843      ,20.638        ,0.768                     
0    ,256  ,736  ,127      ,1024 ,23   ,14.656      ,19.421        ,0.755                     
0    ,256  ,736  ,127      ,288  ,23   ,12.779      ,18.999        ,0.673                     
0    ,256  ,752  ,127      ,512  ,23   ,15.777      ,21.166        ,0.745                     
0    ,256  ,784  ,127      ,1024 ,23   ,16.668      ,26.065        ,0.639                     
0    ,256  ,784  ,127      ,240  ,23   ,13.586      ,18.55         ,0.732                     
0    ,256  ,8    ,127      ,7    ,23   ,3.145       ,3.306         ,0.951                     
0    ,256  ,80   ,127      ,128  ,23   ,5.302       ,4.408         ,1.203                     
0    ,256  ,80   ,127      ,32   ,23   ,4.686       ,5.006         ,0.936                     
0    ,256  ,80   ,127      ,48   ,23   ,5.019       ,5.006         ,1.003                     
0    ,256  ,80   ,127      ,64   ,23   ,5.089       ,4.077         ,1.248                     
0    ,256  ,800  ,127      ,512  ,23   ,16.956      ,22.366        ,0.758                     
0    ,256  ,832  ,127      ,1024 ,23   ,17.372      ,23.477        ,0.74                      
0    ,256  ,832  ,127      ,192  ,23   ,13.515      ,18.271        ,0.74                      
0    ,256  ,880  ,127      ,1024 ,23   ,17.435      ,26.974        ,0.646                     
0    ,256  ,880  ,127      ,144  ,23   ,13.977      ,21.575        ,0.648                     
0    ,256  ,9    ,127      ,8    ,23   ,3.13        ,3.298         ,0.949                     
0    ,256  ,928  ,127      ,1024 ,23   ,18.427      ,24.849        ,0.742                     
0    ,256  ,928  ,127      ,96   ,23   ,16.439      ,20.171        ,0.815                     
0    ,256  ,96   ,127      ,80   ,23   ,5.082       ,4.669         ,1.089                     
0    ,256  ,976  ,127      ,1024 ,23   ,20.996      ,26.154        ,0.803                     
0    ,256  ,976  ,127      ,48   ,23   ,17.031      ,20.964        ,0.812                     
0    ,4    ,1    ,127      ,0    ,23   ,3.162       ,3.298         ,0.959                     
0    ,4    ,10   ,127      ,9    ,23   ,3.185       ,3.322         ,0.959                     
0    ,4    ,1024 ,127      ,0    ,23   ,19.798      ,22.015        ,0.899                     
0    ,4    ,1024 ,127      ,1024 ,23   ,20.387      ,24.541        ,0.831                     
0    ,4    ,1024 ,127      ,144  ,23   ,17.878      ,21.999        ,0.813                     
0    ,4    ,1024 ,127      ,192  ,23   ,17.345      ,22.191        ,0.782                     
0    ,4    ,1024 ,127      ,240  ,23   ,17.3        ,21.851        ,0.792                     
0    ,4    ,1024 ,127      ,288  ,23   ,18.659      ,23.492        ,0.794                     
0    ,4    ,1024 ,127      ,48   ,23   ,19.771      ,21.374        ,0.925                     
0    ,4    ,1024 ,127      ,736  ,23   ,19.632      ,27.25         ,0.72                      
0    ,4    ,1024 ,127      ,784  ,23   ,19.537      ,25.063        ,0.78                      
0    ,4    ,1024 ,127      ,832  ,23   ,20.347      ,27.578        ,0.738                     
0    ,4    ,1024 ,127      ,880  ,23   ,20.138      ,26.659        ,0.755                     
0    ,4    ,1024 ,127      ,928  ,23   ,20.246      ,25.858        ,0.783                     
0    ,4    ,1024 ,127      ,96   ,23   ,18.607      ,23.141        ,0.804                     
0    ,4    ,1024 ,127      ,976  ,23   ,20.446      ,27.822        ,0.735                     
0    ,4    ,1072 ,127      ,1024 ,23   ,19.632      ,26.754        ,0.734                     
0    ,4    ,11   ,127      ,10   ,23   ,3.144       ,3.366         ,0.934                     
0    ,4    ,112  ,127      ,144  ,23   ,5.008       ,4.143         ,1.209                     
0    ,4    ,112  ,127      ,16   ,23   ,4.793       ,5.226         ,0.917                     
0    ,4    ,112  ,127      ,256  ,23   ,5.006       ,4.205         ,1.191                     
0    ,4    ,112  ,127      ,64   ,23   ,5.033       ,3.76          ,1.339                     
0    ,4    ,112  ,127      ,96   ,23   ,5.03        ,4.083         ,1.232                     
0    ,4    ,1120 ,127      ,1024 ,23   ,21.229      ,28.983        ,0.732                     
0    ,4    ,1168 ,127      ,1024 ,23   ,23.294      ,31.851        ,0.731                     
0    ,4    ,12   ,127      ,11   ,23   ,3.145       ,3.3           ,0.953                     
0    ,4    ,1216 ,127      ,1024 ,23   ,22.808      ,30.774        ,0.741                     
0    ,4    ,1264 ,127      ,1024 ,23   ,22.829      ,30.015        ,0.761                     
0    ,4    ,128  ,127      ,0    ,23   ,5.875       ,8.679         ,0.677                     
0    ,4    ,128  ,127      ,112  ,23   ,5.429       ,7.169         ,0.757                     
0    ,4    ,128  ,127      ,128  ,23   ,5.21        ,7.586         ,0.687                     
0    ,4    ,128  ,127      ,144  ,23   ,5.433       ,7.332         ,0.741                     
0    ,4    ,128  ,127      ,192  ,23   ,5.176       ,8.312         ,0.623                     
0    ,4    ,128  ,127      ,240  ,23   ,5.132       ,8.998         ,0.57                      
0    ,4    ,128  ,127      ,288  ,23   ,5.104       ,9.468         ,0.539                     
0    ,4    ,128  ,127      ,32   ,23   ,5.724       ,8.499         ,0.674                     
0    ,4    ,128  ,127      ,48   ,23   ,5.891       ,8.831         ,0.667                     
0    ,4    ,128  ,127      ,80   ,23   ,5.306       ,7.753         ,0.684                     
0    ,4    ,128  ,127      ,96   ,23   ,5.329       ,7.752         ,0.687                     
0    ,4    ,13   ,127      ,12   ,23   ,3.161       ,3.383         ,0.934                     
0    ,4    ,1312 ,127      ,1024 ,23   ,24.76       ,31.739        ,0.78                      
0    ,4    ,14   ,127      ,13   ,23   ,3.145       ,3.262         ,0.964                     
0    ,4    ,144  ,127      ,128  ,23   ,5.193       ,6.291         ,0.825                     
0    ,4    ,15   ,127      ,14   ,23   ,3.195       ,3.331         ,0.959                     
0    ,4    ,16   ,127      ,0    ,23   ,3.145       ,3.261         ,0.965                     
0    ,4    ,16   ,127      ,144  ,23   ,3.829       ,3.872         ,0.989                     
0    ,4    ,16   ,127      ,15   ,23   ,3.193       ,3.306         ,0.966                     
0    ,4    ,16   ,127      ,16   ,23   ,3.151       ,3.338         ,0.944                     
0    ,4    ,16   ,127      ,192  ,23   ,3.793       ,3.802         ,0.998                     
0    ,4    ,16   ,127      ,240  ,23   ,3.794       ,3.737         ,1.015                     
0    ,4    ,16   ,127      ,256  ,23   ,3.793       ,3.954         ,0.959                     
0    ,4    ,16   ,127      ,288  ,23   ,3.773       ,4.71          ,0.801                     
0    ,4    ,16   ,127      ,48   ,23   ,3.146       ,3.452         ,0.911                     
0    ,4    ,16   ,127      ,64   ,23   ,3.792       ,4.271         ,0.888                     
0    ,4    ,16   ,127      ,96   ,23   ,3.794       ,3.874         ,0.98                      
0    ,4    ,160  ,127      ,144  ,23   ,5.216       ,6.293         ,0.829                     
0    ,4    ,160  ,127      ,16   ,23   ,5.818       ,8.62          ,0.675                     
0    ,4    ,160  ,127      ,256  ,23   ,5.166       ,6.405         ,0.806                     
0    ,4    ,160  ,127      ,64   ,23   ,5.283       ,8.162         ,0.647                     
0    ,4    ,160  ,127      ,96   ,23   ,5.323       ,7.597         ,0.701                     
0    ,4    ,17   ,127      ,16   ,23   ,3.177       ,3.349         ,0.949                     
0    ,4    ,176  ,127      ,128  ,23   ,5.154       ,6.772         ,0.761                     
0    ,4    ,176  ,127      ,160  ,23   ,5.29        ,6.374         ,0.83                      
0    ,4    ,176  ,127      ,32   ,23   ,5.934       ,8.192         ,0.724                     
0    ,4    ,1760 ,127      ,2048 ,23   ,28.552      ,34.773        ,0.821                     
0    ,4    ,1760 ,127      ,288  ,23   ,27.336      ,33.413        ,0.818                     
0    ,4    ,18   ,127      ,17   ,23   ,3.145       ,3.272         ,0.961                     
0    ,4    ,1808 ,127      ,2048 ,23   ,30.705      ,35.005        ,0.877                     
0    ,4    ,1808 ,127      ,240  ,23   ,28.109      ,33.235        ,0.846                     
0    ,4    ,1856 ,127      ,192  ,23   ,28.075      ,33.044        ,0.85                      
0    ,4    ,1856 ,127      ,2048 ,23   ,30.324      ,35.97         ,0.843                     
0    ,4    ,19   ,127      ,18   ,23   ,3.153       ,3.297         ,0.956                     
0    ,4    ,1904 ,127      ,144  ,23   ,29.103      ,33.633        ,0.865                     
0    ,4    ,1904 ,127      ,2048 ,23   ,30.218      ,36.616        ,0.825                     
0    ,4    ,192  ,127      ,176  ,23   ,5.578       ,7.171         ,0.778                     
0    ,4    ,1952 ,127      ,2048 ,23   ,32.19       ,36.491        ,0.882                     
0    ,4    ,1952 ,127      ,96   ,23   ,32.428      ,33.789        ,0.96                      
0    ,4    ,2    ,127      ,1    ,23   ,3.169       ,3.526         ,0.899                     
0    ,4    ,20   ,127      ,19   ,23   ,3.177       ,3.436         ,0.924                     
0    ,4    ,2000 ,127      ,2048 ,23   ,32.532      ,37.286        ,0.873                     
0    ,4    ,2000 ,127      ,48   ,23   ,31.538      ,34.406        ,0.917                     
0    ,4    ,2048 ,127      ,0    ,23   ,33.676      ,36.427        ,0.924                     
0    ,4    ,2048 ,127      ,1024 ,23   ,35.274      ,42.614        ,0.828                     
0    ,4    ,2048 ,127      ,128  ,23   ,31.809      ,35.747        ,0.89                      
0    ,4    ,2048 ,127      ,144  ,23   ,32.151      ,35.856        ,0.897                     
0    ,4    ,2048 ,127      ,1760 ,23   ,35.579      ,43.618        ,0.816                     
0    ,4    ,2048 ,127      ,1808 ,23   ,36.049      ,45.062        ,0.8                       
0    ,4    ,2048 ,127      ,1856 ,23   ,36.828      ,49.901        ,0.738                     
0    ,4    ,2048 ,127      ,1904 ,23   ,36.841      ,45.334        ,0.813                     
0    ,4    ,2048 ,127      ,192  ,23   ,30.984      ,36.171        ,0.857                     
0    ,4    ,2048 ,127      ,1952 ,23   ,37.746      ,46.988        ,0.803                     
0    ,4    ,2048 ,127      ,2000 ,23   ,37.25       ,47.263        ,0.788                     
0    ,4    ,2048 ,127      ,2048 ,23   ,34.147      ,39.186        ,0.871                     
0    ,4    ,2048 ,127      ,240  ,23   ,32.204      ,36.242        ,0.889                     
0    ,4    ,2048 ,127      ,256  ,23   ,32.52       ,36.001        ,0.903                     
0    ,4    ,2048 ,127      ,288  ,23   ,33.167      ,38.29         ,0.866                     
0    ,4    ,2048 ,127      ,32   ,23   ,33.135      ,36.306        ,0.913                     
0    ,4    ,2048 ,127      ,4096 ,23   ,33.545      ,36.543        ,0.918                     
0    ,4    ,2048 ,127      ,48   ,23   ,33.287      ,35.443        ,0.939                     
0    ,4    ,2048 ,127      ,512  ,23   ,33.864      ,39.277        ,0.862                     
0    ,4    ,2048 ,127      ,64   ,23   ,32.572      ,35.72         ,0.912                     
0    ,4    ,2048 ,127      ,96   ,23   ,32.803      ,35.505        ,0.924                     
0    ,4    ,208  ,127      ,16   ,23   ,7.079       ,9.082         ,0.779                     
0    ,4    ,208  ,127      ,192  ,23   ,5.241       ,6.938         ,0.755                     
0    ,4    ,208  ,127      ,256  ,23   ,5.545       ,7.796         ,0.711                     
0    ,4    ,208  ,127      ,48   ,23   ,6.855       ,10.459        ,0.655                     
0    ,4    ,208  ,127      ,64   ,23   ,6.697       ,9.267         ,0.723                     
0    ,4    ,2096 ,127      ,2048 ,23   ,35.759      ,42.74         ,0.837                     
0    ,4    ,21   ,127      ,20   ,23   ,3.16        ,3.411         ,0.927                     
0    ,4    ,2144 ,127      ,2048 ,23   ,37.852      ,49.296        ,0.768                     
0    ,4    ,2192 ,127      ,2048 ,23   ,39.406      ,49.179        ,0.801                     
0    ,4    ,22   ,127      ,21   ,23   ,3.177       ,3.428         ,0.927                     
0    ,4    ,224  ,127      ,128  ,23   ,5.963       ,7.29          ,0.818                     
0    ,4    ,224  ,127      ,208  ,23   ,5.457       ,7.215         ,0.756                     
0    ,4    ,224  ,127      ,288  ,23   ,5.17        ,8.362         ,0.618                     
0    ,4    ,224  ,127      ,32   ,23   ,6.713       ,8.96          ,0.749                     
0    ,4    ,224  ,127      ,512  ,23   ,5.765       ,7.395         ,0.78                      
0    ,4    ,2240 ,127      ,2048 ,23   ,38.646      ,51.874        ,0.745                     
0    ,4    ,2288 ,127      ,2048 ,23   ,38.086      ,46.635        ,0.817                     
0    ,4    ,23   ,127      ,22   ,23   ,3.156       ,3.304         ,0.955                     
0    ,4    ,2336 ,127      ,2048 ,23   ,40.864      ,49.39         ,0.827                     
0    ,4    ,24   ,127      ,23   ,23   ,3.145       ,3.297         ,0.954                     
0    ,4    ,240  ,127      ,224  ,23   ,5.151       ,7.398         ,0.696                     
0    ,4    ,25   ,127      ,24   ,23   ,3.161       ,3.417         ,0.925                     
0    ,4    ,256  ,127      ,0    ,23   ,8.701       ,11.556        ,0.753                     
0    ,4    ,256  ,127      ,112  ,23   ,8.171       ,11.121        ,0.735                     
0    ,4    ,256  ,127      ,144  ,23   ,7.464       ,9.71          ,0.769                     
0    ,4    ,256  ,127      ,16   ,23   ,9.452       ,11.721        ,0.806                     
0    ,4    ,256  ,127      ,160  ,23   ,7.531       ,10.249        ,0.735                     
0    ,4    ,256  ,127      ,192  ,23   ,7.164       ,9.711         ,0.738                     
0    ,4    ,256  ,127      ,208  ,23   ,6.934       ,10.41         ,0.666                     
0    ,4    ,256  ,127      ,240  ,23   ,6.931       ,10.384        ,0.667                     
0    ,4    ,256  ,127      ,256  ,23   ,6.969       ,10.847        ,0.642                     
0    ,4    ,256  ,127      ,288  ,23   ,7.26        ,10.342        ,0.702                     
0    ,4    ,256  ,127      ,48   ,23   ,8.714       ,11.77         ,0.74                      
0    ,4    ,256  ,127      ,64   ,23   ,8.201       ,10.318        ,0.795                     
0    ,4    ,256  ,127      ,96   ,23   ,8.216       ,11.089        ,0.741                     
0    ,4    ,26   ,127      ,25   ,23   ,3.162       ,3.391         ,0.932                     
0    ,4    ,27   ,127      ,26   ,23   ,3.153       ,3.483         ,0.905                     
0    ,4    ,272  ,127      ,128  ,23   ,7.51        ,10.377        ,0.724                     
0    ,4    ,272  ,127      ,240  ,23   ,6.987       ,11.404        ,0.613                     
0    ,4    ,272  ,127      ,256  ,23   ,8.364       ,10.109        ,0.827                     
0    ,4    ,272  ,127      ,32   ,23   ,8.669       ,11.119        ,0.78                      
0    ,4    ,272  ,127      ,512  ,23   ,8.073       ,10.11         ,0.798                     
0    ,4    ,28   ,127      ,27   ,23   ,3.153       ,3.491         ,0.903                     
0    ,4    ,288  ,127      ,272  ,23   ,7.898       ,10.284        ,0.768                     
0    ,4    ,29   ,127      ,28   ,23   ,3.16        ,3.428         ,0.922                     
0    ,4    ,3    ,127      ,2    ,23   ,3.173       ,3.265         ,0.972                     
0    ,4    ,30   ,127      ,29   ,23   ,3.152       ,3.444         ,0.915                     
0    ,4    ,304  ,127      ,16   ,23   ,8.734       ,11.574        ,0.755                     
0    ,4    ,304  ,127      ,256  ,23   ,7.853       ,10.389        ,0.756                     
0    ,4    ,304  ,127      ,64   ,23   ,8.111       ,9.896         ,0.82                      
0    ,4    ,31   ,127      ,30   ,23   ,3.16        ,3.48          ,0.908                     
0    ,4    ,32   ,127      ,0    ,23   ,3.154       ,3.714         ,0.849                     
0    ,4    ,32   ,127      ,128  ,23   ,3.812       ,3.973         ,0.959                     
0    ,4    ,32   ,127      ,144  ,23   ,3.793       ,3.775         ,1.005                     
0    ,4    ,32   ,127      ,16   ,23   ,3.16        ,3.456         ,0.915                     
0    ,4    ,32   ,127      ,192  ,23   ,3.775       ,3.735         ,1.01                      
0    ,4    ,32   ,127      ,240  ,23   ,3.792       ,3.754         ,1.01                      
0    ,4    ,32   ,127      ,288  ,23   ,3.813       ,4.472         ,0.853                     
0    ,4    ,32   ,127      ,31   ,23   ,3.145       ,3.288         ,0.956                     
0    ,4    ,32   ,127      ,32   ,23   ,3.161       ,3.285         ,0.962                     
0    ,4    ,32   ,127      ,48   ,23   ,3.144       ,3.245         ,0.969                     
0    ,4    ,32   ,127      ,96   ,23   ,3.153       ,3.376         ,0.934                     
0    ,4    ,320  ,127      ,128  ,23   ,7.404       ,11.083        ,0.668                     
0    ,4    ,320  ,127      ,192  ,23   ,7.372       ,11.067        ,0.666                     
0    ,4    ,320  ,127      ,32   ,23   ,8.683       ,12.286        ,0.707                     
0    ,4    ,320  ,127      ,512  ,23   ,8.9         ,11.483        ,0.775                     
0    ,4    ,352  ,127      ,256  ,23   ,8.961       ,11.195        ,0.8                       
0    ,4    ,352  ,127      ,64   ,23   ,8.125       ,11.297        ,0.719                     
0    ,4    ,368  ,127      ,128  ,23   ,7.463       ,11.081        ,0.674                     
0    ,4    ,368  ,127      ,144  ,23   ,7.645       ,11.844        ,0.645                     
0    ,4    ,368  ,127      ,512  ,23   ,8.956       ,11.084        ,0.808                     
0    ,4    ,4    ,127      ,3    ,23   ,3.156       ,3.322         ,0.95                      
0    ,4    ,400  ,127      ,256  ,23   ,9.632       ,13.603        ,0.708                     
0    ,4    ,416  ,127      ,128  ,23   ,9.081       ,12.395        ,0.733                     
0    ,4    ,416  ,127      ,512  ,23   ,10.163      ,13.657        ,0.744                     
0    ,4    ,416  ,127      ,96   ,23   ,9.499       ,12.73         ,0.746                     
0    ,4    ,448  ,127      ,256  ,23   ,9.794       ,14.234        ,0.688                     
0    ,4    ,464  ,127      ,48   ,23   ,10.019      ,14.79         ,0.677                     
0    ,4    ,464  ,127      ,512  ,23   ,10.971      ,14.271        ,0.769                     
0    ,4    ,48   ,127      ,32   ,23   ,3.161       ,3.333         ,0.948                     
0    ,4    ,496  ,127      ,256  ,23   ,9.582       ,14.229        ,0.673                     
0    ,4    ,5    ,127      ,4    ,23   ,3.177       ,3.312         ,0.959                     
0    ,4    ,512  ,127      ,0    ,23   ,11.964      ,15.131        ,0.791                     
0    ,4    ,512  ,127      ,144  ,23   ,10.739      ,13.798        ,0.778                     
0    ,4    ,512  ,127      ,192  ,23   ,10.087      ,14.271        ,0.707                     
0    ,4    ,512  ,127      ,224  ,23   ,9.986       ,15.099        ,0.661                     
0    ,4    ,512  ,127      ,240  ,23   ,10.143      ,14.624        ,0.694                     
0    ,4    ,512  ,127      ,272  ,23   ,11.388      ,15.172        ,0.751                     
0    ,4    ,512  ,127      ,288  ,23   ,11.387      ,16.032        ,0.71                      
0    ,4    ,512  ,127      ,320  ,23   ,10.736      ,15.141        ,0.709                     
0    ,4    ,512  ,127      ,368  ,23   ,10.783      ,16.038        ,0.672                     
0    ,4    ,512  ,127      ,416  ,23   ,12.213      ,15.788        ,0.774                     
0    ,4    ,512  ,127      ,464  ,23   ,12.305      ,16.43         ,0.749                     
0    ,4    ,512  ,127      ,48   ,23   ,12.029      ,15.382        ,0.782                     
0    ,4    ,512  ,127      ,512  ,23   ,12.13       ,17.72         ,0.685                     
0    ,4    ,512  ,127      ,96   ,23   ,11.375      ,14.748        ,0.771                     
0    ,4    ,544  ,127      ,256  ,23   ,11.385      ,16.65         ,0.684                     
0    ,4    ,560  ,127      ,512  ,23   ,12.883      ,17.571        ,0.733                     
0    ,4    ,6    ,127      ,5    ,23   ,3.169       ,3.314         ,0.956                     
0    ,4    ,608  ,127      ,512  ,23   ,12.714      ,17.098        ,0.744                     
0    ,4    ,64   ,127      ,0    ,23   ,4.58        ,5.489         ,0.834                     
0    ,4    ,64   ,127      ,144  ,23   ,4.425       ,6.144         ,0.72                      
0    ,4    ,64   ,127      ,16   ,23   ,4.646       ,5.062         ,0.918                     
0    ,4    ,64   ,127      ,192  ,23   ,4.534       ,5.142         ,0.882                     
0    ,4    ,64   ,127      ,240  ,23   ,4.402       ,5.264         ,0.836                     
0    ,4    ,64   ,127      ,256  ,23   ,5.035       ,4.959         ,1.015                     
0    ,4    ,64   ,127      ,288  ,23   ,5.154       ,5.17          ,0.997                     
0    ,4    ,64   ,127      ,48   ,23   ,4.474       ,5.026         ,0.89                      
0    ,4    ,64   ,127      ,64   ,23   ,4.547       ,4.981         ,0.913                     
0    ,4    ,64   ,127      ,96   ,23   ,4.403       ,5.721         ,0.77                      
0    ,4    ,656  ,127      ,512  ,23   ,14.344      ,20.61         ,0.696                     
0    ,4    ,7    ,127      ,6    ,23   ,3.161       ,3.203         ,0.987                     
0    ,4    ,704  ,127      ,512  ,23   ,14.396      ,21.377        ,0.673                     
0    ,4    ,736  ,127      ,1024 ,23   ,13.777      ,17.84         ,0.772                     
0    ,4    ,736  ,127      ,288  ,23   ,13.273      ,17.739        ,0.748                     
0    ,4    ,752  ,127      ,512  ,23   ,14.422      ,19.826        ,0.727                     
0    ,4    ,784  ,127      ,1024 ,23   ,15.137      ,18.974        ,0.798                     
0    ,4    ,784  ,127      ,240  ,23   ,13.579      ,18.497        ,0.734                     
0    ,4    ,8    ,127      ,7    ,23   ,3.161       ,3.391         ,0.932                     
0    ,4    ,80   ,127      ,128  ,23   ,5.101       ,4.346         ,1.174                     
0    ,4    ,80   ,127      ,32   ,23   ,4.404       ,5.007         ,0.88                      
0    ,4    ,80   ,127      ,48   ,23   ,4.644       ,5.06          ,0.918                     
0    ,4    ,80   ,127      ,64   ,23   ,5.102       ,4.101         ,1.244                     
0    ,4    ,800  ,127      ,512  ,23   ,16.352      ,22.201        ,0.737                     
0    ,4    ,832  ,127      ,1024 ,23   ,15.508      ,20.118        ,0.771                     
0    ,4    ,832  ,127      ,192  ,23   ,13.498      ,18.374        ,0.735                     
0    ,4    ,880  ,127      ,1024 ,23   ,15.342      ,20.264        ,0.757                     
0    ,4    ,880  ,127      ,144  ,23   ,14.276      ,18.818        ,0.759                     
0    ,4    ,9    ,127      ,8    ,23   ,3.169       ,3.255         ,0.973                     
0    ,4    ,928  ,127      ,1024 ,23   ,17.968      ,22.487        ,0.799                     
0    ,4    ,928  ,127      ,96   ,23   ,16.562      ,20.072        ,0.825                     
0    ,4    ,96   ,127      ,80   ,23   ,5.082       ,3.996         ,1.272                     
0    ,4    ,976  ,127      ,1024 ,23   ,17.833      ,23.204        ,0.769                     
0    ,4    ,976  ,127      ,48   ,23   ,17.332      ,21.29         ,0.814                     
0    ,64   ,1    ,127      ,0    ,23   ,3.146       ,3.184         ,0.988                     
0    ,64   ,10   ,127      ,9    ,23   ,3.145       ,3.44          ,0.914                     
0    ,64   ,1024 ,127      ,0    ,23   ,20.039      ,22.143        ,0.905                     
0    ,64   ,1024 ,127      ,1024 ,23   ,22.081      ,29.115        ,0.758                     
0    ,64   ,1024 ,127      ,144  ,23   ,17.639      ,21.645        ,0.815                     
0    ,64   ,1024 ,127      ,192  ,23   ,16.92       ,21.35         ,0.793                     
0    ,64   ,1024 ,127      ,240  ,23   ,17.088      ,22.019        ,0.776                     
0    ,64   ,1024 ,127      ,288  ,23   ,18.252      ,23.549        ,0.775                     
0    ,64   ,1024 ,127      ,48   ,23   ,19.372      ,21.377        ,0.906                     
0    ,64   ,1024 ,127      ,736  ,23   ,19.648      ,25.354        ,0.775                     
0    ,64   ,1024 ,127      ,784  ,23   ,20.901      ,27.673        ,0.755                     
0    ,64   ,1024 ,127      ,832  ,23   ,20.144      ,26.621        ,0.757                     
0    ,64   ,1024 ,127      ,880  ,23   ,20.418      ,26.813        ,0.761                     
0    ,64   ,1024 ,127      ,928  ,23   ,21.239      ,28.753        ,0.739                     
0    ,64   ,1024 ,127      ,96   ,23   ,18.924      ,21.453        ,0.882                     
0    ,64   ,1024 ,127      ,976  ,23   ,21.965      ,27.76         ,0.791                     
0    ,64   ,1072 ,127      ,1024 ,23   ,21.249      ,27.779        ,0.765                     
0    ,64   ,11   ,127      ,10   ,23   ,3.152       ,3.437         ,0.917                     
0    ,64   ,112  ,127      ,144  ,23   ,5.037       ,4.655         ,1.082                     
0    ,64   ,112  ,127      ,16   ,23   ,4.644       ,5.125         ,0.906                     
0    ,64   ,112  ,127      ,256  ,23   ,5.056       ,4.575         ,1.105                     
0    ,64   ,112  ,127      ,64   ,23   ,5.032       ,4.125         ,1.22                      
0    ,64   ,112  ,127      ,96   ,23   ,5.104       ,4.12          ,1.239                     
0    ,64   ,1120 ,127      ,1024 ,23   ,22.537      ,29.024        ,0.776                     
0    ,64   ,1168 ,127      ,1024 ,23   ,24.492      ,31.453        ,0.779                     
0    ,64   ,12   ,127      ,11   ,23   ,3.145       ,3.252         ,0.967                     
0    ,64   ,1216 ,127      ,1024 ,23   ,25.027      ,32.609        ,0.767                     
0    ,64   ,1264 ,127      ,1024 ,23   ,23.763      ,31.546        ,0.753                     
0    ,64   ,128  ,127      ,0    ,23   ,5.938       ,8.653         ,0.686                     
0    ,64   ,128  ,127      ,112  ,23   ,5.254       ,7.502         ,0.7                       
0    ,64   ,128  ,127      ,128  ,23   ,5.203       ,7.439         ,0.699                     
0    ,64   ,128  ,127      ,144  ,23   ,5.38        ,7.253         ,0.742                     
0    ,64   ,128  ,127      ,192  ,23   ,5.281       ,7.743         ,0.682                     
0    ,64   ,128  ,127      ,240  ,23   ,5.311       ,8.393         ,0.633                     
0    ,64   ,128  ,127      ,288  ,23   ,5.254       ,7.59          ,0.692                     
0    ,64   ,128  ,127      ,32   ,23   ,5.969       ,8.369         ,0.713                     
0    ,64   ,128  ,127      ,48   ,23   ,5.906       ,8.093         ,0.73                      
0    ,64   ,128  ,127      ,80   ,23   ,5.253       ,7.627         ,0.689                     
0    ,64   ,128  ,127      ,96   ,23   ,5.255       ,7.744         ,0.679                     
0    ,64   ,13   ,127      ,12   ,23   ,3.144       ,3.39          ,0.928                     
0    ,64   ,1312 ,127      ,1024 ,23   ,26.18       ,32.955        ,0.794                     
0    ,64   ,14   ,127      ,13   ,23   ,3.177       ,3.374         ,0.941                     
0    ,64   ,144  ,127      ,128  ,23   ,5.183       ,6.504         ,0.797                     
0    ,64   ,15   ,127      ,14   ,23   ,3.169       ,3.298         ,0.961                     
0    ,64   ,16   ,127      ,0    ,23   ,3.151       ,3.396         ,0.928                     
0    ,64   ,16   ,127      ,144  ,23   ,3.144       ,3.13          ,1.005                     
0    ,64   ,16   ,127      ,15   ,23   ,3.16        ,3.374         ,0.937                     
0    ,64   ,16   ,127      ,16   ,23   ,3.15        ,3.225         ,0.977                     
0    ,64   ,16   ,127      ,192  ,23   ,3.161       ,3.366         ,0.939                     
0    ,64   ,16   ,127      ,240  ,23   ,3.162       ,3.324         ,0.951                     
0    ,64   ,16   ,127      ,256  ,23   ,3.177       ,3.456         ,0.919                     
0    ,64   ,16   ,127      ,288  ,23   ,3.177       ,5.36          ,0.593                     
0    ,64   ,16   ,127      ,48   ,23   ,3.161       ,3.367         ,0.939                     
0    ,64   ,16   ,127      ,64   ,23   ,3.177       ,3.365         ,0.944                     
0    ,64   ,16   ,127      ,96   ,23   ,3.16        ,3.366         ,0.939                     
0    ,64   ,160  ,127      ,144  ,23   ,5.19        ,6.626         ,0.783                     
0    ,64   ,160  ,127      ,16   ,23   ,6.034       ,8.181         ,0.738                     
0    ,64   ,160  ,127      ,256  ,23   ,5.192       ,6.469         ,0.803                     
0    ,64   ,160  ,127      ,64   ,23   ,5.285       ,7.661         ,0.69                      
0    ,64   ,160  ,127      ,96   ,23   ,5.4         ,7.961         ,0.678                     
0    ,64   ,17   ,127      ,16   ,23   ,3.145       ,3.269         ,0.962                     
0    ,64   ,176  ,127      ,128  ,23   ,5.157       ,6.404         ,0.805                     
0    ,64   ,176  ,127      ,160  ,23   ,5.17        ,6.074         ,0.851                     
0    ,64   ,176  ,127      ,32   ,23   ,5.752       ,8.549         ,0.673                     
0    ,64   ,1760 ,127      ,2048 ,23   ,32.844      ,42.674        ,0.77                      
0    ,64   ,1760 ,127      ,288  ,23   ,26.748      ,31.352        ,0.853                     
0    ,64   ,18   ,127      ,17   ,23   ,3.145       ,3.246         ,0.969                     
0    ,64   ,1808 ,127      ,2048 ,23   ,34.984      ,52.589        ,0.665                     
0    ,64   ,1808 ,127      ,240  ,23   ,27.354      ,31.114        ,0.879                     
0    ,64   ,1856 ,127      ,192  ,23   ,26.956      ,31.912        ,0.845                     
0    ,64   ,1856 ,127      ,2048 ,23   ,34.706      ,49.761        ,0.697                     
0    ,64   ,19   ,127      ,18   ,23   ,3.16        ,3.305         ,0.956                     
0    ,64   ,1904 ,127      ,144  ,23   ,27.98       ,32.474        ,0.862                     
0    ,64   ,1904 ,127      ,2048 ,23   ,35.375      ,47.52         ,0.744                     
0    ,64   ,192  ,127      ,176  ,23   ,5.522       ,7.308         ,0.756                     
0    ,64   ,1952 ,127      ,2048 ,23   ,37.207      ,55.957        ,0.665                     
0    ,64   ,1952 ,127      ,96   ,23   ,30.383      ,33.955        ,0.895                     
0    ,64   ,2    ,127      ,1    ,23   ,3.145       ,3.449         ,0.912                     
0    ,64   ,20   ,127      ,19   ,23   ,3.186       ,3.288         ,0.969                     
0    ,64   ,2000 ,127      ,2048 ,23   ,37.863      ,50.048        ,0.757                     
0    ,64   ,2000 ,127      ,48   ,23   ,31.004      ,33.297        ,0.931                     
0    ,64   ,2048 ,127      ,0    ,23   ,33.109      ,35.365        ,0.936                     
0    ,64   ,2048 ,127      ,1024 ,23   ,36.863      ,46.556        ,0.792                     
0    ,64   ,2048 ,127      ,128  ,23   ,31.899      ,35.557        ,0.897                     
0    ,64   ,2048 ,127      ,144  ,23   ,32.337      ,35.749        ,0.905                     
0    ,64   ,2048 ,127      ,1760 ,23   ,38.362      ,49.274        ,0.779                     
0    ,64   ,2048 ,127      ,1808 ,23   ,47.114      ,51.886        ,0.908                     
0    ,64   ,2048 ,127      ,1856 ,23   ,42.602      ,50.278        ,0.847                     
0    ,64   ,2048 ,127      ,1904 ,23   ,55.644      ,54.311        ,1.025                     
0    ,64   ,2048 ,127      ,192  ,23   ,31.061      ,35.115        ,0.885                     
0    ,64   ,2048 ,127      ,1952 ,23   ,53.408      ,53.41         ,1.0                       
0    ,64   ,2048 ,127      ,2000 ,23   ,46.835      ,61.275        ,0.764                     
0    ,64   ,2048 ,127      ,2048 ,23   ,48.348      ,61.244        ,0.789                     
0    ,64   ,2048 ,127      ,240  ,23   ,31.128      ,35.186        ,0.885                     
0    ,64   ,2048 ,127      ,256  ,23   ,32.639      ,36.766        ,0.888                     
0    ,64   ,2048 ,127      ,288  ,23   ,32.194      ,36.297        ,0.887                     
0    ,64   ,2048 ,127      ,32   ,23   ,33.772      ,36.12         ,0.935                     
0    ,64   ,2048 ,127      ,4096 ,23   ,44.43       ,65.192        ,0.682                     
0    ,64   ,2048 ,127      ,48   ,23   ,32.937      ,34.616        ,0.952                     
0    ,64   ,2048 ,127      ,512  ,23   ,33.896      ,39.432        ,0.86                      
0    ,64   ,2048 ,127      ,64   ,23   ,32.893      ,35.49         ,0.927                     
0    ,64   ,2048 ,127      ,96   ,23   ,32.17       ,35.893        ,0.896                     
0    ,64   ,208  ,127      ,16   ,23   ,6.976       ,9.14          ,0.763                     
0    ,64   ,208  ,127      ,192  ,23   ,5.517       ,6.905         ,0.799                     
0    ,64   ,208  ,127      ,256  ,23   ,5.435       ,7.378         ,0.737                     
0    ,64   ,208  ,127      ,48   ,23   ,6.491       ,9.187         ,0.707                     
0    ,64   ,208  ,127      ,64   ,23   ,6.565       ,7.989         ,0.822                     
0    ,64   ,2096 ,127      ,2048 ,23   ,39.554      ,55.173        ,0.717                     
0    ,64   ,21   ,127      ,20   ,23   ,3.161       ,3.255         ,0.971                     
0    ,64   ,2144 ,127      ,2048 ,23   ,56.928      ,58.526        ,0.973                     
0    ,64   ,2192 ,127      ,2048 ,23   ,58.545      ,55.186        ,1.061                     
0    ,64   ,22   ,127      ,21   ,23   ,3.161       ,3.203         ,0.987                     
0    ,64   ,224  ,127      ,128  ,23   ,5.687       ,7.27          ,0.782                     
0    ,64   ,224  ,127      ,208  ,23   ,5.21        ,7.013         ,0.743                     
0    ,64   ,224  ,127      ,288  ,23   ,5.05        ,8.043         ,0.628                     
0    ,64   ,224  ,127      ,32   ,23   ,6.655       ,9.114         ,0.73                      
0    ,64   ,224  ,127      ,512  ,23   ,5.129       ,7.491         ,0.685                     
0    ,64   ,2240 ,127      ,2048 ,23   ,58.563      ,55.918        ,1.047                     
0    ,64   ,2288 ,127      ,2048 ,23   ,58.429      ,55.175        ,1.059                     
0    ,64   ,23   ,127      ,22   ,23   ,3.144       ,3.262         ,0.964                     
0    ,64   ,2336 ,127      ,2048 ,23   ,67.585      ,57.044        ,1.185                     
0    ,64   ,24   ,127      ,23   ,23   ,3.159       ,3.309         ,0.955                     
0    ,64   ,240  ,127      ,224  ,23   ,5.236       ,7.518         ,0.696                     
0    ,64   ,25   ,127      ,24   ,23   ,3.16        ,3.499         ,0.903                     
0    ,64   ,256  ,127      ,0    ,23   ,8.762       ,11.848        ,0.739                     
0    ,64   ,256  ,127      ,112  ,23   ,8.157       ,10.941        ,0.746                     
0    ,64   ,256  ,127      ,144  ,23   ,7.565       ,10.157        ,0.745                     
0    ,64   ,256  ,127      ,16   ,23   ,9.48        ,12.01         ,0.789                     
0    ,64   ,256  ,127      ,160  ,23   ,7.404       ,9.94          ,0.745                     
0    ,64   ,256  ,127      ,192  ,23   ,7.011       ,10.15         ,0.691                     
0    ,64   ,256  ,127      ,208  ,23   ,6.977       ,10.371        ,0.673                     
0    ,64   ,256  ,127      ,240  ,23   ,7.2         ,10.791        ,0.667                     
0    ,64   ,256  ,127      ,256  ,23   ,6.949       ,10.915        ,0.637                     
0    ,64   ,256  ,127      ,288  ,23   ,7.02        ,10.324        ,0.68                      
0    ,64   ,256  ,127      ,48   ,23   ,8.71        ,12.04         ,0.723                     
0    ,64   ,256  ,127      ,64   ,23   ,8.111       ,10.848        ,0.748                     
0    ,64   ,256  ,127      ,96   ,23   ,8.318       ,10.306        ,0.807                     
0    ,64   ,26   ,127      ,25   ,23   ,3.153       ,3.394         ,0.929                     
0    ,64   ,27   ,127      ,26   ,23   ,3.161       ,3.349         ,0.944                     
0    ,64   ,272  ,127      ,128  ,23   ,7.589       ,9.763         ,0.777                     
0    ,64   ,272  ,127      ,240  ,23   ,6.828       ,11.202        ,0.61                      
0    ,64   ,272  ,127      ,256  ,23   ,8.03        ,10.276        ,0.781                     
0    ,64   ,272  ,127      ,32   ,23   ,8.762       ,11.419        ,0.767                     
0    ,64   ,272  ,127      ,512  ,23   ,7.813       ,10.083        ,0.775                     
0    ,64   ,28   ,127      ,27   ,23   ,3.161       ,3.5           ,0.903                     
0    ,64   ,288  ,127      ,272  ,23   ,8.07        ,10.086        ,0.8                       
0    ,64   ,29   ,127      ,28   ,23   ,3.161       ,3.347         ,0.944                     
0    ,64   ,3    ,127      ,2    ,23   ,3.178       ,3.302         ,0.962                     
0    ,64   ,30   ,127      ,29   ,23   ,3.177       ,3.604         ,0.881                     
0    ,64   ,304  ,127      ,16   ,23   ,8.735       ,11.45         ,0.763                     
0    ,64   ,304  ,127      ,256  ,23   ,8.095       ,10.279        ,0.788                     
0    ,64   ,304  ,127      ,64   ,23   ,8.173       ,10.129        ,0.807                     
0    ,64   ,31   ,127      ,30   ,23   ,3.161       ,3.383         ,0.934                     
0    ,64   ,32   ,127      ,0    ,23   ,3.161       ,3.328         ,0.95                      
0    ,64   ,32   ,127      ,128  ,23   ,3.145       ,3.365         ,0.935                     
0    ,64   ,32   ,127      ,144  ,23   ,3.161       ,3.246         ,0.974                     
0    ,64   ,32   ,127      ,16   ,23   ,3.145       ,3.383         ,0.93                      
0    ,64   ,32   ,127      ,192  ,23   ,3.161       ,3.145         ,1.005                     
0    ,64   ,32   ,127      ,240  ,23   ,3.145       ,3.262         ,0.964                     
0    ,64   ,32   ,127      ,288  ,23   ,3.161       ,3.35          ,0.944                     
0    ,64   ,32   ,127      ,31   ,23   ,3.165       ,3.305         ,0.957                     
0    ,64   ,32   ,127      ,32   ,23   ,3.166       ,3.163         ,1.001                     
0    ,64   ,32   ,127      ,48   ,23   ,3.145       ,3.473         ,0.906                     
0    ,64   ,32   ,127      ,96   ,23   ,3.153       ,3.34          ,0.944                     
0    ,64   ,320  ,127      ,128  ,23   ,7.522       ,10.625        ,0.708                     
0    ,64   ,320  ,127      ,192  ,23   ,6.864       ,11.256        ,0.61                      
0    ,64   ,320  ,127      ,32   ,23   ,8.782       ,12.663        ,0.694                     
0    ,64   ,320  ,127      ,512  ,23   ,8.992       ,11.797        ,0.762                     
0    ,64   ,352  ,127      ,256  ,23   ,8.935       ,11.31         ,0.79                      
0    ,64   ,352  ,127      ,64   ,23   ,8.15        ,12.042        ,0.677                     
0    ,64   ,368  ,127      ,128  ,23   ,7.513       ,11.204        ,0.671                     
0    ,64   ,368  ,127      ,144  ,23   ,7.674       ,11.428        ,0.671                     
0    ,64   ,368  ,127      ,512  ,23   ,8.317       ,11.54         ,0.721                     
0    ,64   ,4    ,127      ,3    ,23   ,3.168       ,3.357         ,0.944                     
0    ,64   ,400  ,127      ,256  ,23   ,11.646      ,15.042        ,0.774                     
0    ,64   ,416  ,127      ,128  ,23   ,10.087      ,12.554        ,0.803                     
0    ,64   ,416  ,127      ,512  ,23   ,10.185      ,13.658        ,0.746                     
0    ,64   ,416  ,127      ,96   ,23   ,9.486       ,12.415        ,0.764                     
0    ,64   ,448  ,127      ,256  ,23   ,12.026      ,14.413        ,0.834                     
0    ,64   ,464  ,127      ,48   ,23   ,10.167      ,13.469        ,0.755                     
0    ,64   ,464  ,127      ,512  ,23   ,10.272      ,14.88         ,0.69                      
0    ,64   ,48   ,127      ,32   ,23   ,3.144       ,3.433         ,0.916                     
0    ,64   ,496  ,127      ,256  ,23   ,9.631       ,13.89         ,0.693                     
0    ,64   ,5    ,127      ,4    ,23   ,3.161       ,3.341         ,0.946                     
0    ,64   ,512  ,127      ,0    ,23   ,12.66       ,15.548        ,0.814                     
0    ,64   ,512  ,127      ,144  ,23   ,10.741      ,13.792        ,0.779                     
0    ,64   ,512  ,127      ,192  ,23   ,9.987       ,16.142        ,0.619                     
0    ,64   ,512  ,127      ,224  ,23   ,10.083      ,14.313        ,0.704                     
0    ,64   ,512  ,127      ,240  ,23   ,9.984       ,14.853        ,0.672                     
0    ,64   ,512  ,127      ,272  ,23   ,11.329      ,16.623        ,0.682                     
0    ,64   ,512  ,127      ,288  ,23   ,11.385      ,15.399        ,0.739                     
0    ,64   ,512  ,127      ,320  ,23   ,10.655      ,15.312        ,0.696                     
0    ,64   ,512  ,127      ,368  ,23   ,10.903      ,19.033        ,0.573                     
0    ,64   ,512  ,127      ,416  ,23   ,11.917      ,16.12         ,0.739                     
0    ,64   ,512  ,127      ,464  ,23   ,13.572      ,16.613        ,0.817                     
0    ,64   ,512  ,127      ,48   ,23   ,12.607      ,15.017        ,0.84                      
0    ,64   ,512  ,127      ,512  ,23   ,13.403      ,16.757        ,0.8                       
0    ,64   ,512  ,127      ,96   ,23   ,11.976      ,13.067        ,0.917                     
0    ,64   ,544  ,127      ,256  ,23   ,12.365      ,15.798        ,0.783                     
0    ,64   ,560  ,127      ,512  ,23   ,12.314      ,16.035        ,0.768                     
0    ,64   ,6    ,127      ,5    ,23   ,3.153       ,3.264         ,0.966                     
0    ,64   ,608  ,127      ,512  ,23   ,12.215      ,16.937        ,0.721                     
0    ,64   ,64   ,127      ,0    ,23   ,4.706       ,5.479         ,0.859                     
0    ,64   ,64   ,127      ,144  ,23   ,4.425       ,5.032         ,0.879                     
0    ,64   ,64   ,127      ,16   ,23   ,4.534       ,5.08          ,0.893                     
0    ,64   ,64   ,127      ,192  ,23   ,4.404       ,5.307         ,0.83                      
0    ,64   ,64   ,127      ,240  ,23   ,4.424       ,5.586         ,0.792                     
0    ,64   ,64   ,127      ,256  ,23   ,4.523       ,4.998         ,0.905                     
0    ,64   ,64   ,127      ,288  ,23   ,4.402       ,5.587         ,0.788                     
0    ,64   ,64   ,127      ,48   ,23   ,4.404       ,5.033         ,0.875                     
0    ,64   ,64   ,127      ,64   ,23   ,4.402       ,5.03          ,0.875                     
0    ,64   ,64   ,127      ,96   ,23   ,4.692       ,5.006         ,0.937                     
0    ,64   ,656  ,127      ,512  ,23   ,15.147      ,20.982        ,0.722                     
0    ,64   ,7    ,127      ,6    ,23   ,3.161       ,3.322         ,0.951                     
0    ,64   ,704  ,127      ,512  ,23   ,16.274      ,21.22         ,0.767                     
0    ,64   ,736  ,127      ,1024 ,23   ,14.662      ,20.05         ,0.731                     
0    ,64   ,736  ,127      ,288  ,23   ,13.172      ,18.485        ,0.713                     
0    ,64   ,752  ,127      ,512  ,23   ,14.427      ,20.594        ,0.701                     
0    ,64   ,784  ,127      ,1024 ,23   ,16.441      ,22.664        ,0.725                     
0    ,64   ,784  ,127      ,240  ,23   ,13.402      ,17.85         ,0.751                     
0    ,64   ,8    ,127      ,7    ,23   ,3.161       ,3.482         ,0.908                     
0    ,64   ,80   ,127      ,128  ,23   ,5.032       ,4.142         ,1.215                     
0    ,64   ,80   ,127      ,32   ,23   ,4.522       ,4.981         ,0.908                     
0    ,64   ,80   ,127      ,48   ,23   ,4.644       ,5.048         ,0.92                      
0    ,64   ,80   ,127      ,64   ,23   ,5.035       ,4.293         ,1.173                     
0    ,64   ,800  ,127      ,512  ,23   ,17.382      ,24.301        ,0.715                     
0    ,64   ,832  ,127      ,1024 ,23   ,17.625      ,22.821        ,0.772                     
0    ,64   ,832  ,127      ,192  ,23   ,13.428      ,18.869        ,0.712                     
0    ,64   ,880  ,127      ,1024 ,23   ,17.879      ,23.056        ,0.775                     
0    ,64   ,880  ,127      ,144  ,23   ,14.114      ,18.776        ,0.752                     
0    ,64   ,9    ,127      ,8    ,23   ,3.168       ,3.321         ,0.954                     
0    ,64   ,928  ,127      ,1024 ,23   ,18.496      ,24.652        ,0.75                      
0    ,64   ,928  ,127      ,96   ,23   ,16.785      ,20.217        ,0.83                      
0    ,64   ,96   ,127      ,80   ,23   ,5.058       ,4.14          ,1.222                     
0    ,64   ,976  ,127      ,1024 ,23   ,19.09       ,25.806        ,0.74                      
0    ,64   ,976  ,127      ,48   ,23   ,17.251      ,20.872        ,0.827                     
1    ,1    ,2048 ,127      ,32   ,0    ,3.097       ,3.169         ,0.977                     
1    ,1    ,2048 ,127      ,32   ,23   ,33.081      ,35.633        ,0.928                     
1    ,1    ,256  ,127      ,64   ,0    ,5.057       ,4.086         ,1.238                     
1    ,1    ,256  ,127      ,64   ,23   ,7.95        ,11.379        ,0.699                     
1    ,16   ,2048 ,127      ,32   ,23   ,33.403      ,35.357        ,0.945                     
1    ,16   ,256  ,127      ,64   ,23   ,9.528       ,11.335        ,0.841                     
1    ,256  ,2048 ,127      ,32   ,23   ,32.954      ,34.96         ,0.943                     
1    ,256  ,256  ,127      ,64   ,23   ,9.146       ,10.794        ,0.847                     
1    ,4    ,2048 ,127      ,32   ,23   ,33.177      ,35.606        ,0.932                     
1    ,4    ,256  ,127      ,64   ,23   ,7.868       ,11.304        ,0.696                     
1    ,64   ,2048 ,127      ,32   ,23   ,33.723      ,35.765        ,0.943                     
1    ,64   ,256  ,127      ,64   ,23   ,9.038       ,11.406        ,0.792                     
105  ,1    ,256  ,127      ,64   ,0    ,5.634       ,3.956         ,1.424                     
105  ,1    ,256  ,127      ,64   ,23   ,8.222       ,11.162        ,0.737                     
105  ,16   ,256  ,127      ,64   ,23   ,8.202       ,11.547        ,0.71                      
105  ,256  ,256  ,127      ,64   ,23   ,10.737      ,11.705        ,0.917                     
105  ,4    ,256  ,127      ,64   ,23   ,8.221       ,11.32         ,0.726                     
105  ,64   ,256  ,127      ,64   ,23   ,8.221       ,11.637        ,0.706                     
15   ,1    ,256  ,127      ,64   ,0    ,5.098       ,3.704         ,1.376                     
15   ,1    ,256  ,127      ,64   ,23   ,7.842       ,11.545        ,0.679                     
15   ,16   ,256  ,127      ,64   ,23   ,7.974       ,11.892        ,0.671                     
15   ,256  ,256  ,127      ,64   ,23   ,9.086       ,11.35         ,0.801                     
15   ,4    ,256  ,127      ,64   ,23   ,7.946       ,11.221        ,0.708                     
15   ,64   ,256  ,127      ,64   ,23   ,9.209       ,11.459        ,0.804                     
2    ,1    ,2048 ,127      ,64   ,0    ,4.814       ,5.635         ,0.854                     
2    ,1    ,2048 ,127      ,64   ,23   ,32.353      ,35.062        ,0.923                     
2    ,1    ,256  ,127      ,64   ,0    ,4.65        ,3.772         ,1.233                     
2    ,1    ,256  ,127      ,64   ,23   ,7.992       ,11.405        ,0.701                     
2    ,16   ,2048 ,127      ,64   ,23   ,32.676      ,36.217        ,0.902                     
2    ,16   ,256  ,127      ,64   ,23   ,8.101       ,11.102        ,0.73                      
2    ,256  ,2048 ,127      ,64   ,23   ,32.515      ,34.674        ,0.938                     
2    ,256  ,256  ,127      ,64   ,23   ,9.231       ,11.704        ,0.789                     
2    ,4    ,2048 ,127      ,64   ,23   ,32.761      ,35.128        ,0.933                     
2    ,4    ,256  ,127      ,64   ,23   ,7.913       ,11.403        ,0.694                     
2    ,64   ,2048 ,127      ,64   ,23   ,32.715      ,35.393        ,0.924                     
2    ,64   ,256  ,127      ,64   ,23   ,9.352       ,11.399        ,0.82                      
3    ,1    ,2048 ,127      ,128  ,0    ,4.762       ,7.398         ,0.644                     
3    ,1    ,2048 ,127      ,128  ,23   ,32.033      ,36.078        ,0.888                     
3    ,1    ,256  ,127      ,64   ,0    ,4.802       ,3.903         ,1.23                      
3    ,1    ,256  ,127      ,64   ,23   ,8.052       ,12.555        ,0.641                     
3    ,16   ,2048 ,127      ,128  ,23   ,32.283      ,35.991        ,0.897                     
3    ,16   ,256  ,127      ,64   ,23   ,8.082       ,11.55         ,0.7                       
3    ,256  ,2048 ,127      ,128  ,23   ,31.861      ,35.979        ,0.886                     
3    ,256  ,256  ,127      ,64   ,23   ,9.626       ,11.665        ,0.825                     
3    ,4    ,2048 ,127      ,128  ,23   ,31.805      ,36.182        ,0.879                     
3    ,4    ,256  ,127      ,64   ,23   ,8.113       ,11.608        ,0.699                     
3    ,64   ,2048 ,127      ,128  ,23   ,31.828      ,36.292        ,0.877                     
3    ,64   ,256  ,127      ,64   ,23   ,9.415       ,11.211        ,0.84                      
30   ,1    ,256  ,127      ,64   ,0    ,4.805       ,4.243         ,1.133                     
30   ,1    ,256  ,127      ,64   ,23   ,8.108       ,12.058        ,0.672                     
30   ,16   ,256  ,127      ,64   ,23   ,8.033       ,11.376        ,0.706                     
30   ,256  ,256  ,127      ,64   ,23   ,9.271       ,11.242        ,0.825                     
30   ,4    ,256  ,127      ,64   ,23   ,7.963       ,11.373        ,0.7                       
30   ,64   ,256  ,127      ,64   ,23   ,9.405       ,11.978        ,0.785                     
4    ,1    ,2048 ,127      ,256  ,0    ,8.184       ,9.293         ,0.881                     
4    ,1    ,2048 ,127      ,256  ,23   ,32.519      ,36.906        ,0.881                     
4    ,1    ,256  ,127      ,64   ,0    ,4.612       ,3.916         ,1.178                     
4    ,1    ,256  ,127      ,64   ,23   ,8.086       ,11.768        ,0.687                     
4    ,16   ,2048 ,127      ,256  ,23   ,32.358      ,36.867        ,0.878                     
4    ,16   ,256  ,127      ,64   ,23   ,8.252       ,11.421        ,0.723                     
4    ,256  ,2048 ,127      ,256  ,23   ,32.511      ,36.341        ,0.895                     
4    ,256  ,256  ,127      ,64   ,23   ,9.471       ,11.957        ,0.792                     
4    ,4    ,2048 ,127      ,256  ,23   ,32.482      ,36.985        ,0.878                     
4    ,4    ,256  ,127      ,64   ,23   ,8.109       ,11.491        ,0.706                     
4    ,64   ,2048 ,127      ,256  ,23   ,32.575      ,35.95         ,0.906                     
4    ,64   ,256  ,127      ,64   ,23   ,9.371       ,11.319        ,0.828                     
4080 ,1    ,31   ,127      ,30   ,0    ,5.576       ,4.368         ,1.276                     
4080 ,1    ,31   ,127      ,30   ,23   ,5.659       ,4.474         ,1.265                     
4080 ,1    ,32   ,127      ,31   ,0    ,5.663       ,4.347         ,1.303                     
4080 ,1    ,32   ,127      ,31   ,23   ,5.632       ,4.474         ,1.259                     
4080 ,16   ,31   ,127      ,30   ,23   ,5.687       ,4.606         ,1.235                     
4080 ,16   ,32   ,127      ,31   ,23   ,6.282       ,4.5           ,1.396                     
4080 ,256  ,31   ,127      ,30   ,23   ,5.658       ,4.477         ,1.264                     
4080 ,256  ,32   ,127      ,31   ,23   ,5.701       ,4.393         ,1.298                     
4080 ,4    ,31   ,127      ,30   ,23   ,5.658       ,4.476         ,1.264                     
4080 ,4    ,32   ,127      ,31   ,23   ,5.659       ,4.336         ,1.305                     
4080 ,64   ,31   ,127      ,30   ,23   ,5.659       ,4.622         ,1.224                     
4080 ,64   ,32   ,127      ,31   ,23   ,5.688       ,4.5           ,1.264                     
4081 ,1    ,29   ,127      ,28   ,0    ,5.521       ,4.253         ,1.298                     
4081 ,1    ,29   ,127      ,28   ,23   ,5.747       ,4.778         ,1.203                     
4081 ,1    ,30   ,127      ,29   ,0    ,5.493       ,4.55          ,1.207                     
4081 ,1    ,30   ,127      ,29   ,23   ,5.661       ,4.652         ,1.217                     
4081 ,16   ,29   ,127      ,28   ,23   ,5.689       ,4.544         ,1.252                     
4081 ,16   ,30   ,127      ,29   ,23   ,6.183       ,4.74          ,1.304                     
4081 ,256  ,29   ,127      ,28   ,23   ,5.702       ,4.477         ,1.274                     
4081 ,256  ,30   ,127      ,29   ,23   ,5.659       ,4.375         ,1.293                     
4081 ,4    ,29   ,127      ,28   ,23   ,5.687       ,4.77          ,1.192                     
4081 ,4    ,30   ,127      ,29   ,23   ,5.687       ,4.519         ,1.258                     
4081 ,64   ,29   ,127      ,28   ,23   ,5.689       ,4.668         ,1.219                     
4081 ,64   ,30   ,127      ,29   ,23   ,5.717       ,4.647         ,1.23                      
4082 ,1    ,27   ,127      ,26   ,0    ,5.467       ,4.284         ,1.276                     
4082 ,1    ,27   ,127      ,26   ,23   ,5.659       ,4.668         ,1.212                     
4082 ,1    ,28   ,127      ,27   ,0    ,5.493       ,4.366         ,1.258                     
4082 ,1    ,28   ,127      ,27   ,23   ,5.717       ,4.58          ,1.248                     
4082 ,16   ,27   ,127      ,26   ,23   ,5.794       ,4.337         ,1.336                     
4082 ,16   ,28   ,127      ,27   ,23   ,5.66        ,4.476         ,1.264                     
4082 ,256  ,27   ,127      ,26   ,23   ,5.659       ,4.402         ,1.286                     
4082 ,256  ,28   ,127      ,27   ,23   ,5.659       ,4.531         ,1.249                     
4082 ,4    ,27   ,127      ,26   ,23   ,5.69        ,4.381         ,1.299                     
4082 ,4    ,28   ,127      ,27   ,23   ,5.66        ,4.499         ,1.258                     
4082 ,64   ,27   ,127      ,26   ,23   ,5.743       ,4.5           ,1.276                     
4082 ,64   ,28   ,127      ,27   ,23   ,5.66        ,4.668         ,1.213                     
4083 ,1    ,25   ,127      ,24   ,0    ,5.467       ,4.348         ,1.257                     
4083 ,1    ,25   ,127      ,24   ,23   ,5.632       ,4.644         ,1.213                     
4083 ,1    ,26   ,127      ,25   ,0    ,5.441       ,4.369         ,1.245                     
4083 ,1    ,26   ,127      ,25   ,23   ,5.659       ,4.508         ,1.255                     
4083 ,16   ,25   ,127      ,24   ,23   ,5.779       ,4.644         ,1.244                     
4083 ,16   ,26   ,127      ,25   ,23   ,5.631       ,4.545         ,1.239                     
4083 ,256  ,25   ,127      ,24   ,23   ,5.706       ,4.337         ,1.316                     
4083 ,256  ,26   ,127      ,25   ,23   ,5.659       ,4.769         ,1.187                     
4083 ,4    ,25   ,127      ,24   ,23   ,5.701       ,4.667         ,1.222                     
4083 ,4    ,26   ,127      ,25   ,23   ,5.689       ,4.381         ,1.299                     
4083 ,64   ,25   ,127      ,24   ,23   ,5.688       ,4.477         ,1.27                      
4083 ,64   ,26   ,127      ,25   ,23   ,5.702       ,4.49          ,1.27                      
4084 ,1    ,23   ,127      ,22   ,0    ,5.441       ,4.328         ,1.257                     
4084 ,1    ,23   ,127      ,22   ,23   ,5.659       ,4.655         ,1.216                     
4084 ,1    ,24   ,127      ,23   ,0    ,5.584       ,4.327         ,1.291                     
4084 ,1    ,24   ,127      ,23   ,23   ,5.689       ,4.358         ,1.305                     
4084 ,16   ,23   ,127      ,22   ,23   ,5.706       ,4.545         ,1.255                     
4084 ,16   ,24   ,127      ,23   ,23   ,5.687       ,4.545         ,1.251                     
4084 ,256  ,23   ,127      ,22   ,23   ,5.658       ,4.499         ,1.258                     
4084 ,256  ,24   ,127      ,23   ,23   ,5.717       ,4.598         ,1.243                     
4084 ,4    ,23   ,127      ,22   ,23   ,5.659       ,4.746         ,1.192                     
4084 ,4    ,24   ,127      ,23   ,23   ,5.66        ,4.665         ,1.213                     
4084 ,64   ,23   ,127      ,22   ,23   ,5.659       ,4.638         ,1.22                      
4084 ,64   ,24   ,127      ,23   ,23   ,5.687       ,4.897         ,1.161                     
4085 ,1    ,21   ,127      ,20   ,0    ,5.603       ,4.296         ,1.304                     
4085 ,1    ,21   ,127      ,20   ,23   ,5.717       ,4.381         ,1.305                     
4085 ,1    ,22   ,127      ,21   ,0    ,5.604       ,4.53          ,1.237                     
4085 ,1    ,22   ,127      ,21   ,23   ,5.678       ,4.547         ,1.249                     
4085 ,16   ,21   ,127      ,20   ,23   ,5.659       ,4.568         ,1.239                     
4085 ,16   ,22   ,127      ,21   ,23   ,5.659       ,4.544         ,1.245                     
4085 ,256  ,21   ,127      ,20   ,23   ,5.688       ,4.668         ,1.218                     
4085 ,256  ,22   ,127      ,21   ,23   ,5.658       ,4.644         ,1.219                     
4085 ,4    ,21   ,127      ,20   ,23   ,5.736       ,4.381         ,1.309                     
4085 ,4    ,22   ,127      ,21   ,23   ,5.692       ,4.499         ,1.265                     
4085 ,64   ,21   ,127      ,20   ,23   ,5.687       ,4.403         ,1.292                     
4085 ,64   ,22   ,127      ,21   ,23   ,5.688       ,4.847         ,1.173                     
4086 ,1    ,19   ,127      ,18   ,0    ,5.466       ,4.462         ,1.225                     
4086 ,1    ,19   ,127      ,18   ,23   ,5.661       ,4.476         ,1.265                     
4086 ,1    ,20   ,127      ,19   ,0    ,5.467       ,4.256         ,1.284                     
4086 ,1    ,20   ,127      ,19   ,23   ,5.631       ,4.597         ,1.225                     
4086 ,16   ,19   ,127      ,18   ,23   ,5.687       ,4.381         ,1.298                     
4086 ,16   ,20   ,127      ,19   ,23   ,5.735       ,4.644         ,1.235                     
4086 ,256  ,19   ,127      ,18   ,23   ,5.632       ,4.545         ,1.239                     
4086 ,256  ,20   ,127      ,19   ,23   ,5.631       ,4.425         ,1.273                     
4086 ,4    ,19   ,127      ,18   ,23   ,5.717       ,4.699         ,1.217                     
4086 ,4    ,20   ,127      ,19   ,23   ,5.734       ,5.147         ,1.114                     
4086 ,64   ,19   ,127      ,18   ,23   ,5.659       ,4.359         ,1.298                     
4086 ,64   ,20   ,127      ,19   ,23   ,5.717       ,4.499         ,1.271                     
4087 ,1    ,17   ,127      ,16   ,0    ,5.522       ,4.674         ,1.181                     
4087 ,1    ,17   ,127      ,16   ,23   ,5.659       ,4.499         ,1.258                     
4087 ,1    ,18   ,127      ,17   ,0    ,5.607       ,4.401         ,1.274                     
4087 ,1    ,18   ,127      ,17   ,23   ,5.688       ,4.402         ,1.292                     
4087 ,16   ,17   ,127      ,16   ,23   ,5.812       ,4.499         ,1.292                     
4087 ,16   ,18   ,127      ,17   ,23   ,5.716       ,4.538         ,1.26                      
4087 ,256  ,17   ,127      ,16   ,23   ,5.688       ,4.638         ,1.226                     
4087 ,256  ,18   ,127      ,17   ,23   ,5.802       ,4.522         ,1.283                     
4087 ,4    ,17   ,127      ,16   ,23   ,5.717       ,4.337         ,1.318                     
4087 ,4    ,18   ,127      ,17   ,23   ,5.682       ,4.476         ,1.269                     
4087 ,64   ,17   ,127      ,16   ,23   ,5.659       ,4.899         ,1.155                     
4087 ,64   ,18   ,127      ,17   ,23   ,5.688       ,4.477         ,1.271                     
4088 ,1    ,15   ,127      ,14   ,0    ,5.551       ,4.566         ,1.216                     
4088 ,1    ,15   ,127      ,14   ,23   ,5.848       ,4.521         ,1.293                     
4088 ,1    ,16   ,127      ,15   ,0    ,5.469       ,4.524         ,1.209                     
4088 ,1    ,16   ,127      ,15   ,23   ,5.688       ,4.454         ,1.277                     
4088 ,16   ,15   ,127      ,14   ,23   ,5.658       ,4.402         ,1.285                     
4088 ,16   ,16   ,127      ,15   ,23   ,5.658       ,4.597         ,1.231                     
4088 ,256  ,15   ,127      ,14   ,23   ,5.659       ,4.698         ,1.204                     
4088 ,256  ,16   ,127      ,15   ,23   ,5.705       ,4.477         ,1.274                     
4088 ,4    ,15   ,127      ,14   ,23   ,5.78        ,4.5           ,1.284                     
4088 ,4    ,16   ,127      ,15   ,23   ,5.693       ,4.469         ,1.274                     
4088 ,64   ,15   ,127      ,14   ,23   ,5.687       ,4.499         ,1.264                     
4088 ,64   ,16   ,127      ,15   ,23   ,5.676       ,4.524         ,1.255                     
4089 ,1    ,13   ,127      ,12   ,0    ,5.616       ,5.667         ,0.991                     
4089 ,1    ,13   ,127      ,12   ,23   ,5.688       ,4.515         ,1.26                      
4089 ,1    ,14   ,127      ,13   ,0    ,5.549       ,4.362         ,1.272                     
4089 ,1    ,14   ,127      ,13   ,23   ,5.659       ,4.403         ,1.285                     
4089 ,16   ,13   ,127      ,12   ,23   ,5.718       ,4.75          ,1.204                     
4089 ,16   ,14   ,127      ,13   ,23   ,5.688       ,4.598         ,1.237                     
4089 ,256  ,13   ,127      ,12   ,23   ,5.766       ,4.499         ,1.282                     
4089 ,256  ,14   ,127      ,13   ,23   ,5.688       ,4.597         ,1.237                     
4089 ,4    ,13   ,127      ,12   ,23   ,5.688       ,4.568         ,1.245                     
4089 ,4    ,14   ,127      ,13   ,23   ,5.688       ,4.563         ,1.247                     
4089 ,64   ,13   ,127      ,12   ,23   ,5.688       ,4.544         ,1.252                     
4089 ,64   ,14   ,127      ,13   ,23   ,5.742       ,4.499         ,1.276                     
4090 ,1    ,11   ,127      ,10   ,0    ,5.549       ,4.651         ,1.193                     
4090 ,1    ,11   ,127      ,10   ,23   ,5.687       ,4.402         ,1.292                     
4090 ,1    ,12   ,127      ,11   ,0    ,5.414       ,4.686         ,1.155                     
4090 ,1    ,12   ,127      ,11   ,23   ,5.717       ,4.522         ,1.264                     
4090 ,16   ,11   ,127      ,10   ,23   ,5.632       ,4.496         ,1.253                     
4090 ,16   ,12   ,127      ,11   ,23   ,5.746       ,4.523         ,1.27                      
4090 ,256  ,11   ,127      ,10   ,23   ,5.659       ,4.597         ,1.231                     
4090 ,256  ,12   ,127      ,11   ,23   ,5.66        ,4.359         ,1.299                     
4090 ,4    ,11   ,127      ,10   ,23   ,5.674       ,4.645         ,1.222                     
4090 ,4    ,12   ,127      ,11   ,23   ,5.677       ,4.5           ,1.262                     
4090 ,64   ,11   ,127      ,10   ,23   ,5.659       ,4.621         ,1.225                     
4090 ,64   ,12   ,127      ,11   ,23   ,5.659       ,4.535         ,1.248                     
4091 ,1    ,10   ,127      ,9    ,0    ,5.466       ,4.59          ,1.191                     
4091 ,1    ,10   ,127      ,9    ,23   ,5.659       ,4.523         ,1.251                     
4091 ,1    ,9    ,127      ,8    ,0    ,5.505       ,4.284         ,1.285                     
4091 ,1    ,9    ,127      ,8    ,23   ,5.659       ,4.499         ,1.258                     
4091 ,16   ,10   ,127      ,9    ,23   ,5.717       ,4.545         ,1.258                     
4091 ,16   ,9    ,127      ,8    ,23   ,5.687       ,4.522         ,1.258                     
4091 ,256  ,10   ,127      ,9    ,23   ,5.659       ,4.568         ,1.239                     
4091 ,256  ,9    ,127      ,8    ,23   ,5.63        ,4.522         ,1.245                     
4091 ,4    ,10   ,127      ,9    ,23   ,5.717       ,4.5           ,1.27                      
4091 ,4    ,9    ,127      ,8    ,23   ,5.718       ,4.546         ,1.258                     
4091 ,64   ,10   ,127      ,9    ,23   ,5.687       ,4.693         ,1.212                     
4091 ,64   ,9    ,127      ,8    ,23   ,5.659       ,4.522         ,1.251                     
4092 ,1    ,7    ,127      ,6    ,0    ,5.904       ,4.489         ,1.315                     
4092 ,1    ,7    ,127      ,6    ,23   ,5.734       ,4.492         ,1.277                     
4092 ,1    ,8    ,127      ,7    ,0    ,5.468       ,4.407         ,1.241                     
4092 ,1    ,8    ,127      ,7    ,23   ,5.687       ,4.693         ,1.212                     
4092 ,16   ,7    ,127      ,6    ,23   ,5.659       ,4.644         ,1.219                     
4092 ,16   ,8    ,127      ,7    ,23   ,5.659       ,4.898         ,1.155                     
4092 ,256  ,7    ,127      ,6    ,23   ,5.688       ,4.382         ,1.298                     
4092 ,256  ,8    ,127      ,7    ,23   ,5.631       ,4.608         ,1.222                     
4092 ,4    ,7    ,127      ,6    ,23   ,5.688       ,4.403         ,1.292                     
4092 ,4    ,8    ,127      ,7    ,23   ,5.688       ,4.544         ,1.252                     
4092 ,64   ,7    ,127      ,6    ,23   ,5.688       ,4.597         ,1.237                     
4092 ,64   ,8    ,127      ,7    ,23   ,5.687       ,4.646         ,1.224                     
4093 ,1    ,5    ,127      ,4    ,0    ,5.339       ,4.35          ,1.227                     
4093 ,1    ,5    ,127      ,4    ,23   ,5.716       ,4.497         ,1.271                     
4093 ,1    ,6    ,127      ,5    ,0    ,5.473       ,4.369         ,1.253                     
4093 ,1    ,6    ,127      ,5    ,23   ,5.718       ,4.568         ,1.252                     
4093 ,16   ,5    ,127      ,4    ,23   ,5.717       ,4.358         ,1.312                     
4093 ,16   ,6    ,127      ,5    ,23   ,5.688       ,4.499         ,1.264                     
4093 ,256  ,5    ,127      ,4    ,23   ,5.748       ,4.337         ,1.325                     
4093 ,256  ,6    ,127      ,5    ,23   ,5.67        ,4.454         ,1.273                     
4093 ,4    ,5    ,127      ,4    ,23   ,5.717       ,4.789         ,1.194                     
4093 ,4    ,6    ,127      ,5    ,23   ,5.688       ,4.875         ,1.167                     
4093 ,64   ,5    ,127      ,4    ,23   ,5.674       ,4.414         ,1.286                     
4093 ,64   ,6    ,127      ,5    ,23   ,5.907       ,4.397         ,1.343                     
4094 ,1    ,3    ,127      ,2    ,0    ,5.445       ,4.133         ,1.317                     
4094 ,1    ,3    ,127      ,2    ,23   ,5.718       ,4.499         ,1.271                     
4094 ,1    ,4    ,127      ,3    ,0    ,5.735       ,4.4           ,1.303                     
4094 ,1    ,4    ,127      ,3    ,23   ,5.689       ,4.477         ,1.271                     
4094 ,16   ,3    ,127      ,2    ,23   ,5.747       ,4.57          ,1.258                     
4094 ,16   ,4    ,127      ,3    ,23   ,5.805       ,4.622         ,1.256                     
4094 ,256  ,3    ,127      ,2    ,23   ,5.66        ,4.48          ,1.263                     
4094 ,256  ,4    ,127      ,3    ,23   ,5.748       ,4.378         ,1.313                     
4094 ,4    ,3    ,127      ,2    ,23   ,5.659       ,4.518         ,1.253                     
4094 ,4    ,4    ,127      ,3    ,23   ,5.688       ,4.512         ,1.261                     
4094 ,64   ,3    ,127      ,2    ,23   ,5.718       ,4.36          ,1.311                     
4094 ,64   ,4    ,127      ,3    ,23   ,5.687       ,4.381         ,1.298                     
4095 ,1    ,1    ,127      ,0    ,0    ,4.077       ,4.055         ,1.005                     
4095 ,1    ,1    ,127      ,0    ,23   ,5.573       ,4.622         ,1.206                     
4095 ,1    ,2    ,127      ,1    ,0    ,5.532       ,4.115         ,1.344                     
4095 ,1    ,2    ,127      ,1    ,23   ,5.72        ,4.5           ,1.271                     
4095 ,16   ,1    ,127      ,0    ,23   ,5.353       ,4.404         ,1.216                     
4095 ,16   ,2    ,127      ,1    ,23   ,5.707       ,4.599         ,1.241                     
4095 ,256  ,1    ,127      ,0    ,23   ,5.393       ,4.577         ,1.178                     
4095 ,256  ,2    ,127      ,1    ,23   ,5.72        ,4.525         ,1.264                     
4095 ,4    ,1    ,127      ,0    ,23   ,5.509       ,4.383         ,1.257                     
4095 ,4    ,2    ,127      ,1    ,23   ,5.69        ,4.546         ,1.252                     
4095 ,64   ,1    ,127      ,0    ,23   ,5.31        ,4.339         ,1.224                     
4095 ,64   ,2    ,127      ,1    ,23   ,5.719       ,4.548         ,1.258                     
45   ,1    ,256  ,127      ,64   ,0    ,5.411       ,3.993         ,1.355                     
45   ,1    ,256  ,127      ,64   ,23   ,8.029       ,11.493        ,0.699                     
45   ,16   ,256  ,127      ,64   ,23   ,8.142       ,11.596        ,0.702                     
45   ,256  ,256  ,127      ,64   ,23   ,9.462       ,11.824        ,0.8                       
45   ,4    ,256  ,127      ,64   ,23   ,8.106       ,11.413        ,0.71                      
45   ,64   ,256  ,127      ,64   ,23   ,9.655       ,11.369        ,0.849                     
5    ,1    ,2048 ,127      ,512  ,0    ,9.959       ,14.513        ,0.686                     
5    ,1    ,2048 ,127      ,512  ,23   ,33.09       ,36.424        ,0.908                     
5    ,1    ,256  ,127      ,64   ,0    ,4.693       ,3.885         ,1.208                     
5    ,1    ,256  ,127      ,64   ,23   ,8.114       ,12.26         ,0.662                     
5    ,16   ,2048 ,127      ,512  ,23   ,33.891      ,39.937        ,0.849                     
5    ,16   ,256  ,127      ,64   ,23   ,8.081       ,11.652        ,0.693                     
5    ,256  ,2048 ,127      ,512  ,23   ,34.159      ,39.246        ,0.87                      
5    ,256  ,256  ,127      ,64   ,23   ,9.46        ,12.217        ,0.774                     
5    ,4    ,2048 ,127      ,512  ,23   ,34.314      ,39.586        ,0.867                     
5    ,4    ,256  ,127      ,64   ,23   ,8.104       ,11.498        ,0.705                     
5    ,64   ,2048 ,127      ,512  ,23   ,33.874      ,39.485        ,0.858                     
5    ,64   ,256  ,127      ,64   ,23   ,9.593       ,11.388        ,0.842                     
6    ,1    ,2048 ,127      ,1024 ,0    ,17.103      ,19.603        ,0.872                     
6    ,1    ,2048 ,127      ,1024 ,23   ,33.706      ,37.99         ,0.887                     
6    ,1    ,256  ,127      ,64   ,0    ,4.831       ,4.211         ,1.147                     
6    ,1    ,256  ,127      ,64   ,23   ,8.156       ,11.409        ,0.715                     
6    ,16   ,2048 ,127      ,1024 ,23   ,36.265      ,44.042        ,0.823                     
6    ,16   ,256  ,127      ,64   ,23   ,8.123       ,11.167        ,0.727                     
6    ,256  ,2048 ,127      ,1024 ,23   ,37.489      ,44.428        ,0.844                     
6    ,256  ,256  ,127      ,64   ,23   ,9.529       ,11.198        ,0.851                     
6    ,4    ,2048 ,127      ,1024 ,23   ,34.235      ,42.948        ,0.797                     
6    ,4    ,256  ,127      ,64   ,23   ,8.042       ,11.644        ,0.691                     
6    ,64   ,2048 ,127      ,1024 ,23   ,36.438      ,43.778        ,0.832                     
6    ,64   ,256  ,127      ,64   ,23   ,8.144       ,11.219        ,0.726                     
60   ,1    ,256  ,127      ,64   ,0    ,4.613       ,3.868         ,1.193                     
60   ,1    ,256  ,127      ,64   ,23   ,8.055       ,11.715        ,0.688                     
60   ,16   ,256  ,127      ,64   ,23   ,8.138       ,11.888        ,0.685                     
60   ,256  ,256  ,127      ,64   ,23   ,9.701       ,11.935        ,0.813                     
60   ,4    ,256  ,127      ,64   ,23   ,8.132       ,11.762        ,0.691                     
60   ,64   ,256  ,127      ,64   ,23   ,9.608       ,11.642        ,0.825                     
7    ,1    ,2048 ,127      ,2048 ,0    ,28.209      ,33.848        ,0.833                     
7    ,1    ,2048 ,127      ,2048 ,23   ,33.134      ,35.77         ,0.926                     
7    ,1    ,256  ,127      ,64   ,0    ,4.7         ,4.373         ,1.075                     
7    ,1    ,256  ,127      ,64   ,23   ,8.105       ,11.577        ,0.7                       
7    ,16   ,2048 ,127      ,2048 ,23   ,44.67       ,85.315        ,0.524                     
7    ,16   ,256  ,127      ,64   ,23   ,8.104       ,11.19         ,0.724                     
7    ,256  ,2048 ,127      ,2048 ,23   ,52.731      ,58.99         ,0.894                     
7    ,256  ,256  ,127      ,64   ,23   ,9.461       ,11.938        ,0.793                     
7    ,4    ,2048 ,127      ,2048 ,23   ,34.073      ,39.541        ,0.862                     
7    ,4    ,256  ,127      ,64   ,23   ,8.09        ,11.641        ,0.695                     
7    ,64   ,2048 ,127      ,2048 ,23   ,61.246      ,65.594        ,0.934                     
7    ,64   ,256  ,127      ,64   ,23   ,8.134       ,11.558        ,0.704                     
75   ,1    ,256  ,127      ,64   ,0    ,5.008       ,4.024         ,1.244                     
75   ,1    ,256  ,127      ,64   ,23   ,8.146       ,11.817        ,0.689                     
75   ,16   ,256  ,127      ,64   ,23   ,8.145       ,11.39         ,0.715                     
75   ,256  ,256  ,127      ,64   ,23   ,10.768      ,11.612        ,0.927                     
75   ,4    ,256  ,127      ,64   ,23   ,8.105       ,11.661        ,0.695                     
75   ,64   ,256  ,127      ,64   ,23   ,8.137       ,11.039        ,0.737                     
8    ,1    ,2048 ,127      ,4096 ,0    ,31.237      ,38.429        ,0.813                     
8    ,1    ,2048 ,127      ,4096 ,23   ,32.959      ,34.762        ,0.948                     
8    ,16   ,2048 ,127      ,4096 ,23   ,36.619      ,45.644        ,0.802                     
8    ,256  ,2048 ,127      ,4096 ,23   ,45.167      ,55.267        ,0.817                     
8    ,4    ,2048 ,127      ,4096 ,23   ,32.942      ,35.871        ,0.918                     
8    ,64   ,2048 ,127      ,4096 ,23   ,39.52       ,54.964        ,0.719                     
90   ,1    ,256  ,127      ,64   ,0    ,4.762       ,3.896         ,1.222                     
90   ,1    ,256  ,127      ,64   ,23   ,8.123       ,10.953        ,0.742                     
90   ,16   ,256  ,127      ,64   ,23   ,8.231       ,11.242        ,0.732                     
90   ,256  ,256  ,127      ,64   ,23   ,10.765      ,11.764        ,0.915                     
90   ,4    ,256  ,127      ,64   ,23   ,8.104       ,11.525        ,0.703                     
90   ,64   ,256  ,127      ,64   ,23   ,8.175       ,11.522        ,0.71    

[-- Attachment #3: strrchr-evex-data-tgl.txt --]
[-- Type: text/plain, Size: 165417 bytes --]


Results For: strrchr
align,freq ,len  ,max_char ,pos  ,seek ,strrchr-dev ,strrchr-glibc ,strrchr-dev/strrchr-glibc 
0    ,1    ,1    ,127      ,0    ,0    ,2.966       ,2.983         ,0.994                     
0    ,1    ,1    ,127      ,0    ,23   ,3.178       ,3.188         ,0.997                     
0    ,1    ,10   ,127      ,9    ,0    ,3.081       ,3.007         ,1.025                     
0    ,1    ,10   ,127      ,9    ,23   ,3.145       ,3.148         ,0.999                     
0    ,1    ,1024 ,127      ,0    ,0    ,3.727       ,3.046         ,1.224                     
0    ,1    ,1024 ,127      ,0    ,23   ,20.697      ,31.221        ,0.663                     
0    ,1    ,1024 ,127      ,1024 ,0    ,18.007      ,31.044        ,0.58                      
0    ,1    ,1024 ,127      ,1024 ,23   ,20.643      ,30.514        ,0.677                     
0    ,1    ,1024 ,127      ,144  ,0    ,6.371       ,9.031         ,0.705                     
0    ,1    ,1024 ,127      ,144  ,23   ,18.349      ,30.847        ,0.595                     
0    ,1    ,1024 ,127      ,192  ,0    ,7.457       ,10.31         ,0.723                     
0    ,1    ,1024 ,127      ,192  ,23   ,18.031      ,30.368        ,0.594                     
0    ,1    ,1024 ,127      ,240  ,0    ,6.63        ,9.71          ,0.683                     
0    ,1    ,1024 ,127      ,240  ,23   ,18.012      ,31.189        ,0.578                     
0    ,1    ,1024 ,127      ,288  ,0    ,9.467       ,12.061        ,0.785                     
0    ,1    ,1024 ,127      ,288  ,23   ,19.968      ,30.068        ,0.664                     
0    ,1    ,1024 ,127      ,48   ,0    ,4.437       ,4.235         ,1.048                     
0    ,1    ,1024 ,127      ,48   ,23   ,20.137      ,30.478        ,0.661                     
0    ,1    ,1024 ,127      ,736  ,0    ,13.541      ,22.361        ,0.606                     
0    ,1    ,1024 ,127      ,736  ,23   ,19.175      ,29.777        ,0.644                     
0    ,1    ,1024 ,127      ,784  ,0    ,15.315      ,23.778        ,0.644                     
0    ,1    ,1024 ,127      ,784  ,23   ,20.807      ,32.904        ,0.632                     
0    ,1    ,1024 ,127      ,832  ,0    ,15.846      ,25.454        ,0.623                     
0    ,1    ,1024 ,127      ,832  ,23   ,19.736      ,32.652        ,0.604                     
0    ,1    ,1024 ,127      ,880  ,0    ,14.95       ,25.394        ,0.589                     
0    ,1    ,1024 ,127      ,880  ,23   ,20.867      ,32.337        ,0.645                     
0    ,1    ,1024 ,127      ,928  ,0    ,15.721      ,27.709        ,0.567                     
0    ,1    ,1024 ,127      ,928  ,23   ,20.862      ,32.035        ,0.651                     
0    ,1    ,1024 ,127      ,96   ,0    ,5.724       ,5.77          ,0.992                     
0    ,1    ,1024 ,127      ,96   ,23   ,19.75       ,31.262        ,0.632                     
0    ,1    ,1024 ,127      ,976  ,0    ,17.488      ,28.876        ,0.606                     
0    ,1    ,1024 ,127      ,976  ,23   ,20.566      ,32.876        ,0.626                     
0    ,1    ,1072 ,127      ,1024 ,0    ,17.105      ,30.693        ,0.557                     
0    ,1    ,1072 ,127      ,1024 ,23   ,19.859      ,32.556        ,0.61                      
0    ,1    ,11   ,127      ,10   ,0    ,3.137       ,3.482         ,0.901                     
0    ,1    ,11   ,127      ,10   ,23   ,3.17        ,3.153         ,1.005                     
0    ,1    ,112  ,127      ,144  ,0    ,4.981       ,5.101         ,0.977                     
0    ,1    ,112  ,127      ,144  ,23   ,5.006       ,6.439         ,0.777                     
0    ,1    ,112  ,127      ,16   ,0    ,3.099       ,3.038         ,1.02                      
0    ,1    ,112  ,127      ,16   ,23   ,4.667       ,5.834         ,0.8                       
0    ,1    ,112  ,127      ,256  ,0    ,4.932       ,4.991         ,0.988                     
0    ,1    ,112  ,127      ,256  ,23   ,5.031       ,6.201         ,0.811                     
0    ,1    ,112  ,127      ,64   ,0    ,4.957       ,4.304         ,1.152                     
0    ,1    ,112  ,127      ,64   ,23   ,4.984       ,5.876         ,0.848                     
0    ,1    ,112  ,127      ,96   ,0    ,4.977       ,4.887         ,1.019                     
0    ,1    ,112  ,127      ,96   ,23   ,5.031       ,5.661         ,0.889                     
0    ,1    ,1120 ,127      ,1024 ,0    ,17.107      ,30.402        ,0.563                     
0    ,1    ,1120 ,127      ,1024 ,23   ,20.414      ,34.412        ,0.593                     
0    ,1    ,1168 ,127      ,1024 ,0    ,16.961      ,30.616        ,0.554                     
0    ,1    ,1168 ,127      ,1024 ,23   ,24.161      ,37.76         ,0.64                      
0    ,1    ,12   ,127      ,11   ,0    ,3.021       ,3.071         ,0.984                     
0    ,1    ,12   ,127      ,11   ,23   ,3.178       ,3.153         ,1.008                     
0    ,1    ,1216 ,127      ,1024 ,0    ,17.254      ,30.827        ,0.56                      
0    ,1    ,1216 ,127      ,1024 ,23   ,21.954      ,37.513        ,0.585                     
0    ,1    ,1264 ,127      ,1024 ,0    ,17.85       ,30.564        ,0.584                     
0    ,1    ,1264 ,127      ,1024 ,23   ,22.583      ,37.29         ,0.606                     
0    ,1    ,128  ,127      ,0    ,0    ,3.102       ,3.06          ,1.014                     
0    ,1    ,128  ,127      ,0    ,23   ,6.278       ,8.376         ,0.75                      
0    ,1    ,128  ,127      ,112  ,0    ,4.933       ,4.908         ,1.005                     
0    ,1    ,128  ,127      ,112  ,23   ,5.135       ,9.127         ,0.563                     
0    ,1    ,128  ,127      ,128  ,0    ,5.274       ,8.098         ,0.651                     
0    ,1    ,128  ,127      ,128  ,23   ,6.36        ,8.124         ,0.783                     
0    ,1    ,128  ,127      ,144  ,0    ,5.307       ,8.109         ,0.654                     
0    ,1    ,128  ,127      ,144  ,23   ,6.349       ,8.178         ,0.776                     
0    ,1    ,128  ,127      ,192  ,0    ,5.182       ,8.209         ,0.631                     
0    ,1    ,128  ,127      ,192  ,23   ,6.996       ,8.218         ,0.851                     
0    ,1    ,128  ,127      ,240  ,0    ,5.183       ,8.091         ,0.641                     
0    ,1    ,128  ,127      ,240  ,23   ,6.993       ,8.208         ,0.852                     
0    ,1    ,128  ,127      ,288  ,0    ,5.208       ,8.162         ,0.638                     
0    ,1    ,128  ,127      ,288  ,23   ,6.581       ,8.194         ,0.803                     
0    ,1    ,128  ,127      ,32   ,0    ,3.091       ,4.337         ,0.713                     
0    ,1    ,128  ,127      ,32   ,23   ,6.502       ,8.12          ,0.801                     
0    ,1    ,128  ,127      ,48   ,0    ,3.098       ,4.296         ,0.721                     
0    ,1    ,128  ,127      ,48   ,23   ,5.909       ,8.148         ,0.725                     
0    ,1    ,128  ,127      ,80   ,0    ,4.957       ,4.2           ,1.18                      
0    ,1    ,128  ,127      ,80   ,23   ,5.257       ,8.161         ,0.644                     
0    ,1    ,128  ,127      ,96   ,0    ,4.956       ,4.888         ,1.014                     
0    ,1    ,128  ,127      ,96   ,23   ,5.255       ,7.117         ,0.738                     
0    ,1    ,13   ,127      ,12   ,0    ,3.016       ,3.298         ,0.915                     
0    ,1    ,13   ,127      ,12   ,23   ,3.169       ,3.13          ,1.013                     
0    ,1    ,1312 ,127      ,1024 ,0    ,18.457      ,30.698        ,0.601                     
0    ,1    ,1312 ,127      ,1024 ,23   ,24.591      ,40.376        ,0.609                     
0    ,1    ,14   ,127      ,13   ,0    ,3.135       ,2.981         ,1.052                     
0    ,1    ,14   ,127      ,13   ,23   ,3.144       ,3.153         ,0.997                     
0    ,1    ,144  ,127      ,128  ,0    ,5.096       ,8.032         ,0.635                     
0    ,1    ,144  ,127      ,128  ,23   ,5.156       ,8.561         ,0.602                     
0    ,1    ,15   ,127      ,14   ,0    ,3.062       ,2.946         ,1.039                     
0    ,1    ,15   ,127      ,14   ,23   ,3.145       ,3.145         ,1.0                       
0    ,1    ,16   ,127      ,0    ,0    ,3.115       ,3.059         ,1.018                     
0    ,1    ,16   ,127      ,0    ,23   ,3.516       ,3.148         ,1.117                     
0    ,1    ,16   ,127      ,144  ,0    ,3.084       ,3.038         ,1.015                     
0    ,1    ,16   ,127      ,144  ,23   ,3.852       ,3.917         ,0.983                     
0    ,1    ,16   ,127      ,15   ,0    ,3.048       ,3.052         ,0.999                     
0    ,1    ,16   ,127      ,15   ,23   ,3.187       ,3.114         ,1.024                     
0    ,1    ,16   ,127      ,16   ,0    ,3.094       ,3.066         ,1.009                     
0    ,1    ,16   ,127      ,16   ,23   ,3.787       ,3.886         ,0.974                     
0    ,1    ,16   ,127      ,192  ,0    ,3.099       ,3.053         ,1.015                     
0    ,1    ,16   ,127      ,192  ,23   ,3.755       ,3.9           ,0.963                     
0    ,1    ,16   ,127      ,240  ,0    ,3.122       ,3.053         ,1.023                     
0    ,1    ,16   ,127      ,240  ,23   ,3.765       ,3.827         ,0.984                     
0    ,1    ,16   ,127      ,256  ,0    ,3.099       ,3.097         ,1.0                       
0    ,1    ,16   ,127      ,256  ,23   ,3.792       ,3.699         ,1.025                     
0    ,1    ,16   ,127      ,288  ,0    ,3.115       ,3.053         ,1.02                      
0    ,1    ,16   ,127      ,288  ,23   ,4.518       ,3.9           ,1.158                     
0    ,1    ,16   ,127      ,48   ,0    ,3.107       ,3.085         ,1.007                     
0    ,1    ,16   ,127      ,48   ,23   ,3.804       ,4.083         ,0.932                     
0    ,1    ,16   ,127      ,64   ,0    ,3.098       ,3.099         ,1.0                       
0    ,1    ,16   ,127      ,64   ,23   ,3.773       ,4.212         ,0.896                     
0    ,1    ,16   ,127      ,96   ,0    ,3.13        ,3.038         ,1.03                      
0    ,1    ,16   ,127      ,96   ,23   ,3.792       ,4.001         ,0.948                     
0    ,1    ,160  ,127      ,144  ,0    ,5.091       ,8.008         ,0.636                     
0    ,1    ,160  ,127      ,144  ,23   ,5.185       ,8.412         ,0.616                     
0    ,1    ,160  ,127      ,16   ,0    ,3.101       ,3.053         ,1.016                     
0    ,1    ,160  ,127      ,16   ,23   ,5.906       ,10.008        ,0.59                      
0    ,1    ,160  ,127      ,256  ,0    ,5.131       ,8.093         ,0.634                     
0    ,1    ,160  ,127      ,256  ,23   ,6.288       ,8.097         ,0.777                     
0    ,1    ,160  ,127      ,64   ,0    ,5.007       ,4.308         ,1.162                     
0    ,1    ,160  ,127      ,64   ,23   ,5.503       ,8.187         ,0.672                     
0    ,1    ,160  ,127      ,96   ,0    ,4.981       ,4.95          ,1.006                     
0    ,1    ,160  ,127      ,96   ,23   ,5.765       ,7.035         ,0.819                     
0    ,1    ,17   ,127      ,16   ,0    ,3.054       ,3.074         ,0.994                     
0    ,1    ,17   ,127      ,16   ,23   ,3.153       ,3.125         ,1.009                     
0    ,1    ,176  ,127      ,128  ,0    ,5.183       ,8.065         ,0.643                     
0    ,1    ,176  ,127      ,128  ,23   ,5.168       ,8.353         ,0.619                     
0    ,1    ,176  ,127      ,160  ,0    ,5.21        ,8.047         ,0.647                     
0    ,1    ,176  ,127      ,160  ,23   ,5.183       ,9.184         ,0.564                     
0    ,1    ,176  ,127      ,32   ,0    ,3.098       ,4.342         ,0.713                     
0    ,1    ,176  ,127      ,32   ,23   ,5.854       ,8.052         ,0.727                     
0    ,1    ,1760 ,127      ,2048 ,0    ,28.253      ,68.431        ,0.413                     
0    ,1    ,1760 ,127      ,2048 ,23   ,29.553      ,67.922        ,0.435                     
0    ,1    ,1760 ,127      ,288  ,0    ,9.083       ,11.037        ,0.823                     
0    ,1    ,1760 ,127      ,288  ,23   ,28.852      ,49.864        ,0.579                     
0    ,1    ,18   ,127      ,17   ,0    ,3.385       ,2.987         ,1.133                     
0    ,1    ,18   ,127      ,17   ,23   ,3.166       ,3.145         ,1.007                     
0    ,1    ,1808 ,127      ,2048 ,0    ,28.994      ,70.121        ,0.413                     
0    ,1    ,1808 ,127      ,2048 ,23   ,31.177      ,71.111        ,0.438                     
0    ,1    ,1808 ,127      ,240  ,0    ,7.319       ,9.865         ,0.742                     
0    ,1    ,1808 ,127      ,240  ,23   ,29.899      ,48.525        ,0.616                     
0    ,1    ,1856 ,127      ,192  ,0    ,6.708       ,9.712         ,0.691                     
0    ,1    ,1856 ,127      ,192  ,23   ,30.362      ,53.506        ,0.567                     
0    ,1    ,1856 ,127      ,2048 ,0    ,29.728      ,69.492        ,0.428                     
0    ,1    ,1856 ,127      ,2048 ,23   ,31.795      ,70.51         ,0.451                     
0    ,1    ,19   ,127      ,18   ,0    ,3.069       ,3.079         ,0.997                     
0    ,1    ,19   ,127      ,18   ,23   ,3.161       ,3.136         ,1.008                     
0    ,1    ,1904 ,127      ,144  ,0    ,6.13        ,8.565         ,0.716                     
0    ,1    ,1904 ,127      ,144  ,23   ,30.397      ,50.784        ,0.599                     
0    ,1    ,1904 ,127      ,2048 ,0    ,29.915      ,70.101        ,0.427                     
0    ,1    ,1904 ,127      ,2048 ,23   ,30.958      ,70.837        ,0.437                     
0    ,1    ,192  ,127      ,176  ,0    ,5.21        ,8.08          ,0.645                     
0    ,1    ,192  ,127      ,176  ,23   ,5.859       ,11.223        ,0.522                     
0    ,1    ,1952 ,127      ,2048 ,0    ,30.047      ,74.454        ,0.404                     
0    ,1    ,1952 ,127      ,2048 ,23   ,32.931      ,73.662        ,0.447                     
0    ,1    ,1952 ,127      ,96   ,0    ,6.32        ,5.187         ,1.218                     
0    ,1    ,1952 ,127      ,96   ,23   ,32.679      ,75.722        ,0.432                     
0    ,1    ,2    ,127      ,1    ,0    ,2.919       ,2.967         ,0.984                     
0    ,1    ,2    ,127      ,1    ,23   ,3.178       ,3.122         ,1.018                     
0    ,1    ,20   ,127      ,19   ,0    ,3.039       ,2.995         ,1.015                     
0    ,1    ,20   ,127      ,19   ,23   ,3.137       ,3.113         ,1.008                     
0    ,1    ,2000 ,127      ,2048 ,0    ,30.331      ,75.947        ,0.399                     
0    ,1    ,2000 ,127      ,2048 ,23   ,32.174      ,73.461        ,0.438                     
0    ,1    ,2000 ,127      ,48   ,0    ,4.286       ,4.168         ,1.028                     
0    ,1    ,2000 ,127      ,48   ,23   ,34.849      ,67.532        ,0.516                     
0    ,1    ,2048 ,127      ,0    ,0    ,4.751       ,3.133         ,1.517                     
0    ,1    ,2048 ,127      ,0    ,23   ,35.011      ,76.157        ,0.46                      
0    ,1    ,2048 ,127      ,1024 ,0    ,18.037      ,33.074        ,0.545                     
0    ,1    ,2048 ,127      ,1024 ,23   ,34.924      ,59.377        ,0.588                     
0    ,1    ,2048 ,127      ,128  ,0    ,5.029       ,7.956         ,0.632                     
0    ,1    ,2048 ,127      ,128  ,23   ,33.228      ,57.793        ,0.575                     
0    ,1    ,2048 ,127      ,144  ,0    ,6.473       ,8.183         ,0.791                     
0    ,1    ,2048 ,127      ,144  ,23   ,33.341      ,55.555        ,0.6                       
0    ,1    ,2048 ,127      ,1760 ,0    ,28.416      ,68.45         ,0.415                     
0    ,1    ,2048 ,127      ,1760 ,23   ,37.008      ,61.149        ,0.605                     
0    ,1    ,2048 ,127      ,1808 ,0    ,29.105      ,69.316        ,0.42                      
0    ,1    ,2048 ,127      ,1808 ,23   ,38.015      ,60.564        ,0.628                     
0    ,1    ,2048 ,127      ,1856 ,0    ,29.132      ,70.656        ,0.412                     
0    ,1    ,2048 ,127      ,1856 ,23   ,36.962      ,60.026        ,0.616                     
0    ,1    ,2048 ,127      ,1904 ,0    ,29.385      ,71.07         ,0.413                     
0    ,1    ,2048 ,127      ,1904 ,23   ,38.724      ,61.505        ,0.63                      
0    ,1    ,2048 ,127      ,192  ,0    ,6.697       ,9.995         ,0.67                      
0    ,1    ,2048 ,127      ,192  ,23   ,32.848      ,71.547        ,0.459                     
0    ,1    ,2048 ,127      ,1952 ,0    ,30.459      ,71.618        ,0.425                     
0    ,1    ,2048 ,127      ,1952 ,23   ,36.818      ,61.726        ,0.596                     
0    ,1    ,2048 ,127      ,2000 ,0    ,31.646      ,74.669        ,0.424                     
0    ,1    ,2048 ,127      ,2000 ,23   ,40.474      ,63.49         ,0.637                     
0    ,1    ,2048 ,127      ,2048 ,0    ,31.931      ,74.167        ,0.431                     
0    ,1    ,2048 ,127      ,2048 ,23   ,35.59       ,80.296        ,0.443                     
0    ,1    ,2048 ,127      ,240  ,0    ,6.502       ,10.214        ,0.637                     
0    ,1    ,2048 ,127      ,240  ,23   ,33.179      ,68.227        ,0.486                     
0    ,1    ,2048 ,127      ,256  ,0    ,8.792       ,12.314        ,0.714                     
0    ,1    ,2048 ,127      ,256  ,23   ,34.148      ,78.522        ,0.435                     
0    ,1    ,2048 ,127      ,288  ,0    ,9.02        ,11.495        ,0.785                     
0    ,1    ,2048 ,127      ,288  ,23   ,34.193      ,75.257        ,0.454                     
0    ,1    ,2048 ,127      ,32   ,0    ,6.315       ,4.21          ,1.5                       
0    ,1    ,2048 ,127      ,32   ,23   ,34.46       ,67.577        ,0.51                      
0    ,1    ,2048 ,127      ,4096 ,0    ,30.586      ,79.128        ,0.387                     
0    ,1    ,2048 ,127      ,4096 ,23   ,34.698      ,79.702        ,0.435                     
0    ,1    ,2048 ,127      ,48   ,0    ,4.254       ,4.254         ,1.0                       
0    ,1    ,2048 ,127      ,48   ,23   ,34.526      ,69.505        ,0.497                     
0    ,1    ,2048 ,127      ,512  ,0    ,10.018      ,20.415        ,0.491                     
0    ,1    ,2048 ,127      ,512  ,23   ,33.986      ,63.296        ,0.537                     
0    ,1    ,2048 ,127      ,64   ,0    ,4.85        ,4.104         ,1.182                     
0    ,1    ,2048 ,127      ,64   ,23   ,33.907      ,70.864        ,0.478                     
0    ,1    ,2048 ,127      ,96   ,0    ,6.03        ,5.237         ,1.151                     
0    ,1    ,2048 ,127      ,96   ,23   ,33.924      ,75.7          ,0.448                     
0    ,1    ,208  ,127      ,16   ,0    ,3.153       ,3.053         ,1.033                     
0    ,1    ,208  ,127      ,16   ,23   ,6.941       ,10.069        ,0.689                     
0    ,1    ,208  ,127      ,192  ,0    ,4.919       ,9.862         ,0.499                     
0    ,1    ,208  ,127      ,192  ,23   ,5.153       ,11.23         ,0.459                     
0    ,1    ,208  ,127      ,256  ,0    ,5.292       ,9.883         ,0.535                     
0    ,1    ,208  ,127      ,256  ,23   ,7.241       ,9.359         ,0.774                     
0    ,1    ,208  ,127      ,48   ,0    ,3.083       ,4.325         ,0.713                     
0    ,1    ,208  ,127      ,48   ,23   ,6.971       ,10.149        ,0.687                     
0    ,1    ,208  ,127      ,64   ,0    ,4.982       ,4.156         ,1.199                     
0    ,1    ,208  ,127      ,64   ,23   ,6.229       ,9.976         ,0.624                     
0    ,1    ,2096 ,127      ,2048 ,0    ,31.43       ,75.699        ,0.415                     
0    ,1    ,2096 ,127      ,2048 ,23   ,38.359      ,63.822        ,0.601                     
0    ,1    ,21   ,127      ,20   ,0    ,3.127       ,3.035         ,1.03                      
0    ,1    ,21   ,127      ,20   ,23   ,3.129       ,3.106         ,1.007                     
0    ,1    ,2144 ,127      ,2048 ,0    ,32.396      ,77.421        ,0.418                     
0    ,1    ,2144 ,127      ,2048 ,23   ,36.676      ,63.418        ,0.578                     
0    ,1    ,2192 ,127      ,2048 ,0    ,33.077      ,76.825        ,0.431                     
0    ,1    ,2192 ,127      ,2048 ,23   ,38.77       ,64.917        ,0.597                     
0    ,1    ,22   ,127      ,21   ,0    ,3.094       ,2.994         ,1.033                     
0    ,1    ,22   ,127      ,21   ,23   ,3.169       ,3.13          ,1.012                     
0    ,1    ,224  ,127      ,128  ,0    ,5.208       ,8.112         ,0.642                     
0    ,1    ,224  ,127      ,128  ,23   ,5.823       ,10.535        ,0.553                     
0    ,1    ,224  ,127      ,208  ,0    ,5.149       ,9.883         ,0.521                     
0    ,1    ,224  ,127      ,208  ,23   ,5.373       ,10.333        ,0.52                      
0    ,1    ,224  ,127      ,288  ,0    ,5.563       ,9.722         ,0.572                     
0    ,1    ,224  ,127      ,288  ,23   ,7.817       ,9.402         ,0.831                     
0    ,1    ,224  ,127      ,32   ,0    ,3.099       ,4.295         ,0.722                     
0    ,1    ,224  ,127      ,32   ,23   ,7.105       ,9.802         ,0.725                     
0    ,1    ,224  ,127      ,512  ,0    ,5.133       ,9.836         ,0.522                     
0    ,1    ,224  ,127      ,512  ,23   ,7.786       ,9.311         ,0.836                     
0    ,1    ,2240 ,127      ,2048 ,0    ,32.561      ,77.729        ,0.419                     
0    ,1    ,2240 ,127      ,2048 ,23   ,39.404      ,65.423        ,0.602                     
0    ,1    ,2288 ,127      ,2048 ,0    ,33.339      ,76.88         ,0.434                     
0    ,1    ,2288 ,127      ,2048 ,23   ,39.434      ,67.761        ,0.582                     
0    ,1    ,23   ,127      ,22   ,0    ,3.069       ,3.002         ,1.022                     
0    ,1    ,23   ,127      ,22   ,23   ,3.146       ,3.228         ,0.974                     
0    ,1    ,2336 ,127      ,2048 ,0    ,32.8        ,77.427        ,0.424                     
0    ,1    ,2336 ,127      ,2048 ,23   ,40.023      ,67.47         ,0.593                     
0    ,1    ,24   ,127      ,23   ,0    ,3.024       ,3.01          ,1.004                     
0    ,1    ,24   ,127      ,23   ,23   ,3.267       ,3.129         ,1.044                     
0    ,1    ,240  ,127      ,224  ,0    ,5.294       ,9.826         ,0.539                     
0    ,1    ,240  ,127      ,224  ,23   ,5.064       ,11.095        ,0.456                     
0    ,1    ,25   ,127      ,24   ,0    ,3.054       ,3.017         ,1.012                     
0    ,1    ,25   ,127      ,24   ,23   ,3.144       ,3.114         ,1.01                      
0    ,1    ,256  ,127      ,0    ,0    ,3.098       ,3.061         ,1.012                     
0    ,1    ,256  ,127      ,0    ,23   ,9.171       ,11.843        ,0.774                     
0    ,1    ,256  ,127      ,112  ,0    ,4.981       ,5.05          ,0.986                     
0    ,1    ,256  ,127      ,112  ,23   ,9.383       ,10.471        ,0.896                     
0    ,1    ,256  ,127      ,144  ,0    ,5.363       ,8.16          ,0.657                     
0    ,1    ,256  ,127      ,144  ,23   ,7.489       ,12.064        ,0.621                     
0    ,1    ,256  ,127      ,16   ,0    ,3.13        ,3.195         ,0.98                      
0    ,1    ,256  ,127      ,16   ,23   ,10.519      ,12.539        ,0.839                     
0    ,1    ,256  ,127      ,160  ,0    ,6.025       ,8.128         ,0.741                     
0    ,1    ,256  ,127      ,160  ,23   ,7.804       ,11.652        ,0.67                      
0    ,1    ,256  ,127      ,192  ,0    ,5.392       ,9.785         ,0.551                     
0    ,1    ,256  ,127      ,192  ,23   ,7.06        ,12.141        ,0.582                     
0    ,1    ,256  ,127      ,208  ,0    ,5.352       ,9.807         ,0.546                     
0    ,1    ,256  ,127      ,208  ,23   ,7.208       ,12.283        ,0.587                     
0    ,1    ,256  ,127      ,240  ,0    ,5.564       ,9.779         ,0.569                     
0    ,1    ,256  ,127      ,240  ,23   ,6.929       ,11.489        ,0.603                     
0    ,1    ,256  ,127      ,256  ,0    ,8.085       ,11.22         ,0.721                     
0    ,1    ,256  ,127      ,256  ,23   ,8.975       ,11.182        ,0.803                     
0    ,1    ,256  ,127      ,288  ,0    ,7.95        ,11.247        ,0.707                     
0    ,1    ,256  ,127      ,288  ,23   ,8.597       ,11.107        ,0.774                     
0    ,1    ,256  ,127      ,48   ,0    ,3.181       ,4.303         ,0.739                     
0    ,1    ,256  ,127      ,48   ,23   ,8.712       ,11.591        ,0.752                     
0    ,1    ,256  ,127      ,64   ,0    ,5.006       ,4.261         ,1.175                     
0    ,1    ,256  ,127      ,64   ,23   ,8.513       ,11.664        ,0.73                      
0    ,1    ,256  ,127      ,96   ,0    ,4.932       ,4.935         ,0.999                     
0    ,1    ,256  ,127      ,96   ,23   ,8.432       ,10.503        ,0.803                     
0    ,1    ,26   ,127      ,25   ,0    ,3.024       ,3.038         ,0.995                     
0    ,1    ,26   ,127      ,25   ,23   ,3.296       ,3.161         ,1.043                     
0    ,1    ,27   ,127      ,26   ,0    ,3.046       ,3.046         ,1.0                       
0    ,1    ,27   ,127      ,26   ,23   ,3.314       ,3.269         ,1.014                     
0    ,1    ,272  ,127      ,128  ,0    ,5.182       ,8.086         ,0.641                     
0    ,1    ,272  ,127      ,128  ,23   ,7.459       ,12.355        ,0.604                     
0    ,1    ,272  ,127      ,240  ,0    ,5.474       ,9.739         ,0.562                     
0    ,1    ,272  ,127      ,240  ,23   ,8.765       ,11.08         ,0.791                     
0    ,1    ,272  ,127      ,256  ,0    ,7.913       ,11.219        ,0.705                     
0    ,1    ,272  ,127      ,256  ,23   ,7.794       ,11.429        ,0.682                     
0    ,1    ,272  ,127      ,32   ,0    ,3.114       ,4.294         ,0.725                     
0    ,1    ,272  ,127      ,32   ,23   ,8.761       ,11.618        ,0.754                     
0    ,1    ,272  ,127      ,512  ,0    ,7.786       ,11.251        ,0.692                     
0    ,1    ,272  ,127      ,512  ,23   ,8.554       ,10.759        ,0.795                     
0    ,1    ,28   ,127      ,27   ,0    ,3.038       ,3.039         ,1.0                       
0    ,1    ,28   ,127      ,27   ,23   ,3.331       ,3.216         ,1.036                     
0    ,1    ,288  ,127      ,272  ,0    ,8.597       ,11.465        ,0.75                      
0    ,1    ,288  ,127      ,272  ,23   ,8.251       ,11.85         ,0.696                     
0    ,1    ,29   ,127      ,28   ,0    ,3.068       ,3.016         ,1.017                     
0    ,1    ,29   ,127      ,28   ,23   ,3.161       ,3.361         ,0.94                      
0    ,1    ,3    ,127      ,2    ,0    ,2.989       ,3.129         ,0.956                     
0    ,1    ,3    ,127      ,2    ,23   ,3.177       ,3.129         ,1.015                     
0    ,1    ,30   ,127      ,29   ,0    ,3.067       ,3.038         ,1.01                      
0    ,1    ,30   ,127      ,29   ,23   ,3.161       ,3.254         ,0.971                     
0    ,1    ,304  ,127      ,16   ,0    ,3.114       ,3.053         ,1.02                      
0    ,1    ,304  ,127      ,16   ,23   ,9.079       ,11.926        ,0.761                     
0    ,1    ,304  ,127      ,256  ,0    ,7.965       ,11.28         ,0.706                     
0    ,1    ,304  ,127      ,256  ,23   ,9.262       ,11.358        ,0.816                     
0    ,1    ,304  ,127      ,64   ,0    ,4.957       ,4.287         ,1.156                     
0    ,1    ,304  ,127      ,64   ,23   ,8.067       ,11.67         ,0.691                     
0    ,1    ,31   ,127      ,30   ,0    ,3.084       ,3.053         ,1.01                      
0    ,1    ,31   ,127      ,30   ,23   ,3.161       ,3.244         ,0.974                     
0    ,1    ,32   ,127      ,0    ,0    ,3.114       ,3.053         ,1.02                      
0    ,1    ,32   ,127      ,0    ,23   ,3.336       ,4.092         ,0.815                     
0    ,1    ,32   ,127      ,128  ,0    ,3.097       ,4.319         ,0.717                     
0    ,1    ,32   ,127      ,128  ,23   ,3.812       ,4.425         ,0.861                     
0    ,1    ,32   ,127      ,144  ,0    ,3.099       ,4.254         ,0.728                     
0    ,1    ,32   ,127      ,144  ,23   ,3.954       ,4.317         ,0.916                     
0    ,1    ,32   ,127      ,16   ,0    ,3.098       ,3.024         ,1.025                     
0    ,1    ,32   ,127      ,16   ,23   ,3.161       ,4.388         ,0.72                      
0    ,1    ,32   ,127      ,192  ,0    ,3.099       ,4.338         ,0.714                     
0    ,1    ,32   ,127      ,192  ,23   ,3.832       ,4.295         ,0.892                     
0    ,1    ,32   ,127      ,240  ,0    ,3.114       ,4.294         ,0.725                     
0    ,1    ,32   ,127      ,240  ,23   ,3.794       ,4.404         ,0.861                     
0    ,1    ,32   ,127      ,288  ,0    ,3.114       ,4.36          ,0.714                     
0    ,1    ,32   ,127      ,288  ,23   ,3.754       ,4.316         ,0.87                      
0    ,1    ,32   ,127      ,31   ,0    ,3.098       ,3.038         ,1.02                      
0    ,1    ,32   ,127      ,31   ,23   ,3.145       ,4.249         ,0.74                      
0    ,1    ,32   ,127      ,32   ,0    ,3.109       ,4.281         ,0.726                     
0    ,1    ,32   ,127      ,32   ,23   ,3.766       ,4.472         ,0.842                     
0    ,1    ,32   ,127      ,48   ,0    ,3.13        ,4.279         ,0.731                     
0    ,1    ,32   ,127      ,48   ,23   ,3.736       ,4.403         ,0.849                     
0    ,1    ,32   ,127      ,96   ,0    ,3.09        ,4.289         ,0.721                     
0    ,1    ,32   ,127      ,96   ,23   ,3.764       ,4.408         ,0.854                     
0    ,1    ,320  ,127      ,128  ,0    ,5.156       ,8.122         ,0.635                     
0    ,1    ,320  ,127      ,128  ,23   ,7.77        ,13.797        ,0.563                     
0    ,1    ,320  ,127      ,192  ,0    ,5.669       ,10.041        ,0.565                     
0    ,1    ,320  ,127      ,192  ,23   ,8.175       ,14.602        ,0.56                      
0    ,1    ,320  ,127      ,32   ,0    ,3.136       ,4.316         ,0.727                     
0    ,1    ,320  ,127      ,32   ,23   ,8.783       ,13.062        ,0.672                     
0    ,1    ,320  ,127      ,512  ,0    ,7.919       ,12.819        ,0.618                     
0    ,1    ,320  ,127      ,512  ,23   ,10.115      ,12.419        ,0.814                     
0    ,1    ,352  ,127      ,256  ,0    ,7.735       ,11.274        ,0.686                     
0    ,1    ,352  ,127      ,256  ,23   ,9.6         ,13.855        ,0.693                     
0    ,1    ,352  ,127      ,64   ,0    ,4.957       ,3.96          ,1.252                     
0    ,1    ,352  ,127      ,64   ,23   ,7.969       ,13.439        ,0.593                     
0    ,1    ,368  ,127      ,128  ,0    ,5.183       ,8.086         ,0.641                     
0    ,1    ,368  ,127      ,128  ,23   ,7.421       ,13.63         ,0.544                     
0    ,1    ,368  ,127      ,144  ,0    ,5.151       ,8.241         ,0.625                     
0    ,1    ,368  ,127      ,144  ,23   ,8.851       ,13.793        ,0.642                     
0    ,1    ,368  ,127      ,512  ,0    ,7.865       ,12.802        ,0.614                     
0    ,1    ,368  ,127      ,512  ,23   ,9.26        ,12.768        ,0.725                     
0    ,1    ,4    ,127      ,3    ,0    ,3.096       ,2.98          ,1.039                     
0    ,1    ,4    ,127      ,3    ,23   ,3.161       ,3.113         ,1.015                     
0    ,1    ,400  ,127      ,256  ,0    ,7.736       ,11.296        ,0.685                     
0    ,1    ,400  ,127      ,256  ,23   ,10.525      ,15.428        ,0.682                     
0    ,1    ,416  ,127      ,128  ,0    ,5.183       ,8.06          ,0.643                     
0    ,1    ,416  ,127      ,128  ,23   ,9.009       ,15.068        ,0.598                     
0    ,1    ,416  ,127      ,512  ,0    ,8.501       ,14.465        ,0.588                     
0    ,1    ,416  ,127      ,512  ,23   ,11.092      ,14.164        ,0.783                     
0    ,1    ,416  ,127      ,96   ,0    ,5.868       ,5.336         ,1.1                       
0    ,1    ,416  ,127      ,96   ,23   ,10.886      ,13.708        ,0.794                     
0    ,1    ,448  ,127      ,256  ,0    ,7.892       ,11.261        ,0.701                     
0    ,1    ,448  ,127      ,256  ,23   ,11.196      ,16.908        ,0.662                     
0    ,1    ,464  ,127      ,48   ,0    ,4.529       ,4.649         ,0.974                     
0    ,1    ,464  ,127      ,48   ,23   ,11.244      ,16.064        ,0.7                       
0    ,1    ,464  ,127      ,512  ,0    ,9.139       ,16.035        ,0.57                      
0    ,1    ,464  ,127      ,512  ,23   ,11.079      ,15.665        ,0.707                     
0    ,1    ,48   ,127      ,32   ,0    ,3.083       ,4.299         ,0.717                     
0    ,1    ,48   ,127      ,32   ,23   ,3.16        ,4.478         ,0.706                     
0    ,1    ,496  ,127      ,256  ,0    ,8.043       ,11.297        ,0.712                     
0    ,1    ,496  ,127      ,256  ,23   ,10.418      ,16.771        ,0.621                     
0    ,1    ,5    ,127      ,4    ,0    ,2.959       ,3.002         ,0.986                     
0    ,1    ,5    ,127      ,4    ,23   ,3.161       ,3.157         ,1.001                     
0    ,1    ,512  ,127      ,0    ,0    ,3.65        ,3.125         ,1.168                     
0    ,1    ,512  ,127      ,0    ,23   ,13.168      ,17.987        ,0.732                     
0    ,1    ,512  ,127      ,144  ,0    ,5.335       ,8.158         ,0.654                     
0    ,1    ,512  ,127      ,144  ,23   ,11.687      ,17.881        ,0.654                     
0    ,1    ,512  ,127      ,192  ,0    ,5.42        ,9.795         ,0.553                     
0    ,1    ,512  ,127      ,192  ,23   ,12.338      ,17.877        ,0.69                      
0    ,1    ,512  ,127      ,224  ,0    ,6.793       ,9.788         ,0.694                     
0    ,1    ,512  ,127      ,224  ,23   ,11.314      ,17.583        ,0.643                     
0    ,1    ,512  ,127      ,240  ,0    ,5.424       ,9.777         ,0.555                     
0    ,1    ,512  ,127      ,240  ,23   ,11.273      ,17.824        ,0.632                     
0    ,1    ,512  ,127      ,272  ,0    ,9.427       ,11.25         ,0.838                     
0    ,1    ,512  ,127      ,272  ,23   ,12.878      ,18.038        ,0.714                     
0    ,1    ,512  ,127      ,288  ,0    ,9.145       ,11.191        ,0.817                     
0    ,1    ,512  ,127      ,288  ,23   ,11.651      ,17.873        ,0.652                     
0    ,1    ,512  ,127      ,320  ,0    ,9.542       ,13.189        ,0.723                     
0    ,1    ,512  ,127      ,320  ,23   ,11.951      ,19.499        ,0.613                     
0    ,1    ,512  ,127      ,368  ,0    ,9.8         ,12.866        ,0.762                     
0    ,1    ,512  ,127      ,368  ,23   ,12.844      ,17.906        ,0.717                     
0    ,1    ,512  ,127      ,416  ,0    ,9.116       ,14.445        ,0.631                     
0    ,1    ,512  ,127      ,416  ,23   ,13.135      ,17.839        ,0.736                     
0    ,1    ,512  ,127      ,464  ,0    ,10.514      ,16.266        ,0.646                     
0    ,1    ,512  ,127      ,464  ,23   ,12.92       ,20.069        ,0.644                     
0    ,1    ,512  ,127      ,48   ,0    ,4.724       ,4.276         ,1.105                     
0    ,1    ,512  ,127      ,48   ,23   ,13.223      ,17.628        ,0.75                      
0    ,1    ,512  ,127      ,512  ,0    ,10.688      ,17.722        ,0.603                     
0    ,1    ,512  ,127      ,512  ,23   ,13.09       ,17.489        ,0.748                     
0    ,1    ,512  ,127      ,96   ,0    ,5.706       ,5.566         ,1.025                     
0    ,1    ,512  ,127      ,96   ,23   ,11.866      ,16.929        ,0.701                     
0    ,1    ,544  ,127      ,256  ,0    ,8.801       ,11.243        ,0.783                     
0    ,1    ,544  ,127      ,256  ,23   ,12.377      ,19.524        ,0.634                     
0    ,1    ,560  ,127      ,512  ,0    ,10.123      ,17.939        ,0.564                     
0    ,1    ,560  ,127      ,512  ,23   ,12.011      ,18.084        ,0.664                     
0    ,1    ,6    ,127      ,5    ,0    ,3.015       ,3.022         ,0.998                     
0    ,1    ,6    ,127      ,5    ,23   ,3.209       ,3.154         ,1.017                     
0    ,1    ,608  ,127      ,512  ,0    ,10.31       ,17.63         ,0.585                     
0    ,1    ,608  ,127      ,512  ,23   ,12.612      ,19.844        ,0.636                     
0    ,1    ,64   ,127      ,0    ,0    ,3.148       ,3.098         ,1.016                     
0    ,1    ,64   ,127      ,0    ,23   ,4.838       ,5.521         ,0.876                     
0    ,1    ,64   ,127      ,144  ,0    ,4.957       ,4.097         ,1.21                      
0    ,1    ,64   ,127      ,144  ,23   ,5.025       ,5.769         ,0.871                     
0    ,1    ,64   ,127      ,16   ,0    ,3.122       ,3.068         ,1.018                     
0    ,1    ,64   ,127      ,16   ,23   ,4.587       ,5.319         ,0.862                     
0    ,1    ,64   ,127      ,192  ,0    ,4.981       ,4.271         ,1.166                     
0    ,1    ,64   ,127      ,192  ,23   ,5.032       ,5.623         ,0.895                     
0    ,1    ,64   ,127      ,240  ,0    ,5.006       ,4.177         ,1.198                     
0    ,1    ,64   ,127      ,240  ,23   ,5.006       ,5.673         ,0.882                     
0    ,1    ,64   ,127      ,256  ,0    ,4.981       ,4.175         ,1.193                     
0    ,1    ,64   ,127      ,256  ,23   ,5.032       ,5.549         ,0.907                     
0    ,1    ,64   ,127      ,288  ,0    ,4.957       ,3.979         ,1.246                     
0    ,1    ,64   ,127      ,288  ,23   ,5.109       ,5.632         ,0.907                     
0    ,1    ,64   ,127      ,48   ,0    ,3.091       ,4.308         ,0.717                     
0    ,1    ,64   ,127      ,48   ,23   ,4.452       ,4.805         ,0.927                     
0    ,1    ,64   ,127      ,64   ,0    ,4.974       ,4.264         ,1.166                     
0    ,1    ,64   ,127      ,64   ,23   ,5.098       ,5.947         ,0.857                     
0    ,1    ,64   ,127      ,96   ,0    ,4.957       ,4.204         ,1.179                     
0    ,1    ,64   ,127      ,96   ,23   ,5.031       ,5.785         ,0.87                      
0    ,1    ,656  ,127      ,512  ,0    ,10.199      ,17.635        ,0.578                     
0    ,1    ,656  ,127      ,512  ,23   ,14.606      ,23.294        ,0.627                     
0    ,1    ,7    ,127      ,6    ,0    ,3.041       ,3.046         ,0.998                     
0    ,1    ,7    ,127      ,6    ,23   ,3.177       ,3.213         ,0.989                     
0    ,1    ,704  ,127      ,512  ,0    ,10.761      ,17.58         ,0.612                     
0    ,1    ,704  ,127      ,512  ,23   ,15.862      ,25.677        ,0.618                     
0    ,1    ,736  ,127      ,1024 ,0    ,12.243      ,21.981        ,0.557                     
0    ,1    ,736  ,127      ,1024 ,23   ,14.538      ,22.232        ,0.654                     
0    ,1    ,736  ,127      ,288  ,0    ,9.216       ,11.867        ,0.777                     
0    ,1    ,736  ,127      ,288  ,23   ,13.535      ,21.782        ,0.621                     
0    ,1    ,752  ,127      ,512  ,0    ,10.445      ,17.665        ,0.591                     
0    ,1    ,752  ,127      ,512  ,23   ,16.323      ,23.339        ,0.699                     
0    ,1    ,784  ,127      ,1024 ,0    ,13.184      ,23.673        ,0.557                     
0    ,1    ,784  ,127      ,1024 ,23   ,16.755      ,24.668        ,0.679                     
0    ,1    ,784  ,127      ,240  ,0    ,7.166       ,9.836         ,0.729                     
0    ,1    ,784  ,127      ,240  ,23   ,15.091      ,23.458        ,0.643                     
0    ,1    ,8    ,127      ,7    ,0    ,3.187       ,3.056         ,1.043                     
0    ,1    ,8    ,127      ,7    ,23   ,3.161       ,3.138         ,1.007                     
0    ,1    ,80   ,127      ,128  ,0    ,4.932       ,4.095         ,1.204                     
0    ,1    ,80   ,127      ,128  ,23   ,5.032       ,5.639         ,0.892                     
0    ,1    ,80   ,127      ,32   ,0    ,3.099       ,4.279         ,0.724                     
0    ,1    ,80   ,127      ,32   ,23   ,4.479       ,5.031         ,0.89                      
0    ,1    ,80   ,127      ,48   ,0    ,3.116       ,4.304         ,0.724                     
0    ,1    ,80   ,127      ,48   ,23   ,4.404       ,4.925         ,0.894                     
0    ,1    ,80   ,127      ,64   ,0    ,4.934       ,4.224         ,1.168                     
0    ,1    ,80   ,127      ,64   ,23   ,5.059       ,4.798         ,1.054                     
0    ,1    ,800  ,127      ,512  ,0    ,11.268      ,17.689        ,0.637                     
0    ,1    ,800  ,127      ,512  ,23   ,16.309      ,26.236        ,0.622                     
0    ,1    ,832  ,127      ,1024 ,0    ,14.401      ,25.327        ,0.569                     
0    ,1    ,832  ,127      ,1024 ,23   ,17.302      ,25.021        ,0.692                     
0    ,1    ,832  ,127      ,192  ,0    ,7.306       ,9.64          ,0.758                     
0    ,1    ,832  ,127      ,192  ,23   ,14.672      ,24.988        ,0.587                     
0    ,1    ,880  ,127      ,1024 ,0    ,14.098      ,25.311        ,0.557                     
0    ,1    ,880  ,127      ,1024 ,23   ,16.511      ,25.174        ,0.656                     
0    ,1    ,880  ,127      ,144  ,0    ,6.144       ,8.067         ,0.762                     
0    ,1    ,880  ,127      ,144  ,23   ,15.766      ,25.381        ,0.621                     
0    ,1    ,9    ,127      ,8    ,0    ,2.987       ,3.049         ,0.98                      
0    ,1    ,9    ,127      ,8    ,23   ,3.161       ,3.146         ,1.005                     
0    ,1    ,928  ,127      ,1024 ,0    ,15.142      ,27.182        ,0.557                     
0    ,1    ,928  ,127      ,1024 ,23   ,18.63       ,27.478        ,0.678                     
0    ,1    ,928  ,127      ,96   ,0    ,6.498       ,5.501         ,1.181                     
0    ,1    ,928  ,127      ,96   ,23   ,17.987      ,26.647        ,0.675                     
0    ,1    ,96   ,127      ,80   ,0    ,5.009       ,4.094         ,1.223                     
0    ,1    ,96   ,127      ,80   ,23   ,5.056       ,6.327         ,0.799                     
0    ,1    ,976  ,127      ,1024 ,0    ,15.793      ,28.863        ,0.547                     
0    ,1    ,976  ,127      ,1024 ,23   ,18.728      ,29.107        ,0.643                     
0    ,1    ,976  ,127      ,48   ,0    ,4.366       ,4.255         ,1.026                     
0    ,1    ,976  ,127      ,48   ,23   ,19.248      ,29.085        ,0.662                     
0    ,16   ,1    ,127      ,0    ,23   ,3.178       ,3.115         ,1.02                      
0    ,16   ,10   ,127      ,9    ,23   ,3.175       ,3.171         ,1.001                     
0    ,16   ,1024 ,127      ,0    ,23   ,21.03       ,32.617        ,0.645                     
0    ,16   ,1024 ,127      ,1024 ,23   ,23.634      ,43.183        ,0.547                     
0    ,16   ,1024 ,127      ,144  ,23   ,18.486      ,31.015        ,0.596                     
0    ,16   ,1024 ,127      ,192  ,23   ,18.879      ,32.072        ,0.589                     
0    ,16   ,1024 ,127      ,240  ,23   ,17.554      ,31.078        ,0.565                     
0    ,16   ,1024 ,127      ,288  ,23   ,20.114      ,31.785        ,0.633                     
0    ,16   ,1024 ,127      ,48   ,23   ,20.98       ,31.748        ,0.661                     
0    ,16   ,1024 ,127      ,736  ,23   ,21.027      ,36.03         ,0.584                     
0    ,16   ,1024 ,127      ,784  ,23   ,21.708      ,37.96         ,0.572                     
0    ,16   ,1024 ,127      ,832  ,23   ,20.724      ,38.599        ,0.537                     
0    ,16   ,1024 ,127      ,880  ,23   ,21.531      ,38.464        ,0.56                      
0    ,16   ,1024 ,127      ,928  ,23   ,22.492      ,40.187        ,0.56                      
0    ,16   ,1024 ,127      ,96   ,23   ,18.992      ,29.835        ,0.637                     
0    ,16   ,1024 ,127      ,976  ,23   ,22.148      ,41.162        ,0.538                     
0    ,16   ,1072 ,127      ,1024 ,23   ,21.906      ,41.764        ,0.525                     
0    ,16   ,11   ,127      ,10   ,23   ,3.161       ,3.113         ,1.015                     
0    ,16   ,112  ,127      ,144  ,23   ,5.042       ,5.202         ,0.969                     
0    ,16   ,112  ,127      ,16   ,23   ,5.289       ,6.177         ,0.856                     
0    ,16   ,112  ,127      ,256  ,23   ,5.031       ,4.957         ,1.015                     
0    ,16   ,112  ,127      ,64   ,23   ,4.96        ,6.004         ,0.826                     
0    ,16   ,112  ,127      ,96   ,23   ,5.056       ,5.224         ,0.968                     
0    ,16   ,1120 ,127      ,1024 ,23   ,22.463      ,43.551        ,0.516                     
0    ,16   ,1168 ,127      ,1024 ,23   ,24.796      ,44.674        ,0.555                     
0    ,16   ,12   ,127      ,11   ,23   ,3.18        ,3.13          ,1.016                     
0    ,16   ,1216 ,127      ,1024 ,23   ,25.978      ,46.24         ,0.562                     
0    ,16   ,1264 ,127      ,1024 ,23   ,25.774      ,46.273        ,0.557                     
0    ,16   ,128  ,127      ,0    ,23   ,5.984       ,8.405         ,0.712                     
0    ,16   ,128  ,127      ,112  ,23   ,5.738       ,8.347         ,0.687                     
0    ,16   ,128  ,127      ,128  ,23   ,5.443       ,7.079         ,0.769                     
0    ,16   ,128  ,127      ,144  ,23   ,5.337       ,7.081         ,0.754                     
0    ,16   ,128  ,127      ,192  ,23   ,5.295       ,7.116         ,0.744                     
0    ,16   ,128  ,127      ,240  ,23   ,6.962       ,7.045         ,0.988                     
0    ,16   ,128  ,127      ,288  ,23   ,6.599       ,7.011         ,0.941                     
0    ,16   ,128  ,127      ,32   ,23   ,5.669       ,8.216         ,0.69                      
0    ,16   ,128  ,127      ,48   ,23   ,5.696       ,8.064         ,0.706                     
0    ,16   ,128  ,127      ,80   ,23   ,5.422       ,8.103         ,0.669                     
0    ,16   ,128  ,127      ,96   ,23   ,5.229       ,7.084         ,0.738                     
0    ,16   ,13   ,127      ,12   ,23   ,3.193       ,3.145         ,1.015                     
0    ,16   ,1312 ,127      ,1024 ,23   ,26.92       ,47.544        ,0.566                     
0    ,16   ,14   ,127      ,13   ,23   ,3.177       ,3.121         ,1.018                     
0    ,16   ,144  ,127      ,128  ,23   ,6.203       ,8.199         ,0.757                     
0    ,16   ,15   ,127      ,14   ,23   ,3.152       ,3.129         ,1.007                     
0    ,16   ,16   ,127      ,0    ,23   ,3.307       ,3.13          ,1.057                     
0    ,16   ,16   ,127      ,144  ,23   ,3.246       ,3.131         ,1.037                     
0    ,16   ,16   ,127      ,15   ,23   ,3.145       ,3.137         ,1.003                     
0    ,16   ,16   ,127      ,16   ,23   ,3.26        ,3.145         ,1.036                     
0    ,16   ,16   ,127      ,192  ,23   ,3.304       ,3.129         ,1.056                     
0    ,16   ,16   ,127      ,240  ,23   ,3.162       ,3.107         ,1.018                     
0    ,16   ,16   ,127      ,256  ,23   ,3.773       ,3.72          ,1.014                     
0    ,16   ,16   ,127      ,288  ,23   ,4.254       ,3.759         ,1.132                     
0    ,16   ,16   ,127      ,48   ,23   ,3.175       ,3.138         ,1.012                     
0    ,16   ,16   ,127      ,64   ,23   ,3.341       ,3.13          ,1.068                     
0    ,16   ,16   ,127      ,96   ,23   ,3.248       ,3.146         ,1.033                     
0    ,16   ,160  ,127      ,144  ,23   ,5.97        ,8.205         ,0.728                     
0    ,16   ,160  ,127      ,16   ,23   ,5.982       ,10.111        ,0.592                     
0    ,16   ,160  ,127      ,256  ,23   ,5.179       ,8.168         ,0.634                     
0    ,16   ,160  ,127      ,64   ,23   ,5.473       ,8.078         ,0.677                     
0    ,16   ,160  ,127      ,96   ,23   ,5.106       ,7.049         ,0.724                     
0    ,16   ,17   ,127      ,16   ,23   ,3.214       ,3.129         ,1.027                     
0    ,16   ,176  ,127      ,128  ,23   ,5.143       ,8.206         ,0.627                     
0    ,16   ,176  ,127      ,160  ,23   ,5.422       ,8.171         ,0.664                     
0    ,16   ,176  ,127      ,32   ,23   ,5.656       ,8.191         ,0.691                     
0    ,16   ,1760 ,127      ,2048 ,23   ,35.492      ,56.611        ,0.627                     
0    ,16   ,1760 ,127      ,288  ,23   ,28.885      ,48.89         ,0.591                     
0    ,16   ,18   ,127      ,17   ,23   ,3.161       ,3.145         ,1.005                     
0    ,16   ,1808 ,127      ,2048 ,23   ,36.698      ,70.045        ,0.524                     
0    ,16   ,1808 ,127      ,240  ,23   ,29.66       ,52.863        ,0.561                     
0    ,16   ,1856 ,127      ,192  ,23   ,30.6        ,53.677        ,0.57                      
0    ,16   ,1856 ,127      ,2048 ,23   ,37.172      ,61.444        ,0.605                     
0    ,16   ,19   ,127      ,18   ,23   ,3.182       ,3.146         ,1.012                     
0    ,16   ,1904 ,127      ,144  ,23   ,29.959      ,51.087        ,0.586                     
0    ,16   ,1904 ,127      ,2048 ,23   ,38.236      ,61.262        ,0.624                     
0    ,16   ,192  ,127      ,176  ,23   ,5.827       ,9.934         ,0.587                     
0    ,16   ,1952 ,127      ,2048 ,23   ,39.716      ,70.817        ,0.561                     
0    ,16   ,1952 ,127      ,96   ,23   ,32.285      ,74.592        ,0.433                     
0    ,16   ,2    ,127      ,1    ,23   ,3.145       ,3.128         ,1.005                     
0    ,16   ,20   ,127      ,19   ,23   ,3.173       ,3.13          ,1.014                     
0    ,16   ,2000 ,127      ,2048 ,23   ,39.806      ,70.894        ,0.561                     
0    ,16   ,2000 ,127      ,48   ,23   ,34.576      ,68.146        ,0.507                     
0    ,16   ,2048 ,127      ,0    ,23   ,35.847      ,74.808        ,0.479                     
0    ,16   ,2048 ,127      ,1024 ,23   ,39.274      ,67.11         ,0.585                     
0    ,16   ,2048 ,127      ,128  ,23   ,33.458      ,55.192        ,0.606                     
0    ,16   ,2048 ,127      ,144  ,23   ,34.526      ,56.472        ,0.611                     
0    ,16   ,2048 ,127      ,1760 ,23   ,39.921      ,66.213        ,0.603                     
0    ,16   ,2048 ,127      ,1808 ,23   ,45.015      ,70.109        ,0.642                     
0    ,16   ,2048 ,127      ,1856 ,23   ,45.023      ,69.55         ,0.647                     
0    ,16   ,2048 ,127      ,1904 ,23   ,44.574      ,68.752        ,0.648                     
0    ,16   ,2048 ,127      ,192  ,23   ,33.805      ,73.112        ,0.462                     
0    ,16   ,2048 ,127      ,1952 ,23   ,42.448      ,68.393        ,0.621                     
0    ,16   ,2048 ,127      ,2000 ,23   ,48.733      ,70.232        ,0.694                     
0    ,16   ,2048 ,127      ,2048 ,23   ,46.714      ,77.943        ,0.599                     
0    ,16   ,2048 ,127      ,240  ,23   ,33.05       ,76.675        ,0.431                     
0    ,16   ,2048 ,127      ,256  ,23   ,34.01       ,73.68         ,0.462                     
0    ,16   ,2048 ,127      ,288  ,23   ,34.587      ,75.347        ,0.459                     
0    ,16   ,2048 ,127      ,32   ,23   ,35.275      ,70.797        ,0.498                     
0    ,16   ,2048 ,127      ,4096 ,23   ,38.113      ,60.415        ,0.631                     
0    ,16   ,2048 ,127      ,48   ,23   ,35.108      ,70.415        ,0.499                     
0    ,16   ,2048 ,127      ,512  ,23   ,35.473      ,60.246        ,0.589                     
0    ,16   ,2048 ,127      ,64   ,23   ,34.609      ,71.716        ,0.483                     
0    ,16   ,2048 ,127      ,96   ,23   ,33.815      ,76.222        ,0.444                     
0    ,16   ,208  ,127      ,16   ,23   ,6.825       ,11.993        ,0.569                     
0    ,16   ,208  ,127      ,192  ,23   ,5.085       ,10.389        ,0.489                     
0    ,16   ,208  ,127      ,256  ,23   ,5.216       ,10.3          ,0.506                     
0    ,16   ,208  ,127      ,48   ,23   ,6.97        ,9.8           ,0.711                     
0    ,16   ,208  ,127      ,64   ,23   ,5.983       ,9.901         ,0.604                     
0    ,16   ,2096 ,127      ,2048 ,23   ,41.568      ,73.24         ,0.568                     
0    ,16   ,21   ,127      ,20   ,23   ,3.161       ,3.16          ,1.0                       
0    ,16   ,2144 ,127      ,2048 ,23   ,49.168      ,71.646        ,0.686                     
0    ,16   ,2192 ,127      ,2048 ,23   ,44.934      ,76.146        ,0.59                      
0    ,16   ,22   ,127      ,21   ,23   ,3.145       ,3.161         ,0.995                     
0    ,16   ,224  ,127      ,128  ,23   ,5.726       ,10.327        ,0.554                     
0    ,16   ,224  ,127      ,208  ,23   ,5.298       ,10.397        ,0.51                      
0    ,16   ,224  ,127      ,288  ,23   ,6.613       ,10.518        ,0.629                     
0    ,16   ,224  ,127      ,32   ,23   ,7.196       ,9.892         ,0.727                     
0    ,16   ,224  ,127      ,512  ,23   ,5.99        ,11.037        ,0.543                     
0    ,16   ,2240 ,127      ,2048 ,23   ,45.542      ,73.966        ,0.616                     
0    ,16   ,2288 ,127      ,2048 ,23   ,48.387      ,76.458        ,0.633                     
0    ,16   ,23   ,127      ,22   ,23   ,3.144       ,3.146         ,1.0                       
0    ,16   ,2336 ,127      ,2048 ,23   ,50.185      ,76.054        ,0.66                      
0    ,16   ,24   ,127      ,23   ,23   ,3.161       ,3.171         ,0.997                     
0    ,16   ,240  ,127      ,224  ,23   ,5.351       ,10.344        ,0.517                     
0    ,16   ,25   ,127      ,24   ,23   ,3.153       ,3.152         ,1.0                       
0    ,16   ,256  ,127      ,0    ,23   ,9.623       ,11.924        ,0.807                     
0    ,16   ,256  ,127      ,112  ,23   ,9.269       ,10.427        ,0.889                     
0    ,16   ,256  ,127      ,144  ,23   ,7.465       ,11.97         ,0.624                     
0    ,16   ,256  ,127      ,16   ,23   ,9.781       ,12.62         ,0.775                     
0    ,16   ,256  ,127      ,160  ,23   ,8.275       ,11.177        ,0.74                      
0    ,16   ,256  ,127      ,192  ,23   ,7.05        ,13.24         ,0.532                     
0    ,16   ,256  ,127      ,208  ,23   ,6.829       ,13.588        ,0.503                     
0    ,16   ,256  ,127      ,240  ,23   ,7.194       ,12.471        ,0.577                     
0    ,16   ,256  ,127      ,256  ,23   ,7.23        ,12.824        ,0.564                     
0    ,16   ,256  ,127      ,288  ,23   ,7.207       ,12.308        ,0.586                     
0    ,16   ,256  ,127      ,48   ,23   ,8.624       ,11.379        ,0.758                     
0    ,16   ,256  ,127      ,64   ,23   ,8.494       ,11.648        ,0.729                     
0    ,16   ,256  ,127      ,96   ,23   ,8.546       ,10.387        ,0.823                     
0    ,16   ,26   ,127      ,25   ,23   ,3.179       ,3.144         ,1.011                     
0    ,16   ,27   ,127      ,26   ,23   ,3.13        ,3.137         ,0.998                     
0    ,16   ,272  ,127      ,128  ,23   ,7.369       ,12.033        ,0.612                     
0    ,16   ,272  ,127      ,240  ,23   ,8.258       ,12.267        ,0.673                     
0    ,16   ,272  ,127      ,256  ,23   ,8.719       ,13.187        ,0.661                     
0    ,16   ,272  ,127      ,32   ,23   ,8.724       ,11.585        ,0.753                     
0    ,16   ,272  ,127      ,512  ,23   ,8.646       ,13.317        ,0.649                     
0    ,16   ,28   ,127      ,27   ,23   ,3.161       ,3.106         ,1.018                     
0    ,16   ,288  ,127      ,272  ,23   ,8.143       ,13.183        ,0.618                     
0    ,16   ,29   ,127      ,28   ,23   ,3.145       ,3.153         ,0.998                     
0    ,16   ,3    ,127      ,2    ,23   ,3.177       ,3.123         ,1.017                     
0    ,16   ,30   ,127      ,29   ,23   ,3.161       ,3.129         ,1.01                      
0    ,16   ,304  ,127      ,16   ,23   ,8.854       ,11.821        ,0.749                     
0    ,16   ,304  ,127      ,256  ,23   ,8.491       ,13.005        ,0.653                     
0    ,16   ,304  ,127      ,64   ,23   ,8.11        ,11.547        ,0.702                     
0    ,16   ,31   ,127      ,30   ,23   ,3.28        ,3.226         ,1.017                     
0    ,16   ,32   ,127      ,0    ,23   ,3.26        ,4.041         ,0.807                     
0    ,16   ,32   ,127      ,128  ,23   ,3.196       ,4.403         ,0.726                     
0    ,16   ,32   ,127      ,144  ,23   ,3.103       ,4.237         ,0.732                     
0    ,16   ,32   ,127      ,16   ,23   ,3.145       ,4.265         ,0.737                     
0    ,16   ,32   ,127      ,192  ,23   ,3.072       ,4.104         ,0.749                     
0    ,16   ,32   ,127      ,240  ,23   ,3.129       ,4.083         ,0.766                     
0    ,16   ,32   ,127      ,288  ,23   ,3.1         ,4.218         ,0.735                     
0    ,16   ,32   ,127      ,31   ,23   ,3.243       ,4.127         ,0.786                     
0    ,16   ,32   ,127      ,32   ,23   ,3.325       ,4.151         ,0.801                     
0    ,16   ,32   ,127      ,48   ,23   ,3.161       ,3.949         ,0.8                       
0    ,16   ,32   ,127      ,96   ,23   ,3.123       ,4.091         ,0.763                     
0    ,16   ,320  ,127      ,128  ,23   ,7.368       ,13.653        ,0.54                      
0    ,16   ,320  ,127      ,192  ,23   ,8.2         ,15.602        ,0.526                     
0    ,16   ,320  ,127      ,32   ,23   ,8.527       ,13.259        ,0.643                     
0    ,16   ,320  ,127      ,512  ,23   ,9.268       ,14.598        ,0.635                     
0    ,16   ,352  ,127      ,256  ,23   ,9.0         ,15.637        ,0.576                     
0    ,16   ,352  ,127      ,64   ,23   ,8.15        ,13.307        ,0.612                     
0    ,16   ,368  ,127      ,128  ,23   ,8.383       ,13.626        ,0.615                     
0    ,16   ,368  ,127      ,144  ,23   ,8.738       ,13.402        ,0.652                     
0    ,16   ,368  ,127      ,512  ,23   ,8.013       ,15.175        ,0.528                     
0    ,16   ,4    ,127      ,3    ,23   ,3.185       ,3.178         ,1.002                     
0    ,16   ,400  ,127      ,256  ,23   ,9.876       ,17.022        ,0.58                      
0    ,16   ,416  ,127      ,128  ,23   ,9.353       ,14.737        ,0.635                     
0    ,16   ,416  ,127      ,512  ,23   ,11.28       ,17.477        ,0.645                     
0    ,16   ,416  ,127      ,96   ,23   ,11.276      ,13.553        ,0.832                     
0    ,16   ,448  ,127      ,256  ,23   ,10.361      ,18.149        ,0.571                     
0    ,16   ,464  ,127      ,48   ,23   ,10.945      ,16.122        ,0.679                     
0    ,16   ,464  ,127      ,512  ,23   ,12.365      ,20.385        ,0.607                     
0    ,16   ,48   ,127      ,32   ,23   ,3.161       ,4.385         ,0.721                     
0    ,16   ,496  ,127      ,256  ,23   ,11.126      ,18.731        ,0.594                     
0    ,16   ,5    ,127      ,4    ,23   ,3.252       ,3.129         ,1.039                     
0    ,16   ,512  ,127      ,0    ,23   ,13.419      ,18.116        ,0.741                     
0    ,16   ,512  ,127      ,144  ,23   ,12.206      ,18.023        ,0.677                     
0    ,16   ,512  ,127      ,192  ,23   ,11.357      ,18.959        ,0.599                     
0    ,16   ,512  ,127      ,224  ,23   ,11.465      ,18.068        ,0.635                     
0    ,16   ,512  ,127      ,240  ,23   ,10.725      ,18.228        ,0.588                     
0    ,16   ,512  ,127      ,272  ,23   ,11.838      ,19.515        ,0.607                     
0    ,16   ,512  ,127      ,288  ,23   ,12.537      ,18.75         ,0.669                     
0    ,16   ,512  ,127      ,320  ,23   ,11.995      ,20.802        ,0.577                     
0    ,16   ,512  ,127      ,368  ,23   ,12.626      ,19.625        ,0.643                     
0    ,16   ,512  ,127      ,416  ,23   ,13.4        ,20.43         ,0.656                     
0    ,16   ,512  ,127      ,464  ,23   ,13.757      ,22.155        ,0.621                     
0    ,16   ,512  ,127      ,48   ,23   ,13.391      ,17.544        ,0.763                     
0    ,16   ,512  ,127      ,512  ,23   ,13.293      ,21.231        ,0.626                     
0    ,16   ,512  ,127      ,96   ,23   ,12.955      ,18.117        ,0.715                     
0    ,16   ,544  ,127      ,256  ,23   ,12.47       ,19.929        ,0.626                     
0    ,16   ,560  ,127      ,512  ,23   ,13.334      ,22.265        ,0.599                     
0    ,16   ,6    ,127      ,5    ,23   ,3.177       ,3.13          ,1.015                     
0    ,16   ,608  ,127      ,512  ,23   ,13.605      ,24.624        ,0.552                     
0    ,16   ,64   ,127      ,0    ,23   ,4.787       ,5.604         ,0.854                     
0    ,16   ,64   ,127      ,144  ,23   ,4.546       ,4.818         ,0.943                     
0    ,16   ,64   ,127      ,16   ,23   ,4.78        ,5.597         ,0.854                     
0    ,16   ,64   ,127      ,192  ,23   ,4.368       ,4.587         ,0.952                     
0    ,16   ,64   ,127      ,240  ,23   ,4.382       ,4.669         ,0.938                     
0    ,16   ,64   ,127      ,256  ,23   ,4.515       ,4.962         ,0.91                      
0    ,16   ,64   ,127      ,288  ,23   ,5.0         ,4.761         ,1.05                      
0    ,16   ,64   ,127      ,48   ,23   ,4.483       ,4.717         ,0.95                      
0    ,16   ,64   ,127      ,64   ,23   ,4.724       ,4.877         ,0.969                     
0    ,16   ,64   ,127      ,96   ,23   ,4.591       ,4.761         ,0.964                     
0    ,16   ,656  ,127      ,512  ,23   ,15.683      ,26.327        ,0.596                     
0    ,16   ,7    ,127      ,6    ,23   ,3.176       ,3.138         ,1.012                     
0    ,16   ,704  ,127      ,512  ,23   ,14.907      ,27.812        ,0.536                     
0    ,16   ,736  ,127      ,1024 ,23   ,15.953      ,29.405        ,0.543                     
0    ,16   ,736  ,127      ,288  ,23   ,13.63       ,23.485        ,0.58                      
0    ,16   ,752  ,127      ,512  ,23   ,15.665      ,27.982        ,0.56                      
0    ,16   ,784  ,127      ,1024 ,23   ,17.003      ,32.086        ,0.53                      
0    ,16   ,784  ,127      ,240  ,23   ,14.689      ,24.693        ,0.595                     
0    ,16   ,8    ,127      ,7    ,23   ,3.145       ,3.144         ,1.0                       
0    ,16   ,80   ,127      ,128  ,23   ,5.009       ,4.162         ,1.204                     
0    ,16   ,80   ,127      ,32   ,23   ,4.546       ,4.87          ,0.934                     
0    ,16   ,80   ,127      ,48   ,23   ,4.621       ,4.966         ,0.931                     
0    ,16   ,80   ,127      ,64   ,23   ,5.06        ,4.314         ,1.173                     
0    ,16   ,800  ,127      ,512  ,23   ,18.055      ,29.272        ,0.617                     
0    ,16   ,832  ,127      ,1024 ,23   ,17.744      ,34.476        ,0.515                     
0    ,16   ,832  ,127      ,192  ,23   ,14.77       ,27.017        ,0.547                     
0    ,16   ,880  ,127      ,1024 ,23   ,19.394      ,34.054        ,0.57                      
0    ,16   ,880  ,127      ,144  ,23   ,15.364      ,25.505        ,0.602                     
0    ,16   ,9    ,127      ,8    ,23   ,3.161       ,3.137         ,1.008                     
0    ,16   ,928  ,127      ,1024 ,23   ,19.455      ,36.575        ,0.532                     
0    ,16   ,928  ,127      ,96   ,23   ,18.127      ,26.639        ,0.68                      
0    ,16   ,96   ,127      ,80   ,23   ,5.057       ,5.975         ,0.846                     
0    ,16   ,976  ,127      ,1024 ,23   ,20.724      ,39.059        ,0.531                     
0    ,16   ,976  ,127      ,48   ,23   ,18.825      ,28.964        ,0.65                      
0    ,256  ,1    ,127      ,0    ,23   ,3.145       ,3.111         ,1.011                     
0    ,256  ,10   ,127      ,9    ,23   ,3.147       ,3.162         ,0.995                     
0    ,256  ,1024 ,127      ,0    ,23   ,20.995      ,32.521        ,0.646                     
0    ,256  ,1024 ,127      ,1024 ,23   ,23.193      ,42.593        ,0.545                     
0    ,256  ,1024 ,127      ,144  ,23   ,18.952      ,31.567        ,0.6                       
0    ,256  ,1024 ,127      ,192  ,23   ,18.257      ,33.915        ,0.538                     
0    ,256  ,1024 ,127      ,240  ,23   ,18.035      ,31.351        ,0.575                     
0    ,256  ,1024 ,127      ,288  ,23   ,19.344      ,34.431        ,0.562                     
0    ,256  ,1024 ,127      ,48   ,23   ,20.982      ,31.888        ,0.658                     
0    ,256  ,1024 ,127      ,736  ,23   ,20.297      ,35.2          ,0.577                     
0    ,256  ,1024 ,127      ,784  ,23   ,21.678      ,38.066        ,0.569                     
0    ,256  ,1024 ,127      ,832  ,23   ,21.415      ,44.494        ,0.481                     
0    ,256  ,1024 ,127      ,880  ,23   ,21.383      ,42.537        ,0.503                     
0    ,256  ,1024 ,127      ,928  ,23   ,22.856      ,39.22         ,0.583                     
0    ,256  ,1024 ,127      ,96   ,23   ,19.732      ,30.802        ,0.641                     
0    ,256  ,1024 ,127      ,976  ,23   ,22.649      ,41.739        ,0.543                     
0    ,256  ,1072 ,127      ,1024 ,23   ,22.346      ,41.278        ,0.541                     
0    ,256  ,11   ,127      ,10   ,23   ,3.145       ,3.139         ,1.002                     
0    ,256  ,112  ,127      ,144  ,23   ,4.959       ,4.934         ,1.005                     
0    ,256  ,112  ,127      ,16   ,23   ,4.665       ,6.018         ,0.775                     
0    ,256  ,112  ,127      ,256  ,23   ,4.957       ,5.05          ,0.982                     
0    ,256  ,112  ,127      ,64   ,23   ,5.033       ,5.845         ,0.861                     
0    ,256  ,112  ,127      ,96   ,23   ,5.057       ,5.126         ,0.986                     
0    ,256  ,1120 ,127      ,1024 ,23   ,23.281      ,45.061        ,0.517                     
0    ,256  ,1168 ,127      ,1024 ,23   ,26.091      ,48.367        ,0.539                     
0    ,256  ,12   ,127      ,11   ,23   ,3.145       ,3.114         ,1.01                      
0    ,256  ,1216 ,127      ,1024 ,23   ,26.159      ,53.35         ,0.49                      
0    ,256  ,1264 ,127      ,1024 ,23   ,25.924      ,48.606        ,0.533                     
0    ,256  ,128  ,127      ,0    ,23   ,6.408       ,8.333         ,0.769                     
0    ,256  ,128  ,127      ,112  ,23   ,5.176       ,8.366         ,0.619                     
0    ,256  ,128  ,127      ,128  ,23   ,5.258       ,7.04          ,0.747                     
0    ,256  ,128  ,127      ,144  ,23   ,5.229       ,7.166         ,0.73                      
0    ,256  ,128  ,127      ,192  ,23   ,5.448       ,7.081         ,0.769                     
0    ,256  ,128  ,127      ,240  ,23   ,6.203       ,7.073         ,0.877                     
0    ,256  ,128  ,127      ,288  ,23   ,6.745       ,7.082         ,0.953                     
0    ,256  ,128  ,127      ,32   ,23   ,5.907       ,8.168         ,0.723                     
0    ,256  ,128  ,127      ,48   ,23   ,5.724       ,8.108         ,0.706                     
0    ,256  ,128  ,127      ,80   ,23   ,5.767       ,8.201         ,0.703                     
0    ,256  ,128  ,127      ,96   ,23   ,5.132       ,7.085         ,0.724                     
0    ,256  ,13   ,127      ,12   ,23   ,3.153       ,3.113         ,1.013                     
0    ,256  ,1312 ,127      ,1024 ,23   ,27.159      ,47.255        ,0.575                     
0    ,256  ,14   ,127      ,13   ,23   ,3.161       ,3.129         ,1.01                      
0    ,256  ,144  ,127      ,128  ,23   ,5.197       ,8.221         ,0.632                     
0    ,256  ,15   ,127      ,14   ,23   ,3.162       ,3.113         ,1.016                     
0    ,256  ,16   ,127      ,0    ,23   ,3.14        ,3.135         ,1.002                     
0    ,256  ,16   ,127      ,144  ,23   ,3.145       ,3.13          ,1.005                     
0    ,256  ,16   ,127      ,15   ,23   ,3.145       ,3.098         ,1.015                     
0    ,256  ,16   ,127      ,16   ,23   ,3.151       ,3.114         ,1.012                     
0    ,256  ,16   ,127      ,192  ,23   ,3.166       ,3.113         ,1.017                     
0    ,256  ,16   ,127      ,240  ,23   ,3.13        ,3.156         ,0.992                     
0    ,256  ,16   ,127      ,256  ,23   ,3.097       ,3.159         ,0.981                     
0    ,256  ,16   ,127      ,288  ,23   ,3.541       ,3.113         ,1.137                     
0    ,256  ,16   ,127      ,48   ,23   ,3.162       ,3.13          ,1.01                      
0    ,256  ,16   ,127      ,64   ,23   ,3.161       ,3.129         ,1.01                      
0    ,256  ,16   ,127      ,96   ,23   ,3.145       ,3.146         ,1.0                       
0    ,256  ,160  ,127      ,144  ,23   ,5.198       ,8.201         ,0.634                     
0    ,256  ,160  ,127      ,16   ,23   ,5.727       ,10.25         ,0.559                     
0    ,256  ,160  ,127      ,256  ,23   ,5.119       ,8.135         ,0.629                     
0    ,256  ,160  ,127      ,64   ,23   ,5.675       ,8.1           ,0.701                     
0    ,256  ,160  ,127      ,96   ,23   ,5.398       ,7.016         ,0.769                     
0    ,256  ,17   ,127      ,16   ,23   ,3.162       ,3.138         ,1.008                     
0    ,256  ,176  ,127      ,128  ,23   ,5.311       ,8.28          ,0.641                     
0    ,256  ,176  ,127      ,160  ,23   ,5.222       ,8.238         ,0.634                     
0    ,256  ,176  ,127      ,32   ,23   ,5.722       ,8.181         ,0.699                     
0    ,256  ,1760 ,127      ,2048 ,23   ,34.608      ,89.776        ,0.385                     
0    ,256  ,1760 ,127      ,288  ,23   ,28.533      ,48.166        ,0.592                     
0    ,256  ,18   ,127      ,17   ,23   ,3.203       ,3.152         ,1.016                     
0    ,256  ,1808 ,127      ,2048 ,23   ,35.887      ,91.709        ,0.391                     
0    ,256  ,1808 ,127      ,240  ,23   ,28.671      ,49.654        ,0.577                     
0    ,256  ,1856 ,127      ,192  ,23   ,29.022      ,52.13         ,0.557                     
0    ,256  ,1856 ,127      ,2048 ,23   ,36.931      ,93.721        ,0.394                     
0    ,256  ,19   ,127      ,18   ,23   ,3.137       ,3.161         ,0.992                     
0    ,256  ,1904 ,127      ,144  ,23   ,29.576      ,49.294        ,0.6                       
0    ,256  ,1904 ,127      ,2048 ,23   ,36.626      ,94.523        ,0.387                     
0    ,256  ,192  ,127      ,176  ,23   ,6.107       ,9.917         ,0.616                     
0    ,256  ,1952 ,127      ,2048 ,23   ,38.931      ,95.247        ,0.409                     
0    ,256  ,1952 ,127      ,96   ,23   ,32.075      ,71.856        ,0.446                     
0    ,256  ,2    ,127      ,1    ,23   ,3.182       ,3.155         ,1.009                     
0    ,256  ,20   ,127      ,19   ,23   ,3.13        ,3.138         ,0.997                     
0    ,256  ,2000 ,127      ,2048 ,23   ,39.058      ,97.955        ,0.399                     
0    ,256  ,2000 ,127      ,48   ,23   ,32.114      ,67.92         ,0.473                     
0    ,256  ,2048 ,127      ,0    ,23   ,34.208      ,74.932        ,0.457                     
0    ,256  ,2048 ,127      ,1024 ,23   ,37.833      ,66.225        ,0.571                     
0    ,256  ,2048 ,127      ,128  ,23   ,33.786      ,55.182        ,0.612                     
0    ,256  ,2048 ,127      ,144  ,23   ,32.879      ,54.88         ,0.599                     
0    ,256  ,2048 ,127      ,1760 ,23   ,39.653      ,96.363        ,0.411                     
0    ,256  ,2048 ,127      ,1808 ,23   ,41.037      ,95.344        ,0.43                      
0    ,256  ,2048 ,127      ,1856 ,23   ,40.334      ,97.052        ,0.416                     
0    ,256  ,2048 ,127      ,1904 ,23   ,40.043      ,98.632        ,0.406                     
0    ,256  ,2048 ,127      ,192  ,23   ,32.42       ,69.851        ,0.464                     
0    ,256  ,2048 ,127      ,1952 ,23   ,41.346      ,98.624        ,0.419                     
0    ,256  ,2048 ,127      ,2000 ,23   ,44.497      ,99.075        ,0.449                     
0    ,256  ,2048 ,127      ,2048 ,23   ,50.177      ,101.98        ,0.492                     
0    ,256  ,2048 ,127      ,240  ,23   ,31.858      ,64.934        ,0.491                     
0    ,256  ,2048 ,127      ,256  ,23   ,33.747      ,73.774        ,0.457                     
0    ,256  ,2048 ,127      ,288  ,23   ,34.017      ,70.809        ,0.48                      
0    ,256  ,2048 ,127      ,32   ,23   ,34.665      ,71.363        ,0.486                     
0    ,256  ,2048 ,127      ,4096 ,23   ,40.907      ,99.14         ,0.413                     
0    ,256  ,2048 ,127      ,48   ,23   ,34.625      ,72.867        ,0.475                     
0    ,256  ,2048 ,127      ,512  ,23   ,35.083      ,60.616        ,0.579                     
0    ,256  ,2048 ,127      ,64   ,23   ,33.65       ,70.549        ,0.477                     
0    ,256  ,2048 ,127      ,96   ,23   ,33.793      ,75.736        ,0.446                     
0    ,256  ,208  ,127      ,16   ,23   ,6.838       ,11.802        ,0.579                     
0    ,256  ,208  ,127      ,192  ,23   ,5.147       ,10.302        ,0.5                       
0    ,256  ,208  ,127      ,256  ,23   ,4.952       ,10.514        ,0.471                     
0    ,256  ,208  ,127      ,48   ,23   ,6.404       ,9.814         ,0.653                     
0    ,256  ,208  ,127      ,64   ,23   ,6.119       ,9.818         ,0.623                     
0    ,256  ,2096 ,127      ,2048 ,23   ,43.082      ,99.885        ,0.431                     
0    ,256  ,21   ,127      ,20   ,23   ,3.153       ,3.161         ,0.998                     
0    ,256  ,2144 ,127      ,2048 ,23   ,44.896      ,99.713        ,0.45                      
0    ,256  ,2192 ,127      ,2048 ,23   ,44.061      ,103.953       ,0.424                     
0    ,256  ,22   ,127      ,21   ,23   ,3.161       ,3.157         ,1.001                     
0    ,256  ,224  ,127      ,128  ,23   ,5.684       ,10.595        ,0.536                     
0    ,256  ,224  ,127      ,208  ,23   ,5.738       ,10.349        ,0.554                     
0    ,256  ,224  ,127      ,288  ,23   ,6.29        ,10.206        ,0.616                     
0    ,256  ,224  ,127      ,32   ,23   ,6.692       ,9.961         ,0.672                     
0    ,256  ,224  ,127      ,512  ,23   ,5.972       ,10.199        ,0.586                     
0    ,256  ,2240 ,127      ,2048 ,23   ,45.254      ,105.449       ,0.429                     
0    ,256  ,2288 ,127      ,2048 ,23   ,47.722      ,110.098       ,0.433                     
0    ,256  ,23   ,127      ,22   ,23   ,3.145       ,3.159         ,0.996                     
0    ,256  ,2336 ,127      ,2048 ,23   ,45.988      ,107.339       ,0.428                     
0    ,256  ,24   ,127      ,23   ,23   ,3.145       ,3.137         ,1.002                     
0    ,256  ,240  ,127      ,224  ,23   ,5.063       ,10.327        ,0.49                      
0    ,256  ,25   ,127      ,24   ,23   ,3.179       ,3.129         ,1.016                     
0    ,256  ,256  ,127      ,0    ,23   ,9.808       ,11.888        ,0.825                     
0    ,256  ,256  ,127      ,112  ,23   ,9.119       ,10.736        ,0.849                     
0    ,256  ,256  ,127      ,144  ,23   ,7.8         ,11.884        ,0.656                     
0    ,256  ,256  ,127      ,16   ,23   ,9.935       ,12.56         ,0.791                     
0    ,256  ,256  ,127      ,160  ,23   ,7.897       ,11.49         ,0.687                     
0    ,256  ,256  ,127      ,192  ,23   ,6.937       ,13.168        ,0.527                     
0    ,256  ,256  ,127      ,208  ,23   ,6.821       ,13.191        ,0.517                     
0    ,256  ,256  ,127      ,240  ,23   ,7.795       ,12.363        ,0.631                     
0    ,256  ,256  ,127      ,256  ,23   ,7.324       ,12.444        ,0.589                     
0    ,256  ,256  ,127      ,288  ,23   ,6.86        ,12.339        ,0.556                     
0    ,256  ,256  ,127      ,48   ,23   ,9.216       ,11.493        ,0.802                     
0    ,256  ,256  ,127      ,64   ,23   ,9.632       ,11.584        ,0.831                     
0    ,256  ,256  ,127      ,96   ,23   ,8.64        ,10.337        ,0.836                     
0    ,256  ,26   ,127      ,25   ,23   ,3.153       ,3.139         ,1.004                     
0    ,256  ,27   ,127      ,26   ,23   ,3.143       ,3.154         ,0.997                     
0    ,256  ,272  ,127      ,128  ,23   ,9.022       ,12.021        ,0.75                      
0    ,256  ,272  ,127      ,240  ,23   ,8.432       ,12.27         ,0.687                     
0    ,256  ,272  ,127      ,256  ,23   ,7.897       ,13.253        ,0.596                     
0    ,256  ,272  ,127      ,32   ,23   ,10.314      ,11.56         ,0.892                     
0    ,256  ,272  ,127      ,512  ,23   ,8.091       ,13.26         ,0.61                      
0    ,256  ,28   ,127      ,27   ,23   ,3.153       ,3.137         ,1.005                     
0    ,256  ,288  ,127      ,272  ,23   ,8.222       ,13.111        ,0.627                     
0    ,256  ,29   ,127      ,28   ,23   ,3.145       ,3.137         ,1.002                     
0    ,256  ,3    ,127      ,2    ,23   ,3.145       ,3.12          ,1.008                     
0    ,256  ,30   ,127      ,29   ,23   ,3.154       ,3.129         ,1.008                     
0    ,256  ,304  ,127      ,16   ,23   ,10.148      ,11.965        ,0.848                     
0    ,256  ,304  ,127      ,256  ,23   ,8.439       ,13.122        ,0.643                     
0    ,256  ,304  ,127      ,64   ,23   ,9.707       ,11.612        ,0.836                     
0    ,256  ,31   ,127      ,30   ,23   ,3.145       ,3.118         ,1.009                     
0    ,256  ,32   ,127      ,0    ,23   ,3.294       ,4.254         ,0.774                     
0    ,256  ,32   ,127      ,128  ,23   ,3.129       ,4.216         ,0.742                     
0    ,256  ,32   ,127      ,144  ,23   ,3.146       ,3.93          ,0.8                       
0    ,256  ,32   ,127      ,16   ,23   ,3.144       ,4.445         ,0.707                     
0    ,256  ,32   ,127      ,192  ,23   ,3.146       ,3.951         ,0.796                     
0    ,256  ,32   ,127      ,240  ,23   ,3.146       ,3.928         ,0.801                     
0    ,256  ,32   ,127      ,288  ,23   ,3.145       ,3.89          ,0.808                     
0    ,256  ,32   ,127      ,31   ,23   ,3.153       ,3.998         ,0.789                     
0    ,256  ,32   ,127      ,32   ,23   ,3.139       ,4.265         ,0.736                     
0    ,256  ,32   ,127      ,48   ,23   ,3.145       ,3.91          ,0.804                     
0    ,256  ,32   ,127      ,96   ,23   ,3.145       ,4.16          ,0.756                     
0    ,256  ,320  ,127      ,128  ,23   ,10.429      ,15.385        ,0.678                     
0    ,256  ,320  ,127      ,192  ,23   ,7.607       ,15.432        ,0.493                     
0    ,256  ,320  ,127      ,32   ,23   ,11.133      ,14.572        ,0.764                     
0    ,256  ,320  ,127      ,512  ,23   ,8.932       ,14.624        ,0.611                     
0    ,256  ,352  ,127      ,256  ,23   ,9.886       ,15.646        ,0.632                     
0    ,256  ,352  ,127      ,64   ,23   ,10.65       ,14.572        ,0.731                     
0    ,256  ,368  ,127      ,128  ,23   ,11.401      ,15.301        ,0.745                     
0    ,256  ,368  ,127      ,144  ,23   ,8.264       ,13.702        ,0.603                     
0    ,256  ,368  ,127      ,512  ,23   ,8.616       ,15.144        ,0.569                     
0    ,256  ,4    ,127      ,3    ,23   ,3.162       ,3.114         ,1.015                     
0    ,256  ,400  ,127      ,256  ,23   ,11.687      ,18.239        ,0.641                     
0    ,256  ,416  ,127      ,128  ,23   ,12.144      ,16.428        ,0.739                     
0    ,256  ,416  ,127      ,512  ,23   ,12.079      ,17.753        ,0.68                      
0    ,256  ,416  ,127      ,96   ,23   ,10.491      ,13.483        ,0.778                     
0    ,256  ,448  ,127      ,256  ,23   ,12.594      ,19.867        ,0.634                     
0    ,256  ,464  ,127      ,48   ,23   ,11.254      ,15.945        ,0.706                     
0    ,256  ,464  ,127      ,512  ,23   ,11.854      ,19.922        ,0.595                     
0    ,256  ,48   ,127      ,32   ,23   ,3.145       ,4.451         ,0.707                     
0    ,256  ,496  ,127      ,256  ,23   ,12.836      ,19.879        ,0.646                     
0    ,256  ,5    ,127      ,4    ,23   ,3.145       ,3.159         ,0.995                     
0    ,256  ,512  ,127      ,0    ,23   ,13.748      ,19.369        ,0.71                      
0    ,256  ,512  ,127      ,144  ,23   ,12.617      ,17.936        ,0.703                     
0    ,256  ,512  ,127      ,192  ,23   ,11.4        ,18.98         ,0.601                     
0    ,256  ,512  ,127      ,224  ,23   ,10.899      ,18.326        ,0.595                     
0    ,256  ,512  ,127      ,240  ,23   ,11.663      ,18.585        ,0.628                     
0    ,256  ,512  ,127      ,272  ,23   ,11.804      ,19.468        ,0.606                     
0    ,256  ,512  ,127      ,288  ,23   ,12.152      ,18.828        ,0.645                     
0    ,256  ,512  ,127      ,320  ,23   ,11.961      ,20.233        ,0.591                     
0    ,256  ,512  ,127      ,368  ,23   ,11.43       ,20.012        ,0.571                     
0    ,256  ,512  ,127      ,416  ,23   ,15.23       ,20.898        ,0.729                     
0    ,256  ,512  ,127      ,464  ,23   ,15.202      ,24.027        ,0.633                     
0    ,256  ,512  ,127      ,48   ,23   ,13.519      ,18.997        ,0.712                     
0    ,256  ,512  ,127      ,512  ,23   ,15.203      ,22.853        ,0.665                     
0    ,256  ,512  ,127      ,96   ,23   ,13.257      ,18.363        ,0.722                     
0    ,256  ,544  ,127      ,256  ,23   ,12.716      ,20.915        ,0.608                     
0    ,256  ,560  ,127      ,512  ,23   ,13.536      ,22.471        ,0.602                     
0    ,256  ,6    ,127      ,5    ,23   ,3.195       ,3.145         ,1.016                     
0    ,256  ,608  ,127      ,512  ,23   ,14.538      ,26.111        ,0.557                     
0    ,256  ,64   ,127      ,0    ,23   ,4.958       ,5.599         ,0.885                     
0    ,256  ,64   ,127      ,144  ,23   ,4.425       ,4.659         ,0.95                      
0    ,256  ,64   ,127      ,16   ,23   ,4.393       ,5.397         ,0.814                     
0    ,256  ,64   ,127      ,192  ,23   ,4.635       ,4.647         ,0.997                     
0    ,256  ,64   ,127      ,240  ,23   ,4.425       ,4.841         ,0.914                     
0    ,256  ,64   ,127      ,256  ,23   ,4.385       ,4.909         ,0.893                     
0    ,256  ,64   ,127      ,288  ,23   ,4.409       ,4.756         ,0.927                     
0    ,256  ,64   ,127      ,48   ,23   ,4.423       ,4.751         ,0.931                     
0    ,256  ,64   ,127      ,64   ,23   ,4.554       ,4.837         ,0.942                     
0    ,256  ,64   ,127      ,96   ,23   ,4.424       ,4.761         ,0.929                     
0    ,256  ,656  ,127      ,512  ,23   ,16.779      ,27.434        ,0.612                     
0    ,256  ,7    ,127      ,6    ,23   ,3.145       ,3.114         ,1.01                      
0    ,256  ,704  ,127      ,512  ,23   ,16.6        ,28.904        ,0.574                     
0    ,256  ,736  ,127      ,1024 ,23   ,16.03       ,28.876        ,0.555                     
0    ,256  ,736  ,127      ,288  ,23   ,13.525      ,23.905        ,0.566                     
0    ,256  ,752  ,127      ,512  ,23   ,17.491      ,29.082        ,0.601                     
0    ,256  ,784  ,127      ,1024 ,23   ,17.826      ,32.865        ,0.542                     
0    ,256  ,784  ,127      ,240  ,23   ,14.898      ,24.147        ,0.617                     
0    ,256  ,8    ,127      ,7    ,23   ,3.155       ,3.13          ,1.008                     
0    ,256  ,80   ,127      ,128  ,23   ,5.645       ,4.262         ,1.325                     
0    ,256  ,80   ,127      ,32   ,23   ,4.439       ,4.751         ,0.934                     
0    ,256  ,80   ,127      ,48   ,23   ,5.758       ,4.923         ,1.169                     
0    ,256  ,80   ,127      ,64   ,23   ,5.084       ,4.286         ,1.186                     
0    ,256  ,800  ,127      ,512  ,23   ,18.076      ,30.285        ,0.597                     
0    ,256  ,832  ,127      ,1024 ,23   ,18.708      ,33.198        ,0.564                     
0    ,256  ,832  ,127      ,192  ,23   ,15.115      ,26.679        ,0.567                     
0    ,256  ,880  ,127      ,1024 ,23   ,17.833      ,38.807        ,0.46                      
0    ,256  ,880  ,127      ,144  ,23   ,15.093      ,28.699        ,0.526                     
0    ,256  ,9    ,127      ,8    ,23   ,3.129       ,3.13          ,1.0                       
0    ,256  ,928  ,127      ,1024 ,23   ,19.642      ,36.094        ,0.544                     
0    ,256  ,928  ,127      ,96   ,23   ,17.184      ,26.193        ,0.656                     
0    ,256  ,96   ,127      ,80   ,23   ,5.083       ,5.849         ,0.869                     
0    ,256  ,976  ,127      ,1024 ,23   ,20.744      ,38.766        ,0.535                     
0    ,256  ,976  ,127      ,48   ,23   ,18.3        ,28.799        ,0.635                     
0    ,4    ,1    ,127      ,0    ,23   ,3.189       ,3.123         ,1.021                     
0    ,4    ,10   ,127      ,9    ,23   ,3.416       ,3.146         ,1.086                     
0    ,4    ,1024 ,127      ,0    ,23   ,21.229      ,32.925        ,0.645                     
0    ,4    ,1024 ,127      ,1024 ,23   ,21.881      ,34.081        ,0.642                     
0    ,4    ,1024 ,127      ,144  ,23   ,19.561      ,31.178        ,0.627                     
0    ,4    ,1024 ,127      ,192  ,23   ,18.653      ,32.203        ,0.579                     
0    ,4    ,1024 ,127      ,240  ,23   ,18.409      ,31.112        ,0.592                     
0    ,4    ,1024 ,127      ,288  ,23   ,19.658      ,31.827        ,0.618                     
0    ,4    ,1024 ,127      ,48   ,23   ,21.037      ,32.342        ,0.65                      
0    ,4    ,1024 ,127      ,736  ,23   ,20.337      ,33.524        ,0.607                     
0    ,4    ,1024 ,127      ,784  ,23   ,21.445      ,34.117        ,0.629                     
0    ,4    ,1024 ,127      ,832  ,23   ,21.533      ,34.527        ,0.624                     
0    ,4    ,1024 ,127      ,880  ,23   ,21.687      ,34.633        ,0.626                     
0    ,4    ,1024 ,127      ,928  ,23   ,21.153      ,33.67         ,0.628                     
0    ,4    ,1024 ,127      ,96   ,23   ,20.155      ,29.978        ,0.672                     
0    ,4    ,1024 ,127      ,976  ,23   ,21.678      ,35.654        ,0.608                     
0    ,4    ,1072 ,127      ,1024 ,23   ,21.123      ,34.521        ,0.612                     
0    ,4    ,11   ,127      ,10   ,23   ,3.162       ,3.13          ,1.01                      
0    ,4    ,112  ,127      ,144  ,23   ,5.052       ,5.034         ,1.003                     
0    ,4    ,112  ,127      ,16   ,23   ,4.425       ,5.792         ,0.764                     
0    ,4    ,112  ,127      ,256  ,23   ,5.006       ,6.101         ,0.821                     
0    ,4    ,112  ,127      ,64   ,23   ,5.033       ,5.887         ,0.855                     
0    ,4    ,112  ,127      ,96   ,23   ,5.03        ,4.983         ,1.01                      
0    ,4    ,1120 ,127      ,1024 ,23   ,22.284      ,38.165        ,0.584                     
0    ,4    ,1168 ,127      ,1024 ,23   ,24.774      ,39.761        ,0.623                     
0    ,4    ,12   ,127      ,11   ,23   ,3.145       ,3.164         ,0.994                     
0    ,4    ,1216 ,127      ,1024 ,23   ,23.491      ,40.304        ,0.583                     
0    ,4    ,1264 ,127      ,1024 ,23   ,24.01       ,41.854        ,0.574                     
0    ,4    ,128  ,127      ,0    ,23   ,6.084       ,8.365         ,0.727                     
0    ,4    ,128  ,127      ,112  ,23   ,5.174       ,8.403         ,0.616                     
0    ,4    ,128  ,127      ,128  ,23   ,5.421       ,7.06          ,0.768                     
0    ,4    ,128  ,127      ,144  ,23   ,5.184       ,7.047         ,0.736                     
0    ,4    ,128  ,127      ,192  ,23   ,5.491       ,7.715         ,0.712                     
0    ,4    ,128  ,127      ,240  ,23   ,5.373       ,7.062         ,0.761                     
0    ,4    ,128  ,127      ,288  ,23   ,5.105       ,8.125         ,0.628                     
0    ,4    ,128  ,127      ,32   ,23   ,5.814       ,8.124         ,0.716                     
0    ,4    ,128  ,127      ,48   ,23   ,5.892       ,8.308         ,0.709                     
0    ,4    ,128  ,127      ,80   ,23   ,5.221       ,8.148         ,0.641                     
0    ,4    ,128  ,127      ,96   ,23   ,5.329       ,7.084         ,0.752                     
0    ,4    ,13   ,127      ,12   ,23   ,3.269       ,3.145         ,1.039                     
0    ,4    ,1312 ,127      ,1024 ,23   ,26.146      ,41.256        ,0.634                     
0    ,4    ,14   ,127      ,13   ,23   ,3.379       ,3.146         ,1.074                     
0    ,4    ,144  ,127      ,128  ,23   ,5.192       ,8.278         ,0.627                     
0    ,4    ,15   ,127      ,14   ,23   ,3.409       ,3.154         ,1.081                     
0    ,4    ,16   ,127      ,0    ,23   ,3.304       ,3.105         ,1.064                     
0    ,4    ,16   ,127      ,144  ,23   ,3.792       ,3.755         ,1.01                      
0    ,4    ,16   ,127      ,15   ,23   ,3.416       ,3.13          ,1.091                     
0    ,4    ,16   ,127      ,16   ,23   ,3.145       ,3.104         ,1.013                     
0    ,4    ,16   ,127      ,192  ,23   ,3.794       ,3.735         ,1.016                     
0    ,4    ,16   ,127      ,240  ,23   ,3.793       ,3.738         ,1.015                     
0    ,4    ,16   ,127      ,256  ,23   ,3.792       ,3.718         ,1.02                      
0    ,4    ,16   ,127      ,288  ,23   ,3.774       ,3.735         ,1.01                      
0    ,4    ,16   ,127      ,48   ,23   ,3.146       ,3.294         ,0.955                     
0    ,4    ,16   ,127      ,64   ,23   ,3.794       ,4.182         ,0.907                     
0    ,4    ,16   ,127      ,96   ,23   ,3.795       ,3.758         ,1.01                      
0    ,4    ,160  ,127      ,144  ,23   ,5.195       ,8.347         ,0.622                     
0    ,4    ,160  ,127      ,16   ,23   ,5.786       ,10.103        ,0.573                     
0    ,4    ,160  ,127      ,256  ,23   ,5.17        ,8.193         ,0.631                     
0    ,4    ,160  ,127      ,64   ,23   ,5.323       ,8.733         ,0.609                     
0    ,4    ,160  ,127      ,96   ,23   ,5.183       ,7.014         ,0.739                     
0    ,4    ,17   ,127      ,16   ,23   ,3.413       ,3.113         ,1.096                     
0    ,4    ,176  ,127      ,128  ,23   ,5.17        ,8.287         ,0.624                     
0    ,4    ,176  ,127      ,160  ,23   ,5.168       ,8.264         ,0.625                     
0    ,4    ,176  ,127      ,32   ,23   ,5.874       ,8.185         ,0.718                     
0    ,4    ,1760 ,127      ,2048 ,23   ,29.927      ,50.544        ,0.592                     
0    ,4    ,1760 ,127      ,288  ,23   ,29.193      ,49.597        ,0.589                     
0    ,4    ,18   ,127      ,17   ,23   ,3.413       ,3.107         ,1.098                     
0    ,4    ,1808 ,127      ,2048 ,23   ,32.575      ,51.894        ,0.628                     
0    ,4    ,1808 ,127      ,240  ,23   ,29.396      ,51.122        ,0.575                     
0    ,4    ,1856 ,127      ,192  ,23   ,29.112      ,53.641        ,0.543                     
0    ,4    ,1856 ,127      ,2048 ,23   ,31.841      ,53.597        ,0.594                     
0    ,4    ,19   ,127      ,18   ,23   ,3.372       ,3.138         ,1.074                     
0    ,4    ,1904 ,127      ,144  ,23   ,30.702      ,51.52         ,0.596                     
0    ,4    ,1904 ,127      ,2048 ,23   ,31.593      ,53.459        ,0.591                     
0    ,4    ,192  ,127      ,176  ,23   ,6.012       ,9.456         ,0.636                     
0    ,4    ,1952 ,127      ,2048 ,23   ,33.684      ,55.53         ,0.607                     
0    ,4    ,1952 ,127      ,96   ,23   ,33.579      ,74.528        ,0.451                     
0    ,4    ,2    ,127      ,1    ,23   ,3.17        ,3.154         ,1.005                     
0    ,4    ,20   ,127      ,19   ,23   ,3.398       ,3.137         ,1.083                     
0    ,4    ,2000 ,127      ,2048 ,23   ,33.546      ,59.596        ,0.563                     
0    ,4    ,2000 ,127      ,48   ,23   ,33.296      ,72.924        ,0.457                     
0    ,4    ,2048 ,127      ,0    ,23   ,35.011      ,75.292        ,0.465                     
0    ,4    ,2048 ,127      ,1024 ,23   ,36.402      ,60.247        ,0.604                     
0    ,4    ,2048 ,127      ,128  ,23   ,33.147      ,55.691        ,0.595                     
0    ,4    ,2048 ,127      ,144  ,23   ,33.487      ,56.037        ,0.598                     
0    ,4    ,2048 ,127      ,1760 ,23   ,35.333      ,60.26         ,0.586                     
0    ,4    ,2048 ,127      ,1808 ,23   ,37.315      ,61.681        ,0.605                     
0    ,4    ,2048 ,127      ,1856 ,23   ,38.287      ,62.945        ,0.608                     
0    ,4    ,2048 ,127      ,1904 ,23   ,37.2        ,61.528        ,0.605                     
0    ,4    ,2048 ,127      ,192  ,23   ,32.918      ,74.6          ,0.441                     
0    ,4    ,2048 ,127      ,1952 ,23   ,39.265      ,61.925        ,0.634                     
0    ,4    ,2048 ,127      ,2000 ,23   ,38.031      ,64.413        ,0.59                      
0    ,4    ,2048 ,127      ,2048 ,23   ,35.754      ,58.421        ,0.612                     
0    ,4    ,2048 ,127      ,240  ,23   ,33.402      ,84.165        ,0.397                     
0    ,4    ,2048 ,127      ,256  ,23   ,33.676      ,57.183        ,0.589                     
0    ,4    ,2048 ,127      ,288  ,23   ,34.024      ,68.829        ,0.494                     
0    ,4    ,2048 ,127      ,32   ,23   ,34.884      ,68.449        ,0.51                      
0    ,4    ,2048 ,127      ,4096 ,23   ,34.791      ,56.642        ,0.614                     
0    ,4    ,2048 ,127      ,48   ,23   ,34.709      ,71.496        ,0.485                     
0    ,4    ,2048 ,127      ,512  ,23   ,35.209      ,72.761        ,0.484                     
0    ,4    ,2048 ,127      ,64   ,23   ,34.169      ,68.652        ,0.498                     
0    ,4    ,2048 ,127      ,96   ,23   ,33.851      ,76.877        ,0.44                      
0    ,4    ,208  ,127      ,16   ,23   ,6.634       ,10.117        ,0.656                     
0    ,4    ,208  ,127      ,192  ,23   ,5.204       ,10.407        ,0.5                       
0    ,4    ,208  ,127      ,256  ,23   ,5.025       ,10.398        ,0.483                     
0    ,4    ,208  ,127      ,48   ,23   ,6.818       ,9.864         ,0.691                     
0    ,4    ,208  ,127      ,64   ,23   ,6.15        ,11.212        ,0.549                     
0    ,4    ,2096 ,127      ,2048 ,23   ,38.448      ,65.044        ,0.591                     
0    ,4    ,21   ,127      ,20   ,23   ,3.38        ,3.189         ,1.06                      
0    ,4    ,2144 ,127      ,2048 ,23   ,39.558      ,67.803        ,0.583                     
0    ,4    ,2192 ,127      ,2048 ,23   ,40.226      ,67.577        ,0.595                     
0    ,4    ,22   ,127      ,21   ,23   ,3.398       ,3.129         ,1.086                     
0    ,4    ,224  ,127      ,128  ,23   ,5.862       ,10.404        ,0.563                     
0    ,4    ,224  ,127      ,208  ,23   ,5.117       ,10.453        ,0.49                      
0    ,4    ,224  ,127      ,288  ,23   ,5.552       ,10.188        ,0.545                     
0    ,4    ,224  ,127      ,32   ,23   ,6.597       ,9.858         ,0.669                     
0    ,4    ,224  ,127      ,512  ,23   ,6.093       ,10.265        ,0.594                     
0    ,4    ,2240 ,127      ,2048 ,23   ,41.342      ,73.717        ,0.561                     
0    ,4    ,2288 ,127      ,2048 ,23   ,39.63       ,66.747        ,0.594                     
0    ,4    ,23   ,127      ,22   ,23   ,3.364       ,3.13          ,1.075                     
0    ,4    ,2336 ,127      ,2048 ,23   ,41.089      ,68.674        ,0.598                     
0    ,4    ,24   ,127      ,23   ,23   ,3.363       ,3.122         ,1.077                     
0    ,4    ,240  ,127      ,224  ,23   ,4.999       ,10.397        ,0.481                     
0    ,4    ,25   ,127      ,24   ,23   ,3.372       ,3.177         ,1.062                     
0    ,4    ,256  ,127      ,0    ,23   ,9.329       ,11.914        ,0.783                     
0    ,4    ,256  ,127      ,112  ,23   ,9.401       ,10.469        ,0.898                     
0    ,4    ,256  ,127      ,144  ,23   ,7.476       ,12.028        ,0.622                     
0    ,4    ,256  ,127      ,16   ,23   ,10.092      ,12.589        ,0.802                     
0    ,4    ,256  ,127      ,160  ,23   ,7.404       ,11.255        ,0.658                     
0    ,4    ,256  ,127      ,192  ,23   ,7.12        ,13.224        ,0.538                     
0    ,4    ,256  ,127      ,208  ,23   ,7.211       ,13.511        ,0.534                     
0    ,4    ,256  ,127      ,240  ,23   ,7.24        ,12.403        ,0.584                     
0    ,4    ,256  ,127      ,256  ,23   ,7.288       ,13.327        ,0.547                     
0    ,4    ,256  ,127      ,288  ,23   ,7.217       ,13.207        ,0.546                     
0    ,4    ,256  ,127      ,48   ,23   ,8.714       ,11.558        ,0.754                     
0    ,4    ,256  ,127      ,64   ,23   ,8.556       ,11.556        ,0.74                      
0    ,4    ,256  ,127      ,96   ,23   ,8.525       ,10.371        ,0.822                     
0    ,4    ,26   ,127      ,25   ,23   ,3.381       ,3.152         ,1.073                     
0    ,4    ,27   ,127      ,26   ,23   ,3.389       ,3.152         ,1.075                     
0    ,4    ,272  ,127      ,128  ,23   ,7.694       ,12.151        ,0.633                     
0    ,4    ,272  ,127      ,240  ,23   ,7.906       ,12.27         ,0.644                     
0    ,4    ,272  ,127      ,256  ,23   ,7.905       ,13.394        ,0.59                      
0    ,4    ,272  ,127      ,32   ,23   ,8.849       ,11.598        ,0.763                     
0    ,4    ,272  ,127      ,512  ,23   ,8.095       ,11.734        ,0.69                      
0    ,4    ,28   ,127      ,27   ,23   ,3.384       ,3.129         ,1.081                     
0    ,4    ,288  ,127      ,272  ,23   ,8.026       ,13.171        ,0.609                     
0    ,4    ,29   ,127      ,28   ,23   ,3.381       ,3.139         ,1.077                     
0    ,4    ,3    ,127      ,2    ,23   ,3.269       ,3.145         ,1.039                     
0    ,4    ,30   ,127      ,29   ,23   ,3.372       ,3.187         ,1.058                     
0    ,4    ,304  ,127      ,16   ,23   ,8.756       ,11.82         ,0.741                     
0    ,4    ,304  ,127      ,256  ,23   ,8.907       ,13.235        ,0.673                     
0    ,4    ,304  ,127      ,64   ,23   ,8.111       ,11.558        ,0.702                     
0    ,4    ,31   ,127      ,30   ,23   ,3.381       ,3.182         ,1.062                     
0    ,4    ,32   ,127      ,0    ,23   ,3.154       ,4.129         ,0.764                     
0    ,4    ,32   ,127      ,128  ,23   ,3.819       ,4.478         ,0.853                     
0    ,4    ,32   ,127      ,144  ,23   ,3.793       ,4.525         ,0.838                     
0    ,4    ,32   ,127      ,16   ,23   ,3.381       ,4.244         ,0.797                     
0    ,4    ,32   ,127      ,192  ,23   ,3.773       ,4.478         ,0.843                     
0    ,4    ,32   ,127      ,240  ,23   ,3.793       ,4.456         ,0.851                     
0    ,4    ,32   ,127      ,288  ,23   ,3.812       ,4.401         ,0.866                     
0    ,4    ,32   ,127      ,31   ,23   ,3.145       ,4.042         ,0.778                     
0    ,4    ,32   ,127      ,32   ,23   ,3.161       ,4.165         ,0.759                     
0    ,4    ,32   ,127      ,48   ,23   ,3.145       ,3.93          ,0.8                       
0    ,4    ,32   ,127      ,96   ,23   ,3.153       ,4.115         ,0.766                     
0    ,4    ,320  ,127      ,128  ,23   ,7.408       ,13.864        ,0.534                     
0    ,4    ,320  ,127      ,192  ,23   ,7.92        ,15.185        ,0.522                     
0    ,4    ,320  ,127      ,32   ,23   ,8.738       ,13.263        ,0.659                     
0    ,4    ,320  ,127      ,512  ,23   ,8.978       ,14.86         ,0.604                     
0    ,4    ,352  ,127      ,256  ,23   ,9.221       ,15.638        ,0.59                      
0    ,4    ,352  ,127      ,64   ,23   ,8.11        ,13.443        ,0.603                     
0    ,4    ,368  ,127      ,128  ,23   ,7.464       ,13.426        ,0.556                     
0    ,4    ,368  ,127      ,144  ,23   ,8.305       ,13.496        ,0.615                     
0    ,4    ,368  ,127      ,512  ,23   ,9.87        ,14.626        ,0.675                     
0    ,4    ,4    ,127      ,3    ,23   ,3.364       ,3.145         ,1.069                     
0    ,4    ,400  ,127      ,256  ,23   ,9.651       ,17.245        ,0.56                      
0    ,4    ,416  ,127      ,128  ,23   ,8.96        ,15.351        ,0.584                     
0    ,4    ,416  ,127      ,512  ,23   ,11.059      ,16.572        ,0.667                     
0    ,4    ,416  ,127      ,96   ,23   ,10.955      ,13.677        ,0.801                     
0    ,4    ,448  ,127      ,256  ,23   ,11.008      ,18.706        ,0.588                     
0    ,4    ,464  ,127      ,48   ,23   ,10.974      ,16.299        ,0.673                     
0    ,4    ,464  ,127      ,512  ,23   ,11.864      ,18.434        ,0.644                     
0    ,4    ,48   ,127      ,32   ,23   ,3.381       ,4.34          ,0.779                     
0    ,4    ,496  ,127      ,256  ,23   ,10.955      ,18.15         ,0.604                     
0    ,4    ,5    ,127      ,4    ,23   ,3.398       ,3.129         ,1.086                     
0    ,4    ,512  ,127      ,0    ,23   ,12.717      ,18.22         ,0.698                     
0    ,4    ,512  ,127      ,144  ,23   ,12.246      ,17.831        ,0.687                     
0    ,4    ,512  ,127      ,192  ,23   ,13.258      ,19.142        ,0.693                     
0    ,4    ,512  ,127      ,224  ,23   ,11.262      ,18.509        ,0.608                     
0    ,4    ,512  ,127      ,240  ,23   ,10.573      ,18.54         ,0.57                      
0    ,4    ,512  ,127      ,272  ,23   ,13.103      ,20.554        ,0.637                     
0    ,4    ,512  ,127      ,288  ,23   ,11.572      ,19.118        ,0.605                     
0    ,4    ,512  ,127      ,320  ,23   ,11.551      ,20.477        ,0.564                     
0    ,4    ,512  ,127      ,368  ,23   ,11.303      ,18.651        ,0.606                     
0    ,4    ,512  ,127      ,416  ,23   ,13.57       ,20.674        ,0.656                     
0    ,4    ,512  ,127      ,464  ,23   ,13.268      ,20.304        ,0.653                     
0    ,4    ,512  ,127      ,48   ,23   ,12.819      ,17.733        ,0.723                     
0    ,4    ,512  ,127      ,512  ,23   ,13.489      ,19.795        ,0.681                     
0    ,4    ,512  ,127      ,96   ,23   ,12.452      ,17.013        ,0.732                     
0    ,4    ,544  ,127      ,256  ,23   ,12.251      ,20.025        ,0.612                     
0    ,4    ,560  ,127      ,512  ,23   ,14.421      ,20.791        ,0.694                     
0    ,4    ,6    ,127      ,5    ,23   ,3.389       ,3.137         ,1.08                      
0    ,4    ,608  ,127      ,512  ,23   ,13.703      ,23.099        ,0.593                     
0    ,4    ,64   ,127      ,0    ,23   ,5.108       ,5.62          ,0.909                     
0    ,4    ,64   ,127      ,144  ,23   ,4.544       ,5.466         ,0.831                     
0    ,4    ,64   ,127      ,16   ,23   ,4.454       ,5.468         ,0.815                     
0    ,4    ,64   ,127      ,192  ,23   ,4.505       ,4.692         ,0.96                      
0    ,4    ,64   ,127      ,240  ,23   ,4.402       ,4.582         ,0.961                     
0    ,4    ,64   ,127      ,256  ,23   ,5.04        ,5.589         ,0.902                     
0    ,4    ,64   ,127      ,288  ,23   ,5.402       ,6.091         ,0.887                     
0    ,4    ,64   ,127      ,48   ,23   ,4.534       ,4.728         ,0.959                     
0    ,4    ,64   ,127      ,64   ,23   ,4.466       ,4.856         ,0.92                      
0    ,4    ,64   ,127      ,96   ,23   ,4.403       ,5.58          ,0.789                     
0    ,4    ,656  ,127      ,512  ,23   ,15.486      ,24.502        ,0.632                     
0    ,4    ,7    ,127      ,6    ,23   ,3.38        ,3.144         ,1.075                     
0    ,4    ,704  ,127      ,512  ,23   ,15.775      ,26.988        ,0.585                     
0    ,4    ,736  ,127      ,1024 ,23   ,16.268      ,24.42         ,0.666                     
0    ,4    ,736  ,127      ,288  ,23   ,14.222      ,23.443        ,0.607                     
0    ,4    ,752  ,127      ,512  ,23   ,15.471      ,28.21         ,0.548                     
0    ,4    ,784  ,127      ,1024 ,23   ,16.419      ,25.692        ,0.639                     
0    ,4    ,784  ,127      ,240  ,23   ,14.361      ,24.782        ,0.58                      
0    ,4    ,8    ,127      ,7    ,23   ,3.269       ,3.153         ,1.037                     
0    ,4    ,80   ,127      ,128  ,23   ,5.086       ,4.261         ,1.194                     
0    ,4    ,80   ,127      ,32   ,23   ,4.523       ,4.832         ,0.936                     
0    ,4    ,80   ,127      ,48   ,23   ,4.404       ,4.894         ,0.9                       
0    ,4    ,80   ,127      ,64   ,23   ,5.085       ,4.416         ,1.151                     
0    ,4    ,800  ,127      ,512  ,23   ,16.351      ,28.026        ,0.583                     
0    ,4    ,832  ,127      ,1024 ,23   ,16.484      ,28.077        ,0.587                     
0    ,4    ,832  ,127      ,192  ,23   ,14.224      ,27.43         ,0.519                     
0    ,4    ,880  ,127      ,1024 ,23   ,16.732      ,29.64         ,0.565                     
0    ,4    ,880  ,127      ,144  ,23   ,14.929      ,25.334        ,0.589                     
0    ,4    ,9    ,127      ,8    ,23   ,3.389       ,3.129         ,1.083                     
0    ,4    ,928  ,127      ,1024 ,23   ,19.296      ,29.706        ,0.65                      
0    ,4    ,928  ,127      ,96   ,23   ,18.388      ,27.001        ,0.681                     
0    ,4    ,96   ,127      ,80   ,23   ,5.082       ,5.881         ,0.864                     
0    ,4    ,976  ,127      ,1024 ,23   ,19.453      ,31.259        ,0.622                     
0    ,4    ,976  ,127      ,48   ,23   ,18.845      ,28.953        ,0.651                     
0    ,64   ,1    ,127      ,0    ,23   ,3.146       ,3.115         ,1.01                      
0    ,64   ,10   ,127      ,9    ,23   ,3.145       ,3.21          ,0.98                      
0    ,64   ,1024 ,127      ,0    ,23   ,20.923      ,32.608        ,0.642                     
0    ,64   ,1024 ,127      ,1024 ,23   ,23.048      ,42.425        ,0.543                     
0    ,64   ,1024 ,127      ,144  ,23   ,18.206      ,30.662        ,0.594                     
0    ,64   ,1024 ,127      ,192  ,23   ,18.069      ,31.476        ,0.574                     
0    ,64   ,1024 ,127      ,240  ,23   ,17.928      ,31.223        ,0.574                     
0    ,64   ,1024 ,127      ,288  ,23   ,19.612      ,31.291        ,0.627                     
0    ,64   ,1024 ,127      ,48   ,23   ,20.944      ,31.891        ,0.657                     
0    ,64   ,1024 ,127      ,736  ,23   ,20.205      ,35.662        ,0.567                     
0    ,64   ,1024 ,127      ,784  ,23   ,22.332      ,37.985        ,0.588                     
0    ,64   ,1024 ,127      ,832  ,23   ,21.975      ,38.248        ,0.575                     
0    ,64   ,1024 ,127      ,880  ,23   ,22.075      ,37.476        ,0.589                     
0    ,64   ,1024 ,127      ,928  ,23   ,22.21       ,38.655        ,0.575                     
0    ,64   ,1024 ,127      ,96   ,23   ,19.513      ,29.741        ,0.656                     
0    ,64   ,1024 ,127      ,976  ,23   ,22.11       ,41.075        ,0.538                     
0    ,64   ,1072 ,127      ,1024 ,23   ,21.811      ,40.77         ,0.535                     
0    ,64   ,11   ,127      ,10   ,23   ,3.153       ,3.149         ,1.001                     
0    ,64   ,112  ,127      ,144  ,23   ,5.032       ,5.077         ,0.991                     
0    ,64   ,112  ,127      ,16   ,23   ,4.521       ,5.862         ,0.771                     
0    ,64   ,112  ,127      ,256  ,23   ,5.056       ,4.957         ,1.02                      
0    ,64   ,112  ,127      ,64   ,23   ,5.036       ,5.793         ,0.869                     
0    ,64   ,112  ,127      ,96   ,23   ,5.082       ,5.127         ,0.991                     
0    ,64   ,1120 ,127      ,1024 ,23   ,21.617      ,43.445        ,0.498                     
0    ,64   ,1168 ,127      ,1024 ,23   ,25.798      ,44.944        ,0.574                     
0    ,64   ,12   ,127      ,11   ,23   ,3.145       ,3.129         ,1.005                     
0    ,64   ,1216 ,127      ,1024 ,23   ,26.363      ,47.657        ,0.553                     
0    ,64   ,1264 ,127      ,1024 ,23   ,24.709      ,46.382        ,0.533                     
0    ,64   ,128  ,127      ,0    ,23   ,6.284       ,8.313         ,0.756                     
0    ,64   ,128  ,127      ,112  ,23   ,5.153       ,8.324         ,0.619                     
0    ,64   ,128  ,127      ,128  ,23   ,5.131       ,7.027         ,0.73                      
0    ,64   ,128  ,127      ,144  ,23   ,5.133       ,7.103         ,0.723                     
0    ,64   ,128  ,127      ,192  ,23   ,5.158       ,7.086         ,0.728                     
0    ,64   ,128  ,127      ,240  ,23   ,6.442       ,7.118         ,0.905                     
0    ,64   ,128  ,127      ,288  ,23   ,6.356       ,7.377         ,0.862                     
0    ,64   ,128  ,127      ,32   ,23   ,5.763       ,8.18          ,0.704                     
0    ,64   ,128  ,127      ,48   ,23   ,5.724       ,8.1           ,0.707                     
0    ,64   ,128  ,127      ,80   ,23   ,5.197       ,8.276         ,0.628                     
0    ,64   ,128  ,127      ,96   ,23   ,5.131       ,7.085         ,0.724                     
0    ,64   ,13   ,127      ,12   ,23   ,3.145       ,3.163         ,0.994                     
0    ,64   ,1312 ,127      ,1024 ,23   ,27.245      ,46.87         ,0.581                     
0    ,64   ,14   ,127      ,13   ,23   ,3.169       ,3.137         ,1.01                      
0    ,64   ,144  ,127      ,128  ,23   ,5.201       ,8.234         ,0.632                     
0    ,64   ,15   ,127      ,14   ,23   ,3.185       ,3.131         ,1.017                     
0    ,64   ,16   ,127      ,0    ,23   ,3.161       ,3.12          ,1.013                     
0    ,64   ,16   ,127      ,144  ,23   ,3.145       ,3.129         ,1.005                     
0    ,64   ,16   ,127      ,15   ,23   ,3.145       ,3.137         ,1.002                     
0    ,64   ,16   ,127      ,16   ,23   ,3.15        ,3.224         ,0.977                     
0    ,64   ,16   ,127      ,192  ,23   ,3.161       ,3.129         ,1.01                      
0    ,64   ,16   ,127      ,240  ,23   ,3.162       ,3.091         ,1.023                     
0    ,64   ,16   ,127      ,256  ,23   ,3.178       ,3.099         ,1.026                     
0    ,64   ,16   ,127      ,288  ,23   ,3.178       ,3.138         ,1.013                     
0    ,64   ,16   ,127      ,48   ,23   ,3.16        ,3.13          ,1.01                      
0    ,64   ,16   ,127      ,64   ,23   ,3.178       ,3.15          ,1.009                     
0    ,64   ,16   ,127      ,96   ,23   ,3.162       ,3.246         ,0.974                     
0    ,64   ,160  ,127      ,144  ,23   ,5.205       ,8.229         ,0.633                     
0    ,64   ,160  ,127      ,16   ,23   ,5.727       ,10.16         ,0.564                     
0    ,64   ,160  ,127      ,256  ,23   ,5.181       ,8.203         ,0.632                     
0    ,64   ,160  ,127      ,64   ,23   ,5.164       ,8.187         ,0.631                     
0    ,64   ,160  ,127      ,96   ,23   ,5.703       ,7.087         ,0.805                     
0    ,64   ,17   ,127      ,16   ,23   ,3.144       ,3.153         ,0.997                     
0    ,64   ,176  ,127      ,128  ,23   ,5.141       ,8.199         ,0.627                     
0    ,64   ,176  ,127      ,160  ,23   ,5.166       ,8.256         ,0.626                     
0    ,64   ,176  ,127      ,32   ,23   ,5.751       ,8.144         ,0.706                     
0    ,64   ,1760 ,127      ,2048 ,23   ,33.518      ,90.3          ,0.371                     
0    ,64   ,1760 ,127      ,288  ,23   ,28.158      ,48.213        ,0.584                     
0    ,64   ,18   ,127      ,17   ,23   ,3.144       ,3.13          ,1.005                     
0    ,64   ,1808 ,127      ,2048 ,23   ,36.239      ,92.257        ,0.393                     
0    ,64   ,1808 ,127      ,240  ,23   ,28.35       ,49.771        ,0.57                      
0    ,64   ,1856 ,127      ,192  ,23   ,28.301      ,52.002        ,0.544                     
0    ,64   ,1856 ,127      ,2048 ,23   ,36.676      ,92.464        ,0.397                     
0    ,64   ,19   ,127      ,18   ,23   ,3.161       ,3.172         ,0.996                     
0    ,64   ,1904 ,127      ,144  ,23   ,28.952      ,50.126        ,0.578                     
0    ,64   ,1904 ,127      ,2048 ,23   ,36.834      ,95.74         ,0.385                     
0    ,64   ,192  ,127      ,176  ,23   ,5.884       ,9.745         ,0.604                     
0    ,64   ,1952 ,127      ,2048 ,23   ,39.927      ,97.024        ,0.412                     
0    ,64   ,1952 ,127      ,96   ,23   ,32.057      ,72.31         ,0.443                     
0    ,64   ,2    ,127      ,1    ,23   ,3.145       ,3.171         ,0.992                     
0    ,64   ,20   ,127      ,19   ,23   ,3.212       ,3.113         ,1.032                     
0    ,64   ,2000 ,127      ,2048 ,23   ,39.434      ,98.274        ,0.401                     
0    ,64   ,2000 ,127      ,48   ,23   ,32.92       ,68.817        ,0.478                     
0    ,64   ,2048 ,127      ,0    ,23   ,34.774      ,73.787        ,0.471                     
0    ,64   ,2048 ,127      ,1024 ,23   ,38.931      ,67.973        ,0.573                     
0    ,64   ,2048 ,127      ,128  ,23   ,33.431      ,56.005        ,0.597                     
0    ,64   ,2048 ,127      ,144  ,23   ,33.777      ,57.092        ,0.592                     
0    ,64   ,2048 ,127      ,1760 ,23   ,38.731      ,96.808        ,0.4                       
0    ,64   ,2048 ,127      ,1808 ,23   ,40.316      ,96.533        ,0.418                     
0    ,64   ,2048 ,127      ,1856 ,23   ,40.257      ,97.776        ,0.412                     
0    ,64   ,2048 ,127      ,1904 ,23   ,45.745      ,97.821        ,0.468                     
0    ,64   ,2048 ,127      ,192  ,23   ,32.566      ,68.826        ,0.473                     
0    ,64   ,2048 ,127      ,1952 ,23   ,44.333      ,100.817       ,0.44                      
0    ,64   ,2048 ,127      ,2000 ,23   ,44.596      ,98.634        ,0.452                     
0    ,64   ,2048 ,127      ,2048 ,23   ,47.728      ,100.91        ,0.473                     
0    ,64   ,2048 ,127      ,240  ,23   ,32.23       ,68.393        ,0.471                     
0    ,64   ,2048 ,127      ,256  ,23   ,33.926      ,69.822        ,0.486                     
0    ,64   ,2048 ,127      ,288  ,23   ,33.529      ,67.376        ,0.498                     
0    ,64   ,2048 ,127      ,32   ,23   ,35.219      ,69.795        ,0.505                     
0    ,64   ,2048 ,127      ,4096 ,23   ,40.277      ,98.401        ,0.409                     
0    ,64   ,2048 ,127      ,48   ,23   ,34.324      ,69.433        ,0.494                     
0    ,64   ,2048 ,127      ,512  ,23   ,35.471      ,60.545        ,0.586                     
0    ,64   ,2048 ,127      ,64   ,23   ,34.467      ,70.137        ,0.491                     
0    ,64   ,2048 ,127      ,96   ,23   ,33.566      ,74.891        ,0.448                     
0    ,64   ,208  ,127      ,16   ,23   ,6.757       ,11.877        ,0.569                     
0    ,64   ,208  ,127      ,192  ,23   ,5.334       ,10.356        ,0.515                     
0    ,64   ,208  ,127      ,256  ,23   ,5.31        ,10.372        ,0.512                     
0    ,64   ,208  ,127      ,48   ,23   ,6.563       ,9.828         ,0.668                     
0    ,64   ,208  ,127      ,64   ,23   ,6.289       ,9.936         ,0.633                     
0    ,64   ,2096 ,127      ,2048 ,23   ,40.138      ,100.921       ,0.398                     
0    ,64   ,21   ,127      ,20   ,23   ,3.161       ,3.129         ,1.01                      
0    ,64   ,2144 ,127      ,2048 ,23   ,42.27       ,101.966       ,0.415                     
0    ,64   ,2192 ,127      ,2048 ,23   ,51.551      ,104.027       ,0.496                     
0    ,64   ,22   ,127      ,21   ,23   ,3.16        ,3.145         ,1.005                     
0    ,64   ,224  ,127      ,128  ,23   ,5.748       ,10.422        ,0.552                     
0    ,64   ,224  ,127      ,208  ,23   ,5.442       ,10.412        ,0.523                     
0    ,64   ,224  ,127      ,288  ,23   ,6.03        ,10.098        ,0.597                     
0    ,64   ,224  ,127      ,32   ,23   ,6.964       ,9.947         ,0.7                       
0    ,64   ,224  ,127      ,512  ,23   ,5.284       ,10.209        ,0.518                     
0    ,64   ,2240 ,127      ,2048 ,23   ,54.392      ,104.889       ,0.519                     
0    ,64   ,2288 ,127      ,2048 ,23   ,52.139      ,103.332       ,0.505                     
0    ,64   ,23   ,127      ,22   ,23   ,3.161       ,3.145         ,1.005                     
0    ,64   ,2336 ,127      ,2048 ,23   ,57.209      ,107.056       ,0.534                     
0    ,64   ,24   ,127      ,23   ,23   ,3.145       ,3.145         ,1.0                       
0    ,64   ,240  ,127      ,224  ,23   ,5.583       ,10.346        ,0.54                      
0    ,64   ,25   ,127      ,24   ,23   ,3.161       ,3.138         ,1.007                     
0    ,64   ,256  ,127      ,0    ,23   ,9.268       ,11.776        ,0.787                     
0    ,64   ,256  ,127      ,112  ,23   ,8.775       ,10.688        ,0.821                     
0    ,64   ,256  ,127      ,144  ,23   ,7.436       ,12.002        ,0.62                      
0    ,64   ,256  ,127      ,16   ,23   ,10.026      ,12.544        ,0.799                     
0    ,64   ,256  ,127      ,160  ,23   ,8.227       ,12.147        ,0.677                     
0    ,64   ,256  ,127      ,192  ,23   ,7.011       ,13.215        ,0.531                     
0    ,64   ,256  ,127      ,208  ,23   ,7.584       ,13.176        ,0.576                     
0    ,64   ,256  ,127      ,240  ,23   ,6.927       ,12.464        ,0.556                     
0    ,64   ,256  ,127      ,256  ,23   ,7.429       ,12.486        ,0.595                     
0    ,64   ,256  ,127      ,288  ,23   ,7.09        ,12.374        ,0.573                     
0    ,64   ,256  ,127      ,48   ,23   ,8.714       ,11.553        ,0.754                     
0    ,64   ,256  ,127      ,64   ,23   ,8.751       ,11.52         ,0.76                      
0    ,64   ,256  ,127      ,96   ,23   ,8.36        ,10.373        ,0.806                     
0    ,64   ,26   ,127      ,25   ,23   ,3.153       ,3.107         ,1.015                     
0    ,64   ,27   ,127      ,26   ,23   ,3.161       ,3.114         ,1.015                     
0    ,64   ,272  ,127      ,128  ,23   ,7.544       ,12.147        ,0.621                     
0    ,64   ,272  ,127      ,240  ,23   ,8.597       ,11.843        ,0.726                     
0    ,64   ,272  ,127      ,256  ,23   ,8.732       ,13.529        ,0.645                     
0    ,64   ,272  ,127      ,32   ,23   ,8.731       ,11.674        ,0.748                     
0    ,64   ,272  ,127      ,512  ,23   ,7.924       ,12.98         ,0.61                      
0    ,64   ,28   ,127      ,27   ,23   ,3.161       ,3.137         ,1.008                     
0    ,64   ,288  ,127      ,272  ,23   ,8.21        ,13.143        ,0.625                     
0    ,64   ,29   ,127      ,28   ,23   ,3.161       ,3.169         ,0.998                     
0    ,64   ,3    ,127      ,2    ,23   ,3.188       ,3.114         ,1.024                     
0    ,64   ,30   ,127      ,29   ,23   ,3.177       ,3.161         ,1.005                     
0    ,64   ,304  ,127      ,16   ,23   ,8.715       ,11.891        ,0.733                     
0    ,64   ,304  ,127      ,256  ,23   ,8.62        ,13.102        ,0.658                     
0    ,64   ,304  ,127      ,64   ,23   ,8.05        ,11.606        ,0.694                     
0    ,64   ,31   ,127      ,30   ,23   ,3.17        ,3.145         ,1.008                     
0    ,64   ,32   ,127      ,0    ,23   ,3.29        ,4.007         ,0.821                     
0    ,64   ,32   ,127      ,128  ,23   ,3.145       ,4.38          ,0.718                     
0    ,64   ,32   ,127      ,144  ,23   ,3.18        ,4.194         ,0.758                     
0    ,64   ,32   ,127      ,16   ,23   ,3.145       ,4.309         ,0.73                      
0    ,64   ,32   ,127      ,192  ,23   ,3.161       ,4.125         ,0.766                     
0    ,64   ,32   ,127      ,240  ,23   ,3.144       ,4.215         ,0.746                     
0    ,64   ,32   ,127      ,288  ,23   ,3.16        ,4.174         ,0.757                     
0    ,64   ,32   ,127      ,31   ,23   ,3.16        ,4.106         ,0.77                      
0    ,64   ,32   ,127      ,32   ,23   ,3.172       ,4.202         ,0.755                     
0    ,64   ,32   ,127      ,48   ,23   ,3.145       ,3.91          ,0.804                     
0    ,64   ,32   ,127      ,96   ,23   ,3.153       ,4.194         ,0.752                     
0    ,64   ,320  ,127      ,128  ,23   ,7.421       ,13.733        ,0.54                      
0    ,64   ,320  ,127      ,192  ,23   ,7.242       ,15.821        ,0.458                     
0    ,64   ,320  ,127      ,32   ,23   ,8.713       ,13.22         ,0.659                     
0    ,64   ,320  ,127      ,512  ,23   ,9.425       ,14.435        ,0.653                     
0    ,64   ,352  ,127      ,256  ,23   ,8.98        ,15.717        ,0.571                     
0    ,64   ,352  ,127      ,64   ,23   ,8.089       ,13.371        ,0.605                     
0    ,64   ,368  ,127      ,128  ,23   ,7.761       ,13.513        ,0.574                     
0    ,64   ,368  ,127      ,144  ,23   ,9.364       ,14.781        ,0.634                     
0    ,64   ,368  ,127      ,512  ,23   ,8.312       ,16.533        ,0.503                     
0    ,64   ,4    ,127      ,3    ,23   ,3.169       ,3.121         ,1.015                     
0    ,64   ,400  ,127      ,256  ,23   ,12.427      ,16.853        ,0.737                     
0    ,64   ,416  ,127      ,128  ,23   ,9.934       ,14.931        ,0.665                     
0    ,64   ,416  ,127      ,512  ,23   ,11.042      ,17.611        ,0.627                     
0    ,64   ,416  ,127      ,96   ,23   ,10.902      ,13.841        ,0.788                     
0    ,64   ,448  ,127      ,256  ,23   ,13.192      ,19.293        ,0.684                     
0    ,64   ,464  ,127      ,48   ,23   ,10.887      ,16.063        ,0.678                     
0    ,64   ,464  ,127      ,512  ,23   ,10.452      ,20.032        ,0.522                     
0    ,64   ,48   ,127      ,32   ,23   ,3.145       ,4.431         ,0.71                      
0    ,64   ,496  ,127      ,256  ,23   ,10.85       ,18.059        ,0.601                     
0    ,64   ,5    ,127      ,4    ,23   ,3.161       ,3.121         ,1.013                     
0    ,64   ,512  ,127      ,0    ,23   ,13.905      ,18.108        ,0.768                     
0    ,64   ,512  ,127      ,144  ,23   ,11.345      ,18.043        ,0.629                     
0    ,64   ,512  ,127      ,192  ,23   ,10.846      ,21.107        ,0.514                     
0    ,64   ,512  ,127      ,224  ,23   ,11.853      ,18.175        ,0.652                     
0    ,64   ,512  ,127      ,240  ,23   ,11.11       ,17.979        ,0.618                     
0    ,64   ,512  ,127      ,272  ,23   ,12.445      ,19.733        ,0.631                     
0    ,64   ,512  ,127      ,288  ,23   ,12.124      ,18.273        ,0.663                     
0    ,64   ,512  ,127      ,320  ,23   ,12.038      ,20.144        ,0.598                     
0    ,64   ,512  ,127      ,368  ,23   ,12.373      ,21.351        ,0.58                      
0    ,64   ,512  ,127      ,416  ,23   ,13.084      ,20.901        ,0.626                     
0    ,64   ,512  ,127      ,464  ,23   ,15.886      ,22.274        ,0.713                     
0    ,64   ,512  ,127      ,48   ,23   ,13.235      ,17.512        ,0.756                     
0    ,64   ,512  ,127      ,512  ,23   ,14.935      ,21.371        ,0.699                     
0    ,64   ,512  ,127      ,96   ,23   ,13.316      ,17.011        ,0.783                     
0    ,64   ,544  ,127      ,256  ,23   ,12.968      ,19.863        ,0.653                     
0    ,64   ,560  ,127      ,512  ,23   ,13.813      ,22.373        ,0.617                     
0    ,64   ,6    ,127      ,5    ,23   ,3.162       ,3.121         ,1.013                     
0    ,64   ,608  ,127      ,512  ,23   ,12.486      ,24.564        ,0.508                     
0    ,64   ,64   ,127      ,0    ,23   ,5.164       ,5.539         ,0.932                     
0    ,64   ,64   ,127      ,144  ,23   ,4.426       ,4.706         ,0.94                      
0    ,64   ,64   ,127      ,16   ,23   ,4.415       ,5.514         ,0.801                     
0    ,64   ,64   ,127      ,192  ,23   ,4.441       ,4.702         ,0.944                     
0    ,64   ,64   ,127      ,240  ,23   ,4.425       ,4.682         ,0.945                     
0    ,64   ,64   ,127      ,256  ,23   ,4.49        ,4.898         ,0.917                     
0    ,64   ,64   ,127      ,288  ,23   ,4.424       ,4.682         ,0.945                     
0    ,64   ,64   ,127      ,48   ,23   ,4.403       ,4.708         ,0.935                     
0    ,64   ,64   ,127      ,64   ,23   ,4.487       ,4.856         ,0.924                     
0    ,64   ,64   ,127      ,96   ,23   ,4.551       ,4.67          ,0.975                     
0    ,64   ,656  ,127      ,512  ,23   ,16.55       ,26.161        ,0.633                     
0    ,64   ,7    ,127      ,6    ,23   ,3.161       ,3.146         ,1.005                     
0    ,64   ,704  ,127      ,512  ,23   ,16.307      ,30.314        ,0.538                     
0    ,64   ,736  ,127      ,1024 ,23   ,16.321      ,29.145        ,0.56                      
0    ,64   ,736  ,127      ,288  ,23   ,13.73       ,23.139        ,0.593                     
0    ,64   ,752  ,127      ,512  ,23   ,15.28       ,26.936        ,0.567                     
0    ,64   ,784  ,127      ,1024 ,23   ,17.05       ,31.876        ,0.535                     
0    ,64   ,784  ,127      ,240  ,23   ,14.325      ,24.491        ,0.585                     
0    ,64   ,8    ,127      ,7    ,23   ,3.161       ,3.133         ,1.009                     
0    ,64   ,80   ,127      ,128  ,23   ,5.032       ,4.221         ,1.192                     
0    ,64   ,80   ,127      ,32   ,23   ,4.523       ,4.742         ,0.954                     
0    ,64   ,80   ,127      ,48   ,23   ,4.644       ,4.948         ,0.939                     
0    ,64   ,80   ,127      ,64   ,23   ,5.035       ,4.42          ,1.139                     
0    ,64   ,800  ,127      ,512  ,23   ,18.791      ,28.887        ,0.651                     
0    ,64   ,832  ,127      ,1024 ,23   ,18.448      ,33.59         ,0.549                     
0    ,64   ,832  ,127      ,192  ,23   ,14.425      ,27.058        ,0.533                     
0    ,64   ,880  ,127      ,1024 ,23   ,17.871      ,33.927        ,0.527                     
0    ,64   ,880  ,127      ,144  ,23   ,14.666      ,24.774        ,0.592                     
0    ,64   ,9    ,127      ,8    ,23   ,3.161       ,3.154         ,1.002                     
0    ,64   ,928  ,127      ,1024 ,23   ,19.164      ,36.795        ,0.521                     
0    ,64   ,928  ,127      ,96   ,23   ,18.11       ,26.37         ,0.687                     
0    ,64   ,96   ,127      ,80   ,23   ,5.064       ,5.669         ,0.893                     
0    ,64   ,976  ,127      ,1024 ,23   ,20.975      ,38.908        ,0.539                     
0    ,64   ,976  ,127      ,48   ,23   ,19.089      ,28.833        ,0.662                     
1    ,1    ,2048 ,127      ,32   ,0    ,3.097       ,4.435         ,0.698                     
1    ,1    ,2048 ,127      ,32   ,23   ,34.182      ,68.524        ,0.499                     
1    ,1    ,256  ,127      ,64   ,0    ,5.693       ,3.8           ,1.498                     
1    ,1    ,256  ,127      ,64   ,23   ,8.011       ,11.012        ,0.727                     
1    ,16   ,2048 ,127      ,32   ,23   ,35.054      ,71.311        ,0.492                     
1    ,16   ,256  ,127      ,64   ,23   ,9.662       ,12.811        ,0.754                     
1    ,256  ,2048 ,127      ,32   ,23   ,34.659      ,69.267        ,0.5                       
1    ,256  ,256  ,127      ,64   ,23   ,9.45        ,12.737        ,0.742                     
1    ,4    ,2048 ,127      ,32   ,23   ,35.208      ,70.182        ,0.502                     
1    ,4    ,256  ,127      ,64   ,23   ,7.962       ,11.32         ,0.703                     
1    ,64   ,2048 ,127      ,32   ,23   ,35.422      ,73.54         ,0.482                     
1    ,64   ,256  ,127      ,64   ,23   ,9.679       ,12.445        ,0.778                     
105  ,1    ,256  ,127      ,64   ,0    ,5.779       ,3.956         ,1.461                     
105  ,1    ,256  ,127      ,64   ,23   ,8.356       ,11.535        ,0.724                     
105  ,16   ,256  ,127      ,64   ,23   ,8.202       ,11.597        ,0.707                     
105  ,256  ,256  ,127      ,64   ,23   ,10.862      ,13.195        ,0.823                     
105  ,4    ,256  ,127      ,64   ,23   ,8.336       ,11.596        ,0.719                     
105  ,64   ,256  ,127      ,64   ,23   ,8.326       ,11.598        ,0.718                     
15   ,1    ,256  ,127      ,64   ,0    ,5.07        ,3.778         ,1.342                     
15   ,1    ,256  ,127      ,64   ,23   ,7.984       ,11.459        ,0.697                     
15   ,16   ,256  ,127      ,64   ,23   ,8.104       ,11.323        ,0.716                     
15   ,256  ,256  ,127      ,64   ,23   ,9.27        ,12.671        ,0.732                     
15   ,4    ,256  ,127      ,64   ,23   ,8.292       ,11.245        ,0.737                     
15   ,64   ,256  ,127      ,64   ,23   ,9.8         ,11.088        ,0.884                     
2    ,1    ,2048 ,127      ,64   ,0    ,4.808       ,4.698         ,1.023                     
2    ,1    ,2048 ,127      ,64   ,23   ,33.909      ,69.607        ,0.487                     
2    ,1    ,256  ,127      ,64   ,0    ,4.691       ,3.783         ,1.24                      
2    ,1    ,256  ,127      ,64   ,23   ,7.925       ,11.316        ,0.7                       
2    ,16   ,2048 ,127      ,64   ,23   ,34.004      ,71.486        ,0.476                     
2    ,16   ,256  ,127      ,64   ,23   ,8.061       ,11.364        ,0.709                     
2    ,256  ,2048 ,127      ,64   ,23   ,33.833      ,69.847        ,0.484                     
2    ,256  ,256  ,127      ,64   ,23   ,9.405       ,12.79         ,0.735                     
2    ,4    ,2048 ,127      ,64   ,23   ,34.94       ,69.048        ,0.506                     
2    ,4    ,256  ,127      ,64   ,23   ,7.96        ,11.273        ,0.706                     
2    ,64   ,2048 ,127      ,64   ,23   ,34.158      ,71.32         ,0.479                     
2    ,64   ,256  ,127      ,64   ,23   ,9.707       ,11.195        ,0.867                     
3    ,1    ,2048 ,127      ,128  ,0    ,4.917       ,8.438         ,0.583                     
3    ,1    ,2048 ,127      ,128  ,23   ,33.387      ,55.942        ,0.597                     
3    ,1    ,256  ,127      ,64   ,0    ,4.756       ,3.865         ,1.231                     
3    ,1    ,256  ,127      ,64   ,23   ,8.17        ,11.748        ,0.695                     
3    ,16   ,2048 ,127      ,128  ,23   ,34.344      ,55.245        ,0.622                     
3    ,16   ,256  ,127      ,64   ,23   ,8.108       ,11.524        ,0.704                     
3    ,256  ,2048 ,127      ,128  ,23   ,33.726      ,55.718        ,0.605                     
3    ,256  ,256  ,127      ,64   ,23   ,9.697       ,12.942        ,0.749                     
3    ,4    ,2048 ,127      ,128  ,23   ,33.379      ,55.738        ,0.599                     
3    ,4    ,256  ,127      ,64   ,23   ,8.083       ,11.516        ,0.702                     
3    ,64   ,2048 ,127      ,128  ,23   ,33.404      ,55.606        ,0.601                     
3    ,64   ,256  ,127      ,64   ,23   ,9.321       ,11.571        ,0.806                     
30   ,1    ,256  ,127      ,64   ,0    ,4.853       ,4.037         ,1.202                     
30   ,1    ,256  ,127      ,64   ,23   ,7.981       ,11.422        ,0.699                     
30   ,16   ,256  ,127      ,64   ,23   ,8.07        ,11.502        ,0.702                     
30   ,256  ,256  ,127      ,64   ,23   ,9.259       ,12.764        ,0.725                     
30   ,4    ,256  ,127      ,64   ,23   ,8.002       ,11.336        ,0.706                     
30   ,64   ,256  ,127      ,64   ,23   ,9.314       ,11.452        ,0.813                     
4    ,1    ,2048 ,127      ,256  ,0    ,9.384       ,11.959        ,0.785                     
4    ,1    ,2048 ,127      ,256  ,23   ,34.209      ,77.702        ,0.44                      
4    ,1    ,256  ,127      ,64   ,0    ,4.613       ,3.861         ,1.195                     
4    ,1    ,256  ,127      ,64   ,23   ,8.366       ,11.437        ,0.731                     
4    ,16   ,2048 ,127      ,256  ,23   ,33.529      ,73.372        ,0.457                     
4    ,16   ,256  ,127      ,64   ,23   ,8.519       ,11.612        ,0.734                     
4    ,256  ,2048 ,127      ,256  ,23   ,34.09       ,72.41         ,0.471                     
4    ,256  ,256  ,127      ,64   ,23   ,9.358       ,13.062        ,0.716                     
4    ,4    ,2048 ,127      ,256  ,23   ,34.377      ,57.552        ,0.597                     
4    ,4    ,256  ,127      ,64   ,23   ,8.413       ,11.557        ,0.728                     
4    ,64   ,2048 ,127      ,256  ,23   ,33.699      ,68.19         ,0.494                     
4    ,64   ,256  ,127      ,64   ,23   ,9.573       ,11.41         ,0.839                     
4080 ,1    ,31   ,127      ,30   ,0    ,5.576       ,4.86          ,1.147                     
4080 ,1    ,31   ,127      ,30   ,23   ,5.658       ,5.113         ,1.107                     
4080 ,1    ,32   ,127      ,31   ,0    ,5.661       ,4.837         ,1.17                      
4080 ,1    ,32   ,127      ,31   ,23   ,5.677       ,6.393         ,0.888                     
4080 ,16   ,31   ,127      ,30   ,23   ,5.688       ,5.031         ,1.131                     
4080 ,16   ,32   ,127      ,31   ,23   ,6.125       ,6.26          ,0.979                     
4080 ,256  ,31   ,127      ,30   ,23   ,5.659       ,4.982         ,1.136                     
4080 ,256  ,32   ,127      ,31   ,23   ,5.688       ,6.245         ,0.911                     
4080 ,4    ,31   ,127      ,30   ,23   ,5.66        ,5.005         ,1.131                     
4080 ,4    ,32   ,127      ,31   ,23   ,5.659       ,6.198         ,0.913                     
4080 ,64   ,31   ,127      ,30   ,23   ,5.66        ,5.006         ,1.131                     
4080 ,64   ,32   ,127      ,31   ,23   ,5.688       ,6.293         ,0.904                     
4081 ,1    ,29   ,127      ,28   ,0    ,5.52        ,4.86          ,1.136                     
4081 ,1    ,29   ,127      ,28   ,23   ,5.857       ,5.316         ,1.102                     
4081 ,1    ,30   ,127      ,29   ,0    ,5.494       ,4.884         ,1.125                     
4081 ,1    ,30   ,127      ,29   ,23   ,5.675       ,5.175         ,1.097                     
4081 ,16   ,29   ,127      ,28   ,23   ,5.659       ,5.056         ,1.119                     
4081 ,16   ,30   ,127      ,29   ,23   ,6.333       ,5.149         ,1.23                      
4081 ,256  ,29   ,127      ,28   ,23   ,5.659       ,4.981         ,1.136                     
4081 ,256  ,30   ,127      ,29   ,23   ,5.659       ,4.98          ,1.136                     
4081 ,4    ,29   ,127      ,28   ,23   ,5.688       ,5.177         ,1.099                     
4081 ,4    ,30   ,127      ,29   ,23   ,5.688       ,5.164         ,1.101                     
4081 ,64   ,29   ,127      ,28   ,23   ,5.687       ,5.056         ,1.125                     
4081 ,64   ,30   ,127      ,29   ,23   ,5.717       ,5.056         ,1.131                     
4082 ,1    ,27   ,127      ,26   ,0    ,5.488       ,4.837         ,1.135                     
4082 ,1    ,27   ,127      ,26   ,23   ,5.659       ,5.133         ,1.102                     
4082 ,1    ,28   ,127      ,27   ,0    ,5.494       ,4.882         ,1.125                     
4082 ,1    ,28   ,127      ,27   ,23   ,5.732       ,5.233         ,1.095                     
4082 ,16   ,27   ,127      ,26   ,23   ,5.751       ,4.956         ,1.16                      
4082 ,16   ,28   ,127      ,27   ,23   ,5.659       ,4.98          ,1.136                     
4082 ,256  ,27   ,127      ,26   ,23   ,5.66        ,5.047         ,1.122                     
4082 ,256  ,28   ,127      ,27   ,23   ,5.659       ,5.031         ,1.125                     
4082 ,4    ,27   ,127      ,26   ,23   ,5.687       ,5.005         ,1.136                     
4082 ,4    ,28   ,127      ,27   ,23   ,5.658       ,5.005         ,1.13                      
4082 ,64   ,27   ,127      ,26   ,23   ,5.68        ,5.006         ,1.135                     
4082 ,64   ,28   ,127      ,27   ,23   ,5.659       ,5.056         ,1.119                     
4083 ,1    ,25   ,127      ,24   ,0    ,5.467       ,4.916         ,1.112                     
4083 ,1    ,25   ,127      ,24   ,23   ,5.63        ,5.096         ,1.105                     
4083 ,1    ,26   ,127      ,25   ,0    ,5.441       ,4.86          ,1.12                      
4083 ,1    ,26   ,127      ,25   ,23   ,5.659       ,5.023         ,1.127                     
4083 ,16   ,25   ,127      ,24   ,23   ,5.746       ,5.03          ,1.142                     
4083 ,16   ,26   ,127      ,25   ,23   ,5.631       ,5.031         ,1.119                     
4083 ,256  ,25   ,127      ,24   ,23   ,5.688       ,4.956         ,1.148                     
4083 ,256  ,26   ,127      ,25   ,23   ,5.659       ,5.031         ,1.125                     
4083 ,4    ,25   ,127      ,24   ,23   ,5.704       ,5.084         ,1.122                     
4083 ,4    ,26   ,127      ,25   ,23   ,5.688       ,5.006         ,1.136                     
4083 ,64   ,25   ,127      ,24   ,23   ,5.688       ,4.981         ,1.142                     
4083 ,64   ,26   ,127      ,25   ,23   ,5.693       ,4.956         ,1.149                     
4084 ,1    ,23   ,127      ,22   ,0    ,5.441       ,4.815         ,1.13                      
4084 ,1    ,23   ,127      ,22   ,23   ,5.66        ,5.15          ,1.099                     
4084 ,1    ,24   ,127      ,23   ,0    ,5.585       ,4.814         ,1.16                      
4084 ,1    ,24   ,127      ,23   ,23   ,5.729       ,4.98          ,1.15                      
4084 ,16   ,23   ,127      ,22   ,23   ,5.659       ,5.057         ,1.119                     
4084 ,16   ,24   ,127      ,23   ,23   ,5.688       ,5.056         ,1.125                     
4084 ,256  ,23   ,127      ,22   ,23   ,5.675       ,5.062         ,1.121                     
4084 ,256  ,24   ,127      ,23   ,23   ,5.718       ,4.981         ,1.148                     
4084 ,4    ,23   ,127      ,22   ,23   ,5.659       ,5.026         ,1.126                     
4084 ,4    ,24   ,127      ,23   ,23   ,5.659       ,5.038         ,1.123                     
4084 ,64   ,23   ,127      ,22   ,23   ,5.659       ,5.005         ,1.131                     
4084 ,64   ,24   ,127      ,23   ,23   ,5.688       ,5.037         ,1.129                     
4085 ,1    ,21   ,127      ,20   ,0    ,5.561       ,4.813         ,1.155                     
4085 ,1    ,21   ,127      ,20   ,23   ,5.711       ,5.006         ,1.141                     
4085 ,1    ,22   ,127      ,21   ,0    ,5.605       ,4.837         ,1.159                     
4085 ,1    ,22   ,127      ,21   ,23   ,5.66        ,5.06          ,1.119                     
4085 ,16   ,21   ,127      ,20   ,23   ,5.659       ,5.081         ,1.114                     
4085 ,16   ,22   ,127      ,21   ,23   ,5.659       ,5.056         ,1.119                     
4085 ,256  ,21   ,127      ,20   ,23   ,5.687       ,5.056         ,1.125                     
4085 ,256  ,22   ,127      ,21   ,23   ,5.701       ,5.031         ,1.133                     
4085 ,4    ,21   ,127      ,20   ,23   ,5.688       ,5.005         ,1.136                     
4085 ,4    ,22   ,127      ,21   ,23   ,5.689       ,5.005         ,1.136                     
4085 ,64   ,21   ,127      ,20   ,23   ,5.706       ,5.031         ,1.134                     
4085 ,64   ,22   ,127      ,21   ,23   ,5.688       ,5.03          ,1.131                     
4086 ,1    ,19   ,127      ,18   ,0    ,5.441       ,4.963         ,1.096                     
4086 ,1    ,19   ,127      ,18   ,23   ,5.683       ,4.98          ,1.141                     
4086 ,1    ,20   ,127      ,19   ,0    ,5.467       ,4.864         ,1.124                     
4086 ,1    ,20   ,127      ,19   ,23   ,5.631       ,4.996         ,1.127                     
4086 ,16   ,19   ,127      ,18   ,23   ,5.688       ,5.005         ,1.136                     
4086 ,16   ,20   ,127      ,19   ,23   ,5.716       ,5.031         ,1.136                     
4086 ,256  ,19   ,127      ,18   ,23   ,5.631       ,5.056         ,1.114                     
4086 ,256  ,20   ,127      ,19   ,23   ,5.631       ,5.078         ,1.109                     
4086 ,4    ,19   ,127      ,18   ,23   ,5.716       ,5.005         ,1.142                     
4086 ,4    ,20   ,127      ,19   ,23   ,5.717       ,5.028         ,1.137                     
4086 ,64   ,19   ,127      ,18   ,23   ,5.659       ,4.98          ,1.136                     
4086 ,64   ,20   ,127      ,19   ,23   ,5.718       ,5.005         ,1.142                     
4087 ,1    ,17   ,127      ,16   ,0    ,5.512       ,4.79          ,1.151                     
4087 ,1    ,17   ,127      ,16   ,23   ,5.66        ,5.005         ,1.131                     
4087 ,1    ,18   ,127      ,17   ,0    ,5.607       ,4.767         ,1.176                     
4087 ,1    ,18   ,127      ,17   ,23   ,5.689       ,5.031         ,1.131                     
4087 ,16   ,17   ,127      ,16   ,23   ,5.827       ,5.006         ,1.164                     
4087 ,16   ,18   ,127      ,17   ,23   ,5.717       ,5.031         ,1.136                     
4087 ,256  ,17   ,127      ,16   ,23   ,5.688       ,5.056         ,1.125                     
4087 ,256  ,18   ,127      ,17   ,23   ,5.748       ,5.031         ,1.143                     
4087 ,4    ,17   ,127      ,16   ,23   ,5.717       ,4.956         ,1.154                     
4087 ,4    ,18   ,127      ,17   ,23   ,5.659       ,4.981         ,1.136                     
4087 ,64   ,17   ,127      ,16   ,23   ,5.658       ,5.031         ,1.125                     
4087 ,64   ,18   ,127      ,17   ,23   ,5.687       ,5.116         ,1.112                     
4088 ,1    ,15   ,127      ,14   ,0    ,5.551       ,4.82          ,1.152                     
4088 ,1    ,15   ,127      ,14   ,23   ,5.78        ,5.031         ,1.149                     
4088 ,1    ,16   ,127      ,15   ,0    ,5.47        ,4.904         ,1.115                     
4088 ,1    ,16   ,127      ,15   ,23   ,5.688       ,4.956         ,1.148                     
4088 ,16   ,15   ,127      ,14   ,23   ,5.659       ,5.031         ,1.125                     
4088 ,16   ,16   ,127      ,15   ,23   ,5.66        ,5.0           ,1.132                     
4088 ,256  ,15   ,127      ,14   ,23   ,5.659       ,4.956         ,1.142                     
4088 ,256  ,16   ,127      ,15   ,23   ,5.688       ,4.981         ,1.142                     
4088 ,4    ,15   ,127      ,14   ,23   ,5.747       ,5.006         ,1.148                     
4088 ,4    ,16   ,127      ,15   ,23   ,5.689       ,4.982         ,1.142                     
4088 ,64   ,15   ,127      ,14   ,23   ,5.688       ,5.005         ,1.136                     
4088 ,64   ,16   ,127      ,15   ,23   ,5.66        ,5.006         ,1.131                     
4089 ,1    ,13   ,127      ,12   ,0    ,5.608       ,6.821         ,0.822                     
4089 ,1    ,13   ,127      ,12   ,23   ,5.688       ,5.048         ,1.127                     
4089 ,1    ,14   ,127      ,13   ,0    ,5.549       ,4.725         ,1.174                     
4089 ,1    ,14   ,127      ,13   ,23   ,5.659       ,5.031         ,1.125                     
4089 ,16   ,13   ,127      ,12   ,23   ,5.739       ,5.006         ,1.146                     
4089 ,16   ,14   ,127      ,13   ,23   ,5.688       ,4.98          ,1.142                     
4089 ,256  ,13   ,127      ,12   ,23   ,5.688       ,5.012         ,1.135                     
4089 ,256  ,14   ,127      ,13   ,23   ,5.688       ,4.98          ,1.142                     
4089 ,4    ,13   ,127      ,12   ,23   ,5.688       ,5.082         ,1.119                     
4089 ,4    ,14   ,127      ,13   ,23   ,5.687       ,5.137         ,1.107                     
4089 ,64   ,13   ,127      ,12   ,23   ,5.688       ,5.056         ,1.125                     
4089 ,64   ,14   ,127      ,13   ,23   ,5.717       ,5.006         ,1.142                     
4090 ,1    ,11   ,127      ,10   ,0    ,5.581       ,5.162         ,1.081                     
4090 ,1    ,11   ,127      ,10   ,23   ,5.688       ,5.03          ,1.131                     
4090 ,1    ,12   ,127      ,11   ,0    ,5.415       ,5.184         ,1.045                     
4090 ,1    ,12   ,127      ,11   ,23   ,5.718       ,5.031         ,1.137                     
4090 ,16   ,11   ,127      ,10   ,23   ,5.668       ,4.956         ,1.144                     
4090 ,16   ,12   ,127      ,11   ,23   ,5.746       ,5.032         ,1.142                     
4090 ,256  ,11   ,127      ,10   ,23   ,5.659       ,4.98          ,1.136                     
4090 ,256  ,12   ,127      ,11   ,23   ,5.741       ,4.977         ,1.154                     
4090 ,4    ,11   ,127      ,10   ,23   ,5.658       ,5.032         ,1.125                     
4090 ,4    ,12   ,127      ,11   ,23   ,5.66        ,5.006         ,1.131                     
4090 ,64   ,11   ,127      ,10   ,23   ,5.659       ,5.006         ,1.13                      
4090 ,64   ,12   ,127      ,11   ,23   ,5.733       ,5.006         ,1.145                     
4091 ,1    ,10   ,127      ,9    ,0    ,5.415       ,4.843         ,1.118                     
4091 ,1    ,10   ,127      ,9    ,23   ,5.659       ,5.032         ,1.125                     
4091 ,1    ,9    ,127      ,8    ,0    ,5.452       ,4.767         ,1.144                     
4091 ,1    ,9    ,127      ,8    ,23   ,5.659       ,5.005         ,1.131                     
4091 ,16   ,10   ,127      ,9    ,23   ,5.718       ,5.056         ,1.131                     
4091 ,16   ,9    ,127      ,8    ,23   ,5.688       ,5.032         ,1.13                      
4091 ,256  ,10   ,127      ,9    ,23   ,5.659       ,5.083         ,1.113                     
4091 ,256  ,9    ,127      ,8    ,23   ,5.632       ,5.032         ,1.119                     
4091 ,4    ,10   ,127      ,9    ,23   ,5.718       ,5.006         ,1.142                     
4091 ,4    ,9    ,127      ,8    ,23   ,5.717       ,5.057         ,1.13                      
4091 ,64   ,10   ,127      ,9    ,23   ,5.704       ,5.084         ,1.122                     
4091 ,64   ,9    ,127      ,8    ,23   ,5.659       ,5.03          ,1.125                     
4092 ,1    ,7    ,127      ,6    ,0    ,5.795       ,4.863         ,1.192                     
4092 ,1    ,7    ,127      ,6    ,23   ,5.713       ,4.981         ,1.147                     
4092 ,1    ,8    ,127      ,7    ,0    ,5.469       ,4.886         ,1.119                     
4092 ,1    ,8    ,127      ,7    ,23   ,5.688       ,5.084         ,1.119                     
4092 ,16   ,7    ,127      ,6    ,23   ,5.659       ,5.061         ,1.118                     
4092 ,16   ,8    ,127      ,7    ,23   ,5.658       ,5.03          ,1.125                     
4092 ,256  ,7    ,127      ,6    ,23   ,5.688       ,5.007         ,1.136                     
4092 ,256  ,8    ,127      ,7    ,23   ,5.695       ,5.007         ,1.137                     
4092 ,4    ,7    ,127      ,6    ,23   ,5.688       ,5.068         ,1.122                     
4092 ,4    ,8    ,127      ,7    ,23   ,5.687       ,5.056         ,1.125                     
4092 ,64   ,7    ,127      ,6    ,23   ,5.704       ,4.98          ,1.145                     
4092 ,64   ,8    ,127      ,7    ,23   ,5.688       ,5.032         ,1.13                      
4093 ,1    ,5    ,127      ,4    ,0    ,5.313       ,4.837         ,1.098                     
4093 ,1    ,5    ,127      ,4    ,23   ,5.704       ,5.138         ,1.11                      
4093 ,1    ,6    ,127      ,5    ,0    ,5.49        ,4.861         ,1.129                     
4093 ,1    ,6    ,127      ,5    ,23   ,5.717       ,5.082         ,1.125                     
4093 ,16   ,5    ,127      ,4    ,23   ,5.718       ,5.021         ,1.139                     
4093 ,16   ,6    ,127      ,5    ,23   ,5.688       ,5.005         ,1.136                     
4093 ,256  ,5    ,127      ,4    ,23   ,5.872       ,4.956         ,1.185                     
4093 ,256  ,6    ,127      ,5    ,23   ,5.631       ,4.956         ,1.136                     
4093 ,4    ,5    ,127      ,4    ,23   ,5.716       ,5.031         ,1.136                     
4093 ,4    ,6    ,127      ,5    ,23   ,5.687       ,5.03          ,1.131                     
4093 ,64   ,5    ,127      ,4    ,23   ,5.659       ,4.972         ,1.138                     
4093 ,64   ,6    ,127      ,5    ,23   ,5.688       ,5.006         ,1.136                     
4094 ,1    ,3    ,127      ,2    ,0    ,5.345       ,4.723         ,1.132                     
4094 ,1    ,3    ,127      ,2    ,23   ,5.719       ,5.005         ,1.143                     
4094 ,1    ,4    ,127      ,3    ,0    ,5.753       ,4.767         ,1.207                     
4094 ,1    ,4    ,127      ,3    ,23   ,5.689       ,4.98          ,1.142                     
4094 ,16   ,3    ,127      ,2    ,23   ,5.749       ,5.062         ,1.136                     
4094 ,16   ,4    ,127      ,3    ,23   ,5.717       ,5.006         ,1.142                     
4094 ,256  ,3    ,127      ,2    ,23   ,5.661       ,4.932         ,1.148                     
4094 ,256  ,4    ,127      ,3    ,23   ,5.748       ,4.981         ,1.154                     
4094 ,4    ,3    ,127      ,2    ,23   ,5.66        ,5.005         ,1.131                     
4094 ,4    ,4    ,127      ,3    ,23   ,5.698       ,5.005         ,1.138                     
4094 ,64   ,3    ,127      ,2    ,23   ,5.718       ,5.0           ,1.144                     
4094 ,64   ,4    ,127      ,3    ,23   ,5.689       ,5.005         ,1.137                     
4095 ,1    ,1    ,127      ,0    ,0    ,4.077       ,3.561         ,1.145                     
4095 ,1    ,1    ,127      ,0    ,23   ,5.475       ,5.007         ,1.093                     
4095 ,1    ,2    ,127      ,1    ,0    ,5.464       ,4.946         ,1.105                     
4095 ,1    ,2    ,127      ,1    ,23   ,5.72        ,5.005         ,1.143                     
4095 ,16   ,1    ,127      ,0    ,23   ,5.421       ,5.033         ,1.077                     
4095 ,16   ,2    ,127      ,1    ,23   ,5.661       ,4.981         ,1.137                     
4095 ,256  ,1    ,127      ,0    ,23   ,5.413       ,4.974         ,1.088                     
4095 ,256  ,2    ,127      ,1    ,23   ,5.72        ,5.034         ,1.136                     
4095 ,4    ,1    ,127      ,0    ,23   ,5.418       ,5.008         ,1.082                     
4095 ,4    ,2    ,127      ,1    ,23   ,5.689       ,5.057         ,1.125                     
4095 ,64   ,1    ,127      ,0    ,23   ,5.325       ,4.957         ,1.074                     
4095 ,64   ,2    ,127      ,1    ,23   ,5.733       ,5.059         ,1.133                     
45   ,1    ,256  ,127      ,64   ,0    ,4.861       ,3.978         ,1.222                     
45   ,1    ,256  ,127      ,64   ,23   ,8.041       ,11.454        ,0.702                     
45   ,16   ,256  ,127      ,64   ,23   ,8.175       ,11.625        ,0.703                     
45   ,256  ,256  ,127      ,64   ,23   ,9.37        ,12.999        ,0.721                     
45   ,4    ,256  ,127      ,64   ,23   ,8.129       ,11.549        ,0.704                     
45   ,64   ,256  ,127      ,64   ,23   ,10.391      ,11.431        ,0.909                     
5    ,1    ,2048 ,127      ,512  ,0    ,13.039      ,18.823        ,0.693                     
5    ,1    ,2048 ,127      ,512  ,23   ,34.602      ,79.639        ,0.434                     
5    ,1    ,256  ,127      ,64   ,0    ,4.676       ,4.007         ,1.167                     
5    ,1    ,256  ,127      ,64   ,23   ,8.091       ,11.639        ,0.695                     
5    ,16   ,2048 ,127      ,512  ,23   ,35.788      ,60.545        ,0.591                     
5    ,16   ,256  ,127      ,64   ,23   ,8.083       ,11.504        ,0.703                     
5    ,256  ,2048 ,127      ,512  ,23   ,35.448      ,60.085        ,0.59                      
5    ,256  ,256  ,127      ,64   ,23   ,9.772       ,11.601        ,0.842                     
5    ,4    ,2048 ,127      ,512  ,23   ,36.09       ,58.237        ,0.62                      
5    ,4    ,256  ,127      ,64   ,23   ,8.101       ,11.504        ,0.704                     
5    ,64   ,2048 ,127      ,512  ,23   ,35.453      ,60.442        ,0.587                     
5    ,64   ,256  ,127      ,64   ,23   ,9.632       ,11.439        ,0.842                     
6    ,1    ,2048 ,127      ,1024 ,0    ,18.531      ,29.79         ,0.622                     
6    ,1    ,2048 ,127      ,1024 ,23   ,35.371      ,57.648        ,0.614                     
6    ,1    ,256  ,127      ,64   ,0    ,4.875       ,4.037         ,1.208                     
6    ,1    ,256  ,127      ,64   ,23   ,8.062       ,11.617        ,0.694                     
6    ,16   ,2048 ,127      ,1024 ,23   ,37.311      ,66.298        ,0.563                     
6    ,16   ,256  ,127      ,64   ,23   ,8.117       ,11.491        ,0.706                     
6    ,256  ,2048 ,127      ,1024 ,23   ,40.04       ,65.602        ,0.61                      
6    ,256  ,256  ,127      ,64   ,23   ,9.415       ,11.533        ,0.816                     
6    ,4    ,2048 ,127      ,1024 ,23   ,35.439      ,60.21         ,0.589                     
6    ,4    ,256  ,127      ,64   ,23   ,8.203       ,11.582        ,0.708                     
6    ,64   ,2048 ,127      ,1024 ,23   ,38.201      ,66.0          ,0.579                     
6    ,64   ,256  ,127      ,64   ,23   ,8.092       ,11.425        ,0.708                     
60   ,1    ,256  ,127      ,64   ,0    ,4.59        ,3.875         ,1.184                     
60   ,1    ,256  ,127      ,64   ,23   ,8.382       ,11.496        ,0.729                     
60   ,16   ,256  ,127      ,64   ,23   ,8.134       ,11.624        ,0.7                       
60   ,256  ,256  ,127      ,64   ,23   ,9.944       ,11.474        ,0.867                     
60   ,4    ,256  ,127      ,64   ,23   ,8.321       ,11.537        ,0.721                     
60   ,64   ,256  ,127      ,64   ,23   ,9.687       ,11.535        ,0.84                      
7    ,1    ,2048 ,127      ,2048 ,0    ,29.34       ,74.074        ,0.396                     
7    ,1    ,2048 ,127      ,2048 ,23   ,34.479      ,79.704        ,0.433                     
7    ,1    ,256  ,127      ,64   ,0    ,4.767       ,4.047         ,1.178                     
7    ,1    ,256  ,127      ,64   ,23   ,8.158       ,11.55         ,0.706                     
7    ,16   ,2048 ,127      ,2048 ,23   ,48.164      ,77.397        ,0.622                     
7    ,16   ,256  ,127      ,64   ,23   ,8.103       ,11.594        ,0.699                     
7    ,256  ,2048 ,127      ,2048 ,23   ,46.693      ,99.52         ,0.469                     
7    ,256  ,256  ,127      ,64   ,23   ,9.556       ,11.495        ,0.831                     
7    ,4    ,2048 ,127      ,2048 ,23   ,35.553      ,58.184        ,0.611                     
7    ,4    ,256  ,127      ,64   ,23   ,8.294       ,11.619        ,0.714                     
7    ,64   ,2048 ,127      ,2048 ,23   ,41.073      ,101.721       ,0.404                     
7    ,64   ,256  ,127      ,64   ,23   ,8.404       ,11.592        ,0.725                     
75   ,1    ,256  ,127      ,64   ,0    ,5.009       ,4.035         ,1.241                     
75   ,1    ,256  ,127      ,64   ,23   ,8.211       ,11.62         ,0.707                     
75   ,16   ,256  ,127      ,64   ,23   ,8.476       ,11.6          ,0.731                     
75   ,256  ,256  ,127      ,64   ,23   ,11.548      ,13.3          ,0.868                     
75   ,4    ,256  ,127      ,64   ,23   ,8.103       ,11.563        ,0.701                     
75   ,64   ,256  ,127      ,64   ,23   ,8.354       ,11.619        ,0.719                     
8    ,1    ,2048 ,127      ,4096 ,0    ,33.862      ,70.934        ,0.477                     
8    ,1    ,2048 ,127      ,4096 ,23   ,34.545      ,77.425        ,0.446                     
8    ,16   ,2048 ,127      ,4096 ,23   ,38.225      ,60.633        ,0.63                      
8    ,256  ,2048 ,127      ,4096 ,23   ,41.67       ,102.809       ,0.405                     
8    ,4    ,2048 ,127      ,4096 ,23   ,34.59       ,56.665        ,0.61                      
8    ,64   ,2048 ,127      ,4096 ,23   ,41.385      ,99.125        ,0.418                     
90   ,1    ,256  ,127      ,64   ,0    ,4.784       ,3.861         ,1.239                     
90   ,1    ,256  ,127      ,64   ,23   ,8.655       ,11.568        ,0.748                     
90   ,16   ,256  ,127      ,64   ,23   ,8.955       ,11.543        ,0.776                     
90   ,256  ,256  ,127      ,64   ,23   ,11.125      ,13.154        ,0.846                     
90   ,4    ,256  ,127      ,64   ,23   ,8.104       ,11.518        ,0.704                     
90   ,64   ,256  ,127      ,64   ,23   ,8.446       ,11.554        ,0.731                     

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-09-21 14:39 ` Noah Goldstein
@ 2023-09-21 15:16   ` H.J. Lu
  2023-09-21 19:19     ` Noah Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2023-09-21 15:16 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: libc-alpha, carlos

On Thu, Sep 21, 2023 at 7:39 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Thu, Sep 21, 2023 at 9:38 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
> > common implementation: `strrchr-evex-base.S`.
> >
> > The motivation is `strrchr-evex` needed to be refactored to not use
> > 64-bit masked registers in preperation for AVX10.
> >
> > Once vec-width masked register combining was removed, the EVEX and
> > EVEX512 implementations can easily be implemented in the same file
> > without any major overhead.
> >
> > The net result is performance improvements (measured on TGL) for both
> > `strrchr-evex` and `strrchr-evex512`. Although, note there are some
> > regressions in the test suite and it may be many of the cases that
> > make the total-geomean of improvement/regression across bench-strrchr
> > are cold. The point of the performance measurement is to show there
> > are no major regressions, but the primary motivation is preperation
> > for AVX10.
> >
> > Benchmarks where taken on TGL:
> > https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
> >
> > EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
> > EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
> Full summary of attached here.

The results look good to me.  I believe that this is the only 256-bit
EVEX function
with 64-bit mask instructions.

> >
> > Full check passes on x86.
> > ---
> >  sysdeps/x86_64/multiarch/strrchr-evex-base.S | 466 ++++++++++++-------
> >  sysdeps/x86_64/multiarch/strrchr-evex.S      | 392 +---------------
> >  sysdeps/x86_64/multiarch/wcsrchr-evex.S      |   1 +
> >  3 files changed, 294 insertions(+), 565 deletions(-)
> >
> > diff --git a/sysdeps/x86_64/multiarch/strrchr-evex-base.S b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> > index 58b2853ab6..2c98f07fca 100644
> > --- a/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> > +++ b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> > @@ -25,240 +25,354 @@
> >  # include <sysdep.h>
> >
> >  # ifdef USE_AS_WCSRCHR
> > +#  if VEC_SIZE == 64
> > +#   define RCX_M       cx
> > +#   define kortestM    kortestw
> > +#  else
> > +#   define RCX_M       cl
> > +#   define kortestM    kortestb
> > +#  endif
> > +
> > +#  define SHIFT_REG    VRCX
> > +#  define VPCOMPRESS   vpcompressd
> >  #  define CHAR_SIZE    4
> > -#  define VPBROADCAST   vpbroadcastd
> > -#  define VPCMPEQ      vpcmpeqd
> > -#  define VPMINU       vpminud
> > +#  define VPMIN        vpminud
> >  #  define VPTESTN      vptestnmd
> > +#  define VPTEST       vptestmd
> > +#  define VPBROADCAST  vpbroadcastd
> > +#  define VPCMPEQ      vpcmpeqd
> > +#  define VPCMP        vpcmpd
> >  # else
> > +#  define SHIFT_REG    VRDI
> > +#  define VPCOMPRESS   vpcompressb
> >  #  define CHAR_SIZE    1
> > -#  define VPBROADCAST   vpbroadcastb
> > -#  define VPCMPEQ      vpcmpeqb
> > -#  define VPMINU       vpminub
> > +#  define VPMIN        vpminub
> >  #  define VPTESTN      vptestnmb
> > +#  define VPTEST       vptestmb
> > +#  define VPBROADCAST  vpbroadcastb
> > +#  define VPCMPEQ      vpcmpeqb
> > +#  define VPCMP        vpcmpb
> > +
> > +#  define RCX_M        VRCX
> > +#  define kortestM     KORTEST
> >  # endif
> >
> > -# define PAGE_SIZE     4096
> > +# define VMATCH        VMM(0)
> >  # define CHAR_PER_VEC  (VEC_SIZE / CHAR_SIZE)
> > +# define PAGE_SIZE     4096
> >
> >         .section SECTION(.text), "ax", @progbits
> > -/* Aligning entry point to 64 byte, provides better performance for
> > -   one vector length string.  */
> > -ENTRY_P2ALIGN (STRRCHR, 6)
> > -
> > -       /* Broadcast CHAR to VMM(0).  */
> > -       VPBROADCAST %esi, %VMM(0)
> > +       /* Aligning entry point to 64 byte, provides better performance for
> > +          one vector length string.  */
> > +ENTRY_P2ALIGN(STRRCHR, 6)
> >         movl    %edi, %eax
> > -       sall    $20, %eax
> > -       cmpl    $((PAGE_SIZE - VEC_SIZE) << 20), %eax
> > -       ja      L(page_cross)
> > +       /* Broadcast CHAR to VMATCH.  */
> > +       VPBROADCAST %esi, %VMATCH
> >
> > -L(page_cross_continue):
> > -       /* Compare [w]char for null, mask bit will be set for match.  */
> > -       VMOVU   (%rdi), %VMM(1)
> > +       andl    $(PAGE_SIZE - 1), %eax
> > +       cmpl    $(PAGE_SIZE - VEC_SIZE), %eax
> > +       jg      L(cross_page_boundary)
> >
> > -       VPTESTN %VMM(1), %VMM(1), %k1
> > -       KMOV    %k1, %VRCX
> > -       test    %VRCX, %VRCX
> > -       jz      L(align_more)
> > -
> > -       VPCMPEQ %VMM(1), %VMM(0), %k0
> > -       KMOV    %k0, %VRAX
> > -       BLSMSK  %VRCX, %VRCX
> > -       and     %VRCX, %VRAX
> > -       jz      L(ret)
> > -
> > -       BSR     %VRAX, %VRAX
> > +       VMOVU   (%rdi), %VMM(1)
> > +       /* k0 has a 1 for each zero CHAR in YMM1.  */
> > +       VPTESTN %VMM(1), %VMM(1), %k0
> > +       KMOV    %k0, %VGPR(rsi)
> > +       test    %VGPR(rsi), %VGPR(rsi)
> > +       jz      L(aligned_more)
> > +       /* fallthrough: zero CHAR in first VEC.  */
> > +L(page_cross_return):
> > +       /* K1 has a 1 for each search CHAR match in VEC(1).  */
> > +       VPCMPEQ %VMATCH, %VMM(1), %k1
> > +       KMOV    %k1, %VGPR(rax)
> > +       /* Build mask up until first zero CHAR (used to mask of
> > +          potential search CHAR matches past the end of the string).  */
> > +       blsmsk  %VGPR(rsi), %VGPR(rsi)
> > +       and     %VGPR(rsi), %VGPR(rax)
> > +       jz      L(ret0)
> > +       /* Get last match (the `and` removed any out of bounds matches).  */
> > +       bsr     %VGPR(rax), %VGPR(rax)
> >  # ifdef USE_AS_WCSRCHR
> >         leaq    (%rdi, %rax, CHAR_SIZE), %rax
> >  # else
> > -       add     %rdi, %rax
> > +       addq    %rdi, %rax
> >  # endif
> > -L(ret):
> > +L(ret0):
> >         ret
> >
> > -L(vector_x2_end):
> > -       VPCMPEQ %VMM(2), %VMM(0), %k2
> > -       KMOV    %k2, %VRAX
> > -       BLSMSK  %VRCX, %VRCX
> > -       and     %VRCX, %VRAX
> > -       jz      L(vector_x1_ret)
> > -
> > -       BSR     %VRAX, %VRAX
> > -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> > -       ret
> > -
> > -       /* Check the first vector at very last to look for match.  */
> > -L(vector_x1_ret):
> > -       VPCMPEQ %VMM(1), %VMM(0), %k2
> > -       KMOV    %k2, %VRAX
> > -       test    %VRAX, %VRAX
> > -       jz      L(ret)
> > -
> > -       BSR     %VRAX, %VRAX
> > +       /* Returns for first vec x1/x2/x3 have hard coded backward
> > +          search path for earlier matches.  */
> > +       .p2align 4,, 6
> > +L(first_vec_x1):
> > +       VPCMPEQ %VMATCH, %VMM(2), %k1
> > +       KMOV    %k1, %VGPR(rax)
> > +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> > +       /* eax non-zero if search CHAR in range.  */
> > +       and     %VGPR(rcx), %VGPR(rax)
> > +       jnz     L(first_vec_x1_return)
> > +
> > +       /* fallthrough: no match in YMM2 then need to check for earlier
> > +          matches (in YMM1).  */
> > +       .p2align 4,, 4
> > +L(first_vec_x0_test):
> > +       VPCMPEQ %VMATCH, %VMM(1), %k1
> > +       KMOV    %k1, %VGPR(rax)
> > +       test    %VGPR(rax), %VGPR(rax)
> > +       jz      L(ret1)
> > +       bsr     %VGPR(rax), %VGPR(rax)
> >  # ifdef USE_AS_WCSRCHR
> >         leaq    (%rsi, %rax, CHAR_SIZE), %rax
> >  # else
> > -       add     %rsi, %rax
> > +
> > +       addq    %rsi, %rax
> >  # endif
> > +L(ret1):
> > +       ret
> > +
> > +       .p2align 4,, 10
> > +L(first_vec_x3):
> > +       VPCMPEQ %VMATCH, %VMM(4), %k1
> > +       KMOV    %k1, %VGPR(rax)
> > +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> > +       /* If no search CHAR match in range check YMM1/YMM2/YMM3.  */
> > +       and     %VGPR(rcx), %VGPR(rax)
> > +       jz      L(first_vec_x1_or_x2)
> > +       bsr     %VGPR(rax), %VGPR(rax)
> > +       leaq    (VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
> > +       ret
> > +       .p2align 4,, 4
> > +
> > +L(first_vec_x2):
> > +       VPCMPEQ %VMATCH, %VMM(3), %k1
> > +       KMOV    %k1, %VGPR(rax)
> > +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> > +       /* Check YMM3 for last match first. If no match try YMM2/YMM1.  */
> > +       and     %VGPR(rcx), %VGPR(rax)
> > +       jz      L(first_vec_x0_x1_test)
> > +       bsr     %VGPR(rax), %VGPR(rax)
> > +       leaq    (VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
> >         ret
> >
> > -L(align_more):
> > -       /* Zero r8 to store match result.  */
> > -       xorl    %r8d, %r8d
> > -       /* Save pointer of first vector, in case if no match found.  */
> > +       .p2align 4,, 6
> > +L(first_vec_x0_x1_test):
> > +       VPCMPEQ %VMATCH, %VMM(2), %k1
> > +       KMOV    %k1, %VGPR(rax)
> > +       /* Check YMM2 for last match first. If no match try YMM1.  */
> > +       test    %VGPR(rax), %VGPR(rax)
> > +       jz      L(first_vec_x0_test)
> > +       .p2align 4,, 4
> > +L(first_vec_x1_return):
> > +       bsr     %VGPR(rax), %VGPR(rax)
> > +       leaq    (VEC_SIZE)(%r8, %rax, CHAR_SIZE), %rax
> > +       ret
> > +
> > +       .p2align 4,, 12
> > +L(aligned_more):
> > +L(page_cross_continue):
> > +       /* Need to keep original pointer incase VEC(1) has last match.  */
> >         movq    %rdi, %rsi
> > -       /* Align pointer to vector size.  */
> >         andq    $-VEC_SIZE, %rdi
> > -       /* Loop unroll for 2 vector loop.  */
> > -       VMOVA   (VEC_SIZE)(%rdi), %VMM(2)
> > +
> > +       VMOVU   VEC_SIZE(%rdi), %VMM(2)
> >         VPTESTN %VMM(2), %VMM(2), %k0
> >         KMOV    %k0, %VRCX
> > +       movq    %rdi, %r8
> >         test    %VRCX, %VRCX
> > -       jnz     L(vector_x2_end)
> > +       jnz     L(first_vec_x1)
> > +
> > +       VMOVU   (VEC_SIZE * 2)(%rdi), %VMM(3)
> > +       VPTESTN %VMM(3), %VMM(3), %k0
> > +       KMOV    %k0, %VRCX
> > +
> > +       test    %VRCX, %VRCX
> > +       jnz     L(first_vec_x2)
> > +
> > +       VMOVU   (VEC_SIZE * 3)(%rdi), %VMM(4)
> > +       VPTESTN %VMM(4), %VMM(4), %k0
> > +       KMOV    %k0, %VRCX
> > +
> > +       /* Intentionally use 64-bit here.  EVEX256 version needs 1-byte
> > +          padding for efficient nop before loop alignment.  */
> > +       test    %rcx, %rcx
> > +       jnz     L(first_vec_x3)
> >
> > -       /* Save pointer of second vector, in case if no match
> > -          found.  */
> > -       movq    %rdi, %r9
> > -       /* Align address to VEC_SIZE * 2 for loop.  */
> >         andq    $-(VEC_SIZE * 2), %rdi
> > +       .p2align 4
> > +L(first_aligned_loop):
> > +       /* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
> > +          gurantee they don't store a match.  */
> > +       VMOVA   (VEC_SIZE * 4)(%rdi), %VMM(5)
> > +       VMOVA   (VEC_SIZE * 5)(%rdi), %VMM(6)
> >
> > -       .p2align 4,,11
> > -L(loop):
> > -       /* 2 vector loop, as it provide better performance as compared
> > -          to 4 vector loop.  */
> > -       VMOVA   (VEC_SIZE * 2)(%rdi), %VMM(3)
> > -       VMOVA   (VEC_SIZE * 3)(%rdi), %VMM(4)
> > -       VPCMPEQ %VMM(3), %VMM(0), %k1
> > -       VPCMPEQ %VMM(4), %VMM(0), %k2
> > -       VPMINU  %VMM(3), %VMM(4), %VMM(5)
> > -       VPTESTN %VMM(5), %VMM(5), %k0
> > -       KOR     %k1, %k2, %k3
> > -       subq    $-(VEC_SIZE * 2), %rdi
> > -       /* If k0 and k3 zero, match and end of string not found.  */
> > -       KORTEST %k0, %k3
> > -       jz      L(loop)
> > -
> > -       /* If k0 is non zero, end of string found.  */
> > -       KORTEST %k0, %k0
> > -       jnz     L(endloop)
> > -
> > -       lea     VEC_SIZE(%rdi), %r8
> > -       /* A match found, it need to be stored in r8 before loop
> > -          continue.  */
> > -       /* Check second vector first.  */
> > -       KMOV    %k2, %VRDX
> > -       test    %VRDX, %VRDX
> > -       jnz     L(loop_vec_x2_match)
> > +       VPCMP   $4, %VMM(5), %VMATCH, %k2
> > +       VPCMP   $4, %VMM(6), %VMATCH, %k3{%k2}
> > +
> > +       VPMIN   %VMM(5), %VMM(6), %VMM(7)
> > +
> > +       VPTEST  %VMM(7), %VMM(7), %k1{%k3}
> > +       subq    $(VEC_SIZE * -2), %rdi
> > +       kortestM %k1, %k1
> > +       jc      L(first_aligned_loop)
> >
> > +       VPTESTN %VMM(7), %VMM(7), %k1
> >         KMOV    %k1, %VRDX
> > -       /* Match is in first vector, rdi offset need to be subtracted
> > -         by VEC_SIZE.  */
> > -       sub     $VEC_SIZE, %r8
> > -
> > -       /* If second vector doesn't have match, first vector must
> > -          have match.  */
> > -L(loop_vec_x2_match):
> > -       BSR     %VRDX, %VRDX
> > -# ifdef USE_AS_WCSRCHR
> > -       sal     $2, %rdx
> > -# endif
> > -       add     %rdx, %r8
> > -       jmp     L(loop)
> > +       test    %VRDX, %VRDX
> > +       jz      L(second_aligned_loop_prep)
> >
> > -L(endloop):
> > -       /* Check if string end in first loop vector.  */
> > -       VPTESTN %VMM(3), %VMM(3), %k0
> > -       KMOV    %k0, %VRCX
> > -       test    %VRCX, %VRCX
> > -       jnz     L(loop_vector_x1_end)
> > +       kortestM %k3, %k3
> > +       jnc     L(return_first_aligned_loop)
> >
> > -       /* Check if it has match in first loop vector.  */
> > -       KMOV    %k1, %VRAX
> > +       .p2align 4,, 6
> > +L(first_vec_x1_or_x2_or_x3):
> > +       VPCMPEQ %VMM(4), %VMATCH, %k4
> > +       KMOV    %k4, %VRAX
> >         test    %VRAX, %VRAX
> > -       jz      L(loop_vector_x2_end)
> > +       jz      L(first_vec_x1_or_x2)
> > +       bsr     %VRAX, %VRAX
> > +       leaq    (VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
> > +       ret
> >
> > -       BSR     %VRAX, %VRAX
> > -       leaq    (%rdi, %rax, CHAR_SIZE), %r8
> >
> > -       /* String must end in second loop vector.  */
> > -L(loop_vector_x2_end):
> > -       VPTESTN %VMM(4), %VMM(4), %k0
> > +       .p2align 4,, 8
> > +L(return_first_aligned_loop):
> > +       VPTESTN %VMM(5), %VMM(5), %k0
> >         KMOV    %k0, %VRCX
> > +       blsmsk  %VRCX, %VRCX
> > +       jnc     L(return_first_new_match_first)
> > +       blsmsk  %VRDX, %VRDX
> > +       VPCMPEQ %VMM(6), %VMATCH, %k0
> > +       KMOV    %k0, %VRAX
> > +       addq    $VEC_SIZE, %rdi
> > +       and     %VRDX, %VRAX
> > +       jnz     L(return_first_new_match_ret)
> > +       subq    $VEC_SIZE, %rdi
> > +L(return_first_new_match_first):
> >         KMOV    %k2, %VRAX
> > -       BLSMSK  %VRCX, %VRCX
> > -       /* Check if it has match in second loop vector.  */
> > +# ifdef USE_AS_WCSRCHR
> > +       xorl    $((1 << CHAR_PER_VEC)- 1), %VRAX
> >         and     %VRCX, %VRAX
> > -       jz      L(check_last_match)
> > +# else
> > +       andn    %VRCX, %VRAX, %VRAX
> > +# endif
> > +       jz      L(first_vec_x1_or_x2_or_x3)
> > +L(return_first_new_match_ret):
> > +       bsr     %VRAX, %VRAX
> > +       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> > +       ret
> >
> > -       BSR     %VRAX, %VRAX
> > -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> > +       .p2align 4,, 10
> > +L(first_vec_x1_or_x2):
> > +       VPCMPEQ %VMM(3), %VMATCH, %k3
> > +       KMOV    %k3, %VRAX
> > +       test    %VRAX, %VRAX
> > +       jz      L(first_vec_x0_x1_test)
> > +       bsr     %VRAX, %VRAX
> > +       leaq    (VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
> >         ret
> >
> > -       /* String end in first loop vector.  */
> > -L(loop_vector_x1_end):
> > -       KMOV    %k1, %VRAX
> > -       BLSMSK  %VRCX, %VRCX
> > -       /* Check if it has match in second loop vector.  */
> > -       and     %VRCX, %VRAX
> > -       jz      L(check_last_match)
> >
> > -       BSR     %VRAX, %VRAX
> > -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> > -       ret
> > +       .p2align 4
> > +       /* We can throw away the work done for the first 4x checks here
> > +          as we have a later match. This is the 'fast' path persay.  */
> > +L(second_aligned_loop_prep):
> > +L(second_aligned_loop_set_furthest_match):
> > +       movq    %rdi, %rsi
> > +       VMOVA   %VMM(5), %VMM(7)
> > +       VMOVA   %VMM(6), %VMM(8)
> > +       .p2align 4
> > +L(second_aligned_loop):
> > +       VMOVU   (VEC_SIZE * 4)(%rdi), %VMM(5)
> > +       VMOVU   (VEC_SIZE * 5)(%rdi), %VMM(6)
> > +       VPCMP   $4, %VMM(5), %VMATCH, %k2
> > +       VPCMP   $4, %VMM(6), %VMATCH, %k3{%k2}
> > +
> > +       VPMIN   %VMM(5), %VMM(6), %VMM(4)
> > +
> > +       VPTEST  %VMM(4), %VMM(4), %k1{%k3}
> > +       subq    $(VEC_SIZE * -2), %rdi
> > +       KMOV    %k1, %VRCX
> > +       inc     %RCX_M
> > +       jz      L(second_aligned_loop)
> > +       VPTESTN %VMM(4), %VMM(4), %k1
> > +       KMOV    %k1, %VRDX
> > +       test    %VRDX, %VRDX
> > +       jz      L(second_aligned_loop_set_furthest_match)
> >
> > -       /* No match in first and second loop vector.  */
> > -L(check_last_match):
> > -       /* Check if any match recorded in r8.  */
> > -       test    %r8, %r8
> > -       jz      L(vector_x2_ret)
> > -       movq    %r8, %rax
> > +       kortestM %k3, %k3
> > +       jnc     L(return_new_match)
> > +       /* branch here because there is a significant advantage interms
> > +          of output dependency chance in using edx.  */
> > +
> > +
> > +L(return_old_match):
> > +       VPCMPEQ %VMM(8), %VMATCH, %k0
> > +       KMOV    %k0, %VRCX
> > +       bsr     %VRCX, %VRCX
> > +       jnz     L(return_old_match_ret)
> > +
> > +       VPCMPEQ %VMM(7), %VMATCH, %k0
> > +       KMOV    %k0, %VRCX
> > +       bsr     %VRCX, %VRCX
> > +       subq    $VEC_SIZE, %rsi
> > +L(return_old_match_ret):
> > +       leaq    (VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
> >         ret
> >
> > -       /* No match recorded in r8. Check the second saved vector
> > -          in beginning.  */
> > -L(vector_x2_ret):
> > -       VPCMPEQ %VMM(2), %VMM(0), %k2
> > -       KMOV    %k2, %VRAX
> > -       test    %VRAX, %VRAX
> > -       jz      L(vector_x1_ret)
> >
> > -       /* Match found in the second saved vector.  */
> > -       BSR     %VRAX, %VRAX
> > -       leaq    (VEC_SIZE)(%r9, %rax, CHAR_SIZE), %rax
> > +L(return_new_match):
> > +       VPTESTN %VMM(5), %VMM(5), %k0
> > +       KMOV    %k0, %VRCX
> > +       blsmsk  %VRCX, %VRCX
> > +       jnc     L(return_new_match_first)
> > +       dec     %VRDX
> > +       VPCMPEQ %VMM(6), %VMATCH, %k0
> > +       KMOV    %k0, %VRAX
> > +       addq    $VEC_SIZE, %rdi
> > +       and     %VRDX, %VRAX
> > +       jnz     L(return_new_match_ret)
> > +       subq    $VEC_SIZE, %rdi
> > +L(return_new_match_first):
> > +       KMOV    %k2, %VRAX
> > +# ifdef USE_AS_WCSRCHR
> > +       xorl    $((1 << CHAR_PER_VEC)- 1), %VRAX
> > +       and     %VRCX, %VRAX
> > +# else
> > +       andn    %VRCX, %VRAX, %VRAX
> > +# endif
> > +       jz      L(return_old_match)
> > +L(return_new_match_ret):
> > +       bsr     %VRAX, %VRAX
> > +       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> >         ret
> >
> > -L(page_cross):
> > -       mov     %rdi, %rax
> > -       movl    %edi, %ecx
> > +       .p2align 4,, 4
> > +L(cross_page_boundary):
> > +       xorq    %rdi, %rax
> > +       mov     $-1, %VRDX
> > +       VMOVU   (PAGE_SIZE - VEC_SIZE)(%rax), %VMM(6)
> > +       VPTESTN %VMM(6), %VMM(6), %k0
> > +       KMOV    %k0, %VRSI
> >
> >  # ifdef USE_AS_WCSRCHR
> > -       /* Calculate number of compare result bits to be skipped for
> > -          wide string alignment adjustment.  */
> > -       andl    $(VEC_SIZE - 1), %ecx
> > -       sarl    $2, %ecx
> > +       movl    %edi, %ecx
> > +       and     $(VEC_SIZE - 1), %ecx
> > +       shrl    $2, %ecx
> >  # endif
> > -       /* ecx contains number of w[char] to be skipped as a result
> > -          of address alignment.  */
> > -       andq    $-VEC_SIZE, %rax
> > -       VMOVA   (%rax), %VMM(1)
> > -       VPTESTN %VMM(1), %VMM(1), %k1
> > -       KMOV    %k1, %VRAX
> > -       SHR     %cl, %VRAX
> > -       jz      L(page_cross_continue)
> > -       VPCMPEQ %VMM(1), %VMM(0), %k0
> > -       KMOV    %k0, %VRDX
> > -       SHR     %cl, %VRDX
> > -       BLSMSK  %VRAX, %VRAX
> > -       and     %VRDX, %VRAX
> > -       jz      L(ret)
> > -       BSR     %VRAX, %VRAX
> > +       shlx    %SHIFT_REG, %VRDX, %VRDX
> > +
> >  # ifdef USE_AS_WCSRCHR
> > -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> > +       kmovw   %edx, %k1
> >  # else
> > -       add     %rdi, %rax
> > +       KMOV    %VRDX, %k1
> >  # endif
> >
> > -       ret
> > -END (STRRCHR)
> > +       VPCOMPRESS %VMM(6), %VMM(1){%k1}{z}
> > +       /* We could technically just jmp back after the vpcompress but
> > +          it doesn't save any 16-byte blocks.  */
> > +
> > +       shrx    %SHIFT_REG, %VRSI, %VRSI
> > +       test    %VRSI, %VRSI
> > +       jnz     L(page_cross_return)
> > +       jmp     L(page_cross_continue)
> > +       /* 1-byte from cache line.  */
> > +END(STRRCHR)
> >  #endif
> > diff --git a/sysdeps/x86_64/multiarch/strrchr-evex.S b/sysdeps/x86_64/multiarch/strrchr-evex.S
> > index 85e3b0119f..b606e6f69c 100644
> > --- a/sysdeps/x86_64/multiarch/strrchr-evex.S
> > +++ b/sysdeps/x86_64/multiarch/strrchr-evex.S
> > @@ -1,394 +1,8 @@
> > -/* strrchr/wcsrchr optimized with 256-bit EVEX instructions.
> > -   Copyright (C) 2021-2023 Free Software Foundation, Inc.
> > -   This file is part of the GNU C Library.
> > -
> > -   The GNU C Library is free software; you can redistribute it and/or
> > -   modify it under the terms of the GNU Lesser General Public
> > -   License as published by the Free Software Foundation; either
> > -   version 2.1 of the License, or (at your option) any later version.
> > -
> > -   The GNU C Library is distributed in the hope that it will be useful,
> > -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > -   Lesser General Public License for more details.
> > -
> > -   You should have received a copy of the GNU Lesser General Public
> > -   License along with the GNU C Library; if not, see
> > -   <https://www.gnu.org/licenses/>.  */
> > -
> > -#include <isa-level.h>
> > -
> > -#if ISA_SHOULD_BUILD (4)
> > -
> > -# include <sysdep.h>
> > -
> >  # ifndef STRRCHR
> >  #  define STRRCHR      __strrchr_evex
> >  # endif
> >
> > -# include "x86-evex256-vecs.h"
> > -
> > -# ifdef USE_AS_WCSRCHR
> > -#  define SHIFT_REG    rsi
> > -#  define kunpck_2x    kunpckbw
> > -#  define kmov_2x      kmovd
> > -#  define maskz_2x     ecx
> > -#  define maskm_2x     eax
> > -#  define CHAR_SIZE    4
> > -#  define VPMIN        vpminud
> > -#  define VPTESTN      vptestnmd
> > -#  define VPTEST       vptestmd
> > -#  define VPBROADCAST  vpbroadcastd
> > -#  define VPCMPEQ      vpcmpeqd
> > -#  define VPCMP        vpcmpd
> > -
> > -#  define USE_WIDE_CHAR
> > -# else
> > -#  define SHIFT_REG    rdi
> > -#  define kunpck_2x    kunpckdq
> > -#  define kmov_2x      kmovq
> > -#  define maskz_2x     rcx
> > -#  define maskm_2x     rax
> > -
> > -#  define CHAR_SIZE    1
> > -#  define VPMIN        vpminub
> > -#  define VPTESTN      vptestnmb
> > -#  define VPTEST       vptestmb
> > -#  define VPBROADCAST  vpbroadcastb
> > -#  define VPCMPEQ      vpcmpeqb
> > -#  define VPCMP        vpcmpb
> > -# endif
> > -
> > -# include "reg-macros.h"
> > -
> > -# define VMATCH        VMM(0)
> > -# define CHAR_PER_VEC  (VEC_SIZE / CHAR_SIZE)
> > -# define PAGE_SIZE     4096
> > -
> > -       .section SECTION(.text), "ax", @progbits
> > -ENTRY_P2ALIGN(STRRCHR, 6)
> > -       movl    %edi, %eax
> > -       /* Broadcast CHAR to VMATCH.  */
> > -       VPBROADCAST %esi, %VMATCH
> > -
> > -       andl    $(PAGE_SIZE - 1), %eax
> > -       cmpl    $(PAGE_SIZE - VEC_SIZE), %eax
> > -       jg      L(cross_page_boundary)
> > -L(page_cross_continue):
> > -       VMOVU   (%rdi), %VMM(1)
> > -       /* k0 has a 1 for each zero CHAR in VEC(1).  */
> > -       VPTESTN %VMM(1), %VMM(1), %k0
> > -       KMOV    %k0, %VRSI
> > -       test    %VRSI, %VRSI
> > -       jz      L(aligned_more)
> > -       /* fallthrough: zero CHAR in first VEC.  */
> > -       /* K1 has a 1 for each search CHAR match in VEC(1).  */
> > -       VPCMPEQ %VMATCH, %VMM(1), %k1
> > -       KMOV    %k1, %VRAX
> > -       /* Build mask up until first zero CHAR (used to mask of
> > -          potential search CHAR matches past the end of the string).
> > -        */
> > -       blsmsk  %VRSI, %VRSI
> > -       and     %VRSI, %VRAX
> > -       jz      L(ret0)
> > -       /* Get last match (the `and` removed any out of bounds matches).
> > -        */
> > -       bsr     %VRAX, %VRAX
> > -# ifdef USE_AS_WCSRCHR
> > -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> > -# else
> > -       addq    %rdi, %rax
> > -# endif
> > -L(ret0):
> > -       ret
> > -
> > -       /* Returns for first vec x1/x2/x3 have hard coded backward
> > -          search path for earlier matches.  */
> > -       .p2align 4,, 6
> > -L(first_vec_x1):
> > -       VPCMPEQ %VMATCH, %VMM(2), %k1
> > -       KMOV    %k1, %VRAX
> > -       blsmsk  %VRCX, %VRCX
> > -       /* eax non-zero if search CHAR in range.  */
> > -       and     %VRCX, %VRAX
> > -       jnz     L(first_vec_x1_return)
> > -
> > -       /* fallthrough: no match in VEC(2) then need to check for
> > -          earlier matches (in VEC(1)).  */
> > -       .p2align 4,, 4
> > -L(first_vec_x0_test):
> > -       VPCMPEQ %VMATCH, %VMM(1), %k1
> > -       KMOV    %k1, %VRAX
> > -       test    %VRAX, %VRAX
> > -       jz      L(ret1)
> > -       bsr     %VRAX, %VRAX
> > -# ifdef USE_AS_WCSRCHR
> > -       leaq    (%rsi, %rax, CHAR_SIZE), %rax
> > -# else
> > -       addq    %rsi, %rax
> > -# endif
> > -L(ret1):
> > -       ret
> > -
> > -       .p2align 4,, 10
> > -L(first_vec_x1_or_x2):
> > -       VPCMPEQ %VMM(3), %VMATCH, %k3
> > -       VPCMPEQ %VMM(2), %VMATCH, %k2
> > -       /* K2 and K3 have 1 for any search CHAR match. Test if any
> > -          matches between either of them. Otherwise check VEC(1).  */
> > -       KORTEST %k2, %k3
> > -       jz      L(first_vec_x0_test)
> > -
> > -       /* Guaranteed that VEC(2) and VEC(3) are within range so merge
> > -          the two bitmasks then get last result.  */
> > -       kunpck_2x %k2, %k3, %k3
> > -       kmov_2x %k3, %maskm_2x
> > -       bsr     %maskm_2x, %maskm_2x
> > -       leaq    (VEC_SIZE * 1)(%r8, %rax, CHAR_SIZE), %rax
> > -       ret
> > -
> > -       .p2align 4,, 7
> > -L(first_vec_x3):
> > -       VPCMPEQ %VMATCH, %VMM(4), %k1
> > -       KMOV    %k1, %VRAX
> > -       blsmsk  %VRCX, %VRCX
> > -       /* If no search CHAR match in range check VEC(1)/VEC(2)/VEC(3).
> > -        */
> > -       and     %VRCX, %VRAX
> > -       jz      L(first_vec_x1_or_x2)
> > -       bsr     %VRAX, %VRAX
> > -       leaq    (VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
> > -       ret
> > -
> > -
> > -       .p2align 4,, 6
> > -L(first_vec_x0_x1_test):
> > -       VPCMPEQ %VMATCH, %VMM(2), %k1
> > -       KMOV    %k1, %VRAX
> > -       /* Check VEC(2) for last match first. If no match try VEC(1).
> > -        */
> > -       test    %VRAX, %VRAX
> > -       jz      L(first_vec_x0_test)
> > -       .p2align 4,, 4
> > -L(first_vec_x1_return):
> > -       bsr     %VRAX, %VRAX
> > -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> > -       ret
> > -
> > -
> > -       .p2align 4,, 10
> > -L(first_vec_x2):
> > -       VPCMPEQ %VMATCH, %VMM(3), %k1
> > -       KMOV    %k1, %VRAX
> > -       blsmsk  %VRCX, %VRCX
> > -       /* Check VEC(3) for last match first. If no match try
> > -          VEC(2)/VEC(1).  */
> > -       and     %VRCX, %VRAX
> > -       jz      L(first_vec_x0_x1_test)
> > -       bsr     %VRAX, %VRAX
> > -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> > -       ret
> > -
> > -
> > -       .p2align 4,, 12
> > -L(aligned_more):
> > -       /* Need to keep original pointer in case VEC(1) has last match.
> > -        */
> > -       movq    %rdi, %rsi
> > -       andq    $-VEC_SIZE, %rdi
> > -
> > -       VMOVU   VEC_SIZE(%rdi), %VMM(2)
> > -       VPTESTN %VMM(2), %VMM(2), %k0
> > -       KMOV    %k0, %VRCX
> > -
> > -       test    %VRCX, %VRCX
> > -       jnz     L(first_vec_x1)
> > -
> > -       VMOVU   (VEC_SIZE * 2)(%rdi), %VMM(3)
> > -       VPTESTN %VMM(3), %VMM(3), %k0
> > -       KMOV    %k0, %VRCX
> > -
> > -       test    %VRCX, %VRCX
> > -       jnz     L(first_vec_x2)
> > -
> > -       VMOVU   (VEC_SIZE * 3)(%rdi), %VMM(4)
> > -       VPTESTN %VMM(4), %VMM(4), %k0
> > -       KMOV    %k0, %VRCX
> > -       movq    %rdi, %r8
> > -       test    %VRCX, %VRCX
> > -       jnz     L(first_vec_x3)
> > -
> > -       andq    $-(VEC_SIZE * 2), %rdi
> > -       .p2align 4,, 10
> > -L(first_aligned_loop):
> > -       /* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
> > -          guarantee they don't store a match.  */
> > -       VMOVA   (VEC_SIZE * 4)(%rdi), %VMM(5)
> > -       VMOVA   (VEC_SIZE * 5)(%rdi), %VMM(6)
> > -
> > -       VPCMPEQ %VMM(5), %VMATCH, %k2
> > -       vpxord  %VMM(6), %VMATCH, %VMM(7)
> > -
> > -       VPMIN   %VMM(5), %VMM(6), %VMM(8)
> > -       VPMIN   %VMM(8), %VMM(7), %VMM(7)
> > -
> > -       VPTESTN %VMM(7), %VMM(7), %k1
> > -       subq    $(VEC_SIZE * -2), %rdi
> > -       KORTEST %k1, %k2
> > -       jz      L(first_aligned_loop)
> > -
> > -       VPCMPEQ %VMM(6), %VMATCH, %k3
> > -       VPTESTN %VMM(8), %VMM(8), %k1
> > -
> > -       /* If k1 is zero, then we found a CHAR match but no null-term.
> > -          We can now safely throw out VEC1-4.  */
> > -       KTEST   %k1, %k1
> > -       jz      L(second_aligned_loop_prep)
> > -
> > -       KORTEST %k2, %k3
> > -       jnz     L(return_first_aligned_loop)
> > -
> > -
> > -       .p2align 4,, 6
> > -L(first_vec_x1_or_x2_or_x3):
> > -       VPCMPEQ %VMM(4), %VMATCH, %k4
> > -       KMOV    %k4, %VRAX
> > -       bsr     %VRAX, %VRAX
> > -       jz      L(first_vec_x1_or_x2)
> > -       leaq    (VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
> > -       ret
> > -
> > -
> > -       .p2align 4,, 8
> > -L(return_first_aligned_loop):
> > -       VPTESTN %VMM(5), %VMM(5), %k0
> > -
> > -       /* Combined results from VEC5/6.  */
> > -       kunpck_2x %k0, %k1, %k0
> > -       kmov_2x %k0, %maskz_2x
> > -
> > -       blsmsk  %maskz_2x, %maskz_2x
> > -       kunpck_2x %k2, %k3, %k3
> > -       kmov_2x %k3, %maskm_2x
> > -       and     %maskz_2x, %maskm_2x
> > -       jz      L(first_vec_x1_or_x2_or_x3)
> > -
> > -       bsr     %maskm_2x, %maskm_2x
> > -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> > -       ret
> > -
> > -       .p2align 4
> > -       /* We can throw away the work done for the first 4x checks here
> > -          as we have a later match. This is the 'fast' path persay.
> > -        */
> > -L(second_aligned_loop_prep):
> > -L(second_aligned_loop_set_furthest_match):
> > -       movq    %rdi, %rsi
> > -       /* Ideally we would safe k2/k3 but `kmov/kunpck` take uops on
> > -          port0 and have noticeable overhead in the loop.  */
> > -       VMOVA   %VMM(5), %VMM(7)
> > -       VMOVA   %VMM(6), %VMM(8)
> > -       .p2align 4
> > -L(second_aligned_loop):
> > -       VMOVU   (VEC_SIZE * 4)(%rdi), %VMM(5)
> > -       VMOVU   (VEC_SIZE * 5)(%rdi), %VMM(6)
> > -       VPCMPEQ %VMM(5), %VMATCH, %k2
> > -       vpxord  %VMM(6), %VMATCH, %VMM(3)
> > -
> > -       VPMIN   %VMM(5), %VMM(6), %VMM(4)
> > -       VPMIN   %VMM(3), %VMM(4), %VMM(3)
> > -
> > -       VPTESTN %VMM(3), %VMM(3), %k1
> > -       subq    $(VEC_SIZE * -2), %rdi
> > -       KORTEST %k1, %k2
> > -       jz      L(second_aligned_loop)
> > -       VPCMPEQ %VMM(6), %VMATCH, %k3
> > -       VPTESTN %VMM(4), %VMM(4), %k1
> > -       KTEST   %k1, %k1
> > -       jz      L(second_aligned_loop_set_furthest_match)
> > -
> > -       /* branch here because we know we have a match in VEC7/8 but
> > -          might not in VEC5/6 so the latter is expected to be less
> > -          likely.  */
> > -       KORTEST %k2, %k3
> > -       jnz     L(return_new_match)
> > -
> > -L(return_old_match):
> > -       VPCMPEQ %VMM(8), %VMATCH, %k0
> > -       KMOV    %k0, %VRCX
> > -       bsr     %VRCX, %VRCX
> > -       jnz     L(return_old_match_ret)
> > -
> > -       VPCMPEQ %VMM(7), %VMATCH, %k0
> > -       KMOV    %k0, %VRCX
> > -       bsr     %VRCX, %VRCX
> > -       subq    $VEC_SIZE, %rsi
> > -L(return_old_match_ret):
> > -       leaq    (VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
> > -       ret
> > -
> > -       .p2align 4,, 10
> > -L(return_new_match):
> > -       VPTESTN %VMM(5), %VMM(5), %k0
> > -
> > -       /* Combined results from VEC5/6.  */
> > -       kunpck_2x %k0, %k1, %k0
> > -       kmov_2x %k0, %maskz_2x
> > -
> > -       blsmsk  %maskz_2x, %maskz_2x
> > -       kunpck_2x %k2, %k3, %k3
> > -       kmov_2x %k3, %maskm_2x
> > -
> > -       /* Match at end was out-of-bounds so use last known match.  */
> > -       and     %maskz_2x, %maskm_2x
> > -       jz      L(return_old_match)
> > -
> > -       bsr     %maskm_2x, %maskm_2x
> > -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> > -       ret
> > -
> > -L(cross_page_boundary):
> > -       /* eax contains all the page offset bits of src (rdi). `xor rdi,
> > -          rax` sets pointer will all page offset bits cleared so
> > -          offset of (PAGE_SIZE - VEC_SIZE) will get last aligned VEC
> > -          before page cross (guaranteed to be safe to read). Doing this
> > -          as opposed to `movq %rdi, %rax; andq $-VEC_SIZE, %rax` saves
> > -          a bit of code size.  */
> > -       xorq    %rdi, %rax
> > -       VMOVU   (PAGE_SIZE - VEC_SIZE)(%rax), %VMM(1)
> > -       VPTESTN %VMM(1), %VMM(1), %k0
> > -       KMOV    %k0, %VRCX
> > -
> > -       /* Shift out zero CHAR matches that are before the beginning of
> > -          src (rdi).  */
> > -# ifdef USE_AS_WCSRCHR
> > -       movl    %edi, %esi
> > -       andl    $(VEC_SIZE - 1), %esi
> > -       shrl    $2, %esi
> > -# endif
> > -       shrx    %VGPR(SHIFT_REG), %VRCX, %VRCX
> > -
> > -       test    %VRCX, %VRCX
> > -       jz      L(page_cross_continue)
> > +#include "x86-evex512-vecs.h"
> > +#include "reg-macros.h"
> >
> > -       /* Found zero CHAR so need to test for search CHAR.  */
> > -       VPCMP   $0, %VMATCH, %VMM(1), %k1
> > -       KMOV    %k1, %VRAX
> > -       /* Shift out search CHAR matches that are before the beginning of
> > -          src (rdi).  */
> > -       shrx    %VGPR(SHIFT_REG), %VRAX, %VRAX
> > -
> > -       /* Check if any search CHAR match in range.  */
> > -       blsmsk  %VRCX, %VRCX
> > -       and     %VRCX, %VRAX
> > -       jz      L(ret3)
> > -       bsr     %VRAX, %VRAX
> > -# ifdef USE_AS_WCSRCHR
> > -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> > -# else
> > -       addq    %rdi, %rax
> > -# endif
> > -L(ret3):
> > -       ret
> > -END(STRRCHR)
> > -#endif
> > +#include "strrchr-evex-base.S"
> > diff --git a/sysdeps/x86_64/multiarch/wcsrchr-evex.S b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> > index e5c5fe3bf2..a584cd3f43 100644
> > --- a/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> > +++ b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> > @@ -4,4 +4,5 @@
> >
> >  #define STRRCHR        WCSRCHR
> >  #define USE_AS_WCSRCHR 1
> > +#define USE_WIDE_CHAR 1
> >  #include "strrchr-evex.S"
> > --
> > 2.34.1
> >



-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-09-21 15:16   ` H.J. Lu
@ 2023-09-21 19:19     ` Noah Goldstein
  0 siblings, 0 replies; 12+ messages in thread
From: Noah Goldstein @ 2023-09-21 19:19 UTC (permalink / raw)
  To: H.J. Lu; +Cc: libc-alpha, carlos

On Thu, Sep 21, 2023 at 10:17 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Thu, Sep 21, 2023 at 7:39 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > On Thu, Sep 21, 2023 at 9:38 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > >
> > > This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
> > > common implementation: `strrchr-evex-base.S`.
> > >
> > > The motivation is `strrchr-evex` needed to be refactored to not use
> > > 64-bit masked registers in preperation for AVX10.
> > >
> > > Once vec-width masked register combining was removed, the EVEX and
> > > EVEX512 implementations can easily be implemented in the same file
> > > without any major overhead.
> > >
> > > The net result is performance improvements (measured on TGL) for both
> > > `strrchr-evex` and `strrchr-evex512`. Although, note there are some
> > > regressions in the test suite and it may be many of the cases that
> > > make the total-geomean of improvement/regression across bench-strrchr
> > > are cold. The point of the performance measurement is to show there
> > > are no major regressions, but the primary motivation is preperation
> > > for AVX10.
> > >
> > > Benchmarks where taken on TGL:
> > > https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
> > >
> > > EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
> > > EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
> > Full summary of attached here.
>
> The results look good to me.  I believe that this is the only 256-bit
> EVEX function
> with 64-bit mask instructions.
Yeah, but the reason for have seperate ymm/zmm implementations was
the ymm version would combine mask registers. Once we no longer
want to do that, we can just merge the two.
>
> > >
> > > Full check passes on x86.
> > > ---
> > >  sysdeps/x86_64/multiarch/strrchr-evex-base.S | 466 ++++++++++++-------
> > >  sysdeps/x86_64/multiarch/strrchr-evex.S      | 392 +---------------
> > >  sysdeps/x86_64/multiarch/wcsrchr-evex.S      |   1 +
> > >  3 files changed, 294 insertions(+), 565 deletions(-)
> > >
> > > diff --git a/sysdeps/x86_64/multiarch/strrchr-evex-base.S b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> > > index 58b2853ab6..2c98f07fca 100644
> > > --- a/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> > > +++ b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> > > @@ -25,240 +25,354 @@
> > >  # include <sysdep.h>
> > >
> > >  # ifdef USE_AS_WCSRCHR
> > > +#  if VEC_SIZE == 64
> > > +#   define RCX_M       cx
> > > +#   define kortestM    kortestw
> > > +#  else
> > > +#   define RCX_M       cl
> > > +#   define kortestM    kortestb
> > > +#  endif
> > > +
> > > +#  define SHIFT_REG    VRCX
> > > +#  define VPCOMPRESS   vpcompressd
> > >  #  define CHAR_SIZE    4
> > > -#  define VPBROADCAST   vpbroadcastd
> > > -#  define VPCMPEQ      vpcmpeqd
> > > -#  define VPMINU       vpminud
> > > +#  define VPMIN        vpminud
> > >  #  define VPTESTN      vptestnmd
> > > +#  define VPTEST       vptestmd
> > > +#  define VPBROADCAST  vpbroadcastd
> > > +#  define VPCMPEQ      vpcmpeqd
> > > +#  define VPCMP        vpcmpd
> > >  # else
> > > +#  define SHIFT_REG    VRDI
> > > +#  define VPCOMPRESS   vpcompressb
> > >  #  define CHAR_SIZE    1
> > > -#  define VPBROADCAST   vpbroadcastb
> > > -#  define VPCMPEQ      vpcmpeqb
> > > -#  define VPMINU       vpminub
> > > +#  define VPMIN        vpminub
> > >  #  define VPTESTN      vptestnmb
> > > +#  define VPTEST       vptestmb
> > > +#  define VPBROADCAST  vpbroadcastb
> > > +#  define VPCMPEQ      vpcmpeqb
> > > +#  define VPCMP        vpcmpb
> > > +
> > > +#  define RCX_M        VRCX
> > > +#  define kortestM     KORTEST
> > >  # endif
> > >
> > > -# define PAGE_SIZE     4096
> > > +# define VMATCH        VMM(0)
> > >  # define CHAR_PER_VEC  (VEC_SIZE / CHAR_SIZE)
> > > +# define PAGE_SIZE     4096
> > >
> > >         .section SECTION(.text), "ax", @progbits
> > > -/* Aligning entry point to 64 byte, provides better performance for
> > > -   one vector length string.  */
> > > -ENTRY_P2ALIGN (STRRCHR, 6)
> > > -
> > > -       /* Broadcast CHAR to VMM(0).  */
> > > -       VPBROADCAST %esi, %VMM(0)
> > > +       /* Aligning entry point to 64 byte, provides better performance for
> > > +          one vector length string.  */
> > > +ENTRY_P2ALIGN(STRRCHR, 6)
> > >         movl    %edi, %eax
> > > -       sall    $20, %eax
> > > -       cmpl    $((PAGE_SIZE - VEC_SIZE) << 20), %eax
> > > -       ja      L(page_cross)
> > > +       /* Broadcast CHAR to VMATCH.  */
> > > +       VPBROADCAST %esi, %VMATCH
> > >
> > > -L(page_cross_continue):
> > > -       /* Compare [w]char for null, mask bit will be set for match.  */
> > > -       VMOVU   (%rdi), %VMM(1)
> > > +       andl    $(PAGE_SIZE - 1), %eax
> > > +       cmpl    $(PAGE_SIZE - VEC_SIZE), %eax
> > > +       jg      L(cross_page_boundary)
> > >
> > > -       VPTESTN %VMM(1), %VMM(1), %k1
> > > -       KMOV    %k1, %VRCX
> > > -       test    %VRCX, %VRCX
> > > -       jz      L(align_more)
> > > -
> > > -       VPCMPEQ %VMM(1), %VMM(0), %k0
> > > -       KMOV    %k0, %VRAX
> > > -       BLSMSK  %VRCX, %VRCX
> > > -       and     %VRCX, %VRAX
> > > -       jz      L(ret)
> > > -
> > > -       BSR     %VRAX, %VRAX
> > > +       VMOVU   (%rdi), %VMM(1)
> > > +       /* k0 has a 1 for each zero CHAR in YMM1.  */
> > > +       VPTESTN %VMM(1), %VMM(1), %k0
> > > +       KMOV    %k0, %VGPR(rsi)
> > > +       test    %VGPR(rsi), %VGPR(rsi)
> > > +       jz      L(aligned_more)
> > > +       /* fallthrough: zero CHAR in first VEC.  */
> > > +L(page_cross_return):
> > > +       /* K1 has a 1 for each search CHAR match in VEC(1).  */
> > > +       VPCMPEQ %VMATCH, %VMM(1), %k1
> > > +       KMOV    %k1, %VGPR(rax)
> > > +       /* Build mask up until first zero CHAR (used to mask of
> > > +          potential search CHAR matches past the end of the string).  */
> > > +       blsmsk  %VGPR(rsi), %VGPR(rsi)
> > > +       and     %VGPR(rsi), %VGPR(rax)
> > > +       jz      L(ret0)
> > > +       /* Get last match (the `and` removed any out of bounds matches).  */
> > > +       bsr     %VGPR(rax), %VGPR(rax)
> > >  # ifdef USE_AS_WCSRCHR
> > >         leaq    (%rdi, %rax, CHAR_SIZE), %rax
> > >  # else
> > > -       add     %rdi, %rax
> > > +       addq    %rdi, %rax
> > >  # endif
> > > -L(ret):
> > > +L(ret0):
> > >         ret
> > >
> > > -L(vector_x2_end):
> > > -       VPCMPEQ %VMM(2), %VMM(0), %k2
> > > -       KMOV    %k2, %VRAX
> > > -       BLSMSK  %VRCX, %VRCX
> > > -       and     %VRCX, %VRAX
> > > -       jz      L(vector_x1_ret)
> > > -
> > > -       BSR     %VRAX, %VRAX
> > > -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> > > -       ret
> > > -
> > > -       /* Check the first vector at very last to look for match.  */
> > > -L(vector_x1_ret):
> > > -       VPCMPEQ %VMM(1), %VMM(0), %k2
> > > -       KMOV    %k2, %VRAX
> > > -       test    %VRAX, %VRAX
> > > -       jz      L(ret)
> > > -
> > > -       BSR     %VRAX, %VRAX
> > > +       /* Returns for first vec x1/x2/x3 have hard coded backward
> > > +          search path for earlier matches.  */
> > > +       .p2align 4,, 6
> > > +L(first_vec_x1):
> > > +       VPCMPEQ %VMATCH, %VMM(2), %k1
> > > +       KMOV    %k1, %VGPR(rax)
> > > +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> > > +       /* eax non-zero if search CHAR in range.  */
> > > +       and     %VGPR(rcx), %VGPR(rax)
> > > +       jnz     L(first_vec_x1_return)
> > > +
> > > +       /* fallthrough: no match in YMM2 then need to check for earlier
> > > +          matches (in YMM1).  */
> > > +       .p2align 4,, 4
> > > +L(first_vec_x0_test):
> > > +       VPCMPEQ %VMATCH, %VMM(1), %k1
> > > +       KMOV    %k1, %VGPR(rax)
> > > +       test    %VGPR(rax), %VGPR(rax)
> > > +       jz      L(ret1)
> > > +       bsr     %VGPR(rax), %VGPR(rax)
> > >  # ifdef USE_AS_WCSRCHR
> > >         leaq    (%rsi, %rax, CHAR_SIZE), %rax
> > >  # else
> > > -       add     %rsi, %rax
> > > +
> > > +       addq    %rsi, %rax
> > >  # endif
> > > +L(ret1):
> > > +       ret
> > > +
> > > +       .p2align 4,, 10
> > > +L(first_vec_x3):
> > > +       VPCMPEQ %VMATCH, %VMM(4), %k1
> > > +       KMOV    %k1, %VGPR(rax)
> > > +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> > > +       /* If no search CHAR match in range check YMM1/YMM2/YMM3.  */
> > > +       and     %VGPR(rcx), %VGPR(rax)
> > > +       jz      L(first_vec_x1_or_x2)
> > > +       bsr     %VGPR(rax), %VGPR(rax)
> > > +       leaq    (VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
> > > +       ret
> > > +       .p2align 4,, 4
> > > +
> > > +L(first_vec_x2):
> > > +       VPCMPEQ %VMATCH, %VMM(3), %k1
> > > +       KMOV    %k1, %VGPR(rax)
> > > +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> > > +       /* Check YMM3 for last match first. If no match try YMM2/YMM1.  */
> > > +       and     %VGPR(rcx), %VGPR(rax)
> > > +       jz      L(first_vec_x0_x1_test)
> > > +       bsr     %VGPR(rax), %VGPR(rax)
> > > +       leaq    (VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
> > >         ret
> > >
> > > -L(align_more):
> > > -       /* Zero r8 to store match result.  */
> > > -       xorl    %r8d, %r8d
> > > -       /* Save pointer of first vector, in case if no match found.  */
> > > +       .p2align 4,, 6
> > > +L(first_vec_x0_x1_test):
> > > +       VPCMPEQ %VMATCH, %VMM(2), %k1
> > > +       KMOV    %k1, %VGPR(rax)
> > > +       /* Check YMM2 for last match first. If no match try YMM1.  */
> > > +       test    %VGPR(rax), %VGPR(rax)
> > > +       jz      L(first_vec_x0_test)
> > > +       .p2align 4,, 4
> > > +L(first_vec_x1_return):
> > > +       bsr     %VGPR(rax), %VGPR(rax)
> > > +       leaq    (VEC_SIZE)(%r8, %rax, CHAR_SIZE), %rax
> > > +       ret
> > > +
> > > +       .p2align 4,, 12
> > > +L(aligned_more):
> > > +L(page_cross_continue):
> > > +       /* Need to keep original pointer incase VEC(1) has last match.  */
> > >         movq    %rdi, %rsi
> > > -       /* Align pointer to vector size.  */
> > >         andq    $-VEC_SIZE, %rdi
> > > -       /* Loop unroll for 2 vector loop.  */
> > > -       VMOVA   (VEC_SIZE)(%rdi), %VMM(2)
> > > +
> > > +       VMOVU   VEC_SIZE(%rdi), %VMM(2)
> > >         VPTESTN %VMM(2), %VMM(2), %k0
> > >         KMOV    %k0, %VRCX
> > > +       movq    %rdi, %r8
> > >         test    %VRCX, %VRCX
> > > -       jnz     L(vector_x2_end)
> > > +       jnz     L(first_vec_x1)
> > > +
> > > +       VMOVU   (VEC_SIZE * 2)(%rdi), %VMM(3)
> > > +       VPTESTN %VMM(3), %VMM(3), %k0
> > > +       KMOV    %k0, %VRCX
> > > +
> > > +       test    %VRCX, %VRCX
> > > +       jnz     L(first_vec_x2)
> > > +
> > > +       VMOVU   (VEC_SIZE * 3)(%rdi), %VMM(4)
> > > +       VPTESTN %VMM(4), %VMM(4), %k0
> > > +       KMOV    %k0, %VRCX
> > > +
> > > +       /* Intentionally use 64-bit here.  EVEX256 version needs 1-byte
> > > +          padding for efficient nop before loop alignment.  */
> > > +       test    %rcx, %rcx
> > > +       jnz     L(first_vec_x3)
> > >
> > > -       /* Save pointer of second vector, in case if no match
> > > -          found.  */
> > > -       movq    %rdi, %r9
> > > -       /* Align address to VEC_SIZE * 2 for loop.  */
> > >         andq    $-(VEC_SIZE * 2), %rdi
> > > +       .p2align 4
> > > +L(first_aligned_loop):
> > > +       /* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
> > > +          gurantee they don't store a match.  */
> > > +       VMOVA   (VEC_SIZE * 4)(%rdi), %VMM(5)
> > > +       VMOVA   (VEC_SIZE * 5)(%rdi), %VMM(6)
> > >
> > > -       .p2align 4,,11
> > > -L(loop):
> > > -       /* 2 vector loop, as it provide better performance as compared
> > > -          to 4 vector loop.  */
> > > -       VMOVA   (VEC_SIZE * 2)(%rdi), %VMM(3)
> > > -       VMOVA   (VEC_SIZE * 3)(%rdi), %VMM(4)
> > > -       VPCMPEQ %VMM(3), %VMM(0), %k1
> > > -       VPCMPEQ %VMM(4), %VMM(0), %k2
> > > -       VPMINU  %VMM(3), %VMM(4), %VMM(5)
> > > -       VPTESTN %VMM(5), %VMM(5), %k0
> > > -       KOR     %k1, %k2, %k3
> > > -       subq    $-(VEC_SIZE * 2), %rdi
> > > -       /* If k0 and k3 zero, match and end of string not found.  */
> > > -       KORTEST %k0, %k3
> > > -       jz      L(loop)
> > > -
> > > -       /* If k0 is non zero, end of string found.  */
> > > -       KORTEST %k0, %k0
> > > -       jnz     L(endloop)
> > > -
> > > -       lea     VEC_SIZE(%rdi), %r8
> > > -       /* A match found, it need to be stored in r8 before loop
> > > -          continue.  */
> > > -       /* Check second vector first.  */
> > > -       KMOV    %k2, %VRDX
> > > -       test    %VRDX, %VRDX
> > > -       jnz     L(loop_vec_x2_match)
> > > +       VPCMP   $4, %VMM(5), %VMATCH, %k2
> > > +       VPCMP   $4, %VMM(6), %VMATCH, %k3{%k2}
> > > +
> > > +       VPMIN   %VMM(5), %VMM(6), %VMM(7)
> > > +
> > > +       VPTEST  %VMM(7), %VMM(7), %k1{%k3}
> > > +       subq    $(VEC_SIZE * -2), %rdi
> > > +       kortestM %k1, %k1
> > > +       jc      L(first_aligned_loop)
> > >
> > > +       VPTESTN %VMM(7), %VMM(7), %k1
> > >         KMOV    %k1, %VRDX
> > > -       /* Match is in first vector, rdi offset need to be subtracted
> > > -         by VEC_SIZE.  */
> > > -       sub     $VEC_SIZE, %r8
> > > -
> > > -       /* If second vector doesn't have match, first vector must
> > > -          have match.  */
> > > -L(loop_vec_x2_match):
> > > -       BSR     %VRDX, %VRDX
> > > -# ifdef USE_AS_WCSRCHR
> > > -       sal     $2, %rdx
> > > -# endif
> > > -       add     %rdx, %r8
> > > -       jmp     L(loop)
> > > +       test    %VRDX, %VRDX
> > > +       jz      L(second_aligned_loop_prep)
> > >
> > > -L(endloop):
> > > -       /* Check if string end in first loop vector.  */
> > > -       VPTESTN %VMM(3), %VMM(3), %k0
> > > -       KMOV    %k0, %VRCX
> > > -       test    %VRCX, %VRCX
> > > -       jnz     L(loop_vector_x1_end)
> > > +       kortestM %k3, %k3
> > > +       jnc     L(return_first_aligned_loop)
> > >
> > > -       /* Check if it has match in first loop vector.  */
> > > -       KMOV    %k1, %VRAX
> > > +       .p2align 4,, 6
> > > +L(first_vec_x1_or_x2_or_x3):
> > > +       VPCMPEQ %VMM(4), %VMATCH, %k4
> > > +       KMOV    %k4, %VRAX
> > >         test    %VRAX, %VRAX
> > > -       jz      L(loop_vector_x2_end)
> > > +       jz      L(first_vec_x1_or_x2)
> > > +       bsr     %VRAX, %VRAX
> > > +       leaq    (VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
> > > +       ret
> > >
> > > -       BSR     %VRAX, %VRAX
> > > -       leaq    (%rdi, %rax, CHAR_SIZE), %r8
> > >
> > > -       /* String must end in second loop vector.  */
> > > -L(loop_vector_x2_end):
> > > -       VPTESTN %VMM(4), %VMM(4), %k0
> > > +       .p2align 4,, 8
> > > +L(return_first_aligned_loop):
> > > +       VPTESTN %VMM(5), %VMM(5), %k0
> > >         KMOV    %k0, %VRCX
> > > +       blsmsk  %VRCX, %VRCX
> > > +       jnc     L(return_first_new_match_first)
> > > +       blsmsk  %VRDX, %VRDX
> > > +       VPCMPEQ %VMM(6), %VMATCH, %k0
> > > +       KMOV    %k0, %VRAX
> > > +       addq    $VEC_SIZE, %rdi
> > > +       and     %VRDX, %VRAX
> > > +       jnz     L(return_first_new_match_ret)
> > > +       subq    $VEC_SIZE, %rdi
> > > +L(return_first_new_match_first):
> > >         KMOV    %k2, %VRAX
> > > -       BLSMSK  %VRCX, %VRCX
> > > -       /* Check if it has match in second loop vector.  */
> > > +# ifdef USE_AS_WCSRCHR
> > > +       xorl    $((1 << CHAR_PER_VEC)- 1), %VRAX
> > >         and     %VRCX, %VRAX
> > > -       jz      L(check_last_match)
> > > +# else
> > > +       andn    %VRCX, %VRAX, %VRAX
> > > +# endif
> > > +       jz      L(first_vec_x1_or_x2_or_x3)
> > > +L(return_first_new_match_ret):
> > > +       bsr     %VRAX, %VRAX
> > > +       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> > > +       ret
> > >
> > > -       BSR     %VRAX, %VRAX
> > > -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> > > +       .p2align 4,, 10
> > > +L(first_vec_x1_or_x2):
> > > +       VPCMPEQ %VMM(3), %VMATCH, %k3
> > > +       KMOV    %k3, %VRAX
> > > +       test    %VRAX, %VRAX
> > > +       jz      L(first_vec_x0_x1_test)
> > > +       bsr     %VRAX, %VRAX
> > > +       leaq    (VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
> > >         ret
> > >
> > > -       /* String end in first loop vector.  */
> > > -L(loop_vector_x1_end):
> > > -       KMOV    %k1, %VRAX
> > > -       BLSMSK  %VRCX, %VRCX
> > > -       /* Check if it has match in second loop vector.  */
> > > -       and     %VRCX, %VRAX
> > > -       jz      L(check_last_match)
> > >
> > > -       BSR     %VRAX, %VRAX
> > > -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> > > -       ret
> > > +       .p2align 4
> > > +       /* We can throw away the work done for the first 4x checks here
> > > +          as we have a later match. This is the 'fast' path persay.  */
> > > +L(second_aligned_loop_prep):
> > > +L(second_aligned_loop_set_furthest_match):
> > > +       movq    %rdi, %rsi
> > > +       VMOVA   %VMM(5), %VMM(7)
> > > +       VMOVA   %VMM(6), %VMM(8)
> > > +       .p2align 4
> > > +L(second_aligned_loop):
> > > +       VMOVU   (VEC_SIZE * 4)(%rdi), %VMM(5)
> > > +       VMOVU   (VEC_SIZE * 5)(%rdi), %VMM(6)
> > > +       VPCMP   $4, %VMM(5), %VMATCH, %k2
> > > +       VPCMP   $4, %VMM(6), %VMATCH, %k3{%k2}
> > > +
> > > +       VPMIN   %VMM(5), %VMM(6), %VMM(4)
> > > +
> > > +       VPTEST  %VMM(4), %VMM(4), %k1{%k3}
> > > +       subq    $(VEC_SIZE * -2), %rdi
> > > +       KMOV    %k1, %VRCX
> > > +       inc     %RCX_M
> > > +       jz      L(second_aligned_loop)
> > > +       VPTESTN %VMM(4), %VMM(4), %k1
> > > +       KMOV    %k1, %VRDX
> > > +       test    %VRDX, %VRDX
> > > +       jz      L(second_aligned_loop_set_furthest_match)
> > >
> > > -       /* No match in first and second loop vector.  */
> > > -L(check_last_match):
> > > -       /* Check if any match recorded in r8.  */
> > > -       test    %r8, %r8
> > > -       jz      L(vector_x2_ret)
> > > -       movq    %r8, %rax
> > > +       kortestM %k3, %k3
> > > +       jnc     L(return_new_match)
> > > +       /* branch here because there is a significant advantage interms
> > > +          of output dependency chance in using edx.  */
> > > +
> > > +
> > > +L(return_old_match):
> > > +       VPCMPEQ %VMM(8), %VMATCH, %k0
> > > +       KMOV    %k0, %VRCX
> > > +       bsr     %VRCX, %VRCX
> > > +       jnz     L(return_old_match_ret)
> > > +
> > > +       VPCMPEQ %VMM(7), %VMATCH, %k0
> > > +       KMOV    %k0, %VRCX
> > > +       bsr     %VRCX, %VRCX
> > > +       subq    $VEC_SIZE, %rsi
> > > +L(return_old_match_ret):
> > > +       leaq    (VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
> > >         ret
> > >
> > > -       /* No match recorded in r8. Check the second saved vector
> > > -          in beginning.  */
> > > -L(vector_x2_ret):
> > > -       VPCMPEQ %VMM(2), %VMM(0), %k2
> > > -       KMOV    %k2, %VRAX
> > > -       test    %VRAX, %VRAX
> > > -       jz      L(vector_x1_ret)
> > >
> > > -       /* Match found in the second saved vector.  */
> > > -       BSR     %VRAX, %VRAX
> > > -       leaq    (VEC_SIZE)(%r9, %rax, CHAR_SIZE), %rax
> > > +L(return_new_match):
> > > +       VPTESTN %VMM(5), %VMM(5), %k0
> > > +       KMOV    %k0, %VRCX
> > > +       blsmsk  %VRCX, %VRCX
> > > +       jnc     L(return_new_match_first)
> > > +       dec     %VRDX
> > > +       VPCMPEQ %VMM(6), %VMATCH, %k0
> > > +       KMOV    %k0, %VRAX
> > > +       addq    $VEC_SIZE, %rdi
> > > +       and     %VRDX, %VRAX
> > > +       jnz     L(return_new_match_ret)
> > > +       subq    $VEC_SIZE, %rdi
> > > +L(return_new_match_first):
> > > +       KMOV    %k2, %VRAX
> > > +# ifdef USE_AS_WCSRCHR
> > > +       xorl    $((1 << CHAR_PER_VEC)- 1), %VRAX
> > > +       and     %VRCX, %VRAX
> > > +# else
> > > +       andn    %VRCX, %VRAX, %VRAX
> > > +# endif
> > > +       jz      L(return_old_match)
> > > +L(return_new_match_ret):
> > > +       bsr     %VRAX, %VRAX
> > > +       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> > >         ret
> > >
> > > -L(page_cross):
> > > -       mov     %rdi, %rax
> > > -       movl    %edi, %ecx
> > > +       .p2align 4,, 4
> > > +L(cross_page_boundary):
> > > +       xorq    %rdi, %rax
> > > +       mov     $-1, %VRDX
> > > +       VMOVU   (PAGE_SIZE - VEC_SIZE)(%rax), %VMM(6)
> > > +       VPTESTN %VMM(6), %VMM(6), %k0
> > > +       KMOV    %k0, %VRSI
> > >
> > >  # ifdef USE_AS_WCSRCHR
> > > -       /* Calculate number of compare result bits to be skipped for
> > > -          wide string alignment adjustment.  */
> > > -       andl    $(VEC_SIZE - 1), %ecx
> > > -       sarl    $2, %ecx
> > > +       movl    %edi, %ecx
> > > +       and     $(VEC_SIZE - 1), %ecx
> > > +       shrl    $2, %ecx
> > >  # endif
> > > -       /* ecx contains number of w[char] to be skipped as a result
> > > -          of address alignment.  */
> > > -       andq    $-VEC_SIZE, %rax
> > > -       VMOVA   (%rax), %VMM(1)
> > > -       VPTESTN %VMM(1), %VMM(1), %k1
> > > -       KMOV    %k1, %VRAX
> > > -       SHR     %cl, %VRAX
> > > -       jz      L(page_cross_continue)
> > > -       VPCMPEQ %VMM(1), %VMM(0), %k0
> > > -       KMOV    %k0, %VRDX
> > > -       SHR     %cl, %VRDX
> > > -       BLSMSK  %VRAX, %VRAX
> > > -       and     %VRDX, %VRAX
> > > -       jz      L(ret)
> > > -       BSR     %VRAX, %VRAX
> > > +       shlx    %SHIFT_REG, %VRDX, %VRDX
> > > +
> > >  # ifdef USE_AS_WCSRCHR
> > > -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> > > +       kmovw   %edx, %k1
> > >  # else
> > > -       add     %rdi, %rax
> > > +       KMOV    %VRDX, %k1
> > >  # endif
> > >
> > > -       ret
> > > -END (STRRCHR)
> > > +       VPCOMPRESS %VMM(6), %VMM(1){%k1}{z}
> > > +       /* We could technically just jmp back after the vpcompress but
> > > +          it doesn't save any 16-byte blocks.  */
> > > +
> > > +       shrx    %SHIFT_REG, %VRSI, %VRSI
> > > +       test    %VRSI, %VRSI
> > > +       jnz     L(page_cross_return)
> > > +       jmp     L(page_cross_continue)
> > > +       /* 1-byte from cache line.  */
> > > +END(STRRCHR)
> > >  #endif
> > > diff --git a/sysdeps/x86_64/multiarch/strrchr-evex.S b/sysdeps/x86_64/multiarch/strrchr-evex.S
> > > index 85e3b0119f..b606e6f69c 100644
> > > --- a/sysdeps/x86_64/multiarch/strrchr-evex.S
> > > +++ b/sysdeps/x86_64/multiarch/strrchr-evex.S
> > > @@ -1,394 +1,8 @@
> > > -/* strrchr/wcsrchr optimized with 256-bit EVEX instructions.
> > > -   Copyright (C) 2021-2023 Free Software Foundation, Inc.
> > > -   This file is part of the GNU C Library.
> > > -
> > > -   The GNU C Library is free software; you can redistribute it and/or
> > > -   modify it under the terms of the GNU Lesser General Public
> > > -   License as published by the Free Software Foundation; either
> > > -   version 2.1 of the License, or (at your option) any later version.
> > > -
> > > -   The GNU C Library is distributed in the hope that it will be useful,
> > > -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > -   Lesser General Public License for more details.
> > > -
> > > -   You should have received a copy of the GNU Lesser General Public
> > > -   License along with the GNU C Library; if not, see
> > > -   <https://www.gnu.org/licenses/>.  */
> > > -
> > > -#include <isa-level.h>
> > > -
> > > -#if ISA_SHOULD_BUILD (4)
> > > -
> > > -# include <sysdep.h>
> > > -
> > >  # ifndef STRRCHR
> > >  #  define STRRCHR      __strrchr_evex
> > >  # endif
> > >
> > > -# include "x86-evex256-vecs.h"
> > > -
> > > -# ifdef USE_AS_WCSRCHR
> > > -#  define SHIFT_REG    rsi
> > > -#  define kunpck_2x    kunpckbw
> > > -#  define kmov_2x      kmovd
> > > -#  define maskz_2x     ecx
> > > -#  define maskm_2x     eax
> > > -#  define CHAR_SIZE    4
> > > -#  define VPMIN        vpminud
> > > -#  define VPTESTN      vptestnmd
> > > -#  define VPTEST       vptestmd
> > > -#  define VPBROADCAST  vpbroadcastd
> > > -#  define VPCMPEQ      vpcmpeqd
> > > -#  define VPCMP        vpcmpd
> > > -
> > > -#  define USE_WIDE_CHAR
> > > -# else
> > > -#  define SHIFT_REG    rdi
> > > -#  define kunpck_2x    kunpckdq
> > > -#  define kmov_2x      kmovq
> > > -#  define maskz_2x     rcx
> > > -#  define maskm_2x     rax
> > > -
> > > -#  define CHAR_SIZE    1
> > > -#  define VPMIN        vpminub
> > > -#  define VPTESTN      vptestnmb
> > > -#  define VPTEST       vptestmb
> > > -#  define VPBROADCAST  vpbroadcastb
> > > -#  define VPCMPEQ      vpcmpeqb
> > > -#  define VPCMP        vpcmpb
> > > -# endif
> > > -
> > > -# include "reg-macros.h"
> > > -
> > > -# define VMATCH        VMM(0)
> > > -# define CHAR_PER_VEC  (VEC_SIZE / CHAR_SIZE)
> > > -# define PAGE_SIZE     4096
> > > -
> > > -       .section SECTION(.text), "ax", @progbits
> > > -ENTRY_P2ALIGN(STRRCHR, 6)
> > > -       movl    %edi, %eax
> > > -       /* Broadcast CHAR to VMATCH.  */
> > > -       VPBROADCAST %esi, %VMATCH
> > > -
> > > -       andl    $(PAGE_SIZE - 1), %eax
> > > -       cmpl    $(PAGE_SIZE - VEC_SIZE), %eax
> > > -       jg      L(cross_page_boundary)
> > > -L(page_cross_continue):
> > > -       VMOVU   (%rdi), %VMM(1)
> > > -       /* k0 has a 1 for each zero CHAR in VEC(1).  */
> > > -       VPTESTN %VMM(1), %VMM(1), %k0
> > > -       KMOV    %k0, %VRSI
> > > -       test    %VRSI, %VRSI
> > > -       jz      L(aligned_more)
> > > -       /* fallthrough: zero CHAR in first VEC.  */
> > > -       /* K1 has a 1 for each search CHAR match in VEC(1).  */
> > > -       VPCMPEQ %VMATCH, %VMM(1), %k1
> > > -       KMOV    %k1, %VRAX
> > > -       /* Build mask up until first zero CHAR (used to mask of
> > > -          potential search CHAR matches past the end of the string).
> > > -        */
> > > -       blsmsk  %VRSI, %VRSI
> > > -       and     %VRSI, %VRAX
> > > -       jz      L(ret0)
> > > -       /* Get last match (the `and` removed any out of bounds matches).
> > > -        */
> > > -       bsr     %VRAX, %VRAX
> > > -# ifdef USE_AS_WCSRCHR
> > > -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> > > -# else
> > > -       addq    %rdi, %rax
> > > -# endif
> > > -L(ret0):
> > > -       ret
> > > -
> > > -       /* Returns for first vec x1/x2/x3 have hard coded backward
> > > -          search path for earlier matches.  */
> > > -       .p2align 4,, 6
> > > -L(first_vec_x1):
> > > -       VPCMPEQ %VMATCH, %VMM(2), %k1
> > > -       KMOV    %k1, %VRAX
> > > -       blsmsk  %VRCX, %VRCX
> > > -       /* eax non-zero if search CHAR in range.  */
> > > -       and     %VRCX, %VRAX
> > > -       jnz     L(first_vec_x1_return)
> > > -
> > > -       /* fallthrough: no match in VEC(2) then need to check for
> > > -          earlier matches (in VEC(1)).  */
> > > -       .p2align 4,, 4
> > > -L(first_vec_x0_test):
> > > -       VPCMPEQ %VMATCH, %VMM(1), %k1
> > > -       KMOV    %k1, %VRAX
> > > -       test    %VRAX, %VRAX
> > > -       jz      L(ret1)
> > > -       bsr     %VRAX, %VRAX
> > > -# ifdef USE_AS_WCSRCHR
> > > -       leaq    (%rsi, %rax, CHAR_SIZE), %rax
> > > -# else
> > > -       addq    %rsi, %rax
> > > -# endif
> > > -L(ret1):
> > > -       ret
> > > -
> > > -       .p2align 4,, 10
> > > -L(first_vec_x1_or_x2):
> > > -       VPCMPEQ %VMM(3), %VMATCH, %k3
> > > -       VPCMPEQ %VMM(2), %VMATCH, %k2
> > > -       /* K2 and K3 have 1 for any search CHAR match. Test if any
> > > -          matches between either of them. Otherwise check VEC(1).  */
> > > -       KORTEST %k2, %k3
> > > -       jz      L(first_vec_x0_test)
> > > -
> > > -       /* Guaranteed that VEC(2) and VEC(3) are within range so merge
> > > -          the two bitmasks then get last result.  */
> > > -       kunpck_2x %k2, %k3, %k3
> > > -       kmov_2x %k3, %maskm_2x
> > > -       bsr     %maskm_2x, %maskm_2x
> > > -       leaq    (VEC_SIZE * 1)(%r8, %rax, CHAR_SIZE), %rax
> > > -       ret
> > > -
> > > -       .p2align 4,, 7
> > > -L(first_vec_x3):
> > > -       VPCMPEQ %VMATCH, %VMM(4), %k1
> > > -       KMOV    %k1, %VRAX
> > > -       blsmsk  %VRCX, %VRCX
> > > -       /* If no search CHAR match in range check VEC(1)/VEC(2)/VEC(3).
> > > -        */
> > > -       and     %VRCX, %VRAX
> > > -       jz      L(first_vec_x1_or_x2)
> > > -       bsr     %VRAX, %VRAX
> > > -       leaq    (VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
> > > -       ret
> > > -
> > > -
> > > -       .p2align 4,, 6
> > > -L(first_vec_x0_x1_test):
> > > -       VPCMPEQ %VMATCH, %VMM(2), %k1
> > > -       KMOV    %k1, %VRAX
> > > -       /* Check VEC(2) for last match first. If no match try VEC(1).
> > > -        */
> > > -       test    %VRAX, %VRAX
> > > -       jz      L(first_vec_x0_test)
> > > -       .p2align 4,, 4
> > > -L(first_vec_x1_return):
> > > -       bsr     %VRAX, %VRAX
> > > -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> > > -       ret
> > > -
> > > -
> > > -       .p2align 4,, 10
> > > -L(first_vec_x2):
> > > -       VPCMPEQ %VMATCH, %VMM(3), %k1
> > > -       KMOV    %k1, %VRAX
> > > -       blsmsk  %VRCX, %VRCX
> > > -       /* Check VEC(3) for last match first. If no match try
> > > -          VEC(2)/VEC(1).  */
> > > -       and     %VRCX, %VRAX
> > > -       jz      L(first_vec_x0_x1_test)
> > > -       bsr     %VRAX, %VRAX
> > > -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> > > -       ret
> > > -
> > > -
> > > -       .p2align 4,, 12
> > > -L(aligned_more):
> > > -       /* Need to keep original pointer in case VEC(1) has last match.
> > > -        */
> > > -       movq    %rdi, %rsi
> > > -       andq    $-VEC_SIZE, %rdi
> > > -
> > > -       VMOVU   VEC_SIZE(%rdi), %VMM(2)
> > > -       VPTESTN %VMM(2), %VMM(2), %k0
> > > -       KMOV    %k0, %VRCX
> > > -
> > > -       test    %VRCX, %VRCX
> > > -       jnz     L(first_vec_x1)
> > > -
> > > -       VMOVU   (VEC_SIZE * 2)(%rdi), %VMM(3)
> > > -       VPTESTN %VMM(3), %VMM(3), %k0
> > > -       KMOV    %k0, %VRCX
> > > -
> > > -       test    %VRCX, %VRCX
> > > -       jnz     L(first_vec_x2)
> > > -
> > > -       VMOVU   (VEC_SIZE * 3)(%rdi), %VMM(4)
> > > -       VPTESTN %VMM(4), %VMM(4), %k0
> > > -       KMOV    %k0, %VRCX
> > > -       movq    %rdi, %r8
> > > -       test    %VRCX, %VRCX
> > > -       jnz     L(first_vec_x3)
> > > -
> > > -       andq    $-(VEC_SIZE * 2), %rdi
> > > -       .p2align 4,, 10
> > > -L(first_aligned_loop):
> > > -       /* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
> > > -          guarantee they don't store a match.  */
> > > -       VMOVA   (VEC_SIZE * 4)(%rdi), %VMM(5)
> > > -       VMOVA   (VEC_SIZE * 5)(%rdi), %VMM(6)
> > > -
> > > -       VPCMPEQ %VMM(5), %VMATCH, %k2
> > > -       vpxord  %VMM(6), %VMATCH, %VMM(7)
> > > -
> > > -       VPMIN   %VMM(5), %VMM(6), %VMM(8)
> > > -       VPMIN   %VMM(8), %VMM(7), %VMM(7)
> > > -
> > > -       VPTESTN %VMM(7), %VMM(7), %k1
> > > -       subq    $(VEC_SIZE * -2), %rdi
> > > -       KORTEST %k1, %k2
> > > -       jz      L(first_aligned_loop)
> > > -
> > > -       VPCMPEQ %VMM(6), %VMATCH, %k3
> > > -       VPTESTN %VMM(8), %VMM(8), %k1
> > > -
> > > -       /* If k1 is zero, then we found a CHAR match but no null-term.
> > > -          We can now safely throw out VEC1-4.  */
> > > -       KTEST   %k1, %k1
> > > -       jz      L(second_aligned_loop_prep)
> > > -
> > > -       KORTEST %k2, %k3
> > > -       jnz     L(return_first_aligned_loop)
> > > -
> > > -
> > > -       .p2align 4,, 6
> > > -L(first_vec_x1_or_x2_or_x3):
> > > -       VPCMPEQ %VMM(4), %VMATCH, %k4
> > > -       KMOV    %k4, %VRAX
> > > -       bsr     %VRAX, %VRAX
> > > -       jz      L(first_vec_x1_or_x2)
> > > -       leaq    (VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
> > > -       ret
> > > -
> > > -
> > > -       .p2align 4,, 8
> > > -L(return_first_aligned_loop):
> > > -       VPTESTN %VMM(5), %VMM(5), %k0
> > > -
> > > -       /* Combined results from VEC5/6.  */
> > > -       kunpck_2x %k0, %k1, %k0
> > > -       kmov_2x %k0, %maskz_2x
> > > -
> > > -       blsmsk  %maskz_2x, %maskz_2x
> > > -       kunpck_2x %k2, %k3, %k3
> > > -       kmov_2x %k3, %maskm_2x
> > > -       and     %maskz_2x, %maskm_2x
> > > -       jz      L(first_vec_x1_or_x2_or_x3)
> > > -
> > > -       bsr     %maskm_2x, %maskm_2x
> > > -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> > > -       ret
> > > -
> > > -       .p2align 4
> > > -       /* We can throw away the work done for the first 4x checks here
> > > -          as we have a later match. This is the 'fast' path persay.
> > > -        */
> > > -L(second_aligned_loop_prep):
> > > -L(second_aligned_loop_set_furthest_match):
> > > -       movq    %rdi, %rsi
> > > -       /* Ideally we would safe k2/k3 but `kmov/kunpck` take uops on
> > > -          port0 and have noticeable overhead in the loop.  */
> > > -       VMOVA   %VMM(5), %VMM(7)
> > > -       VMOVA   %VMM(6), %VMM(8)
> > > -       .p2align 4
> > > -L(second_aligned_loop):
> > > -       VMOVU   (VEC_SIZE * 4)(%rdi), %VMM(5)
> > > -       VMOVU   (VEC_SIZE * 5)(%rdi), %VMM(6)
> > > -       VPCMPEQ %VMM(5), %VMATCH, %k2
> > > -       vpxord  %VMM(6), %VMATCH, %VMM(3)
> > > -
> > > -       VPMIN   %VMM(5), %VMM(6), %VMM(4)
> > > -       VPMIN   %VMM(3), %VMM(4), %VMM(3)
> > > -
> > > -       VPTESTN %VMM(3), %VMM(3), %k1
> > > -       subq    $(VEC_SIZE * -2), %rdi
> > > -       KORTEST %k1, %k2
> > > -       jz      L(second_aligned_loop)
> > > -       VPCMPEQ %VMM(6), %VMATCH, %k3
> > > -       VPTESTN %VMM(4), %VMM(4), %k1
> > > -       KTEST   %k1, %k1
> > > -       jz      L(second_aligned_loop_set_furthest_match)
> > > -
> > > -       /* branch here because we know we have a match in VEC7/8 but
> > > -          might not in VEC5/6 so the latter is expected to be less
> > > -          likely.  */
> > > -       KORTEST %k2, %k3
> > > -       jnz     L(return_new_match)
> > > -
> > > -L(return_old_match):
> > > -       VPCMPEQ %VMM(8), %VMATCH, %k0
> > > -       KMOV    %k0, %VRCX
> > > -       bsr     %VRCX, %VRCX
> > > -       jnz     L(return_old_match_ret)
> > > -
> > > -       VPCMPEQ %VMM(7), %VMATCH, %k0
> > > -       KMOV    %k0, %VRCX
> > > -       bsr     %VRCX, %VRCX
> > > -       subq    $VEC_SIZE, %rsi
> > > -L(return_old_match_ret):
> > > -       leaq    (VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
> > > -       ret
> > > -
> > > -       .p2align 4,, 10
> > > -L(return_new_match):
> > > -       VPTESTN %VMM(5), %VMM(5), %k0
> > > -
> > > -       /* Combined results from VEC5/6.  */
> > > -       kunpck_2x %k0, %k1, %k0
> > > -       kmov_2x %k0, %maskz_2x
> > > -
> > > -       blsmsk  %maskz_2x, %maskz_2x
> > > -       kunpck_2x %k2, %k3, %k3
> > > -       kmov_2x %k3, %maskm_2x
> > > -
> > > -       /* Match at end was out-of-bounds so use last known match.  */
> > > -       and     %maskz_2x, %maskm_2x
> > > -       jz      L(return_old_match)
> > > -
> > > -       bsr     %maskm_2x, %maskm_2x
> > > -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> > > -       ret
> > > -
> > > -L(cross_page_boundary):
> > > -       /* eax contains all the page offset bits of src (rdi). `xor rdi,
> > > -          rax` sets pointer will all page offset bits cleared so
> > > -          offset of (PAGE_SIZE - VEC_SIZE) will get last aligned VEC
> > > -          before page cross (guaranteed to be safe to read). Doing this
> > > -          as opposed to `movq %rdi, %rax; andq $-VEC_SIZE, %rax` saves
> > > -          a bit of code size.  */
> > > -       xorq    %rdi, %rax
> > > -       VMOVU   (PAGE_SIZE - VEC_SIZE)(%rax), %VMM(1)
> > > -       VPTESTN %VMM(1), %VMM(1), %k0
> > > -       KMOV    %k0, %VRCX
> > > -
> > > -       /* Shift out zero CHAR matches that are before the beginning of
> > > -          src (rdi).  */
> > > -# ifdef USE_AS_WCSRCHR
> > > -       movl    %edi, %esi
> > > -       andl    $(VEC_SIZE - 1), %esi
> > > -       shrl    $2, %esi
> > > -# endif
> > > -       shrx    %VGPR(SHIFT_REG), %VRCX, %VRCX
> > > -
> > > -       test    %VRCX, %VRCX
> > > -       jz      L(page_cross_continue)
> > > +#include "x86-evex512-vecs.h"
> > > +#include "reg-macros.h"
> > >
> > > -       /* Found zero CHAR so need to test for search CHAR.  */
> > > -       VPCMP   $0, %VMATCH, %VMM(1), %k1
> > > -       KMOV    %k1, %VRAX
> > > -       /* Shift out search CHAR matches that are before the beginning of
> > > -          src (rdi).  */
> > > -       shrx    %VGPR(SHIFT_REG), %VRAX, %VRAX
> > > -
> > > -       /* Check if any search CHAR match in range.  */
> > > -       blsmsk  %VRCX, %VRCX
> > > -       and     %VRCX, %VRAX
> > > -       jz      L(ret3)
> > > -       bsr     %VRAX, %VRAX
> > > -# ifdef USE_AS_WCSRCHR
> > > -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> > > -# else
> > > -       addq    %rdi, %rax
> > > -# endif
> > > -L(ret3):
> > > -       ret
> > > -END(STRRCHR)
> > > -#endif
> > > +#include "strrchr-evex-base.S"
> > > diff --git a/sysdeps/x86_64/multiarch/wcsrchr-evex.S b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> > > index e5c5fe3bf2..a584cd3f43 100644
> > > --- a/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> > > +++ b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> > > @@ -4,4 +4,5 @@
> > >
> > >  #define STRRCHR        WCSRCHR
> > >  #define USE_AS_WCSRCHR 1
> > > +#define USE_WIDE_CHAR 1
> > >  #include "strrchr-evex.S"
> > > --
> > > 2.34.1
> > >
>
>
>
> --
> H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-09-21 14:38 x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10 Noah Goldstein
  2023-09-21 14:39 ` Noah Goldstein
@ 2023-10-04 18:48 ` Noah Goldstein
  2023-10-04 19:00   ` Sunil Pandey
  2023-10-18  9:18   ` Florian Weimer
  1 sibling, 2 replies; 12+ messages in thread
From: Noah Goldstein @ 2023-10-04 18:48 UTC (permalink / raw)
  To: libc-alpha; +Cc: goldstein.w.n, hjl.tools, carlos

This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
common implementation: `strrchr-evex-base.S`.

The motivation is `strrchr-evex` needed to be refactored to not use
64-bit masked registers in preperation for AVX10.

Once vec-width masked register combining was removed, the EVEX and
EVEX512 implementations can easily be implemented in the same file
without any major overhead.

The net result is performance improvements (measured on TGL) for both
`strrchr-evex` and `strrchr-evex512`. Although, note there are some
regressions in the test suite and it may be many of the cases that
make the total-geomean of improvement/regression across bench-strrchr
are cold. The point of the performance measurement is to show there
are no major regressions, but the primary motivation is preperation
for AVX10.

Benchmarks where taken on TGL:
https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html

EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87

Full check passes on x86.
---
 sysdeps/x86_64/multiarch/strrchr-evex-base.S | 469 ++++++++++++-------
 sysdeps/x86_64/multiarch/strrchr-evex.S      | 392 +---------------
 sysdeps/x86_64/multiarch/wcsrchr-evex.S      |   1 +
 3 files changed, 293 insertions(+), 569 deletions(-)

diff --git a/sysdeps/x86_64/multiarch/strrchr-evex-base.S b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
index 58b2853ab6..cd6a0a870a 100644
--- a/sysdeps/x86_64/multiarch/strrchr-evex-base.S
+++ b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
@@ -1,4 +1,4 @@
-/* Placeholder function, not used by any processor at the moment.
+/* Implementation for strrchr using evex256 and evex512.
    Copyright (C) 2022-2023 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -16,8 +16,6 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-/* UNUSED. Exists purely as reference implementation.  */
-
 #include <isa-level.h>
 
 #if ISA_SHOULD_BUILD (4)
@@ -25,240 +23,351 @@
 # include <sysdep.h>
 
 # ifdef USE_AS_WCSRCHR
+#  if VEC_SIZE == 64
+#   define RCX_M	cx
+#   define KORTEST_M	kortestw
+#  else
+#   define RCX_M	cl
+#   define KORTEST_M	kortestb
+#  endif
+
+#  define SHIFT_REG	VRCX
 #  define CHAR_SIZE	4
-#  define VPBROADCAST   vpbroadcastd
-#  define VPCMPEQ	vpcmpeqd
-#  define VPMINU	vpminud
+#  define VPCMP		vpcmpd
+#  define VPMIN		vpminud
+#  define VPCOMPRESS	vpcompressd
 #  define VPTESTN	vptestnmd
+#  define VPTEST	vptestmd
+#  define VPBROADCAST	vpbroadcastd
+#  define VPCMPEQ	vpcmpeqd
+
 # else
+#  define SHIFT_REG	VRDI
 #  define CHAR_SIZE	1
-#  define VPBROADCAST   vpbroadcastb
-#  define VPCMPEQ	vpcmpeqb
-#  define VPMINU	vpminub
+#  define VPCMP		vpcmpb
+#  define VPMIN		vpminub
+#  define VPCOMPRESS	vpcompressb
 #  define VPTESTN	vptestnmb
+#  define VPTEST	vptestmb
+#  define VPBROADCAST	vpbroadcastb
+#  define VPCMPEQ	vpcmpeqb
+
+#  define RCX_M		VRCX
+#  define KORTEST_M	KORTEST
 # endif
 
-# define PAGE_SIZE	4096
+# define VMATCH		VMM(0)
 # define CHAR_PER_VEC	(VEC_SIZE / CHAR_SIZE)
+# define PAGE_SIZE	4096
 
 	.section SECTION(.text), "ax", @progbits
-/* Aligning entry point to 64 byte, provides better performance for
-   one vector length string.  */
-ENTRY_P2ALIGN (STRRCHR, 6)
-
-	/* Broadcast CHAR to VMM(0).  */
-	VPBROADCAST %esi, %VMM(0)
+	/* Aligning entry point to 64 byte, provides better performance for
+	   one vector length string.  */
+ENTRY_P2ALIGN(STRRCHR, 6)
 	movl	%edi, %eax
-	sall	$20, %eax
-	cmpl	$((PAGE_SIZE - VEC_SIZE) << 20), %eax
-	ja	L(page_cross)
+	/* Broadcast CHAR to VMATCH.  */
+	VPBROADCAST %esi, %VMATCH
 
-L(page_cross_continue):
-	/* Compare [w]char for null, mask bit will be set for match.  */
-	VMOVU	(%rdi), %VMM(1)
+	andl	$(PAGE_SIZE - 1), %eax
+	cmpl	$(PAGE_SIZE - VEC_SIZE), %eax
+	jg	L(cross_page_boundary)
 
-	VPTESTN	%VMM(1), %VMM(1), %k1
-	KMOV	%k1, %VRCX
-	test	%VRCX, %VRCX
-	jz	L(align_more)
-
-	VPCMPEQ	%VMM(1), %VMM(0), %k0
-	KMOV	%k0, %VRAX
-	BLSMSK	%VRCX, %VRCX
-	and	%VRCX, %VRAX
-	jz	L(ret)
-
-	BSR	%VRAX, %VRAX
+	VMOVU	(%rdi), %VMM(1)
+	/* k0 has a 1 for each zero CHAR in YMM1.  */
+	VPTESTN	%VMM(1), %VMM(1), %k0
+	KMOV	%k0, %VGPR(rsi)
+	test	%VGPR(rsi), %VGPR(rsi)
+	jz	L(aligned_more)
+	/* fallthrough: zero CHAR in first VEC.  */
+L(page_cross_return):
+	/* K1 has a 1 for each search CHAR match in VEC(1).  */
+	VPCMPEQ	%VMATCH, %VMM(1), %k1
+	KMOV	%k1, %VGPR(rax)
+	/* Build mask up until first zero CHAR (used to mask of
+	   potential search CHAR matches past the end of the string).  */
+	blsmsk	%VGPR(rsi), %VGPR(rsi)
+	/* Use `and` here to remove any out of bounds matches so we can
+	   do a reverse scan on `rax` to find the last match.  */
+	and	%VGPR(rsi), %VGPR(rax)
+	jz	L(ret0)
+	/* Get last match.  */
+	bsr	%VGPR(rax), %VGPR(rax)
 # ifdef USE_AS_WCSRCHR
 	leaq	(%rdi, %rax, CHAR_SIZE), %rax
 # else
-	add	%rdi, %rax
+	addq	%rdi, %rax
 # endif
-L(ret):
+L(ret0):
 	ret
 
-L(vector_x2_end):
-	VPCMPEQ	%VMM(2), %VMM(0), %k2
-	KMOV	%k2, %VRAX
-	BLSMSK	%VRCX, %VRCX
-	and	%VRCX, %VRAX
-	jz	L(vector_x1_ret)
-
-	BSR	%VRAX, %VRAX
-	leaq	(VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-	/* Check the first vector at very last to look for match.  */
-L(vector_x1_ret):
-	VPCMPEQ %VMM(1), %VMM(0), %k2
-	KMOV	%k2, %VRAX
-	test	%VRAX, %VRAX
-	jz	L(ret)
-
-	BSR	%VRAX, %VRAX
+	/* Returns for first vec x1/x2/x3 have hard coded backward
+	   search path for earlier matches.  */
+	.p2align 4,, 6
+L(first_vec_x1):
+	VPCMPEQ	%VMATCH, %VMM(2), %k1
+	KMOV	%k1, %VGPR(rax)
+	blsmsk	%VGPR(rcx), %VGPR(rcx)
+	/* eax non-zero if search CHAR in range.  */
+	and	%VGPR(rcx), %VGPR(rax)
+	jnz	L(first_vec_x1_return)
+
+	/* fallthrough: no match in YMM2 then need to check for earlier
+	   matches (in YMM1).  */
+	.p2align 4,, 4
+L(first_vec_x0_test):
+	VPCMPEQ	%VMATCH, %VMM(1), %k1
+	KMOV	%k1, %VGPR(rax)
+	test	%VGPR(rax), %VGPR(rax)
+	jz	L(ret1)
+	bsr	%VGPR(rax), %VGPR(rax)
 # ifdef USE_AS_WCSRCHR
 	leaq	(%rsi, %rax, CHAR_SIZE), %rax
 # else
-	add	%rsi, %rax
+	addq	%rsi, %rax
 # endif
+L(ret1):
 	ret
 
-L(align_more):
-	/* Zero r8 to store match result.  */
-	xorl	%r8d, %r8d
-	/* Save pointer of first vector, in case if no match found.  */
+	.p2align 4,, 10
+L(first_vec_x3):
+	VPCMPEQ	%VMATCH, %VMM(4), %k1
+	KMOV	%k1, %VGPR(rax)
+	blsmsk	%VGPR(rcx), %VGPR(rcx)
+	/* If no search CHAR match in range check YMM1/YMM2/YMM3.  */
+	and	%VGPR(rcx), %VGPR(rax)
+	jz	L(first_vec_x1_or_x2)
+	bsr	%VGPR(rax), %VGPR(rax)
+	leaq	(VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
+	ret
+	.p2align 4,, 4
+
+L(first_vec_x2):
+	VPCMPEQ	%VMATCH, %VMM(3), %k1
+	KMOV	%k1, %VGPR(rax)
+	blsmsk	%VGPR(rcx), %VGPR(rcx)
+	/* Check YMM3 for last match first. If no match try YMM2/YMM1.  */
+	and	%VGPR(rcx), %VGPR(rax)
+	jz	L(first_vec_x0_x1_test)
+	bsr	%VGPR(rax), %VGPR(rax)
+	leaq	(VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
+	ret
+
+	.p2align 4,, 6
+L(first_vec_x0_x1_test):
+	VPCMPEQ	%VMATCH, %VMM(2), %k1
+	KMOV	%k1, %VGPR(rax)
+	/* Check YMM2 for last match first. If no match try YMM1.  */
+	test	%VGPR(rax), %VGPR(rax)
+	jz	L(first_vec_x0_test)
+	.p2align 4,, 4
+L(first_vec_x1_return):
+	bsr	%VGPR(rax), %VGPR(rax)
+	leaq	(VEC_SIZE)(%r8, %rax, CHAR_SIZE), %rax
+	ret
+
+	.p2align 4,, 12
+L(aligned_more):
+L(page_cross_continue):
+	/* Need to keep original pointer incase VEC(1) has last match.  */
 	movq	%rdi, %rsi
-	/* Align pointer to vector size.  */
 	andq	$-VEC_SIZE, %rdi
-	/* Loop unroll for 2 vector loop.  */
-	VMOVA	(VEC_SIZE)(%rdi), %VMM(2)
+
+	VMOVU	VEC_SIZE(%rdi), %VMM(2)
 	VPTESTN	%VMM(2), %VMM(2), %k0
 	KMOV	%k0, %VRCX
+	movq	%rdi, %r8
 	test	%VRCX, %VRCX
-	jnz	L(vector_x2_end)
+	jnz	L(first_vec_x1)
+
+	VMOVU	(VEC_SIZE * 2)(%rdi), %VMM(3)
+	VPTESTN	%VMM(3), %VMM(3), %k0
+	KMOV	%k0, %VRCX
+
+	test	%VRCX, %VRCX
+	jnz	L(first_vec_x2)
+
+	VMOVU	(VEC_SIZE * 3)(%rdi), %VMM(4)
+	VPTESTN	%VMM(4), %VMM(4), %k0
+	KMOV	%k0, %VRCX
+
+	/* Intentionally use 64-bit here.  EVEX256 version needs 1-byte
+	   padding for efficient nop before loop alignment.  */
+	test	%rcx, %rcx
+	jnz	L(first_vec_x3)
 
-	/* Save pointer of second vector, in case if no match
-	   found.  */
-	movq	%rdi, %r9
-	/* Align address to VEC_SIZE * 2 for loop.  */
 	andq	$-(VEC_SIZE * 2), %rdi
+	.p2align 4
+L(first_aligned_loop):
+	/* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
+	   gurantee they don't store a match.  */
+	VMOVA	(VEC_SIZE * 4)(%rdi), %VMM(5)
+	VMOVA	(VEC_SIZE * 5)(%rdi), %VMM(6)
 
-	.p2align 4,,11
-L(loop):
-	/* 2 vector loop, as it provide better performance as compared
-	   to 4 vector loop.  */
-	VMOVA	(VEC_SIZE * 2)(%rdi), %VMM(3)
-	VMOVA	(VEC_SIZE * 3)(%rdi), %VMM(4)
-	VPCMPEQ	%VMM(3), %VMM(0), %k1
-	VPCMPEQ	%VMM(4), %VMM(0), %k2
-	VPMINU	%VMM(3), %VMM(4), %VMM(5)
-	VPTESTN	%VMM(5), %VMM(5), %k0
-	KOR	%k1, %k2, %k3
-	subq	$-(VEC_SIZE * 2), %rdi
-	/* If k0 and k3 zero, match and end of string not found.  */
-	KORTEST	%k0, %k3
-	jz	L(loop)
-
-	/* If k0 is non zero, end of string found.  */
-	KORTEST %k0, %k0
-	jnz	L(endloop)
-
-	lea	VEC_SIZE(%rdi), %r8
-	/* A match found, it need to be stored in r8 before loop
-	   continue.  */
-	/* Check second vector first.  */
-	KMOV	%k2, %VRDX
-	test	%VRDX, %VRDX
-	jnz	L(loop_vec_x2_match)
+	VPCMP	$4, %VMM(5), %VMATCH, %k2
+	VPCMP	$4, %VMM(6), %VMATCH, %k3{%k2}
 
+	VPMIN	%VMM(5), %VMM(6), %VMM(7)
+
+	VPTEST	%VMM(7), %VMM(7), %k1{%k3}
+	subq	$(VEC_SIZE * -2), %rdi
+	KORTEST_M %k1, %k1
+	jc	L(first_aligned_loop)
+
+	VPTESTN	%VMM(7), %VMM(7), %k1
 	KMOV	%k1, %VRDX
-	/* Match is in first vector, rdi offset need to be subtracted
-	  by VEC_SIZE.  */
-	sub	$VEC_SIZE, %r8
-
-	/* If second vector doesn't have match, first vector must
-	   have match.  */
-L(loop_vec_x2_match):
-	BSR	%VRDX, %VRDX
-# ifdef USE_AS_WCSRCHR
-	sal	$2, %rdx
-# endif
-	add	%rdx, %r8
-	jmp	L(loop)
+	test	%VRDX, %VRDX
+	jz	L(second_aligned_loop_prep)
 
-L(endloop):
-	/* Check if string end in first loop vector.  */
-	VPTESTN	%VMM(3), %VMM(3), %k0
-	KMOV	%k0, %VRCX
-	test	%VRCX, %VRCX
-	jnz	L(loop_vector_x1_end)
+	KORTEST_M %k3, %k3
+	jnc	L(return_first_aligned_loop)
 
-	/* Check if it has match in first loop vector.  */
-	KMOV	%k1, %VRAX
+	.p2align 4,, 6
+L(first_vec_x1_or_x2_or_x3):
+	VPCMPEQ	%VMM(4), %VMATCH, %k4
+	KMOV	%k4, %VRAX
 	test	%VRAX, %VRAX
-	jz	L(loop_vector_x2_end)
-
-	BSR	%VRAX, %VRAX
-	leaq	(%rdi, %rax, CHAR_SIZE), %r8
+	jz	L(first_vec_x1_or_x2)
+	bsr	%VRAX, %VRAX
+	leaq	(VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
+	ret
 
-	/* String must end in second loop vector.  */
-L(loop_vector_x2_end):
-	VPTESTN	%VMM(4), %VMM(4), %k0
+	.p2align 4,, 8
+L(return_first_aligned_loop):
+	VPTESTN	%VMM(5), %VMM(5), %k0
 	KMOV	%k0, %VRCX
+	blsmsk	%VRCX, %VRCX
+	jnc	L(return_first_new_match_first)
+	blsmsk	%VRDX, %VRDX
+	VPCMPEQ	%VMM(6), %VMATCH, %k0
+	KMOV	%k0, %VRAX
+	addq	$VEC_SIZE, %rdi
+	and	%VRDX, %VRAX
+	jnz	L(return_first_new_match_ret)
+	subq	$VEC_SIZE, %rdi
+L(return_first_new_match_first):
 	KMOV	%k2, %VRAX
-	BLSMSK	%VRCX, %VRCX
-	/* Check if it has match in second loop vector.  */
+# ifdef USE_AS_WCSRCHR
+	xorl	$((1 << CHAR_PER_VEC)- 1), %VRAX
 	and	%VRCX, %VRAX
-	jz	L(check_last_match)
+# else
+	andn	%VRCX, %VRAX, %VRAX
+# endif
+	jz	L(first_vec_x1_or_x2_or_x3)
+L(return_first_new_match_ret):
+	bsr	%VRAX, %VRAX
+	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
+	ret
 
-	BSR	%VRAX, %VRAX
-	leaq	(VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
+	.p2align 4,, 10
+L(first_vec_x1_or_x2):
+	VPCMPEQ	%VMM(3), %VMATCH, %k3
+	KMOV	%k3, %VRAX
+	test	%VRAX, %VRAX
+	jz	L(first_vec_x0_x1_test)
+	bsr	%VRAX, %VRAX
+	leaq	(VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
 	ret
 
-	/* String end in first loop vector.  */
-L(loop_vector_x1_end):
-	KMOV	%k1, %VRAX
-	BLSMSK	%VRCX, %VRCX
-	/* Check if it has match in second loop vector.  */
-	and	%VRCX, %VRAX
-	jz	L(check_last_match)
+	.p2align 4
+	/* We can throw away the work done for the first 4x checks here
+	   as we have a later match. This is the 'fast' path persay.  */
+L(second_aligned_loop_prep):
+L(second_aligned_loop_set_furthest_match):
+	movq	%rdi, %rsi
+	VMOVA	%VMM(5), %VMM(7)
+	VMOVA	%VMM(6), %VMM(8)
+	.p2align 4
+L(second_aligned_loop):
+	VMOVU	(VEC_SIZE * 4)(%rdi), %VMM(5)
+	VMOVU	(VEC_SIZE * 5)(%rdi), %VMM(6)
+	VPCMP	$4, %VMM(5), %VMATCH, %k2
+	VPCMP	$4, %VMM(6), %VMATCH, %k3{%k2}
+
+	VPMIN	%VMM(5), %VMM(6), %VMM(4)
+
+	VPTEST	%VMM(4), %VMM(4), %k1{%k3}
+	subq	$(VEC_SIZE * -2), %rdi
+	KMOV	%k1, %VRCX
+	inc	%RCX_M
+	jz	L(second_aligned_loop)
+	VPTESTN	%VMM(4), %VMM(4), %k1
+	KMOV	%k1, %VRDX
+	test	%VRDX, %VRDX
+	jz	L(second_aligned_loop_set_furthest_match)
 
-	BSR	%VRAX, %VRAX
-	leaq	(%rdi, %rax, CHAR_SIZE), %rax
-	ret
+	KORTEST_M %k3, %k3
+	jnc	L(return_new_match)
+	/* branch here because there is a significant advantage interms
+	   of output dependency chance in using edx.  */
 
-	/* No match in first and second loop vector.  */
-L(check_last_match):
-	/* Check if any match recorded in r8.  */
-	test	%r8, %r8
-	jz	L(vector_x2_ret)
-	movq	%r8, %rax
+L(return_old_match):
+	VPCMPEQ	%VMM(8), %VMATCH, %k0
+	KMOV	%k0, %VRCX
+	bsr	%VRCX, %VRCX
+	jnz	L(return_old_match_ret)
+
+	VPCMPEQ	%VMM(7), %VMATCH, %k0
+	KMOV	%k0, %VRCX
+	bsr	%VRCX, %VRCX
+	subq	$VEC_SIZE, %rsi
+L(return_old_match_ret):
+	leaq	(VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
 	ret
 
-	/* No match recorded in r8. Check the second saved vector
-	   in beginning.  */
-L(vector_x2_ret):
-	VPCMPEQ %VMM(2), %VMM(0), %k2
+L(return_new_match):
+	VPTESTN	%VMM(5), %VMM(5), %k0
+	KMOV	%k0, %VRCX
+	blsmsk	%VRCX, %VRCX
+	jnc	L(return_new_match_first)
+	dec	%VRDX
+	VPCMPEQ	%VMM(6), %VMATCH, %k0
+	KMOV	%k0, %VRAX
+	addq	$VEC_SIZE, %rdi
+	and	%VRDX, %VRAX
+	jnz	L(return_new_match_ret)
+	subq	$VEC_SIZE, %rdi
+L(return_new_match_first):
 	KMOV	%k2, %VRAX
-	test	%VRAX, %VRAX
-	jz	L(vector_x1_ret)
-
-	/* Match found in the second saved vector.  */
-	BSR	%VRAX, %VRAX
-	leaq	(VEC_SIZE)(%r9, %rax, CHAR_SIZE), %rax
+# ifdef USE_AS_WCSRCHR
+	xorl	$((1 << CHAR_PER_VEC)- 1), %VRAX
+	and	%VRCX, %VRAX
+# else
+	andn	%VRCX, %VRAX, %VRAX
+# endif
+	jz	L(return_old_match)
+L(return_new_match_ret):
+	bsr	%VRAX, %VRAX
+	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
 	ret
 
-L(page_cross):
-	mov	%rdi, %rax
-	movl	%edi, %ecx
+	.p2align 4,, 4
+L(cross_page_boundary):
+	xorq	%rdi, %rax
+	mov	$-1, %VRDX
+	VMOVU	(PAGE_SIZE - VEC_SIZE)(%rax), %VMM(6)
+	VPTESTN	%VMM(6), %VMM(6), %k0
+	KMOV	%k0, %VRSI
 
 # ifdef USE_AS_WCSRCHR
-	/* Calculate number of compare result bits to be skipped for
-	   wide string alignment adjustment.  */
-	andl	$(VEC_SIZE - 1), %ecx
-	sarl	$2, %ecx
+	movl	%edi, %ecx
+	and	$(VEC_SIZE - 1), %ecx
+	shrl	$2, %ecx
 # endif
-	/* ecx contains number of w[char] to be skipped as a result
-	   of address alignment.  */
-	andq    $-VEC_SIZE, %rax
-	VMOVA	(%rax), %VMM(1)
-	VPTESTN	%VMM(1), %VMM(1), %k1
-	KMOV	%k1, %VRAX
-	SHR     %cl, %VRAX
-	jz	L(page_cross_continue)
-	VPCMPEQ	%VMM(1), %VMM(0), %k0
-	KMOV	%k0, %VRDX
-	SHR     %cl, %VRDX
-	BLSMSK	%VRAX, %VRAX
-	and	%VRDX, %VRAX
-	jz	L(ret)
-	BSR	%VRAX, %VRAX
+	shlx	%SHIFT_REG, %VRDX, %VRDX
+
 # ifdef USE_AS_WCSRCHR
-	leaq	(%rdi, %rax, CHAR_SIZE), %rax
+	kmovw	%edx, %k1
 # else
-	add	%rdi, %rax
+	KMOV	%VRDX, %k1
 # endif
 
-	ret
-END (STRRCHR)
+	VPCOMPRESS %VMM(6), %VMM(1){%k1}{z}
+	/* We could technically just jmp back after the vpcompress but
+	   it doesn't save any 16-byte blocks.  */
+	shrx	%SHIFT_REG, %VRSI, %VRSI
+	test	%VRSI, %VRSI
+	jnz	L(page_cross_return)
+	jmp	L(page_cross_continue)
+	/* 1-byte from cache line.  */
+END(STRRCHR)
 #endif
diff --git a/sysdeps/x86_64/multiarch/strrchr-evex.S b/sysdeps/x86_64/multiarch/strrchr-evex.S
index 85e3b0119f..3bf6a51014 100644
--- a/sysdeps/x86_64/multiarch/strrchr-evex.S
+++ b/sysdeps/x86_64/multiarch/strrchr-evex.S
@@ -1,394 +1,8 @@
-/* strrchr/wcsrchr optimized with 256-bit EVEX instructions.
-   Copyright (C) 2021-2023 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <isa-level.h>
-
-#if ISA_SHOULD_BUILD (4)
-
-# include <sysdep.h>
-
 # ifndef STRRCHR
 #  define STRRCHR	__strrchr_evex
 # endif
 
-# include "x86-evex256-vecs.h"
-
-# ifdef USE_AS_WCSRCHR
-#  define SHIFT_REG	rsi
-#  define kunpck_2x	kunpckbw
-#  define kmov_2x	kmovd
-#  define maskz_2x	ecx
-#  define maskm_2x	eax
-#  define CHAR_SIZE	4
-#  define VPMIN	vpminud
-#  define VPTESTN	vptestnmd
-#  define VPTEST	vptestmd
-#  define VPBROADCAST	vpbroadcastd
-#  define VPCMPEQ	vpcmpeqd
-#  define VPCMP	vpcmpd
-
-#  define USE_WIDE_CHAR
-# else
-#  define SHIFT_REG	rdi
-#  define kunpck_2x	kunpckdq
-#  define kmov_2x	kmovq
-#  define maskz_2x	rcx
-#  define maskm_2x	rax
-
-#  define CHAR_SIZE	1
-#  define VPMIN	vpminub
-#  define VPTESTN	vptestnmb
-#  define VPTEST	vptestmb
-#  define VPBROADCAST	vpbroadcastb
-#  define VPCMPEQ	vpcmpeqb
-#  define VPCMP	vpcmpb
-# endif
-
-# include "reg-macros.h"
-
-# define VMATCH	VMM(0)
-# define CHAR_PER_VEC	(VEC_SIZE / CHAR_SIZE)
-# define PAGE_SIZE	4096
-
-	.section SECTION(.text), "ax", @progbits
-ENTRY_P2ALIGN(STRRCHR, 6)
-	movl	%edi, %eax
-	/* Broadcast CHAR to VMATCH.  */
-	VPBROADCAST %esi, %VMATCH
-
-	andl	$(PAGE_SIZE - 1), %eax
-	cmpl	$(PAGE_SIZE - VEC_SIZE), %eax
-	jg	L(cross_page_boundary)
-L(page_cross_continue):
-	VMOVU	(%rdi), %VMM(1)
-	/* k0 has a 1 for each zero CHAR in VEC(1).  */
-	VPTESTN	%VMM(1), %VMM(1), %k0
-	KMOV	%k0, %VRSI
-	test	%VRSI, %VRSI
-	jz	L(aligned_more)
-	/* fallthrough: zero CHAR in first VEC.  */
-	/* K1 has a 1 for each search CHAR match in VEC(1).  */
-	VPCMPEQ	%VMATCH, %VMM(1), %k1
-	KMOV	%k1, %VRAX
-	/* Build mask up until first zero CHAR (used to mask of
-	   potential search CHAR matches past the end of the string).
-	 */
-	blsmsk	%VRSI, %VRSI
-	and	%VRSI, %VRAX
-	jz	L(ret0)
-	/* Get last match (the `and` removed any out of bounds matches).
-	 */
-	bsr	%VRAX, %VRAX
-# ifdef USE_AS_WCSRCHR
-	leaq	(%rdi, %rax, CHAR_SIZE), %rax
-# else
-	addq	%rdi, %rax
-# endif
-L(ret0):
-	ret
-
-	/* Returns for first vec x1/x2/x3 have hard coded backward
-	   search path for earlier matches.  */
-	.p2align 4,, 6
-L(first_vec_x1):
-	VPCMPEQ	%VMATCH, %VMM(2), %k1
-	KMOV	%k1, %VRAX
-	blsmsk	%VRCX, %VRCX
-	/* eax non-zero if search CHAR in range.  */
-	and	%VRCX, %VRAX
-	jnz	L(first_vec_x1_return)
-
-	/* fallthrough: no match in VEC(2) then need to check for
-	   earlier matches (in VEC(1)).  */
-	.p2align 4,, 4
-L(first_vec_x0_test):
-	VPCMPEQ	%VMATCH, %VMM(1), %k1
-	KMOV	%k1, %VRAX
-	test	%VRAX, %VRAX
-	jz	L(ret1)
-	bsr	%VRAX, %VRAX
-# ifdef USE_AS_WCSRCHR
-	leaq	(%rsi, %rax, CHAR_SIZE), %rax
-# else
-	addq	%rsi, %rax
-# endif
-L(ret1):
-	ret
-
-	.p2align 4,, 10
-L(first_vec_x1_or_x2):
-	VPCMPEQ	%VMM(3), %VMATCH, %k3
-	VPCMPEQ	%VMM(2), %VMATCH, %k2
-	/* K2 and K3 have 1 for any search CHAR match. Test if any
-	   matches between either of them. Otherwise check VEC(1).  */
-	KORTEST %k2, %k3
-	jz	L(first_vec_x0_test)
-
-	/* Guaranteed that VEC(2) and VEC(3) are within range so merge
-	   the two bitmasks then get last result.  */
-	kunpck_2x %k2, %k3, %k3
-	kmov_2x	%k3, %maskm_2x
-	bsr	%maskm_2x, %maskm_2x
-	leaq	(VEC_SIZE * 1)(%r8, %rax, CHAR_SIZE), %rax
-	ret
-
-	.p2align 4,, 7
-L(first_vec_x3):
-	VPCMPEQ	%VMATCH, %VMM(4), %k1
-	KMOV	%k1, %VRAX
-	blsmsk	%VRCX, %VRCX
-	/* If no search CHAR match in range check VEC(1)/VEC(2)/VEC(3).
-	 */
-	and	%VRCX, %VRAX
-	jz	L(first_vec_x1_or_x2)
-	bsr	%VRAX, %VRAX
-	leaq	(VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-
-	.p2align 4,, 6
-L(first_vec_x0_x1_test):
-	VPCMPEQ	%VMATCH, %VMM(2), %k1
-	KMOV	%k1, %VRAX
-	/* Check VEC(2) for last match first. If no match try VEC(1).
-	 */
-	test	%VRAX, %VRAX
-	jz	L(first_vec_x0_test)
-	.p2align 4,, 4
-L(first_vec_x1_return):
-	bsr	%VRAX, %VRAX
-	leaq	(VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-
-	.p2align 4,, 10
-L(first_vec_x2):
-	VPCMPEQ	%VMATCH, %VMM(3), %k1
-	KMOV	%k1, %VRAX
-	blsmsk	%VRCX, %VRCX
-	/* Check VEC(3) for last match first. If no match try
-	   VEC(2)/VEC(1).  */
-	and	%VRCX, %VRAX
-	jz	L(first_vec_x0_x1_test)
-	bsr	%VRAX, %VRAX
-	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-
-	.p2align 4,, 12
-L(aligned_more):
-	/* Need to keep original pointer in case VEC(1) has last match.
-	 */
-	movq	%rdi, %rsi
-	andq	$-VEC_SIZE, %rdi
-
-	VMOVU	VEC_SIZE(%rdi), %VMM(2)
-	VPTESTN	%VMM(2), %VMM(2), %k0
-	KMOV	%k0, %VRCX
-
-	test	%VRCX, %VRCX
-	jnz	L(first_vec_x1)
-
-	VMOVU	(VEC_SIZE * 2)(%rdi), %VMM(3)
-	VPTESTN	%VMM(3), %VMM(3), %k0
-	KMOV	%k0, %VRCX
-
-	test	%VRCX, %VRCX
-	jnz	L(first_vec_x2)
-
-	VMOVU	(VEC_SIZE * 3)(%rdi), %VMM(4)
-	VPTESTN	%VMM(4), %VMM(4), %k0
-	KMOV	%k0, %VRCX
-	movq	%rdi, %r8
-	test	%VRCX, %VRCX
-	jnz	L(first_vec_x3)
-
-	andq	$-(VEC_SIZE * 2), %rdi
-	.p2align 4,, 10
-L(first_aligned_loop):
-	/* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
-	   guarantee they don't store a match.  */
-	VMOVA	(VEC_SIZE * 4)(%rdi), %VMM(5)
-	VMOVA	(VEC_SIZE * 5)(%rdi), %VMM(6)
-
-	VPCMPEQ	%VMM(5), %VMATCH, %k2
-	vpxord	%VMM(6), %VMATCH, %VMM(7)
-
-	VPMIN	%VMM(5), %VMM(6), %VMM(8)
-	VPMIN	%VMM(8), %VMM(7), %VMM(7)
-
-	VPTESTN	%VMM(7), %VMM(7), %k1
-	subq	$(VEC_SIZE * -2), %rdi
-	KORTEST %k1, %k2
-	jz	L(first_aligned_loop)
-
-	VPCMPEQ	%VMM(6), %VMATCH, %k3
-	VPTESTN	%VMM(8), %VMM(8), %k1
-
-	/* If k1 is zero, then we found a CHAR match but no null-term.
-	   We can now safely throw out VEC1-4.  */
-	KTEST	%k1, %k1
-	jz	L(second_aligned_loop_prep)
-
-	KORTEST %k2, %k3
-	jnz	L(return_first_aligned_loop)
-
-
-	.p2align 4,, 6
-L(first_vec_x1_or_x2_or_x3):
-	VPCMPEQ	%VMM(4), %VMATCH, %k4
-	KMOV	%k4, %VRAX
-	bsr	%VRAX, %VRAX
-	jz	L(first_vec_x1_or_x2)
-	leaq	(VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
-	ret
-
-
-	.p2align 4,, 8
-L(return_first_aligned_loop):
-	VPTESTN	%VMM(5), %VMM(5), %k0
-
-	/* Combined results from VEC5/6.  */
-	kunpck_2x %k0, %k1, %k0
-	kmov_2x	%k0, %maskz_2x
-
-	blsmsk	%maskz_2x, %maskz_2x
-	kunpck_2x %k2, %k3, %k3
-	kmov_2x	%k3, %maskm_2x
-	and	%maskz_2x, %maskm_2x
-	jz	L(first_vec_x1_or_x2_or_x3)
-
-	bsr	%maskm_2x, %maskm_2x
-	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-	.p2align 4
-	/* We can throw away the work done for the first 4x checks here
-	   as we have a later match. This is the 'fast' path persay.
-	 */
-L(second_aligned_loop_prep):
-L(second_aligned_loop_set_furthest_match):
-	movq	%rdi, %rsi
-	/* Ideally we would safe k2/k3 but `kmov/kunpck` take uops on
-	   port0 and have noticeable overhead in the loop.  */
-	VMOVA	%VMM(5), %VMM(7)
-	VMOVA	%VMM(6), %VMM(8)
-	.p2align 4
-L(second_aligned_loop):
-	VMOVU	(VEC_SIZE * 4)(%rdi), %VMM(5)
-	VMOVU	(VEC_SIZE * 5)(%rdi), %VMM(6)
-	VPCMPEQ	%VMM(5), %VMATCH, %k2
-	vpxord	%VMM(6), %VMATCH, %VMM(3)
-
-	VPMIN	%VMM(5), %VMM(6), %VMM(4)
-	VPMIN	%VMM(3), %VMM(4), %VMM(3)
-
-	VPTESTN	%VMM(3), %VMM(3), %k1
-	subq	$(VEC_SIZE * -2), %rdi
-	KORTEST %k1, %k2
-	jz	L(second_aligned_loop)
-	VPCMPEQ	%VMM(6), %VMATCH, %k3
-	VPTESTN	%VMM(4), %VMM(4), %k1
-	KTEST	%k1, %k1
-	jz	L(second_aligned_loop_set_furthest_match)
-
-	/* branch here because we know we have a match in VEC7/8 but
-	   might not in VEC5/6 so the latter is expected to be less
-	   likely.  */
-	KORTEST %k2, %k3
-	jnz	L(return_new_match)
-
-L(return_old_match):
-	VPCMPEQ	%VMM(8), %VMATCH, %k0
-	KMOV	%k0, %VRCX
-	bsr	%VRCX, %VRCX
-	jnz	L(return_old_match_ret)
-
-	VPCMPEQ	%VMM(7), %VMATCH, %k0
-	KMOV	%k0, %VRCX
-	bsr	%VRCX, %VRCX
-	subq	$VEC_SIZE, %rsi
-L(return_old_match_ret):
-	leaq	(VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
-	ret
-
-	.p2align 4,, 10
-L(return_new_match):
-	VPTESTN	%VMM(5), %VMM(5), %k0
-
-	/* Combined results from VEC5/6.  */
-	kunpck_2x %k0, %k1, %k0
-	kmov_2x	%k0, %maskz_2x
-
-	blsmsk	%maskz_2x, %maskz_2x
-	kunpck_2x %k2, %k3, %k3
-	kmov_2x	%k3, %maskm_2x
-
-	/* Match at end was out-of-bounds so use last known match.  */
-	and	%maskz_2x, %maskm_2x
-	jz	L(return_old_match)
-
-	bsr	%maskm_2x, %maskm_2x
-	leaq	(VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
-	ret
-
-L(cross_page_boundary):
-	/* eax contains all the page offset bits of src (rdi). `xor rdi,
-	   rax` sets pointer will all page offset bits cleared so
-	   offset of (PAGE_SIZE - VEC_SIZE) will get last aligned VEC
-	   before page cross (guaranteed to be safe to read). Doing this
-	   as opposed to `movq %rdi, %rax; andq $-VEC_SIZE, %rax` saves
-	   a bit of code size.  */
-	xorq	%rdi, %rax
-	VMOVU	(PAGE_SIZE - VEC_SIZE)(%rax), %VMM(1)
-	VPTESTN	%VMM(1), %VMM(1), %k0
-	KMOV	%k0, %VRCX
-
-	/* Shift out zero CHAR matches that are before the beginning of
-	   src (rdi).  */
-# ifdef USE_AS_WCSRCHR
-	movl	%edi, %esi
-	andl	$(VEC_SIZE - 1), %esi
-	shrl	$2, %esi
-# endif
-	shrx	%VGPR(SHIFT_REG), %VRCX, %VRCX
-
-	test	%VRCX, %VRCX
-	jz	L(page_cross_continue)
+#include "x86-evex256-vecs.h"
+#include "reg-macros.h"
 
-	/* Found zero CHAR so need to test for search CHAR.  */
-	VPCMP	$0, %VMATCH, %VMM(1), %k1
-	KMOV	%k1, %VRAX
-	/* Shift out search CHAR matches that are before the beginning of
-	   src (rdi).  */
-	shrx	%VGPR(SHIFT_REG), %VRAX, %VRAX
-
-	/* Check if any search CHAR match in range.  */
-	blsmsk	%VRCX, %VRCX
-	and	%VRCX, %VRAX
-	jz	L(ret3)
-	bsr	%VRAX, %VRAX
-# ifdef USE_AS_WCSRCHR
-	leaq	(%rdi, %rax, CHAR_SIZE), %rax
-# else
-	addq	%rdi, %rax
-# endif
-L(ret3):
-	ret
-END(STRRCHR)
-#endif
+#include "strrchr-evex-base.S"
diff --git a/sysdeps/x86_64/multiarch/wcsrchr-evex.S b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
index e5c5fe3bf2..a584cd3f43 100644
--- a/sysdeps/x86_64/multiarch/wcsrchr-evex.S
+++ b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
@@ -4,4 +4,5 @@
 
 #define STRRCHR	WCSRCHR
 #define USE_AS_WCSRCHR 1
+#define USE_WIDE_CHAR 1
 #include "strrchr-evex.S"
-- 
2.34.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-10-04 18:48 ` Noah Goldstein
@ 2023-10-04 19:00   ` Sunil Pandey
  2023-10-18  9:18   ` Florian Weimer
  1 sibling, 0 replies; 12+ messages in thread
From: Sunil Pandey @ 2023-10-04 19:00 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: libc-alpha, hjl.tools, carlos

[-- Attachment #1: Type: text/plain, Size: 34601 bytes --]

On Wed, Oct 4, 2023 at 11:49 AM Noah Goldstein <goldstein.w.n@gmail.com>
wrote:

> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
> common implementation: `strrchr-evex-base.S`.
>
> The motivation is `strrchr-evex` needed to be refactored to not use
> 64-bit masked registers in preperation for AVX10.
>
> Once vec-width masked register combining was removed, the EVEX and
> EVEX512 implementations can easily be implemented in the same file
> without any major overhead.
>
> The net result is performance improvements (measured on TGL) for both
> `strrchr-evex` and `strrchr-evex512`. Although, note there are some
> regressions in the test suite and it may be many of the cases that
> make the total-geomean of improvement/regression across bench-strrchr
> are cold. The point of the performance measurement is to show there
> are no major regressions, but the primary motivation is preperation
> for AVX10.
>
> Benchmarks where taken on TGL:
>
> https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
>
> EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
> EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
>
> Full check passes on x86.
> ---
>  sysdeps/x86_64/multiarch/strrchr-evex-base.S | 469 ++++++++++++-------
>  sysdeps/x86_64/multiarch/strrchr-evex.S      | 392 +---------------
>  sysdeps/x86_64/multiarch/wcsrchr-evex.S      |   1 +
>  3 files changed, 293 insertions(+), 569 deletions(-)
>
> diff --git a/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> index 58b2853ab6..cd6a0a870a 100644
> --- a/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> +++ b/sysdeps/x86_64/multiarch/strrchr-evex-base.S
> @@ -1,4 +1,4 @@
> -/* Placeholder function, not used by any processor at the moment.
> +/* Implementation for strrchr using evex256 and evex512.
>     Copyright (C) 2022-2023 Free Software Foundation, Inc.
>     This file is part of the GNU C Library.
>
> @@ -16,8 +16,6 @@
>     License along with the GNU C Library; if not, see
>     <https://www.gnu.org/licenses/>.  */
>
> -/* UNUSED. Exists purely as reference implementation.  */
> -
>  #include <isa-level.h>
>
>  #if ISA_SHOULD_BUILD (4)
> @@ -25,240 +23,351 @@
>  # include <sysdep.h>
>
>  # ifdef USE_AS_WCSRCHR
> +#  if VEC_SIZE == 64
> +#   define RCX_M       cx
> +#   define KORTEST_M   kortestw
> +#  else
> +#   define RCX_M       cl
> +#   define KORTEST_M   kortestb
> +#  endif
> +
> +#  define SHIFT_REG    VRCX
>  #  define CHAR_SIZE    4
> -#  define VPBROADCAST   vpbroadcastd
> -#  define VPCMPEQ      vpcmpeqd
> -#  define VPMINU       vpminud
> +#  define VPCMP                vpcmpd
> +#  define VPMIN                vpminud
> +#  define VPCOMPRESS   vpcompressd
>  #  define VPTESTN      vptestnmd
> +#  define VPTEST       vptestmd
> +#  define VPBROADCAST  vpbroadcastd
> +#  define VPCMPEQ      vpcmpeqd
> +
>  # else
> +#  define SHIFT_REG    VRDI
>  #  define CHAR_SIZE    1
> -#  define VPBROADCAST   vpbroadcastb
> -#  define VPCMPEQ      vpcmpeqb
> -#  define VPMINU       vpminub
> +#  define VPCMP                vpcmpb
> +#  define VPMIN                vpminub
> +#  define VPCOMPRESS   vpcompressb
>  #  define VPTESTN      vptestnmb
> +#  define VPTEST       vptestmb
> +#  define VPBROADCAST  vpbroadcastb
> +#  define VPCMPEQ      vpcmpeqb
> +
> +#  define RCX_M                VRCX
> +#  define KORTEST_M    KORTEST
>  # endif
>
> -# define PAGE_SIZE     4096
> +# define VMATCH                VMM(0)
>  # define CHAR_PER_VEC  (VEC_SIZE / CHAR_SIZE)
> +# define PAGE_SIZE     4096
>
>         .section SECTION(.text), "ax", @progbits
> -/* Aligning entry point to 64 byte, provides better performance for
> -   one vector length string.  */
> -ENTRY_P2ALIGN (STRRCHR, 6)
> -
> -       /* Broadcast CHAR to VMM(0).  */
> -       VPBROADCAST %esi, %VMM(0)
> +       /* Aligning entry point to 64 byte, provides better performance for
> +          one vector length string.  */
> +ENTRY_P2ALIGN(STRRCHR, 6)
>         movl    %edi, %eax
> -       sall    $20, %eax
> -       cmpl    $((PAGE_SIZE - VEC_SIZE) << 20), %eax
> -       ja      L(page_cross)
> +       /* Broadcast CHAR to VMATCH.  */
> +       VPBROADCAST %esi, %VMATCH
>
> -L(page_cross_continue):
> -       /* Compare [w]char for null, mask bit will be set for match.  */
> -       VMOVU   (%rdi), %VMM(1)
> +       andl    $(PAGE_SIZE - 1), %eax
> +       cmpl    $(PAGE_SIZE - VEC_SIZE), %eax
> +       jg      L(cross_page_boundary)
>
> -       VPTESTN %VMM(1), %VMM(1), %k1
> -       KMOV    %k1, %VRCX
> -       test    %VRCX, %VRCX
> -       jz      L(align_more)
> -
> -       VPCMPEQ %VMM(1), %VMM(0), %k0
> -       KMOV    %k0, %VRAX
> -       BLSMSK  %VRCX, %VRCX
> -       and     %VRCX, %VRAX
> -       jz      L(ret)
> -
> -       BSR     %VRAX, %VRAX
> +       VMOVU   (%rdi), %VMM(1)
> +       /* k0 has a 1 for each zero CHAR in YMM1.  */
> +       VPTESTN %VMM(1), %VMM(1), %k0
> +       KMOV    %k0, %VGPR(rsi)
> +       test    %VGPR(rsi), %VGPR(rsi)
> +       jz      L(aligned_more)
> +       /* fallthrough: zero CHAR in first VEC.  */
> +L(page_cross_return):
> +       /* K1 has a 1 for each search CHAR match in VEC(1).  */
> +       VPCMPEQ %VMATCH, %VMM(1), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       /* Build mask up until first zero CHAR (used to mask of
> +          potential search CHAR matches past the end of the string).  */
> +       blsmsk  %VGPR(rsi), %VGPR(rsi)
> +       /* Use `and` here to remove any out of bounds matches so we can
> +          do a reverse scan on `rax` to find the last match.  */
> +       and     %VGPR(rsi), %VGPR(rax)
> +       jz      L(ret0)
> +       /* Get last match.  */
> +       bsr     %VGPR(rax), %VGPR(rax)
>  # ifdef USE_AS_WCSRCHR
>         leaq    (%rdi, %rax, CHAR_SIZE), %rax
>  # else
> -       add     %rdi, %rax
> +       addq    %rdi, %rax
>  # endif
> -L(ret):
> +L(ret0):
>         ret
>
> -L(vector_x2_end):
> -       VPCMPEQ %VMM(2), %VMM(0), %k2
> -       KMOV    %k2, %VRAX
> -       BLSMSK  %VRCX, %VRCX
> -       and     %VRCX, %VRAX
> -       jz      L(vector_x1_ret)
> -
> -       BSR     %VRAX, %VRAX
> -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -       /* Check the first vector at very last to look for match.  */
> -L(vector_x1_ret):
> -       VPCMPEQ %VMM(1), %VMM(0), %k2
> -       KMOV    %k2, %VRAX
> -       test    %VRAX, %VRAX
> -       jz      L(ret)
> -
> -       BSR     %VRAX, %VRAX
> +       /* Returns for first vec x1/x2/x3 have hard coded backward
> +          search path for earlier matches.  */
> +       .p2align 4,, 6
> +L(first_vec_x1):
> +       VPCMPEQ %VMATCH, %VMM(2), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> +       /* eax non-zero if search CHAR in range.  */
> +       and     %VGPR(rcx), %VGPR(rax)
> +       jnz     L(first_vec_x1_return)
> +
> +       /* fallthrough: no match in YMM2 then need to check for earlier
> +          matches (in YMM1).  */
> +       .p2align 4,, 4
> +L(first_vec_x0_test):
> +       VPCMPEQ %VMATCH, %VMM(1), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       test    %VGPR(rax), %VGPR(rax)
> +       jz      L(ret1)
> +       bsr     %VGPR(rax), %VGPR(rax)
>  # ifdef USE_AS_WCSRCHR
>         leaq    (%rsi, %rax, CHAR_SIZE), %rax
>  # else
> -       add     %rsi, %rax
> +       addq    %rsi, %rax
>  # endif
> +L(ret1):
>         ret
>
> -L(align_more):
> -       /* Zero r8 to store match result.  */
> -       xorl    %r8d, %r8d
> -       /* Save pointer of first vector, in case if no match found.  */
> +       .p2align 4,, 10
> +L(first_vec_x3):
> +       VPCMPEQ %VMATCH, %VMM(4), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> +       /* If no search CHAR match in range check YMM1/YMM2/YMM3.  */
> +       and     %VGPR(rcx), %VGPR(rax)
> +       jz      L(first_vec_x1_or_x2)
> +       bsr     %VGPR(rax), %VGPR(rax)
> +       leaq    (VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
> +       ret
> +       .p2align 4,, 4
> +
> +L(first_vec_x2):
> +       VPCMPEQ %VMATCH, %VMM(3), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       blsmsk  %VGPR(rcx), %VGPR(rcx)
> +       /* Check YMM3 for last match first. If no match try YMM2/YMM1.  */
> +       and     %VGPR(rcx), %VGPR(rax)
> +       jz      L(first_vec_x0_x1_test)
> +       bsr     %VGPR(rax), %VGPR(rax)
> +       leaq    (VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
> +       ret
> +
> +       .p2align 4,, 6
> +L(first_vec_x0_x1_test):
> +       VPCMPEQ %VMATCH, %VMM(2), %k1
> +       KMOV    %k1, %VGPR(rax)
> +       /* Check YMM2 for last match first. If no match try YMM1.  */
> +       test    %VGPR(rax), %VGPR(rax)
> +       jz      L(first_vec_x0_test)
> +       .p2align 4,, 4
> +L(first_vec_x1_return):
> +       bsr     %VGPR(rax), %VGPR(rax)
> +       leaq    (VEC_SIZE)(%r8, %rax, CHAR_SIZE), %rax
> +       ret
> +
> +       .p2align 4,, 12
> +L(aligned_more):
> +L(page_cross_continue):
> +       /* Need to keep original pointer incase VEC(1) has last match.  */
>         movq    %rdi, %rsi
> -       /* Align pointer to vector size.  */
>         andq    $-VEC_SIZE, %rdi
> -       /* Loop unroll for 2 vector loop.  */
> -       VMOVA   (VEC_SIZE)(%rdi), %VMM(2)
> +
> +       VMOVU   VEC_SIZE(%rdi), %VMM(2)
>         VPTESTN %VMM(2), %VMM(2), %k0
>         KMOV    %k0, %VRCX
> +       movq    %rdi, %r8
>         test    %VRCX, %VRCX
> -       jnz     L(vector_x2_end)
> +       jnz     L(first_vec_x1)
> +
> +       VMOVU   (VEC_SIZE * 2)(%rdi), %VMM(3)
> +       VPTESTN %VMM(3), %VMM(3), %k0
> +       KMOV    %k0, %VRCX
> +
> +       test    %VRCX, %VRCX
> +       jnz     L(first_vec_x2)
> +
> +       VMOVU   (VEC_SIZE * 3)(%rdi), %VMM(4)
> +       VPTESTN %VMM(4), %VMM(4), %k0
> +       KMOV    %k0, %VRCX
> +
> +       /* Intentionally use 64-bit here.  EVEX256 version needs 1-byte
> +          padding for efficient nop before loop alignment.  */
> +       test    %rcx, %rcx
> +       jnz     L(first_vec_x3)
>
> -       /* Save pointer of second vector, in case if no match
> -          found.  */
> -       movq    %rdi, %r9
> -       /* Align address to VEC_SIZE * 2 for loop.  */
>         andq    $-(VEC_SIZE * 2), %rdi
> +       .p2align 4
> +L(first_aligned_loop):
> +       /* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
> +          gurantee they don't store a match.  */
> +       VMOVA   (VEC_SIZE * 4)(%rdi), %VMM(5)
> +       VMOVA   (VEC_SIZE * 5)(%rdi), %VMM(6)
>
> -       .p2align 4,,11
> -L(loop):
> -       /* 2 vector loop, as it provide better performance as compared
> -          to 4 vector loop.  */
> -       VMOVA   (VEC_SIZE * 2)(%rdi), %VMM(3)
> -       VMOVA   (VEC_SIZE * 3)(%rdi), %VMM(4)
> -       VPCMPEQ %VMM(3), %VMM(0), %k1
> -       VPCMPEQ %VMM(4), %VMM(0), %k2
> -       VPMINU  %VMM(3), %VMM(4), %VMM(5)
> -       VPTESTN %VMM(5), %VMM(5), %k0
> -       KOR     %k1, %k2, %k3
> -       subq    $-(VEC_SIZE * 2), %rdi
> -       /* If k0 and k3 zero, match and end of string not found.  */
> -       KORTEST %k0, %k3
> -       jz      L(loop)
> -
> -       /* If k0 is non zero, end of string found.  */
> -       KORTEST %k0, %k0
> -       jnz     L(endloop)
> -
> -       lea     VEC_SIZE(%rdi), %r8
> -       /* A match found, it need to be stored in r8 before loop
> -          continue.  */
> -       /* Check second vector first.  */
> -       KMOV    %k2, %VRDX
> -       test    %VRDX, %VRDX
> -       jnz     L(loop_vec_x2_match)
> +       VPCMP   $4, %VMM(5), %VMATCH, %k2
> +       VPCMP   $4, %VMM(6), %VMATCH, %k3{%k2}
>
> +       VPMIN   %VMM(5), %VMM(6), %VMM(7)
> +
> +       VPTEST  %VMM(7), %VMM(7), %k1{%k3}
> +       subq    $(VEC_SIZE * -2), %rdi
> +       KORTEST_M %k1, %k1
> +       jc      L(first_aligned_loop)
> +
> +       VPTESTN %VMM(7), %VMM(7), %k1
>         KMOV    %k1, %VRDX
> -       /* Match is in first vector, rdi offset need to be subtracted
> -         by VEC_SIZE.  */
> -       sub     $VEC_SIZE, %r8
> -
> -       /* If second vector doesn't have match, first vector must
> -          have match.  */
> -L(loop_vec_x2_match):
> -       BSR     %VRDX, %VRDX
> -# ifdef USE_AS_WCSRCHR
> -       sal     $2, %rdx
> -# endif
> -       add     %rdx, %r8
> -       jmp     L(loop)
> +       test    %VRDX, %VRDX
> +       jz      L(second_aligned_loop_prep)
>
> -L(endloop):
> -       /* Check if string end in first loop vector.  */
> -       VPTESTN %VMM(3), %VMM(3), %k0
> -       KMOV    %k0, %VRCX
> -       test    %VRCX, %VRCX
> -       jnz     L(loop_vector_x1_end)
> +       KORTEST_M %k3, %k3
> +       jnc     L(return_first_aligned_loop)
>
> -       /* Check if it has match in first loop vector.  */
> -       KMOV    %k1, %VRAX
> +       .p2align 4,, 6
> +L(first_vec_x1_or_x2_or_x3):
> +       VPCMPEQ %VMM(4), %VMATCH, %k4
> +       KMOV    %k4, %VRAX
>         test    %VRAX, %VRAX
> -       jz      L(loop_vector_x2_end)
> -
> -       BSR     %VRAX, %VRAX
> -       leaq    (%rdi, %rax, CHAR_SIZE), %r8
> +       jz      L(first_vec_x1_or_x2)
> +       bsr     %VRAX, %VRAX
> +       leaq    (VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
> +       ret
>
> -       /* String must end in second loop vector.  */
> -L(loop_vector_x2_end):
> -       VPTESTN %VMM(4), %VMM(4), %k0
> +       .p2align 4,, 8
> +L(return_first_aligned_loop):
> +       VPTESTN %VMM(5), %VMM(5), %k0
>         KMOV    %k0, %VRCX
> +       blsmsk  %VRCX, %VRCX
> +       jnc     L(return_first_new_match_first)
> +       blsmsk  %VRDX, %VRDX
> +       VPCMPEQ %VMM(6), %VMATCH, %k0
> +       KMOV    %k0, %VRAX
> +       addq    $VEC_SIZE, %rdi
> +       and     %VRDX, %VRAX
> +       jnz     L(return_first_new_match_ret)
> +       subq    $VEC_SIZE, %rdi
> +L(return_first_new_match_first):
>         KMOV    %k2, %VRAX
> -       BLSMSK  %VRCX, %VRCX
> -       /* Check if it has match in second loop vector.  */
> +# ifdef USE_AS_WCSRCHR
> +       xorl    $((1 << CHAR_PER_VEC)- 1), %VRAX
>         and     %VRCX, %VRAX
> -       jz      L(check_last_match)
> +# else
> +       andn    %VRCX, %VRAX, %VRAX
> +# endif
> +       jz      L(first_vec_x1_or_x2_or_x3)
> +L(return_first_new_match_ret):
> +       bsr     %VRAX, %VRAX
> +       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> +       ret
>
> -       BSR     %VRAX, %VRAX
> -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> +       .p2align 4,, 10
> +L(first_vec_x1_or_x2):
> +       VPCMPEQ %VMM(3), %VMATCH, %k3
> +       KMOV    %k3, %VRAX
> +       test    %VRAX, %VRAX
> +       jz      L(first_vec_x0_x1_test)
> +       bsr     %VRAX, %VRAX
> +       leaq    (VEC_SIZE * 2)(%r8, %rax, CHAR_SIZE), %rax
>         ret
>
> -       /* String end in first loop vector.  */
> -L(loop_vector_x1_end):
> -       KMOV    %k1, %VRAX
> -       BLSMSK  %VRCX, %VRCX
> -       /* Check if it has match in second loop vector.  */
> -       and     %VRCX, %VRAX
> -       jz      L(check_last_match)
> +       .p2align 4
> +       /* We can throw away the work done for the first 4x checks here
> +          as we have a later match. This is the 'fast' path persay.  */
> +L(second_aligned_loop_prep):
> +L(second_aligned_loop_set_furthest_match):
> +       movq    %rdi, %rsi
> +       VMOVA   %VMM(5), %VMM(7)
> +       VMOVA   %VMM(6), %VMM(8)
> +       .p2align 4
> +L(second_aligned_loop):
> +       VMOVU   (VEC_SIZE * 4)(%rdi), %VMM(5)
> +       VMOVU   (VEC_SIZE * 5)(%rdi), %VMM(6)
> +       VPCMP   $4, %VMM(5), %VMATCH, %k2
> +       VPCMP   $4, %VMM(6), %VMATCH, %k3{%k2}
> +
> +       VPMIN   %VMM(5), %VMM(6), %VMM(4)
> +
> +       VPTEST  %VMM(4), %VMM(4), %k1{%k3}
> +       subq    $(VEC_SIZE * -2), %rdi
> +       KMOV    %k1, %VRCX
> +       inc     %RCX_M
> +       jz      L(second_aligned_loop)
> +       VPTESTN %VMM(4), %VMM(4), %k1
> +       KMOV    %k1, %VRDX
> +       test    %VRDX, %VRDX
> +       jz      L(second_aligned_loop_set_furthest_match)
>
> -       BSR     %VRAX, %VRAX
> -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> +       KORTEST_M %k3, %k3
> +       jnc     L(return_new_match)
> +       /* branch here because there is a significant advantage interms
> +          of output dependency chance in using edx.  */
>
> -       /* No match in first and second loop vector.  */
> -L(check_last_match):
> -       /* Check if any match recorded in r8.  */
> -       test    %r8, %r8
> -       jz      L(vector_x2_ret)
> -       movq    %r8, %rax
> +L(return_old_match):
> +       VPCMPEQ %VMM(8), %VMATCH, %k0
> +       KMOV    %k0, %VRCX
> +       bsr     %VRCX, %VRCX
> +       jnz     L(return_old_match_ret)
> +
> +       VPCMPEQ %VMM(7), %VMATCH, %k0
> +       KMOV    %k0, %VRCX
> +       bsr     %VRCX, %VRCX
> +       subq    $VEC_SIZE, %rsi
> +L(return_old_match_ret):
> +       leaq    (VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
>         ret
>
> -       /* No match recorded in r8. Check the second saved vector
> -          in beginning.  */
> -L(vector_x2_ret):
> -       VPCMPEQ %VMM(2), %VMM(0), %k2
> +L(return_new_match):
> +       VPTESTN %VMM(5), %VMM(5), %k0
> +       KMOV    %k0, %VRCX
> +       blsmsk  %VRCX, %VRCX
> +       jnc     L(return_new_match_first)
> +       dec     %VRDX
> +       VPCMPEQ %VMM(6), %VMATCH, %k0
> +       KMOV    %k0, %VRAX
> +       addq    $VEC_SIZE, %rdi
> +       and     %VRDX, %VRAX
> +       jnz     L(return_new_match_ret)
> +       subq    $VEC_SIZE, %rdi
> +L(return_new_match_first):
>         KMOV    %k2, %VRAX
> -       test    %VRAX, %VRAX
> -       jz      L(vector_x1_ret)
> -
> -       /* Match found in the second saved vector.  */
> -       BSR     %VRAX, %VRAX
> -       leaq    (VEC_SIZE)(%r9, %rax, CHAR_SIZE), %rax
> +# ifdef USE_AS_WCSRCHR
> +       xorl    $((1 << CHAR_PER_VEC)- 1), %VRAX
> +       and     %VRCX, %VRAX
> +# else
> +       andn    %VRCX, %VRAX, %VRAX
> +# endif
> +       jz      L(return_old_match)
> +L(return_new_match_ret):
> +       bsr     %VRAX, %VRAX
> +       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
>         ret
>
> -L(page_cross):
> -       mov     %rdi, %rax
> -       movl    %edi, %ecx
> +       .p2align 4,, 4
> +L(cross_page_boundary):
> +       xorq    %rdi, %rax
> +       mov     $-1, %VRDX
> +       VMOVU   (PAGE_SIZE - VEC_SIZE)(%rax), %VMM(6)
> +       VPTESTN %VMM(6), %VMM(6), %k0
> +       KMOV    %k0, %VRSI
>
>  # ifdef USE_AS_WCSRCHR
> -       /* Calculate number of compare result bits to be skipped for
> -          wide string alignment adjustment.  */
> -       andl    $(VEC_SIZE - 1), %ecx
> -       sarl    $2, %ecx
> +       movl    %edi, %ecx
> +       and     $(VEC_SIZE - 1), %ecx
> +       shrl    $2, %ecx
>  # endif
> -       /* ecx contains number of w[char] to be skipped as a result
> -          of address alignment.  */
> -       andq    $-VEC_SIZE, %rax
> -       VMOVA   (%rax), %VMM(1)
> -       VPTESTN %VMM(1), %VMM(1), %k1
> -       KMOV    %k1, %VRAX
> -       SHR     %cl, %VRAX
> -       jz      L(page_cross_continue)
> -       VPCMPEQ %VMM(1), %VMM(0), %k0
> -       KMOV    %k0, %VRDX
> -       SHR     %cl, %VRDX
> -       BLSMSK  %VRAX, %VRAX
> -       and     %VRDX, %VRAX
> -       jz      L(ret)
> -       BSR     %VRAX, %VRAX
> +       shlx    %SHIFT_REG, %VRDX, %VRDX
> +
>  # ifdef USE_AS_WCSRCHR
> -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> +       kmovw   %edx, %k1
>  # else
> -       add     %rdi, %rax
> +       KMOV    %VRDX, %k1
>  # endif
>
> -       ret
> -END (STRRCHR)
> +       VPCOMPRESS %VMM(6), %VMM(1){%k1}{z}
> +       /* We could technically just jmp back after the vpcompress but
> +          it doesn't save any 16-byte blocks.  */
> +       shrx    %SHIFT_REG, %VRSI, %VRSI
> +       test    %VRSI, %VRSI
> +       jnz     L(page_cross_return)
> +       jmp     L(page_cross_continue)
> +       /* 1-byte from cache line.  */
> +END(STRRCHR)
>  #endif
> diff --git a/sysdeps/x86_64/multiarch/strrchr-evex.S
> b/sysdeps/x86_64/multiarch/strrchr-evex.S
> index 85e3b0119f..3bf6a51014 100644
> --- a/sysdeps/x86_64/multiarch/strrchr-evex.S
> +++ b/sysdeps/x86_64/multiarch/strrchr-evex.S
> @@ -1,394 +1,8 @@
> -/* strrchr/wcsrchr optimized with 256-bit EVEX instructions.
> -   Copyright (C) 2021-2023 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <https://www.gnu.org/licenses/>.  */
> -
> -#include <isa-level.h>
> -
> -#if ISA_SHOULD_BUILD (4)
> -
> -# include <sysdep.h>
> -
>  # ifndef STRRCHR
>  #  define STRRCHR      __strrchr_evex
>  # endif
>
> -# include "x86-evex256-vecs.h"
> -
> -# ifdef USE_AS_WCSRCHR
> -#  define SHIFT_REG    rsi
> -#  define kunpck_2x    kunpckbw
> -#  define kmov_2x      kmovd
> -#  define maskz_2x     ecx
> -#  define maskm_2x     eax
> -#  define CHAR_SIZE    4
> -#  define VPMIN        vpminud
> -#  define VPTESTN      vptestnmd
> -#  define VPTEST       vptestmd
> -#  define VPBROADCAST  vpbroadcastd
> -#  define VPCMPEQ      vpcmpeqd
> -#  define VPCMP        vpcmpd
> -
> -#  define USE_WIDE_CHAR
> -# else
> -#  define SHIFT_REG    rdi
> -#  define kunpck_2x    kunpckdq
> -#  define kmov_2x      kmovq
> -#  define maskz_2x     rcx
> -#  define maskm_2x     rax
> -
> -#  define CHAR_SIZE    1
> -#  define VPMIN        vpminub
> -#  define VPTESTN      vptestnmb
> -#  define VPTEST       vptestmb
> -#  define VPBROADCAST  vpbroadcastb
> -#  define VPCMPEQ      vpcmpeqb
> -#  define VPCMP        vpcmpb
> -# endif
> -
> -# include "reg-macros.h"
> -
> -# define VMATCH        VMM(0)
> -# define CHAR_PER_VEC  (VEC_SIZE / CHAR_SIZE)
> -# define PAGE_SIZE     4096
> -
> -       .section SECTION(.text), "ax", @progbits
> -ENTRY_P2ALIGN(STRRCHR, 6)
> -       movl    %edi, %eax
> -       /* Broadcast CHAR to VMATCH.  */
> -       VPBROADCAST %esi, %VMATCH
> -
> -       andl    $(PAGE_SIZE - 1), %eax
> -       cmpl    $(PAGE_SIZE - VEC_SIZE), %eax
> -       jg      L(cross_page_boundary)
> -L(page_cross_continue):
> -       VMOVU   (%rdi), %VMM(1)
> -       /* k0 has a 1 for each zero CHAR in VEC(1).  */
> -       VPTESTN %VMM(1), %VMM(1), %k0
> -       KMOV    %k0, %VRSI
> -       test    %VRSI, %VRSI
> -       jz      L(aligned_more)
> -       /* fallthrough: zero CHAR in first VEC.  */
> -       /* K1 has a 1 for each search CHAR match in VEC(1).  */
> -       VPCMPEQ %VMATCH, %VMM(1), %k1
> -       KMOV    %k1, %VRAX
> -       /* Build mask up until first zero CHAR (used to mask of
> -          potential search CHAR matches past the end of the string).
> -        */
> -       blsmsk  %VRSI, %VRSI
> -       and     %VRSI, %VRAX
> -       jz      L(ret0)
> -       /* Get last match (the `and` removed any out of bounds matches).
> -        */
> -       bsr     %VRAX, %VRAX
> -# ifdef USE_AS_WCSRCHR
> -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> -# else
> -       addq    %rdi, %rax
> -# endif
> -L(ret0):
> -       ret
> -
> -       /* Returns for first vec x1/x2/x3 have hard coded backward
> -          search path for earlier matches.  */
> -       .p2align 4,, 6
> -L(first_vec_x1):
> -       VPCMPEQ %VMATCH, %VMM(2), %k1
> -       KMOV    %k1, %VRAX
> -       blsmsk  %VRCX, %VRCX
> -       /* eax non-zero if search CHAR in range.  */
> -       and     %VRCX, %VRAX
> -       jnz     L(first_vec_x1_return)
> -
> -       /* fallthrough: no match in VEC(2) then need to check for
> -          earlier matches (in VEC(1)).  */
> -       .p2align 4,, 4
> -L(first_vec_x0_test):
> -       VPCMPEQ %VMATCH, %VMM(1), %k1
> -       KMOV    %k1, %VRAX
> -       test    %VRAX, %VRAX
> -       jz      L(ret1)
> -       bsr     %VRAX, %VRAX
> -# ifdef USE_AS_WCSRCHR
> -       leaq    (%rsi, %rax, CHAR_SIZE), %rax
> -# else
> -       addq    %rsi, %rax
> -# endif
> -L(ret1):
> -       ret
> -
> -       .p2align 4,, 10
> -L(first_vec_x1_or_x2):
> -       VPCMPEQ %VMM(3), %VMATCH, %k3
> -       VPCMPEQ %VMM(2), %VMATCH, %k2
> -       /* K2 and K3 have 1 for any search CHAR match. Test if any
> -          matches between either of them. Otherwise check VEC(1).  */
> -       KORTEST %k2, %k3
> -       jz      L(first_vec_x0_test)
> -
> -       /* Guaranteed that VEC(2) and VEC(3) are within range so merge
> -          the two bitmasks then get last result.  */
> -       kunpck_2x %k2, %k3, %k3
> -       kmov_2x %k3, %maskm_2x
> -       bsr     %maskm_2x, %maskm_2x
> -       leaq    (VEC_SIZE * 1)(%r8, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -       .p2align 4,, 7
> -L(first_vec_x3):
> -       VPCMPEQ %VMATCH, %VMM(4), %k1
> -       KMOV    %k1, %VRAX
> -       blsmsk  %VRCX, %VRCX
> -       /* If no search CHAR match in range check VEC(1)/VEC(2)/VEC(3).
> -        */
> -       and     %VRCX, %VRAX
> -       jz      L(first_vec_x1_or_x2)
> -       bsr     %VRAX, %VRAX
> -       leaq    (VEC_SIZE * 3)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -
> -       .p2align 4,, 6
> -L(first_vec_x0_x1_test):
> -       VPCMPEQ %VMATCH, %VMM(2), %k1
> -       KMOV    %k1, %VRAX
> -       /* Check VEC(2) for last match first. If no match try VEC(1).
> -        */
> -       test    %VRAX, %VRAX
> -       jz      L(first_vec_x0_test)
> -       .p2align 4,, 4
> -L(first_vec_x1_return):
> -       bsr     %VRAX, %VRAX
> -       leaq    (VEC_SIZE)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -
> -       .p2align 4,, 10
> -L(first_vec_x2):
> -       VPCMPEQ %VMATCH, %VMM(3), %k1
> -       KMOV    %k1, %VRAX
> -       blsmsk  %VRCX, %VRCX
> -       /* Check VEC(3) for last match first. If no match try
> -          VEC(2)/VEC(1).  */
> -       and     %VRCX, %VRAX
> -       jz      L(first_vec_x0_x1_test)
> -       bsr     %VRAX, %VRAX
> -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -
> -       .p2align 4,, 12
> -L(aligned_more):
> -       /* Need to keep original pointer in case VEC(1) has last match.
> -        */
> -       movq    %rdi, %rsi
> -       andq    $-VEC_SIZE, %rdi
> -
> -       VMOVU   VEC_SIZE(%rdi), %VMM(2)
> -       VPTESTN %VMM(2), %VMM(2), %k0
> -       KMOV    %k0, %VRCX
> -
> -       test    %VRCX, %VRCX
> -       jnz     L(first_vec_x1)
> -
> -       VMOVU   (VEC_SIZE * 2)(%rdi), %VMM(3)
> -       VPTESTN %VMM(3), %VMM(3), %k0
> -       KMOV    %k0, %VRCX
> -
> -       test    %VRCX, %VRCX
> -       jnz     L(first_vec_x2)
> -
> -       VMOVU   (VEC_SIZE * 3)(%rdi), %VMM(4)
> -       VPTESTN %VMM(4), %VMM(4), %k0
> -       KMOV    %k0, %VRCX
> -       movq    %rdi, %r8
> -       test    %VRCX, %VRCX
> -       jnz     L(first_vec_x3)
> -
> -       andq    $-(VEC_SIZE * 2), %rdi
> -       .p2align 4,, 10
> -L(first_aligned_loop):
> -       /* Preserve VEC(1), VEC(2), VEC(3), and VEC(4) until we can
> -          guarantee they don't store a match.  */
> -       VMOVA   (VEC_SIZE * 4)(%rdi), %VMM(5)
> -       VMOVA   (VEC_SIZE * 5)(%rdi), %VMM(6)
> -
> -       VPCMPEQ %VMM(5), %VMATCH, %k2
> -       vpxord  %VMM(6), %VMATCH, %VMM(7)
> -
> -       VPMIN   %VMM(5), %VMM(6), %VMM(8)
> -       VPMIN   %VMM(8), %VMM(7), %VMM(7)
> -
> -       VPTESTN %VMM(7), %VMM(7), %k1
> -       subq    $(VEC_SIZE * -2), %rdi
> -       KORTEST %k1, %k2
> -       jz      L(first_aligned_loop)
> -
> -       VPCMPEQ %VMM(6), %VMATCH, %k3
> -       VPTESTN %VMM(8), %VMM(8), %k1
> -
> -       /* If k1 is zero, then we found a CHAR match but no null-term.
> -          We can now safely throw out VEC1-4.  */
> -       KTEST   %k1, %k1
> -       jz      L(second_aligned_loop_prep)
> -
> -       KORTEST %k2, %k3
> -       jnz     L(return_first_aligned_loop)
> -
> -
> -       .p2align 4,, 6
> -L(first_vec_x1_or_x2_or_x3):
> -       VPCMPEQ %VMM(4), %VMATCH, %k4
> -       KMOV    %k4, %VRAX
> -       bsr     %VRAX, %VRAX
> -       jz      L(first_vec_x1_or_x2)
> -       leaq    (VEC_SIZE * 3)(%r8, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -
> -       .p2align 4,, 8
> -L(return_first_aligned_loop):
> -       VPTESTN %VMM(5), %VMM(5), %k0
> -
> -       /* Combined results from VEC5/6.  */
> -       kunpck_2x %k0, %k1, %k0
> -       kmov_2x %k0, %maskz_2x
> -
> -       blsmsk  %maskz_2x, %maskz_2x
> -       kunpck_2x %k2, %k3, %k3
> -       kmov_2x %k3, %maskm_2x
> -       and     %maskz_2x, %maskm_2x
> -       jz      L(first_vec_x1_or_x2_or_x3)
> -
> -       bsr     %maskm_2x, %maskm_2x
> -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -       .p2align 4
> -       /* We can throw away the work done for the first 4x checks here
> -          as we have a later match. This is the 'fast' path persay.
> -        */
> -L(second_aligned_loop_prep):
> -L(second_aligned_loop_set_furthest_match):
> -       movq    %rdi, %rsi
> -       /* Ideally we would safe k2/k3 but `kmov/kunpck` take uops on
> -          port0 and have noticeable overhead in the loop.  */
> -       VMOVA   %VMM(5), %VMM(7)
> -       VMOVA   %VMM(6), %VMM(8)
> -       .p2align 4
> -L(second_aligned_loop):
> -       VMOVU   (VEC_SIZE * 4)(%rdi), %VMM(5)
> -       VMOVU   (VEC_SIZE * 5)(%rdi), %VMM(6)
> -       VPCMPEQ %VMM(5), %VMATCH, %k2
> -       vpxord  %VMM(6), %VMATCH, %VMM(3)
> -
> -       VPMIN   %VMM(5), %VMM(6), %VMM(4)
> -       VPMIN   %VMM(3), %VMM(4), %VMM(3)
> -
> -       VPTESTN %VMM(3), %VMM(3), %k1
> -       subq    $(VEC_SIZE * -2), %rdi
> -       KORTEST %k1, %k2
> -       jz      L(second_aligned_loop)
> -       VPCMPEQ %VMM(6), %VMATCH, %k3
> -       VPTESTN %VMM(4), %VMM(4), %k1
> -       KTEST   %k1, %k1
> -       jz      L(second_aligned_loop_set_furthest_match)
> -
> -       /* branch here because we know we have a match in VEC7/8 but
> -          might not in VEC5/6 so the latter is expected to be less
> -          likely.  */
> -       KORTEST %k2, %k3
> -       jnz     L(return_new_match)
> -
> -L(return_old_match):
> -       VPCMPEQ %VMM(8), %VMATCH, %k0
> -       KMOV    %k0, %VRCX
> -       bsr     %VRCX, %VRCX
> -       jnz     L(return_old_match_ret)
> -
> -       VPCMPEQ %VMM(7), %VMATCH, %k0
> -       KMOV    %k0, %VRCX
> -       bsr     %VRCX, %VRCX
> -       subq    $VEC_SIZE, %rsi
> -L(return_old_match_ret):
> -       leaq    (VEC_SIZE * 3)(%rsi, %rcx, CHAR_SIZE), %rax
> -       ret
> -
> -       .p2align 4,, 10
> -L(return_new_match):
> -       VPTESTN %VMM(5), %VMM(5), %k0
> -
> -       /* Combined results from VEC5/6.  */
> -       kunpck_2x %k0, %k1, %k0
> -       kmov_2x %k0, %maskz_2x
> -
> -       blsmsk  %maskz_2x, %maskz_2x
> -       kunpck_2x %k2, %k3, %k3
> -       kmov_2x %k3, %maskm_2x
> -
> -       /* Match at end was out-of-bounds so use last known match.  */
> -       and     %maskz_2x, %maskm_2x
> -       jz      L(return_old_match)
> -
> -       bsr     %maskm_2x, %maskm_2x
> -       leaq    (VEC_SIZE * 2)(%rdi, %rax, CHAR_SIZE), %rax
> -       ret
> -
> -L(cross_page_boundary):
> -       /* eax contains all the page offset bits of src (rdi). `xor rdi,
> -          rax` sets pointer will all page offset bits cleared so
> -          offset of (PAGE_SIZE - VEC_SIZE) will get last aligned VEC
> -          before page cross (guaranteed to be safe to read). Doing this
> -          as opposed to `movq %rdi, %rax; andq $-VEC_SIZE, %rax` saves
> -          a bit of code size.  */
> -       xorq    %rdi, %rax
> -       VMOVU   (PAGE_SIZE - VEC_SIZE)(%rax), %VMM(1)
> -       VPTESTN %VMM(1), %VMM(1), %k0
> -       KMOV    %k0, %VRCX
> -
> -       /* Shift out zero CHAR matches that are before the beginning of
> -          src (rdi).  */
> -# ifdef USE_AS_WCSRCHR
> -       movl    %edi, %esi
> -       andl    $(VEC_SIZE - 1), %esi
> -       shrl    $2, %esi
> -# endif
> -       shrx    %VGPR(SHIFT_REG), %VRCX, %VRCX
> -
> -       test    %VRCX, %VRCX
> -       jz      L(page_cross_continue)
> +#include "x86-evex256-vecs.h"
> +#include "reg-macros.h"
>
> -       /* Found zero CHAR so need to test for search CHAR.  */
> -       VPCMP   $0, %VMATCH, %VMM(1), %k1
> -       KMOV    %k1, %VRAX
> -       /* Shift out search CHAR matches that are before the beginning of
> -          src (rdi).  */
> -       shrx    %VGPR(SHIFT_REG), %VRAX, %VRAX
> -
> -       /* Check if any search CHAR match in range.  */
> -       blsmsk  %VRCX, %VRCX
> -       and     %VRCX, %VRAX
> -       jz      L(ret3)
> -       bsr     %VRAX, %VRAX
> -# ifdef USE_AS_WCSRCHR
> -       leaq    (%rdi, %rax, CHAR_SIZE), %rax
> -# else
> -       addq    %rdi, %rax
> -# endif
> -L(ret3):
> -       ret
> -END(STRRCHR)
> -#endif
> +#include "strrchr-evex-base.S"
> diff --git a/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> index e5c5fe3bf2..a584cd3f43 100644
> --- a/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> +++ b/sysdeps/x86_64/multiarch/wcsrchr-evex.S
> @@ -4,4 +4,5 @@
>
>  #define STRRCHR        WCSRCHR
>  #define USE_AS_WCSRCHR 1
> +#define USE_WIDE_CHAR 1
>  #include "strrchr-evex.S"
> --
> 2.34.1
>
>
LGTM

Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-10-04 18:48 ` Noah Goldstein
  2023-10-04 19:00   ` Sunil Pandey
@ 2023-10-18  9:18   ` Florian Weimer
  2023-11-01 21:04     ` Florian Weimer
  1 sibling, 1 reply; 12+ messages in thread
From: Florian Weimer @ 2023-10-18  9:18 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: libc-alpha, hjl.tools, carlos

* Noah Goldstein:

> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
> common implementation: `strrchr-evex-base.S`.
>
> The motivation is `strrchr-evex` needed to be refactored to not use
> 64-bit masked registers in preperation for AVX10.
>
> Once vec-width masked register combining was removed, the EVEX and
> EVEX512 implementations can easily be implemented in the same file
> without any major overhead.
>
> The net result is performance improvements (measured on TGL) for both
> `strrchr-evex` and `strrchr-evex512`. Although, note there are some
> regressions in the test suite and it may be many of the cases that
> make the total-geomean of improvement/regression across bench-strrchr
> are cold. The point of the performance measurement is to show there
> are no major regressions, but the primary motivation is preperation
> for AVX10.
>
> Benchmarks where taken on TGL:
> https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
>
> EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
> EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
>
> Full check passes on x86.

I believe this caused some sort of regression because when we upgraded
glibc in the Fedora rawhide buildroot, a lot of things started failing:

  glibc-2.38.9000-13.fc40 broke rawhide buildroot on x86_64
  <https://bugzilla.redhat.com/show_bug.cgi?id=2244688>

The list of changes relative to the previous version is rather short:

- stdlib: fix grouping verification with multi-byte thousands separator (bug 30964)
- build-many-glibcs: Check for required system tools
- x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
- aarch64: Optimise vecmath logs
- aarch64: Cosmetic change in SVE exp routines
- aarch64: Optimize SVE cos & cosf
- aarch64: Improve vecmath sin routines
- nss: Get rid of alloca usage in makedb's write_output.
- debug: Add regression tests for BZ 30932
- Fix FORTIFY_SOURCE false positive
- nss: Rearrange and sort Makefile variables
- inet: Rearrange and sort Makefile variables
- Fix off-by-one OOB write in iconv/tst-iconv-mt

And this patch is the most likely one to cause issues.  I will try to
revert the patch and see if it fixes the observed issues.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-10-18  9:18   ` Florian Weimer
@ 2023-11-01 21:04     ` Florian Weimer
  2023-11-01 21:11       ` Noah Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Florian Weimer @ 2023-11-01 21:04 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Noah Goldstein, libc-alpha, hjl.tools, carlos, Sunil Pandey

* Florian Weimer:

> * Noah Goldstein:
>
>> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
>> common implementation: `strrchr-evex-base.S`.
>>
>> The motivation is `strrchr-evex` needed to be refactored to not use
>> 64-bit masked registers in preperation for AVX10.
>>
>> Once vec-width masked register combining was removed, the EVEX and
>> EVEX512 implementations can easily be implemented in the same file
>> without any major overhead.
>>
>> The net result is performance improvements (measured on TGL) for both
>> `strrchr-evex` and `strrchr-evex512`. Although, note there are some
>> regressions in the test suite and it may be many of the cases that
>> make the total-geomean of improvement/regression across bench-strrchr
>> are cold. The point of the performance measurement is to show there
>> are no major regressions, but the primary motivation is preperation
>> for AVX10.
>>
>> Benchmarks where taken on TGL:
>> https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
>>
>> EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
>> EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
>>
>> Full check passes on x86.
>
> I believe this caused some sort of regression because when we upgraded
> glibc in the Fedora rawhide buildroot, a lot of things started failing:
>
>   glibc-2.38.9000-13.fc40 broke rawhide buildroot on x86_64
>   <https://bugzilla.redhat.com/show_bug.cgi?id=2244688>
>
> The list of changes relative to the previous version is rather short:
>
> - stdlib: fix grouping verification with multi-byte thousands separator (bug 30964)
> - build-many-glibcs: Check for required system tools
> - x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
> - aarch64: Optimise vecmath logs
> - aarch64: Cosmetic change in SVE exp routines
> - aarch64: Optimize SVE cos & cosf
> - aarch64: Improve vecmath sin routines
> - nss: Get rid of alloca usage in makedb's write_output.
> - debug: Add regression tests for BZ 30932
> - Fix FORTIFY_SOURCE false positive
> - nss: Rearrange and sort Makefile variables
> - inet: Rearrange and sort Makefile variables
> - Fix off-by-one OOB write in iconv/tst-iconv-mt
>
> And this patch is the most likely one to cause issues.  I will try to
> revert the patch and see if it fixes the observed issues.

We did the revert and the issues were gone.  So I think this commit is
faulty.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-11-01 21:04     ` Florian Weimer
@ 2023-11-01 21:11       ` Noah Goldstein
  2023-11-01 21:22         ` Noah Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Noah Goldstein @ 2023-11-01 21:11 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Florian Weimer, libc-alpha, hjl.tools, carlos, Sunil Pandey

On Wed, Nov 1, 2023 at 4:04 PM Florian Weimer <fw@deneb.enyo.de> wrote:
>
> * Florian Weimer:
>
> > * Noah Goldstein:
> >
> >> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
> >> common implementation: `strrchr-evex-base.S`.
> >>
> >> The motivation is `strrchr-evex` needed to be refactored to not use
> >> 64-bit masked registers in preperation for AVX10.
> >>
> >> Once vec-width masked register combining was removed, the EVEX and
> >> EVEX512 implementations can easily be implemented in the same file
> >> without any major overhead.
> >>
> >> The net result is performance improvements (measured on TGL) for both
> >> `strrchr-evex` and `strrchr-evex512`. Although, note there are some
> >> regressions in the test suite and it may be many of the cases that
> >> make the total-geomean of improvement/regression across bench-strrchr
> >> are cold. The point of the performance measurement is to show there
> >> are no major regressions, but the primary motivation is preperation
> >> for AVX10.
> >>
> >> Benchmarks where taken on TGL:
> >> https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
> >>
> >> EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
> >> EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
> >>
> >> Full check passes on x86.
> >
> > I believe this caused some sort of regression because when we upgraded
> > glibc in the Fedora rawhide buildroot, a lot of things started failing:
> >
> >   glibc-2.38.9000-13.fc40 broke rawhide buildroot on x86_64
> >   <https://bugzilla.redhat.com/show_bug.cgi?id=2244688>
> >
> > The list of changes relative to the previous version is rather short:
> >
> > - stdlib: fix grouping verification with multi-byte thousands separator (bug 30964)
> > - build-many-glibcs: Check for required system tools
> > - x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
> > - aarch64: Optimise vecmath logs
> > - aarch64: Cosmetic change in SVE exp routines
> > - aarch64: Optimize SVE cos & cosf
> > - aarch64: Improve vecmath sin routines
> > - nss: Get rid of alloca usage in makedb's write_output.
> > - debug: Add regression tests for BZ 30932
> > - Fix FORTIFY_SOURCE false positive
> > - nss: Rearrange and sort Makefile variables
> > - inet: Rearrange and sort Makefile variables
> > - Fix off-by-one OOB write in iconv/tst-iconv-mt
> >
> > And this patch is the most likely one to cause issues.  I will try to
> > revert the patch and see if it fixes the observed issues.
>
> We did the revert and the issues were gone.  So I think this commit is
> faulty.

Bah, didn't see your last email.
Thank you for reverting. Will look into the issue.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-11-01 21:11       ` Noah Goldstein
@ 2023-11-01 21:22         ` Noah Goldstein
  2023-11-01 22:17           ` Noah Goldstein
  2023-11-02  6:44           ` Florian Weimer
  0 siblings, 2 replies; 12+ messages in thread
From: Noah Goldstein @ 2023-11-01 21:22 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Florian Weimer, libc-alpha, hjl.tools, carlos, Sunil Pandey

On Wed, Nov 1, 2023 at 4:11 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Wed, Nov 1, 2023 at 4:04 PM Florian Weimer <fw@deneb.enyo.de> wrote:
> >
> > * Florian Weimer:
> >
> > > * Noah Goldstein:
> > >
> > >> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
> > >> common implementation: `strrchr-evex-base.S`.
> > >>
> > >> The motivation is `strrchr-evex` needed to be refactored to not use
> > >> 64-bit masked registers in preperation for AVX10.
> > >>
> > >> Once vec-width masked register combining was removed, the EVEX and
> > >> EVEX512 implementations can easily be implemented in the same file
> > >> without any major overhead.
> > >>
> > >> The net result is performance improvements (measured on TGL) for both
> > >> `strrchr-evex` and `strrchr-evex512`. Although, note there are some
> > >> regressions in the test suite and it may be many of the cases that
> > >> make the total-geomean of improvement/regression across bench-strrchr
> > >> are cold. The point of the performance measurement is to show there
> > >> are no major regressions, but the primary motivation is preperation
> > >> for AVX10.
> > >>
> > >> Benchmarks where taken on TGL:
> > >> https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
> > >>
> > >> EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
> > >> EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
> > >>
> > >> Full check passes on x86.
> > >
> > > I believe this caused some sort of regression because when we upgraded
> > > glibc in the Fedora rawhide buildroot, a lot of things started failing:
> > >
> > >   glibc-2.38.9000-13.fc40 broke rawhide buildroot on x86_64
> > >   <https://bugzilla.redhat.com/show_bug.cgi?id=2244688>
> > >
> > > The list of changes relative to the previous version is rather short:
> > >
> > > - stdlib: fix grouping verification with multi-byte thousands separator (bug 30964)
> > > - build-many-glibcs: Check for required system tools
> > > - x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
> > > - aarch64: Optimise vecmath logs
> > > - aarch64: Cosmetic change in SVE exp routines
> > > - aarch64: Optimize SVE cos & cosf
> > > - aarch64: Improve vecmath sin routines
> > > - nss: Get rid of alloca usage in makedb's write_output.
> > > - debug: Add regression tests for BZ 30932
> > > - Fix FORTIFY_SOURCE false positive
> > > - nss: Rearrange and sort Makefile variables
> > > - inet: Rearrange and sort Makefile variables
> > > - Fix off-by-one OOB write in iconv/tst-iconv-mt
> > >
> > > And this patch is the most likely one to cause issues.  I will try to
> > > revert the patch and see if it fixes the observed issues.
> >
> > We did the revert and the issues were gone.  So I think this commit is
> > faulty.
>
> Bah, didn't see your last email.
> Thank you for reverting. Will look into the issue.

Okay bug is missing VBMI2 check. But the VBMI2 stuff
isn't really needed so will update and repost w/ fixed ISA.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-11-01 21:22         ` Noah Goldstein
@ 2023-11-01 22:17           ` Noah Goldstein
  2023-11-02  6:44           ` Florian Weimer
  1 sibling, 0 replies; 12+ messages in thread
From: Noah Goldstein @ 2023-11-01 22:17 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Florian Weimer, libc-alpha, hjl.tools, carlos, Sunil Pandey

On Wed, Nov 1, 2023 at 4:22 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Wed, Nov 1, 2023 at 4:11 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > On Wed, Nov 1, 2023 at 4:04 PM Florian Weimer <fw@deneb.enyo.de> wrote:
> > >
> > > * Florian Weimer:
> > >
> > > > * Noah Goldstein:
> > > >
> > > >> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a
> > > >> common implementation: `strrchr-evex-base.S`.
> > > >>
> > > >> The motivation is `strrchr-evex` needed to be refactored to not use
> > > >> 64-bit masked registers in preperation for AVX10.
> > > >>
> > > >> Once vec-width masked register combining was removed, the EVEX and
> > > >> EVEX512 implementations can easily be implemented in the same file
> > > >> without any major overhead.
> > > >>
> > > >> The net result is performance improvements (measured on TGL) for both
> > > >> `strrchr-evex` and `strrchr-evex512`. Although, note there are some
> > > >> regressions in the test suite and it may be many of the cases that
> > > >> make the total-geomean of improvement/regression across bench-strrchr
> > > >> are cold. The point of the performance measurement is to show there
> > > >> are no major regressions, but the primary motivation is preperation
> > > >> for AVX10.
> > > >>
> > > >> Benchmarks where taken on TGL:
> > > >> https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html
> > > >>
> > > >> EVEX geometric_mean(N=5) of all benchmarks New / Original   : 0.74
> > > >> EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87
> > > >>
> > > >> Full check passes on x86.
> > > >
> > > > I believe this caused some sort of regression because when we upgraded
> > > > glibc in the Fedora rawhide buildroot, a lot of things started failing:
> > > >
> > > >   glibc-2.38.9000-13.fc40 broke rawhide buildroot on x86_64
> > > >   <https://bugzilla.redhat.com/show_bug.cgi?id=2244688>
> > > >
> > > > The list of changes relative to the previous version is rather short:
> > > >
> > > > - stdlib: fix grouping verification with multi-byte thousands separator (bug 30964)
> > > > - build-many-glibcs: Check for required system tools
> > > > - x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
> > > > - aarch64: Optimise vecmath logs
> > > > - aarch64: Cosmetic change in SVE exp routines
> > > > - aarch64: Optimize SVE cos & cosf
> > > > - aarch64: Improve vecmath sin routines
> > > > - nss: Get rid of alloca usage in makedb's write_output.
> > > > - debug: Add regression tests for BZ 30932
> > > > - Fix FORTIFY_SOURCE false positive
> > > > - nss: Rearrange and sort Makefile variables
> > > > - inet: Rearrange and sort Makefile variables
> > > > - Fix off-by-one OOB write in iconv/tst-iconv-mt
> > > >
> > > > And this patch is the most likely one to cause issues.  I will try to
> > > > revert the patch and see if it fixes the observed issues.
> > >
> > > We did the revert and the issues were gone.  So I think this commit is
> > > faulty.
> >
> > Bah, didn't see your last email.
> > Thank you for reverting. Will look into the issue.
>
> Okay bug is missing VBMI2 check. But the VBMI2 stuff
> isn't really needed so will update and repost w/ fixed ISA.

Posted fix.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10
  2023-11-01 21:22         ` Noah Goldstein
  2023-11-01 22:17           ` Noah Goldstein
@ 2023-11-02  6:44           ` Florian Weimer
  1 sibling, 0 replies; 12+ messages in thread
From: Florian Weimer @ 2023-11-02  6:44 UTC (permalink / raw)
  To: Noah Goldstein
  Cc: Florian Weimer, libc-alpha, hjl.tools, carlos, Sunil Pandey

* Noah Goldstein:

> Okay bug is missing VBMI2 check. But the VBMI2 stuff
> isn't really needed so will update and repost w/ fixed ISA.

Thanks.  The missing check explains why I couldn't reproduce it on the
lab machines I tried, which must have had VBMI2.  The Fedora builders
apparently don't.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-11-02  6:44 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-21 14:38 x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10 Noah Goldstein
2023-09-21 14:39 ` Noah Goldstein
2023-09-21 15:16   ` H.J. Lu
2023-09-21 19:19     ` Noah Goldstein
2023-10-04 18:48 ` Noah Goldstein
2023-10-04 19:00   ` Sunil Pandey
2023-10-18  9:18   ` Florian Weimer
2023-11-01 21:04     ` Florian Weimer
2023-11-01 21:11       ` Noah Goldstein
2023-11-01 21:22         ` Noah Goldstein
2023-11-01 22:17           ` Noah Goldstein
2023-11-02  6:44           ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).