public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] aarch64: fix strcpy and strnlen for big-endian
@ 2020-05-15  2:12 Lexi Shao
  2020-05-15 10:03 ` Szabolcs Nagy
  0 siblings, 1 reply; 3+ messages in thread
From: Lexi Shao @ 2020-05-15  2:12 UTC (permalink / raw)
  To: libc-alpha; +Cc: zhangxuelei4, liucheng32, shaolexi

This patch fixes the optimized implementation of strcpy and strnlen
on a big-endian arm64 machine.

The optimized method uses neon, which can process 128bit with one
instruction. On a big-endian machine, the bit order should be reversed
for the whole 128-bits double word. But with instuction
	rev64	datav.16b, datav.16b
it reverses 64bits in the two halves rather than reverseing 128bits.
There is no such instruction as rev128 to reverse the 128bits, but we
can fix this by loading the data registers accordingly.

Fixes 0237b61526e7("aarch64: Optimized implementation of strcpy") and
2911cb68ed3d("aarch64: Optimized implementation of strnlen").

Signed-off-by: Lexi Shao <shaolexi@huawei.com>
---
 sysdeps/aarch64/strcpy.S  | 5 +++++
 sysdeps/aarch64/strnlen.S | 5 +++++
 2 files changed, 10 insertions(+)

diff --git a/sysdeps/aarch64/strcpy.S b/sysdeps/aarch64/strcpy.S
index 52c21c9..08859dd 100644
--- a/sysdeps/aarch64/strcpy.S
+++ b/sysdeps/aarch64/strcpy.S
@@ -234,8 +234,13 @@ L(entry_no_page_cross):
 #endif
 	/* ���loc */
 	cmeq	datav.16b, datav.16b, #0
+#ifdef __AARCH64EB__
+	mov	data1, datav.d[1]
+	mov	data2, datav.d[0]
+#else
 	mov	data1, datav.d[0]
 	mov	data2, datav.d[1]
+#endif
 	cmp	data1, 0
 	csel	data1, data1, data2, ne
 	mov	pos, 8
diff --git a/sysdeps/aarch64/strnlen.S b/sysdeps/aarch64/strnlen.S
index 5981247..086a5c7 100644
--- a/sysdeps/aarch64/strnlen.S
+++ b/sysdeps/aarch64/strnlen.S
@@ -154,8 +154,13 @@ L(loop_end):
 	   byte.  */
 
 	cmeq	datav.16b, datav.16b, #0
+#ifdef __AARCH64EB__
+	mov	data1, datav.d[1]
+	mov	data2, datav.d[0]
+#else
 	mov	data1, datav.d[0]
 	mov	data2, datav.d[1]
+#endif
 	cmp	data1, 0
 	csel	data1, data1, data2, ne
 	sub	len, src, srcin
-- 
2.12.3


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] aarch64: fix strcpy and strnlen for big-endian
  2020-05-15  2:12 [PATCH] aarch64: fix strcpy and strnlen for big-endian Lexi Shao
@ 2020-05-15 10:03 ` Szabolcs Nagy
  2020-05-15 10:40   ` shaolexi
  0 siblings, 1 reply; 3+ messages in thread
From: Szabolcs Nagy @ 2020-05-15 10:03 UTC (permalink / raw)
  To: Lexi Shao; +Cc: libc-alpha, liucheng32

The 05/15/2020 10:12, Lexi Shao wrote:
> This patch fixes the optimized implementation of strcpy and strnlen
> on a big-endian arm64 machine.
> 
> The optimized method uses neon, which can process 128bit with one
> instruction. On a big-endian machine, the bit order should be reversed
> for the whole 128-bits double word. But with instuction
> 	rev64	datav.16b, datav.16b
> it reverses 64bits in the two halves rather than reverseing 128bits.
> There is no such instruction as rev128 to reverse the 128bits, but we
> can fix this by loading the data registers accordingly.
> 
> Fixes 0237b61526e7("aarch64: Optimized implementation of strcpy") and
> 2911cb68ed3d("aarch64: Optimized implementation of strnlen").
> 
> Signed-off-by: Lexi Shao <shaolexi@huawei.com>

Please add the bug reference to the title i.e.
append [BZ #25824]

note the patch was corrupted below you might want
to check if it's something on your side. (in this
case i could fix it because it was in the context)

with those fixed it's ok to commit,

Reviewed-by: Szabolcs Nagy  <szabolcs.nagy@arm.com>

if you don't have commit rights then i can commit
this for you.

> ---
>  sysdeps/aarch64/strcpy.S  | 5 +++++
>  sysdeps/aarch64/strnlen.S | 5 +++++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/sysdeps/aarch64/strcpy.S b/sysdeps/aarch64/strcpy.S
> index 52c21c9..08859dd 100644
> --- a/sysdeps/aarch64/strcpy.S
> +++ b/sysdeps/aarch64/strcpy.S
> @@ -234,8 +234,13 @@ L(entry_no_page_cross):
>  #endif
>  	/* ���loc */
corrupt: ^^^^^^^^^

>  	cmeq	datav.16b, datav.16b, #0
> +#ifdef __AARCH64EB__
> +	mov	data1, datav.d[1]
> +	mov	data2, datav.d[0]
> +#else
>  	mov	data1, datav.d[0]
>  	mov	data2, datav.d[1]
> +#endif
>  	cmp	data1, 0
>  	csel	data1, data1, data2, ne
>  	mov	pos, 8
> diff --git a/sysdeps/aarch64/strnlen.S b/sysdeps/aarch64/strnlen.S
> index 5981247..086a5c7 100644
> --- a/sysdeps/aarch64/strnlen.S
> +++ b/sysdeps/aarch64/strnlen.S
> @@ -154,8 +154,13 @@ L(loop_end):
>  	   byte.  */
>  
>  	cmeq	datav.16b, datav.16b, #0
> +#ifdef __AARCH64EB__
> +	mov	data1, datav.d[1]
> +	mov	data2, datav.d[0]
> +#else
>  	mov	data1, datav.d[0]
>  	mov	data2, datav.d[1]
> +#endif
>  	cmp	data1, 0
>  	csel	data1, data1, data2, ne
>  	sub	len, src, srcin
> -- 
> 2.12.3
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [PATCH] aarch64: fix strcpy and strnlen for big-endian
  2020-05-15 10:03 ` Szabolcs Nagy
@ 2020-05-15 10:40   ` shaolexi
  0 siblings, 0 replies; 3+ messages in thread
From: shaolexi @ 2020-05-15 10:40 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha, liucheng (G)

The 05/15/2020 18:04, Szabolcs Nagy wrote:
>The 05/15/2020 10:12, Lexi Shao wrote:
>> This patch fixes the optimized implementation of strcpy and strnlen on
>> a big-endian arm64 machine.
>>
>> The optimized method uses neon, which can process 128bit with one
>> instruction. On a big-endian machine, the bit order should be reversed
>> for the whole 128-bits double word. But with instuction
>>      rev64   datav.16b, datav.16b
>> it reverses 64bits in the two halves rather than reverseing 128bits.
>> There is no such instruction as rev128 to reverse the 128bits, but we
>> can fix this by loading the data registers accordingly.
>>
>> Fixes 0237b61526e7("aarch64: Optimized implementation of strcpy") and
>> 2911cb68ed3d("aarch64: Optimized implementation of strnlen").
>>
>> Signed-off-by: Lexi Shao <shaolexi@huawei.com>
>
>Please add the bug reference to the title i.e.
>append [BZ #25824]
>
>note the patch was corrupted below you might want to check if it's something on your side. (in this case i could fix it because it was in the context)
>
>with those fixed it's ok to commit,
>
>Reviewed-by: Szabolcs Nagy  <szabolcs.nagy@arm.com>
>
>if you don't have commit rights then i can commit this for you.

No I don't have commit rights. I will send out a new patch soon that fixes the corruption and the title soon, please push the commit for me, thanks!

>
>> ---

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-05-15 10:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-15  2:12 [PATCH] aarch64: fix strcpy and strnlen for big-endian Lexi Shao
2020-05-15 10:03 ` Szabolcs Nagy
2020-05-15 10:40   ` shaolexi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).