public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v2] aarch64: fix strcpy and strnlen for big-endian [BZ #25824]
@ 2020-05-15 11:07 Wilco Dijkstra
  2020-05-15 11:39 ` Szabolcs Nagy
  0 siblings, 1 reply; 3+ messages in thread
From: Wilco Dijkstra @ 2020-05-15 11:07 UTC (permalink / raw)
  To: libc-alpha, shaolexi

Hi,

This looks good to me. Are you planning to post a patch to fix strlen_asimd.S as well? That has 3 more incorrect uses of rev64 that should be fixed in the same way.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] aarch64: fix strcpy and strnlen for big-endian [BZ #25824]
  2020-05-15 11:07 [PATCH v2] aarch64: fix strcpy and strnlen for big-endian [BZ #25824] Wilco Dijkstra
@ 2020-05-15 11:39 ` Szabolcs Nagy
  0 siblings, 0 replies; 3+ messages in thread
From: Szabolcs Nagy @ 2020-05-15 11:39 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: libc-alpha, shaolexi

The 05/15/2020 12:07, Wilco Dijkstra wrote:
> This looks good to me. Are you planning to post a patch to fix strlen_asimd.S as well? That has 3 more incorrect uses of rev64 that should be fixed in the same way.

i committed v2 so the bug should be fixed for most users,
strlen_asimd.S seems to only affect falkor and kunpeng920.

it would be nice to fix that too (in case somebody needs
to back port a fix), but without fix we can decide to
remove that variant, we plan to add an improved strlen
that may work for kunpeng920 too.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2] aarch64: fix strcpy and strnlen for big-endian [BZ #25824]
@ 2020-05-15 10:48 Lexi Shao
  0 siblings, 0 replies; 3+ messages in thread
From: Lexi Shao @ 2020-05-15 10:48 UTC (permalink / raw)
  To: libc-alpha, szabolcs.nagy; +Cc: liucheng32, shaolexi

This patch fixes the optimized implementation of strcpy and strnlen
on a big-endian arm64 machine.

The optimized method uses neon, which can process 128bit with one
instruction. On a big-endian machine, the bit order should be reversed
for the whole 128-bits double word. But with instuction
	rev64	datav.16b, datav.16b
it reverses 64bits in the two halves rather than reversing 128bits.
There is no such instruction as rev128 to reverse the 128bits, but we
can fix this by loading the data registers accordingly.

Fixes 0237b61526e7("aarch64: Optimized implementation of strcpy") and
2911cb68ed3d("aarch64: Optimized implementation of strnlen").

Signed-off-by: Lexi Shao <shaolexi@huawei.com>
Reviewed-by: Szabolcs Nagy  <szabolcs.nagy@arm.com>
---
 sysdeps/aarch64/strcpy.S  | 5 +++++
 sysdeps/aarch64/strnlen.S | 5 +++++
 2 files changed, 10 insertions(+)

diff --git a/sysdeps/aarch64/strcpy.S b/sysdeps/aarch64/strcpy.S
index 548130e..a8ff52c 100644
--- a/sysdeps/aarch64/strcpy.S
+++ b/sysdeps/aarch64/strcpy.S
@@ -234,8 +234,13 @@ L(entry_no_page_cross):
 #endif
 	/* calculate the loc value */
 	cmeq	datav.16b, datav.16b, #0
+#ifdef __AARCH64EB__
+	mov	data1, datav.d[1]
+	mov	data2, datav.d[0]
+#else
 	mov	data1, datav.d[0]
 	mov	data2, datav.d[1]
+#endif
 	cmp	data1, 0
 	csel	data1, data1, data2, ne
 	mov	pos, 8
diff --git a/sysdeps/aarch64/strnlen.S b/sysdeps/aarch64/strnlen.S
index 5981247..086a5c7 100644
--- a/sysdeps/aarch64/strnlen.S
+++ b/sysdeps/aarch64/strnlen.S
@@ -154,8 +154,13 @@ L(loop_end):
 	   byte.  */
 
 	cmeq	datav.16b, datav.16b, #0
+#ifdef __AARCH64EB__
+	mov	data1, datav.d[1]
+	mov	data2, datav.d[0]
+#else
 	mov	data1, datav.d[0]
 	mov	data2, datav.d[1]
+#endif
 	cmp	data1, 0
 	csel	data1, data1, data2, ne
 	sub	len, src, srcin
-- 
2.12.3


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-05-15 11:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-15 11:07 [PATCH v2] aarch64: fix strcpy and strnlen for big-endian [BZ #25824] Wilco Dijkstra
2020-05-15 11:39 ` Szabolcs Nagy
  -- strict thread matches above, loose matches on Subject: below --
2020-05-15 10:48 Lexi Shao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).