* [PATCH] AArch64: Improve backwards memmove performance
@ 2020-08-20 11:46 Wilco Dijkstra
2020-08-25 12:15 ` Adhemerval Zanella
0 siblings, 1 reply; 2+ messages in thread
From: Wilco Dijkstra @ 2020-08-20 11:46 UTC (permalink / raw)
To: 'GNU C Library'
On some microarchitectures performance of the backwards memmove improves if
the stores use STR with decreasing addresses. So change the memmove loop
in memcpy_advsimd.S to use 2x STR rather than STP.
Passes GLIBC regression test, OK for commit?
---
diff --git a/sysdeps/aarch64/multiarch/memcpy_advsimd.S b/sysdeps/aarch64/multiarch/memcpy_advsimd.S
index d4ba74777744c8bb5a83e43ab2d63ad8dab35203..48bb6d7ca425197907eaef2307fb3939e69baa15 100644
--- a/sysdeps/aarch64/multiarch/memcpy_advsimd.S
+++ b/sysdeps/aarch64/multiarch/memcpy_advsimd.S
@@ -223,12 +223,13 @@ L(copy_long_backwards):
b.ls L(copy64_from_start)
L(loop64_backwards):
- stp A_q, B_q, [dstend, -32]
+ str B_q, [dstend, -16]
+ str A_q, [dstend, -32]
ldp A_q, B_q, [srcend, -96]
- stp C_q, D_q, [dstend, -64]
+ str D_q, [dstend, -48]
+ str C_q, [dstend, -64]!
ldp C_q, D_q, [srcend, -128]
sub srcend, srcend, 64
- sub dstend, dstend, 64
subs count, count, 64
b.hi L(loop64_backwards)
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] AArch64: Improve backwards memmove performance
2020-08-20 11:46 [PATCH] AArch64: Improve backwards memmove performance Wilco Dijkstra
@ 2020-08-25 12:15 ` Adhemerval Zanella
0 siblings, 0 replies; 2+ messages in thread
From: Adhemerval Zanella @ 2020-08-25 12:15 UTC (permalink / raw)
To: libc-alpha
On 20/08/2020 08:46, Wilco Dijkstra wrote:
> On some microarchitectures performance of the backwards memmove improves if
> the stores use STR with decreasing addresses. So change the memmove loop
> in memcpy_advsimd.S to use 2x STR rather than STP.
>
> Passes GLIBC regression test, OK for commit?
LGTM, thanks. Does it make any difference to use the same strategy on the
last iteration at L(copy64_from_start) as well?
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
>
> ---
> diff --git a/sysdeps/aarch64/multiarch/memcpy_advsimd.S b/sysdeps/aarch64/multiarch/memcpy_advsimd.S
> index d4ba74777744c8bb5a83e43ab2d63ad8dab35203..48bb6d7ca425197907eaef2307fb3939e69baa15 100644
> --- a/sysdeps/aarch64/multiarch/memcpy_advsimd.S
> +++ b/sysdeps/aarch64/multiarch/memcpy_advsimd.S
> @@ -223,12 +223,13 @@ L(copy_long_backwards):
> b.ls L(copy64_from_start)
>
> L(loop64_backwards):
> - stp A_q, B_q, [dstend, -32]
> + str B_q, [dstend, -16]
> + str A_q, [dstend, -32]
> ldp A_q, B_q, [srcend, -96]
> - stp C_q, D_q, [dstend, -64]
> + str D_q, [dstend, -48]
> + str C_q, [dstend, -64]!
> ldp C_q, D_q, [srcend, -128]
> sub srcend, srcend, 64
> - sub dstend, dstend, 64
> subs count, count, 64
> b.hi L(loop64_backwards)
>
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-08-25 12:15 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-20 11:46 [PATCH] AArch64: Improve backwards memmove performance Wilco Dijkstra
2020-08-25 12:15 ` Adhemerval Zanella
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).