public inbox for libc-ports@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v3] ARM: Improve armv7 memcpy performance.
@ 2013-09-09  9:40 Will Newton
  2013-09-09 13:39 ` Joseph S. Myers
  0 siblings, 1 reply; 6+ messages in thread
From: Will Newton @ 2013-09-09  9:40 UTC (permalink / raw)
  To: libc-ports; +Cc: patches


Only enter the aligned copy loop with buffers that can be 8-byte
aligned. This improves performance slightly on Cortex-A9 and
Cortex-A15 cores for large copies with buffers that are 4-byte
aligned but not 8-byte aligned.

ports/ChangeLog.arm:

2013-08-30  Will Newton  <will.newton@linaro.org>

	* sysdeps/arm/armv7/multiarch/memcpy_impl.S: Tighten check
	on entry to aligned copy loop to improve performance.
---
 ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Changes in v3:
 - Fixed comments

diff --git a/ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S b/ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S
index 3decad6..330bb2d 100644
--- a/ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S
+++ b/ports/sysdeps/arm/armv7/multiarch/memcpy_impl.S
@@ -369,8 +369,8 @@ ENTRY(memcpy)
 	cfi_adjust_cfa_offset (FRAME_SIZE)
 	cfi_rel_offset (tmp2, 0)
 	cfi_remember_state
-	and	tmp2, src, #3
-	and	tmp1, dst, #3
+	and	tmp2, src, #7
+	and	tmp1, dst, #7
 	cmp	tmp1, tmp2
 	bne	.Lcpy_notaligned

@@ -381,9 +381,9 @@ ENTRY(memcpy)
 	vmov.f32	s0, s0
 #endif

-	/* SRC and DST have the same mutual 32-bit alignment, but we may
+	/* SRC and DST have the same mutual 64-bit alignment, but we may
 	   still need to pre-copy some bytes to get to natural alignment.
-	   We bring DST into full 64-bit alignment.  */
+	   We bring SRC and DST into full 64-bit alignment.  */
 	lsls	tmp2, dst, #29
 	beq	1f
 	rsbs	tmp2, tmp2, #0
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-09-09 21:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-09  9:40 [PATCH v3] ARM: Improve armv7 memcpy performance Will Newton
2013-09-09 13:39 ` Joseph S. Myers
2013-09-09 16:06   ` Will Newton
2013-09-09 17:11     ` Joseph S. Myers
2013-09-09 17:46       ` Ondřej Bílka
2013-09-09 21:02       ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).