public inbox for newlib-cvs@sourceware.org
help / color / mirror / Atom feed
* [newlib-cygwin] Adjust writeback in non-zero memset
@ 2018-11-06 15:01 Richard Earnshaw
  0 siblings, 0 replies; only message in thread
From: Richard Earnshaw @ 2018-11-06 15:01 UTC (permalink / raw)
  To: newlib-cvs

https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;h=d80db600664bec381230be85955b54884f21a619

commit d80db600664bec381230be85955b54884f21a619
Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Date:   Tue Nov 6 14:42:10 2018 +0000

    Adjust writeback in non-zero memset
    
    This fixes an ineffiency in the non-zero memset.  Delaying the writeback
    until the end of the loop is slightly faster on some cores - this shows
    ~5% performance gain on Cortex-A53 when doing large non-zero memsets.
    
    Tested against the GLIBC testsuite.

Diff:
---
 newlib/libc/machine/aarch64/memset.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/newlib/libc/machine/aarch64/memset.S b/newlib/libc/machine/aarch64/memset.S
index 799e7b7..7c8fe58 100644
--- a/newlib/libc/machine/aarch64/memset.S
+++ b/newlib/libc/machine/aarch64/memset.S
@@ -142,10 +142,10 @@ L(set_long):
 	b.eq	L(try_zva)
 L(no_zva):
 	sub	count, dstend, dst	/* Count is 16 too large.  */
-	add	dst, dst, 16
+	sub	dst, dst, 16		/* Dst is biased by -32.  */
 	sub	count, count, 64 + 16	/* Adjust count and bias for loop.  */
-1:	stp	q0, q0, [dst], 64
-	stp	q0, q0, [dst, -32]
+1:	stp	q0, q0, [dst, 32]
+	stp	q0, q0, [dst, 64]!
 L(tail64):
 	subs	count, count, 64
 	b.hi	1b


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2018-11-06 15:01 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-06 15:01 [newlib-cygwin] Adjust writeback in non-zero memset Richard Earnshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).