public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Siddhesh Poyarekar <siddhesh@sourceware.org>
To: libc-alpha@sourceware.org
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Subject: [PING*2][PATCH] aarch64: Improve strcmp unaligned performance
Date: Tue, 12 Dec 2017 18:20:00 -0000	[thread overview]
Message-ID: <2846b978-d7c5-487d-007c-65077e28ad61@sourceware.org> (raw)
In-Reply-To: <3bdd6a7c-25ad-07ca-ade1-691d1d4b6a44@sourceware.org>

Ping!

On Monday 04 December 2017 09:37 PM, Siddhesh Poyarekar wrote:
> Ping!
> 
> On Friday 01 December 2017 10:56 AM, Siddhesh Poyarekar wrote:
>> Replace the simple byte-wise compare in the misaligned case with a
>> dword compare with page boundary checks in place.  For simplicity I've
>> chosen a 4K page boundary so that we don't have to query the actual
>> page size on the system.
>>
>> This results in up to 3x improvement in performance in the unaligned
>> case on falkor and about 2.5x improvement on mustang as measured using
>> bench-strcmp.
>>
>> 	* sysdeps/aarch64/strcmp.S (misaligned8): Compare dword at a
>> 	time whenever possible.
>> ---
>>  sysdeps/aarch64/strcmp.S | 31 +++++++++++++++++++++++++++++--
>>  1 file changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/sysdeps/aarch64/strcmp.S b/sysdeps/aarch64/strcmp.S
>> index e99d662..c260e1d 100644
>> --- a/sysdeps/aarch64/strcmp.S
>> +++ b/sysdeps/aarch64/strcmp.S
>> @@ -72,6 +72,7 @@ L(start_realigned):
>>  	cbz	syndrome, L(loop_aligned)
>>  	/* End of performance-critical section  -- one 64B cache line.  */
>>  
>> +L(end):
>>  #ifndef	__AARCH64EB__
>>  	rev	syndrome, syndrome
>>  	rev	data1, data1
>> @@ -145,12 +146,38 @@ L(mutual_align):
>>  	b	L(start_realigned)
>>  
>>  L(misaligned8):
>> -	/* We can do better than this.  */
>> +	/* Align SRC1 to 8 bytes and then compare 8 bytes at a time, always
>> +	   checking to make sure that we don't access beyond page boundary in
>> +	   SRC2.  */
>> +	tst	src1, #7
>> +	b.eq	L(loop_misaligned)
>> +L(do_misaligned):
>>  	ldrb	data1w, [src1], #1
>>  	ldrb	data2w, [src2], #1
>>  	cmp	data1w, #1
>>  	ccmp	data1w, data2w, #0, cs	/* NZCV = 0b0000.  */
>> -	b.eq	L(misaligned8)
>> +	b.ne	L(done)
>> +	tst	src1, #7
>> +	b.ne	L(misaligned8)
>> +
>> +L(loop_misaligned):
>> +	/* Test if we are within the last dword of the end of a 4K page.  If
>> +	   yes then jump back to the misaligned loop to copy a byte at a time.  */
>> +	and	tmp1, src2, #0xff8
>> +	eor	tmp1, tmp1, #0xff8
>> +	cbz	tmp1, L(do_misaligned)
>> +	ldr	data1, [src1], #8
>> +	ldr	data2, [src2], #8
>> +
>> +	sub	tmp1, data1, zeroones
>> +	orr	tmp2, data1, #REP8_7f
>> +	eor	diff, data1, data2	/* Non-zero if differences found.  */
>> +	bic	has_nul, tmp1, tmp2	/* Non-zero if NUL terminator.  */
>> +	orr	syndrome, diff, has_nul
>> +	cbz	syndrome, L(loop_misaligned)
>> +	b	L(end)
>> +
>> +L(done):
>>  	sub	result, data1, data2
>>  	RET
>>  END(strcmp)
>>
> 

  reply	other threads:[~2017-12-12 18:20 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-01  5:29 [PATCH] " Siddhesh Poyarekar
2017-12-01  6:55 ` Andrew Pinski
2017-12-07 12:38   ` Siddhesh Poyarekar
2017-12-04 16:07 ` [PING][PATCH] " Siddhesh Poyarekar
2017-12-12 18:20   ` Siddhesh Poyarekar [this message]
2017-12-12 18:27     ` [PING*2][PATCH] " Szabolcs Nagy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2846b978-d7c5-487d-007c-65077e28ad61@sourceware.org \
    --to=siddhesh@sourceware.org \
    --cc=libc-alpha@sourceware.org \
    --cc=szabolcs.nagy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).