From: Alejandro Colomar <alx.manpages@gmail.com>
To: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Cc: 'GNU C Library' <libc-alpha@sourceware.org>
Subject: Re: [PATCH 1/1] string: Add stpecpy(3)
Date: Fri, 23 Dec 2022 23:40:03 +0100 [thread overview]
Message-ID: <d5f982a2-b2d4-b18f-8ab2-d4cc40f15f1d@gmail.com> (raw)
In-Reply-To: <PAWPR08MB89828B94DCA3DC9F566FF3DB83E99@PAWPR08MB8982.eurprd08.prod.outlook.com>
[-- Attachment #1.1: Type: text/plain, Size: 3969 bytes --]
Hi Wilco,
On 12/23/22 19:35, Wilco Dijkstra wrote:
> Hi Alex,
(a)
>
>> if (dst == end)
>> return end;
>> if (stp_unlikely(dst == NULL)) // Allow chaining with stpeprintf().
>> return NULL;
>
>> Oh, and the two branches above can be optimized into a branch that returns dst.
>
> How? There will be 2 branches since you're doing 2 checks here...
That is equivalent to:
(b)
if (dst == end)
return dst;
if (stp_unlikely(dst == NULL)) // Allow chaining with stpeprintf().
return dst;
which itself is equivalent to:
(c)
if ((dst == end) || stp_unlikely(dst == NULL))
return dst;
which still has a branch in ||, due to the shortcut of the boolean operator.
However, the compiler is allowed to transform it into bitwise, since there are
no side effects, no UB, and the result would be the same:
(d)
if ((dst == end) | stp_unlikely(dst == NULL))
return dst;
Which doesn't have a hidden branch.
I tried GCC, and (b) and (c) produce the same assembly code, but slightly
different than (a) (but almost identical, nothing significant). (d) produces
considerably different assembly. Tried under -O3 -march=native. I don't know
enough assembly to judge which is better; I'll copy the results here for the
curious. I guess the compiler seems to prefer an extra branch here over
unconditionally doing bitwise operations; it very likely knows more than I do.
Cheers,
Alex
alx@asus5775:~/src/alx/libstp$ diff -u a.s b.s
--- a.s 2022-12-23 23:27:57.788103834 +0100
+++ b.s 2022-12-23 23:28:13.463919271 +0100
@@ -59,9 +59,9 @@
.cfi_offset 3, -16
movq %rsi, %rbx
cmpq %rsi, %rdi
- je .L9
- testq %rdi, %rdi
je .L12
+ testq %rdi, %rdi
+ je .L13
movq %rbx, %rcx
movq %rdx, %rsi
xorl %edx, %edx
@@ -79,7 +79,7 @@
.L11:
.cfi_restore_state
movb $0, -1(%rbx)
-.L9:
+.L12:
movq %rbx, %rax
popq %rbx
.cfi_remember_state
@@ -87,7 +87,7 @@
ret
.p2align 4,,10
.p2align 3
-.L12:
+.L13:
.cfi_restore_state
xorl %eax, %eax
popq %rbx
alx@asus5775:~/src/alx/libstp$ diff -u b.s c.s
alx@asus5775:~/src/alx/libstp$ diff -u c.s d.s
--- c.s 2022-12-23 23:29:07.315367548 +0100
+++ d.s 2022-12-23 23:28:58.007455133 +0100
@@ -11,16 +11,16 @@
.cfi_offset 3, -16
movq %rsi, %rbx
cmpq %rsi, %rdi
- je .L2
+ je .L4
testq %rdi, %rdi
- je .L5
+ je .L4
movq %rbx, %rcx
movq %rdx, %rsi
xorl %edx, %edx
subq %rdi, %rcx
call memccpy@PLT
testq %rax, %rax
- je .L4
+ je .L3
decq %rax
popq %rbx
.cfi_remember_state
@@ -28,20 +28,19 @@
ret
.p2align 4,,10
.p2align 3
-.L4:
+.L3:
.cfi_restore_state
- movb $0, -1(%rbx)
-.L2:
movq %rbx, %rax
+ movb $0, -1(%rbx)
popq %rbx
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.p2align 4,,10
.p2align 3
-.L5:
+.L4:
.cfi_restore_state
- xorl %eax, %eax
+ movq %rdi, %rax
popq %rbx
.cfi_def_cfa_offset 8
ret
@@ -59,16 +58,16 @@
.cfi_offset 3, -16
movq %rsi, %rbx
cmpq %rsi, %rdi
- je .L12
+ je .L10
testq %rdi, %rdi
- je .L13
+ je .L10
movq %rbx, %rcx
movq %rdx, %rsi
xorl %edx, %edx
subq %rdi, %rcx
call memccpy@PLT
testq %rax, %rax
- je .L11
+ je .L9
decq %rax
popq %rbx
.cfi_remember_state
@@ -76,20 +75,19 @@
ret
.p2align 4,,10
.p2align 3
-.L11:
+.L9:
.cfi_restore_state
- movb $0, -1(%rbx)
-.L12:
movq %rbx, %rax
+ movb $0, -1(%rbx)
popq %rbx
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.p2align 4,,10
.p2align 3
-.L13:
+.L10:
.cfi_restore_state
- xorl %eax, %eax
+ movq %rdi, %rax
popq %rbx
.cfi_def_cfa_offset 8
ret
>
> Cheers,
> Wilco
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2022-12-23 22:40 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-23 18:35 Wilco Dijkstra
2022-12-23 22:40 ` Alejandro Colomar [this message]
-- strict thread matches above, loose matches on Subject: below --
2022-12-23 23:24 Wilco Dijkstra
2022-12-24 0:05 ` Alejandro Colomar
2022-12-24 0:26 ` Alejandro Colomar
2022-12-25 1:52 ` Noah Goldstein
2022-12-25 14:37 ` Alejandro Colomar
2022-12-25 22:31 ` Noah Goldstein
2022-12-26 0:26 ` Alejandro Colomar
2022-12-26 0:32 ` Noah Goldstein
2022-12-26 0:37 ` Alejandro Colomar
2022-12-26 2:43 ` Noah Goldstein
2022-12-26 22:25 ` Alejandro Colomar
2022-12-26 23:24 ` Alejandro Colomar
2022-12-26 23:52 ` Alejandro Colomar
2022-12-27 0:12 ` Alejandro Colomar
2022-12-23 14:59 Wilco Dijkstra
2022-12-23 17:03 ` Alejandro Colomar
2022-12-23 17:27 ` Alejandro Colomar
2022-12-22 21:42 [PATCH 0/1] " Alejandro Colomar
2022-12-22 21:42 ` [PATCH 1/1] " Alejandro Colomar
2022-12-23 7:02 ` Sam James
2022-12-23 12:26 ` Alejandro Colomar
2022-12-23 12:29 ` Alejandro Colomar
2022-12-23 17:21 ` Alejandro Colomar
2022-12-31 15:13 ` Sam James
2022-12-31 15:15 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d5f982a2-b2d4-b18f-8ab2-d4cc40f15f1d@gmail.com \
--to=alx.manpages@gmail.com \
--cc=Wilco.Dijkstra@arm.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).