* [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations
@ 2019-09-30 14:55 Christos Gentsos
2019-09-30 15:31 ` Richard Earnshaw (lists)
0 siblings, 1 reply; 5+ messages in thread
From: Christos Gentsos @ 2019-09-30 14:55 UTC (permalink / raw)
To: newlib
The same pop instruction that is used to restore registers can be used
to return from the function (as it is already done in other function
implementations).
---
newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 +---
newlib/libc/machine/arm/aeabi_memset-thumb.S | 4 +---
2 files changed, 2 insertions(+), 6 deletions(-)
diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
index 61a72581..a0aad852 100644
--- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
+++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
@@ -49,9 +49,7 @@ __aeabi_memmove:
subs r3, r3, #1
bcs 1b
2:
- pop {r4}
- pop {r1}
- bx r1
+ pop {r4, pc}
3:
movs r3, #0
cmp r2, #0
diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S b/newlib/libc/machine/arm/aeabi_memset-thumb.S
index aa8f2719..5bb80b20 100644
--- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
+++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
@@ -110,9 +110,7 @@ __aeabi_memset:
cmp r4, r3
bne 8b
9:
- pop {r4, r5, r6}
- pop {r1}
- bx r1
+ pop {r4, r5, r6, pc}
10:
movs r3, r0
movs r4, r1
--
2.23.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations
2019-09-30 14:55 [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations Christos Gentsos
@ 2019-09-30 15:31 ` Richard Earnshaw (lists)
2019-09-30 15:37 ` Richard Earnshaw (lists)
0 siblings, 1 reply; 5+ messages in thread
From: Richard Earnshaw (lists) @ 2019-09-30 15:31 UTC (permalink / raw)
To: Christos Gentsos, newlib
On 30/09/2019 15:55, Christos Gentsos wrote:
> The same pop instruction that is used to restore registers can be used
> to return from the function (as it is already done in other function
> implementations).
> ---
> newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 +---
> newlib/libc/machine/arm/aeabi_memset-thumb.S | 4 +---
> 2 files changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
> index 61a72581..a0aad852 100644
> --- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
> +++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
> @@ -49,9 +49,7 @@ __aeabi_memmove:
> subs r3, r3, #1
> bcs 1b
> 2:
> - pop {r4}
> - pop {r1}
> - bx r1
> + pop {r4, pc}
> 3:
> movs r3, #0
> cmp r2, #0
> diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S b/newlib/libc/machine/arm/aeabi_memset-thumb.S
> index aa8f2719..5bb80b20 100644
> --- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
> +++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
> @@ -110,9 +110,7 @@ __aeabi_memset:
> cmp r4, r3
> bne 8b
> 9:
> - pop {r4, r5, r6}
> - pop {r1}
> - bx r1
> + pop {r4, r5, r6, pc}
> 10:
> movs r3, r0
> movs r4, r1
>
No. That isn't interworking clean on armv4t, which we still need to
support.
Sorry.
R.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations
2019-09-30 15:31 ` Richard Earnshaw (lists)
@ 2019-09-30 15:37 ` Richard Earnshaw (lists)
2019-09-30 17:17 ` Christos Gentsos
0 siblings, 1 reply; 5+ messages in thread
From: Richard Earnshaw (lists) @ 2019-09-30 15:37 UTC (permalink / raw)
To: Christos Gentsos, newlib
On 30/09/2019 16:31, Richard Earnshaw (lists) wrote:
> On 30/09/2019 15:55, Christos Gentsos wrote:
>> The same pop instruction that is used to restore registers can be used
>> to return from the function (as it is already done in other function
>> implementations).
>> ---
>> Â newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 +---
>> Â newlib/libc/machine/arm/aeabi_memset-thumb.SÂ | 4 +---
>> Â 2 files changed, 2 insertions(+), 6 deletions(-)
>>
>> diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>> b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>> index 61a72581..a0aad852 100644
>> --- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>> +++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>> @@ -49,9 +49,7 @@ __aeabi_memmove:
>>      subs   r3, r3, #1
>>      bcs   1b
>> Â 2:
>> -   pop   {r4}
>> -   pop   {r1}
>> -   bx   r1
>> +   pop   {r4, pc}
>> Â 3:
>>      movs   r3, #0
>>      cmp   r2, #0
>> diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S
>> b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>> index aa8f2719..5bb80b20 100644
>> --- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
>> +++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>> @@ -110,9 +110,7 @@ __aeabi_memset:
>>      cmp   r4, r3
>>      bne   8b
>> Â 9:
>> -   pop   {r4, r5, r6}
>> -   pop   {r1}
>> -   bx   r1
>> +   pop   {r4, r5, r6, pc}
>> Â 10:
>>      movs   r3, r0
>>      movs   r4, r1
>>
>
> No. That isn't interworking clean on armv4t, which we still need to
> support.
>
> Sorry.
>
> R.
However, a patch that tests __ARM_ARCH >=5 and uses your improved
sequence only in that case (preserving the old code otherwise) would
probably be OK :-)
R.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations
2019-09-30 15:37 ` Richard Earnshaw (lists)
@ 2019-09-30 17:17 ` Christos Gentsos
2019-10-07 14:31 ` Richard Earnshaw (lists)
0 siblings, 1 reply; 5+ messages in thread
From: Christos Gentsos @ 2019-09-30 17:17 UTC (permalink / raw)
To: Richard Earnshaw (lists), newlib
On Mon, Sep 30 2019 at 16:37:26 +0100, Richard Earnshaw (lists) wrote:
> On 30/09/2019 16:31, Richard Earnshaw (lists) wrote:
>> On 30/09/2019 15:55, Christos Gentsos wrote:
>>> The same pop instruction that is used to restore registers can be used
>>> to return from the function (as it is already done in other function
>>> implementations).
>>> ---
>>> newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 +---
>>> newlib/libc/machine/arm/aeabi_memset-thumb.S | 4 +---
>>> 2 files changed, 2 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> index 61a72581..a0aad852 100644
>>> --- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> +++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> @@ -49,9 +49,7 @@ __aeabi_memmove:
>>> subs r3, r3, #1
>>> bcs 1b
>>> 2:
>>> - pop {r4}
>>> - pop {r1}
>>> - bx r1
>>> + pop {r4, pc}
>>> 3:
>>> movs r3, #0
>>> cmp r2, #0
>>> diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> index aa8f2719..5bb80b20 100644
>>> --- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> +++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> @@ -110,9 +110,7 @@ __aeabi_memset:
>>> cmp r4, r3
>>> bne 8b
>>> 9:
>>> - pop {r4, r5, r6}
>>> - pop {r1}
>>> - bx r1
>>> + pop {r4, r5, r6, pc}
>>> 10:
>>> movs r3, r0
>>> movs r4, r1
>>>
>>
>> No. That isn't interworking clean on armv4t, which we still need to
>> support.
>>
>> Sorry.
>>
>> R.
>
> However, a patch that tests __ARM_ARCH >=5 and uses your improved
> sequence only in that case (preserving the old code otherwise) would
> probably be OK :-)
>
> R.
Oh sorry then, I wasn't aware of that, thanks for the correction. I
re-made the patch such that it now checks for __ARM_ARCH, as per your
suggestion. Does it look better?
Thanks,
Christos
---
newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 ++++
newlib/libc/machine/arm/aeabi_memset-thumb.S | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
index 61a72581..465a5a19 100644
--- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
+++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
@@ -49,9 +49,13 @@ __aeabi_memmove:
subs r3, r3, #1
bcs 1b
2:
+#if __ARM_ARCH >= 5
+ pop {r4, pc}
+#else
pop {r4}
pop {r1}
bx r1
+#endif
3:
movs r3, #0
cmp r2, #0
diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S b/newlib/libc/machine/arm/aeabi_memset-thumb.S
index aa8f2719..52094a7b 100644
--- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
+++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
@@ -110,9 +110,13 @@ __aeabi_memset:
cmp r4, r3
bne 8b
9:
+#if __ARM_ARCH >= 5
+ pop {r4, r5, r6, pc}
+#else
pop {r4, r5, r6}
pop {r1}
bx r1
+#endif
10:
movs r3, r0
movs r4, r1
--
2.23.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations
2019-09-30 17:17 ` Christos Gentsos
@ 2019-10-07 14:31 ` Richard Earnshaw (lists)
0 siblings, 0 replies; 5+ messages in thread
From: Richard Earnshaw (lists) @ 2019-10-07 14:31 UTC (permalink / raw)
To: Christos Gentsos, newlib
Thanks. I tweaked this slightly to include the compatibility header
acle-compat.h and put this in.
R.
On 30/09/2019 18:16, Christos Gentsos wrote:
> On Mon, Sep 30 2019 at 16:37:26 +0100, Richard Earnshaw (lists) wrote:
>> On 30/09/2019 16:31, Richard Earnshaw (lists) wrote:
>>> On 30/09/2019 15:55, Christos Gentsos wrote:
>>>> The same pop instruction that is used to restore registers can be used
>>>> to return from the function (as it is already done in other function
>>>> implementations).
>>>> ---
>>>> Â newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 +---
>>>> Â newlib/libc/machine/arm/aeabi_memset-thumb.SÂ | 4 +---
>>>> Â 2 files changed, 2 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>>> b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>>> index 61a72581..a0aad852 100644
>>>> --- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>>> +++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>>> @@ -49,9 +49,7 @@ __aeabi_memmove:
>>>>      subs   r3, r3, #1
>>>>      bcs   1b
>>>> Â 2:
>>>> -   pop   {r4}
>>>> -   pop   {r1}
>>>> -   bx   r1
>>>> +   pop   {r4, pc}
>>>> Â 3:
>>>>      movs   r3, #0
>>>>      cmp   r2, #0
>>>> diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>>> b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>>> index aa8f2719..5bb80b20 100644
>>>> --- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>>> +++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>>> @@ -110,9 +110,7 @@ __aeabi_memset:
>>>>      cmp   r4, r3
>>>>      bne   8b
>>>> Â 9:
>>>> -   pop   {r4, r5, r6}
>>>> -   pop   {r1}
>>>> -   bx   r1
>>>> +   pop   {r4, r5, r6, pc}
>>>> Â 10:
>>>>      movs   r3, r0
>>>>      movs   r4, r1
>>>>
>>>
>>> No. That isn't interworking clean on armv4t, which we still need to
>>> support.
>>>
>>> Sorry.
>>>
>>> R.
>>
>> However, a patch that tests __ARM_ARCH >=5 and uses your improved
>> sequence only in that case (preserving the old code otherwise) would
>> probably be OK :-)
>>
>> R.
>
> Oh sorry then, I wasn't aware of that, thanks for the correction. I
> re-made the patch such that it now checks for __ARM_ARCH, as per your
> suggestion. Does it look better?
>
> Thanks,
> Christos
> ---
> newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 ++++
> newlib/libc/machine/arm/aeabi_memset-thumb.S | 4 ++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
> index 61a72581..465a5a19 100644
> --- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
> +++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
> @@ -49,9 +49,13 @@ __aeabi_memmove:
> subs r3, r3, #1
> bcs 1b
> 2:
> +#if __ARM_ARCH >= 5
> + pop {r4, pc}
> +#else
> pop {r4}
> pop {r1}
> bx r1
> +#endif
> 3:
> movs r3, #0
> cmp r2, #0
> diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S b/newlib/libc/machine/arm/aeabi_memset-thumb.S
> index aa8f2719..52094a7b 100644
> --- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
> +++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
> @@ -110,9 +110,13 @@ __aeabi_memset:
> cmp r4, r3
> bne 8b
> 9:
> +#if __ARM_ARCH >= 5
> + pop {r4, r5, r6, pc}
> +#else
> pop {r4, r5, r6}
> pop {r1}
> bx r1
> +#endif
> 10:
> movs r3, r0
> movs r4, r1
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-10-07 14:31 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-30 14:55 [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations Christos Gentsos
2019-09-30 15:31 ` Richard Earnshaw (lists)
2019-09-30 15:37 ` Richard Earnshaw (lists)
2019-09-30 17:17 ` Christos Gentsos
2019-10-07 14:31 ` Richard Earnshaw (lists)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).