* [PATCH, ARM, v2] Improve 64 bit division performance
@ 2014-05-22 10:08 Charles Baylis
2014-06-11 9:30 ` Charles Baylis
0 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-05-22 10:08 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: GCC Patches, Ramana Radhakrishnan
[-- Attachment #1: Type: text/plain, Size: 7908 bytes --]
On 1 May 2014 16:41, Richard Earnshaw <rearnsha@arm.com> wrote:
> I think really, you've got three independent changes here:
Version 2 of this patch series is now a 9 patch series which addresses
most of the following. Exceptions discussed below.
> 1) Optimize the prologue/epilogue sequences when ldrd is available.
> 2) Replace the call to __gnu_ldivmod_helper with __udivmoddi4
I assume you mean __gnu_uldivmod_helper here, as __gnu_ldivmod_helper
performs signed division and can't be directly replaced with the
unsigned division performed by __udivmoddi4.
> 3) Optimize the code to __aeabi_ldivmod.
Converting to call __udivmoddi4, fixing up signedness of operands and
results and optimisation are all one change.
> Ideally, therefore, this is a three patch series, but it's then missing
> a few bits.
>
> 4) Step 2 can also be trivially applied to bpabi-v6m.S as well, since
> it's a direct swap of one function for another (unless I've misread the
> changes, I think the ABI of the two helper functions are the same).
For __aeabi_uldivmod this is true. For __aeabi_ldivmod this is not
trivial as the signedness fix-ups must be written.
> 5) Step 4 then makes __gnu_ldivmod_helper in bpabi.c a dead function
> which can be deleted. This is good because currently pulling in either
> 64-bit division function causes both these helper functions to be pulled
> in and thus the whole of the 64-bit div-mod code for both signed and
> unsigned values. That's particularly unfortunate for ARMv6m class
> devices as that's potentially a lot of redundant code.
Similarly, __gnu_uldivmod_helper not __gnu_ldivmod_helper.
I've included two patches which do the trivial steps for the unsigned case.
>
> Finally, I know this was the original code, but the complete lack of
> comments in this code made reviewing even the trivial parts a complete
> nightmare -- it took me half an hour before I remembered that
> __udivmoddi4 took three parameters, the third of which was on the stack:
> thus the messing around with sp/ip in the prologue wasn't just trivial
> padding but a necessary part of the function. Please could you add, at
> least some short comments clarifying the register disposition on input
> and what that prologue code is up to...
Done.
> Finally, how was this code tested?
It has been built and "make check" has been run with no regressions on:
arm-unknown-linux-gnueabihf --with-mode=thumb --with-arch=armv7-a
arm-unknown-linux-gnueabihf --with-mode=arm --with-arch=armv7-a
arm-unknown-linux-gnueabi --with-mode=arm --with-arch=armv5te
arm-unknown-linux-gnueabi --with-mode=arm --with-arch=armv4t
I have also run a simple test harness which checks the result of
several 64 bit division operations where gcc has been built with the
above configurations.
I am not currently set up with a way to test v6M, so those parts aren't tested.
> Anyway, some additional comments below:
>
> Don't repeat the function name for multiple tweaks to the same function;
> as mentioned above, if these are really separate changes they should be
> in separate submissions. Mixing unrelated changes just makes the
> reviewing step that much harder.
Done.
>> + strd ip,lr, [sp, #-16]!
>
> Space after comma.
Done
> Also, since you've essentially rewritten the entire function, can you
> please also reformat them to follow the coding style of the rest of the
> file: namely "<tab>OP<tab>operands".
Done
>> #else
>> + sub sp, sp, #8
>> do_push {sp, lr}
>> #endif
>
> Please add a comment that the value at *sp is the address of the the
> slot for the remainder.
Done
>> +#if defined(__thumb2__) && CAN_USE_LDRD
>> + sub ip, sp, #8
>> + strd ip,lr, [sp, #-16]!
>
> Space after comma.
Done
>> #else
>> + sub sp, sp, #8
>> do_push {sp, lr}
>> #endif
>> + cmp xxh, #0
>> + blt 1f
>> + cmp yyh, #0
>> + blt 2f
>> +
>> +98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
>
> The CFI push should really precede your conditional tests, it relates to
> the do_push expression.
Done.
>> + bl SYM(__udivmoddi4) __PLT__
>> + ldr lr, [sp, #4]
>> +#if CAN_USE_LDRD
>> + ldrd r2, r3, [sp, #8]
>> + add sp, sp, #16
>> +#else
>> + add sp, sp, #8
>> + do_pop {r2, r3}
>> +#endif
>
> You're missing a CFI pop, which is needed when the values on the stack
> go out of scope.
The existing code doesn't do this. Since there are multiple exit
points from the optimised function the existing cfi_* macros aren't
sufficient (there is no cfi_save_state/cfi_restore_state), so I have
included a patch which uses the gas .cfi_* directives. This may be
interesting on non-DWARF or non-ELF platforms, if any are still
supported .
>> + RET
>> +1: /* xxh:xxl is negative */
>> + rsbs xxl, xxl, #0
>
> We're using unified syntax, so NEGS is preferable.
Done
>> + sbc xxh, xxh, xxh, lsl #1
>
> Worthy of a comment, Thumb2 has no RSC instruction, so use X - 2X.
Done
>> + cmp yyh, #0
>> + blt 3f
>> +98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
>
> This CFI push looks wrong. You've already pushed things earlier. On
> the other hand, you should save the state before the CFI pop above, so
> that you can restore the state again for the next (ie this) block of code.
Done (see above)
>> +98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
>> + bl SYM(__udivmoddi4) __PLT__
>> + ldr lr, [sp, #4]
>> +#if CAN_USE_LDRD
>> + ldrd r2, r3, [sp, #8]
>> + add sp, sp, #16
>> +#else
>> + add sp, sp, #8
>> + do_pop {r2, r3}
>> +#endif
>> + rsbs yyl, yyl, #0
>> + sbc yyh, yyh, yyh, lsl #1
>> + RET
>>
>> #endif /* L_aeabi_ldivmod */
>>
>
> You use the LDRD vs do_pop sequence identically several times. To avoid
> a lot of ifdefs, it might be worth considering a macro for this idiom to
> reduce the overall amount of conditionalized code.
Done.
The updated patch series is attached. Hopefully, patches 1 through 6
are now ready. Patches 7 through 9 can be dropped if necessary.
0001-Whitespace.patch
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod): Fix whitespace.
(__aeabi_ldivmod): Fix whitespace.
0002-Add-comments.patch
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod, __aeabi_ldivmod): Add comment
describing register usage on function entry and exit.
0003-Optimise-__aeabi_uldivmod-stack-manipulation.patch
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod): Optimise stack pointer
manipulation.
0004-Optimise-__aeabi_uldivmod.patch
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call
to __udivmoddi4.
0005-Optimise-__aeabi_ldivmod-stack-manipulation.patch
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_ldivmod): Optimise stack manipulation.
0006-Optimise-__aeabi_ldivmod.patch
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_ldivmod): Perform division using
__udivmoddi4, and fixups for negative operands.
0007-Fix-cfi-annotations.patch
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_ldivmod, __aeabi_uldivmod,
push_for_divide, pop_for_divide): Use .cfi_* directives for DWARF
annotations. Fix DWARF information.
0008-Use-__udivmoddi4-for-v6M-aeabi_uldivmod.patch
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi-v6m.S (__aeabi_uldivmod): Perform division using
__udivmoddi4.
0009-Remove-__gnu_uldivmod_helper.patch
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.c (__gnu_uldivmod_helper): Remove.
[-- Attachment #2: 0001-Whitespace.patch --]
[-- Type: text/x-patch, Size: 1935 bytes --]
From d1453a43717939baf4f96a2c1ddb8b775f592c91 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Mon, 12 May 2014 17:39:18 +0100
Subject: [PATCH 1/9] Whitespace
2014-05-12 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod): Fix whitespace.
(__aeabi_ldivmod): Fix whitespace.
---
libgcc/config/arm/bpabi.S | 36 ++++++++++++++++++------------------
1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index 7772301..f47d715 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -124,20 +124,20 @@ ARM_FUNC_START aeabi_ulcmp
ARM_FUNC_START aeabi_ldivmod
cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
- test_div_by_zero signed
+ test_div_by_zero signed
- sub sp, sp, #8
+ sub sp, sp, #8
#if defined(__thumb2__)
- mov ip, sp
- push {ip, lr}
+ mov ip, sp
+ push {ip, lr}
#else
- do_push {sp, lr}
+ do_push {sp, lr}
#endif
98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
- bl SYM(__gnu_ldivmod_helper) __PLT__
- ldr lr, [sp, #4]
- add sp, sp, #8
- do_pop {r2, r3}
+ bl SYM(__gnu_ldivmod_helper) __PLT__
+ ldr lr, [sp, #4]
+ add sp, sp, #8
+ do_pop {r2, r3}
RET
cfi_end LSYM(Lend_aeabi_ldivmod)
@@ -147,20 +147,20 @@ ARM_FUNC_START aeabi_ldivmod
ARM_FUNC_START aeabi_uldivmod
cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
- test_div_by_zero unsigned
+ test_div_by_zero unsigned
- sub sp, sp, #8
+ sub sp, sp, #8
#if defined(__thumb2__)
- mov ip, sp
- push {ip, lr}
+ mov ip, sp
+ push {ip, lr}
#else
- do_push {sp, lr}
+ do_push {sp, lr}
#endif
98: cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10
- bl SYM(__gnu_uldivmod_helper) __PLT__
- ldr lr, [sp, #4]
- add sp, sp, #8
- do_pop {r2, r3}
+ bl SYM(__gnu_uldivmod_helper) __PLT__
+ ldr lr, [sp, #4]
+ add sp, sp, #8
+ do_pop {r2, r3}
RET
cfi_end LSYM(Lend_aeabi_uldivmod)
--
1.9.1
[-- Attachment #3: 0002-Add-comments.patch --]
[-- Type: text/x-patch, Size: 1294 bytes --]
From 8d420bee9b818cd51d836aab25be7e92f11afe15 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Tue, 13 May 2014 13:10:58 +0100
Subject: [PATCH 2/9] Add comments
2014-05-12 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod, __aeabi_ldivmod): Add comment
describing register usage on function entry and exit.
---
libgcc/config/arm/bpabi.S | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index f47d715..ae76cd3 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -122,6 +122,14 @@ ARM_FUNC_START aeabi_ulcmp
#ifdef L_aeabi_ldivmod
+/* Perform 64 bit signed division.
+ Inputs:
+ r0:r1 numerator
+ r2:r3 denominator
+ Outputs:
+ r0:r1 quotient
+ r2:r3 remainder
+ */
ARM_FUNC_START aeabi_ldivmod
cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
test_div_by_zero signed
@@ -145,6 +153,14 @@ ARM_FUNC_START aeabi_ldivmod
#ifdef L_aeabi_uldivmod
+/* Perform 64 bit signed division.
+ Inputs:
+ r0:r1 numerator
+ r2:r3 denominator
+ Outputs:
+ r0:r1 quotient
+ r2:r3 remainder
+ */
ARM_FUNC_START aeabi_uldivmod
cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
test_div_by_zero unsigned
--
1.9.1
[-- Attachment #4: 0003-Optimise-__aeabi_uldivmod-stack-manipulation.patch --]
[-- Type: text/x-patch, Size: 2346 bytes --]
From 8e996cd947311966665c13d46b4ee75c145d31f5 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Mon, 12 May 2014 17:40:36 +0100
Subject: [PATCH 3/9] Optimise __aeabi_uldivmod (stack manipulation)
2014-05-12 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod): Optimise stack pointer
manipulation.
---
libgcc/config/arm/bpabi.S | 54 +++++++++++++++++++++++++++++++++++++----------
1 file changed, 43 insertions(+), 11 deletions(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index ae76cd3..67246b0 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -120,6 +120,46 @@ ARM_FUNC_START aeabi_ulcmp
#endif
.endm
+/* we can use STRD/LDRD on v5TE and later, and any Thumb-2 architecture. */
+#if (defined(__ARM_EABI__) \
+ && (defined(__thumb2__) \
+ || (__ARM_ARCH >= 5 && defined(__TARGET_FEATURE_DSP))))
+#define CAN_USE_LDRD 1
+#else
+#define CAN_USE_LDRD 0
+#endif
+
+/* set up stack from for call to __udivmoddi4. At the end of the macro the
+ stack is arranged as follows:
+ sp+12 / space for remainder
+ sp+8 \ (written by __udivmoddi4)
+ sp+4 lr
+ sp+0 sp+8 [rp (remainder pointer) argument for __udivmoddi4]
+
+ */
+.macro push_for_divide fname
+#if defined(__thumb2__) && CAN_USE_LDRD
+ sub ip, sp, #8
+ strd ip, lr, [sp, #-16]!
+#else
+ sub sp, sp, #8
+ do_push {sp, lr}
+#endif
+98: cfi_push 98b - \fname, 0xe, -0xc, 0x10
+.endm
+
+/* restore stack */
+.macro pop_for_divide
+ ldr lr, [sp, #4]
+#if CAN_USE_LDRD
+ ldrd r2, r3, [sp, #8]
+ add sp, sp, #16
+#else
+ add sp, sp, #8
+ do_pop {r2, r3}
+#endif
+.endm
+
#ifdef L_aeabi_ldivmod
/* Perform 64 bit signed division.
@@ -165,18 +205,10 @@ ARM_FUNC_START aeabi_uldivmod
cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
test_div_by_zero unsigned
- sub sp, sp, #8
-#if defined(__thumb2__)
- mov ip, sp
- push {ip, lr}
-#else
- do_push {sp, lr}
-#endif
-98: cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10
+ push_for_divide __aeabi_uldivmod
+ /* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__gnu_uldivmod_helper) __PLT__
- ldr lr, [sp, #4]
- add sp, sp, #8
- do_pop {r2, r3}
+ pop_for_divide
RET
cfi_end LSYM(Lend_aeabi_uldivmod)
--
1.9.1
[-- Attachment #5: 0004-Optimise-__aeabi_uldivmod.patch --]
[-- Type: text/x-patch, Size: 886 bytes --]
From 06da1a0fb2ccc21d41bc83cfcd4db99fca45dd44 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Wed, 14 May 2014 14:15:58 +0100
Subject: [PATCH 4/9] Optimise __aeabi_uldivmod
2014-05-12 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call
to __udivmoddi4.
---
libgcc/config/arm/bpabi.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index 67246b0..927e37f 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -207,7 +207,7 @@ ARM_FUNC_START aeabi_uldivmod
push_for_divide __aeabi_uldivmod
/* arguments in (r0:r1), (r2:r3) and *sp */
- bl SYM(__gnu_uldivmod_helper) __PLT__
+ bl SYM(__udivmoddi4) __PLT__
pop_for_divide
RET
cfi_end LSYM(Lend_aeabi_uldivmod)
--
1.9.1
[-- Attachment #6: 0005-Optimise-__aeabi_ldivmod-stack-manipulation.patch --]
[-- Type: text/x-patch, Size: 1152 bytes --]
From 6cfc99a2d765346b77c3d6d8c29e958d877e400c Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Mon, 12 May 2014 17:41:44 +0100
Subject: [PATCH 5/9] Optimise __aeabi_ldivmod (stack manipulation)
2014-05-12 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_ldivmod): Optimise stack manipulation.
---
libgcc/config/arm/bpabi.S | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index 927e37f..3f9ece5 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -174,18 +174,10 @@ ARM_FUNC_START aeabi_ldivmod
cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
test_div_by_zero signed
- sub sp, sp, #8
-#if defined(__thumb2__)
- mov ip, sp
- push {ip, lr}
-#else
- do_push {sp, lr}
-#endif
-98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
+ push_for_divide __aeabi_ldivmod
+ /* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__gnu_ldivmod_helper) __PLT__
- ldr lr, [sp, #4]
- add sp, sp, #8
- do_pop {r2, r3}
+ pop_for_divide
RET
cfi_end LSYM(Lend_aeabi_ldivmod)
--
1.9.1
[-- Attachment #7: 0006-Optimise-__aeabi_ldivmod.patch --]
[-- Type: text/x-patch, Size: 2073 bytes --]
From 6987775f7523248144c84d036cbdc3ae0ba6b5da Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Wed, 14 May 2014 14:17:30 +0100
Subject: [PATCH 6/9] Optimise __aeabi_ldivmod
2014-05-12 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_ldivmod): Perform division using
__udivmoddi4, and fixups for negative operands.
---
libgcc/config/arm/bpabi.S | 41 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 40 insertions(+), 1 deletion(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index 3f9ece5..c044167 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -175,10 +175,49 @@ ARM_FUNC_START aeabi_ldivmod
test_div_by_zero signed
push_for_divide __aeabi_ldivmod
+ cmp xxh, #0
+ blt 1f
+ cmp yyh, #0
+ blt 2f
+ /* arguments in (r0:r1), (r2:r3) and *sp */
+ bl SYM(__udivmoddi4) __PLT__
+ pop_for_divide
+ RET
+
+1: /* xxh:xxl is negative */
+ negs xxl, xxl
+ sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ cmp yyh, #0
+ blt 3f
+ /* arguments in (r0:r1), (r2:r3) and *sp */
+ bl SYM(__udivmoddi4) __PLT__
+ pop_for_divide
+ negs xxl, xxl
+ sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ negs yyl, yyl
+ sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ RET
+
+2: /* only yyh:yyl is negative */
+ negs yyl, yyl
+ sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ /* arguments in (r0:r1), (r2:r3) and *sp */
+ bl SYM(__udivmoddi4) __PLT__
+ pop_for_divide
+ negs xxl, xxl
+ sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ RET
+
+3: /* both xxh:xxl and yyh:yyl are negative */
+ negs yyl, yyl
+ sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
/* arguments in (r0:r1), (r2:r3) and *sp */
- bl SYM(__gnu_ldivmod_helper) __PLT__
+ bl SYM(__udivmoddi4) __PLT__
pop_for_divide
+ negs yyl, yyl
+ sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
RET
+
cfi_end LSYM(Lend_aeabi_ldivmod)
#endif /* L_aeabi_ldivmod */
--
1.9.1
[-- Attachment #8: 0007-Fix-cfi-annotations.patch --]
[-- Type: text/x-patch, Size: 3421 bytes --]
From d2be16c4cb112fe32100f4c4c250fa5acd72c777 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Wed, 14 May 2014 15:42:19 +0100
Subject: [PATCH 7/9] Fix cfi annotations
2014-05-12 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_ldivmod, __aeabi_uldivmod,
push_for_divide, pop_for_divide): Use .cfi_* directives for DWARF
annotations. Fix DWARF information.
---
libgcc/config/arm/bpabi.S | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index c044167..959ecb1 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -22,6 +22,8 @@
see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
<http://www.gnu.org/licenses/>. */
+ .cfi_sections .debug_frame
+
#ifdef __ARM_EABI__
/* Some attributes that are common to all routines in this file. */
/* Tag_ABI_align_needed: This code does not require 8-byte
@@ -145,7 +147,8 @@ ARM_FUNC_START aeabi_ulcmp
sub sp, sp, #8
do_push {sp, lr}
#endif
-98: cfi_push 98b - \fname, 0xe, -0xc, 0x10
+ .cfi_adjust_cfa_offset 16
+ .cfi_offset 14, -12
.endm
/* restore stack */
@@ -158,6 +161,8 @@ ARM_FUNC_START aeabi_ulcmp
add sp, sp, #8
do_pop {r2, r3}
#endif
+ .cfi_restore 14
+ .cfi_adjust_cfa_offset 0
.endm
#ifdef L_aeabi_ldivmod
@@ -171,7 +176,7 @@ ARM_FUNC_START aeabi_ulcmp
r2:r3 remainder
*/
ARM_FUNC_START aeabi_ldivmod
- cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
+ .cfi_startproc
test_div_by_zero signed
push_for_divide __aeabi_ldivmod
@@ -181,16 +186,19 @@ ARM_FUNC_START aeabi_ldivmod
blt 2f
/* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__udivmoddi4) __PLT__
+ .cfi_remember_state
pop_for_divide
RET
1: /* xxh:xxl is negative */
+ .cfi_restore_state
negs xxl, xxl
sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
cmp yyh, #0
blt 3f
/* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__udivmoddi4) __PLT__
+ .cfi_remember_state
pop_for_divide
negs xxl, xxl
sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
@@ -199,16 +207,19 @@ ARM_FUNC_START aeabi_ldivmod
RET
2: /* only yyh:yyl is negative */
+ .cfi_restore_state
negs yyl, yyl
sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
/* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__udivmoddi4) __PLT__
+ .cfi_remember_state
pop_for_divide
negs xxl, xxl
sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
RET
3: /* both xxh:xxl and yyh:yyl are negative */
+ .cfi_restore_state
negs yyl, yyl
sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
/* arguments in (r0:r1), (r2:r3) and *sp */
@@ -218,7 +229,7 @@ ARM_FUNC_START aeabi_ldivmod
sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
RET
- cfi_end LSYM(Lend_aeabi_ldivmod)
+ .cfi_endproc
#endif /* L_aeabi_ldivmod */
@@ -233,7 +244,7 @@ ARM_FUNC_START aeabi_ldivmod
r2:r3 remainder
*/
ARM_FUNC_START aeabi_uldivmod
- cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
+ .cfi_startproc
test_div_by_zero unsigned
push_for_divide __aeabi_uldivmod
@@ -241,7 +252,7 @@ ARM_FUNC_START aeabi_uldivmod
bl SYM(__udivmoddi4) __PLT__
pop_for_divide
RET
- cfi_end LSYM(Lend_aeabi_uldivmod)
+ .cfi_endproc
#endif /* L_aeabi_divmod */
--
1.9.1
[-- Attachment #9: 0008-Use-__udivmoddi4-for-v6M-aeabi_uldivmod.patch --]
[-- Type: text/x-patch, Size: 837 bytes --]
From 314caabcda45861371cfed73b29cf77c652da018 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Mon, 12 May 2014 18:51:54 +0100
Subject: [PATCH 8/9] Use __udivmoddi4 for v6M aeabi_uldivmod
2014-05-12 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi-v6m.S (__aeabi_uldivmod): Perform division using
__udivmoddi4.
---
libgcc/config/arm/bpabi-v6m.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libgcc/config/arm/bpabi-v6m.S b/libgcc/config/arm/bpabi-v6m.S
index 0bf2e55..d549fa6 100644
--- a/libgcc/config/arm/bpabi-v6m.S
+++ b/libgcc/config/arm/bpabi-v6m.S
@@ -148,7 +148,7 @@ FUNC_START aeabi_uldivmod
mov r0, sp
push {r0, lr}
ldr r0, [sp, #8]
- bl SYM(__gnu_uldivmod_helper)
+ bl SYM(__udivmoddi4)
ldr r3, [sp, #4]
mov lr, r3
add sp, sp, #8
--
1.9.1
[-- Attachment #10: 0009-Remove-__gnu_uldivmod_helper.patch --]
[-- Type: text/x-patch, Size: 1303 bytes --]
From 540a4016a32953cb37dbe2f7ea1ec3e1cf0bfdcf Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Mon, 12 May 2014 18:58:04 +0100
Subject: [PATCH 9/9] Remove __gnu_uldivmod_helper
2014-05-12 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.c (__gnu_uldivmod_helper): Remove.
---
libgcc/config/arm/bpabi.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/libgcc/config/arm/bpabi.c b/libgcc/config/arm/bpabi.c
index 7b155cc..e90d044 100644
--- a/libgcc/config/arm/bpabi.c
+++ b/libgcc/config/arm/bpabi.c
@@ -26,9 +26,6 @@ extern long long __divdi3 (long long, long long);
extern unsigned long long __udivdi3 (unsigned long long,
unsigned long long);
extern long long __gnu_ldivmod_helper (long long, long long, long long *);
-extern unsigned long long __gnu_uldivmod_helper (unsigned long long,
- unsigned long long,
- unsigned long long *);
long long
@@ -43,14 +40,3 @@ __gnu_ldivmod_helper (long long a,
return quotient;
}
-unsigned long long
-__gnu_uldivmod_helper (unsigned long long a,
- unsigned long long b,
- unsigned long long *remainder)
-{
- unsigned long long quotient;
-
- quotient = __udivdi3 (a, b);
- *remainder = a - b * quotient;
- return quotient;
-}
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH, ARM, v2] Improve 64 bit division performance
2014-05-22 10:08 [PATCH, ARM, v2] Improve 64 bit division performance Charles Baylis
@ 2014-06-11 9:30 ` Charles Baylis
2014-06-11 9:33 ` Richard Earnshaw
0 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 9:30 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: GCC Patches, Ramana Radhakrishnan
ping?
On 22 May 2014 11:08, Charles Baylis <charles.baylis@linaro.org> wrote:
> On 1 May 2014 16:41, Richard Earnshaw <rearnsha@arm.com> wrote:
>> I think really, you've got three independent changes here:
>
> Version 2 of this patch series is now a 9 patch series which addresses
> most of the following. Exceptions discussed below.
>
>> 1) Optimize the prologue/epilogue sequences when ldrd is available.
>> 2) Replace the call to __gnu_ldivmod_helper with __udivmoddi4
>
> I assume you mean __gnu_uldivmod_helper here, as __gnu_ldivmod_helper
> performs signed division and can't be directly replaced with the
> unsigned division performed by __udivmoddi4.
>
>> 3) Optimize the code to __aeabi_ldivmod.
>
> Converting to call __udivmoddi4, fixing up signedness of operands and
> results and optimisation are all one change.
>
>> Ideally, therefore, this is a three patch series, but it's then missing
>> a few bits.
>>
>> 4) Step 2 can also be trivially applied to bpabi-v6m.S as well, since
>> it's a direct swap of one function for another (unless I've misread the
>> changes, I think the ABI of the two helper functions are the same).
>
> For __aeabi_uldivmod this is true. For __aeabi_ldivmod this is not
> trivial as the signedness fix-ups must be written.
>
>> 5) Step 4 then makes __gnu_ldivmod_helper in bpabi.c a dead function
>> which can be deleted. This is good because currently pulling in either
>> 64-bit division function causes both these helper functions to be pulled
>> in and thus the whole of the 64-bit div-mod code for both signed and
>> unsigned values. That's particularly unfortunate for ARMv6m class
>> devices as that's potentially a lot of redundant code.
>
> Similarly, __gnu_uldivmod_helper not __gnu_ldivmod_helper.
>
> I've included two patches which do the trivial steps for the unsigned case.
>
>>
>> Finally, I know this was the original code, but the complete lack of
>> comments in this code made reviewing even the trivial parts a complete
>> nightmare -- it took me half an hour before I remembered that
>> __udivmoddi4 took three parameters, the third of which was on the stack:
>> thus the messing around with sp/ip in the prologue wasn't just trivial
>> padding but a necessary part of the function. Please could you add, at
>> least some short comments clarifying the register disposition on input
>> and what that prologue code is up to...
>
> Done.
>
>> Finally, how was this code tested?
>
> It has been built and "make check" has been run with no regressions on:
> arm-unknown-linux-gnueabihf --with-mode=thumb --with-arch=armv7-a
> arm-unknown-linux-gnueabihf --with-mode=arm --with-arch=armv7-a
> arm-unknown-linux-gnueabi --with-mode=arm --with-arch=armv5te
> arm-unknown-linux-gnueabi --with-mode=arm --with-arch=armv4t
>
> I have also run a simple test harness which checks the result of
> several 64 bit division operations where gcc has been built with the
> above configurations.
>
> I am not currently set up with a way to test v6M, so those parts aren't tested.
>
>> Anyway, some additional comments below:
>>
>> Don't repeat the function name for multiple tweaks to the same function;
>> as mentioned above, if these are really separate changes they should be
>> in separate submissions. Mixing unrelated changes just makes the
>> reviewing step that much harder.
>
> Done.
>
>
>>> + strd ip,lr, [sp, #-16]!
>>
>> Space after comma.
>
> Done
>
>> Also, since you've essentially rewritten the entire function, can you
>> please also reformat them to follow the coding style of the rest of the
>> file: namely "<tab>OP<tab>operands".
>
> Done
>
>>> #else
>>> + sub sp, sp, #8
>>> do_push {sp, lr}
>>> #endif
>>
>> Please add a comment that the value at *sp is the address of the the
>> slot for the remainder.
>
> Done
>>> +#if defined(__thumb2__) && CAN_USE_LDRD
>>> + sub ip, sp, #8
>>> + strd ip,lr, [sp, #-16]!
>>
>> Space after comma.
>
> Done
>
>>> #else
>>> + sub sp, sp, #8
>>> do_push {sp, lr}
>>> #endif
>>> + cmp xxh, #0
>>> + blt 1f
>>> + cmp yyh, #0
>>> + blt 2f
>>> +
>>> +98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
>>
>> The CFI push should really precede your conditional tests, it relates to
>> the do_push expression.
>
> Done.
>
>>> + bl SYM(__udivmoddi4) __PLT__
>>> + ldr lr, [sp, #4]
>>> +#if CAN_USE_LDRD
>>> + ldrd r2, r3, [sp, #8]
>>> + add sp, sp, #16
>>> +#else
>>> + add sp, sp, #8
>>> + do_pop {r2, r3}
>>> +#endif
>>
>> You're missing a CFI pop, which is needed when the values on the stack
>> go out of scope.
>
> The existing code doesn't do this. Since there are multiple exit
> points from the optimised function the existing cfi_* macros aren't
> sufficient (there is no cfi_save_state/cfi_restore_state), so I have
> included a patch which uses the gas .cfi_* directives. This may be
> interesting on non-DWARF or non-ELF platforms, if any are still
> supported .
>
>>> + RET
>>> +1: /* xxh:xxl is negative */
>>> + rsbs xxl, xxl, #0
>>
>> We're using unified syntax, so NEGS is preferable.
>
> Done
>
>>> + sbc xxh, xxh, xxh, lsl #1
>>
>> Worthy of a comment, Thumb2 has no RSC instruction, so use X - 2X.
>
> Done
>
>>> + cmp yyh, #0
>>> + blt 3f
>>> +98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
>>
>> This CFI push looks wrong. You've already pushed things earlier. On
>> the other hand, you should save the state before the CFI pop above, so
>> that you can restore the state again for the next (ie this) block of code.
>
> Done (see above)
>
>>> +98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
>>> + bl SYM(__udivmoddi4) __PLT__
>>> + ldr lr, [sp, #4]
>>> +#if CAN_USE_LDRD
>>> + ldrd r2, r3, [sp, #8]
>>> + add sp, sp, #16
>>> +#else
>>> + add sp, sp, #8
>>> + do_pop {r2, r3}
>>> +#endif
>>> + rsbs yyl, yyl, #0
>>> + sbc yyh, yyh, yyh, lsl #1
>>> + RET
>>>
>>> #endif /* L_aeabi_ldivmod */
>>>
>>
>> You use the LDRD vs do_pop sequence identically several times. To avoid
>> a lot of ifdefs, it might be worth considering a macro for this idiom to
>> reduce the overall amount of conditionalized code.
>
> Done.
>
>
> The updated patch series is attached. Hopefully, patches 1 through 6
> are now ready. Patches 7 through 9 can be dropped if necessary.
>
>
>
>
> 0001-Whitespace.patch
>
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_uldivmod): Fix whitespace.
> (__aeabi_ldivmod): Fix whitespace.
>
>
>
> 0002-Add-comments.patch
>
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_uldivmod, __aeabi_ldivmod): Add comment
> describing register usage on function entry and exit.
>
>
>
> 0003-Optimise-__aeabi_uldivmod-stack-manipulation.patch
>
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_uldivmod): Optimise stack pointer
> manipulation.
>
>
>
> 0004-Optimise-__aeabi_uldivmod.patch
>
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call
> to __udivmoddi4.
>
>
>
> 0005-Optimise-__aeabi_ldivmod-stack-manipulation.patch
>
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_ldivmod): Optimise stack manipulation.
>
>
>
> 0006-Optimise-__aeabi_ldivmod.patch
>
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_ldivmod): Perform division using
> __udivmoddi4, and fixups for negative operands.
>
>
>
> 0007-Fix-cfi-annotations.patch
>
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_ldivmod, __aeabi_uldivmod,
> push_for_divide, pop_for_divide): Use .cfi_* directives for DWARF
> annotations. Fix DWARF information.
>
>
>
> 0008-Use-__udivmoddi4-for-v6M-aeabi_uldivmod.patch
>
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi-v6m.S (__aeabi_uldivmod): Perform division using
> __udivmoddi4.
>
>
>
> 0009-Remove-__gnu_uldivmod_helper.patch
>
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.c (__gnu_uldivmod_helper): Remove.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH, ARM, v2] Improve 64 bit division performance
2014-06-11 9:30 ` Charles Baylis
@ 2014-06-11 9:33 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
0 siblings, 1 reply; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-11 9:33 UTC (permalink / raw)
To: Charles Baylis; +Cc: GCC Patches, Ramana Radhakrishnan
On 11/06/14 10:30, Charles Baylis wrote:
> ping?
>
Sorry, can you resend this as a series of mails, with one patch per mail.
R.
> On 22 May 2014 11:08, Charles Baylis <charles.baylis@linaro.org> wrote:
>> On 1 May 2014 16:41, Richard Earnshaw <rearnsha@arm.com> wrote:
>>> I think really, you've got three independent changes here:
>>
>> Version 2 of this patch series is now a 9 patch series which addresses
>> most of the following. Exceptions discussed below.
>>
>>> 1) Optimize the prologue/epilogue sequences when ldrd is available.
>>> 2) Replace the call to __gnu_ldivmod_helper with __udivmoddi4
>>
>> I assume you mean __gnu_uldivmod_helper here, as __gnu_ldivmod_helper
>> performs signed division and can't be directly replaced with the
>> unsigned division performed by __udivmoddi4.
>>
>>> 3) Optimize the code to __aeabi_ldivmod.
>>
>> Converting to call __udivmoddi4, fixing up signedness of operands and
>> results and optimisation are all one change.
>>
>>> Ideally, therefore, this is a three patch series, but it's then missing
>>> a few bits.
>>>
>>> 4) Step 2 can also be trivially applied to bpabi-v6m.S as well, since
>>> it's a direct swap of one function for another (unless I've misread the
>>> changes, I think the ABI of the two helper functions are the same).
>>
>> For __aeabi_uldivmod this is true. For __aeabi_ldivmod this is not
>> trivial as the signedness fix-ups must be written.
>>
>>> 5) Step 4 then makes __gnu_ldivmod_helper in bpabi.c a dead function
>>> which can be deleted. This is good because currently pulling in either
>>> 64-bit division function causes both these helper functions to be pulled
>>> in and thus the whole of the 64-bit div-mod code for both signed and
>>> unsigned values. That's particularly unfortunate for ARMv6m class
>>> devices as that's potentially a lot of redundant code.
>>
>> Similarly, __gnu_uldivmod_helper not __gnu_ldivmod_helper.
>>
>> I've included two patches which do the trivial steps for the unsigned case.
>>
>>>
>>> Finally, I know this was the original code, but the complete lack of
>>> comments in this code made reviewing even the trivial parts a complete
>>> nightmare -- it took me half an hour before I remembered that
>>> __udivmoddi4 took three parameters, the third of which was on the stack:
>>> thus the messing around with sp/ip in the prologue wasn't just trivial
>>> padding but a necessary part of the function. Please could you add, at
>>> least some short comments clarifying the register disposition on input
>>> and what that prologue code is up to...
>>
>> Done.
>>
>>> Finally, how was this code tested?
>>
>> It has been built and "make check" has been run with no regressions on:
>> arm-unknown-linux-gnueabihf --with-mode=thumb --with-arch=armv7-a
>> arm-unknown-linux-gnueabihf --with-mode=arm --with-arch=armv7-a
>> arm-unknown-linux-gnueabi --with-mode=arm --with-arch=armv5te
>> arm-unknown-linux-gnueabi --with-mode=arm --with-arch=armv4t
>>
>> I have also run a simple test harness which checks the result of
>> several 64 bit division operations where gcc has been built with the
>> above configurations.
>>
>> I am not currently set up with a way to test v6M, so those parts aren't tested.
>>
>>> Anyway, some additional comments below:
>>>
>>> Don't repeat the function name for multiple tweaks to the same function;
>>> as mentioned above, if these are really separate changes they should be
>>> in separate submissions. Mixing unrelated changes just makes the
>>> reviewing step that much harder.
>>
>> Done.
>>
>>
>>>> + strd ip,lr, [sp, #-16]!
>>>
>>> Space after comma.
>>
>> Done
>>
>>> Also, since you've essentially rewritten the entire function, can you
>>> please also reformat them to follow the coding style of the rest of the
>>> file: namely "<tab>OP<tab>operands".
>>
>> Done
>>
>>>> #else
>>>> + sub sp, sp, #8
>>>> do_push {sp, lr}
>>>> #endif
>>>
>>> Please add a comment that the value at *sp is the address of the the
>>> slot for the remainder.
>>
>> Done
>>>> +#if defined(__thumb2__) && CAN_USE_LDRD
>>>> + sub ip, sp, #8
>>>> + strd ip,lr, [sp, #-16]!
>>>
>>> Space after comma.
>>
>> Done
>>
>>>> #else
>>>> + sub sp, sp, #8
>>>> do_push {sp, lr}
>>>> #endif
>>>> + cmp xxh, #0
>>>> + blt 1f
>>>> + cmp yyh, #0
>>>> + blt 2f
>>>> +
>>>> +98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
>>>
>>> The CFI push should really precede your conditional tests, it relates to
>>> the do_push expression.
>>
>> Done.
>>
>>>> + bl SYM(__udivmoddi4) __PLT__
>>>> + ldr lr, [sp, #4]
>>>> +#if CAN_USE_LDRD
>>>> + ldrd r2, r3, [sp, #8]
>>>> + add sp, sp, #16
>>>> +#else
>>>> + add sp, sp, #8
>>>> + do_pop {r2, r3}
>>>> +#endif
>>>
>>> You're missing a CFI pop, which is needed when the values on the stack
>>> go out of scope.
>>
>> The existing code doesn't do this. Since there are multiple exit
>> points from the optimised function the existing cfi_* macros aren't
>> sufficient (there is no cfi_save_state/cfi_restore_state), so I have
>> included a patch which uses the gas .cfi_* directives. This may be
>> interesting on non-DWARF or non-ELF platforms, if any are still
>> supported .
>>
>>>> + RET
>>>> +1: /* xxh:xxl is negative */
>>>> + rsbs xxl, xxl, #0
>>>
>>> We're using unified syntax, so NEGS is preferable.
>>
>> Done
>>
>>>> + sbc xxh, xxh, xxh, lsl #1
>>>
>>> Worthy of a comment, Thumb2 has no RSC instruction, so use X - 2X.
>>
>> Done
>>
>>>> + cmp yyh, #0
>>>> + blt 3f
>>>> +98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
>>>
>>> This CFI push looks wrong. You've already pushed things earlier. On
>>> the other hand, you should save the state before the CFI pop above, so
>>> that you can restore the state again for the next (ie this) block of code.
>>
>> Done (see above)
>>
>>>> +98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
>>>> + bl SYM(__udivmoddi4) __PLT__
>>>> + ldr lr, [sp, #4]
>>>> +#if CAN_USE_LDRD
>>>> + ldrd r2, r3, [sp, #8]
>>>> + add sp, sp, #16
>>>> +#else
>>>> + add sp, sp, #8
>>>> + do_pop {r2, r3}
>>>> +#endif
>>>> + rsbs yyl, yyl, #0
>>>> + sbc yyh, yyh, yyh, lsl #1
>>>> + RET
>>>>
>>>> #endif /* L_aeabi_ldivmod */
>>>>
>>>
>>> You use the LDRD vs do_pop sequence identically several times. To avoid
>>> a lot of ifdefs, it might be worth considering a macro for this idiom to
>>> reduce the overall amount of conditionalized code.
>>
>> Done.
>>
>>
>> The updated patch series is attached. Hopefully, patches 1 through 6
>> are now ready. Patches 7 through 9 can be dropped if necessary.
>>
>>
>>
>>
>> 0001-Whitespace.patch
>>
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi.S (__aeabi_uldivmod): Fix whitespace.
>> (__aeabi_ldivmod): Fix whitespace.
>>
>>
>>
>> 0002-Add-comments.patch
>>
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi.S (__aeabi_uldivmod, __aeabi_ldivmod): Add comment
>> describing register usage on function entry and exit.
>>
>>
>>
>> 0003-Optimise-__aeabi_uldivmod-stack-manipulation.patch
>>
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi.S (__aeabi_uldivmod): Optimise stack pointer
>> manipulation.
>>
>>
>>
>> 0004-Optimise-__aeabi_uldivmod.patch
>>
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call
>> to __udivmoddi4.
>>
>>
>>
>> 0005-Optimise-__aeabi_ldivmod-stack-manipulation.patch
>>
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi.S (__aeabi_ldivmod): Optimise stack manipulation.
>>
>>
>>
>> 0006-Optimise-__aeabi_ldivmod.patch
>>
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi.S (__aeabi_ldivmod): Perform division using
>> __udivmoddi4, and fixups for negative operands.
>>
>>
>>
>> 0007-Fix-cfi-annotations.patch
>>
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi.S (__aeabi_ldivmod, __aeabi_uldivmod,
>> push_for_divide, pop_for_divide): Use .cfi_* directives for DWARF
>> annotations. Fix DWARF information.
>>
>>
>>
>> 0008-Use-__udivmoddi4-for-v6M-aeabi_uldivmod.patch
>>
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi-v6m.S (__aeabi_uldivmod): Perform division using
>> __udivmoddi4.
>>
>>
>>
>> 0009-Remove-__gnu_uldivmod_helper.patch
>>
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi.c (__gnu_uldivmod_helper): Remove.
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 7/9] Fix cfi annotations
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
` (6 preceding siblings ...)
2014-06-11 10:20 ` [PATCH 3/9] Optimise __aeabi_uldivmod (stack manipulation) Charles Baylis
@ 2014-06-11 10:20 ` Charles Baylis
2014-06-18 14:04 ` Richard Earnshaw
2014-06-11 12:55 ` [PATCH 1/9] Whitespace Richard Earnshaw
8 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 10:20 UTC (permalink / raw)
To: rearnsha; +Cc: gcc-patches, Ramana.Radhakrishnan
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_ldivmod, __aeabi_uldivmod,
push_for_divide, pop_for_divide): Use .cfi_* directives for DWARF
annotations. Fix DWARF information.
---
libgcc/config/arm/bpabi.S | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index c044167..959ecb1 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -22,6 +22,8 @@
see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
<http://www.gnu.org/licenses/>. */
+ .cfi_sections .debug_frame
+
#ifdef __ARM_EABI__
/* Some attributes that are common to all routines in this file. */
/* Tag_ABI_align_needed: This code does not require 8-byte
@@ -145,7 +147,8 @@ ARM_FUNC_START aeabi_ulcmp
sub sp, sp, #8
do_push {sp, lr}
#endif
-98: cfi_push 98b - \fname, 0xe, -0xc, 0x10
+ .cfi_adjust_cfa_offset 16
+ .cfi_offset 14, -12
.endm
/* restore stack */
@@ -158,6 +161,8 @@ ARM_FUNC_START aeabi_ulcmp
add sp, sp, #8
do_pop {r2, r3}
#endif
+ .cfi_restore 14
+ .cfi_adjust_cfa_offset 0
.endm
#ifdef L_aeabi_ldivmod
@@ -171,7 +176,7 @@ ARM_FUNC_START aeabi_ulcmp
r2:r3 remainder
*/
ARM_FUNC_START aeabi_ldivmod
- cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
+ .cfi_startproc
test_div_by_zero signed
push_for_divide __aeabi_ldivmod
@@ -181,16 +186,19 @@ ARM_FUNC_START aeabi_ldivmod
blt 2f
/* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__udivmoddi4) __PLT__
+ .cfi_remember_state
pop_for_divide
RET
1: /* xxh:xxl is negative */
+ .cfi_restore_state
negs xxl, xxl
sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
cmp yyh, #0
blt 3f
/* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__udivmoddi4) __PLT__
+ .cfi_remember_state
pop_for_divide
negs xxl, xxl
sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
@@ -199,16 +207,19 @@ ARM_FUNC_START aeabi_ldivmod
RET
2: /* only yyh:yyl is negative */
+ .cfi_restore_state
negs yyl, yyl
sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
/* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__udivmoddi4) __PLT__
+ .cfi_remember_state
pop_for_divide
negs xxl, xxl
sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
RET
3: /* both xxh:xxl and yyh:yyl are negative */
+ .cfi_restore_state
negs yyl, yyl
sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
/* arguments in (r0:r1), (r2:r3) and *sp */
@@ -218,7 +229,7 @@ ARM_FUNC_START aeabi_ldivmod
sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
RET
- cfi_end LSYM(Lend_aeabi_ldivmod)
+ .cfi_endproc
#endif /* L_aeabi_ldivmod */
@@ -233,7 +244,7 @@ ARM_FUNC_START aeabi_ldivmod
r2:r3 remainder
*/
ARM_FUNC_START aeabi_uldivmod
- cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
+ .cfi_startproc
test_div_by_zero unsigned
push_for_divide __aeabi_uldivmod
@@ -241,7 +252,7 @@ ARM_FUNC_START aeabi_uldivmod
bl SYM(__udivmoddi4) __PLT__
pop_for_divide
RET
- cfi_end LSYM(Lend_aeabi_uldivmod)
+ .cfi_endproc
#endif /* L_aeabi_divmod */
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 6/9] Optimise __aeabi_ldivmod
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
2014-06-11 10:20 ` [PATCH 2/9] Add comments Charles Baylis
2014-06-11 10:20 ` [PATCH 8/9] Use __udivmoddi4 for v6M aeabi_uldivmod Charles Baylis
@ 2014-06-11 10:20 ` Charles Baylis
2014-06-18 14:03 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 5/9] Optimise __aeabi_ldivmod (stack manipulation) Charles Baylis
` (5 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 10:20 UTC (permalink / raw)
To: rearnsha; +Cc: gcc-patches, Ramana.Radhakrishnan
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_ldivmod): Perform division using
__udivmoddi4, and fixups for negative operands.
---
libgcc/config/arm/bpabi.S | 41 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 40 insertions(+), 1 deletion(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index 3f9ece5..c044167 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -175,10 +175,49 @@ ARM_FUNC_START aeabi_ldivmod
test_div_by_zero signed
push_for_divide __aeabi_ldivmod
+ cmp xxh, #0
+ blt 1f
+ cmp yyh, #0
+ blt 2f
+ /* arguments in (r0:r1), (r2:r3) and *sp */
+ bl SYM(__udivmoddi4) __PLT__
+ pop_for_divide
+ RET
+
+1: /* xxh:xxl is negative */
+ negs xxl, xxl
+ sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ cmp yyh, #0
+ blt 3f
+ /* arguments in (r0:r1), (r2:r3) and *sp */
+ bl SYM(__udivmoddi4) __PLT__
+ pop_for_divide
+ negs xxl, xxl
+ sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ negs yyl, yyl
+ sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ RET
+
+2: /* only yyh:yyl is negative */
+ negs yyl, yyl
+ sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ /* arguments in (r0:r1), (r2:r3) and *sp */
+ bl SYM(__udivmoddi4) __PLT__
+ pop_for_divide
+ negs xxl, xxl
+ sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
+ RET
+
+3: /* both xxh:xxl and yyh:yyl are negative */
+ negs yyl, yyl
+ sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
/* arguments in (r0:r1), (r2:r3) and *sp */
- bl SYM(__gnu_ldivmod_helper) __PLT__
+ bl SYM(__udivmoddi4) __PLT__
pop_for_divide
+ negs yyl, yyl
+ sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
RET
+
cfi_end LSYM(Lend_aeabi_ldivmod)
#endif /* L_aeabi_ldivmod */
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 5/9] Optimise __aeabi_ldivmod (stack manipulation)
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
` (2 preceding siblings ...)
2014-06-11 10:20 ` [PATCH 6/9] Optimise __aeabi_ldivmod Charles Baylis
@ 2014-06-11 10:20 ` Charles Baylis
2014-06-18 13:53 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 9/9] Remove __gnu_uldivmod_helper Charles Baylis
` (4 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 10:20 UTC (permalink / raw)
To: rearnsha; +Cc: gcc-patches, Ramana.Radhakrishnan
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_ldivmod): Optimise stack manipulation.
---
libgcc/config/arm/bpabi.S | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index 927e37f..3f9ece5 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -174,18 +174,10 @@ ARM_FUNC_START aeabi_ldivmod
cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
test_div_by_zero signed
- sub sp, sp, #8
-#if defined(__thumb2__)
- mov ip, sp
- push {ip, lr}
-#else
- do_push {sp, lr}
-#endif
-98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
+ push_for_divide __aeabi_ldivmod
+ /* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__gnu_ldivmod_helper) __PLT__
- ldr lr, [sp, #4]
- add sp, sp, #8
- do_pop {r2, r3}
+ pop_for_divide
RET
cfi_end LSYM(Lend_aeabi_ldivmod)
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 8/9] Use __udivmoddi4 for v6M aeabi_uldivmod
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
2014-06-11 10:20 ` [PATCH 2/9] Add comments Charles Baylis
@ 2014-06-11 10:20 ` Charles Baylis
2014-06-18 14:04 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 6/9] Optimise __aeabi_ldivmod Charles Baylis
` (6 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 10:20 UTC (permalink / raw)
To: rearnsha; +Cc: gcc-patches, Ramana.Radhakrishnan
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi-v6m.S (__aeabi_uldivmod): Perform division using
__udivmoddi4.
---
libgcc/config/arm/bpabi-v6m.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libgcc/config/arm/bpabi-v6m.S b/libgcc/config/arm/bpabi-v6m.S
index 0bf2e55..d549fa6 100644
--- a/libgcc/config/arm/bpabi-v6m.S
+++ b/libgcc/config/arm/bpabi-v6m.S
@@ -148,7 +148,7 @@ FUNC_START aeabi_uldivmod
mov r0, sp
push {r0, lr}
ldr r0, [sp, #8]
- bl SYM(__gnu_uldivmod_helper)
+ bl SYM(__udivmoddi4)
ldr r3, [sp, #4]
mov lr, r3
add sp, sp, #8
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 4/9] Optimise __aeabi_uldivmod
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
` (4 preceding siblings ...)
2014-06-11 10:20 ` [PATCH 9/9] Remove __gnu_uldivmod_helper Charles Baylis
@ 2014-06-11 10:20 ` Charles Baylis
2014-06-18 13:53 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 3/9] Optimise __aeabi_uldivmod (stack manipulation) Charles Baylis
` (2 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 10:20 UTC (permalink / raw)
To: rearnsha; +Cc: gcc-patches, Ramana.Radhakrishnan
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call
to __udivmoddi4.
---
libgcc/config/arm/bpabi.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index 67246b0..927e37f 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -207,7 +207,7 @@ ARM_FUNC_START aeabi_uldivmod
push_for_divide __aeabi_uldivmod
/* arguments in (r0:r1), (r2:r3) and *sp */
- bl SYM(__gnu_uldivmod_helper) __PLT__
+ bl SYM(__udivmoddi4) __PLT__
pop_for_divide
RET
cfi_end LSYM(Lend_aeabi_uldivmod)
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 9/9] Remove __gnu_uldivmod_helper
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
` (3 preceding siblings ...)
2014-06-11 10:20 ` [PATCH 5/9] Optimise __aeabi_ldivmod (stack manipulation) Charles Baylis
@ 2014-06-11 10:20 ` Charles Baylis
2014-06-18 14:09 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 4/9] Optimise __aeabi_uldivmod Charles Baylis
` (3 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 10:20 UTC (permalink / raw)
To: rearnsha; +Cc: gcc-patches, Ramana.Radhakrishnan
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.c (__gnu_uldivmod_helper): Remove.
---
libgcc/config/arm/bpabi.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/libgcc/config/arm/bpabi.c b/libgcc/config/arm/bpabi.c
index 7b155cc..e90d044 100644
--- a/libgcc/config/arm/bpabi.c
+++ b/libgcc/config/arm/bpabi.c
@@ -26,9 +26,6 @@ extern long long __divdi3 (long long, long long);
extern unsigned long long __udivdi3 (unsigned long long,
unsigned long long);
extern long long __gnu_ldivmod_helper (long long, long long, long long *);
-extern unsigned long long __gnu_uldivmod_helper (unsigned long long,
- unsigned long long,
- unsigned long long *);
long long
@@ -43,14 +40,3 @@ __gnu_ldivmod_helper (long long a,
return quotient;
}
-unsigned long long
-__gnu_uldivmod_helper (unsigned long long a,
- unsigned long long b,
- unsigned long long *remainder)
-{
- unsigned long long quotient;
-
- quotient = __udivdi3 (a, b);
- *remainder = a - b * quotient;
- return quotient;
-}
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/9] Whitespace
2014-06-11 9:33 ` Richard Earnshaw
@ 2014-06-11 10:20 ` Charles Baylis
2014-06-11 10:20 ` [PATCH 2/9] Add comments Charles Baylis
` (8 more replies)
0 siblings, 9 replies; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 10:20 UTC (permalink / raw)
To: rearnsha; +Cc: gcc-patches, Ramana.Radhakrishnan
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod): Fix whitespace.
(__aeabi_ldivmod): Fix whitespace.
---
libgcc/config/arm/bpabi.S | 36 ++++++++++++++++++------------------
1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index 7772301..f47d715 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -124,20 +124,20 @@ ARM_FUNC_START aeabi_ulcmp
ARM_FUNC_START aeabi_ldivmod
cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
- test_div_by_zero signed
+ test_div_by_zero signed
- sub sp, sp, #8
+ sub sp, sp, #8
#if defined(__thumb2__)
- mov ip, sp
- push {ip, lr}
+ mov ip, sp
+ push {ip, lr}
#else
- do_push {sp, lr}
+ do_push {sp, lr}
#endif
98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
- bl SYM(__gnu_ldivmod_helper) __PLT__
- ldr lr, [sp, #4]
- add sp, sp, #8
- do_pop {r2, r3}
+ bl SYM(__gnu_ldivmod_helper) __PLT__
+ ldr lr, [sp, #4]
+ add sp, sp, #8
+ do_pop {r2, r3}
RET
cfi_end LSYM(Lend_aeabi_ldivmod)
@@ -147,20 +147,20 @@ ARM_FUNC_START aeabi_ldivmod
ARM_FUNC_START aeabi_uldivmod
cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
- test_div_by_zero unsigned
+ test_div_by_zero unsigned
- sub sp, sp, #8
+ sub sp, sp, #8
#if defined(__thumb2__)
- mov ip, sp
- push {ip, lr}
+ mov ip, sp
+ push {ip, lr}
#else
- do_push {sp, lr}
+ do_push {sp, lr}
#endif
98: cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10
- bl SYM(__gnu_uldivmod_helper) __PLT__
- ldr lr, [sp, #4]
- add sp, sp, #8
- do_pop {r2, r3}
+ bl SYM(__gnu_uldivmod_helper) __PLT__
+ ldr lr, [sp, #4]
+ add sp, sp, #8
+ do_pop {r2, r3}
RET
cfi_end LSYM(Lend_aeabi_uldivmod)
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 3/9] Optimise __aeabi_uldivmod (stack manipulation)
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
` (5 preceding siblings ...)
2014-06-11 10:20 ` [PATCH 4/9] Optimise __aeabi_uldivmod Charles Baylis
@ 2014-06-11 10:20 ` Charles Baylis
2014-06-18 13:52 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 7/9] Fix cfi annotations Charles Baylis
2014-06-11 12:55 ` [PATCH 1/9] Whitespace Richard Earnshaw
8 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 10:20 UTC (permalink / raw)
To: rearnsha; +Cc: gcc-patches, Ramana.Radhakrishnan
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod): Optimise stack pointer
manipulation.
---
libgcc/config/arm/bpabi.S | 54 +++++++++++++++++++++++++++++++++++++----------
1 file changed, 43 insertions(+), 11 deletions(-)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index ae76cd3..67246b0 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -120,6 +120,46 @@ ARM_FUNC_START aeabi_ulcmp
#endif
.endm
+/* we can use STRD/LDRD on v5TE and later, and any Thumb-2 architecture. */
+#if (defined(__ARM_EABI__) \
+ && (defined(__thumb2__) \
+ || (__ARM_ARCH >= 5 && defined(__TARGET_FEATURE_DSP))))
+#define CAN_USE_LDRD 1
+#else
+#define CAN_USE_LDRD 0
+#endif
+
+/* set up stack from for call to __udivmoddi4. At the end of the macro the
+ stack is arranged as follows:
+ sp+12 / space for remainder
+ sp+8 \ (written by __udivmoddi4)
+ sp+4 lr
+ sp+0 sp+8 [rp (remainder pointer) argument for __udivmoddi4]
+
+ */
+.macro push_for_divide fname
+#if defined(__thumb2__) && CAN_USE_LDRD
+ sub ip, sp, #8
+ strd ip, lr, [sp, #-16]!
+#else
+ sub sp, sp, #8
+ do_push {sp, lr}
+#endif
+98: cfi_push 98b - \fname, 0xe, -0xc, 0x10
+.endm
+
+/* restore stack */
+.macro pop_for_divide
+ ldr lr, [sp, #4]
+#if CAN_USE_LDRD
+ ldrd r2, r3, [sp, #8]
+ add sp, sp, #16
+#else
+ add sp, sp, #8
+ do_pop {r2, r3}
+#endif
+.endm
+
#ifdef L_aeabi_ldivmod
/* Perform 64 bit signed division.
@@ -165,18 +205,10 @@ ARM_FUNC_START aeabi_uldivmod
cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
test_div_by_zero unsigned
- sub sp, sp, #8
-#if defined(__thumb2__)
- mov ip, sp
- push {ip, lr}
-#else
- do_push {sp, lr}
-#endif
-98: cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10
+ push_for_divide __aeabi_uldivmod
+ /* arguments in (r0:r1), (r2:r3) and *sp */
bl SYM(__gnu_uldivmod_helper) __PLT__
- ldr lr, [sp, #4]
- add sp, sp, #8
- do_pop {r2, r3}
+ pop_for_divide
RET
cfi_end LSYM(Lend_aeabi_uldivmod)
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 2/9] Add comments
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
@ 2014-06-11 10:20 ` Charles Baylis
2014-06-11 12:55 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 8/9] Use __udivmoddi4 for v6M aeabi_uldivmod Charles Baylis
` (7 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Charles Baylis @ 2014-06-11 10:20 UTC (permalink / raw)
To: rearnsha; +Cc: gcc-patches, Ramana.Radhakrishnan
2014-05-22 Charles Baylis <charles.baylis@linaro.org>
* config/arm/bpabi.S (__aeabi_uldivmod, __aeabi_ldivmod): Add comment
describing register usage on function entry and exit.
---
libgcc/config/arm/bpabi.S | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index f47d715..ae76cd3 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -122,6 +122,14 @@ ARM_FUNC_START aeabi_ulcmp
#ifdef L_aeabi_ldivmod
+/* Perform 64 bit signed division.
+ Inputs:
+ r0:r1 numerator
+ r2:r3 denominator
+ Outputs:
+ r0:r1 quotient
+ r2:r3 remainder
+ */
ARM_FUNC_START aeabi_ldivmod
cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
test_div_by_zero signed
@@ -145,6 +153,14 @@ ARM_FUNC_START aeabi_ldivmod
#ifdef L_aeabi_uldivmod
+/* Perform 64 bit signed division.
+ Inputs:
+ r0:r1 numerator
+ r2:r3 denominator
+ Outputs:
+ r0:r1 quotient
+ r2:r3 remainder
+ */
ARM_FUNC_START aeabi_uldivmod
cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
test_div_by_zero unsigned
--
1.9.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/9] Whitespace
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
` (7 preceding siblings ...)
2014-06-11 10:20 ` [PATCH 7/9] Fix cfi annotations Charles Baylis
@ 2014-06-11 12:55 ` Richard Earnshaw
2014-06-18 15:55 ` Charles Baylis
8 siblings, 1 reply; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-11 12:55 UTC (permalink / raw)
To: Charles Baylis; +Cc: gcc-patches, Ramana Radhakrishnan
On 11/06/14 11:19, Charles Baylis wrote:
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_uldivmod): Fix whitespace.
> (__aeabi_ldivmod): Fix whitespace.
This is OK, but please wait until the others are ready to go in.
R.
> ---
> libgcc/config/arm/bpabi.S | 36 ++++++++++++++++++------------------
> 1 file changed, 18 insertions(+), 18 deletions(-)
>
> diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
> index 7772301..f47d715 100644
> --- a/libgcc/config/arm/bpabi.S
> +++ b/libgcc/config/arm/bpabi.S
> @@ -124,20 +124,20 @@ ARM_FUNC_START aeabi_ulcmp
>
> ARM_FUNC_START aeabi_ldivmod
> cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
> - test_div_by_zero signed
> + test_div_by_zero signed
>
> - sub sp, sp, #8
> + sub sp, sp, #8
> #if defined(__thumb2__)
> - mov ip, sp
> - push {ip, lr}
> + mov ip, sp
> + push {ip, lr}
> #else
> - do_push {sp, lr}
> + do_push {sp, lr}
> #endif
> 98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
> - bl SYM(__gnu_ldivmod_helper) __PLT__
> - ldr lr, [sp, #4]
> - add sp, sp, #8
> - do_pop {r2, r3}
> + bl SYM(__gnu_ldivmod_helper) __PLT__
> + ldr lr, [sp, #4]
> + add sp, sp, #8
> + do_pop {r2, r3}
> RET
> cfi_end LSYM(Lend_aeabi_ldivmod)
>
> @@ -147,20 +147,20 @@ ARM_FUNC_START aeabi_ldivmod
>
> ARM_FUNC_START aeabi_uldivmod
> cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
> - test_div_by_zero unsigned
> + test_div_by_zero unsigned
>
> - sub sp, sp, #8
> + sub sp, sp, #8
> #if defined(__thumb2__)
> - mov ip, sp
> - push {ip, lr}
> + mov ip, sp
> + push {ip, lr}
> #else
> - do_push {sp, lr}
> + do_push {sp, lr}
> #endif
> 98: cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10
> - bl SYM(__gnu_uldivmod_helper) __PLT__
> - ldr lr, [sp, #4]
> - add sp, sp, #8
> - do_pop {r2, r3}
> + bl SYM(__gnu_uldivmod_helper) __PLT__
> + ldr lr, [sp, #4]
> + add sp, sp, #8
> + do_pop {r2, r3}
> RET
> cfi_end LSYM(Lend_aeabi_uldivmod)
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/9] Add comments
2014-06-11 10:20 ` [PATCH 2/9] Add comments Charles Baylis
@ 2014-06-11 12:55 ` Richard Earnshaw
0 siblings, 0 replies; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-11 12:55 UTC (permalink / raw)
To: Charles Baylis; +Cc: gcc-patches, Ramana Radhakrishnan
On 11/06/14 11:19, Charles Baylis wrote:
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_uldivmod, __aeabi_ldivmod): Add comment
> describing register usage on function entry and exit.
OK.
R.
> ---
> libgcc/config/arm/bpabi.S | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
> index f47d715..ae76cd3 100644
> --- a/libgcc/config/arm/bpabi.S
> +++ b/libgcc/config/arm/bpabi.S
> @@ -122,6 +122,14 @@ ARM_FUNC_START aeabi_ulcmp
>
> #ifdef L_aeabi_ldivmod
>
> +/* Perform 64 bit signed division.
> + Inputs:
> + r0:r1 numerator
> + r2:r3 denominator
> + Outputs:
> + r0:r1 quotient
> + r2:r3 remainder
> + */
> ARM_FUNC_START aeabi_ldivmod
> cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
> test_div_by_zero signed
> @@ -145,6 +153,14 @@ ARM_FUNC_START aeabi_ldivmod
>
> #ifdef L_aeabi_uldivmod
>
> +/* Perform 64 bit signed division.
> + Inputs:
> + r0:r1 numerator
> + r2:r3 denominator
> + Outputs:
> + r0:r1 quotient
> + r2:r3 remainder
> + */
> ARM_FUNC_START aeabi_uldivmod
> cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
> test_div_by_zero unsigned
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/9] Optimise __aeabi_uldivmod (stack manipulation)
2014-06-11 10:20 ` [PATCH 3/9] Optimise __aeabi_uldivmod (stack manipulation) Charles Baylis
@ 2014-06-18 13:52 ` Richard Earnshaw
0 siblings, 0 replies; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-18 13:52 UTC (permalink / raw)
To: Charles Baylis; +Cc: gcc-patches, Ramana Radhakrishnan
On 11/06/14 11:19, Charles Baylis wrote:
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_uldivmod): Optimise stack pointer
> manipulation.
OK.
R.
> ---
> libgcc/config/arm/bpabi.S | 54 +++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 43 insertions(+), 11 deletions(-)
>
> diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
> index ae76cd3..67246b0 100644
> --- a/libgcc/config/arm/bpabi.S
> +++ b/libgcc/config/arm/bpabi.S
> @@ -120,6 +120,46 @@ ARM_FUNC_START aeabi_ulcmp
> #endif
> .endm
>
> +/* we can use STRD/LDRD on v5TE and later, and any Thumb-2 architecture. */
> +#if (defined(__ARM_EABI__) \
> + && (defined(__thumb2__) \
> + || (__ARM_ARCH >= 5 && defined(__TARGET_FEATURE_DSP))))
> +#define CAN_USE_LDRD 1
> +#else
> +#define CAN_USE_LDRD 0
> +#endif
> +
> +/* set up stack from for call to __udivmoddi4. At the end of the macro the
> + stack is arranged as follows:
> + sp+12 / space for remainder
> + sp+8 \ (written by __udivmoddi4)
> + sp+4 lr
> + sp+0 sp+8 [rp (remainder pointer) argument for __udivmoddi4]
> +
> + */
> +.macro push_for_divide fname
> +#if defined(__thumb2__) && CAN_USE_LDRD
> + sub ip, sp, #8
> + strd ip, lr, [sp, #-16]!
> +#else
> + sub sp, sp, #8
> + do_push {sp, lr}
> +#endif
> +98: cfi_push 98b - \fname, 0xe, -0xc, 0x10
> +.endm
> +
> +/* restore stack */
> +.macro pop_for_divide
> + ldr lr, [sp, #4]
> +#if CAN_USE_LDRD
> + ldrd r2, r3, [sp, #8]
> + add sp, sp, #16
> +#else
> + add sp, sp, #8
> + do_pop {r2, r3}
> +#endif
> +.endm
> +
> #ifdef L_aeabi_ldivmod
>
> /* Perform 64 bit signed division.
> @@ -165,18 +205,10 @@ ARM_FUNC_START aeabi_uldivmod
> cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
> test_div_by_zero unsigned
>
> - sub sp, sp, #8
> -#if defined(__thumb2__)
> - mov ip, sp
> - push {ip, lr}
> -#else
> - do_push {sp, lr}
> -#endif
> -98: cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10
> + push_for_divide __aeabi_uldivmod
> + /* arguments in (r0:r1), (r2:r3) and *sp */
> bl SYM(__gnu_uldivmod_helper) __PLT__
> - ldr lr, [sp, #4]
> - add sp, sp, #8
> - do_pop {r2, r3}
> + pop_for_divide
> RET
> cfi_end LSYM(Lend_aeabi_uldivmod)
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 5/9] Optimise __aeabi_ldivmod (stack manipulation)
2014-06-11 10:20 ` [PATCH 5/9] Optimise __aeabi_ldivmod (stack manipulation) Charles Baylis
@ 2014-06-18 13:53 ` Richard Earnshaw
0 siblings, 0 replies; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-18 13:53 UTC (permalink / raw)
To: Charles Baylis; +Cc: gcc-patches, Ramana Radhakrishnan
On 11/06/14 11:19, Charles Baylis wrote:
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_ldivmod): Optimise stack manipulation.
OK.
R.
> ---
> libgcc/config/arm/bpabi.S | 14 +++-----------
> 1 file changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
> index 927e37f..3f9ece5 100644
> --- a/libgcc/config/arm/bpabi.S
> +++ b/libgcc/config/arm/bpabi.S
> @@ -174,18 +174,10 @@ ARM_FUNC_START aeabi_ldivmod
> cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
> test_div_by_zero signed
>
> - sub sp, sp, #8
> -#if defined(__thumb2__)
> - mov ip, sp
> - push {ip, lr}
> -#else
> - do_push {sp, lr}
> -#endif
> -98: cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
> + push_for_divide __aeabi_ldivmod
> + /* arguments in (r0:r1), (r2:r3) and *sp */
> bl SYM(__gnu_ldivmod_helper) __PLT__
> - ldr lr, [sp, #4]
> - add sp, sp, #8
> - do_pop {r2, r3}
> + pop_for_divide
> RET
> cfi_end LSYM(Lend_aeabi_ldivmod)
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 4/9] Optimise __aeabi_uldivmod
2014-06-11 10:20 ` [PATCH 4/9] Optimise __aeabi_uldivmod Charles Baylis
@ 2014-06-18 13:53 ` Richard Earnshaw
0 siblings, 0 replies; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-18 13:53 UTC (permalink / raw)
To: Charles Baylis; +Cc: gcc-patches, Ramana Radhakrishnan
On 11/06/14 11:19, Charles Baylis wrote:
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call
> to __udivmoddi4.
OK.
R.
> ---
> libgcc/config/arm/bpabi.S | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
> index 67246b0..927e37f 100644
> --- a/libgcc/config/arm/bpabi.S
> +++ b/libgcc/config/arm/bpabi.S
> @@ -207,7 +207,7 @@ ARM_FUNC_START aeabi_uldivmod
>
> push_for_divide __aeabi_uldivmod
> /* arguments in (r0:r1), (r2:r3) and *sp */
> - bl SYM(__gnu_uldivmod_helper) __PLT__
> + bl SYM(__udivmoddi4) __PLT__
> pop_for_divide
> RET
> cfi_end LSYM(Lend_aeabi_uldivmod)
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 6/9] Optimise __aeabi_ldivmod
2014-06-11 10:20 ` [PATCH 6/9] Optimise __aeabi_ldivmod Charles Baylis
@ 2014-06-18 14:03 ` Richard Earnshaw
0 siblings, 0 replies; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-18 14:03 UTC (permalink / raw)
To: Charles Baylis; +Cc: gcc-patches, Ramana Radhakrishnan
On 11/06/14 11:19, Charles Baylis wrote:
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_ldivmod): Perform division using
> __udivmoddi4, and fixups for negative operands.
OK.
> ---
> libgcc/config/arm/bpabi.S | 41 ++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 40 insertions(+), 1 deletion(-)
>
> diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
> index 3f9ece5..c044167 100644
> --- a/libgcc/config/arm/bpabi.S
> +++ b/libgcc/config/arm/bpabi.S
> @@ -175,10 +175,49 @@ ARM_FUNC_START aeabi_ldivmod
> test_div_by_zero signed
>
> push_for_divide __aeabi_ldivmod
> + cmp xxh, #0
> + blt 1f
> + cmp yyh, #0
> + blt 2f
> + /* arguments in (r0:r1), (r2:r3) and *sp */
> + bl SYM(__udivmoddi4) __PLT__
> + pop_for_divide
> + RET
> +
> +1: /* xxh:xxl is negative */
> + negs xxl, xxl
> + sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> + cmp yyh, #0
> + blt 3f
> + /* arguments in (r0:r1), (r2:r3) and *sp */
> + bl SYM(__udivmoddi4) __PLT__
> + pop_for_divide
> + negs xxl, xxl
> + sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> + negs yyl, yyl
> + sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> + RET
> +
> +2: /* only yyh:yyl is negative */
> + negs yyl, yyl
> + sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> + /* arguments in (r0:r1), (r2:r3) and *sp */
> + bl SYM(__udivmoddi4) __PLT__
> + pop_for_divide
> + negs xxl, xxl
> + sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> + RET
> +
> +3: /* both xxh:xxl and yyh:yyl are negative */
> + negs yyl, yyl
> + sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> /* arguments in (r0:r1), (r2:r3) and *sp */
> - bl SYM(__gnu_ldivmod_helper) __PLT__
> + bl SYM(__udivmoddi4) __PLT__
> pop_for_divide
> + negs yyl, yyl
> + sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> RET
> +
> cfi_end LSYM(Lend_aeabi_ldivmod)
>
> #endif /* L_aeabi_ldivmod */
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 7/9] Fix cfi annotations
2014-06-11 10:20 ` [PATCH 7/9] Fix cfi annotations Charles Baylis
@ 2014-06-18 14:04 ` Richard Earnshaw
0 siblings, 0 replies; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-18 14:04 UTC (permalink / raw)
To: Charles Baylis; +Cc: gcc-patches, Ramana Radhakrishnan
On 11/06/14 11:19, Charles Baylis wrote:
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.S (__aeabi_ldivmod, __aeabi_uldivmod,
> push_for_divide, pop_for_divide): Use .cfi_* directives for DWARF
> annotations. Fix DWARF information.
OK.
> ---
> libgcc/config/arm/bpabi.S | 21 ++++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
> index c044167..959ecb1 100644
> --- a/libgcc/config/arm/bpabi.S
> +++ b/libgcc/config/arm/bpabi.S
> @@ -22,6 +22,8 @@
> see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
> <http://www.gnu.org/licenses/>. */
>
> + .cfi_sections .debug_frame
> +
> #ifdef __ARM_EABI__
> /* Some attributes that are common to all routines in this file. */
> /* Tag_ABI_align_needed: This code does not require 8-byte
> @@ -145,7 +147,8 @@ ARM_FUNC_START aeabi_ulcmp
> sub sp, sp, #8
> do_push {sp, lr}
> #endif
> -98: cfi_push 98b - \fname, 0xe, -0xc, 0x10
> + .cfi_adjust_cfa_offset 16
> + .cfi_offset 14, -12
> .endm
>
> /* restore stack */
> @@ -158,6 +161,8 @@ ARM_FUNC_START aeabi_ulcmp
> add sp, sp, #8
> do_pop {r2, r3}
> #endif
> + .cfi_restore 14
> + .cfi_adjust_cfa_offset 0
> .endm
>
> #ifdef L_aeabi_ldivmod
> @@ -171,7 +176,7 @@ ARM_FUNC_START aeabi_ulcmp
> r2:r3 remainder
> */
> ARM_FUNC_START aeabi_ldivmod
> - cfi_start __aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
> + .cfi_startproc
> test_div_by_zero signed
>
> push_for_divide __aeabi_ldivmod
> @@ -181,16 +186,19 @@ ARM_FUNC_START aeabi_ldivmod
> blt 2f
> /* arguments in (r0:r1), (r2:r3) and *sp */
> bl SYM(__udivmoddi4) __PLT__
> + .cfi_remember_state
> pop_for_divide
> RET
>
> 1: /* xxh:xxl is negative */
> + .cfi_restore_state
> negs xxl, xxl
> sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> cmp yyh, #0
> blt 3f
> /* arguments in (r0:r1), (r2:r3) and *sp */
> bl SYM(__udivmoddi4) __PLT__
> + .cfi_remember_state
> pop_for_divide
> negs xxl, xxl
> sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> @@ -199,16 +207,19 @@ ARM_FUNC_START aeabi_ldivmod
> RET
>
> 2: /* only yyh:yyl is negative */
> + .cfi_restore_state
> negs yyl, yyl
> sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> /* arguments in (r0:r1), (r2:r3) and *sp */
> bl SYM(__udivmoddi4) __PLT__
> + .cfi_remember_state
> pop_for_divide
> negs xxl, xxl
> sbc xxh, xxh, xxh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> RET
>
> 3: /* both xxh:xxl and yyh:yyl are negative */
> + .cfi_restore_state
> negs yyl, yyl
> sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> /* arguments in (r0:r1), (r2:r3) and *sp */
> @@ -218,7 +229,7 @@ ARM_FUNC_START aeabi_ldivmod
> sbc yyh, yyh, yyh, lsl #1 /* Thumb-2 has no RSC, so use X - 2X */
> RET
>
> - cfi_end LSYM(Lend_aeabi_ldivmod)
> + .cfi_endproc
>
> #endif /* L_aeabi_ldivmod */
>
> @@ -233,7 +244,7 @@ ARM_FUNC_START aeabi_ldivmod
> r2:r3 remainder
> */
> ARM_FUNC_START aeabi_uldivmod
> - cfi_start __aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
> + .cfi_startproc
> test_div_by_zero unsigned
>
> push_for_divide __aeabi_uldivmod
> @@ -241,7 +252,7 @@ ARM_FUNC_START aeabi_uldivmod
> bl SYM(__udivmoddi4) __PLT__
> pop_for_divide
> RET
> - cfi_end LSYM(Lend_aeabi_uldivmod)
> + .cfi_endproc
>
> #endif /* L_aeabi_divmod */
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 8/9] Use __udivmoddi4 for v6M aeabi_uldivmod
2014-06-11 10:20 ` [PATCH 8/9] Use __udivmoddi4 for v6M aeabi_uldivmod Charles Baylis
@ 2014-06-18 14:04 ` Richard Earnshaw
0 siblings, 0 replies; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-18 14:04 UTC (permalink / raw)
To: Charles Baylis; +Cc: gcc-patches, Ramana Radhakrishnan
On 11/06/14 11:19, Charles Baylis wrote:
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi-v6m.S (__aeabi_uldivmod): Perform division using
> __udivmoddi4.
OK.
R.
> ---
> libgcc/config/arm/bpabi-v6m.S | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libgcc/config/arm/bpabi-v6m.S b/libgcc/config/arm/bpabi-v6m.S
> index 0bf2e55..d549fa6 100644
> --- a/libgcc/config/arm/bpabi-v6m.S
> +++ b/libgcc/config/arm/bpabi-v6m.S
> @@ -148,7 +148,7 @@ FUNC_START aeabi_uldivmod
> mov r0, sp
> push {r0, lr}
> ldr r0, [sp, #8]
> - bl SYM(__gnu_uldivmod_helper)
> + bl SYM(__udivmoddi4)
> ldr r3, [sp, #4]
> mov lr, r3
> add sp, sp, #8
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 9/9] Remove __gnu_uldivmod_helper
2014-06-11 10:20 ` [PATCH 9/9] Remove __gnu_uldivmod_helper Charles Baylis
@ 2014-06-18 14:09 ` Richard Earnshaw
0 siblings, 0 replies; 22+ messages in thread
From: Richard Earnshaw @ 2014-06-18 14:09 UTC (permalink / raw)
To: Charles Baylis; +Cc: gcc-patches, Ramana Radhakrishnan
On 11/06/14 11:19, Charles Baylis wrote:
> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>
> * config/arm/bpabi.c (__gnu_uldivmod_helper): Remove.
OK.
R.
> ---
> libgcc/config/arm/bpabi.c | 14 --------------
> 1 file changed, 14 deletions(-)
>
> diff --git a/libgcc/config/arm/bpabi.c b/libgcc/config/arm/bpabi.c
> index 7b155cc..e90d044 100644
> --- a/libgcc/config/arm/bpabi.c
> +++ b/libgcc/config/arm/bpabi.c
> @@ -26,9 +26,6 @@ extern long long __divdi3 (long long, long long);
> extern unsigned long long __udivdi3 (unsigned long long,
> unsigned long long);
> extern long long __gnu_ldivmod_helper (long long, long long, long long *);
> -extern unsigned long long __gnu_uldivmod_helper (unsigned long long,
> - unsigned long long,
> - unsigned long long *);
>
>
> long long
> @@ -43,14 +40,3 @@ __gnu_ldivmod_helper (long long a,
> return quotient;
> }
>
> -unsigned long long
> -__gnu_uldivmod_helper (unsigned long long a,
> - unsigned long long b,
> - unsigned long long *remainder)
> -{
> - unsigned long long quotient;
> -
> - quotient = __udivdi3 (a, b);
> - *remainder = a - b * quotient;
> - return quotient;
> -}
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/9] Whitespace
2014-06-11 12:55 ` [PATCH 1/9] Whitespace Richard Earnshaw
@ 2014-06-18 15:55 ` Charles Baylis
0 siblings, 0 replies; 22+ messages in thread
From: Charles Baylis @ 2014-06-18 15:55 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: gcc-patches, Ramana Radhakrishnan
On 11 June 2014 13:55, Richard Earnshaw <rearnsha@arm.com> wrote:
> On 11/06/14 11:19, Charles Baylis wrote:
>> 2014-05-22 Charles Baylis <charles.baylis@linaro.org>
>>
>> * config/arm/bpabi.S (__aeabi_uldivmod): Fix whitespace.
>> (__aeabi_ldivmod): Fix whitespace.
>
> This is OK, but please wait until the others are ready to go in.
The series is now committed as r211789-r211797.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2014-06-18 15:55 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-22 10:08 [PATCH, ARM, v2] Improve 64 bit division performance Charles Baylis
2014-06-11 9:30 ` Charles Baylis
2014-06-11 9:33 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 1/9] Whitespace Charles Baylis
2014-06-11 10:20 ` [PATCH 2/9] Add comments Charles Baylis
2014-06-11 12:55 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 8/9] Use __udivmoddi4 for v6M aeabi_uldivmod Charles Baylis
2014-06-18 14:04 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 6/9] Optimise __aeabi_ldivmod Charles Baylis
2014-06-18 14:03 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 5/9] Optimise __aeabi_ldivmod (stack manipulation) Charles Baylis
2014-06-18 13:53 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 9/9] Remove __gnu_uldivmod_helper Charles Baylis
2014-06-18 14:09 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 4/9] Optimise __aeabi_uldivmod Charles Baylis
2014-06-18 13:53 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 3/9] Optimise __aeabi_uldivmod (stack manipulation) Charles Baylis
2014-06-18 13:52 ` Richard Earnshaw
2014-06-11 10:20 ` [PATCH 7/9] Fix cfi annotations Charles Baylis
2014-06-18 14:04 ` Richard Earnshaw
2014-06-11 12:55 ` [PATCH 1/9] Whitespace Richard Earnshaw
2014-06-18 15:55 ` Charles Baylis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).