public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed
* [Arm] SP and SL initialization refactored
@ 2019-03-19 13:55 Alexander Fedotov
  2019-03-26 10:53 ` Alexander Fedotov
  2019-04-10 12:26 ` Richard Earnshaw (lists)
  0 siblings, 2 replies; 12+ messages in thread
From: Alexander Fedotov @ 2019-03-19 13:55 UTC (permalink / raw)
  To: Richard Earnshaw (lists), Nick Clifton, Tamar Christina, Newlib; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 476 bytes --]

Hi all

In continue of threads
https://sourceware.org/ml/newlib/2019/msg00117.html and
https://sourceware.org/ml/newlib/2019/msg00102.html

I have reworked stack pointer initialization and stack limit as well.
General idea behind of all these changes is to provide hook for
flexibility (including semihosting when debug tap is used) and improve
readability of existing crt0 code.
And now code makes SP initialization in Thumb mode :)

Any comments are very appreciated.

Alex

[-- Attachment #2: 0001-SP-initialization-changes.patch --]
[-- Type: application/x-patch, Size: 12156 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-03-19 13:55 [Arm] SP and SL initialization refactored Alexander Fedotov
@ 2019-03-26 10:53 ` Alexander Fedotov
  2019-03-26 12:42   ` Nick Clifton
  2019-04-10 12:26 ` Richard Earnshaw (lists)
  1 sibling, 1 reply; 12+ messages in thread
From: Alexander Fedotov @ 2019-03-26 10:53 UTC (permalink / raw)
  To: Richard Earnshaw (lists), Nick Clifton, Tamar Christina, Newlib; +Cc: nd

ping

On Tue, Mar 19, 2019 at 4:54 PM Alexander Fedotov <alfedotov@gmail.com> wrote:
>
> Hi all
>
> In continue of threads
> https://sourceware.org/ml/newlib/2019/msg00117.html and
> https://sourceware.org/ml/newlib/2019/msg00102.html
>
> I have reworked stack pointer initialization and stack limit as well.
> General idea behind of all these changes is to provide hook for
> flexibility (including semihosting when debug tap is used) and improve
> readability of existing crt0 code.
> And now code makes SP initialization in Thumb mode :)
>
> Any comments are very appreciated.
>
> Alex



-- 
Best regards,
AF

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-03-26 10:53 ` Alexander Fedotov
@ 2019-03-26 12:42   ` Nick Clifton
  2019-03-29 19:03     ` Tamar Christina
  0 siblings, 1 reply; 12+ messages in thread
From: Nick Clifton @ 2019-03-26 12:42 UTC (permalink / raw)
  To: Alexander Fedotov, Richard Earnshaw (lists), Tamar Christina, Newlib; +Cc: nd

Hi Alex,

> ping

Sorry - I am not very involved with newlib these days, but I can
say that the changes look reasonable to me.

Cheers
  Nick



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [Arm] SP and SL initialization refactored
  2019-03-26 12:42   ` Nick Clifton
@ 2019-03-29 19:03     ` Tamar Christina
  2019-04-05 13:36       ` Alexander Fedotov
  0 siblings, 1 reply; 12+ messages in thread
From: Tamar Christina @ 2019-03-29 19:03 UTC (permalink / raw)
  To: nickc, Alexander Fedotov, Richard Earnshaw, Newlib; +Cc: nd

Hi Alex,

I'm not that familiar with CMSIS so I'll leave that part up to Richard,
But the rest of the patch looks good to me as well.

Cheers,
Tamar

> -----Original Message-----
> From: Nick Clifton <nickc@redhat.com>
> Sent: Tuesday, March 26, 2019 12:43
> To: Alexander Fedotov <alfedotov@gmail.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; Tamar Christina
> <Tamar.Christina@arm.com>; Newlib <newlib@sourceware.org>
> Cc: nd <nd@arm.com>
> Subject: Re: [Arm] SP and SL initialization refactored
> 
> Hi Alex,
> 
> > ping
> 
> Sorry - I am not very involved with newlib these days, but I can say that the
> changes look reasonable to me.
> 
> Cheers
>   Nick
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-03-29 19:03     ` Tamar Christina
@ 2019-04-05 13:36       ` Alexander Fedotov
  2019-04-10 12:28         ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 12+ messages in thread
From: Alexander Fedotov @ 2019-04-05 13:36 UTC (permalink / raw)
  Cc: Richard Earnshaw, Tamar Christina, Newlib, nd

I've found a typo at the end of _stack_init routine:
+    b        lr
+    FN_END
must be
+    bx       lr
+    FN_END

I think we can fix this directly in the original patch before apply.

Alex


On Fri, Mar 29, 2019 at 10:03 PM Tamar Christina
<Tamar.Christina@arm.com> wrote:
>
> Hi Alex,
>
> I'm not that familiar with CMSIS so I'll leave that part up to Richard,
> But the rest of the patch looks good to me as well.
>
> Cheers,
> Tamar
>
> > -----Original Message-----
> > From: Nick Clifton <nickc@redhat.com>
> > Sent: Tuesday, March 26, 2019 12:43
> > To: Alexander Fedotov <alfedotov@gmail.com>; Richard Earnshaw
> > <Richard.Earnshaw@arm.com>; Tamar Christina
> > <Tamar.Christina@arm.com>; Newlib <newlib@sourceware.org>
> > Cc: nd <nd@arm.com>
> > Subject: Re: [Arm] SP and SL initialization refactored
> >
> > Hi Alex,
> >
> > > ping
> >
> > Sorry - I am not very involved with newlib these days, but I can say that the
> > changes look reasonable to me.
> >
> > Cheers
> >   Nick
> >
> >
>


--
Best regards,
AF

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-03-19 13:55 [Arm] SP and SL initialization refactored Alexander Fedotov
  2019-03-26 10:53 ` Alexander Fedotov
@ 2019-04-10 12:26 ` Richard Earnshaw (lists)
  2019-04-10 14:23   ` Corinna Vinschen
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Earnshaw (lists) @ 2019-04-10 12:26 UTC (permalink / raw)
  To: Alexander Fedotov, Nick Clifton, Tamar Christina, Newlib; +Cc: nd

On 19/03/2019 13:54, Alexander Fedotov wrote:
> Hi all
> 
> In continue of threads
> https://sourceware.org/ml/newlib/2019/msg00117.html and
> https://sourceware.org/ml/newlib/2019/msg00102.html
> 
> I have reworked stack pointer initialization and stack limit as well.
> General idea behind of all these changes is to provide hook for
> flexibility (including semihosting when debug tap is used) and improve
> readability of existing crt0 code.
> And now code makes SP initialization in Thumb mode :)
> 
> Any comments are very appreciated.
> 
> Alex
> 

Apologies once again for the delayed review.

A small plea: when posting patches, please can you ensure that your
attachments are text/patch, or something similar, so that replies will
include the contents (and I can view them directly in the email rather
than having to mess about extracting the attachment).

Before this goes in, the changes will need to be mirrored into
newlib/libc/sys/arm.  The two should be kept in sync as far as possible.

now to the patch...

+/* Current Program Status Register (CPSR)
+	M[4:0]		Mode bits. M[4] is always 1 for 32-bit modes.
+	T[5]			1: Thumb, 0: ARM instruction set
+	F[6]			1: disables FIQ
+	I[7]			1: disables IRQ
+	A[8]			1: disables imprecise aborts
+	E[9]			0: Little-endian, 1: Big-endian
+	J[24]			1: Jazelle instruction set
+ */

Please note in the comment that this only applies to A- and R- profiles
(and legacy Arm).

+.macro FN_START

Please rename to FN_EH_START (and similarly for FN_END) - we already
have FUNC_START and the similar name is confusing.

+
+/*******************************************************************************
+* User mode only:            This routine makes default target specific
Stack
+*   +-----+ <- SL_sys,     Pointer initialization for different
processor modes:
+*   |     |    SL_usr      FIQ, Abort, IRQ, Undefined, Supervisor,
System (User)
+*   | SYS |                and setups a default Stack Limit in-case the
code has
+*   | USR | -=0x10000      been compiled with "-mapcs-stack-check" for
FIQ and
+*   |     |                System (User) modes.
+*   +-----+ <- initial SP,
+*           becomes SP_sys   Hard-wiring SL value is not ideal, since
there is
+*           and SL_usr     currently no support for checking that the
heap and
+*                          stack have not collided, or that this
default 64k is
+* All modes:               enough for the program being executed.
However, it
+*   +-----+ <- SL_sys,     ensures that this simple crt0 world will not


Similarly here.  Also this comment wraps in an 80-column window, so
please reformat to avoid that.

+	/* M profile doesn't have CPSR register.  */
+#if (__ARM_ARCH_PROFILE != 'M')
+	/* Following code is compatible for both ARM and Thumb ISA */
+	mrs		r4, CPSR
+	tst		r4, #(CPSR_M_MASK)	/* Test mode bits - in User of all are 0 */

This code is clearly ARM/Thumb2 only.  Have you tested that this builds
for, say ARM7TDMI in thumb state?  If you use a build of gcc configured
with --with-multilib-list=aprofile or --with-multilib-list=rmprofile
you'll get pretty comprehensive checking of all the variants that need
to be supported (beware, it can take quite a long time, even on a fast
machine).  At the very least, this code needs to be checked on:

armv4t Arm and Thumb states
armv6-m Thumb state
armv7-a Arm and Thumb states
armv7-m Thumb state

+	#ifdef THUMB1_ONLY
+	adds	r2, #128
+	adds	r2, #128
+	mov		sl, r2
+	#else
+	add		sl, r2, #256
+	#endif

the ifdef should start in the first column.  If you really want to
reflect the nesting, then use a number of spaces between the # and the
"ifdef".

Please can you also clean up the indentation of the instructions: some
have one tab indent after the mnemonic, some two.


+	b		lr
+	FN_END

??? This can't be right, you don't have a label called 'lr'.  Probably
you meant to use the appropriate return instruction.

+	/* System (shares regs with User) mode, interrupts disabled */
+	mov		r1, #(CPSR_M_SYS|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+	msr		CPSR_c, r1
+	mov		sp, r3
+

This implies dropping support for Armv3 as system mode was added in
Armv4.  That's probably not an issue now as GCC has also dropped such
support, but it should be added to the News.

Overall, I quite like the direction this is taking, but there's more
work to yet.

R.

R.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-04-05 13:36       ` Alexander Fedotov
@ 2019-04-10 12:28         ` Richard Earnshaw (lists)
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Earnshaw (lists) @ 2019-04-10 12:28 UTC (permalink / raw)
  To: Alexander Fedotov; +Cc: Tamar Christina, Newlib, nd

On 05/04/2019 14:35, Alexander Fedotov wrote:
> I've found a typo at the end of _stack_init routine:
> +    b        lr
> +    FN_END
> must be
> +    bx       lr

BX isn't available on Armv4.  We're not dropping support for that yet,
you you'll need to play some ISA-based macro tricks.  See the assembly
functions in the libc/machine/arm dir for some ideas.

R.

> +    FN_END
> 
> I think we can fix this directly in the original patch before apply.
> 
> Alex
> 
> 
> On Fri, Mar 29, 2019 at 10:03 PM Tamar Christina
> <Tamar.Christina@arm.com> wrote:
>>
>> Hi Alex,
>>
>> I'm not that familiar with CMSIS so I'll leave that part up to Richard,
>> But the rest of the patch looks good to me as well.
>>
>> Cheers,
>> Tamar
>>
>>> -----Original Message-----
>>> From: Nick Clifton <nickc@redhat.com>
>>> Sent: Tuesday, March 26, 2019 12:43
>>> To: Alexander Fedotov <alfedotov@gmail.com>; Richard Earnshaw
>>> <Richard.Earnshaw@arm.com>; Tamar Christina
>>> <Tamar.Christina@arm.com>; Newlib <newlib@sourceware.org>
>>> Cc: nd <nd@arm.com>
>>> Subject: Re: [Arm] SP and SL initialization refactored
>>>
>>> Hi Alex,
>>>
>>>> ping
>>>
>>> Sorry - I am not very involved with newlib these days, but I can say that the
>>> changes look reasonable to me.
>>>
>>> Cheers
>>>   Nick
>>>
>>>
>>
> 
> 
> --
> Best regards,
> AF
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-04-10 12:26 ` Richard Earnshaw (lists)
@ 2019-04-10 14:23   ` Corinna Vinschen
  2019-04-10 14:55     ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 12+ messages in thread
From: Corinna Vinschen @ 2019-04-10 14:23 UTC (permalink / raw)
  To: Richard Earnshaw (lists)
  Cc: Alexander Fedotov, Nick Clifton, Tamar Christina, Newlib, nd

[-- Attachment #1: Type: text/plain, Size: 1310 bytes --]

Hi Richard,

On Apr 10 13:26, Richard Earnshaw (lists) wrote:
> On 19/03/2019 13:54, Alexander Fedotov wrote:
> > Hi all
> > 
> > In continue of threads
> > https://sourceware.org/ml/newlib/2019/msg00117.html and
> > https://sourceware.org/ml/newlib/2019/msg00102.html
> > 
> > I have reworked stack pointer initialization and stack limit as well.
> > General idea behind of all these changes is to provide hook for
> > flexibility (including semihosting when debug tap is used) and improve
> > readability of existing crt0 code.
> > And now code makes SP initialization in Thumb mode :)
> > 
> > Any comments are very appreciated.
> > 
> > Alex
> > 
> 
> Apologies once again for the delayed review.
> 
> A small plea: when posting patches, please can you ensure that your
> attachments are text/patch, or something similar, so that replies will
> include the contents (and I can view them directly in the email rather
> than having to mess about extracting the attachment).

Ideally the patches are inline patches sent with `git send-email' or
equivalent.  I added some wording in terms of how to provide patches to
https://sourceware.org/newlib/ lately.  Do you have a suggestion for
improving this?


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-04-10 14:23   ` Corinna Vinschen
@ 2019-04-10 14:55     ` Richard Earnshaw (lists)
  2019-04-10 15:58       ` Alexander Fedotov
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Earnshaw (lists) @ 2019-04-10 14:55 UTC (permalink / raw)
  To: Alexander Fedotov, Nick Clifton, Tamar Christina, Newlib, nd

On 10/04/2019 15:23, Corinna Vinschen wrote:
> Hi Richard,
> 
> On Apr 10 13:26, Richard Earnshaw (lists) wrote:
>> On 19/03/2019 13:54, Alexander Fedotov wrote:
>>> Hi all
>>>
>>> In continue of threads
>>> https://sourceware.org/ml/newlib/2019/msg00117.html and
>>> https://sourceware.org/ml/newlib/2019/msg00102.html
>>>
>>> I have reworked stack pointer initialization and stack limit as well.
>>> General idea behind of all these changes is to provide hook for
>>> flexibility (including semihosting when debug tap is used) and improve
>>> readability of existing crt0 code.
>>> And now code makes SP initialization in Thumb mode :)
>>>
>>> Any comments are very appreciated.
>>>
>>> Alex
>>>
>>
>> Apologies once again for the delayed review.
>>
>> A small plea: when posting patches, please can you ensure that your
>> attachments are text/patch, or something similar, so that replies will
>> include the contents (and I can view them directly in the email rather
>> than having to mess about extracting the attachment).
> 
> Ideally the patches are inline patches sent with `git send-email' or
> equivalent.  I added some wording in terms of how to provide patches to
> https://sourceware.org/newlib/ lately.  Do you have a suggestion for
> improving this?
> 
> 
> Thanks,
> Corinna
> 

Not really;  When I use git format-patch with --attach I get
text/x-patch as the MIME type, which is fine.  Problems only start if
users have poorly configured mail client that attach as octet-stream or
other binary types.  Then mail clients can't display the attachment
inline as they don't know the format is really displayable.

R.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-04-10 14:55     ` Richard Earnshaw (lists)
@ 2019-04-10 15:58       ` Alexander Fedotov
  2019-04-12 12:38         ` Alexander Fedotov
  0 siblings, 1 reply; 12+ messages in thread
From: Alexander Fedotov @ 2019-04-10 15:58 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Tamar Christina, Newlib, nd

Hi Richard

New patch is below (only for libgloss).
If it's fine with you I will do the same changes in newlib/libc/sys/arm as well.

>This code is clearly ARM/Thumb2 only.  Have you tested that this builds
>for, say ARM7TDMI in thumb state?  If you use a build of gcc configured
>with --with-multilib-list=aprofile or --with-multilib-list=rmprofile
>you'll get pretty comprehensive checking of all the variants that need
>to be supported (beware, it can take quite a long time, even on a fast
>machine).  At the very least, this code needs to be checked on:

I have built gcc with --with-multilib-list=rmprofile
Unfortunately I have no ability to check on all of them. At least this code is
verified on Cortex-r52 that was the aim.



>This implies dropping support for Armv3 as system mode was added in
>Armv4.  That's probably not an issue now as GCC has also dropped such
>support, but it should be added to the News.
I don't see News file for newlib. But I have preserved original
version in the patch.
I suggest to drop Armv3 by another commit.


I can't get why description is corrupted for _stack_init function in
the patch below while lines
are less than 80 symbols :(


From c8ff1ecf005dba2a4a69c5754b1911c0d351aba5 Mon Sep 17 00:00:00 2001
From: Alexander Fedotov <alfedotov@gmail.com>
Date: Wed, 23 Jan 2019 13:33:14 +0300
Subject: [PATCH] SP initialization changes:   1. set default value in
 semihosting case as well   2. moved existing SP & SL init code for processor
 modes in separate routine and made it as "hook"   3. init SP for processor
 modes in Thumb mode as well Add new macro FN_RETURN, FN_EH_START and
 FN_EH_END.

---
 libgloss/arm/arm.h  |  26 +++++
 libgloss/arm/crt0.S | 279 +++++++++++++++++++++++++++++++-------------
 2 files changed, 224 insertions(+), 81 deletions(-)

diff --git a/libgloss/arm/arm.h b/libgloss/arm/arm.h
index 0489f2d92..04b3a5ac7 100644
--- a/libgloss/arm/arm.h
+++ b/libgloss/arm/arm.h
@@ -61,4 +61,30 @@
 # define HAVE_CALL_INDIRECT
 #endif

+/* A and R profiles (and legacy Arm).
+    Current Program Status Register (CPSR)
+    M[4:0]        Mode bits. M[4] is always 1 for 32-bit modes.
+    T[5]            1: Thumb, 0: ARM instruction set
+    F[6]            1: disables FIQ
+    I[7]            1: disables IRQ
+    A[8]            1: disables imprecise aborts
+    E[9]            0: Little-endian, 1: Big-endian
+    J[24]            1: Jazelle instruction set
+ */
+#define CPSR_M_USR            0x00    /* User mode */
+#define CPSR_M_FIQ            0x01    /* Fast Interrupt mode */
+#define CPSR_M_IRQ            0x02    /* Interrupt mode */
+#define CPSR_M_SVR            0x03    /* Supervisor mode */
+#define CPSR_M_MON            0x06    /* Monitor mode */
+#define CPSR_M_ABT            0x07    /* Abort mode */
+#define CPSR_M_HYP            0x0A    /* Hypervisor mode */
+#define CPSR_M_UND            0x0B    /* Undefined mode */
+#define CPSR_M_SYS            0x0F    /* System mode */
+#define CPSR_M_32BIT        0x10    /* 32-bit mode */
+#define CPSR_T_BIT            0x20    /* Thumb bit */
+#define CPSR_F_MASK            0x40    /* FIQ bit */
+#define CPSR_I_MASK            0x80    /* IRQ bit */
+
+#define CPSR_M_MASK            0x0F    /* Mode mask except M[4] */
+
 #endif /* _LIBGLOSS_ARM_H */
diff --git a/libgloss/arm/crt0.S b/libgloss/arm/crt0.S
index c708f63d8..89256d5e8 100644
--- a/libgloss/arm/crt0.S
+++ b/libgloss/arm/crt0.S
@@ -59,6 +59,21 @@
 .endm
 #endif

+/* Annotation for EABI unwinding tables. */
+.macro FN_EH_START
+#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
+    .fnstart
+#endif
+.endm
+
+.macro FN_EH_END
+#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
+    /* Protect against unhandled exceptions.  */
+    .cantunwind
+    .fnend
+#endif
+.endm
+
 .macro indirect_call reg
 #ifdef HAVE_CALL_INDIRECT
     blx \reg
@@ -68,14 +83,170 @@
 #endif
 .endm

-    .align     0
+/* For armv4t and newer, toolchains will transparently convert
+   'bx lr' to 'mov pc, lr' if needed. GCC has deprecated support
+   for anything older than armv4t, but this should handle that
+   corner case in case anyone needs it anyway */
+.macro  FN_RETURN
+#if __ARM_ARCH <= 4 && __ARM_ARCH_ISA_THUMB == 0
+    mov    pc, lr
+#else
+    bx    lr
+#endif
+.endm
+
+
+
+/******************************************************************************
+* User mode only:           This routine makes default target specific Stack
+*   +-----+ <- SL_sys,    Pointer initialization for different processor modes:
+*   |     |    SL_usr     FIQ, Abort, IRQ, Undefined, Supervisor, System (User)
+*   | SYS |               and setups a default Stack Limit in-case the code has
+*   | USR | -=0x10000     been compiled with "-mapcs-stack-check" for FIQ and
+*   |     |               System (User) modes.
+*   |     |
+*   +-----+ <- initial SP,
+*           becomes SP_sys   Hard-wiring SL value is not ideal, since there is
+*           and SL_usr     currently no support for checking that the heap and
+*                          stack have not collided, or that this default 64k is
+* All modes:               is enough for the program being executed. However,
+*   +-----+ <- SL_sys,     it ensures that this simple crt0 world will not
+*   |     |    SL_usr      immediately cause an overflow event.
+*   | SYS |
+*   | USR | -=0x10000        We go through all execution modes and set up SP
+*   |     |                for each of them.
+*   +-----+ <- SP_sys,
+*   |     |    SP_usr      Note:
+*   | SVC | -= 0x8000        Mode switch via CPSR is not allowed once in
+*   |     |                  non-privileged mode, so we take care not to enter
+*   +-----+ <- SP_svc        "User" to set up its sp, and also skip most
+*   |     |                  operations if already in that mode.
+*   | IRQ | -= 0x2000
+*   |     |                Input parameters:
+* ^ +-----+ <- SP_und       - sp - Initialized SP
+* s |     |                 - r2 - May contain SL value from semihosting
+* t | UND | -= 0x1000              SYS_HEAPINFO call
+* a |     |                Scratch registers:
+* c +-----+ <- SP_und       - r1 - new value of CPSR
+* k |     |                 - r2 - intermediate value (in standalone mode)
+*   | ABT | -= 0x1000       - r3 - new SP value
+* g |     |                 - r4 - save/restore CPSR on entry/exit
+* r +-----+ <- SP_abt,
+* o |     |    SL_fiq        Declared as "weak" so that user can write and use
+* w | FIQ | -= 0x1000      his own implementation if current doesn't fit.
+* t |     |
+* h +-----+ <- initial SP,
+*           becomes SP_fiq
+*
+******************************************************************************/
+    .align    0
+    FUNC_START    _stack_init
+    .weak FUNCTION (_stack_init)
+    FN_EH_START
+
+    /* M profile doesn't have CPSR register.  */
+#if (__ARM_ARCH_PROFILE != 'M')
+    /* Following code is compatible for both ARM and Thumb ISA */
+    mrs    r4, CPSR
+    /* Test mode bits - in User of all are 0 */
+    tst    r4, #(CPSR_M_MASK)
+    /* "eq" means r4 AND #0x0F is 0 */
+    beq    .Lskip_cpu_modes
+
+    mov    r3, sp /* save input SP value */
+
+    /* FIQ mode, interrupts disabled */
+    mov    r1, #(CPSR_M_FIQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    sl, sp, #0x1000    /* FIQ mode has its own SL */
+
+    /* Abort mode, interrupts disabled */
+    mov    r3, sl
+    mov    r1, #(CPSR_M_ABT|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    r3, r3, #0x1000
+
+    /* Undefined mode, interrupts disabled */
+    mov    r1, #(CPSR_M_UND|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    r3, r3, #0x1000
+
+    /* IRQ mode, interrupts disabled */
+    mov    r1, #(CPSR_M_IRQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    r3, r3, #0x2000
+
+    /* Supervisory mode, interrupts disabled */
+    mov    r1, #(CPSR_M_SVR|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+
+    sub    r3, r3, #0x8000    /* Min size 32k */
+    bic    r3, r3, #0x00FF    /* Align with current 64k block */
+    bic    r3, r3, #0xFF00
+
+# if __ARM_ARCH > 3
+    /* System (shares regs with User) mode, interrupts disabled */
+    mov    r1, #(CPSR_M_SYS|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+# else
+    /* Keep this for ARMv3, but GCC actually dropped it. */
+    /* Move value into user mode sp without changing modes, */
+    /* via '^' form of ldm */
+    str    r3, [r3, #-4]
+    ldmdb    r3, {sp}^
+# endif
+
+    /* Back to original mode, presumably SVC, with diabled FIQ/IRQ */
+    orr    r4, r4, #(CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r4
+
+.Lskip_cpu_modes:
+#endif
+
+    /* Set SL register */
+#if defined (ARM_RDI_MONITOR) /* semihosting */
+    cmp    r2, #0
+    beq    .Lsl_forced_zero
+    /* allow slop for stack overflow handling and small frames */
+# ifdef THUMB1_ONLY
+    adds    r2, #128
+    adds    r2, #128
+    mov    sl, r2
+# else
+    add    sl, r2, #256
+# endif
+.Lsl_forced_zero:
+
+#else /* standalone */
+    /* r3 contains SP for System/User mode. Set SL = SP - 0x10000 */
+    #ifdef THUMB1_ONLY
+    movs    r2, #64
+    lsls    r2, r2, #10
+    subs    r2, r3, r2
+    mov    sl, r2
+    #else
+    /* Still assumes 256bytes below sl */
+    sub    sl, r3, #64 << 10
+    #endif
+#endif
+
+    FN_RETURN
+    FN_EH_END

+
+/*******************************************************************************
+* Main library startup code.
+*******************************************************************************/
+    .align     0
     FUNC_START    _mainCRTStartup
     FUNC_START    _start
-#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
-    /* Annotation for EABI unwinding tables.  */
-    .fnstart
-#endif
+    FN_EH_START

     /* __ARM_ARCH_PROFILE is defined from GCC 4.8 onwards, however
__ARM_ARCH_7A
     has been defined since 4.2 onwards, which is when v7-a support was added
@@ -148,34 +319,27 @@
           to skip setting sp/sl to 0 here.
         - Considering M-profile processors, We might want to initialize
           sp by the first entry of vector table and return 0 to SYS_HEAPINFO
-          semihosting call, which will be skipped here. */
+          semihosting call, which will be skipped here.
+        - Considering R-profile processors there is no automatic SP
init by hardware
+          so we need to initialize it by default value. */
+    ldr    r3, .Lstack
     cmp    r1, #0
     beq    .LC26
-    mov    sp, r1
+    mov    r3, r1
 .LC26:
-    cmp    r2, #0
-    beq    .LC27
-    /*  allow slop for stack overflow handling and small frames */
-#ifdef THUMB1_ONLY
-    adds    r2, #128
-    adds    r2, #128
-    mov    sl, r2
-#else
-    add    sl, r2, #256
-#endif
-.LC27:
-#else
+    mov    sp, r3
+
+    /* r2 (SL value) will be used in _stack_init */
+    bl FUNCTION (_stack_init)
+
+
+#else /* standalone */
     /*  Set up the stack pointer to a fixed value */
     /*  Changes by toralf:
         - Allow linker script to provide stack via __stack symbol - see
           defintion of .Lstack
         - Provide "hooks" that may be used by the application to add
-          custom init code - see .Lhwinit and .Lswinit
-        - Go through all execution modes and set up stack for each of them.
-          Loosely based on init.s from ARM/Motorola example code.
-              Note: Mode switch via CPSR is not allowed once in non-privileged
-            mode, so we take care not to enter "User" to set up its sp,
-            and also skip most operations if already in that mode. */
+          custom init code - see .Lhwinit and .Lswinit */

     ldr    r3, .Lstack
     cmp    r3, #0
@@ -194,57 +358,11 @@
          have somehow missed it below (in which case it gets the same
          value as FIQ - not ideal, but better than nothing.) */
     mov    sp, r3
-#ifdef PREFER_THUMB
-    /* XXX Fill in stack assignments for interrupt modes.  */
-#else
-    mrs    r2, CPSR
-    tst    r2, #0x0F    /* Test mode bits - in User of all are 0 */
-    beq    .LC23        /* "eq" means r2 AND #0x0F is 0 */
-    msr     CPSR_c, #0xD1    /* FIRQ mode, interrupts disabled */
-    mov     sp, r3
-    sub    sl, sp, #0x1000    /* This mode also has its own sl (see below) */
-
-    mov    r3, sl
-    msr     CPSR_c, #0xD7    /* Abort mode, interrupts disabled */
-    mov    sp, r3
-    sub    r3, r3, #0x1000
-
-    msr     CPSR_c, #0xDB    /* Undefined mode, interrupts disabled */
-    mov    sp, r3
-    sub    r3, r3, #0x1000

-    msr     CPSR_c, #0xD2    /* IRQ mode, interrupts disabled */
-    mov    sp, r3
-    sub    r3, r3, #0x2000
-
-    msr     CPSR_c, #0xD3    /* Supervisory mode, interrupts disabled */
+    /* we don't care of r2 value in standalone */
+    bl FUNCTION (_stack_init)

-    mov    sp, r3
-    sub    r3, r3, #0x8000    /* Min size 32k */
-    bic    r3, r3, #0x00FF    /* Align with current 64k block */
-    bic    r3, r3, #0xFF00

-    str    r3, [r3, #-4]    /* Move value into user mode sp without */
-    ldmdb    r3, {sp}^       /* changing modes, via '^' form of ldm */
-    orr    r2, r2, #0xC0    /* Back to original mode, presumably SVC, */
-    msr    CPSR_c, r2    /* with FIQ/IRQ disable bits forced to 1 */
-#endif
-.LC23:
-    /* Setup a default stack-limit in-case the code has been
-       compiled with "-mapcs-stack-check".  Hard-wiring this value
-       is not ideal, since there is currently no support for
-       checking that the heap and stack have not collided, or that
-       this default 64k is enough for the program being executed.
-       However, it ensures that this simple crt0 world will not
-       immediately cause an overflow event:  */
-#ifdef THUMB1_ONLY
-    movs    r2, #64
-    lsls    r2, r2, #10
-    subs    r2, r3, r2
-    mov    sl, r2
-#else
-    sub    sl, r3, #64 << 10    /* Still assumes 256bytes below sl */
-#endif
 #endif
 #endif
     /* Zero the memory in the .bss section.  */
@@ -445,6 +563,8 @@ change_back:
     swi    SWI_Exit
 #endif

+    FN_EH_END
+
     /* For Thumb, constants must be after the code since only
        positive offsets are supported for PC relative addresses.  */

@@ -464,8 +584,6 @@ change_back:
 #else
     .word    0x80000            /* Top of RAM on the PIE board.  */
 #endif
-.Lstack:
-    .word    __stack
 .Lhwinit:
     .word    FUNCTION (hardware_init_hook)
 .Lswinit:
@@ -478,17 +596,16 @@ change_back:
        and only if, a normal version of the same symbol isn't provided
        e.g. by a linker script or another object file.) */

-    .weak __stack
     .weak FUNCTION (hardware_init_hook)
     .weak FUNCTION (software_init_hook)
 #endif

 #endif
-#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
-    /* Protect against unhandled exceptions.  */
-    .cantunwind
-    .fnend
-#endif
+
+.Lstack:
+    .word    __stack
+    .weak    __stack
+
 .LC1:
     .word    __bss_start__
 .LC2:
-- 
2.20.1.windows.1



On Wed, Apr 10, 2019 at 5:55 PM Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 10/04/2019 15:23, Corinna Vinschen wrote:
> > Hi Richard,
> >
> > On Apr 10 13:26, Richard Earnshaw (lists) wrote:
> >> On 19/03/2019 13:54, Alexander Fedotov wrote:
> >>> Hi all
> >>>
> >>> In continue of threads
> >>> https://sourceware.org/ml/newlib/2019/msg00117.html and
> >>> https://sourceware.org/ml/newlib/2019/msg00102.html
> >>>
> >>> I have reworked stack pointer initialization and stack limit as well.
> >>> General idea behind of all these changes is to provide hook for
> >>> flexibility (including semihosting when debug tap is used) and improve
> >>> readability of existing crt0 code.
> >>> And now code makes SP initialization in Thumb mode :)
> >>>
> >>> Any comments are very appreciated.
> >>>
> >>> Alex
> >>>
> >>
> >> Apologies once again for the delayed review.
> >>
> >> A small plea: when posting patches, please can you ensure that your
> >> attachments are text/patch, or something similar, so that replies will
> >> include the contents (and I can view them directly in the email rather
> >> than having to mess about extracting the attachment).
> >
> > Ideally the patches are inline patches sent with `git send-email' or
> > equivalent.  I added some wording in terms of how to provide patches to
> > https://sourceware.org/newlib/ lately.  Do you have a suggestion for
> > improving this?
> >
> >
> > Thanks,
> > Corinna
> >
>
> Not really;  When I use git format-patch with --attach I get
> text/x-patch as the MIME type, which is fine.  Problems only start if
> users have poorly configured mail client that attach as octet-stream or
> other binary types.  Then mail clients can't display the attachment
> inline as they don't know the format is really displayable.
>
> R.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-04-10 15:58       ` Alexander Fedotov
@ 2019-04-12 12:38         ` Alexander Fedotov
  2019-04-15 10:21           ` Alexander Fedotov
  0 siblings, 1 reply; 12+ messages in thread
From: Alexander Fedotov @ 2019-04-12 12:38 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Tamar Christina, Newlib, nd

Hi Richard
New patch (rebased on
https://sourceware.org/ml/newlib/2019/msg00164.html ) is below.

OK to apply on master ?


From b7ad80f440e593b363bd0240e8d2476da7b6160d Mon Sep 17 00:00:00 2001
From: Alexander Fedotov <alfedotov@gmail.com>
Date: Wed, 23 Jan 2019 13:33:14 +0300
Subject: [PATCH] [Arm] Stack Pointer and Stack Limit initialization
 refactored.

SP initialization changes:
  1. set default value in semihosting case as well
  2. moved existing SP & SL init code for processor modes in separate
routine and made it as "hook"
  3. init SP for processor modes in Thumb mode as well

Add new macro FN_RETURN, FN_EH_START and FN_EH_END.
---
 libgloss/arm/arm.h         |  26 ++++
 libgloss/arm/crt0.S        | 277 ++++++++++++++++++++++++++-----------
 newlib/libc/sys/arm/arm.h  |  26 ++++
 newlib/libc/sys/arm/crt0.S | 277 ++++++++++++++++++++++++++-----------
 4 files changed, 440 insertions(+), 166 deletions(-)

diff --git a/libgloss/arm/arm.h b/libgloss/arm/arm.h
index 0489f2d92..04b3a5ac7 100644
--- a/libgloss/arm/arm.h
+++ b/libgloss/arm/arm.h
@@ -61,4 +61,30 @@
 # define HAVE_CALL_INDIRECT
 #endif

+/* A and R profiles (and legacy Arm).
+    Current Program Status Register (CPSR)
+    M[4:0]        Mode bits. M[4] is always 1 for 32-bit modes.
+    T[5]            1: Thumb, 0: ARM instruction set
+    F[6]            1: disables FIQ
+    I[7]            1: disables IRQ
+    A[8]            1: disables imprecise aborts
+    E[9]            0: Little-endian, 1: Big-endian
+    J[24]            1: Jazelle instruction set
+ */
+#define CPSR_M_USR            0x00    /* User mode */
+#define CPSR_M_FIQ            0x01    /* Fast Interrupt mode */
+#define CPSR_M_IRQ            0x02    /* Interrupt mode */
+#define CPSR_M_SVR            0x03    /* Supervisor mode */
+#define CPSR_M_MON            0x06    /* Monitor mode */
+#define CPSR_M_ABT            0x07    /* Abort mode */
+#define CPSR_M_HYP            0x0A    /* Hypervisor mode */
+#define CPSR_M_UND            0x0B    /* Undefined mode */
+#define CPSR_M_SYS            0x0F    /* System mode */
+#define CPSR_M_32BIT        0x10    /* 32-bit mode */
+#define CPSR_T_BIT            0x20    /* Thumb bit */
+#define CPSR_F_MASK            0x40    /* FIQ bit */
+#define CPSR_I_MASK            0x80    /* IRQ bit */
+
+#define CPSR_M_MASK            0x0F    /* Mode mask except M[4] */
+
 #endif /* _LIBGLOSS_ARM_H */
diff --git a/libgloss/arm/crt0.S b/libgloss/arm/crt0.S
index 1deb73aa5..95aa3b578 100644
--- a/libgloss/arm/crt0.S
+++ b/libgloss/arm/crt0.S
@@ -59,6 +59,21 @@
 .endm
 #endif

+/* Annotation for EABI unwinding tables. */
+.macro FN_EH_START
+#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
+    .fnstart
+#endif
+.endm
+
+.macro FN_EH_END
+#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
+    /* Protect against unhandled exceptions.  */
+    .cantunwind
+    .fnend
+#endif
+.endm
+
 .macro indirect_call reg
 #ifdef HAVE_CALL_INDIRECT
     blx \reg
@@ -68,16 +83,170 @@
 #endif
 .endm

+/* For armv4t and newer, toolchains will transparently convert
+   'bx lr' to 'mov pc, lr' if needed. GCC has deprecated support
+   for anything older than armv4t, but this should handle that
+   corner case in case anyone needs it anyway */
+.macro  FN_RETURN
+#if __ARM_ARCH <= 4 && __ARM_ARCH_ISA_THUMB == 0
+    mov    pc, lr
+#else
+    bx    lr
+#endif
+.endm
+
+
+
+/******************************************************************************
+* User mode only:           This routine makes default target specific Stack
+*   +-----+ <- SL_sys,    Pointer initialization for different processor modes:
+*   |     |    SL_usr     FIQ, Abort, IRQ, Undefined, Supervisor, System (User)
+*   | SYS |               and setups a default Stack Limit in-case the code has
+*   | USR | -=0x10000     been compiled with "-mapcs-stack-check" for FIQ and
+*   |     |               System (User) modes.
+*   |     |
+*   +-----+ <- initial SP,
+*           becomes SP_sys   Hard-wiring SL value is not ideal, since there is
+*           and SL_usr     currently no support for checking that the heap and
+*                          stack have not collided, or that this default 64k is
+* All modes:               is enough for the program being executed. However,
+*   +-----+ <- SL_sys,     it ensures that this simple crt0 world will not
+*   |     |    SL_usr      immediately cause an overflow event.
+*   | SYS |
+*   | USR | -=0x10000        We go through all execution modes and set up SP
+*   |     |                for each of them.
+*   +-----+ <- SP_sys,
+*   |     |    SP_usr      Note:
+*   | SVC | -= 0x8000        Mode switch via CPSR is not allowed once in
+*   |     |                  non-privileged mode, so we take care not to enter
+*   +-----+ <- SP_svc        "User" to set up its sp, and also skip most
+*   |     |                  operations if already in that mode.
+*   | IRQ | -= 0x2000
+*   |     |                Input parameters:
+* ^ +-----+ <- SP_und       - sp - Initialized SP
+* s |     |                 - r2 - May contain SL value from semihosting
+* t | UND | -= 0x1000              SYS_HEAPINFO call
+* a |     |                Scratch registers:
+* c +-----+ <- SP_und       - r1 - new value of CPSR
+* k |     |                 - r2 - intermediate value (in standalone mode)
+*   | ABT | -= 0x1000       - r3 - new SP value
+* g |     |                 - r4 - save/restore CPSR on entry/exit
+* r +-----+ <- SP_abt,
+* o |     |    SL_fiq        Declared as "weak" so that user can write and use
+* w | FIQ | -= 0x1000      his own implementation if current doesn't fit.
+* t |     |
+* h +-----+ <- initial SP,
+*           becomes SP_fiq
+*
+******************************************************************************/
+    .align    0
+    FUNC_START    _stack_init
+    .weak FUNCTION (_stack_init)
+    FN_EH_START
+
+    /* M profile doesn't have CPSR register.  */
+#if (__ARM_ARCH_PROFILE != 'M')
+    /* Following code is compatible for both ARM and Thumb ISA */
+    mrs    r4, CPSR
+    /* Test mode bits - in User of all are 0 */
+    tst    r4, #(CPSR_M_MASK)
+    /* "eq" means r4 AND #0x0F is 0 */
+    beq    .Lskip_cpu_modes
+
+    mov    r3, sp /* save input SP value */
+
+    /* FIQ mode, interrupts disabled */
+    mov    r1, #(CPSR_M_FIQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    sl, sp, #0x1000    /* FIQ mode has its own SL */
+
+    /* Abort mode, interrupts disabled */
+    mov    r3, sl
+    mov    r1, #(CPSR_M_ABT|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    r3, r3, #0x1000
+
+    /* Undefined mode, interrupts disabled */
+    mov    r1, #(CPSR_M_UND|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    r3, r3, #0x1000
+
+    /* IRQ mode, interrupts disabled */
+    mov    r1, #(CPSR_M_IRQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    r3, r3, #0x2000
+
+    /* Supervisory mode, interrupts disabled */
+    mov    r1, #(CPSR_M_SVR|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+
+    sub    r3, r3, #0x8000    /* Min size 32k */
+    bic    r3, r3, #0x00FF    /* Align with current 64k block */
+    bic    r3, r3, #0xFF00
+
+# if __ARM_ARCH >= 4
+    /* System (shares regs with User) mode, interrupts disabled */
+    mov    r1, #(CPSR_M_SYS|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+# else
+    /* Keep this for ARMv3, but GCC actually dropped it. */
+    /* Move value into user mode sp without changing modes, */
+    /* via '^' form of ldm */
+    str    r3, [r3, #-4]
+    ldmdb    r3, {sp}^
+# endif
+
+    /* Back to original mode, presumably SVC, with diabled FIQ/IRQ */
+    orr    r4, r4, #(CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r4
+
+.Lskip_cpu_modes:
+#endif
+
+    /* Set SL register */
+#if defined (ARM_RDI_MONITOR) /* semihosting */
+    cmp    r2, #0
+    beq    .Lsl_forced_zero
+    /* allow slop for stack overflow handling and small frames */
+# ifdef THUMB1_ONLY
+    adds    r2, #128
+    adds    r2, #128
+    mov    sl, r2
+# else
+    add    sl, r2, #256
+# endif
+.Lsl_forced_zero:
+
+#else /* standalone */
+    /* r3 contains SP for System/User mode. Set SL = SP - 0x10000 */
+    #ifdef THUMB1_ONLY
+    movs    r2, #64
+    lsls    r2, r2, #10
+    subs    r2, r3, r2
+    mov    sl, r2
+    #else
+    /* Still assumes 256bytes below sl */
+    sub    sl, r3, #64 << 10
+    #endif
+#endif
+
+    FN_RETURN
+    FN_EH_END
+
+
 /*******************************************************************************
 * Main library startup code.
 *******************************************************************************/
     .align     0
     FUNC_START    _mainCRTStartup
     FUNC_START    _start
-#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
-    /* Annotation for EABI unwinding tables.  */
-    .fnstart
-#endif
+    FN_EH_START

     /* __ARM_ARCH_PROFILE is defined from GCC 4.8 onwards, however
__ARM_ARCH_7A
     has been defined since 4.2 onwards, which is when v7-a support was added
@@ -150,36 +319,27 @@
           to skip setting sp/sl to 0 here.
         - Considering M-profile processors, We might want to initialize
           sp by the first entry of vector table and return 0 to SYS_HEAPINFO
-          semihosting call, which will be skipped here.  */
+          semihosting call, which will be skipped here.
+        - Considering R-profile processors there is no automatic SP
init by hardware
+          so we need to initialize it by default value. */
+    ldr    r3, .Lstack
     cmp    r1, #0
     beq    .LC26
-    mov    sp, r1
+    mov    r3, r1
 .LC26:
-    cmp    r2, #0
-    beq    .LC27
+    mov    sp, r3

-    /*  Allow slop for stack overflow handling and small frames.  */
-#ifdef THUMB1_ONLY
-    adds    r2, #128
-    adds    r2, #128
-    mov    sl, r2
-#else
-    add    sl, r2, #256
-#endif
+    /* r2 (SL value) will be used in _stack_init */
+    bl FUNCTION (_stack_init)

-.LC27:
-#else
-    /*  Set up the stack pointer to a fixed value.  */
+
+#else /* standalone */
+    /*  Set up the stack pointer to a fixed value. */
     /*  Changes by toralf:
         - Allow linker script to provide stack via __stack symbol - see
           defintion of .Lstack
         - Provide "hooks" that may be used by the application to add
-          custom init code - see .Lhwinit and .Lswinit
-        - Go through all execution modes and set up stack for each of them.
-          Loosely based on init.s from ARM/Motorola example code.
-              Note: Mode switch via CPSR is not allowed once in non-privileged
-            mode, so we take care not to enter "User" to set up its sp,
-            and also skip most operations if already in that mode.  */
+          custom init code - see .Lhwinit and .Lswinit */

     ldr    r3, .Lstack
     cmp    r3, #0
@@ -198,57 +358,10 @@
          have somehow missed it below (in which case it gets the same
          value as FIQ - not ideal, but better than nothing).  */
     mov    sp, r3
-#ifdef PREFER_THUMB
-    /* XXX Fill in stack assignments for interrupt modes.  */
-#else
-    mrs    r2, CPSR
-    tst    r2, #0x0F    /* Test mode bits - in User of all are 0.  */
-    beq    .LC23        /* "eq" means r2 AND #0x0F is 0.  */
-    msr     CPSR_c, #0xD1    /* FIRQ mode, interrupts disabled.  */
-    mov     sp, r3
-    sub    sl, sp, #0x1000    /* This mode also has its own sl (see below).  */
-
-    mov    r3, sl
-    msr     CPSR_c, #0xD7    /* Abort mode, interrupts disabled.  */
-    mov    sp, r3
-    sub    r3, r3, #0x1000
-
-    msr     CPSR_c, #0xDB    /* Undefined mode, interrupts disabled.  */
-    mov    sp, r3
-    sub    r3, r3, #0x1000

-    msr     CPSR_c, #0xD2    /* IRQ mode, interrupts disabled.  */
-    mov    sp, r3
-    sub    r3, r3, #0x2000
-
-    msr     CPSR_c, #0xD3    /* Supervisory mode, interrupts disabled.  */
+    /* we don't care of r2 value in standalone */
+    bl FUNCTION (_stack_init)

-    mov    sp, r3
-    sub    r3, r3, #0x8000    /* Min size 32k.  */
-    bic    r3, r3, #0x00FF    /* Align with current 64k block.  */
-    bic    r3, r3, #0xFF00
-
-    str    r3, [r3, #-4]    /* Move value into user mode sp without */
-    ldmdb    r3, {sp}^       /* changing modes, via '^' form of ldm.  */
-    orr    r2, r2, #0xC0    /* Back to original mode, presumably SVC, */
-    msr    CPSR_c, r2    /* with FIQ/IRQ disable bits forced to 1.  */
-#endif
-.LC23:
-    /* Setup a default stack-limit in-case the code has been
-       compiled with "-mapcs-stack-check".  Hard-wiring this value
-       is not ideal, since there is currently no support for
-       checking that the heap and stack have not collided, or that
-       this default 64k is enough for the program being executed.
-       However, it ensures that this simple crt0 world will not
-       immediately cause an overflow event:  */
-#ifdef THUMB1_ONLY
-    movs    r2, #64
-    lsls    r2, r2, #10
-    subs    r2, r3, r2
-    mov    sl, r2
-#else
-    sub    sl, r3, #64 << 10    /* Still assumes 256bytes below sl.  */
-#endif
 #endif
 #endif
     /* Zero the memory in the .bss section.  */
@@ -447,6 +560,8 @@ change_back:
     swi    SWI_Exit
 #endif

+    FN_EH_END
+
     /* For Thumb, constants must be after the code since only
        positive offsets are supported for PC relative addresses.  */
     .align 0
@@ -464,9 +579,6 @@ change_back:
 #else
     .word    0x80000            /* Top of RAM on the PIE board.  */
 #endif
-
-.Lstack:
-    .word    __stack
 .Lhwinit:
     .word    FUNCTION (hardware_init_hook)
 .Lswinit:
@@ -479,17 +591,16 @@ change_back:
        and only if, a normal version of the same symbol isn't provided
        e.g. by a linker script or another object file.)  */

-    .weak __stack
     .weak FUNCTION (hardware_init_hook)
     .weak FUNCTION (software_init_hook)
 #endif

 #endif
-#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
-    /* Protect against unhandled exceptions.  */
-    .cantunwind
-    .fnend
-#endif
+
+.Lstack:
+    .word    __stack
+    .weak    __stack
+
 .LC1:
     .word    __bss_start__
 .LC2:
diff --git a/newlib/libc/sys/arm/arm.h b/newlib/libc/sys/arm/arm.h
index 0489f2d92..04b3a5ac7 100644
--- a/newlib/libc/sys/arm/arm.h
+++ b/newlib/libc/sys/arm/arm.h
@@ -61,4 +61,30 @@
 # define HAVE_CALL_INDIRECT
 #endif

+/* A and R profiles (and legacy Arm).
+    Current Program Status Register (CPSR)
+    M[4:0]        Mode bits. M[4] is always 1 for 32-bit modes.
+    T[5]            1: Thumb, 0: ARM instruction set
+    F[6]            1: disables FIQ
+    I[7]            1: disables IRQ
+    A[8]            1: disables imprecise aborts
+    E[9]            0: Little-endian, 1: Big-endian
+    J[24]            1: Jazelle instruction set
+ */
+#define CPSR_M_USR            0x00    /* User mode */
+#define CPSR_M_FIQ            0x01    /* Fast Interrupt mode */
+#define CPSR_M_IRQ            0x02    /* Interrupt mode */
+#define CPSR_M_SVR            0x03    /* Supervisor mode */
+#define CPSR_M_MON            0x06    /* Monitor mode */
+#define CPSR_M_ABT            0x07    /* Abort mode */
+#define CPSR_M_HYP            0x0A    /* Hypervisor mode */
+#define CPSR_M_UND            0x0B    /* Undefined mode */
+#define CPSR_M_SYS            0x0F    /* System mode */
+#define CPSR_M_32BIT        0x10    /* 32-bit mode */
+#define CPSR_T_BIT            0x20    /* Thumb bit */
+#define CPSR_F_MASK            0x40    /* FIQ bit */
+#define CPSR_I_MASK            0x80    /* IRQ bit */
+
+#define CPSR_M_MASK            0x0F    /* Mode mask except M[4] */
+
 #endif /* _LIBGLOSS_ARM_H */
diff --git a/newlib/libc/sys/arm/crt0.S b/newlib/libc/sys/arm/crt0.S
index 7a6b40d9a..0929a435e 100644
--- a/newlib/libc/sys/arm/crt0.S
+++ b/newlib/libc/sys/arm/crt0.S
@@ -59,6 +59,21 @@
 .endm
 #endif

+/* Annotation for EABI unwinding tables. */
+.macro FN_EH_START
+#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
+    .fnstart
+#endif
+.endm
+
+.macro FN_EH_END
+#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
+    /* Protect against unhandled exceptions.  */
+    .cantunwind
+    .fnend
+#endif
+.endm
+
 .macro indirect_call reg
 #ifdef HAVE_CALL_INDIRECT
     blx \reg
@@ -68,16 +83,170 @@
 #endif
 .endm

+/* For armv4t and newer, toolchains will transparently convert
+   'bx lr' to 'mov pc, lr' if needed. GCC has deprecated support
+   for anything older than armv4t, but this should handle that
+   corner case in case anyone needs it anyway */
+.macro  FN_RETURN
+#if __ARM_ARCH <= 4 && __ARM_ARCH_ISA_THUMB == 0
+    mov    pc, lr
+#else
+    bx    lr
+#endif
+.endm
+
+
+
+/******************************************************************************
+* User mode only:           This routine makes default target specific Stack
+*   +-----+ <- SL_sys,    Pointer initialization for different processor modes:
+*   |     |    SL_usr     FIQ, Abort, IRQ, Undefined, Supervisor, System (User)
+*   | SYS |               and setups a default Stack Limit in-case the code has
+*   | USR | -=0x10000     been compiled with "-mapcs-stack-check" for FIQ and
+*   |     |               System (User) modes.
+*   |     |
+*   +-----+ <- initial SP,
+*           becomes SP_sys   Hard-wiring SL value is not ideal, since there is
+*           and SL_usr     currently no support for checking that the heap and
+*                          stack have not collided, or that this default 64k is
+* All modes:               is enough for the program being executed. However,
+*   +-----+ <- SL_sys,     it ensures that this simple crt0 world will not
+*   |     |    SL_usr      immediately cause an overflow event.
+*   | SYS |
+*   | USR | -=0x10000        We go through all execution modes and set up SP
+*   |     |                for each of them.
+*   +-----+ <- SP_sys,
+*   |     |    SP_usr      Note:
+*   | SVC | -= 0x8000        Mode switch via CPSR is not allowed once in
+*   |     |                  non-privileged mode, so we take care not to enter
+*   +-----+ <- SP_svc        "User" to set up its sp, and also skip most
+*   |     |                  operations if already in that mode.
+*   | IRQ | -= 0x2000
+*   |     |                Input parameters:
+* ^ +-----+ <- SP_und       - sp - Initialized SP
+* s |     |                 - r2 - May contain SL value from semihosting
+* t | UND | -= 0x1000              SYS_HEAPINFO call
+* a |     |                Scratch registers:
+* c +-----+ <- SP_und       - r1 - new value of CPSR
+* k |     |                 - r2 - intermediate value (in standalone mode)
+*   | ABT | -= 0x1000       - r3 - new SP value
+* g |     |                 - r4 - save/restore CPSR on entry/exit
+* r +-----+ <- SP_abt,
+* o |     |    SL_fiq        Declared as "weak" so that user can write and use
+* w | FIQ | -= 0x1000      his own implementation if current doesn't fit.
+* t |     |
+* h +-----+ <- initial SP,
+*           becomes SP_fiq
+*
+******************************************************************************/
+    .align    0
+    FUNC_START    _stack_init
+    .weak FUNCTION (_stack_init)
+    FN_EH_START
+
+    /* M profile doesn't have CPSR register.  */
+#if (__ARM_ARCH_PROFILE != 'M')
+    /* Following code is compatible for both ARM and Thumb ISA */
+    mrs    r4, CPSR
+    /* Test mode bits - in User of all are 0 */
+    tst    r4, #(CPSR_M_MASK)
+    /* "eq" means r4 AND #0x0F is 0 */
+    beq    .Lskip_cpu_modes
+
+    mov    r3, sp /* save input SP value */
+
+    /* FIQ mode, interrupts disabled */
+    mov    r1, #(CPSR_M_FIQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    sl, sp, #0x1000    /* FIQ mode has its own SL */
+
+    /* Abort mode, interrupts disabled */
+    mov    r3, sl
+    mov    r1, #(CPSR_M_ABT|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    r3, r3, #0x1000
+
+    /* Undefined mode, interrupts disabled */
+    mov    r1, #(CPSR_M_UND|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    r3, r3, #0x1000
+
+    /* IRQ mode, interrupts disabled */
+    mov    r1, #(CPSR_M_IRQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+    sub    r3, r3, #0x2000
+
+    /* Supervisory mode, interrupts disabled */
+    mov    r1, #(CPSR_M_SVR|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+
+    sub    r3, r3, #0x8000    /* Min size 32k */
+    bic    r3, r3, #0x00FF    /* Align with current 64k block */
+    bic    r3, r3, #0xFF00
+
+# if __ARM_ARCH >= 4
+    /* System (shares regs with User) mode, interrupts disabled */
+    mov    r1, #(CPSR_M_SYS|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r1
+    mov    sp, r3
+# else
+    /* Keep this for ARMv3, but GCC actually dropped it. */
+    /* Move value into user mode sp without changing modes, */
+    /* via '^' form of ldm */
+    str    r3, [r3, #-4]
+    ldmdb    r3, {sp}^
+# endif
+
+    /* Back to original mode, presumably SVC, with diabled FIQ/IRQ */
+    orr    r4, r4, #(CPSR_I_MASK|CPSR_F_MASK)
+    msr    CPSR_c, r4
+
+.Lskip_cpu_modes:
+#endif
+
+    /* Set SL register */
+#if defined (ARM_RDI_MONITOR) /* semihosting */
+    cmp    r2, #0
+    beq    .Lsl_forced_zero
+    /* allow slop for stack overflow handling and small frames */
+# ifdef THUMB1_ONLY
+    adds    r2, #128
+    adds    r2, #128
+    mov    sl, r2
+# else
+    add    sl, r2, #256
+# endif
+.Lsl_forced_zero:
+
+#else /* standalone */
+    /* r3 contains SP for System/User mode. Set SL = SP - 0x10000 */
+    #ifdef THUMB1_ONLY
+    movs    r2, #64
+    lsls    r2, r2, #10
+    subs    r2, r3, r2
+    mov    sl, r2
+    #else
+    /* Still assumes 256bytes below sl */
+    sub    sl, r3, #64 << 10
+    #endif
+#endif
+
+    FN_RETURN
+    FN_EH_END
+
+
 /*******************************************************************************
 * Main library startup code.
 *******************************************************************************/
     .align     0
     FUNC_START    _mainCRTStartup
     FUNC_START    _start
-#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
-    /* Annotation for EABI unwinding tables.  */
-    .fnstart
-#endif
+    FN_EH_START

 /* Start by setting up a stack.  */
 #ifdef ARM_RDP_MONITOR
@@ -130,36 +299,27 @@
           to skip setting sp/sl to 0 here.
         - Considering M-profile processors, We might want to initialize
           sp by the first entry of vector table and return 0 to SYS_HEAPINFO
-          semihosting call, which will be skipped here.  */
+          semihosting call, which will be skipped here.
+        - Considering R-profile processors there is no automatic SP
init by hardware
+          so we need to initialize it by default value. */
+    ldr    r3, .Lstack
     cmp    r1, #0
     beq    .LC26
-    mov    sp, r1
+    mov    r3, r1
 .LC26:
-    cmp    r2, #0
-    beq    .LC27
+    mov    sp, r3

-    /*  Allow slop for stack overflow handling and small frames.  */
-#ifdef THUMB1_ONLY
-    adds    r2, #128
-    adds    r2, #128
-    mov    sl, r2
-#else
-    add    sl, r2, #256
-#endif
+    /* r2 (SL value) will be used in _stack_init */
+    bl FUNCTION (_stack_init)

-.LC27:
-#else
-    /*  Set up the stack pointer to a fixed value.  */
+
+#else /* standalone */
+    /*  Set up the stack pointer to a fixed value. */
     /*  Changes by toralf:
         - Allow linker script to provide stack via __stack symbol - see
           defintion of .Lstack
         - Provide "hooks" that may be used by the application to add
-          custom init code - see .Lhwinit and .Lswinit
-        - Go through all execution modes and set up stack for each of them.
-          Loosely based on init.s from ARM/Motorola example code.
-              Note: Mode switch via CPSR is not allowed once in non-privileged
-            mode, so we take care not to enter "User" to set up its sp,
-            and also skip most operations if already in that mode.  */
+          custom init code - see .Lhwinit and .Lswinit */

     ldr    r3, .Lstack
     cmp    r3, #0
@@ -178,57 +338,10 @@
          have somehow missed it below (in which case it gets the same
          value as FIQ - not ideal, but better than nothing).  */
     mov    sp, r3
-#ifdef PREFER_THUMB
-    /* XXX Fill in stack assignments for interrupt modes.  */
-#else
-    mrs    r2, CPSR
-    tst    r2, #0x0F    /* Test mode bits - in User of all are 0.  */
-    beq    .LC23        /* "eq" means r2 AND #0x0F is 0.  */
-    msr     CPSR_c, #0xD1    /* FIRQ mode, interrupts disabled.  */
-    mov     sp, r3
-    sub    sl, sp, #0x1000    /* This mode also has its own sl (see below).  */
-
-    mov    r3, sl
-    msr     CPSR_c, #0xD7    /* Abort mode, interrupts disabled.  */
-    mov    sp, r3
-    sub    r3, r3, #0x1000
-
-    msr     CPSR_c, #0xDB    /* Undefined mode, interrupts disabled.  */
-    mov    sp, r3
-    sub    r3, r3, #0x1000

-    msr     CPSR_c, #0xD2    /* IRQ mode, interrupts disabled.  */
-    mov    sp, r3
-    sub    r3, r3, #0x2000
-
-    msr     CPSR_c, #0xD3    /* Supervisory mode, interrupts disabled.  */
+    /* we don't care of r2 value in standalone */
+    bl FUNCTION (_stack_init)

-    mov    sp, r3
-    sub    r3, r3, #0x8000    /* Min size 32k.  */
-    bic    r3, r3, #0x00FF    /* Align with current 64k block.  */
-    bic    r3, r3, #0xFF00
-
-    str    r3, [r3, #-4]    /* Move value into user mode sp without */
-    ldmdb    r3, {sp}^       /* changing modes, via '^' form of ldm.  */
-    orr    r2, r2, #0xC0    /* Back to original mode, presumably SVC, */
-    msr    CPSR_c, r2    /* with FIQ/IRQ disable bits forced to 1.  */
-#endif
-.LC23:
-    /* Setup a default stack-limit in-case the code has been
-       compiled with "-mapcs-stack-check".  Hard-wiring this value
-       is not ideal, since there is currently no support for
-       checking that the heap and stack have not collided, or that
-       this default 64k is enough for the program being executed.
-       However, it ensures that this simple crt0 world will not
-       immediately cause an overflow event:  */
-#ifdef THUMB1_ONLY
-    movs    r2, #64
-    lsls    r2, r2, #10
-    subs    r2, r3, r2
-    mov    sl, r2
-#else
-    sub    sl, r3, #64 << 10    /* Still assumes 256bytes below sl.  */
-#endif
 #endif
 #endif
     /* Zero the memory in the .bss section.  */
@@ -421,6 +534,8 @@ change_back:
     swi    SWI_Exit
 #endif

+    FN_EH_END
+
     /* For Thumb, constants must be after the code since only
        positive offsets are supported for PC relative addresses.  */
     .align 0
@@ -438,9 +553,6 @@ change_back:
 #else
     .word    0x80000            /* Top of RAM on the PIE board.  */
 #endif
-
-.Lstack:
-    .word    __stack
 .Lhwinit:
     .word    FUNCTION (hardware_init_hook)
 .Lswinit:
@@ -453,17 +565,16 @@ change_back:
        and only if, a normal version of the same symbol isn't provided
        e.g. by a linker script or another object file.)  */

-    .weak __stack
     .weak FUNCTION (hardware_init_hook)
     .weak FUNCTION (software_init_hook)
 #endif

 #endif
-#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
-    /* Protect against unhandled exceptions.  */
-    .cantunwind
-    .fnend
-#endif
+
+.Lstack:
+    .word    __stack
+    .weak    __stack
+
 .LC1:
     .word    __bss_start__
 .LC2:
-- 
2.20.1.windows.1


On Wed, Apr 10, 2019 at 6:57 PM Alexander Fedotov <alfedotov@gmail.com> wrote:
>
> Hi Richard
>
> New patch is below (only for libgloss).
> If it's fine with you I will do the same changes in newlib/libc/sys/arm as well.
>
> >This code is clearly ARM/Thumb2 only.  Have you tested that this builds
> >for, say ARM7TDMI in thumb state?  If you use a build of gcc configured
> >with --with-multilib-list=aprofile or --with-multilib-list=rmprofile
> >you'll get pretty comprehensive checking of all the variants that need
> >to be supported (beware, it can take quite a long time, even on a fast
> >machine).  At the very least, this code needs to be checked on:
>
> I have built gcc with --with-multilib-list=rmprofile
> Unfortunately I have no ability to check on all of them. At least this code is
> verified on Cortex-r52 that was the aim.
>
>
>
> >This implies dropping support for Armv3 as system mode was added in
> >Armv4.  That's probably not an issue now as GCC has also dropped such
> >support, but it should be added to the News.
> I don't see News file for newlib. But I have preserved original
> version in the patch.
> I suggest to drop Armv3 by another commit.
>
>
> I can't get why description is corrupted for _stack_init function in
> the patch below while lines
> are less than 80 symbols :(
>
>
> From c8ff1ecf005dba2a4a69c5754b1911c0d351aba5 Mon Sep 17 00:00:00 2001
> From: Alexander Fedotov <alfedotov@gmail.com>
> Date: Wed, 23 Jan 2019 13:33:14 +0300
> Subject: [PATCH] SP initialization changes:   1. set default value in
>  semihosting case as well   2. moved existing SP & SL init code for processor
>  modes in separate routine and made it as "hook"   3. init SP for processor
>  modes in Thumb mode as well Add new macro FN_RETURN, FN_EH_START and
>  FN_EH_END.
>
> ---
>  libgloss/arm/arm.h  |  26 +++++
>  libgloss/arm/crt0.S | 279 +++++++++++++++++++++++++++++++-------------
>  2 files changed, 224 insertions(+), 81 deletions(-)
>
> diff --git a/libgloss/arm/arm.h b/libgloss/arm/arm.h
> index 0489f2d92..04b3a5ac7 100644
> --- a/libgloss/arm/arm.h
> +++ b/libgloss/arm/arm.h
> @@ -61,4 +61,30 @@
>  # define HAVE_CALL_INDIRECT
>  #endif
>
> +/* A and R profiles (and legacy Arm).
> +    Current Program Status Register (CPSR)
> +    M[4:0]        Mode bits. M[4] is always 1 for 32-bit modes.
> +    T[5]            1: Thumb, 0: ARM instruction set
> +    F[6]            1: disables FIQ
> +    I[7]            1: disables IRQ
> +    A[8]            1: disables imprecise aborts
> +    E[9]            0: Little-endian, 1: Big-endian
> +    J[24]            1: Jazelle instruction set
> + */
> +#define CPSR_M_USR            0x00    /* User mode */
> +#define CPSR_M_FIQ            0x01    /* Fast Interrupt mode */
> +#define CPSR_M_IRQ            0x02    /* Interrupt mode */
> +#define CPSR_M_SVR            0x03    /* Supervisor mode */
> +#define CPSR_M_MON            0x06    /* Monitor mode */
> +#define CPSR_M_ABT            0x07    /* Abort mode */
> +#define CPSR_M_HYP            0x0A    /* Hypervisor mode */
> +#define CPSR_M_UND            0x0B    /* Undefined mode */
> +#define CPSR_M_SYS            0x0F    /* System mode */
> +#define CPSR_M_32BIT        0x10    /* 32-bit mode */
> +#define CPSR_T_BIT            0x20    /* Thumb bit */
> +#define CPSR_F_MASK            0x40    /* FIQ bit */
> +#define CPSR_I_MASK            0x80    /* IRQ bit */
> +
> +#define CPSR_M_MASK            0x0F    /* Mode mask except M[4] */
> +
>  #endif /* _LIBGLOSS_ARM_H */
> diff --git a/libgloss/arm/crt0.S b/libgloss/arm/crt0.S
> index c708f63d8..89256d5e8 100644
> --- a/libgloss/arm/crt0.S
> +++ b/libgloss/arm/crt0.S
> @@ -59,6 +59,21 @@
>  .endm
>  #endif
>
> +/* Annotation for EABI unwinding tables. */
> +.macro FN_EH_START
> +#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> +    .fnstart
> +#endif
> +.endm
> +
> +.macro FN_EH_END
> +#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> +    /* Protect against unhandled exceptions.  */
> +    .cantunwind
> +    .fnend
> +#endif
> +.endm
> +
>  .macro indirect_call reg
>  #ifdef HAVE_CALL_INDIRECT
>      blx \reg
> @@ -68,14 +83,170 @@
>  #endif
>  .endm
>
> -    .align     0
> +/* For armv4t and newer, toolchains will transparently convert
> +   'bx lr' to 'mov pc, lr' if needed. GCC has deprecated support
> +   for anything older than armv4t, but this should handle that
> +   corner case in case anyone needs it anyway */
> +.macro  FN_RETURN
> +#if __ARM_ARCH <= 4 && __ARM_ARCH_ISA_THUMB == 0
> +    mov    pc, lr
> +#else
> +    bx    lr
> +#endif
> +.endm
> +
> +
> +
> +/******************************************************************************
> +* User mode only:           This routine makes default target specific Stack
> +*   +-----+ <- SL_sys,    Pointer initialization for different processor modes:
> +*   |     |    SL_usr     FIQ, Abort, IRQ, Undefined, Supervisor, System (User)
> +*   | SYS |               and setups a default Stack Limit in-case the code has
> +*   | USR | -=0x10000     been compiled with "-mapcs-stack-check" for FIQ and
> +*   |     |               System (User) modes.
> +*   |     |
> +*   +-----+ <- initial SP,
> +*           becomes SP_sys   Hard-wiring SL value is not ideal, since there is
> +*           and SL_usr     currently no support for checking that the heap and
> +*                          stack have not collided, or that this default 64k is
> +* All modes:               is enough for the program being executed. However,
> +*   +-----+ <- SL_sys,     it ensures that this simple crt0 world will not
> +*   |     |    SL_usr      immediately cause an overflow event.
> +*   | SYS |
> +*   | USR | -=0x10000        We go through all execution modes and set up SP
> +*   |     |                for each of them.
> +*   +-----+ <- SP_sys,
> +*   |     |    SP_usr      Note:
> +*   | SVC | -= 0x8000        Mode switch via CPSR is not allowed once in
> +*   |     |                  non-privileged mode, so we take care not to enter
> +*   +-----+ <- SP_svc        "User" to set up its sp, and also skip most
> +*   |     |                  operations if already in that mode.
> +*   | IRQ | -= 0x2000
> +*   |     |                Input parameters:
> +* ^ +-----+ <- SP_und       - sp - Initialized SP
> +* s |     |                 - r2 - May contain SL value from semihosting
> +* t | UND | -= 0x1000              SYS_HEAPINFO call
> +* a |     |                Scratch registers:
> +* c +-----+ <- SP_und       - r1 - new value of CPSR
> +* k |     |                 - r2 - intermediate value (in standalone mode)
> +*   | ABT | -= 0x1000       - r3 - new SP value
> +* g |     |                 - r4 - save/restore CPSR on entry/exit
> +* r +-----+ <- SP_abt,
> +* o |     |    SL_fiq        Declared as "weak" so that user can write and use
> +* w | FIQ | -= 0x1000      his own implementation if current doesn't fit.
> +* t |     |
> +* h +-----+ <- initial SP,
> +*           becomes SP_fiq
> +*
> +******************************************************************************/
> +    .align    0
> +    FUNC_START    _stack_init
> +    .weak FUNCTION (_stack_init)
> +    FN_EH_START
> +
> +    /* M profile doesn't have CPSR register.  */
> +#if (__ARM_ARCH_PROFILE != 'M')
> +    /* Following code is compatible for both ARM and Thumb ISA */
> +    mrs    r4, CPSR
> +    /* Test mode bits - in User of all are 0 */
> +    tst    r4, #(CPSR_M_MASK)
> +    /* "eq" means r4 AND #0x0F is 0 */
> +    beq    .Lskip_cpu_modes
> +
> +    mov    r3, sp /* save input SP value */
> +
> +    /* FIQ mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_FIQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    sl, sp, #0x1000    /* FIQ mode has its own SL */
> +
> +    /* Abort mode, interrupts disabled */
> +    mov    r3, sl
> +    mov    r1, #(CPSR_M_ABT|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    r3, r3, #0x1000
> +
> +    /* Undefined mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_UND|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    r3, r3, #0x1000
> +
> +    /* IRQ mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_IRQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    r3, r3, #0x2000
> +
> +    /* Supervisory mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_SVR|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +
> +    sub    r3, r3, #0x8000    /* Min size 32k */
> +    bic    r3, r3, #0x00FF    /* Align with current 64k block */
> +    bic    r3, r3, #0xFF00
> +
> +# if __ARM_ARCH > 3
> +    /* System (shares regs with User) mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_SYS|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +# else
> +    /* Keep this for ARMv3, but GCC actually dropped it. */
> +    /* Move value into user mode sp without changing modes, */
> +    /* via '^' form of ldm */
> +    str    r3, [r3, #-4]
> +    ldmdb    r3, {sp}^
> +# endif
> +
> +    /* Back to original mode, presumably SVC, with diabled FIQ/IRQ */
> +    orr    r4, r4, #(CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r4
> +
> +.Lskip_cpu_modes:
> +#endif
> +
> +    /* Set SL register */
> +#if defined (ARM_RDI_MONITOR) /* semihosting */
> +    cmp    r2, #0
> +    beq    .Lsl_forced_zero
> +    /* allow slop for stack overflow handling and small frames */
> +# ifdef THUMB1_ONLY
> +    adds    r2, #128
> +    adds    r2, #128
> +    mov    sl, r2
> +# else
> +    add    sl, r2, #256
> +# endif
> +.Lsl_forced_zero:
> +
> +#else /* standalone */
> +    /* r3 contains SP for System/User mode. Set SL = SP - 0x10000 */
> +    #ifdef THUMB1_ONLY
> +    movs    r2, #64
> +    lsls    r2, r2, #10
> +    subs    r2, r3, r2
> +    mov    sl, r2
> +    #else
> +    /* Still assumes 256bytes below sl */
> +    sub    sl, r3, #64 << 10
> +    #endif
> +#endif
> +
> +    FN_RETURN
> +    FN_EH_END
>
> +
> +/*******************************************************************************
> +* Main library startup code.
> +*******************************************************************************/
> +    .align     0
>      FUNC_START    _mainCRTStartup
>      FUNC_START    _start
> -#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> -    /* Annotation for EABI unwinding tables.  */
> -    .fnstart
> -#endif
> +    FN_EH_START
>
>      /* __ARM_ARCH_PROFILE is defined from GCC 4.8 onwards, however
> __ARM_ARCH_7A
>      has been defined since 4.2 onwards, which is when v7-a support was added
> @@ -148,34 +319,27 @@
>            to skip setting sp/sl to 0 here.
>          - Considering M-profile processors, We might want to initialize
>            sp by the first entry of vector table and return 0 to SYS_HEAPINFO
> -          semihosting call, which will be skipped here. */
> +          semihosting call, which will be skipped here.
> +        - Considering R-profile processors there is no automatic SP
> init by hardware
> +          so we need to initialize it by default value. */
> +    ldr    r3, .Lstack
>      cmp    r1, #0
>      beq    .LC26
> -    mov    sp, r1
> +    mov    r3, r1
>  .LC26:
> -    cmp    r2, #0
> -    beq    .LC27
> -    /*  allow slop for stack overflow handling and small frames */
> -#ifdef THUMB1_ONLY
> -    adds    r2, #128
> -    adds    r2, #128
> -    mov    sl, r2
> -#else
> -    add    sl, r2, #256
> -#endif
> -.LC27:
> -#else
> +    mov    sp, r3
> +
> +    /* r2 (SL value) will be used in _stack_init */
> +    bl FUNCTION (_stack_init)
> +
> +
> +#else /* standalone */
>      /*  Set up the stack pointer to a fixed value */
>      /*  Changes by toralf:
>          - Allow linker script to provide stack via __stack symbol - see
>            defintion of .Lstack
>          - Provide "hooks" that may be used by the application to add
> -          custom init code - see .Lhwinit and .Lswinit
> -        - Go through all execution modes and set up stack for each of them.
> -          Loosely based on init.s from ARM/Motorola example code.
> -              Note: Mode switch via CPSR is not allowed once in non-privileged
> -            mode, so we take care not to enter "User" to set up its sp,
> -            and also skip most operations if already in that mode. */
> +          custom init code - see .Lhwinit and .Lswinit */
>
>      ldr    r3, .Lstack
>      cmp    r3, #0
> @@ -194,57 +358,11 @@
>           have somehow missed it below (in which case it gets the same
>           value as FIQ - not ideal, but better than nothing.) */
>      mov    sp, r3
> -#ifdef PREFER_THUMB
> -    /* XXX Fill in stack assignments for interrupt modes.  */
> -#else
> -    mrs    r2, CPSR
> -    tst    r2, #0x0F    /* Test mode bits - in User of all are 0 */
> -    beq    .LC23        /* "eq" means r2 AND #0x0F is 0 */
> -    msr     CPSR_c, #0xD1    /* FIRQ mode, interrupts disabled */
> -    mov     sp, r3
> -    sub    sl, sp, #0x1000    /* This mode also has its own sl (see below) */
> -
> -    mov    r3, sl
> -    msr     CPSR_c, #0xD7    /* Abort mode, interrupts disabled */
> -    mov    sp, r3
> -    sub    r3, r3, #0x1000
> -
> -    msr     CPSR_c, #0xDB    /* Undefined mode, interrupts disabled */
> -    mov    sp, r3
> -    sub    r3, r3, #0x1000
>
> -    msr     CPSR_c, #0xD2    /* IRQ mode, interrupts disabled */
> -    mov    sp, r3
> -    sub    r3, r3, #0x2000
> -
> -    msr     CPSR_c, #0xD3    /* Supervisory mode, interrupts disabled */
> +    /* we don't care of r2 value in standalone */
> +    bl FUNCTION (_stack_init)
>
> -    mov    sp, r3
> -    sub    r3, r3, #0x8000    /* Min size 32k */
> -    bic    r3, r3, #0x00FF    /* Align with current 64k block */
> -    bic    r3, r3, #0xFF00
>
> -    str    r3, [r3, #-4]    /* Move value into user mode sp without */
> -    ldmdb    r3, {sp}^       /* changing modes, via '^' form of ldm */
> -    orr    r2, r2, #0xC0    /* Back to original mode, presumably SVC, */
> -    msr    CPSR_c, r2    /* with FIQ/IRQ disable bits forced to 1 */
> -#endif
> -.LC23:
> -    /* Setup a default stack-limit in-case the code has been
> -       compiled with "-mapcs-stack-check".  Hard-wiring this value
> -       is not ideal, since there is currently no support for
> -       checking that the heap and stack have not collided, or that
> -       this default 64k is enough for the program being executed.
> -       However, it ensures that this simple crt0 world will not
> -       immediately cause an overflow event:  */
> -#ifdef THUMB1_ONLY
> -    movs    r2, #64
> -    lsls    r2, r2, #10
> -    subs    r2, r3, r2
> -    mov    sl, r2
> -#else
> -    sub    sl, r3, #64 << 10    /* Still assumes 256bytes below sl */
> -#endif
>  #endif
>  #endif
>      /* Zero the memory in the .bss section.  */
> @@ -445,6 +563,8 @@ change_back:
>      swi    SWI_Exit
>  #endif
>
> +    FN_EH_END
> +
>      /* For Thumb, constants must be after the code since only
>         positive offsets are supported for PC relative addresses.  */
>
> @@ -464,8 +584,6 @@ change_back:
>  #else
>      .word    0x80000            /* Top of RAM on the PIE board.  */
>  #endif
> -.Lstack:
> -    .word    __stack
>  .Lhwinit:
>      .word    FUNCTION (hardware_init_hook)
>  .Lswinit:
> @@ -478,17 +596,16 @@ change_back:
>         and only if, a normal version of the same symbol isn't provided
>         e.g. by a linker script or another object file.) */
>
> -    .weak __stack
>      .weak FUNCTION (hardware_init_hook)
>      .weak FUNCTION (software_init_hook)
>  #endif
>
>  #endif
> -#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> -    /* Protect against unhandled exceptions.  */
> -    .cantunwind
> -    .fnend
> -#endif
> +
> +.Lstack:
> +    .word    __stack
> +    .weak    __stack
> +
>  .LC1:
>      .word    __bss_start__
>  .LC2:
> --
> 2.20.1.windows.1
>
>
>
> On Wed, Apr 10, 2019 at 5:55 PM Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
> >
> > On 10/04/2019 15:23, Corinna Vinschen wrote:
> > > Hi Richard,
> > >
> > > On Apr 10 13:26, Richard Earnshaw (lists) wrote:
> > >> On 19/03/2019 13:54, Alexander Fedotov wrote:
> > >>> Hi all
> > >>>
> > >>> In continue of threads
> > >>> https://sourceware.org/ml/newlib/2019/msg00117.html and
> > >>> https://sourceware.org/ml/newlib/2019/msg00102.html
> > >>>
> > >>> I have reworked stack pointer initialization and stack limit as well.
> > >>> General idea behind of all these changes is to provide hook for
> > >>> flexibility (including semihosting when debug tap is used) and improve
> > >>> readability of existing crt0 code.
> > >>> And now code makes SP initialization in Thumb mode :)
> > >>>
> > >>> Any comments are very appreciated.
> > >>>
> > >>> Alex
> > >>>
> > >>
> > >> Apologies once again for the delayed review.
> > >>
> > >> A small plea: when posting patches, please can you ensure that your
> > >> attachments are text/patch, or something similar, so that replies will
> > >> include the contents (and I can view them directly in the email rather
> > >> than having to mess about extracting the attachment).
> > >
> > > Ideally the patches are inline patches sent with `git send-email' or
> > > equivalent.  I added some wording in terms of how to provide patches to
> > > https://sourceware.org/newlib/ lately.  Do you have a suggestion for
> > > improving this?
> > >
> > >
> > > Thanks,
> > > Corinna
> > >
> >
> > Not really;  When I use git format-patch with --attach I get
> > text/x-patch as the MIME type, which is fine.  Problems only start if
> > users have poorly configured mail client that attach as octet-stream or
> > other binary types.  Then mail clients can't display the attachment
> > inline as they don't know the format is really displayable.
> >
> > R.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Arm] SP and SL initialization refactored
  2019-04-12 12:38         ` Alexander Fedotov
@ 2019-04-15 10:21           ` Alexander Fedotov
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Fedotov @ 2019-04-15 10:21 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Tamar Christina, Newlib, nd

New patch (includes libgloss and libc crt0) is here
https://sourceware.org/ml/newlib/2019/msg00170.html

Alex

On Fri, Apr 12, 2019 at 3:38 PM Alexander Fedotov <alfedotov@gmail.com> wrote:
>
> Hi Richard
> New patch (rebased on
> https://sourceware.org/ml/newlib/2019/msg00164.html ) is below.
>
> OK to apply on master ?
>
>
> From b7ad80f440e593b363bd0240e8d2476da7b6160d Mon Sep 17 00:00:00 2001
> From: Alexander Fedotov <alfedotov@gmail.com>
> Date: Wed, 23 Jan 2019 13:33:14 +0300
> Subject: [PATCH] [Arm] Stack Pointer and Stack Limit initialization
>  refactored.
>
> SP initialization changes:
>   1. set default value in semihosting case as well
>   2. moved existing SP & SL init code for processor modes in separate
> routine and made it as "hook"
>   3. init SP for processor modes in Thumb mode as well
>
> Add new macro FN_RETURN, FN_EH_START and FN_EH_END.
> ---
>  libgloss/arm/arm.h         |  26 ++++
>  libgloss/arm/crt0.S        | 277 ++++++++++++++++++++++++++-----------
>  newlib/libc/sys/arm/arm.h  |  26 ++++
>  newlib/libc/sys/arm/crt0.S | 277 ++++++++++++++++++++++++++-----------
>  4 files changed, 440 insertions(+), 166 deletions(-)
>
> diff --git a/libgloss/arm/arm.h b/libgloss/arm/arm.h
> index 0489f2d92..04b3a5ac7 100644
> --- a/libgloss/arm/arm.h
> +++ b/libgloss/arm/arm.h
> @@ -61,4 +61,30 @@
>  # define HAVE_CALL_INDIRECT
>  #endif
>
> +/* A and R profiles (and legacy Arm).
> +    Current Program Status Register (CPSR)
> +    M[4:0]        Mode bits. M[4] is always 1 for 32-bit modes.
> +    T[5]            1: Thumb, 0: ARM instruction set
> +    F[6]            1: disables FIQ
> +    I[7]            1: disables IRQ
> +    A[8]            1: disables imprecise aborts
> +    E[9]            0: Little-endian, 1: Big-endian
> +    J[24]            1: Jazelle instruction set
> + */
> +#define CPSR_M_USR            0x00    /* User mode */
> +#define CPSR_M_FIQ            0x01    /* Fast Interrupt mode */
> +#define CPSR_M_IRQ            0x02    /* Interrupt mode */
> +#define CPSR_M_SVR            0x03    /* Supervisor mode */
> +#define CPSR_M_MON            0x06    /* Monitor mode */
> +#define CPSR_M_ABT            0x07    /* Abort mode */
> +#define CPSR_M_HYP            0x0A    /* Hypervisor mode */
> +#define CPSR_M_UND            0x0B    /* Undefined mode */
> +#define CPSR_M_SYS            0x0F    /* System mode */
> +#define CPSR_M_32BIT        0x10    /* 32-bit mode */
> +#define CPSR_T_BIT            0x20    /* Thumb bit */
> +#define CPSR_F_MASK            0x40    /* FIQ bit */
> +#define CPSR_I_MASK            0x80    /* IRQ bit */
> +
> +#define CPSR_M_MASK            0x0F    /* Mode mask except M[4] */
> +
>  #endif /* _LIBGLOSS_ARM_H */
> diff --git a/libgloss/arm/crt0.S b/libgloss/arm/crt0.S
> index 1deb73aa5..95aa3b578 100644
> --- a/libgloss/arm/crt0.S
> +++ b/libgloss/arm/crt0.S
> @@ -59,6 +59,21 @@
>  .endm
>  #endif
>
> +/* Annotation for EABI unwinding tables. */
> +.macro FN_EH_START
> +#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> +    .fnstart
> +#endif
> +.endm
> +
> +.macro FN_EH_END
> +#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> +    /* Protect against unhandled exceptions.  */
> +    .cantunwind
> +    .fnend
> +#endif
> +.endm
> +
>  .macro indirect_call reg
>  #ifdef HAVE_CALL_INDIRECT
>      blx \reg
> @@ -68,16 +83,170 @@
>  #endif
>  .endm
>
> +/* For armv4t and newer, toolchains will transparently convert
> +   'bx lr' to 'mov pc, lr' if needed. GCC has deprecated support
> +   for anything older than armv4t, but this should handle that
> +   corner case in case anyone needs it anyway */
> +.macro  FN_RETURN
> +#if __ARM_ARCH <= 4 && __ARM_ARCH_ISA_THUMB == 0
> +    mov    pc, lr
> +#else
> +    bx    lr
> +#endif
> +.endm
> +
> +
> +
> +/******************************************************************************
> +* User mode only:           This routine makes default target specific Stack
> +*   +-----+ <- SL_sys,    Pointer initialization for different processor modes:
> +*   |     |    SL_usr     FIQ, Abort, IRQ, Undefined, Supervisor, System (User)
> +*   | SYS |               and setups a default Stack Limit in-case the code has
> +*   | USR | -=0x10000     been compiled with "-mapcs-stack-check" for FIQ and
> +*   |     |               System (User) modes.
> +*   |     |
> +*   +-----+ <- initial SP,
> +*           becomes SP_sys   Hard-wiring SL value is not ideal, since there is
> +*           and SL_usr     currently no support for checking that the heap and
> +*                          stack have not collided, or that this default 64k is
> +* All modes:               is enough for the program being executed. However,
> +*   +-----+ <- SL_sys,     it ensures that this simple crt0 world will not
> +*   |     |    SL_usr      immediately cause an overflow event.
> +*   | SYS |
> +*   | USR | -=0x10000        We go through all execution modes and set up SP
> +*   |     |                for each of them.
> +*   +-----+ <- SP_sys,
> +*   |     |    SP_usr      Note:
> +*   | SVC | -= 0x8000        Mode switch via CPSR is not allowed once in
> +*   |     |                  non-privileged mode, so we take care not to enter
> +*   +-----+ <- SP_svc        "User" to set up its sp, and also skip most
> +*   |     |                  operations if already in that mode.
> +*   | IRQ | -= 0x2000
> +*   |     |                Input parameters:
> +* ^ +-----+ <- SP_und       - sp - Initialized SP
> +* s |     |                 - r2 - May contain SL value from semihosting
> +* t | UND | -= 0x1000              SYS_HEAPINFO call
> +* a |     |                Scratch registers:
> +* c +-----+ <- SP_und       - r1 - new value of CPSR
> +* k |     |                 - r2 - intermediate value (in standalone mode)
> +*   | ABT | -= 0x1000       - r3 - new SP value
> +* g |     |                 - r4 - save/restore CPSR on entry/exit
> +* r +-----+ <- SP_abt,
> +* o |     |    SL_fiq        Declared as "weak" so that user can write and use
> +* w | FIQ | -= 0x1000      his own implementation if current doesn't fit.
> +* t |     |
> +* h +-----+ <- initial SP,
> +*           becomes SP_fiq
> +*
> +******************************************************************************/
> +    .align    0
> +    FUNC_START    _stack_init
> +    .weak FUNCTION (_stack_init)
> +    FN_EH_START
> +
> +    /* M profile doesn't have CPSR register.  */
> +#if (__ARM_ARCH_PROFILE != 'M')
> +    /* Following code is compatible for both ARM and Thumb ISA */
> +    mrs    r4, CPSR
> +    /* Test mode bits - in User of all are 0 */
> +    tst    r4, #(CPSR_M_MASK)
> +    /* "eq" means r4 AND #0x0F is 0 */
> +    beq    .Lskip_cpu_modes
> +
> +    mov    r3, sp /* save input SP value */
> +
> +    /* FIQ mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_FIQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    sl, sp, #0x1000    /* FIQ mode has its own SL */
> +
> +    /* Abort mode, interrupts disabled */
> +    mov    r3, sl
> +    mov    r1, #(CPSR_M_ABT|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    r3, r3, #0x1000
> +
> +    /* Undefined mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_UND|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    r3, r3, #0x1000
> +
> +    /* IRQ mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_IRQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    r3, r3, #0x2000
> +
> +    /* Supervisory mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_SVR|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +
> +    sub    r3, r3, #0x8000    /* Min size 32k */
> +    bic    r3, r3, #0x00FF    /* Align with current 64k block */
> +    bic    r3, r3, #0xFF00
> +
> +# if __ARM_ARCH >= 4
> +    /* System (shares regs with User) mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_SYS|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +# else
> +    /* Keep this for ARMv3, but GCC actually dropped it. */
> +    /* Move value into user mode sp without changing modes, */
> +    /* via '^' form of ldm */
> +    str    r3, [r3, #-4]
> +    ldmdb    r3, {sp}^
> +# endif
> +
> +    /* Back to original mode, presumably SVC, with diabled FIQ/IRQ */
> +    orr    r4, r4, #(CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r4
> +
> +.Lskip_cpu_modes:
> +#endif
> +
> +    /* Set SL register */
> +#if defined (ARM_RDI_MONITOR) /* semihosting */
> +    cmp    r2, #0
> +    beq    .Lsl_forced_zero
> +    /* allow slop for stack overflow handling and small frames */
> +# ifdef THUMB1_ONLY
> +    adds    r2, #128
> +    adds    r2, #128
> +    mov    sl, r2
> +# else
> +    add    sl, r2, #256
> +# endif
> +.Lsl_forced_zero:
> +
> +#else /* standalone */
> +    /* r3 contains SP for System/User mode. Set SL = SP - 0x10000 */
> +    #ifdef THUMB1_ONLY
> +    movs    r2, #64
> +    lsls    r2, r2, #10
> +    subs    r2, r3, r2
> +    mov    sl, r2
> +    #else
> +    /* Still assumes 256bytes below sl */
> +    sub    sl, r3, #64 << 10
> +    #endif
> +#endif
> +
> +    FN_RETURN
> +    FN_EH_END
> +
> +
>  /*******************************************************************************
>  * Main library startup code.
>  *******************************************************************************/
>      .align     0
>      FUNC_START    _mainCRTStartup
>      FUNC_START    _start
> -#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> -    /* Annotation for EABI unwinding tables.  */
> -    .fnstart
> -#endif
> +    FN_EH_START
>
>      /* __ARM_ARCH_PROFILE is defined from GCC 4.8 onwards, however
> __ARM_ARCH_7A
>      has been defined since 4.2 onwards, which is when v7-a support was added
> @@ -150,36 +319,27 @@
>            to skip setting sp/sl to 0 here.
>          - Considering M-profile processors, We might want to initialize
>            sp by the first entry of vector table and return 0 to SYS_HEAPINFO
> -          semihosting call, which will be skipped here.  */
> +          semihosting call, which will be skipped here.
> +        - Considering R-profile processors there is no automatic SP
> init by hardware
> +          so we need to initialize it by default value. */
> +    ldr    r3, .Lstack
>      cmp    r1, #0
>      beq    .LC26
> -    mov    sp, r1
> +    mov    r3, r1
>  .LC26:
> -    cmp    r2, #0
> -    beq    .LC27
> +    mov    sp, r3
>
> -    /*  Allow slop for stack overflow handling and small frames.  */
> -#ifdef THUMB1_ONLY
> -    adds    r2, #128
> -    adds    r2, #128
> -    mov    sl, r2
> -#else
> -    add    sl, r2, #256
> -#endif
> +    /* r2 (SL value) will be used in _stack_init */
> +    bl FUNCTION (_stack_init)
>
> -.LC27:
> -#else
> -    /*  Set up the stack pointer to a fixed value.  */
> +
> +#else /* standalone */
> +    /*  Set up the stack pointer to a fixed value. */
>      /*  Changes by toralf:
>          - Allow linker script to provide stack via __stack symbol - see
>            defintion of .Lstack
>          - Provide "hooks" that may be used by the application to add
> -          custom init code - see .Lhwinit and .Lswinit
> -        - Go through all execution modes and set up stack for each of them.
> -          Loosely based on init.s from ARM/Motorola example code.
> -              Note: Mode switch via CPSR is not allowed once in non-privileged
> -            mode, so we take care not to enter "User" to set up its sp,
> -            and also skip most operations if already in that mode.  */
> +          custom init code - see .Lhwinit and .Lswinit */
>
>      ldr    r3, .Lstack
>      cmp    r3, #0
> @@ -198,57 +358,10 @@
>           have somehow missed it below (in which case it gets the same
>           value as FIQ - not ideal, but better than nothing).  */
>      mov    sp, r3
> -#ifdef PREFER_THUMB
> -    /* XXX Fill in stack assignments for interrupt modes.  */
> -#else
> -    mrs    r2, CPSR
> -    tst    r2, #0x0F    /* Test mode bits - in User of all are 0.  */
> -    beq    .LC23        /* "eq" means r2 AND #0x0F is 0.  */
> -    msr     CPSR_c, #0xD1    /* FIRQ mode, interrupts disabled.  */
> -    mov     sp, r3
> -    sub    sl, sp, #0x1000    /* This mode also has its own sl (see below).  */
> -
> -    mov    r3, sl
> -    msr     CPSR_c, #0xD7    /* Abort mode, interrupts disabled.  */
> -    mov    sp, r3
> -    sub    r3, r3, #0x1000
> -
> -    msr     CPSR_c, #0xDB    /* Undefined mode, interrupts disabled.  */
> -    mov    sp, r3
> -    sub    r3, r3, #0x1000
>
> -    msr     CPSR_c, #0xD2    /* IRQ mode, interrupts disabled.  */
> -    mov    sp, r3
> -    sub    r3, r3, #0x2000
> -
> -    msr     CPSR_c, #0xD3    /* Supervisory mode, interrupts disabled.  */
> +    /* we don't care of r2 value in standalone */
> +    bl FUNCTION (_stack_init)
>
> -    mov    sp, r3
> -    sub    r3, r3, #0x8000    /* Min size 32k.  */
> -    bic    r3, r3, #0x00FF    /* Align with current 64k block.  */
> -    bic    r3, r3, #0xFF00
> -
> -    str    r3, [r3, #-4]    /* Move value into user mode sp without */
> -    ldmdb    r3, {sp}^       /* changing modes, via '^' form of ldm.  */
> -    orr    r2, r2, #0xC0    /* Back to original mode, presumably SVC, */
> -    msr    CPSR_c, r2    /* with FIQ/IRQ disable bits forced to 1.  */
> -#endif
> -.LC23:
> -    /* Setup a default stack-limit in-case the code has been
> -       compiled with "-mapcs-stack-check".  Hard-wiring this value
> -       is not ideal, since there is currently no support for
> -       checking that the heap and stack have not collided, or that
> -       this default 64k is enough for the program being executed.
> -       However, it ensures that this simple crt0 world will not
> -       immediately cause an overflow event:  */
> -#ifdef THUMB1_ONLY
> -    movs    r2, #64
> -    lsls    r2, r2, #10
> -    subs    r2, r3, r2
> -    mov    sl, r2
> -#else
> -    sub    sl, r3, #64 << 10    /* Still assumes 256bytes below sl.  */
> -#endif
>  #endif
>  #endif
>      /* Zero the memory in the .bss section.  */
> @@ -447,6 +560,8 @@ change_back:
>      swi    SWI_Exit
>  #endif
>
> +    FN_EH_END
> +
>      /* For Thumb, constants must be after the code since only
>         positive offsets are supported for PC relative addresses.  */
>      .align 0
> @@ -464,9 +579,6 @@ change_back:
>  #else
>      .word    0x80000            /* Top of RAM on the PIE board.  */
>  #endif
> -
> -.Lstack:
> -    .word    __stack
>  .Lhwinit:
>      .word    FUNCTION (hardware_init_hook)
>  .Lswinit:
> @@ -479,17 +591,16 @@ change_back:
>         and only if, a normal version of the same symbol isn't provided
>         e.g. by a linker script or another object file.)  */
>
> -    .weak __stack
>      .weak FUNCTION (hardware_init_hook)
>      .weak FUNCTION (software_init_hook)
>  #endif
>
>  #endif
> -#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> -    /* Protect against unhandled exceptions.  */
> -    .cantunwind
> -    .fnend
> -#endif
> +
> +.Lstack:
> +    .word    __stack
> +    .weak    __stack
> +
>  .LC1:
>      .word    __bss_start__
>  .LC2:
> diff --git a/newlib/libc/sys/arm/arm.h b/newlib/libc/sys/arm/arm.h
> index 0489f2d92..04b3a5ac7 100644
> --- a/newlib/libc/sys/arm/arm.h
> +++ b/newlib/libc/sys/arm/arm.h
> @@ -61,4 +61,30 @@
>  # define HAVE_CALL_INDIRECT
>  #endif
>
> +/* A and R profiles (and legacy Arm).
> +    Current Program Status Register (CPSR)
> +    M[4:0]        Mode bits. M[4] is always 1 for 32-bit modes.
> +    T[5]            1: Thumb, 0: ARM instruction set
> +    F[6]            1: disables FIQ
> +    I[7]            1: disables IRQ
> +    A[8]            1: disables imprecise aborts
> +    E[9]            0: Little-endian, 1: Big-endian
> +    J[24]            1: Jazelle instruction set
> + */
> +#define CPSR_M_USR            0x00    /* User mode */
> +#define CPSR_M_FIQ            0x01    /* Fast Interrupt mode */
> +#define CPSR_M_IRQ            0x02    /* Interrupt mode */
> +#define CPSR_M_SVR            0x03    /* Supervisor mode */
> +#define CPSR_M_MON            0x06    /* Monitor mode */
> +#define CPSR_M_ABT            0x07    /* Abort mode */
> +#define CPSR_M_HYP            0x0A    /* Hypervisor mode */
> +#define CPSR_M_UND            0x0B    /* Undefined mode */
> +#define CPSR_M_SYS            0x0F    /* System mode */
> +#define CPSR_M_32BIT        0x10    /* 32-bit mode */
> +#define CPSR_T_BIT            0x20    /* Thumb bit */
> +#define CPSR_F_MASK            0x40    /* FIQ bit */
> +#define CPSR_I_MASK            0x80    /* IRQ bit */
> +
> +#define CPSR_M_MASK            0x0F    /* Mode mask except M[4] */
> +
>  #endif /* _LIBGLOSS_ARM_H */
> diff --git a/newlib/libc/sys/arm/crt0.S b/newlib/libc/sys/arm/crt0.S
> index 7a6b40d9a..0929a435e 100644
> --- a/newlib/libc/sys/arm/crt0.S
> +++ b/newlib/libc/sys/arm/crt0.S
> @@ -59,6 +59,21 @@
>  .endm
>  #endif
>
> +/* Annotation for EABI unwinding tables. */
> +.macro FN_EH_START
> +#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> +    .fnstart
> +#endif
> +.endm
> +
> +.macro FN_EH_END
> +#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> +    /* Protect against unhandled exceptions.  */
> +    .cantunwind
> +    .fnend
> +#endif
> +.endm
> +
>  .macro indirect_call reg
>  #ifdef HAVE_CALL_INDIRECT
>      blx \reg
> @@ -68,16 +83,170 @@
>  #endif
>  .endm
>
> +/* For armv4t and newer, toolchains will transparently convert
> +   'bx lr' to 'mov pc, lr' if needed. GCC has deprecated support
> +   for anything older than armv4t, but this should handle that
> +   corner case in case anyone needs it anyway */
> +.macro  FN_RETURN
> +#if __ARM_ARCH <= 4 && __ARM_ARCH_ISA_THUMB == 0
> +    mov    pc, lr
> +#else
> +    bx    lr
> +#endif
> +.endm
> +
> +
> +
> +/******************************************************************************
> +* User mode only:           This routine makes default target specific Stack
> +*   +-----+ <- SL_sys,    Pointer initialization for different processor modes:
> +*   |     |    SL_usr     FIQ, Abort, IRQ, Undefined, Supervisor, System (User)
> +*   | SYS |               and setups a default Stack Limit in-case the code has
> +*   | USR | -=0x10000     been compiled with "-mapcs-stack-check" for FIQ and
> +*   |     |               System (User) modes.
> +*   |     |
> +*   +-----+ <- initial SP,
> +*           becomes SP_sys   Hard-wiring SL value is not ideal, since there is
> +*           and SL_usr     currently no support for checking that the heap and
> +*                          stack have not collided, or that this default 64k is
> +* All modes:               is enough for the program being executed. However,
> +*   +-----+ <- SL_sys,     it ensures that this simple crt0 world will not
> +*   |     |    SL_usr      immediately cause an overflow event.
> +*   | SYS |
> +*   | USR | -=0x10000        We go through all execution modes and set up SP
> +*   |     |                for each of them.
> +*   +-----+ <- SP_sys,
> +*   |     |    SP_usr      Note:
> +*   | SVC | -= 0x8000        Mode switch via CPSR is not allowed once in
> +*   |     |                  non-privileged mode, so we take care not to enter
> +*   +-----+ <- SP_svc        "User" to set up its sp, and also skip most
> +*   |     |                  operations if already in that mode.
> +*   | IRQ | -= 0x2000
> +*   |     |                Input parameters:
> +* ^ +-----+ <- SP_und       - sp - Initialized SP
> +* s |     |                 - r2 - May contain SL value from semihosting
> +* t | UND | -= 0x1000              SYS_HEAPINFO call
> +* a |     |                Scratch registers:
> +* c +-----+ <- SP_und       - r1 - new value of CPSR
> +* k |     |                 - r2 - intermediate value (in standalone mode)
> +*   | ABT | -= 0x1000       - r3 - new SP value
> +* g |     |                 - r4 - save/restore CPSR on entry/exit
> +* r +-----+ <- SP_abt,
> +* o |     |    SL_fiq        Declared as "weak" so that user can write and use
> +* w | FIQ | -= 0x1000      his own implementation if current doesn't fit.
> +* t |     |
> +* h +-----+ <- initial SP,
> +*           becomes SP_fiq
> +*
> +******************************************************************************/
> +    .align    0
> +    FUNC_START    _stack_init
> +    .weak FUNCTION (_stack_init)
> +    FN_EH_START
> +
> +    /* M profile doesn't have CPSR register.  */
> +#if (__ARM_ARCH_PROFILE != 'M')
> +    /* Following code is compatible for both ARM and Thumb ISA */
> +    mrs    r4, CPSR
> +    /* Test mode bits - in User of all are 0 */
> +    tst    r4, #(CPSR_M_MASK)
> +    /* "eq" means r4 AND #0x0F is 0 */
> +    beq    .Lskip_cpu_modes
> +
> +    mov    r3, sp /* save input SP value */
> +
> +    /* FIQ mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_FIQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    sl, sp, #0x1000    /* FIQ mode has its own SL */
> +
> +    /* Abort mode, interrupts disabled */
> +    mov    r3, sl
> +    mov    r1, #(CPSR_M_ABT|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    r3, r3, #0x1000
> +
> +    /* Undefined mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_UND|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    r3, r3, #0x1000
> +
> +    /* IRQ mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_IRQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +    sub    r3, r3, #0x2000
> +
> +    /* Supervisory mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_SVR|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +
> +    sub    r3, r3, #0x8000    /* Min size 32k */
> +    bic    r3, r3, #0x00FF    /* Align with current 64k block */
> +    bic    r3, r3, #0xFF00
> +
> +# if __ARM_ARCH >= 4
> +    /* System (shares regs with User) mode, interrupts disabled */
> +    mov    r1, #(CPSR_M_SYS|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r1
> +    mov    sp, r3
> +# else
> +    /* Keep this for ARMv3, but GCC actually dropped it. */
> +    /* Move value into user mode sp without changing modes, */
> +    /* via '^' form of ldm */
> +    str    r3, [r3, #-4]
> +    ldmdb    r3, {sp}^
> +# endif
> +
> +    /* Back to original mode, presumably SVC, with diabled FIQ/IRQ */
> +    orr    r4, r4, #(CPSR_I_MASK|CPSR_F_MASK)
> +    msr    CPSR_c, r4
> +
> +.Lskip_cpu_modes:
> +#endif
> +
> +    /* Set SL register */
> +#if defined (ARM_RDI_MONITOR) /* semihosting */
> +    cmp    r2, #0
> +    beq    .Lsl_forced_zero
> +    /* allow slop for stack overflow handling and small frames */
> +# ifdef THUMB1_ONLY
> +    adds    r2, #128
> +    adds    r2, #128
> +    mov    sl, r2
> +# else
> +    add    sl, r2, #256
> +# endif
> +.Lsl_forced_zero:
> +
> +#else /* standalone */
> +    /* r3 contains SP for System/User mode. Set SL = SP - 0x10000 */
> +    #ifdef THUMB1_ONLY
> +    movs    r2, #64
> +    lsls    r2, r2, #10
> +    subs    r2, r3, r2
> +    mov    sl, r2
> +    #else
> +    /* Still assumes 256bytes below sl */
> +    sub    sl, r3, #64 << 10
> +    #endif
> +#endif
> +
> +    FN_RETURN
> +    FN_EH_END
> +
> +
>  /*******************************************************************************
>  * Main library startup code.
>  *******************************************************************************/
>      .align     0
>      FUNC_START    _mainCRTStartup
>      FUNC_START    _start
> -#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> -    /* Annotation for EABI unwinding tables.  */
> -    .fnstart
> -#endif
> +    FN_EH_START
>
>  /* Start by setting up a stack.  */
>  #ifdef ARM_RDP_MONITOR
> @@ -130,36 +299,27 @@
>            to skip setting sp/sl to 0 here.
>          - Considering M-profile processors, We might want to initialize
>            sp by the first entry of vector table and return 0 to SYS_HEAPINFO
> -          semihosting call, which will be skipped here.  */
> +          semihosting call, which will be skipped here.
> +        - Considering R-profile processors there is no automatic SP
> init by hardware
> +          so we need to initialize it by default value. */
> +    ldr    r3, .Lstack
>      cmp    r1, #0
>      beq    .LC26
> -    mov    sp, r1
> +    mov    r3, r1
>  .LC26:
> -    cmp    r2, #0
> -    beq    .LC27
> +    mov    sp, r3
>
> -    /*  Allow slop for stack overflow handling and small frames.  */
> -#ifdef THUMB1_ONLY
> -    adds    r2, #128
> -    adds    r2, #128
> -    mov    sl, r2
> -#else
> -    add    sl, r2, #256
> -#endif
> +    /* r2 (SL value) will be used in _stack_init */
> +    bl FUNCTION (_stack_init)
>
> -.LC27:
> -#else
> -    /*  Set up the stack pointer to a fixed value.  */
> +
> +#else /* standalone */
> +    /*  Set up the stack pointer to a fixed value. */
>      /*  Changes by toralf:
>          - Allow linker script to provide stack via __stack symbol - see
>            defintion of .Lstack
>          - Provide "hooks" that may be used by the application to add
> -          custom init code - see .Lhwinit and .Lswinit
> -        - Go through all execution modes and set up stack for each of them.
> -          Loosely based on init.s from ARM/Motorola example code.
> -              Note: Mode switch via CPSR is not allowed once in non-privileged
> -            mode, so we take care not to enter "User" to set up its sp,
> -            and also skip most operations if already in that mode.  */
> +          custom init code - see .Lhwinit and .Lswinit */
>
>      ldr    r3, .Lstack
>      cmp    r3, #0
> @@ -178,57 +338,10 @@
>           have somehow missed it below (in which case it gets the same
>           value as FIQ - not ideal, but better than nothing).  */
>      mov    sp, r3
> -#ifdef PREFER_THUMB
> -    /* XXX Fill in stack assignments for interrupt modes.  */
> -#else
> -    mrs    r2, CPSR
> -    tst    r2, #0x0F    /* Test mode bits - in User of all are 0.  */
> -    beq    .LC23        /* "eq" means r2 AND #0x0F is 0.  */
> -    msr     CPSR_c, #0xD1    /* FIRQ mode, interrupts disabled.  */
> -    mov     sp, r3
> -    sub    sl, sp, #0x1000    /* This mode also has its own sl (see below).  */
> -
> -    mov    r3, sl
> -    msr     CPSR_c, #0xD7    /* Abort mode, interrupts disabled.  */
> -    mov    sp, r3
> -    sub    r3, r3, #0x1000
> -
> -    msr     CPSR_c, #0xDB    /* Undefined mode, interrupts disabled.  */
> -    mov    sp, r3
> -    sub    r3, r3, #0x1000
>
> -    msr     CPSR_c, #0xD2    /* IRQ mode, interrupts disabled.  */
> -    mov    sp, r3
> -    sub    r3, r3, #0x2000
> -
> -    msr     CPSR_c, #0xD3    /* Supervisory mode, interrupts disabled.  */
> +    /* we don't care of r2 value in standalone */
> +    bl FUNCTION (_stack_init)
>
> -    mov    sp, r3
> -    sub    r3, r3, #0x8000    /* Min size 32k.  */
> -    bic    r3, r3, #0x00FF    /* Align with current 64k block.  */
> -    bic    r3, r3, #0xFF00
> -
> -    str    r3, [r3, #-4]    /* Move value into user mode sp without */
> -    ldmdb    r3, {sp}^       /* changing modes, via '^' form of ldm.  */
> -    orr    r2, r2, #0xC0    /* Back to original mode, presumably SVC, */
> -    msr    CPSR_c, r2    /* with FIQ/IRQ disable bits forced to 1.  */
> -#endif
> -.LC23:
> -    /* Setup a default stack-limit in-case the code has been
> -       compiled with "-mapcs-stack-check".  Hard-wiring this value
> -       is not ideal, since there is currently no support for
> -       checking that the heap and stack have not collided, or that
> -       this default 64k is enough for the program being executed.
> -       However, it ensures that this simple crt0 world will not
> -       immediately cause an overflow event:  */
> -#ifdef THUMB1_ONLY
> -    movs    r2, #64
> -    lsls    r2, r2, #10
> -    subs    r2, r3, r2
> -    mov    sl, r2
> -#else
> -    sub    sl, r3, #64 << 10    /* Still assumes 256bytes below sl.  */
> -#endif
>  #endif
>  #endif
>      /* Zero the memory in the .bss section.  */
> @@ -421,6 +534,8 @@ change_back:
>      swi    SWI_Exit
>  #endif
>
> +    FN_EH_END
> +
>      /* For Thumb, constants must be after the code since only
>         positive offsets are supported for PC relative addresses.  */
>      .align 0
> @@ -438,9 +553,6 @@ change_back:
>  #else
>      .word    0x80000            /* Top of RAM on the PIE board.  */
>  #endif
> -
> -.Lstack:
> -    .word    __stack
>  .Lhwinit:
>      .word    FUNCTION (hardware_init_hook)
>  .Lswinit:
> @@ -453,17 +565,16 @@ change_back:
>         and only if, a normal version of the same symbol isn't provided
>         e.g. by a linker script or another object file.)  */
>
> -    .weak __stack
>      .weak FUNCTION (hardware_init_hook)
>      .weak FUNCTION (software_init_hook)
>  #endif
>
>  #endif
> -#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> -    /* Protect against unhandled exceptions.  */
> -    .cantunwind
> -    .fnend
> -#endif
> +
> +.Lstack:
> +    .word    __stack
> +    .weak    __stack
> +
>  .LC1:
>      .word    __bss_start__
>  .LC2:
> --
> 2.20.1.windows.1
>
>
> On Wed, Apr 10, 2019 at 6:57 PM Alexander Fedotov <alfedotov@gmail.com> wrote:
> >
> > Hi Richard
> >
> > New patch is below (only for libgloss).
> > If it's fine with you I will do the same changes in newlib/libc/sys/arm as well.
> >
> > >This code is clearly ARM/Thumb2 only.  Have you tested that this builds
> > >for, say ARM7TDMI in thumb state?  If you use a build of gcc configured
> > >with --with-multilib-list=aprofile or --with-multilib-list=rmprofile
> > >you'll get pretty comprehensive checking of all the variants that need
> > >to be supported (beware, it can take quite a long time, even on a fast
> > >machine).  At the very least, this code needs to be checked on:
> >
> > I have built gcc with --with-multilib-list=rmprofile
> > Unfortunately I have no ability to check on all of them. At least this code is
> > verified on Cortex-r52 that was the aim.
> >
> >
> >
> > >This implies dropping support for Armv3 as system mode was added in
> > >Armv4.  That's probably not an issue now as GCC has also dropped such
> > >support, but it should be added to the News.
> > I don't see News file for newlib. But I have preserved original
> > version in the patch.
> > I suggest to drop Armv3 by another commit.
> >
> >
> > I can't get why description is corrupted for _stack_init function in
> > the patch below while lines
> > are less than 80 symbols :(
> >
> >
> > From c8ff1ecf005dba2a4a69c5754b1911c0d351aba5 Mon Sep 17 00:00:00 2001
> > From: Alexander Fedotov <alfedotov@gmail.com>
> > Date: Wed, 23 Jan 2019 13:33:14 +0300
> > Subject: [PATCH] SP initialization changes:   1. set default value in
> >  semihosting case as well   2. moved existing SP & SL init code for processor
> >  modes in separate routine and made it as "hook"   3. init SP for processor
> >  modes in Thumb mode as well Add new macro FN_RETURN, FN_EH_START and
> >  FN_EH_END.
> >
> > ---
> >  libgloss/arm/arm.h  |  26 +++++
> >  libgloss/arm/crt0.S | 279 +++++++++++++++++++++++++++++++-------------
> >  2 files changed, 224 insertions(+), 81 deletions(-)
> >
> > diff --git a/libgloss/arm/arm.h b/libgloss/arm/arm.h
> > index 0489f2d92..04b3a5ac7 100644
> > --- a/libgloss/arm/arm.h
> > +++ b/libgloss/arm/arm.h
> > @@ -61,4 +61,30 @@
> >  # define HAVE_CALL_INDIRECT
> >  #endif
> >
> > +/* A and R profiles (and legacy Arm).
> > +    Current Program Status Register (CPSR)
> > +    M[4:0]        Mode bits. M[4] is always 1 for 32-bit modes.
> > +    T[5]            1: Thumb, 0: ARM instruction set
> > +    F[6]            1: disables FIQ
> > +    I[7]            1: disables IRQ
> > +    A[8]            1: disables imprecise aborts
> > +    E[9]            0: Little-endian, 1: Big-endian
> > +    J[24]            1: Jazelle instruction set
> > + */
> > +#define CPSR_M_USR            0x00    /* User mode */
> > +#define CPSR_M_FIQ            0x01    /* Fast Interrupt mode */
> > +#define CPSR_M_IRQ            0x02    /* Interrupt mode */
> > +#define CPSR_M_SVR            0x03    /* Supervisor mode */
> > +#define CPSR_M_MON            0x06    /* Monitor mode */
> > +#define CPSR_M_ABT            0x07    /* Abort mode */
> > +#define CPSR_M_HYP            0x0A    /* Hypervisor mode */
> > +#define CPSR_M_UND            0x0B    /* Undefined mode */
> > +#define CPSR_M_SYS            0x0F    /* System mode */
> > +#define CPSR_M_32BIT        0x10    /* 32-bit mode */
> > +#define CPSR_T_BIT            0x20    /* Thumb bit */
> > +#define CPSR_F_MASK            0x40    /* FIQ bit */
> > +#define CPSR_I_MASK            0x80    /* IRQ bit */
> > +
> > +#define CPSR_M_MASK            0x0F    /* Mode mask except M[4] */
> > +
> >  #endif /* _LIBGLOSS_ARM_H */
> > diff --git a/libgloss/arm/crt0.S b/libgloss/arm/crt0.S
> > index c708f63d8..89256d5e8 100644
> > --- a/libgloss/arm/crt0.S
> > +++ b/libgloss/arm/crt0.S
> > @@ -59,6 +59,21 @@
> >  .endm
> >  #endif
> >
> > +/* Annotation for EABI unwinding tables. */
> > +.macro FN_EH_START
> > +#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> > +    .fnstart
> > +#endif
> > +.endm
> > +
> > +.macro FN_EH_END
> > +#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> > +    /* Protect against unhandled exceptions.  */
> > +    .cantunwind
> > +    .fnend
> > +#endif
> > +.endm
> > +
> >  .macro indirect_call reg
> >  #ifdef HAVE_CALL_INDIRECT
> >      blx \reg
> > @@ -68,14 +83,170 @@
> >  #endif
> >  .endm
> >
> > -    .align     0
> > +/* For armv4t and newer, toolchains will transparently convert
> > +   'bx lr' to 'mov pc, lr' if needed. GCC has deprecated support
> > +   for anything older than armv4t, but this should handle that
> > +   corner case in case anyone needs it anyway */
> > +.macro  FN_RETURN
> > +#if __ARM_ARCH <= 4 && __ARM_ARCH_ISA_THUMB == 0
> > +    mov    pc, lr
> > +#else
> > +    bx    lr
> > +#endif
> > +.endm
> > +
> > +
> > +
> > +/******************************************************************************
> > +* User mode only:           This routine makes default target specific Stack
> > +*   +-----+ <- SL_sys,    Pointer initialization for different processor modes:
> > +*   |     |    SL_usr     FIQ, Abort, IRQ, Undefined, Supervisor, System (User)
> > +*   | SYS |               and setups a default Stack Limit in-case the code has
> > +*   | USR | -=0x10000     been compiled with "-mapcs-stack-check" for FIQ and
> > +*   |     |               System (User) modes.
> > +*   |     |
> > +*   +-----+ <- initial SP,
> > +*           becomes SP_sys   Hard-wiring SL value is not ideal, since there is
> > +*           and SL_usr     currently no support for checking that the heap and
> > +*                          stack have not collided, or that this default 64k is
> > +* All modes:               is enough for the program being executed. However,
> > +*   +-----+ <- SL_sys,     it ensures that this simple crt0 world will not
> > +*   |     |    SL_usr      immediately cause an overflow event.
> > +*   | SYS |
> > +*   | USR | -=0x10000        We go through all execution modes and set up SP
> > +*   |     |                for each of them.
> > +*   +-----+ <- SP_sys,
> > +*   |     |    SP_usr      Note:
> > +*   | SVC | -= 0x8000        Mode switch via CPSR is not allowed once in
> > +*   |     |                  non-privileged mode, so we take care not to enter
> > +*   +-----+ <- SP_svc        "User" to set up its sp, and also skip most
> > +*   |     |                  operations if already in that mode.
> > +*   | IRQ | -= 0x2000
> > +*   |     |                Input parameters:
> > +* ^ +-----+ <- SP_und       - sp - Initialized SP
> > +* s |     |                 - r2 - May contain SL value from semihosting
> > +* t | UND | -= 0x1000              SYS_HEAPINFO call
> > +* a |     |                Scratch registers:
> > +* c +-----+ <- SP_und       - r1 - new value of CPSR
> > +* k |     |                 - r2 - intermediate value (in standalone mode)
> > +*   | ABT | -= 0x1000       - r3 - new SP value
> > +* g |     |                 - r4 - save/restore CPSR on entry/exit
> > +* r +-----+ <- SP_abt,
> > +* o |     |    SL_fiq        Declared as "weak" so that user can write and use
> > +* w | FIQ | -= 0x1000      his own implementation if current doesn't fit.
> > +* t |     |
> > +* h +-----+ <- initial SP,
> > +*           becomes SP_fiq
> > +*
> > +******************************************************************************/
> > +    .align    0
> > +    FUNC_START    _stack_init
> > +    .weak FUNCTION (_stack_init)
> > +    FN_EH_START
> > +
> > +    /* M profile doesn't have CPSR register.  */
> > +#if (__ARM_ARCH_PROFILE != 'M')
> > +    /* Following code is compatible for both ARM and Thumb ISA */
> > +    mrs    r4, CPSR
> > +    /* Test mode bits - in User of all are 0 */
> > +    tst    r4, #(CPSR_M_MASK)
> > +    /* "eq" means r4 AND #0x0F is 0 */
> > +    beq    .Lskip_cpu_modes
> > +
> > +    mov    r3, sp /* save input SP value */
> > +
> > +    /* FIQ mode, interrupts disabled */
> > +    mov    r1, #(CPSR_M_FIQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> > +    msr    CPSR_c, r1
> > +    mov    sp, r3
> > +    sub    sl, sp, #0x1000    /* FIQ mode has its own SL */
> > +
> > +    /* Abort mode, interrupts disabled */
> > +    mov    r3, sl
> > +    mov    r1, #(CPSR_M_ABT|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> > +    msr    CPSR_c, r1
> > +    mov    sp, r3
> > +    sub    r3, r3, #0x1000
> > +
> > +    /* Undefined mode, interrupts disabled */
> > +    mov    r1, #(CPSR_M_UND|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> > +    msr    CPSR_c, r1
> > +    mov    sp, r3
> > +    sub    r3, r3, #0x1000
> > +
> > +    /* IRQ mode, interrupts disabled */
> > +    mov    r1, #(CPSR_M_IRQ|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> > +    msr    CPSR_c, r1
> > +    mov    sp, r3
> > +    sub    r3, r3, #0x2000
> > +
> > +    /* Supervisory mode, interrupts disabled */
> > +    mov    r1, #(CPSR_M_SVR|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> > +    msr    CPSR_c, r1
> > +    mov    sp, r3
> > +
> > +    sub    r3, r3, #0x8000    /* Min size 32k */
> > +    bic    r3, r3, #0x00FF    /* Align with current 64k block */
> > +    bic    r3, r3, #0xFF00
> > +
> > +# if __ARM_ARCH > 3
> > +    /* System (shares regs with User) mode, interrupts disabled */
> > +    mov    r1, #(CPSR_M_SYS|CPSR_M_32BIT|CPSR_I_MASK|CPSR_F_MASK)
> > +    msr    CPSR_c, r1
> > +    mov    sp, r3
> > +# else
> > +    /* Keep this for ARMv3, but GCC actually dropped it. */
> > +    /* Move value into user mode sp without changing modes, */
> > +    /* via '^' form of ldm */
> > +    str    r3, [r3, #-4]
> > +    ldmdb    r3, {sp}^
> > +# endif
> > +
> > +    /* Back to original mode, presumably SVC, with diabled FIQ/IRQ */
> > +    orr    r4, r4, #(CPSR_I_MASK|CPSR_F_MASK)
> > +    msr    CPSR_c, r4
> > +
> > +.Lskip_cpu_modes:
> > +#endif
> > +
> > +    /* Set SL register */
> > +#if defined (ARM_RDI_MONITOR) /* semihosting */
> > +    cmp    r2, #0
> > +    beq    .Lsl_forced_zero
> > +    /* allow slop for stack overflow handling and small frames */
> > +# ifdef THUMB1_ONLY
> > +    adds    r2, #128
> > +    adds    r2, #128
> > +    mov    sl, r2
> > +# else
> > +    add    sl, r2, #256
> > +# endif
> > +.Lsl_forced_zero:
> > +
> > +#else /* standalone */
> > +    /* r3 contains SP for System/User mode. Set SL = SP - 0x10000 */
> > +    #ifdef THUMB1_ONLY
> > +    movs    r2, #64
> > +    lsls    r2, r2, #10
> > +    subs    r2, r3, r2
> > +    mov    sl, r2
> > +    #else
> > +    /* Still assumes 256bytes below sl */
> > +    sub    sl, r3, #64 << 10
> > +    #endif
> > +#endif
> > +
> > +    FN_RETURN
> > +    FN_EH_END
> >
> > +
> > +/*******************************************************************************
> > +* Main library startup code.
> > +*******************************************************************************/
> > +    .align     0
> >      FUNC_START    _mainCRTStartup
> >      FUNC_START    _start
> > -#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> > -    /* Annotation for EABI unwinding tables.  */
> > -    .fnstart
> > -#endif
> > +    FN_EH_START
> >
> >      /* __ARM_ARCH_PROFILE is defined from GCC 4.8 onwards, however
> > __ARM_ARCH_7A
> >      has been defined since 4.2 onwards, which is when v7-a support was added
> > @@ -148,34 +319,27 @@
> >            to skip setting sp/sl to 0 here.
> >          - Considering M-profile processors, We might want to initialize
> >            sp by the first entry of vector table and return 0 to SYS_HEAPINFO
> > -          semihosting call, which will be skipped here. */
> > +          semihosting call, which will be skipped here.
> > +        - Considering R-profile processors there is no automatic SP
> > init by hardware
> > +          so we need to initialize it by default value. */
> > +    ldr    r3, .Lstack
> >      cmp    r1, #0
> >      beq    .LC26
> > -    mov    sp, r1
> > +    mov    r3, r1
> >  .LC26:
> > -    cmp    r2, #0
> > -    beq    .LC27
> > -    /*  allow slop for stack overflow handling and small frames */
> > -#ifdef THUMB1_ONLY
> > -    adds    r2, #128
> > -    adds    r2, #128
> > -    mov    sl, r2
> > -#else
> > -    add    sl, r2, #256
> > -#endif
> > -.LC27:
> > -#else
> > +    mov    sp, r3
> > +
> > +    /* r2 (SL value) will be used in _stack_init */
> > +    bl FUNCTION (_stack_init)
> > +
> > +
> > +#else /* standalone */
> >      /*  Set up the stack pointer to a fixed value */
> >      /*  Changes by toralf:
> >          - Allow linker script to provide stack via __stack symbol - see
> >            defintion of .Lstack
> >          - Provide "hooks" that may be used by the application to add
> > -          custom init code - see .Lhwinit and .Lswinit
> > -        - Go through all execution modes and set up stack for each of them.
> > -          Loosely based on init.s from ARM/Motorola example code.
> > -              Note: Mode switch via CPSR is not allowed once in non-privileged
> > -            mode, so we take care not to enter "User" to set up its sp,
> > -            and also skip most operations if already in that mode. */
> > +          custom init code - see .Lhwinit and .Lswinit */
> >
> >      ldr    r3, .Lstack
> >      cmp    r3, #0
> > @@ -194,57 +358,11 @@
> >           have somehow missed it below (in which case it gets the same
> >           value as FIQ - not ideal, but better than nothing.) */
> >      mov    sp, r3
> > -#ifdef PREFER_THUMB
> > -    /* XXX Fill in stack assignments for interrupt modes.  */
> > -#else
> > -    mrs    r2, CPSR
> > -    tst    r2, #0x0F    /* Test mode bits - in User of all are 0 */
> > -    beq    .LC23        /* "eq" means r2 AND #0x0F is 0 */
> > -    msr     CPSR_c, #0xD1    /* FIRQ mode, interrupts disabled */
> > -    mov     sp, r3
> > -    sub    sl, sp, #0x1000    /* This mode also has its own sl (see below) */
> > -
> > -    mov    r3, sl
> > -    msr     CPSR_c, #0xD7    /* Abort mode, interrupts disabled */
> > -    mov    sp, r3
> > -    sub    r3, r3, #0x1000
> > -
> > -    msr     CPSR_c, #0xDB    /* Undefined mode, interrupts disabled */
> > -    mov    sp, r3
> > -    sub    r3, r3, #0x1000
> >
> > -    msr     CPSR_c, #0xD2    /* IRQ mode, interrupts disabled */
> > -    mov    sp, r3
> > -    sub    r3, r3, #0x2000
> > -
> > -    msr     CPSR_c, #0xD3    /* Supervisory mode, interrupts disabled */
> > +    /* we don't care of r2 value in standalone */
> > +    bl FUNCTION (_stack_init)
> >
> > -    mov    sp, r3
> > -    sub    r3, r3, #0x8000    /* Min size 32k */
> > -    bic    r3, r3, #0x00FF    /* Align with current 64k block */
> > -    bic    r3, r3, #0xFF00
> >
> > -    str    r3, [r3, #-4]    /* Move value into user mode sp without */
> > -    ldmdb    r3, {sp}^       /* changing modes, via '^' form of ldm */
> > -    orr    r2, r2, #0xC0    /* Back to original mode, presumably SVC, */
> > -    msr    CPSR_c, r2    /* with FIQ/IRQ disable bits forced to 1 */
> > -#endif
> > -.LC23:
> > -    /* Setup a default stack-limit in-case the code has been
> > -       compiled with "-mapcs-stack-check".  Hard-wiring this value
> > -       is not ideal, since there is currently no support for
> > -       checking that the heap and stack have not collided, or that
> > -       this default 64k is enough for the program being executed.
> > -       However, it ensures that this simple crt0 world will not
> > -       immediately cause an overflow event:  */
> > -#ifdef THUMB1_ONLY
> > -    movs    r2, #64
> > -    lsls    r2, r2, #10
> > -    subs    r2, r3, r2
> > -    mov    sl, r2
> > -#else
> > -    sub    sl, r3, #64 << 10    /* Still assumes 256bytes below sl */
> > -#endif
> >  #endif
> >  #endif
> >      /* Zero the memory in the .bss section.  */
> > @@ -445,6 +563,8 @@ change_back:
> >      swi    SWI_Exit
> >  #endif
> >
> > +    FN_EH_END
> > +
> >      /* For Thumb, constants must be after the code since only
> >         positive offsets are supported for PC relative addresses.  */
> >
> > @@ -464,8 +584,6 @@ change_back:
> >  #else
> >      .word    0x80000            /* Top of RAM on the PIE board.  */
> >  #endif
> > -.Lstack:
> > -    .word    __stack
> >  .Lhwinit:
> >      .word    FUNCTION (hardware_init_hook)
> >  .Lswinit:
> > @@ -478,17 +596,16 @@ change_back:
> >         and only if, a normal version of the same symbol isn't provided
> >         e.g. by a linker script or another object file.) */
> >
> > -    .weak __stack
> >      .weak FUNCTION (hardware_init_hook)
> >      .weak FUNCTION (software_init_hook)
> >  #endif
> >
> >  #endif
> > -#if defined(__ELF__) && !defined(__USING_SJLJ_EXCEPTIONS__)
> > -    /* Protect against unhandled exceptions.  */
> > -    .cantunwind
> > -    .fnend
> > -#endif
> > +
> > +.Lstack:
> > +    .word    __stack
> > +    .weak    __stack
> > +
> >  .LC1:
> >      .word    __bss_start__
> >  .LC2:
> > --
> > 2.20.1.windows.1
> >
> >
> >
> > On Wed, Apr 10, 2019 at 5:55 PM Richard Earnshaw (lists)
> > <Richard.Earnshaw@arm.com> wrote:
> > >
> > > On 10/04/2019 15:23, Corinna Vinschen wrote:
> > > > Hi Richard,
> > > >
> > > > On Apr 10 13:26, Richard Earnshaw (lists) wrote:
> > > >> On 19/03/2019 13:54, Alexander Fedotov wrote:
> > > >>> Hi all
> > > >>>
> > > >>> In continue of threads
> > > >>> https://sourceware.org/ml/newlib/2019/msg00117.html and
> > > >>> https://sourceware.org/ml/newlib/2019/msg00102.html
> > > >>>
> > > >>> I have reworked stack pointer initialization and stack limit as well.
> > > >>> General idea behind of all these changes is to provide hook for
> > > >>> flexibility (including semihosting when debug tap is used) and improve
> > > >>> readability of existing crt0 code.
> > > >>> And now code makes SP initialization in Thumb mode :)
> > > >>>
> > > >>> Any comments are very appreciated.
> > > >>>
> > > >>> Alex
> > > >>>
> > > >>
> > > >> Apologies once again for the delayed review.
> > > >>
> > > >> A small plea: when posting patches, please can you ensure that your
> > > >> attachments are text/patch, or something similar, so that replies will
> > > >> include the contents (and I can view them directly in the email rather
> > > >> than having to mess about extracting the attachment).
> > > >
> > > > Ideally the patches are inline patches sent with `git send-email' or
> > > > equivalent.  I added some wording in terms of how to provide patches to
> > > > https://sourceware.org/newlib/ lately.  Do you have a suggestion for
> > > > improving this?
> > > >
> > > >
> > > > Thanks,
> > > > Corinna
> > > >
> > >
> > > Not really;  When I use git format-patch with --attach I get
> > > text/x-patch as the MIME type, which is fine.  Problems only start if
> > > users have poorly configured mail client that attach as octet-stream or
> > > other binary types.  Then mail clients can't display the attachment
> > > inline as they don't know the format is really displayable.
> > >
> > > R.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-04-15 10:21 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-19 13:55 [Arm] SP and SL initialization refactored Alexander Fedotov
2019-03-26 10:53 ` Alexander Fedotov
2019-03-26 12:42   ` Nick Clifton
2019-03-29 19:03     ` Tamar Christina
2019-04-05 13:36       ` Alexander Fedotov
2019-04-10 12:28         ` Richard Earnshaw (lists)
2019-04-10 12:26 ` Richard Earnshaw (lists)
2019-04-10 14:23   ` Corinna Vinschen
2019-04-10 14:55     ` Richard Earnshaw (lists)
2019-04-10 15:58       ` Alexander Fedotov
2019-04-12 12:38         ` Alexander Fedotov
2019-04-15 10:21           ` Alexander Fedotov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).