[ECOS] PowerPC HAL questions.

public inbox for ecos-discuss@sourceware.org
 help / color / mirror / Atom feed

* [ECOS] PowerPC HAL questions.
@ 2000-05-12  9:21 Sergei Organov
  2000-05-15  6:15 ` Nick Garnett
  0 siblings, 1 reply; 5+ messages in thread
From: Sergei Organov @ 2000-05-12  9:21 UTC (permalink / raw)
  To: ecos-discuss

Hello,

While developing FP support for the PowerPC HAL I've got a few questions. Some
of them are actually suggestions for possible improvements.

Q.1. All the writing to the MSR in the assembly code are surrounded by the
'sync' instructions. AFAIK it's because of bug in some PPC chips. However, the
exported macros 'HAL_DISABLE_INTERRUPTS', 'HAL_ENABLE_INTERRUPTS', and
'HAL_RESTORE_INTERRUPTS' don't surround 'mtmsr' instruction by 'sync'
instructions. Is it a bug?

Q.2. Is it worth to make surrounding of 'mtmsr' instruction by 'sync'
instructions configurable? I mean providing ability for processor variants to
change this. It could be more important after addition of "on demand" FPU
context switch support because it requires additional tweaking of the FP bit
in MSR.

Q.3. What's the reason to restore only MSR EE bit in context switches as
opposed to restoring of entire MSR?

Q.4. What's the reason to enable interrupts in the
'hal_interrupt_stack_call_pending_DSRs' routine? Note that when
CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK isn't defined, this routine
isn't invoked, and interrupts thus stay as-is before call to the
'Cyg_Interrupt::call_pending_DSRs_inner'.

Q.5. The following macro

	.macro hal_cpu_int_enable
        mfmsr   r0
        ori     r3,r3,0x8000
        rlwimi  r0,r3,0,16,16
        sync
        mtmsr   r0
        sync
	.endm

could be rewritten as

	.macro hal_cpu_int_enable
        mfmsr   r0
        ori     r0,0x8000
        sync
        mtmsr   r0
        sync
	.endm

that is 1 instruction less. Is there something wrong about the latter
implementation?

Similar question about

	.macro hal_cpu_int_disable
        mfmsr   r0
        li      r3,0
        rlwimi  r0,r3,0,16,16
        sync
        mtmsr   r0
        sync
	.endm

and

	.macro hal_cpu_int_disable
        mfmsr   r0
        rlwinm  r0,r0,0,18,17
        sync
        mtmsr   r0
        sync
	.endm

Q.6. The common level-0 code of VSRs uses SPRG1-3 to save work registers, then
the SPRG1-3 are moved back to GPRs and are stored on the stack by the default
VSRs. Isn't it better to store work registers directly on the stack and use
SPRGs for some global variables such as address of 'hal_interrupt_handlers',
address of 'hal_interrupt_data', etc.? It will also allow to save SRR0 and
SRR1, and then set MSR RI as soon as possible. This could be critical for some
applications (e.g., MPC509 goes to the check-stop state if another exception
occurs when RI bit isn't set, processor hangs, and even internal watchdog
doesn't work. The exception could occur due to spikes on the bus while testing 
by applying high voltage to the ground of the device).

Related issue. In the startup code:

        # set up stack
        lwi     sp,__interrupt_stack
        mtspr   SPRG0,sp        # save in sprg0 for later use

SPRG0 isn't then used in the interrupts: the __interrupt_stack is loaded by
'lwi' (that is actually 2 instructions) instead:

#ifdef CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK
        lwi     r3,__interrupt_stack            # stack top

BR,
Sergei Organov.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ECOS] PowerPC HAL questions.
  2000-05-12  9:21 [ECOS] PowerPC HAL questions Sergei Organov
@ 2000-05-15  6:15 ` Nick Garnett
  2000-05-15  7:18   ` Sergei Organov
  2001-09-05  0:10   ` Nick Garnett
  0 siblings, 2 replies; 5+ messages in thread
From: Nick Garnett @ 2000-05-15  6:15 UTC (permalink / raw)
  To: ecos-discuss

Sergei Organov <osv@javad.ru> writes:

> Hello,
> 
> While developing FP support for the PowerPC HAL I've got a few questions. Some
> of them are actually suggestions for possible improvements.
> 
> Q.1. All the writing to the MSR in the assembly code are surrounded by the
> 'sync' instructions. AFAIK it's because of bug in some PPC chips. However, the
> exported macros 'HAL_DISABLE_INTERRUPTS', 'HAL_ENABLE_INTERRUPTS', and
> 'HAL_RESTORE_INTERRUPTS' don't surround 'mtmsr' instruction by 'sync'
> instructions. Is it a bug?

Probably. Several people have worked on this code, so it is mostly
down to individual style. Since at present the lack of these syncs
does not cause any real problems there is little impetus to change it.

> 
> Q.2. Is it worth to make surrounding of 'mtmsr' instruction by 'sync'
> instructions configurable? I mean providing ability for processor variants to
> change this. It could be more important after addition of "on demand" FPU
> context switch support because it requires additional tweaking of the FP bit
> in MSR.

This might be a good thing to do.

> 
> Q.3. What's the reason to restore only MSR EE bit in context switches as
> opposed to restoring of entire MSR?

Because we only want the interrupt enable state to be
per-thread. There are other things in the MSR that can get changed on
a global basis and which we want to stay unchanged across context
switches.

> 
> Q.4. What's the reason to enable interrupts in the
> 'hal_interrupt_stack_call_pending_DSRs' routine? Note that when
> CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK isn't defined, this routine
> isn't invoked, and interrupts thus stay as-is before call to the
> 'Cyg_Interrupt::call_pending_DSRs_inner'.
>

Enabing interrupts while running DSRs is the main reason for doing
this. Since some DSR can run for a long time, they would affect
interrupt latency. If we enabled interrupts during DSR processing
while on a thread stack, we have the possibility of taking nested
interrupts, which would either overflow the stack, or require the
stack to be extra large. So, to avoid this we transfer to the
interrupt stack for DSR processing. A useful side-effect of this is
that we can keep the thread stacks small. This is also the reason for
preserving the EE bit across context switches - interrupts are
disabled in threads that already have an interrupt context stacked and
interrupts are not re-enabled until we have trasferred to another
thread that is capable of taking an interrupt.

Since we cannot do any of this if there is no interrupt stack, then
all of this is conditional on
CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK. We really do not
expect anyone to disable this, the advantages of having a separate
interrupt stack are far greater than the alternatives. However, if it
is disabled, then the system will still work, although with reduced
interrupt latency.

> Q.5. The following macro
> 
> 	.macro hal_cpu_int_enable
>         mfmsr   r0
>         ori     r3,r3,0x8000
>         rlwimi  r0,r3,0,16,16
>         sync
>         mtmsr   r0
>         sync
> 	.endm
> 
> could be rewritten as
> 
> 	.macro hal_cpu_int_enable
>         mfmsr   r0
>         ori     r0,0x8000
>         sync
>         mtmsr   r0
>         sync
> 	.endm
> 
> that is 1 instruction less. Is there something wrong about the latter
> implementation?
>

Not as far as I can see. I would probably have written the second
variant too. This is probably just an example of code being
cut-and-pasted.

> Q.6. The common level-0 code of VSRs uses SPRG1-3 to save work registers, then
> the SPRG1-3 are moved back to GPRs and are stored on the stack by the default
> VSRs. Isn't it better to store work registers directly on the stack and use
> SPRGs for some global variables such as address of 'hal_interrupt_handlers',
> address of 'hal_interrupt_data', etc.? It will also allow to save SRR0 and
> SRR1, and then set MSR RI as soon as possible. This could be critical for some
> applications (e.g., MPC509 goes to the check-stop state if another exception
> occurs when RI bit isn't set, processor hangs, and even internal watchdog
> doesn't work. The exception could occur due to spikes on the bus while testing 
> by applying high voltage to the ground of the device).
> 

There are several reasons for the code being as it is. The first is to
make the exception decode code in the exception_vector macro as fast
as possible by not touching memory more than the once to get the VSR
table entry. We cannot assume that the SP is necessarily valid at this
point. Although the default VSRs make this assumption, VSRs for
handling things like TLB miss should not. If we stashed some data on
the stack in this macro, it is very possible that the default VSRs
would have to pop it again to set up their own saved states. We also
want to provide the option of fast interrupt and exception handling to
applications by allowing them to install their own VSRs. We do this by
entering the VSR with the minimum of change to the machine state that
we can achieve.

> Related issue. In the startup code:
> 
>         # set up stack
>         lwi     sp,__interrupt_stack
>         mtspr   SPRG0,sp        # save in sprg0 for later use
> 
> SPRG0 isn't then used in the interrupts: the __interrupt_stack is loaded by
> 'lwi' (that is actually 2 instructions) instead:
> 
> #ifdef CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK
>         lwi     r3,__interrupt_stack            # stack top
> 

Just an intended optimization that never got finished.

-- 
Nick Garnett
Cygnus Solutions, a Red Hat Company
Cambridge, UK

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ECOS] PowerPC HAL questions.
  2000-05-15  6:15 ` Nick Garnett
@ 2000-05-15  7:18   ` Sergei Organov
  2000-05-15  7:48     ` Nick Garnett
  2001-09-05  0:10   ` Nick Garnett
  1 sibling, 1 reply; 5+ messages in thread
From: Sergei Organov @ 2000-05-15  7:18 UTC (permalink / raw)
  To: Nick Garnett; +Cc: ecos-discuss

Nick Garnett <nickg@cygnus.co.uk> writes:
> Sergei Organov <osv@javad.ru> writes:
> 
> > Hello,
> > 
> > While developing FP support for the PowerPC HAL I've got a few questions. Some
> > of them are actually suggestions for possible improvements.
> > 
[...]
> > 
> > Q.4. What's the reason to enable interrupts in the
> > 'hal_interrupt_stack_call_pending_DSRs' routine? Note that when
> > CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK isn't defined, this routine
> > isn't invoked, and interrupts thus stay as-is before call to the
> > 'Cyg_Interrupt::call_pending_DSRs_inner'.
> >
> 
> Enabing interrupts while running DSRs is the main reason for doing
> this. Since some DSR can run for a long time, they would affect
> interrupt latency. If we enabled interrupts during DSR processing
> while on a thread stack, we have the possibility of taking nested
> interrupts, which would either overflow the stack, or require the
> stack to be extra large. So, to avoid this we transfer to the
> interrupt stack for DSR processing. A useful side-effect of this is
> that we can keep the thread stacks small. This is also the reason for
> preserving the EE bit across context switches - interrupts are
> disabled in threads that already have an interrupt context stacked and
> interrupts are not re-enabled until we have trasferred to another
> thread that is capable of taking an interrupt.
> 
> Since we cannot do any of this if there is no interrupt stack, then
> all of this is conditional on
> CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK. We really do not
> expect anyone to disable this, the advantages of having a separate
> interrupt stack are far greater than the alternatives. However, if it
> is disabled, then the system will still work, although with reduced
> interrupt latency.

Thanks for explanations.

Now when FP switch "on demand" is configured, I still need to call DSRs
through the HAL even if CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK is
disabled. From your explanation I now understand that I better don't enable
interrupts in this case before entering DSR, right?

[...]
> > Q.6. The common level-0 code of VSRs uses SPRG1-3 to save work registers, then
> > the SPRG1-3 are moved back to GPRs and are stored on the stack by the default
> > VSRs. Isn't it better to store work registers directly on the stack and use
> > SPRGs for some global variables such as address of 'hal_interrupt_handlers',
> > address of 'hal_interrupt_data', etc.? It will also allow to save SRR0 and
> > SRR1, and then set MSR RI as soon as possible. This could be critical for some
> > applications (e.g., MPC509 goes to the check-stop state if another exception
> > occurs when RI bit isn't set, processor hangs, and even internal watchdog
> > doesn't work. The exception could occur due to spikes on the bus while testing 
> > by applying high voltage to the ground of the device).
> > 
> 
> There are several reasons for the code being as it is. The first is to
> make the exception decode code in the exception_vector macro as fast
> as possible by not touching memory more than the once to get the VSR
> table entry. We cannot assume that the SP is necessarily valid at this
> point. Although the default VSRs make this assumption, VSRs for
> handling things like TLB miss should not.

Do you mean that eCos will support virtual memory in the future? Or there are
some other reasons to handle TLB misses? Are there other possible cases when
SP could be invalid? Anyway, it could be handled by providing special level-0
code for given exception vector in the HAL.

> If we stashed some data on the stack in this macro, it is very possible that
> the default VSRs would have to pop it again to set up their own saved
> states.

This seems to be questionable. At least when I wrote "FP unavailable"
VSR, I was forced to save a few registers on the stack anyway, and if VSR
entry code did it for me, I definitely don't pop them back until the end of
the VSR. For me it seems best if level-0 code creates the standard
exception/interrupt stack frame and stores working registers at the standard 
places in the frame. Why then default VSRs need to pop them back?

> We also want to provide the option of fast interrupt and exception handling
> to applications by allowing them to install their own VSRs. We do this by
> entering the VSR with the minimum of change to the machine state that
> we can achieve.

Well, that's a good thing. But unfortunately it creates an overhead in all
existing VSRs in favor of optimizing for some (imaginary ?) cases. Note that
after level-0 code there are only 2 somewhat free working registers (LR and
R3), and thus almost any reasonable VSR will need to store something on stack
anyway ;-) Both default VSRs and those one I wrote for "FP unavailable" VSR
make this fairly obvious for me. 


Sergei.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ECOS] PowerPC HAL questions.
  2000-05-15  7:18   ` Sergei Organov
@ 2000-05-15  7:48     ` Nick Garnett
  0 siblings, 0 replies; 5+ messages in thread
From: Nick Garnett @ 2000-05-15  7:48 UTC (permalink / raw)
  To: Sergei Organov; +Cc: ecos-discuss

Sergei Organov <osv@javad.ru> writes:

> Nick Garnett <nickg@cygnus.co.uk> writes:
> > Sergei Organov <osv@javad.ru> writes:

> Now when FP switch "on demand" is configured, I still need to call DSRs
> through the HAL even if CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK is
> disabled. From your explanation I now understand that I better don't enable
> interrupts in this case before entering DSR, right?

That is correct. 

> 
> [...]
> > > Q.6. The common level-0 code of VSRs uses SPRG1-3 to save work registers, then
> > > the SPRG1-3 are moved back to GPRs and are stored on the stack by the default
> > > VSRs. Isn't it better to store work registers directly on the stack and use
> > > SPRGs for some global variables such as address of 'hal_interrupt_handlers',
> > > address of 'hal_interrupt_data', etc.? It will also allow to save SRR0 and
> > > SRR1, and then set MSR RI as soon as possible. This could be critical for some
> > > applications (e.g., MPC509 goes to the check-stop state if another exception
> > > occurs when RI bit isn't set, processor hangs, and even internal watchdog
> > > doesn't work. The exception could occur due to spikes on the bus while testing 
> > > by applying high voltage to the ground of the device).
> > > 
> > 
> > There are several reasons for the code being as it is. The first is to
> > make the exception decode code in the exception_vector macro as fast
> > as possible by not touching memory more than the once to get the VSR
> > table entry. We cannot assume that the SP is necessarily valid at this
> > point. Although the default VSRs make this assumption, VSRs for
> > handling things like TLB miss should not.
> 
> Do you mean that eCos will support virtual memory in the future? Or there are
> some other reasons to handle TLB misses? Are there other possible cases when
> SP could be invalid? Anyway, it could be handled by providing special level-0
> code for given exception vector in the HAL.

We probably will not support virtual memory, but there are often
reasons for running the MMU, like cache control, remapping IO memory
space etc. Things like memory allocators and garbage collectors may
also want to make use of MMU features, and may want to install their
own TLB miss VSR.

> 
> > If we stashed some data on the stack in this macro, it is very possible that
> > the default VSRs would have to pop it again to set up their own saved
> > states.
> 
> This seems to be questionable. At least when I wrote "FP unavailable"
> VSR, I was forced to save a few registers on the stack anyway, and if VSR
> entry code did it for me, I definitely don't pop them back until the end of
> the VSR. For me it seems best if level-0 code creates the standard
> exception/interrupt stack frame and stores working registers at the standard 
> places in the frame. Why then default VSRs need to pop them back?

If we made the standard interrupt frame, we would be making more
assumptions about the stack and the availability of memory below
it. This is not something I am happy for the level-0 code to do.


> 
> > We also want to provide the option of fast interrupt and exception handling
> > to applications by allowing them to install their own VSRs. We do this by
> > entering the VSR with the minimum of change to the machine state that
> > we can achieve.
> 
> Well, that's a good thing. But unfortunately it creates an overhead in all
> existing VSRs in favor of optimizing for some (imaginary ?) cases. Note that
> after level-0 code there are only 2 somewhat free working registers (LR and
> R3), and thus almost any reasonable VSR will need to store something on stack
> anyway ;-) Both default VSRs and those one I wrote for "FP unavailable" VSR
> make this fairly obvious for me. 

The current code gives us minimum latency between the exception being
delivered and entry into the VSR. As you say, few VSRs will not save
extra registers to memory, but how they do that, and where, is their
business. Some might save to a static area, others might only push a
handful onto the stack. I didn't want the level-0 code to make any
premature choices of this sort, hence the effort to keep everything in
registers.



-- 
Nick Garnett
Cygnus Solutions, a Red Hat Company
Cambridge, UK

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ECOS] PowerPC HAL questions.
  2000-05-15  6:15 ` Nick Garnett
  2000-05-15  7:18   ` Sergei Organov
@ 2001-09-05  0:10   ` Nick Garnett
  1 sibling, 0 replies; 5+ messages in thread
From: Nick Garnett @ 2001-09-05  0:10 UTC (permalink / raw)
  To: ecos-discuss

Sergei Organov <osv@javad.ru> writes:

> Hello,
> 
> While developing FP support for the PowerPC HAL I've got a few questions. Some
> of them are actually suggestions for possible improvements.
> 
> Q.1. All the writing to the MSR in the assembly code are surrounded by the
> 'sync' instructions. AFAIK it's because of bug in some PPC chips. However, the
> exported macros 'HAL_DISABLE_INTERRUPTS', 'HAL_ENABLE_INTERRUPTS', and
> 'HAL_RESTORE_INTERRUPTS' don't surround 'mtmsr' instruction by 'sync'
> instructions. Is it a bug?

Probably. Several people have worked on this code, so it is mostly
down to individual style. Since at present the lack of these syncs
does not cause any real problems there is little impetus to change it.

> 
> Q.2. Is it worth to make surrounding of 'mtmsr' instruction by 'sync'
> instructions configurable? I mean providing ability for processor variants to
> change this. It could be more important after addition of "on demand" FPU
> context switch support because it requires additional tweaking of the FP bit
> in MSR.

This might be a good thing to do.

> 
> Q.3. What's the reason to restore only MSR EE bit in context switches as
> opposed to restoring of entire MSR?

Because we only want the interrupt enable state to be
per-thread. There are other things in the MSR that can get changed on
a global basis and which we want to stay unchanged across context
switches.

> 
> Q.4. What's the reason to enable interrupts in the
> 'hal_interrupt_stack_call_pending_DSRs' routine? Note that when
> CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK isn't defined, this routine
> isn't invoked, and interrupts thus stay as-is before call to the
> 'Cyg_Interrupt::call_pending_DSRs_inner'.
>

Enabing interrupts while running DSRs is the main reason for doing
this. Since some DSR can run for a long time, they would affect
interrupt latency. If we enabled interrupts during DSR processing
while on a thread stack, we have the possibility of taking nested
interrupts, which would either overflow the stack, or require the
stack to be extra large. So, to avoid this we transfer to the
interrupt stack for DSR processing. A useful side-effect of this is
that we can keep the thread stacks small. This is also the reason for
preserving the EE bit across context switches - interrupts are
disabled in threads that already have an interrupt context stacked and
interrupts are not re-enabled until we have trasferred to another
thread that is capable of taking an interrupt.

Since we cannot do any of this if there is no interrupt stack, then
all of this is conditional on
CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK. We really do not
expect anyone to disable this, the advantages of having a separate
interrupt stack are far greater than the alternatives. However, if it
is disabled, then the system will still work, although with reduced
interrupt latency.

> Q.5. The following macro
> 
> 	.macro hal_cpu_int_enable
>         mfmsr   r0
>         ori     r3,r3,0x8000
>         rlwimi  r0,r3,0,16,16
>         sync
>         mtmsr   r0
>         sync
> 	.endm
> 
> could be rewritten as
> 
> 	.macro hal_cpu_int_enable
>         mfmsr   r0
>         ori     r0,0x8000
>         sync
>         mtmsr   r0
>         sync
> 	.endm
> 
> that is 1 instruction less. Is there something wrong about the latter
> implementation?
>

Not as far as I can see. I would probably have written the second
variant too. This is probably just an example of code being
cut-and-pasted.

> Q.6. The common level-0 code of VSRs uses SPRG1-3 to save work registers, then
> the SPRG1-3 are moved back to GPRs and are stored on the stack by the default
> VSRs. Isn't it better to store work registers directly on the stack and use
> SPRGs for some global variables such as address of 'hal_interrupt_handlers',
> address of 'hal_interrupt_data', etc.? It will also allow to save SRR0 and
> SRR1, and then set MSR RI as soon as possible. This could be critical for some
> applications (e.g., MPC509 goes to the check-stop state if another exception
> occurs when RI bit isn't set, processor hangs, and even internal watchdog
> doesn't work. The exception could occur due to spikes on the bus while testing 
> by applying high voltage to the ground of the device).
> 

There are several reasons for the code being as it is. The first is to
make the exception decode code in the exception_vector macro as fast
as possible by not touching memory more than the once to get the VSR
table entry. We cannot assume that the SP is necessarily valid at this
point. Although the default VSRs make this assumption, VSRs for
handling things like TLB miss should not. If we stashed some data on
the stack in this macro, it is very possible that the default VSRs
would have to pop it again to set up their own saved states. We also
want to provide the option of fast interrupt and exception handling to
applications by allowing them to install their own VSRs. We do this by
entering the VSR with the minimum of change to the machine state that
we can achieve.

> Related issue. In the startup code:
> 
>         # set up stack
>         lwi     sp,__interrupt_stack
>         mtspr   SPRG0,sp        # save in sprg0 for later use
> 
> SPRG0 isn't then used in the interrupts: the __interrupt_stack is loaded by
> 'lwi' (that is actually 2 instructions) instead:
> 
> #ifdef CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK
>         lwi     r3,__interrupt_stack            # stack top
> 

Just an intended optimization that never got finished.

-- 
Nick Garnett
Cygnus Solutions, a Red Hat Company
Cambridge, UK

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-09-05  0:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-05-12  9:21 [ECOS] PowerPC HAL questions Sergei Organov
2000-05-15  6:15 ` Nick Garnett
2000-05-15  7:18   ` Sergei Organov
2000-05-15  7:48     ` Nick Garnett
2001-09-05  0:10   ` Nick Garnett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).