* [ECOS] DSR stops running after heavy interrupts. @ 2006-04-05 21:09 Joe Porthouse 2006-04-06 6:49 ` Andrew Lunn 0 siblings, 1 reply; 25+ messages in thread From: Joe Porthouse @ 2006-04-05 21:09 UTC (permalink / raw) To: 'eCos Discussion' In a nutshell: The real time clock DSR stops getting called after several minutes of heavy UART ISR traffic. I have been running into this on and off for a while. Lowering the serial ISR priority seems to help some, but not eliminate the problem. Background: Application is on a custom xScale PXA255 board without redboot. When problem occurs the Real Time tick clock simply stops updating. All other aspects of the program seem to work correctly. The real time ISR is still getting called as well as other ISRs, but the real time clock DSR is no longer called. In the Vectors.S file I can step through the execution and see what is happening. On return from the ISR the return code is examined to determine if a DSR call should be added to the DSL list. This check is done here: // The return value from the handler (in r0) will indicate whether a // DSR is to be posted. Pass this together with a pointer to the // interrupt object we have just used to the interrupt tidy up cmp v1,#CYGNUM_HAL_INTERRUPT_NONE beq 17f When the problem occurs the branch (beq) is occurring that skips adding the DSR to the list and ends the ISR. I can see that R0 is correctly 0x03 but the branch still occurs. The problem may be in how this is getting compiled. In my JTAG tool I see the above code as: 00008C5C e3740001 CMN R4,#00000001 00008C60 0a000003 BEQ 00008c74 Obviously there is some assembler substitution going on. I'm not sure why if the value is in r0, why v1 is being checked (not familiar with the "v" register notation). Also not sure why the resulting code refers to R4. R4 has a different value then R1 at this point in the execution. Any ideas on this? Many thanks, Joe Porthouse Toptech Systems, Inc. -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] DSR stops running after heavy interrupts. 2006-04-05 21:09 [ECOS] DSR stops running after heavy interrupts Joe Porthouse @ 2006-04-06 6:49 ` Andrew Lunn 2006-04-06 9:02 ` Stefan Sommerfeld 0 siblings, 1 reply; 25+ messages in thread From: Andrew Lunn @ 2006-04-06 6:49 UTC (permalink / raw) To: Joe Porthouse; +Cc: 'eCos Discussion' On Wed, Apr 05, 2006 at 05:09:41PM -0400, Joe Porthouse wrote: > In a nutshell: > The real time clock DSR stops getting called after several minutes > of heavy UART ISR traffic. I have been running into this on and off for a > while. Lowering the serial ISR priority seems to help some, but not > eliminate the problem. > > Background: > Application is on a custom xScale PXA255 board without redboot. > When problem occurs the Real Time tick clock simply stops updating. All > other aspects of the program seem to work correctly. The real time ISR is > still getting called as well as other ISRs, but the real time clock DSR is > no longer called. > > In the Vectors.S file I can step through the execution and see what > is happening. On return from the ISR the return code is examined to > determine if a DSR call should be added to the DSL list. This check is done > here: > > // The return value from the handler (in r0) will indicate whether a > > // DSR is to be posted. Pass this together with a pointer to the > // interrupt object we have just used to the interrupt tidy up > > cmp v1,#CYGNUM_HAL_INTERRUPT_NONE > beq 17f > > When the problem occurs the branch (beq) is occurring that skips > adding the DSR to the list and ends the ISR. I can see that R0 is correctly > 0x03 but the branch still occurs. The problem may be in how this is getting > compiled. In my JTAG tool I see the above code as: > > 00008C5C e3740001 CMN R4,#00000001 > 00008C60 0a000003 BEQ 00008c74 > > Obviously there is some assembler substitution going on. I'm not > sure why if the value is in r0, why v1 is being checked (not familiar with > the "v" register notation). Also not sure why the resulting code refers to > R4. R4 has a different value then R1 at this point in the execution. > > Any ideas on this? The fast that this works for a while and then breaks suggests it is something unusual going on. When the problem occurs take a look at the actually contents of memory which contains these instructions. Has it been corrupted? Be careful with your debugger here. If you just ask it to disassemble the code it might show you what is in the elf file, not what is in memory. Do a hex dump and compare the machine code bytes. Andrew -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] DSR stops running after heavy interrupts. 2006-04-06 6:49 ` Andrew Lunn @ 2006-04-06 9:02 ` Stefan Sommerfeld 2006-04-06 21:09 ` Joe Porthouse 0 siblings, 1 reply; 25+ messages in thread From: Stefan Sommerfeld @ 2006-04-06 9:02 UTC (permalink / raw) To: ecos-discuss Hi, >> In a nutshell: >> The real time clock DSR stops getting called after several minutes >> of heavy UART ISR traffic. I have been running into this on and off for >> a >> while. Lowering the serial ISR priority seems to help some, but not >> eliminate the problem. >> >> Background: >> Application is on a custom xScale PXA255 board without redboot. >> When problem occurs the Real Time tick clock simply stops updating. All >> other aspects of the program seem to work correctly. The real time ISR >> is >> still getting called as well as other ISRs, but the real time clock DSR >> is >> no longer called. >> >> In the Vectors.S file I can step through the execution and see what >> is happening. On return from the ISR the return code is examined to >> determine if a DSR call should be added to the DSL list. This check is >> done >> here: >> >> // The return value from the handler (in r0) will indicate >> whether a >> >> // DSR is to be posted. Pass this together with a pointer to the >> // interrupt object we have just used to the interrupt tidy up >> >> cmp v1,#CYGNUM_HAL_INTERRUPT_NONE >> beq 17f >> >> When the problem occurs the branch (beq) is occurring that skips >> adding the DSR to the list and ends the ISR. I can see that R0 is >> correctly >> 0x03 but the branch still occurs. The problem may be in how this is >> getting >> compiled. In my JTAG tool I see the above code as: >> >> 00008C5C e3740001 CMN R4,#00000001 >> 00008C60 0a000003 BEQ 00008c74 >> >> Obviously there is some assembler substitution going on. I'm not >> sure why if the value is in r0, why v1 is being checked (not familiar >> with >> the "v" register notation). Also not sure why the resulting code refers >> to >> R4. R4 has a different value then R1 at this point in the execution. >> >> Any ideas on this? > > The fast that this works for a while and then breaks suggests it is > something unusual going on. > > When the problem occurs take a look at the actually contents of memory > which contains these instructions. Has it been corrupted? Be careful > with your debugger here. If you just ask it to disassemble the code it > might show you what is in the elf file, not what is in memory. Do a > hex dump and compare the machine code bytes. I had that experience too, not just with the uart (i'm using a PXA270). Currently using my FIFO DSRs it works better, but please do more research on that topic to fix this once and forever. Bye... -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [ECOS] DSR stops running after heavy interrupts. 2006-04-06 9:02 ` Stefan Sommerfeld @ 2006-04-06 21:09 ` Joe Porthouse 2006-04-06 21:19 ` Andrew Lunn 0 siblings, 1 reply; 25+ messages in thread From: Joe Porthouse @ 2006-04-06 21:09 UTC (permalink / raw) To: ecos-discuss Stefan, thanks. I'm glad to know I'm not the only one experiencing this problem. I have made a little more progress. I still can't explain the issues with the code listed in my first message with the code checking the return value from the ISR, but I believe it is somehow working correctly. I still believe there may be a problem with R4 being checked instead of R0. I did verify that the memory was the same as my code window, as well as the flash image. This is what I did find. DSR calls are being added to the table... thousands of them... just not getting serviced. The all calls that lead to "call_pending_DSRs" seem to originate from the unlock_inner() routine getting called. This routine stops getting called when the problem occurs. (you can see the logic below) inline void Cyg_Scheduler::unlock() { // This is an inline wrapper for the real scheduler unlock function in // Cyg_Scheduler::unlock_inner(). // Only do anything if the lock is about to go zero, otherwise we simply // decrement and return. As with lock() we do not need any special code // to decrement the lock counter. CYG_INSTRUMENT_SCHED(UNLOCK,get_sched_lock(),0); HAL_REORDER_BARRIER(); cyg_ucount32 __lock = get_sched_lock() - 1; if( __lock == 0 ) unlock_inner(0); else set_sched_lock(__lock); HAL_REORDER_BARRIER(); } Upon examination the __lock value is "6" when unlock() is called at the end of the ISR, thus unlock_inner never gets called. If I get the variable location in the get_sched_lock() back to 1, my DSR calls resume. Mmmmmmm.... So somehow locks are being done without unlocks. I am at a loss to figure out how this is occurring since I do not make lock calls in any of my code. Could interrupt preemption somehow be occurring? Does the hal_disable/enable interrupt calls mess with the lock? Any good ideas on how to track this down? Joe Porthouse -----Original Message----- From: ecos-discuss-owner@ecos.sourceware.org [mailto:ecos-discuss-owner@ecos.sourceware.org] On Behalf Of Stefan Sommerfeld Sent: Thursday, April 06, 2006 5:01 AM To: ecos-discuss@ecos.sourceware.org Subject: Re: [ECOS] DSR stops running after heavy interrupts. Hi, >> In a nutshell: >> The real time clock DSR stops getting called after several minutes >> of heavy UART ISR traffic. I have been running into this on and off for >> a >> while. Lowering the serial ISR priority seems to help some, but not >> eliminate the problem. >> >> Background: >> Application is on a custom xScale PXA255 board without redboot. >> When problem occurs the Real Time tick clock simply stops updating. All >> other aspects of the program seem to work correctly. The real time ISR >> is >> still getting called as well as other ISRs, but the real time clock DSR >> is >> no longer called. >> >> In the Vectors.S file I can step through the execution and see what >> is happening. On return from the ISR the return code is examined to >> determine if a DSR call should be added to the DSL list. This check is >> done >> here: >> >> // The return value from the handler (in r0) will indicate >> whether a >> >> // DSR is to be posted. Pass this together with a pointer to the >> // interrupt object we have just used to the interrupt tidy up >> >> cmp v1,#CYGNUM_HAL_INTERRUPT_NONE >> beq 17f >> >> When the problem occurs the branch (beq) is occurring that skips >> adding the DSR to the list and ends the ISR. I can see that R0 is >> correctly >> 0x03 but the branch still occurs. The problem may be in how this is >> getting >> compiled. In my JTAG tool I see the above code as: >> >> 00008C5C e3740001 CMN R4,#00000001 >> 00008C60 0a000003 BEQ 00008c74 >> >> Obviously there is some assembler substitution going on. I'm not >> sure why if the value is in r0, why v1 is being checked (not familiar >> with >> the "v" register notation). Also not sure why the resulting code refers >> to >> R4. R4 has a different value then R1 at this point in the execution. >> >> Any ideas on this? > > The fast that this works for a while and then breaks suggests it is > something unusual going on. > > When the problem occurs take a look at the actually contents of memory > which contains these instructions. Has it been corrupted? Be careful > with your debugger here. If you just ask it to disassemble the code it > might show you what is in the elf file, not what is in memory. Do a > hex dump and compare the machine code bytes. I had that experience too, not just with the uart (i'm using a PXA270). Currently using my FIFO DSRs it works better, but please do more research on that topic to fix this once and forever. Bye... -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] DSR stops running after heavy interrupts. 2006-04-06 21:09 ` Joe Porthouse @ 2006-04-06 21:19 ` Andrew Lunn 2006-04-08 4:18 ` [ECOS] DSR stops running after heavy interrupts. Bug found? Joe Porthouse 0 siblings, 1 reply; 25+ messages in thread From: Andrew Lunn @ 2006-04-06 21:19 UTC (permalink / raw) To: Joe Porthouse; +Cc: ecos-discuss On Thu, Apr 06, 2006 at 05:08:45PM -0400, Joe Porthouse wrote: > Stefan, thanks. I'm glad to know I'm not the only one experiencing this > problem. > > I have made a little more progress. > > I still can't explain the issues with the code listed in my first message > with the code checking the return value from the ISR, but I believe it is > somehow working correctly. I still believe there may be a problem with R4 > being checked instead of R0. I did verify that the memory was the same as > my code window, as well as the flash image. > > This is what I did find. > > DSR calls are being added to the table... thousands of them... just not > getting serviced. The all calls that lead to "call_pending_DSRs" seem to > originate from the unlock_inner() routine getting called. This routine > stops getting called when the problem occurs. (you can see the logic below) > > > inline void Cyg_Scheduler::unlock() > { > // This is an inline wrapper for the real scheduler unlock function in > // Cyg_Scheduler::unlock_inner(). > > // Only do anything if the lock is about to go zero, otherwise we simply > > // decrement and return. As with lock() we do not need any special code > // to decrement the lock counter. > > CYG_INSTRUMENT_SCHED(UNLOCK,get_sched_lock(),0); > > HAL_REORDER_BARRIER(); > > cyg_ucount32 __lock = get_sched_lock() - 1; > > if( __lock == 0 ) > unlock_inner(0); > else > set_sched_lock(__lock); > > HAL_REORDER_BARRIER(); > } > > Upon examination the __lock value is "6" when unlock() is called at the end > of the ISR, thus unlock_inner never gets called. If I get the variable > location in the get_sched_lock() back to 1, my DSR calls resume. > Mmmmmmm.... > > So somehow locks are being done without unlocks. I am at a loss to figure > out how this is occurring since I do not make lock calls in any of my code. > Could interrupt preemption somehow be occurring? Does the > hal_disable/enable interrupt calls mess with the lock? > > Any good ideas on how to track this down? Kernel instrumentation. CYG_INSTRUMENT_SCHED(UNLOCK,get_sched_lock(),0); locks and unlocks are logged. See if you can find a case of a lock without an unlock. Andrew -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [ECOS] DSR stops running after heavy interrupts. Bug found? 2006-04-06 21:19 ` Andrew Lunn @ 2006-04-08 4:18 ` Joe Porthouse 2006-04-09 12:33 ` Andrew Lunn 0 siblings, 1 reply; 25+ messages in thread From: Joe Porthouse @ 2006-04-08 4:18 UTC (permalink / raw) To: ecos-discuss Found it!!! It took two days to figure out what was happening but I think I have a handle on it. See if this sounds right. After an ISR executes, if there is an associated DSR to execute, the DSR is added to the DSR list and a scheduler lock is made. Since DSRs are run with interrupts enabled, the scheduler lock will prevent the application code from running until all DSRs finish and release each of the scheduler locks. After adding the DSR to the list, if there is only one scheduler lock (the one just added), then a call must be made to start the first DSR executing. If more then one scheduler lock is in place, then execution must resume from where it left off (DSR or other critical section). The DSR will start after the next scheduler unlock is called. If the ISR does not have an associated DSR, nothing is added to the DSR list and the scheduler lock is not made, allowing the application or DSR to resume when the ISR finishes. The problem is in the /hal/arm/arch/current/src/vectors.S file at line 951. // The return value from the handler (in r0) will indicate whether a // DSR is to be posted. Pass this together with a pointer to the // interrupt object we have just used to the interrupt tidy up routine. // don't run this for spurious interrupts! cmp v1,#CYGNUM_HAL_INTERRUPT_NONE <-- Incorrectly references R4 cmp r0,#CYGNUM_HAL_INTERRUPT_NONE <-- Change to this The wrong register is referenced to determine if the ISR has a DSR to add to the DSR list. Since any value in R4 other then 0x0001 will call the routines to add a DSR, and I assume most ISRs have a DSR, the default behavior seems to works by chance in most configurations. In my application my ISR does NOT have an associated DSR. Even though the correct 0x0001 is returned by the ISR, the call to add the DSR is still made. This includes performing a scheduler lock since it expects to release it after the DSR runs, but there is no DSR. I believe there is some type of race condition here that allows the lock to not be released correctly since there is no corresponding DSR in the DSR list. Modifying only the above line has so far completely solved my issue of loosing my DSRs execution. Can someone review the proposed change, and if warranted, add it into the CVS? This problem could/will effect any ARM eCOS application. Since "v1" may have correctly referenced "r0" at some time in the past, the other half dozen "v1, v2...v6" references in vectors.S could also be incorrect. Joe Porthouse Toptech Systems, Inc. -----Original Message----- From: ecos-discuss-owner@ecos.sourceware.org [mailto:ecos-discuss-owner@ecos.sourceware.org] On Behalf Of Andrew Lunn Sent: Thursday, April 06, 2006 5:19 PM To: Joe Porthouse Cc: ecos-discuss@ecos.sourceware.org Subject: Re: [ECOS] DSR stops running after heavy interrupts. On Thu, Apr 06, 2006 at 05:08:45PM -0400, Joe Porthouse wrote: > Stefan, thanks. I'm glad to know I'm not the only one experiencing this > problem. > > I have made a little more progress. > > I still can't explain the issues with the code listed in my first message > with the code checking the return value from the ISR, but I believe it is > somehow working correctly. I still believe there may be a problem with R4 > being checked instead of R0. I did verify that the memory was the same as > my code window, as well as the flash image. > > This is what I did find. > > DSR calls are being added to the table... thousands of them... just not > getting serviced. The all calls that lead to "call_pending_DSRs" seem to > originate from the unlock_inner() routine getting called. This routine > stops getting called when the problem occurs. (you can see the logic below) > > > inline void Cyg_Scheduler::unlock() > { > // This is an inline wrapper for the real scheduler unlock function in > // Cyg_Scheduler::unlock_inner(). > > // Only do anything if the lock is about to go zero, otherwise we simply > > // decrement and return. As with lock() we do not need any special code > // to decrement the lock counter. > > CYG_INSTRUMENT_SCHED(UNLOCK,get_sched_lock(),0); > > HAL_REORDER_BARRIER(); > > cyg_ucount32 __lock = get_sched_lock() - 1; > > if( __lock == 0 ) > unlock_inner(0); > else > set_sched_lock(__lock); > > HAL_REORDER_BARRIER(); > } > > Upon examination the __lock value is "6" when unlock() is called at the end > of the ISR, thus unlock_inner never gets called. If I get the variable > location in the get_sched_lock() back to 1, my DSR calls resume. > Mmmmmmm.... > > So somehow locks are being done without unlocks. I am at a loss to figure > out how this is occurring since I do not make lock calls in any of my code. > Could interrupt preemption somehow be occurring? Does the > hal_disable/enable interrupt calls mess with the lock? > > Any good ideas on how to track this down? Kernel instrumentation. CYG_INSTRUMENT_SCHED(UNLOCK,get_sched_lock(),0); locks and unlocks are logged. See if you can find a case of a lock without an unlock. Andrew -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] DSR stops running after heavy interrupts. Bug found? 2006-04-08 4:18 ` [ECOS] DSR stops running after heavy interrupts. Bug found? Joe Porthouse @ 2006-04-09 12:33 ` Andrew Lunn 2006-04-10 4:50 ` [ECOS] " Sergei Organov 0 siblings, 1 reply; 25+ messages in thread From: Andrew Lunn @ 2006-04-09 12:33 UTC (permalink / raw) To: Joe Porthouse; +Cc: ecos-discuss On Sat, Apr 08, 2006 at 12:18:24AM -0400, Joe Porthouse wrote: > Found it!!! > > It took two days to figure out what was happening but I think I have a > handle on it. See if this sounds right. > > After an ISR executes, if there is an associated DSR to execute, the DSR is > added to the DSR list and a scheduler lock is made. Since DSRs are run with > interrupts enabled, the scheduler lock will prevent the application code > from running until all DSRs finish and release each of the scheduler locks. > After adding the DSR to the list, if there is only one scheduler lock (the > one just added), then a call must be made to start the first DSR executing. > If more then one scheduler lock is in place, then execution must resume from > where it left off (DSR or other critical section). The DSR will start after > the next scheduler unlock is called. > > If the ISR does not have an associated DSR, nothing is added to the DSR list > and the scheduler lock is not made, allowing the application or DSR to > resume when the ISR finishes. > > The problem is in the /hal/arm/arch/current/src/vectors.S file at line 951. > > // The return value from the handler (in r0) will indicate whether a > // DSR is to be posted. Pass this together with a pointer to the > // interrupt object we have just used to the interrupt tidy up routine. > > // don't run this for spurious interrupts! > cmp v1,#CYGNUM_HAL_INTERRUPT_NONE <-- Incorrectly references R4 > > cmp r0,#CYGNUM_HAL_INTERRUPT_NONE <-- Change to this > > The wrong register is referenced to determine if the ISR has a DSR to add to > the DSR list. Since any value in R4 other then 0x0001 will call the > routines to add a DSR, and I assume most ISRs have a DSR, the default > behavior seems to works by chance in most configurations. I don't think this is the correct interprtation of this code. Line 914" bl hal_IRQ_handler // determine interrupt source mov v1,r0 // returned vector # So the vector is now in both r0 and v1( == r4) #if defined(CYGPKG_KERNEL_INSTRUMENT) && \ defined(CYGDBG_KERNEL_INSTRUMENT_INTR) ldr r0,=RAISE_INTR // arg0 = type = INTR,RAISE mov r1,v1 // arg1 = vector mov r2,#0 // arg2 = 0 bl cyg_instrument // call instrument function #endif ARM_MODE(r0,10) mov r0,v1 // vector # The code above destroys the vector value in r0. Reload it from v1. #if defined(CYGDBG_HAL_DEBUG_GDB_CTRLC_SUPPORT) \ || defined(CYGDBG_HAL_DEBUG_GDB_BREAK_SUPPORT) // If we are supporting Ctrl-C interrupts from GDB, we must squirrel // away a pointer to the save interrupt state here so that we can // plant a breakpoint at some later time. .extern hal_saved_interrupt_state ldr r2,=hal_saved_interrupt_state str v6,[r2] #endif cmp r0,#CYGNUM_HAL_INTERRUPT_NONE // spurious interrupt bne 10f Here we check if the vector returned indicates there is a spurious interrupt. If it is not we jump over this next bit of code. #ifdef CYGIMP_HAL_COMMON_INTERRUPTS_IGNORE_SPURIOUS // Acknowledge the interrupt THUMB_CALL(r1,12,hal_interrupt_acknowledge) #else mov r0,v6 // register frame THUMB_CALL(r1,12,hal_spurious_IRQ) #endif // CYGIMP_HAL_COMMON_INTERRUPTS_IGNORE_SPURIOUS b spurious_IRQ Cleans up the interrupt controller after the spurious interrupt. v1 still contains the interrupt vector, ie #CYGNUM_HAL_INTERRUPT_NONE 10: ldr r1,.hal_interrupt_data ldr r1,[r1,v1,lsl #2] // handler data ldr r2,.hal_interrupt_handlers ldr v3,[r2,v1,lsl #2] // handler (indexed by vector #) mov r2,v6 // register frame (this is necessary // for the ISR too, for ^C detection) Calculates the address of the function to call. #ifdef __thumb__ ldr lr,=10f bx v3 // invoke handler (thumb mode) .pool .code 16 .thumb_func IRQ_10T: 10: ldr r2,=15f bx r2 // switch back to ARM mode .pool .code 32 15: IRQ_15A: #else mov lr,pc // invoke handler (call indirect mov pc,v3 // thru v3) #endif Calls the vector, either as thumb of ARM ISA. v1 (==r4) still contains the vector. #ifdef CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK // If we are returning from the last nested interrupt, move back // to the thread stack. interrupt_end() must be called on the // thread stack since it potentially causes a context switch. ldr r2,.irq_level ldr r3,[r2] subs r1,r3,#1 str r1,[r2] ldreq sp,[sp] // This should be the saved stack pointer #endif // The return value from the handler (in r0) will indicate whether a // DSR is to be posted. Pass this together with a pointer to the // interrupt object we have just used to the interrupt tidy up routine. // don't run this for spurious interrupts! cmp v1,#CYGNUM_HAL_INTERRUPT_NONE beq 17f v1 is still the vector. If the vector indicates a spurious interrupt we don't have a DSR to call so skip the next bit of code. ldr r1,.hal_interrupt_objects ldr r1,[r1,v1,lsl #2] mov r2,v6 // register frame THUMB_MODE(r3,10) bl interrupt_end // post any bottom layer handler Now run the DSRs if possible etc. // threads and call scheduler ARM_MODE(r1,10) 17: // mov r0,sp // bl show_frame_out // return from IRQ is same as return from exception b return_from_exception So i think the comparison with v1 is correct. However, from what you are saying it sounds like there needs to be another comparison afterwards. Something like: and r0,r0,#2 // CYG_ISR_CALL_DSR beq 17f Andrew -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* [ECOS] Re: DSR stops running after heavy interrupts. Bug found? 2006-04-09 12:33 ` Andrew Lunn @ 2006-04-10 4:50 ` Sergei Organov 2006-04-10 9:36 ` Nick Garnett 0 siblings, 1 reply; 25+ messages in thread From: Sergei Organov @ 2006-04-10 4:50 UTC (permalink / raw) To: ecos-discuss Andrew Lunn <andrew@lunn.ch> writes: [...] > However, from what you are saying it sounds like there needs to be > another comparison afterwards. Something like: > > and r0,r0,#2 // CYG_ISR_CALL_DSR > beq 17f No, bit checking of the ISR return value is performed inside the interrupt_end() routine: if( isr_ret & Cyg_Interrupt::CALL_DSR && intr != NULL ) intr->post_dsr() -- Sergei. -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] Re: DSR stops running after heavy interrupts. Bug found? 2006-04-10 4:50 ` [ECOS] " Sergei Organov @ 2006-04-10 9:36 ` Nick Garnett 2006-04-10 10:44 ` Sergei Organov ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Nick Garnett @ 2006-04-10 9:36 UTC (permalink / raw) To: Sergei Organov; +Cc: ecos-discuss Sergei Organov <osv@javad.com> writes: > Andrew Lunn <andrew@lunn.ch> writes: > > [...] > > > However, from what you are saying it sounds like there needs to be > > another comparison afterwards. Something like: > > > > and r0,r0,#2 // CYG_ISR_CALL_DSR > > beq 17f > > No, bit checking of the ISR return value is performed inside the > interrupt_end() routine: > > if( isr_ret & Cyg_Interrupt::CALL_DSR && intr != NULL ) intr->post_dsr() Exactly. And there are other housekeeping things that go on in interrupt_end() which cannot be skipped. The most important of these is decrementing the scheduler lock. I don't really see how the original poster's problem is fixed by trying to skip interrupt_end(), I would only expect doing that to aggravate the problem. The scheduler lock is acquired early in interrupt processing -- before the ISR is called and we know whether there is a DSR to call. interrupt_end() decrements the scheduler lock and as a side-effect may cause any DSRs to be called. As Andrew has suggested, I think Joe's best way of working out what is happening is to switch on instrumentation and see if he can track down the extra increments of the scheduler lock. -- Nick Garnett eCos Kernel Architect http://www.ecoscentric.com The eCos and RedBoot experts -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] Re: DSR stops running after heavy interrupts. Bug found? 2006-04-10 9:36 ` Nick Garnett @ 2006-04-10 10:44 ` Sergei Organov 2006-04-10 10:59 ` Nick Garnett 2006-04-10 16:41 ` [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! Joe Porthouse 2006-04-13 7:58 ` [ECOS] How to use the ARM directive DCB in Vectors.S Birahim Larou Fall 2 siblings, 1 reply; 25+ messages in thread From: Sergei Organov @ 2006-04-10 10:44 UTC (permalink / raw) To: ecos-discuss; +Cc: Nick Garnett Nick Garnett <nickg@ecoscentric.com> writes: > Sergei Organov <osv@javad.com> writes: > >> Andrew Lunn <andrew@lunn.ch> writes: >> >> [...] >> >> > However, from what you are saying it sounds like there needs to be >> > another comparison afterwards. Something like: >> > >> > and r0,r0,#2 // CYG_ISR_CALL_DSR >> > beq 17f >> >> No, bit checking of the ISR return value is performed inside the >> interrupt_end() routine: >> >> if( isr_ret & Cyg_Interrupt::CALL_DSR && intr != NULL ) intr->post_dsr() > > > Exactly. And there are other housekeeping things that go on in > interrupt_end() which cannot be skipped. The most important of these > is decrementing the scheduler lock. > > I don't really see how the original poster's problem is fixed by > trying to skip interrupt_end(), I would only expect doing that to > aggravate the problem. The scheduler lock is acquired early in > interrupt processing -- before the ISR is called and we know whether > there is a DSR to call. interrupt_end() decrements the scheduler lock > and as a side-effect may cause any DSRs to be called. A little OT while we are at interrupt_end(). Could you please explain why #ifdef CYGPKG_KERNEL_SMP_SUPPORT Cyg_Scheduler::lock(); #endif is there at the beginning, -- looks like extra scheduler lock without corresponding unlock for SMP case. If not a bug, it seems a comment would be nice to have there. -- Sergei. -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] Re: DSR stops running after heavy interrupts. Bug found? 2006-04-10 10:44 ` Sergei Organov @ 2006-04-10 10:59 ` Nick Garnett 2006-04-10 11:15 ` Sergei Organov 0 siblings, 1 reply; 25+ messages in thread From: Nick Garnett @ 2006-04-10 10:59 UTC (permalink / raw) To: Sergei Organov; +Cc: ecos-discuss Sergei Organov <osv@javad.com> writes: > A little OT while we are at interrupt_end(). Could you please explain > why > > #ifdef CYGPKG_KERNEL_SMP_SUPPORT > Cyg_Scheduler::lock(); > #endif > > is there at the beginning, -- looks like extra scheduler lock without > corresponding unlock for SMP case. If not a bug, it seems a comment > would be nice to have there. In SMP configurations we don't want to claim the scheduler lock in the interrupt VSR because it would block interrupts and scheduler operations on other CPUs. It also requires a spinlock to be claimed, which would require special code to be written -- it's much easier to do the job later. In HALs where SMP is supported, the usual scheduler lock increment is ifdeffed out. Perhaps a comment would be useful, but it seemed like the ifdef surrounding it would be sufficient indication that this was for SMP only. -- Nick Garnett eCos Kernel Architect http://www.ecoscentric.com The eCos and RedBoot experts -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] Re: DSR stops running after heavy interrupts. Bug found? 2006-04-10 10:59 ` Nick Garnett @ 2006-04-10 11:15 ` Sergei Organov 2006-04-10 13:20 ` Joe Porthouse 0 siblings, 1 reply; 25+ messages in thread From: Sergei Organov @ 2006-04-10 11:15 UTC (permalink / raw) To: ecos-discuss Nick Garnett <nickg@ecoscentric.com> writes: > Sergei Organov <osv@javad.com> writes: > >> A little OT while we are at interrupt_end(). Could you please explain >> why >> >> #ifdef CYGPKG_KERNEL_SMP_SUPPORT >> Cyg_Scheduler::lock(); >> #endif >> >> is there at the beginning, -- looks like extra scheduler lock without >> corresponding unlock for SMP case. If not a bug, it seems a comment >> would be nice to have there. > > In SMP configurations we don't want to claim the scheduler lock in the > interrupt VSR because it would block interrupts and scheduler > operations on other CPUs. It also requires a spinlock to be claimed, > which would require special code to be written -- it's much easier to > do the job later. In HALs where SMP is supported, the usual scheduler > lock increment is ifdeffed out. Ah, now I see, thanks. Seems like non-SMP targets could benefit from this approach as well, isn't it? Or is there some fundamental difference here? I just think that SMP variant makes some things better even for single-CPU case and thus it could be a good idea to use SMP variant for single-CPU case in those places. Less ifdefs would be another gain. > > Perhaps a comment would be useful, but it seemed like the ifdef > surrounding it would be sufficient indication that this was for SMP > only. Please try to look at it from the POW of a reader of the interrupt_end(), -- it's clear that it's for SMP only, but it's absolutely unclear why SMP requires one more scheduler lock. When I looked at it, I failed to find corresponding unlock(), but didn't pay much attention as I'm not currently interested in SMP. I believe your above "In SMP configurations we..." phrase would be a nice comment for this piece of code. -- Sergei. -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [ECOS] Re: DSR stops running after heavy interrupts. Bug found? 2006-04-10 11:15 ` Sergei Organov @ 2006-04-10 13:20 ` Joe Porthouse 0 siblings, 0 replies; 25+ messages in thread From: Joe Porthouse @ 2006-04-10 13:20 UTC (permalink / raw) To: ecos-discuss All, Many thanks for the replies. I now see my misunderstanding on the intent of the vectors.S line 951, cmp v1,#CYGNUM_HAL_INTERRUPT_NONE I also see the bit check in the interrupt_end routine for the isr_ret value. I am still at a loss for why my change solved my issue. I do believe it is an issue with servicing an ISR that does not have a DSR. I am still debugging. BTW, where is the initial scheduler lock performed when an interrupt is generated? Joe Porthouse Toptech Systems, Inc. -----Original Message----- From: ecos-discuss-owner@ecos.sourceware.org [mailto:ecos-discuss-owner@ecos.sourceware.org] On Behalf Of Sergei Organov Sent: Monday, April 10, 2006 7:15 AM To: ecos-discuss@ecos.sourceware.org Subject: Re: [ECOS] Re: DSR stops running after heavy interrupts. Bug found? Nick Garnett <nickg@ecoscentric.com> writes: > Sergei Organov <osv@javad.com> writes: > >> A little OT while we are at interrupt_end(). Could you please explain >> why >> >> #ifdef CYGPKG_KERNEL_SMP_SUPPORT >> Cyg_Scheduler::lock(); >> #endif >> >> is there at the beginning, -- looks like extra scheduler lock without >> corresponding unlock for SMP case. If not a bug, it seems a comment >> would be nice to have there. > > In SMP configurations we don't want to claim the scheduler lock in the > interrupt VSR because it would block interrupts and scheduler > operations on other CPUs. It also requires a spinlock to be claimed, > which would require special code to be written -- it's much easier to > do the job later. In HALs where SMP is supported, the usual scheduler > lock increment is ifdeffed out. Ah, now I see, thanks. Seems like non-SMP targets could benefit from this approach as well, isn't it? Or is there some fundamental difference here? I just think that SMP variant makes some things better even for single-CPU case and thus it could be a good idea to use SMP variant for single-CPU case in those places. Less ifdefs would be another gain. > > Perhaps a comment would be useful, but it seemed like the ifdef > surrounding it would be sufficient indication that this was for SMP > only. Please try to look at it from the POW of a reader of the interrupt_end(), -- it's clear that it's for SMP only, but it's absolutely unclear why SMP requires one more scheduler lock. When I looked at it, I failed to find corresponding unlock(), but didn't pay much attention as I'm not currently interested in SMP. I believe your above "In SMP configurations we..." phrase would be a nice comment for this piece of code. -- Sergei. -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! 2006-04-10 9:36 ` Nick Garnett 2006-04-10 10:44 ` Sergei Organov @ 2006-04-10 16:41 ` Joe Porthouse 2006-04-10 17:20 ` Nick Garnett 2006-04-13 7:58 ` [ECOS] How to use the ARM directive DCB in Vectors.S Birahim Larou Fall 2 siblings, 1 reply; 25+ messages in thread From: Joe Porthouse @ 2006-04-10 16:41 UTC (permalink / raw) To: ecos-discuss Nick, Thanks for you reply. Your right. Not calling the interrupt_end() routine would cause the lock not to be released. After your comment I started looking closer at my modification. My original modification of: /hal/arm/arch/current/src/vectors.S file at line 951. cmp v1,#CYGNUM_HAL_INTERRUPT_NONE <-- from this cmp r0,#CYGNUM_HAL_INTERRUPT_NONE <-- to this v1 originally contained the interrupt vector. But I mistakenly believed this was the check of the return value from the ISR. I modified it to look at r0, the return value from the isr. The return value from the isr will be 0-3 (really 1 or 3). The CYGNUM_HAL_INTERRUPT_NONE is -1! (for some reason I thought it was +1 looking at the assembly listing) Bottom line, my modification made sure that interrupt_end() would always be called, even when v1 == CYGNUM_HAL_INTERRUPT_NONE (spurious interrupt). I just did a quick test with the original code and verified that when a spurious interrupt occurs, the interrupt_end() routine is not called and the lock is not released and my problem occurs. Calling the interrupt_end() routine with a spurious interrupt did not seem to break anything. Was there a reason why interrupt_end() should not be called on spurious interrupts? Now to figure out why I am getting a spurious interrupt with the simple UART code listed below? What should I look for in attempting to eliminate spurious interrupts? Can they be eliminated? What modifications to eCos source or my project is in order for dealing with spurious interrupts correctly? #define CYGNUM_HAL_INTERRUPT_22 22 #define CYGNUM_HAL_INTERRUPT_21 21 #define CYGNUM_HAL_INTERRUPT_20 20 #define CYG_HAL_PRI_HIGH 0 static cyg_interrupt btuart_interrupt_new_object; static cyg_handle_t btuart_interrupt_handle; static cyg_vector_t btuart_interrupt_vector = CYGNUM_HAL_INTERRUPT_21; static cyg_priority_t btuart_interrupt_priority = CYG_HAL_PRI_HIGH; unsigned int isr_rx_count = 0; cyg_uint32 btuart_interrupt_isr( cyg_vector_t vector, cyg_addrword_t data) { unsigned int iir, lsr, rbr; iir = PXA255_BTIIR; // check if RX FIFO interrupt if((iir & PXA255_IIR_IID_INT_ID_MASK) == PXA255_IIR_IID_RX_FIFO_INT_PENDING) { lsr = PXA255_BTLSR; while(lsr & PXA255_LSR_DR_DATA_READY) { rbr = PXA255_BTRBR; isr_rx_count++; lsr = PXA255_BTLSR; } } cyg_interrupt_acknowledge(vector); return(CYG_ISR_HANDLED); } void serial_port_start(void) { // GPIO and UART inits here... cyg_interrupt_create( btuart_interrupt_vector, btuart_interrupt_priority, 0, &btuart_interrupt_isr, 0, &btuart_interrupt_handle, &btuart_interrupt_new_object); cyg_interrupt_attach(btuart_interrupt_handle); cyg_interrupt_unmask(btuart_interrupt_vector); // UART interupt enabled here... } Joe Porthouse Toptech Systems, Inc. -----Original Message----- From: ecos-discuss-owner@ecos.sourceware.org [mailto:ecos-discuss-owner@ecos.sourceware.org] On Behalf Of Nick Garnett Sent: Monday, April 10, 2006 5:36 AM To: Sergei Organov Cc: ecos-discuss@sources.redhat.com Subject: Re: [ECOS] Re: DSR stops running after heavy interrupts. Bug found? Sergei Organov <osv@javad.com> writes: > Andrew Lunn <andrew@lunn.ch> writes: > > [...] > > > However, from what you are saying it sounds like there needs to be > > another comparison afterwards. Something like: > > > > and r0,r0,#2 // CYG_ISR_CALL_DSR > > beq 17f > > No, bit checking of the ISR return value is performed inside the > interrupt_end() routine: > > if( isr_ret & Cyg_Interrupt::CALL_DSR && intr != NULL ) intr->post_dsr() Exactly. And there are other housekeeping things that go on in interrupt_end() which cannot be skipped. The most important of these is decrementing the scheduler lock. I don't really see how the original poster's problem is fixed by trying to skip interrupt_end(), I would only expect doing that to aggravate the problem. The scheduler lock is acquired early in interrupt processing -- before the ISR is called and we know whether there is a DSR to call. interrupt_end() decrements the scheduler lock and as a side-effect may cause any DSRs to be called. As Andrew has suggested, I think Joe's best way of working out what is happening is to switch on instrumentation and see if he can track down the extra increments of the scheduler lock. -- Nick Garnett eCos Kernel Architect http://www.ecoscentric.com The eCos and RedBoot experts -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! 2006-04-10 16:41 ` [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! Joe Porthouse @ 2006-04-10 17:20 ` Nick Garnett 2006-04-10 17:44 ` Andrew Lunn ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Nick Garnett @ 2006-04-10 17:20 UTC (permalink / raw) To: jporthouse; +Cc: ecos-discuss "Joe Porthouse" <jporthouse@toptech.com> writes: > Nick, > Thanks for you reply. > > Your right. Not calling the interrupt_end() routine would cause the lock > not to be released. After your comment I started looking closer at my > modification. > > My original modification of: > /hal/arm/arch/current/src/vectors.S file at line 951. > cmp v1,#CYGNUM_HAL_INTERRUPT_NONE <-- from this > cmp r0,#CYGNUM_HAL_INTERRUPT_NONE <-- to this > > v1 originally contained the interrupt vector. But I mistakenly believed > this was the check of the return value from the ISR. I modified it to look > at r0, the return value from the isr. The return value from the isr will be > 0-3 (really 1 or 3). The CYGNUM_HAL_INTERRUPT_NONE is -1! (for some reason > I thought it was +1 looking at the assembly listing) > > Bottom line, my modification made sure that interrupt_end() would always be > called, even when v1 == CYGNUM_HAL_INTERRUPT_NONE (spurious interrupt). > > I just did a quick test with the original code and verified that when a > spurious interrupt occurs, the interrupt_end() routine is not called and the > lock is not released and my problem occurs. > > Calling the interrupt_end() routine with a spurious interrupt did not seem > to break anything. This all makes sense. > Was there a reason why interrupt_end() should not be > called on spurious interrupts? I guess it was an attempt to avoid doing more than the absolute minimum on spurious interrupts. It looks like there is a bug in there, since the scheduler lock doesn't get decremented. In general, spurious interrupts shouldn't happen, which is why it has managed to lurk here for so long. > > Now to figure out why I am getting a spurious interrupt with the simple UART > code listed below? > > What should I look for in attempting to eliminate spurious interrupts? Can > they be eliminated? The CYGNUM_HAL_INTERRUPT_NONE return from hal_IRQ_handler() only happens when an interrupt occurs but the interrupt controller denies all knowledge of it. One possibility is that hal_IRQ_handler() is decoding a real interrupt wrongly and generating -1 by mistake. What you need to do is find out why hal_IRQ_handler() is returning this value. If you can put a breakpoint in hal_IRQ_handler() where CYGNUM_HAL_INTERRUPT_NONE is returned, then you should be able to look at all the relevant device and interrupt controller registers and find out what is going on. Also, enable assertions, it might tell you something. -- Nick Garnett eCos Kernel Architect http://www.ecoscentric.com The eCos and RedBoot experts -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! 2006-04-10 17:20 ` Nick Garnett @ 2006-04-10 17:44 ` Andrew Lunn 2006-04-10 20:49 ` Joe Porthouse 2006-04-11 4:15 ` Sergei Organov 2 siblings, 0 replies; 25+ messages in thread From: Andrew Lunn @ 2006-04-10 17:44 UTC (permalink / raw) To: Nick Garnett; +Cc: jporthouse, ecos-discuss > > What should I look for in attempting to eliminate spurious interrupts? Can > > they be eliminated? > > The CYGNUM_HAL_INTERRUPT_NONE return from hal_IRQ_handler() only > happens when an interrupt occurs but the interrupt controller denies > all knowledge of it. One possibility is that hal_IRQ_handler() is > decoding a real interrupt wrongly and generating -1 by mistake. > > What you need to do is find out why hal_IRQ_handler() is returning > this value. If you can put a breakpoint in hal_IRQ_handler() > where CYGNUM_HAL_INTERRUPT_NONE is returned, then you should be able > to look at all the relevant device and interrupt controller registers > and find out what is going on. Also check that you have level vs edge trigger correct for your hardware. It could be a hardware error, eg a pulse is too short, a floating interrupt signal, some device which does not get reset when the process does etc. Andrew -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! 2006-04-10 17:20 ` Nick Garnett 2006-04-10 17:44 ` Andrew Lunn @ 2006-04-10 20:49 ` Joe Porthouse 2006-04-11 4:07 ` Sergei Organov 2006-04-11 8:31 ` Nick Garnett 2006-04-11 4:15 ` Sergei Organov 2 siblings, 2 replies; 25+ messages in thread From: Joe Porthouse @ 2006-04-10 20:49 UTC (permalink / raw) To: ecos-discuss Ok, found the source of the Spurious Interrupts, (your really going to love this one). During my testing on determining why I was receiving occasional spurious interrupts I noticed that the PXA2X0_ICIP (IRQ Interrupt Pending) register was completely clear, even though the processor had just jumped to the IRQ vector. I checked the interrupt level register and even turned off all other interrupts in the system but the problem still occurred. It was like something was clearing the interrupt before the IRQ vector jump occurred, or some intermediate IRQ hardware flag was not getting cleared when the PXA2X0_ICIP was getting cleared from the last interrupt. I was really scratching my head at this point. My target is an Intel PXA255 xScale processor and I'm using the three built in UARTs. One of the events that the built in UART can generate an interrupt on is a "Character Timeout". The definition of this event is when at least one character is in the Receive FIFO and no data has been received for four character times. Clearing this interrupt occurs if you read from the Receiver FIFO, set the FCR[RESETRF] bit or A NEW START BIT IS RECEIVED!!! So if the RX FIFO is below the trigger point and a timeout occurs an IRQ request is generated, but if a new start bit is detected the IRQ request is then immediately cleared. :( Wow an interrupt that can clear its own IRQ request before service occurs!!! That would surely cause a Spurious Interrupt. If my conclusions are correct and I want don't want characters to hang out in my RX FIFO I will either need to: #1. Stop using the UART FIFO. #2. Poll the FIFO for trailing characters. #3. Live with the Spurious Interrupts as a processor UART design issue. I will probably follow through with #3 by commenting out lien 951 and 952 in the /hal/arm/arch/current/src/vectors.S file. Joe Porthouse Toptech Systems, Inc. -----Original Message----- From: nickg@xl5.calivar.com [mailto:nickg@xl5.calivar.com] On Behalf Of Nick Garnett Sent: Monday, April 10, 2006 1:20 PM To: jporthouse@toptech.com Cc: ecos-discuss@sources.redhat.com Subject: Re: [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! "Joe Porthouse" <jporthouse@toptech.com> writes: > Nick, > Thanks for you reply. > > Your right. Not calling the interrupt_end() routine would cause the lock > not to be released. After your comment I started looking closer at my > modification. > > My original modification of: > /hal/arm/arch/current/src/vectors.S file at line 951. > cmp v1,#CYGNUM_HAL_INTERRUPT_NONE <-- from this > cmp r0,#CYGNUM_HAL_INTERRUPT_NONE <-- to this > > v1 originally contained the interrupt vector. But I mistakenly believed > this was the check of the return value from the ISR. I modified it to look > at r0, the return value from the isr. The return value from the isr will be > 0-3 (really 1 or 3). The CYGNUM_HAL_INTERRUPT_NONE is -1! (for some reason > I thought it was +1 looking at the assembly listing) > > Bottom line, my modification made sure that interrupt_end() would always be > called, even when v1 == CYGNUM_HAL_INTERRUPT_NONE (spurious interrupt). > > I just did a quick test with the original code and verified that when a > spurious interrupt occurs, the interrupt_end() routine is not called and the > lock is not released and my problem occurs. > > Calling the interrupt_end() routine with a spurious interrupt did not seem > to break anything. This all makes sense. > Was there a reason why interrupt_end() should not be > called on spurious interrupts? I guess it was an attempt to avoid doing more than the absolute minimum on spurious interrupts. It looks like there is a bug in there, since the scheduler lock doesn't get decremented. In general, spurious interrupts shouldn't happen, which is why it has managed to lurk here for so long. > > Now to figure out why I am getting a spurious interrupt with the simple UART > code listed below? > > What should I look for in attempting to eliminate spurious interrupts? Can > they be eliminated? The CYGNUM_HAL_INTERRUPT_NONE return from hal_IRQ_handler() only happens when an interrupt occurs but the interrupt controller denies all knowledge of it. One possibility is that hal_IRQ_handler() is decoding a real interrupt wrongly and generating -1 by mistake. What you need to do is find out why hal_IRQ_handler() is returning this value. If you can put a breakpoint in hal_IRQ_handler() where CYGNUM_HAL_INTERRUPT_NONE is returned, then you should be able to look at all the relevant device and interrupt controller registers and find out what is going on. Also, enable assertions, it might tell you something. -- Nick Garnett eCos Kernel Architect http://www.ecoscentric.com The eCos and RedBoot experts -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! 2006-04-10 20:49 ` Joe Porthouse @ 2006-04-11 4:07 ` Sergei Organov 2006-04-11 8:31 ` Nick Garnett 1 sibling, 0 replies; 25+ messages in thread From: Sergei Organov @ 2006-04-11 4:07 UTC (permalink / raw) To: ecos-discuss "Joe Porthouse" <jporthouse@toptech.com> writes: > Ok, found the source of the Spurious Interrupts, (your really going to love > this one). [...] > Clearing this interrupt occurs if you read from the Receiver FIFO, set the > FCR[RESETRF] bit or A NEW START BIT IS RECEIVED!!! > > So if the RX FIFO is below the trigger point and a timeout occurs an IRQ > request is generated, but if a new start bit is detected the IRQ request is > then immediately cleared. :( > > Wow an interrupt that can clear its own IRQ request before service occurs!!! > That would surely cause a Spurious Interrupt. Sounds like yet another piece of broken hardware design from Intel, -- they failed to deliver reasonable RS232 implementation at the early days of PC, and still fail to do it right in 20 years :( -- Sergei. -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! 2006-04-10 20:49 ` Joe Porthouse 2006-04-11 4:07 ` Sergei Organov @ 2006-04-11 8:31 ` Nick Garnett 1 sibling, 0 replies; 25+ messages in thread From: Nick Garnett @ 2006-04-11 8:31 UTC (permalink / raw) To: jporthouse; +Cc: ecos-discuss "Joe Porthouse" <jporthouse@toptech.com> writes: > Clearing this interrupt occurs if you read from the Receiver FIFO, set the > FCR[RESETRF] bit or A NEW START BIT IS RECEIVED!!! > > So if the RX FIFO is below the trigger point and a timeout occurs an IRQ > request is generated, but if a new start bit is detected the IRQ request is > then immediately cleared. :( That certainly sounds like the cause of your problems. It sounds like there is no way to fix it without disabling the timeout interrupt. I'm not sure whether it is a bug in the UART for cancelling an interrupt it has raised, or a bug in the interrupt controller for not latching interrupt requests. At least it seems to be a fairly narrow race window, so isn't going to interfere with performance too much. > > Wow an interrupt that can clear its own IRQ request before service occurs!!! > That would surely cause a Spurious Interrupt. > > If my conclusions are correct and I want don't want characters to hang out > in my RX FIFO I will either need to: > #1. Stop using the UART FIFO. > #2. Poll the FIFO for trailing characters. > #3. Live with the Spurious Interrupts as a processor UART design issue. > > I will probably follow through with #3 by commenting out lien 951 and 952 in > the /hal/arm/arch/current/src/vectors.S file. > That sounds like the best approach. I guess we ought to take a look at making this a permanent feature. -- Nick Garnett eCos Kernel Architect http://www.ecoscentric.com The eCos and RedBoot experts -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! 2006-04-10 17:20 ` Nick Garnett 2006-04-10 17:44 ` Andrew Lunn 2006-04-10 20:49 ` Joe Porthouse @ 2006-04-11 4:15 ` Sergei Organov 2006-04-11 8:43 ` Nick Garnett 2 siblings, 1 reply; 25+ messages in thread From: Sergei Organov @ 2006-04-11 4:15 UTC (permalink / raw) To: ecos-discuss Nick Garnett <nickg@ecoscentric.com> writes: > "Joe Porthouse" <jporthouse@toptech.com> writes: [...] >> Was there a reason why interrupt_end() should not be >> called on spurious interrupts? > > I guess it was an attempt to avoid doing more than the absolute > minimum on spurious interrupts. It looks like there is a bug in there, > since the scheduler lock doesn't get decremented. In general, spurious > interrupts shouldn't happen, which is why it has managed to lurk here > for so long. Well, I think the right question here is why scheduler lock is incremented at all? I mean if SMP implementations happen to increment it inside the interrupt_end(), then it should be safe for ARM HAL to increment it just before calling interrupt_end(), isn't it? This way spurious interrupt handling code will avoid both scheduler lock increment and interrupt_end() call. Makes sense? -- Sergei. -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! 2006-04-11 4:15 ` Sergei Organov @ 2006-04-11 8:43 ` Nick Garnett 0 siblings, 0 replies; 25+ messages in thread From: Nick Garnett @ 2006-04-11 8:43 UTC (permalink / raw) To: Sergei Organov; +Cc: ecos-discuss Sergei Organov <osv@javad.com> writes: > Nick Garnett <nickg@ecoscentric.com> writes: > > "Joe Porthouse" <jporthouse@toptech.com> writes: > [...] > > >> Was there a reason why interrupt_end() should not be > >> called on spurious interrupts? > > > > I guess it was an attempt to avoid doing more than the absolute > > minimum on spurious interrupts. It looks like there is a bug in there, > > since the scheduler lock doesn't get decremented. In general, spurious > > interrupts shouldn't happen, which is why it has managed to lurk here > > for so long. > > Well, I think the right question here is why scheduler lock is > incremented at all? I mean if SMP implementations happen to increment it > inside the interrupt_end(), then it should be safe for ARM HAL to > increment it just before calling interrupt_end(), isn't it? This way > spurious interrupt handling code will avoid both scheduler lock > increment and interrupt_end() call. Makes sense? The scheduler lock has several duties. As well as disabling thread suspension and controlling when DSRs are called, it also does duty as an interrupt nesting counter. We only want DSRs to be called when all nested interrupts have been unwound and we are about to return from the first one. The scheduler lock count does this implicitly. But for this to work properly, the scheduler lock must be incremented in the VSR before interrupts are re-enabled and the ISR is called. In the SMP case I decided, at least initially, that nested interrupts would not be supported. It was hard enough keeping track of interrupts going off on different CPUs. This allowed me to move the lock operation into interrupt_end(), and avoided having to write any asm code to go into the VSR. SMP is really still in its development phase, there are a number of things that a little experimental in there. Moving the scheduler locking to interrupt_end() was one of them. I certainly would not want to do that for any other configuration. -- Nick Garnett eCos Kernel Architect http://www.ecoscentric.com The eCos and RedBoot experts -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* [ECOS] How to use the ARM directive DCB in Vectors.S 2006-04-10 9:36 ` Nick Garnett 2006-04-10 10:44 ` Sergei Organov 2006-04-10 16:41 ` [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! Joe Porthouse @ 2006-04-13 7:58 ` Birahim Larou Fall 2006-04-13 13:28 ` Andrew Lunn 2 siblings, 1 reply; 25+ messages in thread From: Birahim Larou Fall @ 2006-04-13 7:58 UTC (permalink / raw) To: ecos-discuss I have modified the source file vectors.s for arm achitecture, and I can't compile theis file because DCB is seen as a bad instrucetion. packages/hal/arm/arch/current/src/vectors.S:517: Error: bad instruction `c_string DCB "C_string",0' How to tell ecos to support ARM directives (DCD, DCB...? Thanks!) Fall Birahim -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] How to use the ARM directive DCB in Vectors.S 2006-04-13 7:58 ` [ECOS] How to use the ARM directive DCB in Vectors.S Birahim Larou Fall @ 2006-04-13 13:28 ` Andrew Lunn 2006-04-13 13:32 ` Birahim Larou Fall 0 siblings, 1 reply; 25+ messages in thread From: Andrew Lunn @ 2006-04-13 13:28 UTC (permalink / raw) To: Birahim Larou Fall; +Cc: ecos-discuss On Thu, Apr 13, 2006 at 09:53:30AM +0200, Birahim Larou Fall wrote: > I have modified the source file vectors.s for arm achitecture, and I can't > compile theis file because DCB is seen as a bad instrucetion. > packages/hal/arm/arch/current/src/vectors.S:517: Error: bad instruction > `c_string DCB "C_string",0' > How to tell ecos to support ARM directives (DCD, DCB...? The problem is DCD, DCB are directives for ARM's assemble. eCos uses gas, so you need to use the gas equivelent. I suggest you read the gas documentation. Andrew -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [ECOS] How to use the ARM directive DCB in Vectors.S 2006-04-13 13:28 ` Andrew Lunn @ 2006-04-13 13:32 ` Birahim Larou Fall 2006-04-21 7:40 ` [ECOS] " Daniel Néri 0 siblings, 1 reply; 25+ messages in thread From: Birahim Larou Fall @ 2006-04-13 13:32 UTC (permalink / raw) To: ecos-discuss Thanks, Andrew, where can I have the gas documentation. Fall Birahim Andrew Lunn <andrew@lunn.ch> Sent by: ecos-discuss-owner@ecos.sourceware.org 13/04/2006 15:27 To Birahim Larou Fall <BLFall@scmmicro.fr> cc ecos-discuss@sources.redhat.com Subject Re: [ECOS] How to use the ARM directive DCB in Vectors.S On Thu, Apr 13, 2006 at 09:53:30AM +0200, Birahim Larou Fall wrote: > I have modified the source file vectors.s for arm achitecture, and I can't > compile theis file because DCB is seen as a bad instrucetion. > packages/hal/arm/arch/current/src/vectors.S:517: Error: bad instruction > `c_string DCB "C_string",0' > How to tell ecos to support ARM directives (DCD, DCB...? The problem is DCD, DCB are directives for ARM's assemble. eCos uses gas, so you need to use the gas equivelent. I suggest you read the gas documentation. Andrew -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
* [ECOS] Re: How to use the ARM directive DCB in Vectors.S 2006-04-13 13:32 ` Birahim Larou Fall @ 2006-04-21 7:40 ` Daniel Néri 0 siblings, 0 replies; 25+ messages in thread From: Daniel Néri @ 2006-04-21 7:40 UTC (permalink / raw) To: ecos-discuss Birahim Larou Fall <BLFall@scmmicro.fr> writes: > Thanks, Andrew, where can I have the gas documentation. gas is a member of the GNU binutils tool collection: http://sourceware.org/binutils/ Regards, -- Daniel Néri <daniel.neri@sigicom.se> Sigicom AB, Stockholm, Sweden -- Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2006-04-21 7:40 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2006-04-05 21:09 [ECOS] DSR stops running after heavy interrupts Joe Porthouse 2006-04-06 6:49 ` Andrew Lunn 2006-04-06 9:02 ` Stefan Sommerfeld 2006-04-06 21:09 ` Joe Porthouse 2006-04-06 21:19 ` Andrew Lunn 2006-04-08 4:18 ` [ECOS] DSR stops running after heavy interrupts. Bug found? Joe Porthouse 2006-04-09 12:33 ` Andrew Lunn 2006-04-10 4:50 ` [ECOS] " Sergei Organov 2006-04-10 9:36 ` Nick Garnett 2006-04-10 10:44 ` Sergei Organov 2006-04-10 10:59 ` Nick Garnett 2006-04-10 11:15 ` Sergei Organov 2006-04-10 13:20 ` Joe Porthouse 2006-04-10 16:41 ` [ECOS] Re: DSR stops running after heavy interrupts. Spurious Interrupt! Joe Porthouse 2006-04-10 17:20 ` Nick Garnett 2006-04-10 17:44 ` Andrew Lunn 2006-04-10 20:49 ` Joe Porthouse 2006-04-11 4:07 ` Sergei Organov 2006-04-11 8:31 ` Nick Garnett 2006-04-11 4:15 ` Sergei Organov 2006-04-11 8:43 ` Nick Garnett 2006-04-13 7:58 ` [ECOS] How to use the ARM directive DCB in Vectors.S Birahim Larou Fall 2006-04-13 13:28 ` Andrew Lunn 2006-04-13 13:32 ` Birahim Larou Fall 2006-04-21 7:40 ` [ECOS] " Daniel Néri
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).