From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22263 invoked by alias); 22 Apr 2011 02:51:30 -0000 Received: (qmail 22254 invoked by uid 22791); 22 Apr 2011 02:51:29 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Received: from mail-vw0-f41.google.com (HELO mail-vw0-f41.google.com) (209.85.212.41) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 22 Apr 2011 02:51:14 +0000 Received: by vws4 with SMTP id 4so311540vws.0 for ; Thu, 21 Apr 2011 19:51:13 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.187.194 with SMTP id fu2mr873294vdc.258.1303440673482; Thu, 21 Apr 2011 19:51:13 -0700 (PDT) Received: by 10.52.168.33 with HTTP; Thu, 21 Apr 2011 19:51:13 -0700 (PDT) Date: Fri, 22 Apr 2011 02:51:00 -0000 Message-ID: Subject: [ECOS] Miss calling ASR in sched.cxx From: kiron To: ecos-devel@sourceware.org Content-Type: text/plain; charset=ISO-8859-1 X-IsSubscribed: yes Mailing-List: contact ecos-devel-help@ecos.sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: ecos-devel-owner@ecos.sourceware.org X-SW-Source: 2011-04/txt/msg00008.txt.bz2 Hi All, I' am new to this mail list. If I have something wrong, correct me. I debug my application on MPC8313 platform. It has some pthreads. one of them use the posix timer (packages/compat/posix/current/src/time.cxx) to drive a status machine. When the pthread running a few minuters, the timer was disarmed. I tracked code path, found that code clips of Cyg_Scheduler::unlock_inner in packages/kernel/current/src/sched/sched.cxx : ------------------------------------------------------------------------------------------------------------------------------------------- do { CYG_PRECONDITION( new_lock==0 ? get_sched_lock() == 1 : ((get_sched_lock() == new_lock) || (get_sched_lock() == new_lock+1)), "sched_lock not at expected value" ); #ifdef CYGIMP_KERNEL_INTERRUPTS_DSRS // Call any pending DSRs. Do this here to ensure that any // threads that get awakened are properly scheduled. if( new_lock == 0 && Cyg_Interrupt::DSRs_pending() ) Cyg_Interrupt::call_pending_DSRs(); #endif Cyg_Thread *current = get_current_thread(); CYG_ASSERTCLASS( current, "Bad current thread" ); #ifdef CYGFUN_KERNEL_ALL_THREADS_STACK_CHECKING // should have CYGVAR_KERNEL_THREADS_LIST current = Cyg_Thread::get_list_head(); while ( current ) { current->check_stack(); current = current->get_list_next(); } current = get_current_thread(); #endif #ifdef CYGFUN_KERNEL_THREADS_STACK_CHECKING current->check_stack(); #endif // If the current thread is going to sleep, or someone // wants a reschedule, choose another thread to run if( current->state != Cyg_Thread::RUNNING || get_need_reschedule() ) { CYG_INSTRUMENT_SCHED(RESCHEDULE,0,0); // Get the next thread to run from scheduler Cyg_Thread *next = scheduler.schedule(); CYG_CHECK_DATA_PTR( next, "Invalid next thread pointer"); CYG_ASSERTCLASS( next, "Bad next thread" ); if( current != next ) { CYG_INSTRUMENT_THREAD(SWITCH,current,next); // Count this thread switch thread_switches[CYG_KERNEL_CPU_THIS()]++; #ifdef CYGFUN_KERNEL_THREADS_STACK_CHECKING next->check_stack(); // before running it #endif current->timeslice_save(); // Switch contexts HAL_THREAD_SWITCH_CONTEXT( ¤t->stack_ptr, &next->stack_ptr ); // Worry here about possible compiler // optimizations across the above call that may try to // propogate common subexpresions. We would end up // with the expression from one thread in its // successor. This is only a worry if we do not save // and restore the complete register set. We need a // way of marking functions that return into a // different context. A temporary fix would be to // disable CSE (-fdisable-cse) in the compiler. // We return here only when the current thread is // rescheduled. There is a bit of housekeeping to do // here before we are allowed to go on our way. CYG_CHECK_DATA_PTR( current, "Invalid current thread pointer"); CYG_ASSERTCLASS( current, "Bad current thread" ); current_thread[CYG_KERNEL_CPU_THIS()] = current; // restore current thread pointer current->timeslice_restore(); } clear_need_reschedule(); // finished rescheduling } if( new_lock == 0 ) { #ifdef CYGSEM_KERNEL_SCHED_ASR_SUPPORT // Check whether the ASR is pending and not inhibited. If // we can call it, then transfer this info to a local // variable (call_asr) and clear the pending flag. Note // that we only do this if the scheduler lock is about to // be zeroed. In any other circumstance we are not // unlocking. cyg_bool call_asr = false; if( (current->asr_inhibit == 0) && current->asr_pending ) { call_asr = true; current->asr_pending = false; } #endif HAL_REORDER_BARRIER(); // Make sure everything above has happened // by this point zero_sched_lock(); // Clear the lock HAL_REORDER_BARRIER(); #ifdef CYGIMP_KERNEL_INTERRUPTS_DSRS // Now check whether any DSRs got posted during the thread // switch and if so, go around again. Making this test after // the lock has been zeroed avoids a race condition in which // a DSR could have been posted during a reschedule, but would // not be run until the _next_ time we release the sched lock. if( Cyg_Interrupt::DSRs_pending() ) { inc_sched_lock(); // reclaim the lock continue; // go back to head of loop } #endif // Otherwise the lock is zero, we can return. // CYG_POSTCONDITION( get_sched_lock() == 0, "sched_lock not zero" ); #ifdef CYGSEM_KERNEL_SCHED_ASR_SUPPORT // If the test within the sched_lock indicating that the ASR // be called was true, call it here. Calling the ASR must be // the very last thing we do here, since it must run as close // to "user" state as possible. if( call_asr ) current->asr(current->asr_data); #endif } else { // If new_lock is non-zero then we restore the sched_lock to // the value given. HAL_REORDER_BARRIER(); set_sched_lock(new_lock); HAL_REORDER_BARRIER(); } #ifdef CYGDBG_KERNEL_TRACE_UNLOCK_INNER CYG_REPORT_RETURN(); #endif return; } while( 1 ); ------------------------------------------------------------------------------------------------------------------------------------------- When enable both CYGSEM_KERNEL_SCHED_ASR_SUPPORT and CYGIMP_KERNEL_INTERRUPTS_DSRS, consider that local variable call_asr is set to true and clear the asr_pending flag of current thread, but has DSRs_pending. It will continue the while loop and the call_asr's value is re-initialized to false, code path will miss to call ASR. Posix timer use ASR to deliver signal,If missing to call ASR, and signal will not been delivered. Unfortunately, posix time subsystem don't try to reset the asr_pending flag (see alarm_action() in packages/compat/posix/current/src/time.cxx), and the timer was disarmed forever (If it is a interval timer). here is a small patch to fix this. Huang Yi ------------------------ Index: packages/kernel/current/src/sched/sched.cxx =================================================================== RCS file: /cvs/ecos/ecos/packages/kernel/current/src/sched/sched.cxx,v retrieving revision 1.20 diff -u -8 -p -r1.20 sched.cxx --- packages/kernel/current/src/sched/sched.cxx 29 Jan 2009 17:49:50 -0000 1.20 +++ packages/kernel/current/src/sched/sched.cxx 22 Apr 2011 02:42:06 -0000 @@ -131,16 +131,20 @@ inline void *operator new(size_t size, v // have when it reschedules this thread back, and leaves this function. // When it is non-zero, and the thread is rescheduled, no ASRS are run, // or DSRs processed. By doing this, it makes it possible for threads // that want to go to sleep to wake up with the scheduler lock in the // same state it was in before. void Cyg_Scheduler::unlock_inner( cyg_ucount32 new_lock ) { +#ifdef CYGSEM_KERNEL_SCHED_ASR_SUPPORT + cyg_bool call_asr = false; +#endif + #ifdef CYGDBG_KERNEL_TRACE_UNLOCK_INNER CYG_REPORT_FUNCTION(); #endif do { CYG_PRECONDITION( new_lock==0 ? get_sched_lock() == 1 : ((get_sched_lock() == new_lock) || (get_sched_lock() == new_lock+1)), @@ -229,23 +233,21 @@ void Cyg_Scheduler::unlock_inner( cyg_uc } if( new_lock == 0 ) { #ifdef CYGSEM_KERNEL_SCHED_ASR_SUPPORT // Check whether the ASR is pending and not inhibited. If - // we can call it, then transfer this info to a local + // we can call it, then transfer this info to a // variable (call_asr) and clear the pending flag. Note // that we only do this if the scheduler lock is about to // be zeroed. In any other circumstance we are not // unlocking. - - cyg_bool call_asr = false; if( (current->asr_inhibit == 0) && current->asr_pending ) { call_asr = true; current->asr_pending = false; } #endif