serious bug in synchronisation primitives

public inbox for ecos-devel@sourceware.org
 help / color / mirror / Atom feed

* serious bug in synchronisation primitives
@ 2004-10-27 13:35 sandeep
  2004-10-27 13:44 ` [SMP] " sandeep
  0 siblings, 1 reply; 2+ messages in thread
From: sandeep @ 2004-10-27 13:35 UTC (permalink / raw)
  To: ecos-devel

While going through the execution logs of an assert
build, found the messages complaining about 'Locking
mutex I already own' , 'Unlock mutex I do not own'. On
further analysis found the source of problems lying in
situations like (explained wrt Cyg_Mutex::lock) --

self/get_current_thread gets value from
current_thread[CYG_KERNEL_CPU_THIS()]
If the thread gets switched out in the middle of this
indexing, it's mess.

Consider a thread is executing on processor 0 when it
executes the mutex lock call. it has got the current
CPU (index into array) when it's timeslice got over
and it gets switched out. Next time it gets chance to
run on processor 1 and continues from where it left
off. The id it gets for self is not it's own, but of
the thread running on processor 0.

Hope that repurcussions of this are clear w/o detailed
explanations. On noticing this, scanned mutex.cxx and
other sources in kernel/current/src/sync and found
couple of more synchronisation primitives affected by
this bug in a quick scan.

The crux of problem (whatever little i can see into it
at the moment) is - accesses to arrays via
CYG_KERNEL_CPU_THIS() should be done under
scheduler_lock taken.

Doing it under sched_lock might be a costly affair in
some places (?? need to check out ??), may be
interrupts-disabled for the moment could be used in
those situations.

Various asserts/tests/normal code needs to be checked
for direct/indirect accesses to current_thread,
need_reschedule, thread_switches (variable that i
directly see in sched.hxx) outside scheduler_lock.

I hope with the help from list a thorough scan
(earlier thorough scan for direct/indirect use of
get_sched_lock is still pending ) can be run to find
instances of the problem.

mutex.cxx
---------
cyg_bool Cyg_Mutex::lock(void)
{
    CYG_REPORT_FUNCTYPE("returning %d");

   cyg_bool result = true;
   Cyg_Thread *self = Cyg_Thread::self();

   // Prevent preemption
   Cyg_Scheduler::lock();
...
}

same situation also appears in --
cyg_bool Cyg_Condition_Variable::wait_inner( Cyg_Mutex
*mx )
cyg_bool Cyg_Condition_Variable::wait_inner( Cyg_Mutex
*mx, cyg_tick_count timeout )

cnt_sem2.cxx
------------
cyg_bool Cyg_Counting_Semaphore2::wait()
cyg_bool Cyg_Counting_Semaphore2::wait( cyg_tick_count
abs_timeout )

cnt_sem.cxx
-----------
cyg_bool Cyg_Counting_Semaphore::wait()
cyg_bool Cyg_Counting_Semaphore::wait( cyg_tick_count
timeout )

bin_sem.cxx
-----------
cyg_bool Cyg_Binary_Semaphore::wait()
cyg_bool Cyg_Binary_Semaphore::wait( cyg_tick_count
timeout )

__________________________________
Do you Yahoo!?
Yahoo! Mail Address AutoComplete - You start. We finish.
http://promotions.yahoo.com/new_mail 

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [SMP] serious bug in synchronisation primitives
  2004-10-27 13:35 serious bug in synchronisation primitives sandeep
@ 2004-10-27 13:44 ` sandeep
  0 siblings, 0 replies; 2+ messages in thread
From: sandeep @ 2004-10-27 13:44 UTC (permalink / raw)
  To: sandeep; +Cc: ecos-devel

oops! missed out mentioning in previous post that buggy situation arise in case 
of SMP configurations.
sandeep

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-10-27 13:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-27 13:35 serious bug in synchronisation primitives sandeep
2004-10-27 13:44 ` [SMP] " sandeep

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).