public inbox for pthreads-win32@sourceware.org
 help / color / mirror / Atom feed
* mutexes: "food for thought"
@ 2003-10-18 17:47 Alexander Terekhov
  2004-10-08 12:49 ` Ross Johnson
  0 siblings, 1 reply; 4+ messages in thread
From: Alexander Terekhov @ 2003-10-18 17:47 UTC (permalink / raw)
  To: pthreads-win32

G'Day,

here's "ala futex based" mutex stuff using XCHG. 

No need for CAS. I hope that it will work just fine. 

Can you see any harmful race condition(s) here? 

TIA.

#define SWAP_BASED_MUTEX_FOR_WINDOWS_INITIALIZER { 0, 0 }

struct swap_based_mutex_for_windows {

  atomic<int>                m_lock_status;  // -1: free, 0: locked, 1 
lock-contention
  atomic<auto_reset_event *> m_retry_event;  // DCSI'd 

  void DCSI(); // double-checked serialized initialization
  void slow_lock();
  bool slow_trylock();
  bool slow_timedlock(absolute_timeout const & timeout);
  void release_one_waiter_if_any();

  void lock() {
    if (m_lock_status.swap(0, msync::acq) >= 0) slow_lock();
  } 

  bool trylock() {
    return (m_lock_status.swap(0, msync::acq) < 0) ? true : 
slow_trylock(); 
  }

  bool timedlock(absolute_timeout const & timeout) {
    return (m_lock_status.swap(0, msync::acq) < 0) ? true : 
slow_timedlock(timeout);
  }

  void unlock() {
    if (m_lock_status.swap(-1, msync::rel) > 0) 
release_one_waiter_if_any();
  }

};

void swap_based_mutex_for_windows::slow_lock() {
  DCSI();
  while (m_lock_status.swap(1, msync::acq) >= 0)
    m_retry_event.load(msync::none)->wait();
}

bool swap_based_mutex_for_windows::slow_trylock() {
  DCSI();
  return m_lock_status.swap(1, msync::acq) < 0;
}

bool swap_based_mutex_for_windows::slow_timedlock(absolute_timeout const & 
timeout) {
  DCSI();
  while (m_lock_status.swap(1, msync::acq) >= 0)
    if (!m_retry_event.load(msync::none)->timedwait(timeout)) return 
false;
  return true;
}

void swap_based_mutex_for_windows::release_one_waiter_if_any() {
  m_retry_event.load(msync::none)->set();
}

void swap_based_mutex_for_windows::DCSI() {
  if (!m_retry_event.load(msync::none)) {
    named_windows_mutex_trick guard(this);
    if (!m_retry_event.load(msync::none)) {
      m_retry_event.store(new auto_reset_event(), msync::rel);
      m_lock_status.store(-1, msync::rel);
    }
  }
}

regards,
alexander.

P.S. I've never run it. Just a sketch.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mutexes: "food for thought"
  2003-10-18 17:47 mutexes: "food for thought" Alexander Terekhov
@ 2004-10-08 12:49 ` Ross Johnson
  2004-10-26 17:29   ` mutexes: "food for thought" [upcoming XBOX] Alexander Terekhov
  0 siblings, 1 reply; 4+ messages in thread
From: Ross Johnson @ 2004-10-08 12:49 UTC (permalink / raw)
  To: pthreads-win32; +Cc: Alexander Terekhov

Hi all,

Summary: mutex speedups.

Belatedly getting around to Alexander Terekhov's sketched 
enhancements to mutexes (below), I've rewritten the mutex 
routines in pthreads-win32. The main objective was to remove 
the need for the extra critical section (wait_cs) in the 
unlock and timedlock routines.

However, I couldn't get my translation of Alexander's logic 
to work, so have applied Ulrich Drepper's futex based mutex 
algorithms (specifically 'Mutex2') from his paper 
"http://people.redhat.com/drepper/futex.pdf". Some of the 
other ideas in Alexander's sketch, such as postponing full 
initialisation of statically declared mutexes until the slow 
sections of mutex operations (shown as DCSI() in the sketch 
below), have not been included yet, and may not be, because 
it would require recompiling applications before they could 
use the new dll. Postponing saves a compare op in each call 
to lock, timedlock or trylock.

The new code is in CVS if anyone wants to inspect/try it. 
The modified pthreads-win32 dll passes the full testsuite 
and has achieved some very significant speedups - at least 
on my single processor machine. In particular, the rwlock7.c 
test, which intensively exercises reader/writer locks, runs 
approximately 3 times faster than previously. [The 
reader/writer locks in pthreads-win32 are built from 
pthreads-win32 mutexes and condition variables.]

Further speedups were attempted by inlining the [many] calls 
to InterlockedCompareExchange(). This uses the library's own 
assembler version of this routine (X86 only), which was 
originally included for Win9x systems. But surprisingly, 
this canceled out almost all of the speed gains just made. 
It turns out that the 'lock' prefix to the
cmpxchg instruction has this effect on single processor 
systems - as a google search later confirmed - see:

http://gcc.gnu.org/ml/java/2001-03/msg00122.html

Interestingly though, the Windows version of
InterlockedCompareExchange() on single processor systems 
doesn't appear to use the 'lock' prefix as calling it is 
only marginally slower (by approximately 10%) than the new 
pthreads-win32 dll with inlined CMPXCHG minus 'lock' prefix. 
I assume the difference is subroutine call overhead. So, 
rather than build a separate dll for SMP systems, inlining
is currently turned off, sacrificing the 10% speed gain for 
binary portablility.

[If anyone wants to turn inlining on - after checking out 
the code from CVS, change the "#if 0" to "#if 1" at the 
bottom of ptw32_InterlockedCompareExchange.c, and build the 
dll by running "nmake clean VC-inlined", or "make clean 
GC-inlined" for MinGW.]

With all changes included, performance of pthreads mutexes 
is approaching, and in the case of trylock, apparently 
exceeding the performance of Win32 Critical Section calls - 
based on tests\benchtest1.c. But, by avoiding Win32 Critical 
Sections, there is now a possibility that pthreads-win32 
mutexes can exist in process shared memory, which may then 
allow PROCESS_SHARED mutexes and other objects to be 
implemented.

Unless I'm mistaken, the one negative about all of this is 
that threads are no longer guarranteed strict FIFO access to 
the lock. That is, a thread newly requesting the lock can 
sometimes steal the lock off an already waiting thread.

Regards.
Ross

Alexander Terekhov wrote 1 year ago:

>G'Day,
>
>here's "ala futex based" mutex stuff using XCHG. 
>
>No need for CAS. I hope that it will work just fine. 
>
>Can you see any harmful race condition(s) here? 
>
>TIA.
>
>#define SWAP_BASED_MUTEX_FOR_WINDOWS_INITIALIZER { 0, 0 }
>
>struct swap_based_mutex_for_windows {
>
>  atomic<int>                m_lock_status;  // -1: free, 0: locked, 1 
>lock-contention
>  atomic<auto_reset_event *> m_retry_event;  // DCSI'd 
>
>  void DCSI(); // double-checked serialized initialization
>  void slow_lock();
>  bool slow_trylock();
>  bool slow_timedlock(absolute_timeout const & timeout);
>  void release_one_waiter_if_any();
>
>  void lock() {
>    if (m_lock_status.swap(0, msync::acq) >= 0) slow_lock();
>  } 
>
>  bool trylock() {
>    return (m_lock_status.swap(0, msync::acq) < 0) ? true : 
>slow_trylock(); 
>  }
>
>  bool timedlock(absolute_timeout const & timeout) {
>    return (m_lock_status.swap(0, msync::acq) < 0) ? true : 
>slow_timedlock(timeout);
>  }
>
>  void unlock() {
>    if (m_lock_status.swap(-1, msync::rel) > 0) 
>release_one_waiter_if_any();
>  }
>
>};
>
>void swap_based_mutex_for_windows::slow_lock() {
>  DCSI();
>  while (m_lock_status.swap(1, msync::acq) >= 0)
>    m_retry_event.load(msync::none)->wait();
>}
>
>bool swap_based_mutex_for_windows::slow_trylock() {
>  DCSI();
>  return m_lock_status.swap(1, msync::acq) < 0;
>}
>
>bool swap_based_mutex_for_windows::slow_timedlock(absolute_timeout const & 
>timeout) {
>  DCSI();
>  while (m_lock_status.swap(1, msync::acq) >= 0)
>    if (!m_retry_event.load(msync::none)->timedwait(timeout)) return 
>false;
>  return true;
>}
>
>void swap_based_mutex_for_windows::release_one_waiter_if_any() {
>  m_retry_event.load(msync::none)->set();
>}
>
>void swap_based_mutex_for_windows::DCSI() {
>  if (!m_retry_event.load(msync::none)) {
>    named_windows_mutex_trick guard(this);
>    if (!m_retry_event.load(msync::none)) {
>      m_retry_event.store(new auto_reset_event(), msync::rel);
>      m_lock_status.store(-1, msync::rel);
>    }
>  }
>}
>
>regards,
>alexander.
>
>P.S. I've never run it. Just a sketch.
>  
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* mutexes: "food for thought" [upcoming XBOX]
  2004-10-08 12:49 ` Ross Johnson
@ 2004-10-26 17:29   ` Alexander Terekhov
  0 siblings, 0 replies; 4+ messages in thread
From: Alexander Terekhov @ 2004-10-26 17:29 UTC (permalink / raw)
  To: pthreads-win32

Given

http://www.theinquirer.net/?article=14407
http://www.gamepro.com/microsoft/xbox/hardware/news/35216.shtml

here's "food for thought" illustration. 

// doesn't provide "POSIX-safety" with respect to destruction
class mutex_for_XBOX_NEXT { // noncopyable

  atomic<int>      m_lock_status; // 0: free, 1/-1: locked/contention
  auto_reset_event m_retry_event; // prohibitively slow bin.sema/gate

  template<typename T>
  int attempt_update(int old, int new, T msync) {
    while (!m_lock_status.store_conditional(new, msync)) {
      int fresh = m_lock_status.load_reserved(msync::none);
      if (fresh != old)
        return fresh;
    }
    return old;
  }

public:

  // ctor/dtor [w/o lazy event init] 

  bool trylock() throw() {
    return !(m_lock_status.load_reserved(msync::none) || 
             attempt_update(0, 1, msync::acq)); 
  }

  // bool timedlock() omitted for brevity

  void lock() throw() {
    int old = m_lock_status.load_reserved(msync::none);
    if (old || old = attempt_update(0, 1, msync::acq)) {
      do {
        while (old < 0 || 
               old = attempt_update(1, -1, msync::none)) { 
          m_retry_event.wait();
          old = m_lock_status.load_reserved(msync::none);
          if (!old) break;
        }
      } while (old = attempt_update(0, -1, msync::acq));
    }
  }

  void unlock() throw() {
    if (m_lock_status.load_reserved(msync::none) < 0 || 
        attempt_update(1, 0, msync::rel) < 0) { // or just !SC
      m_lock_status.store(0, msync::rel);
      m_retry_event.set();
    }
  }

};

Oder?

regards,
alexander.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mutexes: "food for thought"
       [not found] <4166852F.8090300@callisto.canberra.edu.au>
@ 2004-10-08 12:53 ` Alexander Terekhov
  0 siblings, 0 replies; 4+ messages in thread
From: Alexander Terekhov @ 2004-10-08 12:53 UTC (permalink / raw)
  To: rpj; +Cc: pthreads-win32

Hi Ross,

I'm sorta busy at the moment. Here's a version without DCSI.

More next week.

// doesn't provide "POSIX-safety" with respect to destruction
class swap_based_mutex { // noncopyable 

  atomic<int>      m_lock_status; // 0: free, 1/-1: locked/contention
  auto_reset_event m_retry_event; // bin.sema/gate

public:

  // ctor/dtor [w/o lazy event init] 

  void lock() throw() {
    if (m_lock_status.swap(1, msync::acq)) 
      while (m_lock_status.swap(-1, msync::acq))
        m_retry_event.wait();
  } 

  bool trylock() throw() {
    return !m_lock_status.swap(1, msync::acq) ? 
      true : !m_lock_status.swap(-1, msync::acq);
  }

  bool timedlock(absolute_timeout const & timeout) throw() {
    if (m_lock_status.swap(1, msync::acq)) {
      while (m_lock_status.swap(-1, msync::acq))
        if (!m_retry_event.timedwait(timeout)) 
          return false;
    }  return true;
  }

  void unlock() throw() {
    if (m_lock_status.swap(0, msync::rel) < 0) 
      m_retry_event.set();
  } 
};

As for Ulrich's "take 2", my "take 3" can be found here:

http://listman.redhat.com/archives/phil-list/2003-October/msg00030.html
(Subject: Mutex, Take 3 (for "dumb" futex))

regards,
alexander.

P.S. "POSIX-safety" with respect to destruction:

http://lists.boost.org/MailArchives/boost/msg67616.php
http://lists.boost.org/MailArchives/boost/msg67651.php
http://lists.boost.org/MailArchives/boost/msg67667.php

regards,
alexander.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-10-26 17:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-18 17:47 mutexes: "food for thought" Alexander Terekhov
2004-10-08 12:49 ` Ross Johnson
2004-10-26 17:29   ` mutexes: "food for thought" [upcoming XBOX] Alexander Terekhov
     [not found] <4166852F.8090300@callisto.canberra.edu.au>
2004-10-08 12:53 ` mutexes: "food for thought" Alexander Terekhov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).