public inbox for pthreads-win32@sourceware.org
 help / color / mirror / Atom feed
* New pthreads-w32 releases available: versions 2.3.0 and 1.7.0
@ 2005-04-12  7:46 Ross Johnson
  2005-04-12 10:10 ` Alexander Terekhov
  0 siblings, 1 reply; 6+ messages in thread
From: Ross Johnson @ 2005-04-12  7:46 UTC (permalink / raw)
  To: Pthreads-Win32 list

Announcing two new releases of pthreads-w32:-
pthreads-w32-2-3-0-release
pthreads-w32-1-7-0-release

Packages are available in self-unpacking zip files (.exe) and gzipped
tar files (.tar.gz) as usual.

See
http://sources.redhat.com/pthreads-win32/

or go directly to:
ftp://sources.redhat.com/pub/pthreads-win32/

Red Hat have a low ftp concurrent user limit. Mirrors are at (available 
as they update):
http://sources.redhat.com/mirrors.html


These releases hopefully fix all known problems with pthread_once() in
both versions 1 and 2 of pthreads-win32. In particular, the starvation
problem that potentially arises after an init_routine cancellation has
been resolved using momentary priority boosting. If it proves to be
robust then there will be no need for a version 3 release as previously
implied (at least, not to fix pthread_once()).

The additional work of managing thread priorities inside of pthread_once
has been kept out of the normal (cancellation-free) pathways so that the
additional normal path overhead is almost nil (i.e. introduces no
additional bus locking or cache coherence operations if cancellation-
free).

The functionally and behaviour of versions 1.7 and 2.3 should be
logically identical. However, the version 2 pthread_once implementation
(based on code posted by Gottlob Frege) is much more efficient.


RELEASE 2.3.0
-------------
(2005-04-12)

General
-------

Release 1.7.0 is the backport of features and bug fixes new in
this release. See earlier notes under Release 2.0.0/General.

Bugs fixed
----------

* Fixed pthread_once potential for post once_routine cancellation
hanging due to starvation. See comments in pthread_once.c.
Momentary priority boosting is used to ensure that, after a
once_routine is cancelled, the thread that will run the
once_routine is not starved by higher priority waiting threads at
critical times. Priority boosting occurs only AFTER a once_routine
cancellation, and is applied only to that once_control. The
once_routine is run at the thread's normal base priority.

New tests
---------

* once4.c: Aggressively tests pthread_once() under realtime
conditions using threads with varying priorities. Windows'
random priority boosting does not occur for threads with realtime
priority levels.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: New pthreads-w32 releases available: versions 2.3.0 and 1.7.0
  2005-04-12  7:46 New pthreads-w32 releases available: versions 2.3.0 and 1.7.0 Ross Johnson
@ 2005-04-12 10:10 ` Alexander Terekhov
  2005-04-13  7:45   ` Ross Johnson
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Terekhov @ 2005-04-12 10:10 UTC (permalink / raw)
  To: Ross Johnson; +Cc: Pthreads-Win32 list


[... pthread_once() ...]

I'm not inclined to check the code at the moment, but I can tell
you that robust approach to priority problems is to use locks with
priority protocols on the them. Trying to optimize-out mutex
doesn't make much sense here since you need it on slow path (once
per thread per once_control instance at most) only.

Variation of DCSI (either MBR or TLS) with named mutex is the way
to go.

regards,
alexander.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: New pthreads-w32 releases available: versions 2.3.0 and 1.7.0
  2005-04-12 10:10 ` Alexander Terekhov
@ 2005-04-13  7:45   ` Ross Johnson
  2005-04-13  8:21     ` Gottlob Frege
  2005-04-14 16:19     ` Alexander Terekhov
  0 siblings, 2 replies; 6+ messages in thread
From: Ross Johnson @ 2005-04-13  7:45 UTC (permalink / raw)
  To: Pthreads-Win32 list

On Tue, 2005-04-12 at 12:10 +0200, Alexander Terekhov wrote: 
> [... pthread_once() ...]
> 
> I'm not inclined to check the code at the moment, but I can tell
> you that robust approach to priority problems is to use locks with
> priority protocols on the them. Trying to optimize-out mutex
> doesn't make much sense here since you need it on slow path (once
> per thread per once_control instance at most) only.

Hi Alexander,

[For anyone new to this issue:
http://sources.redhat.com/ml/pthreads-win32/2005/msg00034.html
]

As Tim Theisen has just kindly reported
( http://sources.redhat.com/ml/pthreads-win32/2005/msg00068.html ), both
of the once_routine cancellation tests have failed on his MP system (no
problem on my UP system though), so you may yet prevail. It's hard to
argue when I don't have a system that I can properly test this stuff on.

So I'm now hoping that someone might take a look and point out the error
and also, that the error is retrievable, because, even though your model
is logically very attractive and robust, there are still aspects of it
that concern me and compel me to look at alternatives. I'll list them
now:

1 - OVERHEAD:
I could live with this if I absolutely had to, but it does make sense to
me to try to optimise the uncontended path - although, not if it can't
be made to work reliably, obviously.

The create/open named mutex seems like a disproportionate overhead for
something that will probably complete uncontested in most situations.
Here's a table showing uncontested speeds (from tests/benchtest1.c) on
my system (Windows 2000, 2400MHz):

benchtest1
=============================================================================

Lock plus unlock on an unlocked mutex.
10000000 iterations

Using                                              Total(msec)   average(usec)
-----------------------------------------------------------------------------
Dummy call x 2                                             78           0.008
InterlockedOp x 2                                         140           0.014
Simple Critical Section                                   140           0.014
Old PT Mutex using a Critical Section (WNT)               281           0.028
Old PT Mutex using a Win32 Mutex (W9x)                  18265           1.827
.............................................................................
PTHREAD_MUTEX_DEFAULT (W9x,WNT)                           297           0.030
PTHREAD_MUTEX_NORMAL (W9x,WNT)                            296           0.030
PTHREAD_MUTEX_ERRORCHECK (W9x,WNT)                       1312           0.131
PTHREAD_MUTEX_RECURSIVE (W9x,WNT)                        1328           0.133
=============================================================================

The (W9x) in the description for Win32 Mutex timing just means that this
was type of object used by older versions of the library when running on
Win9x systems. As you can see, Win32 mutexes are 61 times slower than
the POSIX default mutex type, and 130 times slower than two Interlocked
operations or simple (enter+leave) critical section.


2 - SECURITY:
This makes me especially reluctant, given that users may not be aware of
the library's internals. I haven't looked in detail at, and have no
prior experience with, security of Windows objects, but I think I can
justify caution.

I don't think it's possible to guarantee that a 'named' mutex is not
vulnerable across all Windows systems that use the library. The name
must be unique (easy enough to do), but it's not possible to absolutely
prevent other processes from opening and locking the named mutex
accidentally or maliciously. I imagine that, while it's unlikely (but
not impossible) for another process to take the lock and stop the
init_routine from running, it might be possible to take a place in the
queue and ultimately prevent other legitimate waiters from ever
continuing.

According to the MS documentation, WinCE and Win95/98 ignore the
security parameter (and so I assume no security) - and namespaces for
named objects were introduced even later I think (Win 2000?).

The MSDN CreateMutex web page also includes the following note that
implies even more overhead and complexity:

"If you are using a named mutex to limit your application to a single
instance, a malicious user can create this mutex before you do and
prevent your application from starting. To prevent this situation,
create a randomly named mutex and store the name so that it can only be
obtained by an authorized user. Alternatively, you can use a file for
this purpose. To limit your application to one instance per user, create
a locked file in the user's profile directory."

Now, I could have investigated further and further, but I decided to try
to find an alternative first. Something that avoids all of these
'whatifs'. I want to be able to give users some assurances when they use
the library.

3 - TLS:
To use this I believe I need to make pthread_once_t an opaque pointer or
something similar (and PTHREAD_ONCE_INIT an auto-init magic token). This
adds still more overhead to the 'use once' object.

4 - MBR (alternative to TLS):
Not a concern - more a response:
From various URLs, (http://groups.yahoo.com/group/boost/message/15442) I
see that you don't trust Windows Interlocked operations to provide
proper MBR semantics. I recall someone at Microsoft writing that the
unadorned Interlocked routines (those not specifically acquire or
release) insert a full memory barrier on systems that support it.


Anyway, those are the questions that occurred to me at the time, as I
logged the following ChangeLog entry:


2005-03-13 Ross Johnson <rpj at callisto.canberra.edu.au> 

* pthread_once.c (pthread_once): Completely redesigned; a change was
required to the ABI (pthread_once_t_), and resulting in a version
compatibility index increment.

NOTES:
The design (based on pseudo code contributed by Gottlob Frege) avoids
creating a kernel object if there is no contention. See URL for details:-
http://sources.redhat.com/ml/pthreads-win32/2005/msg00029.html
This uses late initialisation similar to the technique already used for
pthreads-win32 mutexes and semaphores (from Alexander Terekhov).

The subsequent cancelation cleanup additions (by rpj) could not be implemented
without sacrificing some of the efficiency in Gottlob's design. In particular,
although each once_control uses it's own event to block on, a global CS is
required to manage it - since the event must be either re-usable or
re-creatable under cancelation. This is not needed in the non-cancelable
design because it is able to mark the event as closed (forever).

When uncontested, a CS operation is equivalent to an Interlocked operation
in speed. So, in the final design with cancelability, an uncontested
once_control operation involves a minimum of five interlocked operations
(including the LeaveCS operation).
	
ALTERNATIVES:
An alternative design from Alexander Terekhov proposed using a named mutex,
as sketched below:-

  if (!once_control) { // May be in TLS
    named_mutex::guard guard(&once_control2);
      if (!once_control2) {
         <init>
         once_control2 = true;
      }
    once_control = true;
  }
	
A more detailed description of this can be found here:-
http://groups.yahoo.com/group/boost/message/15442

[Although the definition of a suitable PTHREAD_ONCE_INIT precludes use of the
TLS located flag, this is not critical.]

There are three primary concerns though:-
1) The [named] mutex is 'created' even in the uncontended case.
2) A system wide unique name must be generated.
3) Win32 mutexes are VERY slow even in the uncontended 	case. An uncontested
Win32 mutex lock operation can be 50 (or more) times slower than an
uncontested EnterCS operation.

Ultimately, the named mutex trick is making use of the global locks maintained
by the kernel.


Regards.
Ross


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: New pthreads-w32 releases available: versions 2.3.0 and 1.7.0
  2005-04-13  7:45   ` Ross Johnson
@ 2005-04-13  8:21     ` Gottlob Frege
  2005-04-14 16:19     ` Alexander Terekhov
  1 sibling, 0 replies; 6+ messages in thread
From: Gottlob Frege @ 2005-04-13  8:21 UTC (permalink / raw)
  To: Ross Johnson; +Cc: Pthreads-Win32 list

On 4/13/05, Ross Johnson <RossJohnson@homemail.com.au> wrote:
> [...]
> So I'm now hoping that someone might take a look and point out the error
> and also, that the error is retrievable
> [...]

Yeah, I still have 'take another crack at call_once' on my list of things to do.


If we end up needing to mess with thread priorities anyhow (although
only in the cancel case), it might be easiest to just boost everyone's
priority right away, which was my first version (Alex might remember
seeing it on comp.programming.threads).  Something like (from memory,
and with interlocks, etc, left out):


oldpriority = GetPriority();
SetPriority(MAX); // must do this before checking initted - otherwise
could be too late!

if (!started++)
{
   do_init();
   initted = true;
}
else
{
     SetPriority(oldPriority); // note that oldPriority may
coincidentally be MAX
     while( !initted)
     {
          Sleep(1);
     }
}


Very much like the old original version, but since we mess with the
priorities, we don't have to worry about starvation  (even the worse
case where a waiter's original priority is MAX, the init still gets a
share of time (assuming a reasonable scheduler))

From this version, adding cancellation / exception handling should be easy.

But I never liked this version much:
   - not sure of the overhead of SetPriority
   - not sure of portability  (stuff like this is inherently not very
portable, and pthreads-win32 obviously doesn't care, but I care for my
purposes)
   - just 'inelegant' (polling, etc...)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: New pthreads-w32 releases available: versions 2.3.0 and 1.7.0
  2005-04-13  7:45   ` Ross Johnson
  2005-04-13  8:21     ` Gottlob Frege
@ 2005-04-14 16:19     ` Alexander Terekhov
  2005-04-15  3:08       ` Ross Johnson
  1 sibling, 1 reply; 6+ messages in thread
From: Alexander Terekhov @ 2005-04-14 16:19 UTC (permalink / raw)
  To: Ross Johnson; +Cc: Pthreads-Win32 list


[... named mutex and atrocities of MS impl ...]

It's not that hard to implement it using a map with pointers (to
"normal" mutex + refcount structure and addresses of once_control
variables as keys).

[... Windows Interlocked stuff ...]

In the meantime, MS has documented that they are fully fenced
(newer Acq/Rel stuff aside for a moment).

But you'll need acquire semantics for check on fast path. Not a
problem on IA32 (where all loads do have acquire semantics), but
on IA64 (lets pretend that it's still alive ;-) ), you'll need a
bit of assembly to make it fast.

regards,
alexander.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: New pthreads-w32 releases available: versions 2.3.0 and 1.7.0
  2005-04-14 16:19     ` Alexander Terekhov
@ 2005-04-15  3:08       ` Ross Johnson
  0 siblings, 0 replies; 6+ messages in thread
From: Ross Johnson @ 2005-04-15  3:08 UTC (permalink / raw)
  To: Pthreads-Win32 list

UPDATE: The latest change to pthread_once.c (current CVS head) passes
all the tests on an MP system, particularly the two tests that failed
with release 2.3.0 (each was run hundreds of times). Thanks to Tim
Theisen for running the tests.

This means that people can have more confidence in 2.4.0 when it comes
out (version 1.7.0 still needs to be confirmed but it uses a different
model again - and the original change that caused v2 to fail doesn't
apply to v1).

On Thu, 2005-04-14 at 18:18 +0200, Alexander Terekhov wrote:
> [... named mutex and atrocities of MS impl ...]
> 
> It's not that hard to implement it using a map with pointers (to
> "normal" mutex + refcount structure and addresses of once_control
> variables as keys).

I won't be rushing a version 3 release now unless more problems require
it, but I will pursue this. It would be nice to leave the priority
hacking to the kernel.

As it's turned out, v1 and v2 can be fixed rather than abandoned, as far
as this problem is concerned at least. Other as yet unknown bugs aside,
the version differences are now just in efficiency and, in most
applications, should not even be noticeable.

Thanks.
Ross


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-04-15  3:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-12  7:46 New pthreads-w32 releases available: versions 2.3.0 and 1.7.0 Ross Johnson
2005-04-12 10:10 ` Alexander Terekhov
2005-04-13  7:45   ` Ross Johnson
2005-04-13  8:21     ` Gottlob Frege
2005-04-14 16:19     ` Alexander Terekhov
2005-04-15  3:08       ` Ross Johnson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).