Re: [PATCH] NUMA spinlock [BZ #23962]

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Torvald Riegel <triegel@redhat.com>
To: kemi <kemi.wang@intel.com>, Rich Felker <dalias@libc.org>,
	"H.J. Lu" <hjl.tools@gmail.com>
Cc: Ma Ling <ling.ma.program@gmail.com>,
	GNU C Library <libc-alpha@sourceware.org>,
	"Lu, Hongjiu" <hongjiu.lu@intel.com>,
	"ling.ma" <ling.ml@antfin.com>, Wei Xiao <wei3.xiao@intel.com>
Subject: Re: [PATCH] NUMA spinlock [BZ #23962]
Date: Tue, 15 Jan 2019 12:37:00 -0000	[thread overview]
Message-ID: <6eaaec4d0ae349eaf31de1239f27c01dc1f5b6a8.camel@redhat.com> (raw)
In-Reply-To: <0b4620c1-a9c5-061e-9636-65d80655a6fd@intel.com>

On Tue, 2019-01-15 at 10:28 +0800, kemi wrote:
> > > "Scalable spinlock" is something of an oxymoron.
> > 
> > No, that's not true at all.  Most high-performance shared-memory
> > synchronization constructs (on typical HW we have today) will do some kind
> > of spinning (and back-off), and there's nothing wrong about it.  This can
> > scale very well. 
> > 
> > > Spinlocks are for
> > > situations where contention is extremely rare,
> > 
> > No, the question is rather whether the program needs blocking through the
> > OS (for performance, or for semantics such as PI) or not.  Energy may be
> > another factor.  For example, glibc's current mutexes don't scale well on
> > short critical because there's not enough spinning being done.
> > 
> 
> yes. That's why we need pthread.mutex.spin_count tunable interface before.

I don't think we need the tunable interface before that.  Where we need to
improve performance most is for applications that don't want to bother
tuning their mutexes -- that's where the broadest gains are overall, I
think.

In turn, that means that we have spinning and back-off that give good
average-case performance -- whether that's through automatic tuning of
those two things at runtime, or through static default values that we do
regular performance checks for in the glibc community. 

From that perspective, the tunable interface is a nice addition that can
allow users to fine-tune the setting, but it's not how users would enable
it.

> But, that's not enough. When tunable is not the bottleneck, the simple busy-waiting
> algorithm of current adaptive mutex is the major negative factor which degrades mutex
> performance.

Note that I'm not advocating for focusing on just the adaptive mutex type. 
IMO, adding this type was a mistake because whether to spin or not does not
affect semantics of the mutexes.  Performance hints shouldn't be done via a
mutex' type, and all mutex implementations should consider to spin at least
a little.

If we just do something about the adaptive mutexes, then I guess this will
reach few users.  I believe most applications just don't use them, and the
current implementation of adaptive mutexes is so simplistic that there's
not much performance to be had by changing to adaptive mutexes (which is
another reason for it having few users).

> That's why I proposed to use MCS-based spinning-waiting algorithm for adaptive
> mutex.

MCS-style spinning (ie, spinning on memory local to the spinning thread) is
helpful, but I think we should tackle spinning on global memory first (ie,
on a location in the mutex, which is shared by all the threads trying to
acquire it).  Of course, always including back-off.

> https://sourceware.org/ml/libc-alpha/2019-01/msg00279.html
> 
> Also, if with very small critical section in the worklad, this new type of mutex 
> with GNU extension PTHREAD_MUTEX_QUEUESPINNER_NP acts like MCS-spinlock, and performs
> much better than original spinlock.

I don't think we want to have a new type for that.  It maybe useful for
experimenting with it, but it shouldn't be exposed to users as a stable
interface.

Also, have you experimented with different kinds/settings of exponential
back-off?  I just saw normal spinning in your implementation, no varying
amounts of back-off.  The performance comparison should include back-off
though, as that's one way to work around the contention problems (with a
bigger hammer than local spinning of course, but can be effective
nonetheless, and faster in low-contention cases).

My guess is that a mix of local spinning on memory shared by a few threads
running on cores that are close to each other would perform best (eg,
similar to what's done in flat combining). 

> So, in some day, if adaptive mutex is tuned good enough, it should act like
> mcs-spinlock (or NUMA spinlock) if workload has small critical section, and
> performs like normal mutex if the critical section is too big to spinning-wait.

I agree in some way, but I think that the adaptive mutex type should just
be an alias of the normal mutex type (for API compatibility reasons only). 
And there could be other reasons than just critical-section-size that
determine whether a thread should block using futexes or not.

next prev parent reply	other threads:[~2019-01-15 12:37 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-26  9:51 Ma Ling
2019-01-03  4:05 ` 马凌(彦军)
     [not found]   ` <0a474516-b8c8-48cf-aeea-e57c77b78cbd.ling.ml@antfin.com>
2019-01-03  5:35     ` 转发：[PATCH] " 马凌(彦军)
2019-01-03 14:52       ` Szabolcs Nagy
2019-01-03 19:59         ` H.J. Lu
2019-01-05 12:34           ` [PATCH] " Carlos O'Donell
2019-01-05 16:36             ` H.J. Lu
2019-01-07 19:12               ` Florian Weimer
2019-01-07 19:49                 ` H.J. Lu
2019-01-10 16:31                   ` Carlos O'Donell
2019-01-10 16:32                     ` Florian Weimer
2019-01-10 16:41                       ` Carlos O'Donell
2019-01-10 17:52                         ` Szabolcs Nagy
2019-01-10 19:24                           ` Carlos O'Donell
2019-01-11 12:01                             ` kemi
2019-01-14 22:45                         ` Torvald Riegel
2019-01-15  9:32                           ` Florian Weimer
2019-01-15 12:01                             ` Torvald Riegel
2019-01-15 12:17                               ` Florian Weimer
2019-01-15 12:31                                 ` Torvald Riegel
2019-01-11 16:24                       ` H.J. Lu
2019-01-14 23:03             ` Torvald Riegel
2019-01-04  4:13         ` 转发：[PATCH] " 马凌(彦军)
2019-01-03 20:43 ` [PATCH] " Rich Felker
2019-01-03 20:55   ` H.J. Lu
2019-01-03 21:21     ` Rich Felker
2019-01-03 21:28       ` H.J. Lu
2019-01-14 23:18       ` Torvald Riegel
2019-01-15  2:33         ` kemi
2019-01-15 12:37           ` Torvald Riegel [this message]
2019-01-15 16:44             ` Rich Felker
2019-01-17  3:10             ` kemi
2019-02-04 17:23               ` Torvald Riegel
2019-01-14 22:40     ` Torvald Riegel
2019-01-14 23:26 ` Torvald Riegel
2019-01-15  4:47   ` 马凌(彦军)
2019-01-15  2:56 ` kemi
2019-01-15  4:27   ` 马凌(彦军)
2019-01-10 13:18 马凌(彦军)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6eaaec4d0ae349eaf31de1239f27c01dc1f5b6a8.camel@redhat.com \
    --to=triegel@redhat.com \
    --cc=dalias@libc.org \
    --cc=hjl.tools@gmail.com \
    --cc=hongjiu.lu@intel.com \
    --cc=kemi.wang@intel.com \
    --cc=libc-alpha@sourceware.org \
    --cc=ling.ma.program@gmail.com \
    --cc=ling.ml@antfin.com \
    --cc=wei3.xiao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).