From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-alpha-return-99250-listarch-libc-alpha=sources.redhat.com@sourceware.org>
Received: (qmail 86068 invoked by alias); 15 Jan 2019 02:33:09 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Received: (qmail 86055 invoked by uid 89); 15 Jan 2019 02:33:08 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=H*RU:HELO, Hx-spam-relays-external:HELO, acts, act
X-HELO: mga07.intel.com
Subject: Re: [PATCH] NUMA spinlock [BZ #23962]
To: Torvald Riegel <triegel@redhat.com>, Rich Felker <dalias@libc.org>,
 "H.J. Lu" <hjl.tools@gmail.com>
Cc: Ma Ling <ling.ma.program@gmail.com>,
 GNU C Library <libc-alpha@sourceware.org>, "Lu, Hongjiu"
 <hongjiu.lu@intel.com>, "ling.ma" <ling.ml@antfin.com>,
 Wei Xiao <wei3.xiao@intel.com>
References: <20181226025019.38752-1-ling.ma@MacBook-Pro-8.local>
 <20190103204338.GU23599@brightrain.aerifal.cx>
 <CAMe9rOoBhZmzuEoPGjhbxkYZ3mOaC-8tPUrSuhmcbnu8J19LpA@mail.gmail.com>
 <20190103212113.GV23599@brightrain.aerifal.cx>
 <5c2bf8859a412759aba26a21b317ea98f6ff8eaf.camel@redhat.com>
From: kemi <kemi.wang@intel.com>
Message-ID: <0b4620c1-a9c5-061e-9636-65d80655a6fd@intel.com>
Date: Tue, 15 Jan 2019 02:33:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.2.1
MIME-Version: 1.0
In-Reply-To: <5c2bf8859a412759aba26a21b317ea98f6ff8eaf.camel@redhat.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-SW-Source: 2019-01/txt/msg00320.txt.bz2


>> "Scalable spinlock" is something of an oxymoron.
> 
> No, that's not true at all.  Most high-performance shared-memory
> synchronization constructs (on typical HW we have today) will do some kind
> of spinning (and back-off), and there's nothing wrong about it.  This can
> scale very well. 
> 
>> Spinlocks are for
>> situations where contention is extremely rare,
> 
> No, the question is rather whether the program needs blocking through the
> OS (for performance, or for semantics such as PI) or not.  Energy may be
> another factor.  For example, glibc's current mutexes don't scale well on
> short critical because there's not enough spinning being done.
> 

yes. That's why we need pthread.mutex.spin_count tunable interface before.
But, that's not enough. When tunable is not the bottleneck, the simple busy-waiting
algorithm of current adaptive mutex is the major negative factor which degrades mutex
performance. That's why I proposed to use MCS-based spinning-waiting algorithm for adaptive
mutex.

https://sourceware.org/ml/libc-alpha/2019-01/msg00279.html

Also, if with very small critical section in the worklad, this new type of mutex 
with GNU extension PTHREAD_MUTEX_QUEUESPINNER_NP acts like MCS-spinlock, and performs
much better than original spinlock.

So, in some day, if adaptive mutex is tuned good enough, it should act like
mcs-spinlock (or NUMA spinlock) if workload has small critical section, and
performs like normal mutex if the critical section is too big to spinning-wait.