From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5466 invoked by alias); 5 Apr 2018 20:59:28 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 5440 invoked by uid 89); 5 Apr 2018 20:59:27 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.1 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f194.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=Rft1zNKoj9B5vxMl7gP0h51+wleVIj6kZWh69k1Vh6Q=; b=CBXCf4I9AEJQkYk6zITmmzbeOmd8f+d9fRi8jp8rBMbu0KXBElFnjDy7sgGLSicae3 n3lRsJTAmvYTY7Bh2Q3a4GOvVjFr4Y8vNQWhGQWrcefJYJZk91yCFi09xHus5uzGWa0m F9sQgGCNFZeHG4IHzmmglBPPrAjfmqbJGnUtUGts1SJUPjli7B6YgW908G/Ot6kdcri2 1llNk5++p/BUyuhm02/4ACdTQHUR4W3FpBLIWkFDIJVrNDnP9b9pAif+jcIVtz8VMnEU whdvj2HOrx8txjLvyo9T+QmpTPJyNlIa2V/qppcY+zi6xxv8uv21sq0AR13y6jNAB/2x bQfw== X-Gm-Message-State: ALQs6tDZiFMGiWvzvnKUXJ6uT3shqDjHAAm+RibnoyO6OiHFUfSqBkyH /i9RH61/Jhj3M4hyjXVO9EkOrCHPJDU= X-Google-Smtp-Source: AIpwx499uILawFABsf5MVdBaCOrvI1atj1T8f0rHlWR4C062b6Q2tKo5c9aVASjDUQM5A79jHBdaVg== X-Received: by 10.200.67.210 with SMTP id w18mr34922698qtn.327.1522961964107; Thu, 05 Apr 2018 13:59:24 -0700 (PDT) Subject: Re: [PATCH 3/3] Mutex: Avoid useless spinning To: libc-alpha@sourceware.org References: <1522394093-9835-1-git-send-email-kemi.wang@intel.com> <1522394093-9835-3-git-send-email-kemi.wang@intel.com> From: Adhemerval Zanella Openpgp: preference=signencrypt Autocrypt: addr=adhemerval.zanella@linaro.org; prefer-encrypt=mutual; keydata= xsFNBFcVGkoBEADiQU2x/cBBmAVf5C2d1xgz6zCnlCefbqaflUBw4hB/bEME40QsrVzWZ5Nq 8kxkEczZzAOKkkvv4pRVLlLn/zDtFXhlcvQRJ3yFMGqzBjofucOrmdYkOGo0uCaoJKPT186L NWp53SACXguFJpnw4ODI64ziInzXQs/rUJqrFoVIlrPDmNv/LUv1OVPKz20ETjgfpg8MNwG6 iMizMefCl+RbtXbIEZ3TE/IaDT/jcOirjv96lBKrc/pAL0h/O71Kwbbp43fimW80GhjiaN2y WGByepnkAVP7FyNarhdDpJhoDmUk9yfwNuIuESaCQtfd3vgKKuo6grcKZ8bHy7IXX1XJj2X/ BgRVhVgMHAnDPFIkXtP+SiarkUaLjGzCz7XkUn4XAGDskBNfbizFqYUQCaL2FdbW3DeZqNIa nSzKAZK7Dm9+0VVSRZXP89w71Y7JUV56xL/PlOE+YKKFdEw+gQjQi0e+DZILAtFjJLoCrkEX w4LluMhYX/X8XP6/C3xW0yOZhvHYyn72sV4yJ1uyc/qz3OY32CRy+bwPzAMAkhdwcORA3JPb kPTlimhQqVgvca8m+MQ/JFZ6D+K7QPyvEv7bQ7M+IzFmTkOCwCJ3xqOD6GjX3aphk8Sr0dq3 4Awlf5xFDAG8dn8Uuutb7naGBd/fEv6t8dfkNyzj6yvc4jpVxwARAQABzUlBZGhlbWVydmFs IFphbmVsbGEgTmV0dG8gKExpbmFybyBWUE4gS2V5KSA8YWRoZW1lcnZhbC56YW5lbGxhQGxp bmFyby5vcmc+wsF3BBMBCAAhBQJXFRpKAhsDBQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJ EKqx7BSnlIjv0e8P/1YOYoNkvJ+AJcNUaM5a2SA9oAKjSJ/M/EN4Id5Ow41ZJS4lUA0apSXW NjQg3VeVc2RiHab2LIB4MxdJhaWTuzfLkYnBeoy4u6njYcaoSwf3g9dSsvsl3mhtuzm6aXFH /Qsauav77enJh99tI4T+58rp0EuLhDsQbnBic/ukYNv7sQV8dy9KxA54yLnYUFqH6pfH8Lly sTVAMyi5Fg5O5/hVV+Z0Kpr+ZocC1YFJkTsNLAW5EIYSP9ftniqaVsim7MNmodv/zqK0IyDB GLLH1kjhvb5+6ySGlWbMTomt/or/uvMgulz0bRS+LUyOmlfXDdT+t38VPKBBVwFMarNuREU2 69M3a3jdTfScboDd2ck1u7l+QbaGoHZQ8ZNUrzgObltjohiIsazqkgYDQzXIMrD9H19E+8fw kCNUlXxjEgH/Kg8DlpoYJXSJCX0fjMWfXywL6ZXc2xyG/hbl5hvsLNmqDpLpc1CfKcA0BkK+ k8R57fr91mTCppSwwKJYO9T+8J+o4ho/CJnK/jBy1pWKMYJPvvrpdBCWq3MfzVpXYdahRKHI ypk8m4QlRlbOXWJ3TDd/SKNfSSrWgwRSg7XCjSlR7PNzNFXTULLB34sZhjrN6Q8NQZsZnMNs TX8nlGOVrKolnQPjKCLwCyu8PhllU8OwbSMKskcD1PSkG6h3r0AqzsFNBFcVGkoBEACgAdbR Ck+fsfOVwT8zowMiL3l9a2DP3Eeak23ifdZG+8Avb/SImpv0UMSbRfnw/N81IWwlbjkjbGTu oT37iZHLRwYUFmA8fZX0wNDNKQUUTjN6XalJmvhdz9l71H3WnE0wneEM5ahu5V1L1utUWTyh VUwzX1lwJeV3vyrNgI1kYOaeuNVvq7npNR6t6XxEpqPsNc6O77I12XELic2+36YibyqlTJIQ V1SZEbIy26AbC2zH9WqaKyGyQnr/IPbTJ2Lv0dM3RaXoVf+CeK7gB2B+w1hZummD21c1Laua +VIMPCUQ+EM8W9EtX+0iJXxI+wsztLT6vltQcm+5Q7tY+HFUucizJkAOAz98YFucwKefbkTp eKvCfCwiM1bGatZEFFKIlvJ2QNMQNiUrqJBlW9nZp/k7pbG3oStOjvawD9ZbP9e0fnlWJIsj 6c7pX354Yi7kxIk/6gREidHLLqEb/otuwt1aoMPg97iUgDV5mlNef77lWE8vxmlY0FBWIXuZ yv0XYxf1WF6dRizwFFbxvUZzIJp3spAao7jLsQj1DbD2s5+S1BW09A0mI/1DjB6EhNN+4bDB SJCOv/ReK3tFJXuj/HbyDrOdoMt8aIFbe7YFLEExHpSk+HgN05Lg5TyTro8oW7TSMTk+8a5M kzaH4UGXTTBDP/g5cfL3RFPl79ubXwARAQABwsFfBBgBCAAJBQJXFRpKAhsMAAoJEKqx7BSn lIjvI/8P/jg0jl4Tbvg3B5kT6PxJOXHYu9OoyaHLcay6Cd+ZrOd1VQQCbOcgLFbf4Yr+rE9l mYsY67AUgq2QKmVVbn9pjvGsEaz8UmfDnz5epUhDxC6yRRvY4hreMXZhPZ1pbMa6A0a/WOSt AgFj5V6Z4dXGTM/lNManr0HjXxbUYv2WfbNt3/07Db9T+GZkpUotC6iknsTA4rJi6u2ls0W9 1UIvW4o01vb4nZRCj4rni0g6eWoQCGoVDk/xFfy7ZliR5B+3Z3EWRJcQskip/QAHjbLa3pml xAZ484fVxgeESOoaeC9TiBIp0NfH8akWOI0HpBCiBD5xaCTvR7ujUWMvhsX2n881r/hNlR9g fcE6q00qHSPAEgGr1bnFv74/1vbKtjeXLCcRKk3Ulw0bY1OoDxWQr86T2fZGJ/HIZuVVBf3+ gaYJF92GXFynHnea14nFFuFgOni0Mi1zDxYH/8yGGBXvo14KWd8JOW0NJPaCDFJkdS5hu0VY 7vJwKcyHJGxsCLU+Et0mryX8qZwqibJIzu7kUJQdQDljbRPDFd/xmGUFCQiQAncSilYOcxNU EMVCXPAQTteqkvA+gNqSaK1NM9tY0eQ4iJpo+aoX8HAcn4sZzt2pfUB9vQMTBJ2d4+m/qO6+ cFTAceXmIoFsN8+gFN3i8Is3u12u8xGudcBPvpoy4OoG Message-ID: <44b112b8-829e-bcfd-4eac-fdd8362862ca@linaro.org> Date: Thu, 05 Apr 2018 20:59:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <1522394093-9835-3-git-send-email-kemi.wang@intel.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2018-04/txt/msg00148.txt.bz2 On 30/03/2018 04:14, Kemi Wang wrote: > Usually, we can't set too short time out while spinning on the lock, that > probably makes a thread which is trying to acquire the lock go to sleep > quickly, thus weakens the benefit of pthread adaptive spin lock. > > However, there is also a problem if we set the time out large in > case of protecting a small critical section with severe lock contention. > As we can see the test result in the last patch, the performance is highly > effected by the spin count tunables, smaller spin count, better performance > improvement. This is because the thread probably spins on the lock until > timeout in severe lock contention before going to sleep. > > In this patch, we avoid the useless spin by making the spinner sleep > if it fails to acquire the lock when the lock is available, as suggested > by Tim Chen. > > nr_threads base COUNT=1000(head~1) COUNT=1000(head) > 1 51644585 51323778(-0.6%) 51378551(-0.5%) > 2 7914789 9867343(+24.7%) 11503559(+45.3%) > 7 1687620 3430504(+103.3%) 7817383(+363.2%) > 14 1026555 1843458(+79.6%) 7360883(+617.0%) > 28 962001 681965(-29.1%) 5681945(+490.6%) > 56 883770 364879(-58.7%) 3416068(+286.5%) > 112 1150589 415261(-63.9%) 3255700(+183.0%) As before [2], I checked the change on a 64 cores aarch64 machine, but differently than previous patch this one seems to show improvements: nr_threads base head(SPIN_COUNT=10) head(SPIN_COUNT=1000) 1 27566206 28776779 (4.206770) 28778073 (4.211078) 2 8498813 9129102 (6.904173) 7042975 (-20.670782) 7 5019434 5832195 (13.935765) 5098511 (1.550982) 14 4379155 6507212 (32.703053) 5200018 (15.785772) 28 4397464 4584480 (4.079329) 4456767 (1.330628) 56 4020956 3534899 (-13.750237) 4096197 (1.836850) I would suggest you to squash both patch in only one for version 2. > > Suggested-by: Tim Chen > Signed-off-by: Kemi Wang > --- > nptl/pthread_mutex_lock.c | 25 +++++++++++-------------- > 1 file changed, 11 insertions(+), 14 deletions(-) > > diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c > index c3aca93..0faee1a 100644 > --- a/nptl/pthread_mutex_lock.c > +++ b/nptl/pthread_mutex_lock.c > @@ -127,22 +127,19 @@ __pthread_mutex_lock (pthread_mutex_t *mutex) > int cnt = 0; > int max_cnt = MIN (__mutex_aconf.spin_count, > mutex->__data.__spins * 2 + 100); > + > + /* MO read while spinning */ > do > - { > - if (cnt >= max_cnt) > - { > - LLL_MUTEX_LOCK (mutex); > - break; > - } > - /* MO read while spinning */ > - do > - { > - atomic_spin_nop (); > - } > - while (atomic_load_relaxed (&mutex->__data.__lock) != 0 && > + { > + atomic_spin_nop (); > + } > + while (atomic_load_relaxed (&mutex->__data.__lock) != 0 && > ++cnt < max_cnt); > - } > - while (LLL_MUTEX_TRYLOCK (mutex) != 0); > + /* Try to acquire the lock if lock is available or the spin count > + * is run out, go to sleep if fails > + */ > + if (LLL_MUTEX_TRYLOCK (mutex) != 0) > + LLL_MUTEX_LOCK (mutex); > > mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8; > } > Please fix the format issue here.