From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22564 invoked by alias); 27 Jun 2012 21:45:26 -0000 Received: (qmail 22543 invoked by uid 22791); 27 Jun 2012 21:45:25 -0000 X-SWARE-Spam-Status: No, hits=-4.6 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_HOSTKARMA_W,RCVD_IN_HOSTKARMA_WL,TW_FN X-Spam-Check-By: sourceware.org Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 27 Jun 2012 21:45:11 +0000 Received: from svr-orw-exc-10.mgc.mentorg.com ([147.34.98.58]) by relay1.mentorg.com with esmtp id 1Sk02w-0003Na-Po from Maxim_Kuvyrkov@mentor.com ; Wed, 27 Jun 2012 14:45:10 -0700 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by SVR-ORW-EXC-10.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 27 Jun 2012 14:44:25 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.1.289.1; Wed, 27 Jun 2012 22:45:08 +0100 Subject: Re: [PATCH] Optimize libc_lock_lock for MIPS XLP. MIME-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset="iso-8859-1" From: Maxim Kuvyrkov In-Reply-To: Date: Wed, 27 Jun 2012 21:45:00 -0000 CC: Chris Metcalf , Tom de Vries Content-Transfer-Encoding: quoted-printable Message-ID: <15EB7E17-5692-4221-A1B1-FC16EA236BFF@codesourcery.com> References: <4FD9DB74.8080905@tilera.com> <40CBC472-71CC-4FF3-A452-073B76701215@codesourcery.com> <4FDAA190.3050706@tilera.com> To: "Joseph S. Myers" , GLIBC Devel , Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-ports-owner@sourceware.org X-SW-Source: 2012-06/txt/msg00092.txt.bz2 On 15/06/2012, at 2:49 PM, Maxim Kuvyrkov wrote: > On 15/06/2012, at 2:44 PM, Chris Metcalf wrote: >=20 >> On 6/14/2012 9:20 PM, Maxim Kuvyrkov wrote: > ... >>> As I read it, in case of a contended lock __lll_lock_wait will reset th= e value of the lock to "2" before calling lll_futex_wait(). I agree that t= here is a timing window in which the other threads will see a value of the = lock greater than "2", but the value will not get as high as hundreds or bi= llions as it will be constantly reset to "2" in atomic_exchange in lll_lock= _wait(). >>>=20 >>> I do not see how threads will get into a busywait state, though. Would= you please elaborate on that? >>=20 >> You are correct. I was thinking the that the while loop had a cmpxchg, = but >> since it's just a straight-up exchange, the flow will be something like: >>=20 >> - Fail to initially call lll_futex_wait() if the lock is contended >> - Fall through to while loop >> - Spin as long as the lock is contended enough that *futex > 2 >> - Enter futex_wait >>=20 >> So a little busy under high contention, but probably settles out reasona= bly >> well. >=20 Attached is an improved patch that also optimizes __libc_lock_trylock using= XLP's atomic instructions. The patch also removes unnecessary indirection step represented by new macr= os lll_add_lock, which is then used to define __libc_lock_lock, and defines= __libc_lock_lock and __libc_lock_trylock directly in lowlevellock.h . Thi= s makes changes outside of ports/ trivial. Tested on MIPS XLP with no regressions. OK to apply for 2.17? -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics Allow overrides of __libc_lock_lock and __libc_lock_trylock. * nptl/sysdeps/pthread/bits/libc-lockP.h (__libc_lock_lock) (__libc_lock_trylock): Allow pre-existing definitions. --- nptl/sysdeps/pthread/bits/libc-lockP.h | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/nptl/sysdeps/pthread/bits/libc-lockP.h b/nptl/sysdeps/pthread/= bits/libc-lockP.h index 0ebac91..9c61662 100644 --- a/nptl/sysdeps/pthread/bits/libc-lockP.h +++ b/nptl/sysdeps/pthread/bits/libc-lockP.h @@ -176,8 +176,10 @@ typedef pthread_key_t __libc_key_t; =20 /* Lock the named lock variable. */ #if !defined NOT_IN_libc || defined IS_IN_libpthread -# define __libc_lock_lock(NAME) \ +# ifndef __libc_lock_lock +# define __libc_lock_lock(NAME) \ ({ lll_lock (NAME, LLL_PRIVATE); 0; }) +# endif #else # define __libc_lock_lock(NAME) \ __libc_maybe_call (__pthread_mutex_lock, (&(NAME)), 0) @@ -189,8 +191,10 @@ typedef pthread_key_t __libc_key_t; =20 /* Try to lock the named lock variable. */ #if !defined NOT_IN_libc || defined IS_IN_libpthread -# define __libc_lock_trylock(NAME) \ +# ifndef __libc_lock_trylock +# define __libc_lock_trylock(NAME) \ lll_trylock (NAME) +# endif #else # define __libc_lock_trylock(NAME) \ __libc_maybe_call (__pthread_mutex_trylock, (&(NAME)), 0) --=20 1.7.4.1 Optimize libc_lock_lock for XLP. 2012-06-28 Tom de Vries Maxim Kuvyrkov * sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h (__libc_lock_lock) (__libc_lock_trylock): Define for XLP. --- sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h | 39 ++++++++++++++++++= ++- 1 files changed, 37 insertions(+), 2 deletions(-) diff --git a/sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h b/sysdeps/uni= x/sysv/linux/mips/nptl/lowlevellock.h index 88b601e..a441e6b 100644 --- a/sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h +++ b/sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h @@ -1,5 +1,4 @@ -/* Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, - 2009 Free Software Foundation, Inc. +/* Copyright (C) 2003-2012 Free Software Foundation, Inc. This file is part of the GNU C Library. =20 The GNU C Library is free software; you can redistribute it and/or @@ -291,4 +290,40 @@ extern int __lll_timedwait_tid (int *, const struct ti= mespec *) __res; \ }) =20 +#ifdef _MIPS_ARCH_XLP +/* Implement __libc_lock_lock using exchange_and_add, which expands into + a single LDADD instruction on XLP. This is a simplified expansion of + ({ lll_lock (NAME, LLL_PRIVATE); 0; }). + + __lll_lock_wait_private() resets lock value to '2', which prevents unbo= unded + increase of the lock value and [with billions of threads] overflow. + + As atomic.h currently only supports a full-barrier atomic_exchange_and_= add, + using a full-barrier operation instead of an acquire-barrier operation = is + not beneficial for MIPS in general. Limit this optimization to XLP for + now. */ +# define __libc_lock_lock(NAME) \ + ({ \ + int *__futex =3D &(NAME); \ + if (__builtin_expect (atomic_exchange_and_add (__futex, 1), 0)) \ + __lll_lock_wait_private (__futex); \ + 0; \ + }) + +# define __libc_lock_trylock(NAME) \ + ({ \ + int *__futex =3D &(NAME); \ + int __result; \ + if (atomic_exchange_and_add (__futex, 1) =3D=3D 0) \ + __result =3D 0; \ + else \ + /* The lock is already locked. Set it to 'contended' state to avoid \ + unbounded increase from subsequent trylocks. This slightly degrade= s \ + performance of locked-but-uncontended case, as lll_futex_wake() wil= l be \ + called unnecessarily. */ \ + __result =3D (atomic_exchange_acq (__futex, 2) !=3D 0); \ + __result; \ + }) +#endif + #endif /* lowlevellock.h */ --=20 1.7.4.1