From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 294423857820 for ; Wed, 10 Nov 2021 14:26:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 294423857820 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1AADmOYf016264; Wed, 10 Nov 2021 14:26:14 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3c8f9r8xq0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Nov 2021 14:26:14 +0000 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1AADmd2x017961; Wed, 10 Nov 2021 14:26:13 GMT Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com with ESMTP id 3c8f9r8xpm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Nov 2021 14:26:13 +0000 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1AAEDe2T021015; Wed, 10 Nov 2021 14:26:12 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma03dal.us.ibm.com with ESMTP id 3c5hbbxsfe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Nov 2021 14:26:12 +0000 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1AAEQBcB56295710 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 10 Nov 2021 14:26:11 GMT Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3B22E6A069; Wed, 10 Nov 2021 14:26:11 +0000 (GMT) Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 58C636A04D; Wed, 10 Nov 2021 14:26:10 +0000 (GMT) Received: from [9.160.75.28] (unknown [9.160.75.28]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP; Wed, 10 Nov 2021 14:26:10 +0000 (GMT) Message-ID: Date: Wed, 10 Nov 2021 08:26:09 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Subject: Re: [PATCH v4 0/3] Optimize CAS [BZ #28537] Content-Language: en-US To: "H.J. Lu" , libc-alpha@sourceware.org Cc: Florian Weimer , Arjan van de Ven , Andreas Schwab References: <20211110001614.2087610-1-hjl.tools@gmail.com> From: Paul E Murphy In-Reply-To: <20211110001614.2087610-1-hjl.tools@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed X-TM-AS-GCONF: 00 X-Proofpoint-GUID: PkEjaxUzg2lp47GW9ov8OYBYXJoUpmec X-Proofpoint-ORIG-GUID: CDA-hFyqPjIiZnWGv5NPuG3GMAM3opYW Content-Transfer-Encoding: 7bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-10_05,2021-11-08_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 clxscore=1011 suspectscore=0 malwarescore=0 mlxlogscore=783 lowpriorityscore=0 mlxscore=0 impostorscore=0 priorityscore=1501 adultscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111100074 X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2021 14:26:21 -0000 On 11/9/21 6:16 PM, H.J. Lu via Libc-alpha wrote: > CAS instruction is expensive. From the x86 CPU's point of view, getting > a cache line for writing is more expensive than reading. See Appendix > A.2 Spinlock in: > > https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf > > The full compare and swap will grab the cache line exclusive and cause > excessive cache line bouncing. > > Optimize CAS in low level locks and pthread_mutex_lock.c: > > 1. Do an atomic load and skip CAS if compare may fail to reduce cache > line bouncing on contended locks. > 2. Replace atomic_compare_and_exchange_bool_acq with > atomic_compare_and_exchange_val_acq to avoid the extra load. > 3. Drop __glibc_unlikely in __lll_trylock and lll_cond_trylock since we > don't know if it's actually rare; in the contended case it is clearly not > rare. Are you able to share benchmarks of this change? I am curious what effects this might have on other platforms. > > This is the first patch set to optimize CAS. I will investigate the rest > CAS usages in glibc after this patch set has been accepted. > > H.J. Lu (3): > Reduce CAS in low level locks [BZ #28537] > Reduce CAS in __pthread_mutex_lock_full [BZ #28537] > Optimize CAS in __pthread_mutex_lock_full [BZ #28537] > > nptl/lowlevellock.c | 12 ++++----- > nptl/pthread_mutex_lock.c | 53 ++++++++++++++++++++++++++++--------- > sysdeps/nptl/lowlevellock.h | 29 +++++++++++++------- > 3 files changed, 67 insertions(+), 27 deletions(-) >