From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 3DE373858D33; Wed, 19 Apr 2023 21:49:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3DE373858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-x32c.google.com with SMTP id ay3-20020a05600c1e0300b003f17289710aso197729wmb.5; Wed, 19 Apr 2023 14:49:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681940976; x=1684532976; h=content-transfer-encoding:mime-version:message-id:references :in-reply-to:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=+48h9SZOIePZgpFQVq3jZdVBWnHL0UCbtNqwVOL9Vcs=; b=AJK81T5UMsEQC34dgY3RvFLskHF5u2cY6igHkw/AwwbSZdJk0xkNVDUmf56UI3J0XI VVItTxhBrFYGasHPbgsPP3aqU/q+RQzKvcXzbOwhUXbLDTcg3uGDW9041TnKVjuySstq AQquGs82hlj+awROqZgithaKzwtJOtxRFI8VWfcIVpkWLqTt7q8+GV9pOCLz6qloDkpm fmL2sgdWP28wanravEYsYf0f8RwtIPVlECSeHR9NMzAd2k2yeVHIpICmF1dnC4vY8ts4 sW/XbA/yEH9QcgowMvhRyaPS6ybv5WQCO7rBIKkKTMw3jF2MQWr2l84RO1Bv+FIuqIyd oPsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681940976; x=1684532976; h=content-transfer-encoding:mime-version:message-id:references :in-reply-to:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+48h9SZOIePZgpFQVq3jZdVBWnHL0UCbtNqwVOL9Vcs=; b=dpnAgjNynZD50NZFuNtyKW2u2ktHPoyUl6T3FXkB5uZ+UUTJtRSEoaGKzmRv6riGXw nGp7vKDqzTOHA9oCsqjiQCXGX9UfJ1XP0iUmMiaJaoAoqjNqCTZmZLAIpwWUibKWaUgP ZDifyH01Y5duMvjV2y6mPHkIxm0S7Kd0P0ykB3YqQK4lqnVrFcQvZxiMuHKpe7My83Ck MgRJKnzYeR8JukZQD+6Jz1+v+h7Ru9pLR5nRRxjVvW1Oegkp4RqsTbexVmOQhMsH58Vn KjwIRuYU97p8sjCYk3DoTNHSGLQlDcvqr/XTMT+c6bT1i1B6LSxhS4Z0tyjP9Kmq2X3G u3Ng== X-Gm-Message-State: AAQBX9d58rEAJuAjWccHFXm1wA46jRddueBaruC+82BDyFUL6pnOVUP2 2iN/XDu4j1EyXgno9KlV9QM= X-Google-Smtp-Source: AKy350aNSShnuPqWbberajXFmAEZEStmB7v13NCsTKlFRg6lM7g6l3ZEYPRNcNdBn/bm0kBfKRKAJg== X-Received: by 2002:a05:600c:cb:b0:3f1:6f52:74d with SMTP id u11-20020a05600c00cb00b003f16f52074dmr10342653wmm.39.1681940975836; Wed, 19 Apr 2023 14:49:35 -0700 (PDT) Received: from [127.0.0.1] (80-110-214-113.static.upcbusiness.at. [80.110.214.113]) by smtp.gmail.com with ESMTPSA id z18-20020adfd0d2000000b002fae7408544sm142230wrh.108.2023.04.19.14.49.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 19 Apr 2023 14:49:35 -0700 (PDT) Date: Wed, 19 Apr 2023 23:49:33 +0200 From: Bernhard Reutner-Fischer To: Lipeng Zhu , Lipeng Zhu via Fortran , tkoenig@netcologne.de, fortran@gcc.gnu.org, gcc-patches@gcc.gnu.org CC: hongjiu.lu@intel.com, tianyou.li@intel.com, pan.deng@intel.com, wangyang.guo@intel.com Subject: Re: [PATCH v3] libgfortran: Replace mutex with rwlock In-Reply-To: <20230419070628.1011624-1-lipeng.zhu@intel.com> References: <20221230001607.2232962-1-lipeng.zhu () intel ! com> <20230419070628.1011624-1-lipeng.zhu@intel.com> Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 19 April 2023 09:06:28 CEST, Lipeng Zhu via Fortran wrote: >This patch try to introduce the rwlock and split the read/write to >unit_root tree and unit_cache with rwlock instead of the mutex to >increase CPU efficiency=2E In the get_gfc_unit function, the percentage >to step into the insert_unit function is around 30%, in most instances, >we can get the unit in the phase of reading the unit_cache or unit_root >tree=2E So split the read/write phase by rwlock would be an approach to >make it more parallel=2E > >BTW, the IPC metrics can gain around 9x in our test >server with 220 cores=2E The benchmark we used is >https://github=2Ecom/rwesson/NEAT > >+#define RD_TO_WRLOCK(rwlock) \ >+ RWUNLOCK (rwlock);\ >+ WRLOCK (rwlock); >+#endif >+ >diff --git a/libgfortran/io/unit=2Ec b/libgfortran/io/unit=2Ec >index 82664dc5f98=2E=2E4312c5f36de 100644 >--- a/libgfortran/io/unit=2Ec >+++ b/libgfortran/io/unit=2Ec >@@ -329,7 +335,7 @@ get_gfc_unit (int n, int do_create) > int c, created =3D 0; >=20 > NOTE ("Unit n=3D%d, do_create =3D %d", n, do_create); >- LOCK (&unit_lock); >+ RDLOCK (&unit_rwlock); >=20 > retry: > for (c =3D 0; c < CACHE_SIZE; c++) >@@ -350,6 +356,7 @@ retry: > if (c =3D=3D 0) > break; > } >+ RD_TO_WRLOCK (&unit_rwlock); So I'm trying to convince myself why it's safe to unlock and only then tak= e the write lock=2E Can you please elaborate/confirm why that's ok? I wouldn't mind a comment like We can release the unit and cache read lock now=2E We might have to alloca= te a (locked) unit, below in do_create=2E Either way, we will manipulate the cache and/or the unit list so we have t= o take a write lock now=2E We don't take the write bit in *addition* to the read lock because=2E=2E (that needlessly complicates releasing the respective locks / it triggers = too much contention when we=2E=2E / =2E=2E=2E?) thanks, >=20 > if (p =3D=3D NULL && do_create) > { >@@ -368,8 +375,8 @@ retry: > if (created) > { > /* Newly created units have their lock held already >- from insert_unit=2E Just unlock UNIT_LOCK and return=2E */ >- UNLOCK (&unit_lock); >+ from insert_unit=2E Just unlock UNIT_RWLOCK and return=2E */ >+ RWUNLOCK (&unit_rwlock); > return p; > } >=20 >@@ -380,7 +387,7 @@ found: > if (! TRYLOCK (&p->lock)) > { > /* assert (p->closed =3D=3D 0); */ >- UNLOCK (&unit_lock); >+ RWUNLOCK (&unit_rwlock); > return p; > } >=20 >@@ -388,14 +395,14 @@ found: > } >=20 >=20 >- UNLOCK (&unit_lock); >+ RWUNLOCK (&unit_rwlock); >=20 > if (p !=3D NULL && (p->child_dtio =3D=3D 0)) > { > LOCK (&p->lock); > if (p->closed) > { >- LOCK (&unit_lock); >+ WRLOCK (&unit_rwlock); > UNLOCK (&p->lock); > if (predec_waiting_locked (p) =3D=3D 0) > destroy_unit_mutex (p); >@@ -593,8 +600,8 @@ init_units (void) > #endif > #endif >=20 >-#ifndef __GTHREAD_MUTEX_INIT >- __GTHREAD_MUTEX_INIT_FUNCTION (&unit_lock); >+#if (!defined(__GTHREAD_RWLOCK_INIT) && !defined(__GTHREAD_MUTEX_INIT)) >+ __GTHREAD_MUTEX_INIT_FUNCTION (&unit_rwlock); > #endif >=20 > if (sizeof (max_offset) =3D=3D 8)