From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 982E03858286; Fri, 15 Dec 2023 19:23:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 982E03858286 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=foss.arm.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=foss.arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 982E03858286 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702668214; cv=none; b=YGKAOuHE5RV/GOVNaC5GrUZSa/oN1gqExd5MubiarMCswxFXHY85oVddtHQy+hPbk7xxiKGEoSaQux8akWMgz8y4t46IlnhIi3vihLWDORNeuYfo9hPA55Qs6S2WxbLET9Q8F5OgaTZoHO9vjTg8r8Za0DJqTTIwGawCllknpnk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702668214; c=relaxed/simple; bh=YPSCqPe9lN+wRt3MJpVY8dAGmcMnkqCdFw/hvJJedHM=; h=Message-ID:Date:MIME-Version:Subject:To:From; b=VbAdrRU2WCCqlE2pyvV7ix2adwaW2ZbCcfztQK2PtWuWYDB089CmYW5rL2OuIlhoYbx1TOyglfLv7NgufAOCI0hTnX2Jf/B2mwL4qFt25iNPxQaOnt3vpxptQG+/mCDdTetDSAGrQysOFmwVIzu7LkRskOQ+pmzxU6uEqBOHJNE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 287F3C15; Fri, 15 Dec 2023 11:24:17 -0800 (PST) Received: from [10.57.2.196] (unknown [10.57.2.196]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6E32A3F738; Fri, 15 Dec 2023 11:23:30 -0800 (PST) Message-ID: <49b49e34-59b8-48f3-99cc-12133d76562c@foss.arm.com> Date: Fri, 15 Dec 2023 19:23:29 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v7] libgfortran: Replace mutex with rwlock Content-Language: en-GB To: Lipeng Zhu , "Richard Earnshaw (lists)" , jakub@redhat.com Cc: fortran@gcc.gnu.org, gcc-patches@gcc.gnu.org, hongjiu.lu@intel.com, pan.deng@intel.com, rep.dot.nop@gmail.com, tianyou.li@intel.com, tkoenig@netcologne.de, wangyang.guo@intel.com References: <20231209153944.3746165-1-lipeng.zhu@intel.com> <523c6fbf-344c-48b8-8508-7e2acb088606@arm.com> <8192480a-1dfc-4bc6-b141-0254364d3fba@intel.com> From: Richard Earnshaw In-Reply-To: <8192480a-1dfc-4bc6-b141-0254364d3fba@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3493.5 required=5.0 tests=BAYES_00,BODY_8BITS,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 15/12/2023 11:31, Lipeng Zhu wrote: > > > On 2023/12/14 23:50, Richard Earnshaw (lists) wrote: >> On 09/12/2023 15:39, Lipeng Zhu wrote: >>> This patch try to introduce the rwlock and split the read/write to >>> unit_root tree and unit_cache with rwlock instead of the mutex to >>> increase CPU efficiency. In the get_gfc_unit function, the percentage >>> to step into the insert_unit function is around 30%, in most instances, >>> we can get the unit in the phase of reading the unit_cache or unit_root >>> tree. So split the read/write phase by rwlock would be an approach to >>> make it more parallel. >>> >>> BTW, the IPC metrics can gain around 9x in our test >>> server with 220 cores. The benchmark we used is >>> https://github.com/rwesson/NEAT >>> >>> libgcc/ChangeLog: >>> >>>     * gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro. >>>     (__gthrw): New function. >>>     (__gthread_rwlock_rdlock): New function. >>>     (__gthread_rwlock_tryrdlock): New function. >>>     (__gthread_rwlock_wrlock): New function. >>>     (__gthread_rwlock_trywrlock): New function. >>>     (__gthread_rwlock_unlock): New function. >>> >>> libgfortran/ChangeLog: >>> >>>     * io/async.c (DEBUG_LINE): New macro. >>>     * io/async.h (RWLOCK_DEBUG_ADD): New macro. >>>     (CHECK_RDLOCK): New macro. >>>     (CHECK_WRLOCK): New macro. >>>     (TAIL_RWLOCK_DEBUG_QUEUE): New macro. >>>     (IN_RWLOCK_DEBUG_QUEUE): New macro. >>>     (RDLOCK): New macro. >>>     (WRLOCK): New macro. >>>     (RWUNLOCK): New macro. >>>     (RD_TO_WRLOCK): New macro. >>>     (INTERN_RDLOCK): New macro. >>>     (INTERN_WRLOCK): New macro. >>>     (INTERN_RWUNLOCK): New macro. >>>     * io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in >>>     a comment. >>>     (unit_lock): Remove including associated internal_proto. >>>     (unit_rwlock): New declarations including associated internal_proto. >>>     (dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock >>>     instead of __gthread_mutex_lock and __gthread_mutex_unlock on >>>     unit_lock. >>>     * io/transfer.c (st_read_done_worker): Use WRLOCK and RWUNLOCK on >>>     unit_rwlock instead of LOCK and UNLOCK on unit_lock. >>>     (st_write_done_worker): Likewise. >>>     * io/unit.c: Change UNIT_LOCK to UNIT_RWLOCK in 'IO locking rules' >>>     comment. Use unit_rwlock variable instead of unit_lock variable. >>>     (get_gfc_unit_from_unit_root): New function. >>>     (get_gfc_unit): Use RDLOCK, WRLOCK and RWUNLOCK on unit_rwlock >>>     instead of LOCK and UNLOCK on unit_lock. >>>     (close_unit_1): Use WRLOCK and RWUNLOCK on unit_rwlock instead of >>>     LOCK and UNLOCK on unit_lock. >>>     (close_units): Likewise. >>>     (newunit_alloc): Use RWUNLOCK on unit_rwlock instead of UNLOCK on >>>     unit_lock. >>>     * io/unix.c (find_file): Use RDLOCK and RWUNLOCK on unit_rwlock >>>     instead of LOCK and UNLOCK on unit_lock. >>>     (flush_all_units): Use WRLOCK and RWUNLOCK on unit_rwlock instead >>>     of LOCK and UNLOCK on unit_lock. >>> >> >> It looks like this has broken builds on arm-none-eabi when using newlib: >> >> In file included from >> /work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran >> /runtime/error.c:27: >> /work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h: In >> function >> ‘dec_waiting_unlocked’: >> /work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1023:3: error >> : implicit declaration of function ‘WRLOCK’ >> [-Wimplicit-function-declaration] >>   1023 |   WRLOCK (&unit_rwlock); >>        |   ^~~~~~ >> /work/rearnsha/gnusrc/nightly/gcc-cross/master/libgfortran/io/io.h:1025:3: error >> : implicit declaration of function ‘RWUNLOCK’ >> [-Wimplicit-function-declaration] >>   1025 |   RWUNLOCK (&unit_rwlock); >>        |   ^~~~~~~~ >> >> >> R. > > Hi Richard, > > The root cause is that the macro WRLOCK and RWUNLOCK are not defined in > io.h. The reason of x86 platform not failed is that > HAVE_ATOMIC_FETCH_ADD is defined then caused above macros were never > been used. Code logic show as below: > #ifdef HAVE_ATOMIC_FETCH_ADD >   (void) __atomic_fetch_add (&u->waiting, -1, __ATOMIC_RELAXED); > #else >   WRLOCK (&unit_rwlock); >   u->waiting--; >   RWUNLOCK (&unit_rwlock); > #endif > > I just draft a patch try to fix this bug, because I didn't have arm > platform, would you help to validate if it was fixed on arm platform? > > diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h > index 15daa0995b1..c7f0f7d7d9e 100644 > --- a/libgfortran/io/io.h > +++ b/libgfortran/io/io.h > @@ -1020,9 +1020,15 @@ dec_waiting_unlocked (gfc_unit *u) >  #ifdef HAVE_ATOMIC_FETCH_ADD >    (void) __atomic_fetch_add (&u->waiting, -1, __ATOMIC_RELAXED); >  #else > -  WRLOCK (&unit_rwlock); > +#ifdef __GTHREAD_RWLOCK_INIT > +  __gthread_rwlock_wrlock (&unit_rwlock); > +  u->waiting--; > +  __gthread_rwlock_unlock (&unit_rwlock); > +#else > +  __gthread_mutex_lock (&unit_rwlock); >    u->waiting--; > -  RWUNLOCK (&unit_rwlock); > +  __gthread_mutex_unlock (&unit_rwlock); > +#endif >  #endif >  } > > > Lipeng Zhu Hi Lipeng, Thanks for the quick reply. I can confirm that with the above change the bootstrap failure is fixed. However, this shouldn't be considered a formal review; libgfortran is not really my area. I'll be away now until January 2nd. Richard.