From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) by sourceware.org (Postfix) with ESMTPS id 142AA3858D33; Fri, 22 Dec 2023 06:49:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 142AA3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 142AA3858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703227743; cv=none; b=Fsqi2cdHlOgbfMDRKKUt2s8JQKiUuhvGoMM9u53AnLnYKbs0IMl4yqBl9M1iDElZ0A3IMbaUontVu06QvJmbVfPTttvlBmUs6tJcNTBMoTU+UUb8KT5TLanNpKRHyEnHg92/DKzk2kMOJh4J0N4u/v2grDKJ0kFhbIBSyEhjoFY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703227743; c=relaxed/simple; bh=sKTAd7wMvj8AgFzK3atRncHqaqpbX53xN4VrqJTmRXo=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=jPRc2haLVbeZP3C4bPOrB0aDWL4klXe9cHkBD3Tv0CP686BDmcarfnEkm47TddtHtpQe+Xof5OoYti2hZjUdzkXYx0040pnK711QHtvaGBcmAgSORHqlc7ZDcvhnYz30kZotkX/ohADe/DWNK8Bq4Olt4J6B/roLK8o52a+6hkA= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703227741; x=1734763741; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=sKTAd7wMvj8AgFzK3atRncHqaqpbX53xN4VrqJTmRXo=; b=j00rK995Vq2x9DtRhP5GQ9m5cr74582HhEDDvhGcrhJR6mMeVJyiKNA6 lxuy5p7/+QNwfz10LJ0sRG2YHmbV6kEvUXXSK9ZsE73v18QBHtD2vNqMj CSVc8/LiDkDaA7TOtboCmnT5QHPsKXmxnpHwsVfI1FoKX+W3M4Hih2Toi 0mHnL0jXYcLIvtwEJEWgQ6xmtSmfGLhp2qzP4vTrVp58udTZYpOTrWEre 1WLz3blTZcO0BXJN1rFvcFYEiYvBeOecnkPciF3keiOug01B6g6PbOtG0 3ilfFWIa/oWsw/xuJ8cDXI4uLf7MtCJ5LtEYYusWHfz8OeH/nYI1TWXar Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10931"; a="2930180" X-IronPort-AV: E=Sophos;i="6.04,294,1695711600"; d="scan'208";a="2930180" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Dec 2023 22:49:00 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10931"; a="770199103" X-IronPort-AV: E=Sophos;i="6.04,294,1695711600"; d="scan'208";a="770199103" Received: from zhulipeng-win.ccr.corp.intel.com (HELO [10.238.0.214]) ([10.238.0.214]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Dec 2023 22:48:56 -0800 Message-ID: <90d205a6-d38a-49d6-ba66-99202e566016@intel.com> Date: Fri, 22 Dec 2023 14:48:54 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: RE: [PATCH v7] libgfortran: Replace mutex with rwlock Content-Language: en-US To: Thomas Schwinge Cc: Jakub Jelinek , fortran@gcc.gnu.org, gcc-patches@gcc.gnu.org, "Deng, Pan" , rep.dot.nop@gmail.com, "Li, Tianyou" , tkoenig@netcologne.de, "Guo, Wangyang" , Tobias Burnus , "H.J. Lu" References: <20231209153944.3746165-1-lipeng.zhu@intel.com> <87sf45su42.fsf@euler.schwinge.homeip.net> <87bkajsrx4.fsf@euler.schwinge.homeip.net> From: Lipeng Zhu Organization: Intel In-Reply-To: <87bkajsrx4.fsf@euler.schwinge.homeip.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Thomas, On 2023/12/21 19:42, Thomas Schwinge wrote: > Hi! > > On 2023-12-13T21:52:29+0100, I wrote: >> On 2023-12-12T02:05:26+0000, "Zhu, Lipeng" wrote: >>> On 2023/12/12 1:45, H.J. Lu wrote: >>>> On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng wrote: >>>>> On 2023/12/9 23:23, Jakub Jelinek wrote: >>>>>> On Sat, Dec 09, 2023 at 10:39:45AM -0500, Lipeng Zhu wrote: >>>>>>> This patch try to introduce the rwlock and split the read/write to >>>>>>> unit_root tree and unit_cache with rwlock instead of the mutex to >>>>>>> increase CPU efficiency. In the get_gfc_unit function, the >>>>>>> percentage to step into the insert_unit function is around 30%, in >>>>>>> most instances, we can get the unit in the phase of reading the >>>>>>> unit_cache or unit_root tree. So split the read/write phase by >>>>>>> rwlock would be an approach to make it more parallel. >>>>>>> >>>>>>> BTW, the IPC metrics can gain around 9x in our test server with >>>>>>> 220 cores. The benchmark we used is >>>>>>> https://github.com/rwesson/NEAT >> >>>>>> Ok for trunk, thanks. >> >>>>> Thanks! Looking forward to landing to trunk. >> >>>> Pushed for you. > >> I've just filed >> "'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test timeouts". >> Would you be able to look into that? > > See my update in there. > > > Grüße > Thomas > -------------- > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 > Since I don't have gcc bugzilla account. Reply in this thread: Limit themselves to some lower 'OMP_NUM_THREADS' should be an option or increase the execution timeout? But I can't reproduce the execution timeout failure on both powerpc9 and powerpc10 arch machine. And I also tried to decrease the CPU frequency from 2.6G to 800M, these test cases still can run successfully. > so only a little bit of an improvement of the new "rwlock" libgfortran vs. old "mutex" GCC 10 one, curiously. (But supposedly that depends on the hardware or other factors?) The rwlock can increase the IPC a lot, maybe the wall time you listed is not obvious. $ grep ^cpu < /proc/cpuinfo | uniq -c 192 cpu : POWER10 (architected), altivec supported Native configuration is powerpc64le-unknown-linux-gnu Schedule of variations: unix PASS: libgomp.fortran/rwlock_1.f90 -O0 (test for excess errors) PASS: libgomp.fortran/rwlock_1.f90 -O0 execution test PASS: libgomp.fortran/rwlock_1.f90 -O1 (test for excess errors) PASS: libgomp.fortran/rwlock_1.f90 -O1 execution test PASS: libgomp.fortran/rwlock_1.f90 -O2 (test for excess errors) PASS: libgomp.fortran/rwlock_1.f90 -O2 execution test PASS: libgomp.fortran/rwlock_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) PASS: libgomp.fortran/rwlock_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: libgomp.fortran/rwlock_1.f90 -O3 -g (test for excess errors) PASS: libgomp.fortran/rwlock_1.f90 -O3 -g execution test PASS: libgomp.fortran/rwlock_1.f90 -Os (test for excess errors) PASS: libgomp.fortran/rwlock_1.f90 -Os execution test PASS: libgomp.fortran/rwlock_2.f90 -O0 (test for excess errors) PASS: libgomp.fortran/rwlock_2.f90 -O0 execution test PASS: libgomp.fortran/rwlock_2.f90 -O1 (test for excess errors) PASS: libgomp.fortran/rwlock_2.f90 -O1 execution test PASS: libgomp.fortran/rwlock_2.f90 -O2 (test for excess errors) PASS: libgomp.fortran/rwlock_2.f90 -O2 execution test PASS: libgomp.fortran/rwlock_2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) PASS: libgomp.fortran/rwlock_2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test PASS: libgomp.fortran/rwlock_2.f90 -O3 -g (test for excess errors) PASS: libgomp.fortran/rwlock_2.f90 -O3 -g execution test PASS: libgomp.fortran/rwlock_2.f90 -Os (test for excess errors) PASS: libgomp.fortran/rwlock_2.f90 -Os execution test PASS: libgomp.fortran/rwlock_3.f90 -O0 (test for excess errors) PASS: libgomp.fortran/rwlock_3.f90 -O0 execution test PASS: libgomp.fortran/rwlock_3.f90 -O1 (test for excess errors) PASS: libgomp.fortran/rwlock_3.f90 -O1 execution test PASS: libgomp.fortran/rwlock_3.f90 -O2 (test for excess errors) PASS: libgomp.fortran/rwlock_3.f90 -O2 execution test PASS: libgomp.fortran/rwlock_3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) PASS: libgomp.fortran/rwlock_3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test Lipeng Zhu