From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa5.fujitsucc.c3s2.iphmx.com (esa5.fujitsucc.c3s2.iphmx.com [68.232.159.76]) by sourceware.org (Postfix) with ESMTPS id 3AA9C393C84E for ; Tue, 13 Apr 2021 12:07:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3AA9C393C84E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=fujitsu.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=naohirot@fujitsu.com IronPort-SDR: 57CK94xCx8e2XDUGGRqAFSCWrXC/qCARqxd916YmRAILFO11QAbtD/3L+vrvP2ysN6fw2yk47e zJbnB5NVwTM/ZVIL9dJCrHMa0sxOGI4k8EoyDu2JBsWfHD2CPQaLmLPu6pC9ifFqEe6dWZB+Cy tsjwWCXlOieyMxeMvW258UWpHHEQ41uMP4IKsv/7aD2bZ0l48eK9+oLjRMJAi9k0ukGPqFavMc DgRF9UV+6FUpJd2Ms9QLwJCGzWqOLmUCJtnKJxulR7ct1CJJCuMk1ZjLamDRXj2gFkGuH4+FZn kRM= X-IronPort-AV: E=McAfee;i="6200,9189,9952"; a="29672811" X-IronPort-AV: E=Sophos;i="5.82,219,1613401200"; d="scan'208";a="29672811" Received: from mail-ty1jpn01lp2050.outbound.protection.outlook.com (HELO JPN01-TY1-obe.outbound.protection.outlook.com) ([104.47.93.50]) by ob1.fujitsucc.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2021 21:07:41 +0900 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cAe7zEwq7wsFtgAsbGmEV1JsIyNYBrc/5STYPDdPl0bsil5A3jkD76Ll2fQR50MSF2u0C92nFOvnit3nvujKHmS9CYwesIXIbP4ROz9qvn1tREeMrcFFt8Fpa+hTMknGDfE1ICsrRoSj74stxyS3moQXQ+VnulWnwbUW8CJpJ1uwBXKYvJk9jvoZ3imTDRmhJ4EScXayzAPpzZXld/TTnGoesuJmIfCcb00ndrmi40afinsAJGLII3povOA/WGSpLDHbQjxt+d7G3tP5Nj4Nrc6PnFee3wplW4s7VezuPjB05F/0fNtPjHBo9hi1jje/e0x7wIVuxH6CDQg+vznRng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2nqD0karNI3g49xXOi8YpMo29d4CBrgp1GjXc0MWIeM=; b=afO4GQMjhA193mQN8oTPyCbrWXUDgc3eVRRfjoqYYHjMTI2OlQftE+V2VoGuONY/sWBJPEHeWjJYOkcAHuxeI/XuJPk7Gcf7m+jcpW8ZqeXoOcsW+NsNzvat+zvz889j5oQ2oZLSH6noEz/hAyvVruU8iKOmcgdYfGl63TuNfZL2W0npmpmWiNJs95rzfWiAKw5xThYWBm/9V0i9XNCGfEKUoBTFXMOc5asrXLkjlgXCGFW9oYto4Hx9t3OuItry+Z6AWo1WhXjnTH1e/hNxVsdTAsiXpNJ+SbbSEne+8aQwu8p7F3tivYtEIRPZ9xDggjJxpzbM4wqrd2fIwBztCQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fujitsu.com; dmarc=pass action=none header.from=fujitsu.com; dkim=pass header.d=fujitsu.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fujitsu.onmicrosoft.com; s=selector2-fujitsu-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2nqD0karNI3g49xXOi8YpMo29d4CBrgp1GjXc0MWIeM=; b=NFLEGrzOcF6Mr6SKfCQ8XI6BBFwKXy1YwORpYeBGNNoavKcM7ajxx+Q2VURPcHwMS/vfwq2yDhLbWncA/1kc4LGpIAaPgrGiL4pB8IA4kNB/agUMte/RAdjVTXRUcv3NQhZp3xYzSrnGJeb/mZ0Wl8c9lYN9wY7AzbGl4iODYOg= Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com (2603:1096:402:36::13) by TYCPR01MB6190.jpnprd01.prod.outlook.com (2603:1096:400:4f::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.16; Tue, 13 Apr 2021 12:07:37 +0000 Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::c8de:7917:af16:588b]) by TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::c8de:7917:af16:588b%6]) with mapi id 15.20.4020.022; Tue, 13 Apr 2021 12:07:37 +0000 From: "naohirot@fujitsu.com" To: 'Wilco Dijkstra' CC: 'GNU C Library' , Szabolcs Nagy Subject: RE: [PATCH 0/5] Added optimized memcpy/memmove/memset for A64FX Thread-Topic: [PATCH 0/5] Added optimized memcpy/memmove/memset for A64FX Thread-Index: AQHXL5Jyw0P1gKwhEk6/DkVDv1IPJaqyCeTQ Date: Tue, 13 Apr 2021 12:07:37 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-001, ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-securitypolicycheck: OK by SHieldMailChecker v2.6.3 x-shieldmailcheckermailid: 016387cf25df4cf3b7a02e6b14504e17 authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=fujitsu.com; x-originating-ip: [218.44.52.179] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 4c4673c2-b577-42af-aa4e-08d8fe74baf6 x-ms-traffictypediagnostic: TYCPR01MB6190: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:8882; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: XBjixeKutAZnghZFyujfGGEOEwAhOGd/IFjpsCBxLnYYtPO8d7ykxWWHAZCSPJs2yZmkK5pFR0a2ZUo2psW3Ah4OsuuTm9pjnpLjTU2DNDyAXFm6hLTmRH3qF+Byu/3GE2UNIBQ3Me4JTV7FyoOnzlF+QbCKDEC7qXTpqRkODUPXgmcifcKrBqeU61EsvJr17fs9fsfrTv/J4q0MrHSLSGDcDwGjMH26zEwElyVg6GuGbalvM5vZqKdK9rAixhqe8P3g4RwX+ezlwcbIqZ/Yp6byfrL1JBc37mKRPW0cRXJ3yEx+7fKV2uw/ca2lriOUG9sm4gt22VdghwAabyUdJCoiYpriy6un3aJE3CxSHnhtQ5V0jcCp975evROoyjK9oZVrE7KNsjgDqqtyGDhJI4yq7zXJByWwjcYhFUY30B7ntPuBPdwUsotg3EEJbQitbZVqvA6P7IKxtQWV71t3ux7OX9gztQmz7aUlr7XIZ4x8ct1hcUVPKKDHxnsUJLsM0/ItdXeKEz82AYoUjbiWbqhJwtl5tpVX4h8r4Rl77Ne1PQ20TDTgbhbKAe6z7qd24jir2/vmntIdYW+4cUhRcdAnr315bfZUZ1zRwIFg1koS0jO22dkD32HO7KKHkNWor5pAyhKmVwAFjG1Bj0FfyK8kYD31S3a3ai/79v4XyOZj+f2jX01okWIUPyoDm0F3sHKhksc3JcxV4PZ2eqTwUH2phqL4gKdAgg38/P7USxc= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB6025.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(39860400002)(396003)(366004)(376002)(136003)(316002)(7696005)(71200400001)(76116006)(85182001)(38100700002)(2906002)(186003)(83380400001)(122000001)(966005)(6506007)(52536014)(5660300002)(6916009)(8676002)(9686003)(54906003)(4326008)(64756008)(8936002)(66556008)(66446008)(26005)(66946007)(86362001)(478600001)(66476007)(55016002)(33656002)(473944003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-2022-jp?B?QTFzYjBOYnB3V3prMTNROEU2RjFRMHQ2MkhvUkU4UlFHSFliS3NiTXhX?= =?iso-2022-jp?B?clEvdTYrU0dndEgyVlFpM1V6Zm55OXlNalhkRHBIZHkvWEdmQjA2WDhS?= =?iso-2022-jp?B?QjE3dkdLbHRDTXVnWFdCR1BEc0JhT21vNHYzWXNGSXV0cnMzNDlSUDEr?= =?iso-2022-jp?B?R0RRNzFxcEF2YWpYSzBEcjVDSnlBeVg2aEpDS1FiMUpKZTNJdHk1VnR0?= =?iso-2022-jp?B?U2EweXBRa2N4Z3JZY0NpeUY2bmdHV0c5dnE5SDk3KzNrL3JGc01sMlVx?= =?iso-2022-jp?B?YlAydzZKM1JSV3ZsM0tXUEJkSG12d05uSU5LaUhTR1AxMU9SVEM3R1hv?= =?iso-2022-jp?B?d0JRU3RiZmkreGU5VUxKU1BGUTRQUnpteURKbkhLNUU2WGdkUmZXVlRr?= =?iso-2022-jp?B?WVlseU5TVEdPZzdxSCtNMEFHU2VhL0JpMjFOU1hPQVV1NmRQZTF1blcz?= =?iso-2022-jp?B?amlTNllJbkZrbjN5bm9kaGx5WjJBU0FoclB3eExqVzUva215S0FSWm9X?= =?iso-2022-jp?B?dEUyMTVuc3c4dS9CRTN5Q09FN3czaDliTUZITUgzOGo2Z2d0WkFjRXZj?= =?iso-2022-jp?B?d2NNb2x3UXZmajNXR21aUVZnVy9ibVVBbUExQytDSU9PU3JoV3Q2K2VW?= =?iso-2022-jp?B?OXdGb0kwWXJQR3Z5Qnpnc0o5TlJSOHJQbm9vYWRRb05BVzNEZm5MWVFZ?= =?iso-2022-jp?B?UDZ0MjE0NXhSZ2NNY2QrN1NyeGo1WWp1dldJcFQrb3FXWVJSQmpMaUVL?= =?iso-2022-jp?B?ZFNBdnlPRWVPRFRuZDFDanpZM05HN3RkNEEzQXhKMUFmdUUwSFFybnNM?= =?iso-2022-jp?B?MXZOditkRjVMeVB5VVd1S0dlNjk4NXJPVGxXNHBOa1dKV0dKclhFMjha?= =?iso-2022-jp?B?dHJDMHMzVzNlYVRjejhtUnkzSmdJTGlWK2ZQUXBPYjVGM2FGS3hISnJj?= =?iso-2022-jp?B?VS9Tc1BFWUV5bUhkSG1GeXp6MXFOaTYyZmNYOTg1SHVZV29ZS0FFS1Vv?= =?iso-2022-jp?B?MGM0Q1JVYTY3elpnNTdQd0VxN2p5Qm40OWR5RmpMeFg4L3ZlSndJZm5D?= =?iso-2022-jp?B?Q3pNVWRQOWdtZkdpTnJYYW4wOHpCS2NQN2U2UlA0dFBzNTJvTmI5WTdL?= =?iso-2022-jp?B?eWRQM0pUYjZ5am82SVhCSjRrcC9aNFFURk1WZ2hjU2NuTDJYcUtIZWZu?= =?iso-2022-jp?B?UklkM2VjN3VUQWJaSDFrOVVKR0MwWUxJWTJ3M0JxelZWOWFtZHFuM2w1?= =?iso-2022-jp?B?L1NyM01yY2hzOHRKR2tUbnFROWRxQk1tZTZjNjVCb2UybmFHcWJrY2xF?= =?iso-2022-jp?B?TU5QbkJNZXNqQ1RFbXRrcGVTZW4yeDB4RGVlVnhqK2oxT3kwbkMwQUFB?= =?iso-2022-jp?B?UDE3bHpOais1eVVWZWlObUs4S3B0Nkc5d21RdkVtNXZZV3VhODJBbUV0?= =?iso-2022-jp?B?aUZ6NGxELzQ2cGFaTWVWNXNJZzRVVzE1QTh3Z016SHJaUFQ0Z2Y4QzBy?= =?iso-2022-jp?B?YWFVQWc4dUdhY2ZDc3NUY2F2aytINVJEUUNkd1dmdGl6Y2JZSXpWM3Q0?= =?iso-2022-jp?B?Sm0rc2NyQzVtMHZyZHl4RldUMmg2amxUMldBRFFZNGZlclF4bXhrT1J6?= =?iso-2022-jp?B?RUlObTdUSlgzMWljRUhGUDhYRnd0VVQxUnRsTkdIN1hUZjNaaTNmcW1k?= =?iso-2022-jp?B?d1FhTVZtcmNBbXo0a3JCM0xTNDcrRnNsUm5ZREdFeEk1enB5S2NBcFh4?= =?iso-2022-jp?B?bVBlUFJIdHIxQm5xYUxrQ0NiNFJ6STFHaG8vMElUQ2F4eHFBc2RNc2NQ?= =?iso-2022-jp?B?SXN5cTRLd1FiaXBQdDFySWcrNlJFV2RsUWlTK0tWZ1FGVXVFM3hrYU94?= =?iso-2022-jp?B?dWYwSSsrU3lwRm5USm51VndqYnFRPQ==?= Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: fujitsu.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB6025.jpnprd01.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4c4673c2-b577-42af-aa4e-08d8fe74baf6 X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Apr 2021 12:07:37.8547 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a19f121d-81e1-4858-a9d8-736e267fd4c7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Ped2vXOcHVR76MlbSYmg6bJQhVm5xtN5GLGCRieTstQqlzLb6kpPyGS5i1iSGYwLC3JytYyTorCYVGfoo7EC7g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYCPR01MB6190 X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, KAM_LOTSOFHASH, PDS_BTC_ID, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Apr 2021 12:07:45 -0000 Hi Wilco-san, Thanks for the comments. I've been continuously updated the first patch since I posted on Mar. 17 20= 21, and fixed some bugs. Here is my local repository's commit history: https://github.com/NaohiroTamura/glibc/commits/patch-20210317 I answer your comments referring to the latest source code above and benchtests graphs uploaded to Google drive. > From: Wilco Dijkstra >=20 > 1. Overall the code is too large due to enormous unroll factors >=20 > Our current memcpy is about 300 bytes (that includes memmove), this memcp= y is > ~12 times larger! > This hurts performance due to the code not fitting in the I-cache for com= mon > copies. OK, I'll try to remove unnecessary code which doesn't contribute performanc= e gain based on benchtests performance data.=20 > On a modern OoO core you need very little unrolling since ALU operations = and > branches become essentially free while the CPU executes loads and stores.= So > rather than unrolling by 32-64 times, try 4 times - you just need enough = to hide the > taken branch latency. >=20 In terms of loop unrolling, I tested several cases in my local environment. Here is the result. The source code is based on the latest commit of the branch patch-20210317 = in my GitHub repository. [1] https://github.com/NaohiroTamura/glibc/blob/ec0b55a855529f75bd6f280e59d= c2b1c25640490/sysdeps/aarch64/multiarch/memcpy_a64fx.S [2] https://github.com/NaohiroTamura/glibc/blob/ec0b55a855529f75bd6f280e59d= c2b1c25640490/sysdeps/aarch64/multiarch/memset_a64fx.S Memcpy/memmove uses 8, 4, 2 unrolls, and memset uses 32, 8, 4, 2 unrolls. This unroll configuration recorded the highest performance. Memcpy 35 Gbps/sec [3] Memmove 70 Gbps/sec [4] Mmemset 70 Gbps/sec [5] [3] https://drive.google.com/file/d/1Xz04kV-S1E4tKOKLJRl8KgO8ZdCQqv1O/view [4] https://drive.google.com/file/d/1QDmt7LMscXIJSpaq2sPOiCKl3nxcLxwk/view [5] https://drive.google.com/file/d/1rpy7rkIskRs6czTARNIh4yCeh8d-L-cP/view In case that Memcpy/memmove uses 4 unrolls, and memset uses 4 unrolls, The performance degraded minus 5 to 15 Gbps/sec at the peak. Memcpy 30 Gbps/sec [6] Memmove 65 Gbps/sec [7] Mmemset 45 Gbps/sec [8] [6] https://drive.google.com/file/d/1P-QJGeuHPlfj3ax8GlxRShV0_HVMJWGc/view [7] https://drive.google.com/file/d/1R2IK5eWr8NEduNnvqkdPZyoNE0oImRcp/view [8] https://drive.google.com/file/d/1WMZFjzF5WgmfpXSOnAd9YMjLqv1mcsEm/view > 2. I don't see any special handling for small copies >=20 > Even if you want to hyper optimize gigabyte sized copies, small copies ar= e still > extremely common, so you always want to handle those as quickly (and with= as > little code) as possible. Special casing small copies does not slow down = the huge > copies - the reverse is more likely since you no longer need to handle sm= all cases. > Yes, I implemented for the case of 1 byte to 512 byte [9][10]. SVE code seems faster than ASIMD in small/medium range too [11][12][13]. [9] https://github.com/NaohiroTamura/glibc/blob/ec0b55a855529f75bd6f280e59d= c2b1c25640490/sysdeps/aarch64/multiarch/memcpy_a64fx.S#L176-L267 [10] https://github.com/NaohiroTamura/glibc/blob/ec0b55a855529f75bd6f280e59= dc2b1c25640490/sysdeps/aarch64/multiarch/memset_a64fx.S#L68-L78 [11] https://drive.google.com/file/d/1VgkFTrWgjFMQ35btWjqHJbEGMgb3ZE-h/view [12] https://drive.google.com/file/d/1SJ-WMUEEX73SioT9F7tVEIc4iRa8SfjU/view [13] https://drive.google.com/file/d/1DPPgh2r6t16Ppe0Cpo5XzkVqWA_AVRUc/view =20 > 3. Check whether using SVE helps small/medium copies >=20 > Run memcpy-random benchmark to see whether it is faster to use SVE for sm= all > cases or just the SIMD copy on your uarch. >=20 Thanks for the memcpy-random benchmark info. For small/medium copies, I needed to remove BTI macro from ASM ENTRY in ord= er to see the distinct performance difference between ASIMD and SVE. I'll post the patch [14] with the A64FX second patch. =20 And also somehow on A64FX as well as on ThunderX2 machine, memcpy-random didn't start due to mprotect error. I needed to fix memcpy-random [15]. If this is not wrong, I'll post the patch [15] with the a64fx second patch. [14] https://github.com/NaohiroTamura/glibc/commit/07ea389846c7c63622b6c0b3= aaead3f93e21f356 [15] https://github.com/NaohiroTamura/glibc/commit/ec0b55a855529f75bd6f280e= 59dc2b1c25640490 > 4. Avoid making the code too general or too specialistic >=20 > I see both appearing in the code - trying to deal with different cachelin= e sizes and > different vector lengths, and also splitting these out into separate case= s. If you > depend on a particular cacheline size, specialize the code for that and c= heck the > size in the ifunc selector (as various memsets do already). If you want t= o handle > multiple vector sizes, just use a register for the increment rather than = repeating > the same code several times for each vector length. >=20 In terms of the cache line size, A64FX is not configurable, it is fixed to = 256 byte. I've already removed the code to get it [16][17] [16] https://github.com/NaohiroTamura/glibc/commit/4bcc6d83c970f7a7283abfec= 753ecf6b697cf6f7 [17] https://github.com/NaohiroTamura/glibc/commit/f2b2c1ca03b50d414e03411e= d65e4b131615e865 In terms of Vector Length, I'll remove the code for VL256 bit and 128 bit. Because Vector Length agnostic code can cover the both cases. > 5. Odd prefetches >=20 > I have a hard time believing first prefetching the data to be written, th= en clearing it > using DC ZVA (???), then prefetching the same data a 2nd time, before fin= ally > write the loaded data is helping performance... > Generally hardware prefetchers are able to do exactly the right thing sin= ce > memcpy is trivial to prefetch. > So what is the performance gain of each prefetch/clear step? What is the > difference between memcpy and memmove performance (given memmove > doesn't do any of this)? Sorry, memcpy prefetch code was not right, I noticed this bug and fixed it soon after posting the first patch [18]. Basically " prfm pstl1keep, [dest_ptr, tmp1]" should be " prfm pldl2keep, [= src_ptr, tmp1]". [18] https://github.com/NaohiroTamura/glibc/commit/f5bf15708830f91fb886b159= 28158db2e875ac88 Without DC_VZA and L2 prefetch, memcpy and memset performance degraded over= 4MB. Please compare [19] with [22], and [21] with [24] for memset. Without DC_VZA and L2 prefetch, memmove didn't degraded over 4MB. Please compare [20] with [23]. The reason why I didn't implement DC_VZA and L2 prefetch is that memmove ca= lls memcpy in most cases, and memmove code only handles backward copy. Maybe most of memmove-large benchtest cases are backward copy, I need to ch= eck. DC_VZA and L2 prefetch have to be pair, only DC_VZA or only L2 prefetch doe= sn't get any improvement. With DC_VZA and L2 prefetch: [19] https://drive.google.com/file/d/1mmYaLwzEoytBJZ913jaWmucL0j564Ta7/view [20] https://drive.google.com/file/d/1Bc_DVGBcDRpvDjxCB_2yOk3MOy5BEiOs/view [21] https://drive.google.com/file/d/19cHvU2lxF28DW9_Z5_5O6gOOdUmVz_ps/view Without DC_VZA and L2 prefetch: [22] https://drive.google.com/file/d/1My6idNuQsrsPVODl0VrqiRbMR9yKGsGS/view [23] https://drive.google.com/file/d/1q8KhvIqDf27fJ8HGWgjX0nBhgPgGBg_T/view [24] https://drive.google.com/file/d/1l6pDhuPWDLy5egQ6BhRIYRvshvDeIrGl/view Thanks. Naohiro