From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 0E8DA385AC24 for ; Tue, 5 Oct 2021 14:27:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0E8DA385AC24 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: omNeQPS6K6ej+cVYcnSwmBbGeZTmnSNJOzBRzctuJFN/Wr1z5Scfz+twBlCO5oDdCuaiWUvkxv QVtWL20rnubY+HX9+7ZBqkuCvjsvl3Vr9UDLhmj6eunhwQuTDEiEYC9CAM66G39TqCgn8oA9dd gtXGlqi8IjT7JC9CeNQYvwX07NlFSV4D2eN7jzJD4oLkD7/WGD8rJz4soowi8WBeeaIK6TiECW D/IYxEaKY/bFPO+PemeE6KnrTsATFY0aUkLj+2h6YIC7n8FG76Gn8vIMHGBtyvo+ZSHybXdf4q kGU7CO3QN/3lSpVvnSMkHPwh X-IronPort-AV: E=Sophos;i="5.85,349,1624348800"; d="scan'208";a="66834995" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 05 Oct 2021 06:27:55 -0800 IronPort-SDR: SngXR1TEsqqrc6uyYE0/fKFBTCE/A5n3nXRpIuS5Nuap8pBRgGAMG79ppplL9keaQDkFbwmMVn VGiByyNqaq40DO9i6KSVA8NOLt/RTyAAzN6cucJR+Thq3jg0gF9sKy1gVrAFy75GKpPYhEk9bu sqeErUAy2F6dq6iRKJK9HdQdSRy2cRlwlkW9lXT0RXtFrmnPdmB8tjAQxTTFikeE1dxzQLWWGT pQnlCgpxX5mcO5Evx3D96p43cAjPHMXjFNIFEKFR/hEx2AzX24orHCDz27bb9qiPP78ZgG6650 QX0= Subject: Re: [PATCH v6 2/2] BZ #17645, fix slow DSO sorting behavior in dynamic loader -- Algorithm changes To: Adhemerval Zanella , GNU C Library , Florian Weimer References: <30aa9fbd-fc08-9a6f-d34b-accb47abd8af@codesourcery.com> <334c6848-e805-4666-add5-e611f1d7c255@linaro.org> From: Chung-Lin Tang Message-ID: <4fefc9b5-64fa-1384-075c-4b3363d95941@codesourcery.com> Date: Tue, 5 Oct 2021 22:27:49 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: <334c6848-e805-4666-add5-e611f1d7c255@linaro.org> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-ClientProxiedBy: SVR-ORW-MBX-07.mgc.mentorg.com (147.34.90.207) To svr-orw-mbx-02.mgc.mentorg.com (147.34.90.202) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Oct 2021 14:27:58 -0000 Hi Adhemerval, On 2021/8/11 5:08 AM, Adhemerval Zanella wrote: >> +void >> +_dl_sort_maps (struct link_map **maps, unsigned int nmaps, >> + unsigned int skip, bool for_fini) >> +{ >> + /* Index code for sorting algorithm currently in use. */ >> + static int32_t algorithm = 0; >> + if (__glibc_unlikely (algorithm == 0)) >> + algorithm = TUNABLE_GET (glibc, rtld, dynamic_sort, >> + int32_t, NULL); > This is not race free, a better alternative would be to add a new member on > the GLRO struct and a init function to setup the value after tunables > initialization. Also, but using a enumeration to define to possible values, > there is no need to add the __builtin_unreachable() below. I'm not sure why there can be a race with TUNABLE_GET(), since isn't all tunables set only once during initialization? That said, I agree that placing the algorithm code initializing separately in _dl_sort_maps_init looks better. The changes to this part (relative to my v6 patch) I see you have on the azanella/dso-opt branch: void +_dl_sort_maps_init (void) +{ + int32_t algorithm = TUNABLE_GET (glibc, rtld, dynamic_sort, int32_t, NULL); + GLRO(dl_dso_sort_algo) = algorithm == 1 ? dso_sort_algorithm_original + : dso_sort_algorithm_dfs; +} + +void _dl_sort_maps (struct link_map **maps, unsigned int nmaps, unsigned int skip, bool for_fini) { - /* Index code for sorting algorithm currently in use. */ - static int32_t algorithm = 0; - if (__glibc_unlikely (algorithm == 0)) - algorithm = TUNABLE_GET (glibc, rtld, dynamic_sort, - int32_t, NULL); - /* It can be tempting to use a static function pointer to store and call the current selected sorting algorithm routine, but experimentation shows that current processors still do not handle indirect branches @@ -291,12 +293,10 @@ PTR_MANGLE/DEMANGLE, further impairing performance of small, common input cases. A simple if-case with direct function calls appears to be the fastest. */ - if (__glibc_likely (algorithm == 1)) + if (GLRO(dl_dso_sort_algo) == dso_sort_algorithm_original) _dl_sort_maps_original (maps, nmaps, skip, for_fini); - else if (algorithm == 2) - _dl_sort_maps_dfs (maps, nmaps, skip, for_fini); else - __builtin_unreachable (); + _dl_sort_maps_dfs (maps, nmaps, skip, for_fini); } I'm not sure if defining 'enum dso_sort_algorithm' is really needed? Is __builtin_unreachable() that undesirable? Also, the final code should probably add __glibc_likely() to the default algorithm case. Thanks, Chung-Lin