public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Chung-Lin Tang <cltang@codesourcery.com>,
	GNU C Library <libc-alpha@sourceware.org>,
	Florian Weimer <fweimer@redhat.com>
Subject: Re: [PATCH v6 2/2] BZ #17645, fix slow DSO sorting behavior in dynamic loader -- Algorithm changes
Date: Tue, 5 Oct 2021 14:15:52 -0300	[thread overview]
Message-ID: <d4360317-23cd-4eea-2607-fafa90f21666@linaro.org> (raw)
In-Reply-To: <4fefc9b5-64fa-1384-075c-4b3363d95941@codesourcery.com>



On 05/10/2021 11:27, Chung-Lin Tang wrote:
> Hi Adhemerval,
> 
> On 2021/8/11 5:08 AM, Adhemerval Zanella wrote:
>>> +void
>>> +_dl_sort_maps (struct link_map **maps, unsigned int nmaps,
>>> +           unsigned int skip, bool for_fini)
>>> +{
>>> +  /* Index code for sorting algorithm currently in use.  */
>>> +  static int32_t algorithm = 0;
>>> +  if (__glibc_unlikely (algorithm == 0))
>>> +    algorithm = TUNABLE_GET (glibc, rtld, dynamic_sort,
>>> +                 int32_t, NULL);
>> This is not race free, a better alternative would be to add a new member on
>> the GLRO struct and a init function to setup the value after tunables
>> initialization.  Also, but using a enumeration to define to possible values,
>> there is no need to add the __builtin_unreachable() below.
> 
> I'm not sure why there can be a race with TUNABLE_GET(), since isn't all tunables
> set only once during initialization?

We are trying to avoid even 'benign' data races, where a static variable might
be concurrent set from a constant value.  One pattern that we used to have
is to check if a syscall was present and set a static variable, similar to
what you do.

Another way to do it would be use relaxed atomic, such as the way mips64
getdents does [1].  But I still think it would be better organized if we
move the selected dl_maps sorting algorithm to a GLRO variable.

[1] sysdeps/unix/sysv/linux/mips/mips64/getdents64.c

> 
> That said, I agree that placing the algorithm code initializing separately in
> _dl_sort_maps_init looks better.
> 
> 
> 
> The changes to this part (relative to my v6 patch) I see you have on the
> azanella/dso-opt branch:
> 
>  void
> +_dl_sort_maps_init (void)
> +{
> +  int32_t algorithm = TUNABLE_GET (glibc, rtld, dynamic_sort, int32_t, NULL);
> +  GLRO(dl_dso_sort_algo) = algorithm == 1 ? dso_sort_algorithm_original
> +                                         : dso_sort_algorithm_dfs;
> +}
> +
> +void
>  _dl_sort_maps (struct link_map **maps, unsigned int nmaps,
>                unsigned int skip, bool for_fini)
>  {
> -  /* Index code for sorting algorithm currently in use.  */
> -  static int32_t algorithm = 0;
> -  if (__glibc_unlikely (algorithm == 0))
> -    algorithm = TUNABLE_GET (glibc, rtld, dynamic_sort,
> -                            int32_t, NULL);
> -
>    /* It can be tempting to use a static function pointer to store and call
>       the current selected sorting algorithm routine, but experimentation
>       shows that current processors still do not handle indirect branches
> @@ -291,12 +293,10 @@
>       PTR_MANGLE/DEMANGLE, further impairing performance of small, common
>       input cases. A simple if-case with direct function calls appears to
>       be the fastest.  */
> -  if (__glibc_likely (algorithm == 1))
> +  if (GLRO(dl_dso_sort_algo) == dso_sort_algorithm_original)
>      _dl_sort_maps_original (maps, nmaps, skip, for_fini);
> -  else if (algorithm == 2)
> -    _dl_sort_maps_dfs (maps, nmaps, skip, for_fini);
>    else
> -    __builtin_unreachable ();
> +    _dl_sort_maps_dfs (maps, nmaps, skip, for_fini);
>  }
> 
> 
> I'm not sure if defining 'enum dso_sort_algorithm' is really needed?
> Is __builtin_unreachable() that undesirable?

The tunables code should already sanitize the input and returns the
possible algorithms as a type, instead of a number.  I think it is
clear to work with predefined type, than hint the compiler explicitly. 

> 
> Also, the final code should probably add __glibc_likely() to the default
> algorithm case.

Sometime I think we abuse the __glibc_{un}likely, specially on short code
like that.  But please reinstate it if you see that it does improve code 
generation.

> 
> Thanks,
> Chung-Lin

  reply	other threads:[~2021-10-05 17:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-20 14:25 Chung-Lin Tang
2021-08-10 21:08 ` Adhemerval Zanella
2021-10-05 14:27   ` Chung-Lin Tang
2021-10-05 17:15     ` Adhemerval Zanella [this message]
2021-10-15 15:22       ` [PATCH v7 " Chung-Lin Tang
2021-10-19 14:12         ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4360317-23cd-4eea-2607-fafa90f21666@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=cltang@codesourcery.com \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).