From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by sourceware.org (Postfix) with ESMTPS id 7D2BF3857BBC for ; Fri, 17 Jun 2022 20:35:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7D2BF3857BBC Received: by mail-pg1-x532.google.com with SMTP id l4so4922759pgh.13 for ; Fri, 17 Jun 2022 13:35:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+fGRAiGsQsf4UvgNSAouyKCSPHa1Tc4zndQrrmD9kPc=; b=Rkt8029FXrIqj+R8ioJcIqgU+VMgBaoR67lAl6o7rNpiATKjq8b4L938uKjKyYxlzg FYq+mV7IQygd4wFPROGLRcPj9f0btDJYYDJPOr8lSoiPHAdPOpmfoVP76wNQRSJKa0fL cGAsZ77Mbi5GOIgm5OeGUrQCFC2nRORcdFtoU7sh1FIB9z5EdU5+0VRBU5EH41N6pFhu wMfOywjTaq25AN3LPPth6VBozuT4W3r7RM6lw5/RORmopJFoIQnA8HMExah5c7GXNHnH ySL1VhW/8C7AN9d9lb/c6SRQMDA5IggpOJHfakzsBgIBFdN5NwJVsMEndLWiuV0ZXy0R pZqA== X-Gm-Message-State: AJIora9HKbTxre3+Ken5zeKl5BEDeiRueNKcgHUwnYbc+mzh2nRSd2WA RqBiC/aEsDTi2e16CEfxhhmIuw== X-Google-Smtp-Source: AGRyM1tlk/C1sd7PohT5abw6nyIIceCyubJX0e5CMWp/3sH5IuYs3flOooCxuZ2WFob1v8LqKd010w== X-Received: by 2002:a65:4907:0:b0:3fd:bc3e:fb0a with SMTP id p7-20020a654907000000b003fdbc3efb0amr10785749pgs.123.1655498113217; Fri, 17 Jun 2022 13:35:13 -0700 (PDT) Received: from google.com ([2620:15c:2ce:200:3e97:dac1:ecec:f016]) by smtp.gmail.com with ESMTPSA id ih15-20020a17090b430f00b001ead6fca7e0sm3715530pjb.4.2022.06.17.13.35.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Jun 2022 13:35:12 -0700 (PDT) Date: Fri, 17 Jun 2022 13:35:09 -0700 From: Fangrui Song To: Adhemerval Zanella Cc: GNU C Library , Wilco Dijkstra Subject: Re: [PATCH v2 1/4] misc: Optimize internal usage of __libc_single_threaded Message-ID: <20220617203509.jhz3dkwxwu6wx2j4@google.com> References: <20220610163552.3587064-1-adhemerval.zanella@linaro.org> <20220610163552.3587064-2-adhemerval.zanella@linaro.org> <20220616071538.f2w2f45ds27bgu3j@google.com> <693ECAA0-6F3E-4004-91D0-DE7FAFF43F0A@linaro.org> <20220616223040.a6s4sy4mcz5mrpmo@google.com> <9023C141-E75D-494B-BC79-4DB22E2F8268@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <9023C141-E75D-494B-BC79-4DB22E2F8268@linaro.org> X-Spam-Status: No, score=-26.9 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, GIT_PATCH_0, KAM_INFOUSMEBIZ, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Jun 2022 20:35:16 -0000 On 2022-06-16, Adhemerval Zanella wrote: > > >> On 16 Jun 2022, at 15:30, Fangrui Song wrote: >> >> On 2022-06-16, Adhemerval Zanella wrote: >>> >>> >>>> On 16 Jun 2022, at 00:15, Fangrui Song wrote: >>>> >>>> On 2022-06-10, Adhemerval Zanella via Libc-alpha wrote: >>>>> By adding an internal hidden_def alias to avoid the GOT indirection. >>>>> On some architecture, __libc_single_thread may be accessed through >>>>> copy relocations and thus it requires to update both copies. >>>>> >>>>> To obtain the correct address of the __libc_single_thread, >>>>> __libc_dlsym is extended to support RTLD_DEFAULT. It searches >>>>> through all scope instead of default local ones. >>>>> >>>>> Checked on x86_64-linux-gnu and i686-linux-gnu. >>>>> --- >>>>> elf/dl-libc.c | 20 ++++++++++++++++++-- >>>>> elf/libc_early_init.c | 9 +++++++++ >>>>> include/sys/single_threaded.h | 11 +++++++++++ >>>>> misc/single_threaded.c | 2 ++ >>>>> nptl/pthread_create.c | 6 +++++- >>>>> 5 files changed, 45 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/elf/dl-libc.c b/elf/dl-libc.c >>>>> index 266e068da6..e64f4b9910 100644 >>>>> --- a/elf/dl-libc.c >>>>> +++ b/elf/dl-libc.c >>>>> @@ -16,6 +16,7 @@ >>>>> License along with the GNU C Library; if not, see >>>>> . */ >>>>> >>>>> +#include >>>>> #include >>>>> #include >>>>> #include >>>>> @@ -72,6 +73,7 @@ struct do_dlsym_args >>>>> /* Arguments to do_dlsym. */ >>>>> struct link_map *map; >>>>> const char *name; >>>>> + const void *caller_dlsym; >>>>> >>>>> /* Return values of do_dlsym. */ >>>>> lookup_t loadbase; >>>>> @@ -102,8 +104,21 @@ do_dlsym (void *ptr) >>>>> { >>>>> struct do_dlsym_args *args = (struct do_dlsym_args *) ptr; >>>>> args->ref = NULL; >>>>> - args->loadbase = GLRO(dl_lookup_symbol_x) (args->name, args->map, &args->ref, >>>>> - args->map->l_local_scope, NULL, 0, >>>>> + struct link_map *match = args->map; >>>>> + struct r_scope_elem **scope; >>>>> + if (args->map == RTLD_DEFAULT) >>>>> + { >>>>> + ElfW(Addr) caller = (ElfW(Addr)) args->caller_dlsym; >>>>> + match = _dl_find_dso_for_object (caller); >>>>> + /* It is only used internally, so caller should be always recognized. */ >>>>> + assert (match != NULL); >>>>> + scope = match->l_scope; >>>>> + } >>>>> + else >>>>> + scope = args->map->l_local_scope; >>>>> + >>>>> + args->loadbase = GLRO(dl_lookup_symbol_x) (args->name, match, &args->ref, >>>>> + scope, NULL, 0, >>>>> DL_LOOKUP_RETURN_NEWEST, NULL); >>>>> } >>>>> >>>>> @@ -182,6 +197,7 @@ __libc_dlsym (void *map, const char *name) >>>>> struct do_dlsym_args args; >>>>> args.map = map; >>>>> args.name = name; >>>>> + args.caller_dlsym = RETURN_ADDRESS (0); >>>>> >>>>> #ifdef SHARED >>>>> if (GLRO (dl_dlfcn_hook) != NULL) >>>>> diff --git a/elf/libc_early_init.c b/elf/libc_early_init.c >>>>> index 3c4a19cf6b..7cc2997122 100644 >>>>> --- a/elf/libc_early_init.c >>>>> +++ b/elf/libc_early_init.c >>>>> @@ -16,7 +16,9 @@ >>>>> License along with the GNU C Library; if not, see >>>>> . */ >>>>> >>>>> +#include >>>>> #include >>>>> +#include >>>>> #include >>>>> #include >>>>> #include >>>>> @@ -38,6 +40,13 @@ __libc_early_init (_Bool initial) >>>>> __libc_single_threaded = initial; >>>>> >>>>> #ifdef SHARED >>>>> + /* __libc_single_threaded can be accessed through copy relocations, so it >>>>> + requires to update the external copy. */ >>>>> + __libc_external_single_threaded = __libc_dlsym (RTLD_DEFAULT, >>>>> + "__libc_single_threaded"); >>>>> + assert (__libc_external_single_threaded != NULL); >>>>> + *__libc_external_single_threaded = initial; >>>>> + >>>>> __libc_initial = initial; >>>>> #endif >>>> >>>> I think this whole scheme can be greatly simplified. >>>> >>>> Under the hood, >>>> >>>> extern __typeof (__libc_single_threaded) __EI___libc_single_threaded __asm__("" "__libc_single_threaded"); extern __typeof (__libc_single_threaded) __EI___libc_single_threaded __attribute__((alias ("" "__GI___libc_single_threaded"))) __attribute__ ((__copy__ (__libc_single_threaded))); >>>> * __libc_single_threaded is the STV_HIDDEN symbol with the asm name "__GI___libc_single_threaded" >>>> * __EI___libc_single_threaded is the STV_DEFAULT symbol with the asm name "__libc_single_threaded". It aliases __libc_single_threaded. >>> >>> In fact libc_hidden_proto for SHARED will result in: >>> >>> extern __typeof (__libc_single_threaded) __libc_single_threaded __asm__ ("" "__GI___libc_single_threaded") __attribute__ ((visibility ("hidden")));; >>> >>> So using __libc_single_threaded will be the most simple way. >> >> The code tries to read the initial value of the copy relocated __libc_single_threaded. >> Accessing the STV_DEFAULT symbol does the trick and is simpler than __libc_dlsym. >> (Accessing the STV_HIDDEN symbol just accesses libc.so's own copy.) > >The glibc will still continue to use the __libc_single_threaded value internally, the >__libc_dlsym is used to initialize __libc_external_single_threaded which is used to >update the external one solely to avoid update not being visible on architectures that >use copy relocations. Yes, I know. My point is that to access the possibly external definition (in the presence of a copy relocation), we can access the STV_DEFAULT symbol, instead of using __libc_dlsym. The compiler generates a GOT-generating relocation in -fPIC mode and the linker will create a GLOB_DAT dynamic relocation because the symbol is preemptible/interposable (see https://maskray.me/blog/2020-11-15-explain-gnu-linker-options#bsymbolic-and---dynamic-list how a symbol is considered preemptible). ld.so resolves GLOB_DAT to the copy relocated definition in the executable (if exists). >This is actually not needed on some architectures, aarch64 for instance, but without >it x86 does not see the __libc_single_thread being updated. __libc_single_thread is in a public header sys/single_threaded.h and is used by libstdc++ ext/atomicity.h. A user program compiled with -fno-pic will typically access __libc_single_thread with an absolute or PC-relative relocation. The linker will resolve this relocation to a copy relocation if __libc_single_thread is defined in libc.so.6. aarch64 -fno-pic uses PC-relative relocations. I think some variants of mips avoid -fno-pic code generation which may lead to copy relocation. clang -fno-direct-access-external-data and gcc -mno-direct-extern-access avoids absolute/PC-relative relocation in -fno-pic/-fpie mode, but the options are by no means popular. >> >> Note: The copy relocated __libc_single_threaded likely has a value of zero and is read-only from the executable. >> I assume that this is a conservative and safe value. >> > >It can not be a read-only value because it should be updated by pthread_create. > I mean that the executable does not write to __libc_single_threaded. Its write to __libc_single_threaded will not be visible to libc.so.6. If libc.so.6 decides to write the STV_HIDDEN __libc_single_threaded, it needs to write the STV_DEFAULT __libc_single_threaded to update the possibly copy relocated definition.