From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by sourceware.org (Postfix) with ESMTPS id F2AFD385383E for ; Mon, 26 Jul 2021 19:51:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F2AFD385383E Received: by mail-pj1-x102d.google.com with SMTP id gv20-20020a17090b11d4b0290173b9578f1cso865933pjb.0 for ; Mon, 26 Jul 2021 12:51:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=/QWGCx0tmlNjkQ5ZdPBw01m7RLC0ogMwusvsJpGVu3A=; b=gPYjbiS23bovxM1aHLfFs1DzaTl8YrHWcxvSJaF6zr8BqxpcMEpdgWZGzR3S8TqJsv KzDT0Np88O7bjfSe3hjzmwdeyk7BUvmpp8AUCFVVZLcWQS5AcUStDWbqYW/Ft8LleuGC NVI71O9SwMoal8FMUKRQ2y1gV2aeEf4vOmRzdl/yExDa1Tl0m/VnbGyR+SxasqXcsbIA HTIX0X3HChlZudqb2sJgG914jgbappHmxL0csS9G48LzbzzY4fE4QPmNr/x3ePKNUYVX SN/tUX9hh6zs9n30aY1Gv8FA46qspc5eYsKM2VYNzM2pYlmTqbcgxhyXU++CdgM/irSO ykjw== X-Gm-Message-State: AOAM533ifqI3/FY+ZR5vC3gn725Z0wuJgju9EOghFcvDnyDgFNTttQXD k4pcjOe3kL3Ee9IaZM+kKGWiu6PQ9BEYFQ== X-Google-Smtp-Source: ABdhPJxNpOz1LUMqtBzdj0bkzpxbUz4e8wJyaInPweIfFDupuWsvTdjy6fRh6gIMxmELcCKmqIUopg== X-Received: by 2002:a17:90a:d58f:: with SMTP id v15mr575945pju.117.1627329094977; Mon, 26 Jul 2021 12:51:34 -0700 (PDT) Received: from ?IPv6:2804:431:c7cb:43e2:1da8:3cb1:1aa8:c46a? ([2804:431:c7cb:43e2:1da8:3cb1:1aa8:c46a]) by smtp.gmail.com with ESMTPSA id w6sm610000pgh.56.2021.07.26.12.51.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 26 Jul 2021 12:51:34 -0700 (PDT) Subject: Re: Enable preloading in dlopen-ed shared libraries? To: Fengkai Sun Cc: libc-help@sourceware.org, Vivek Das Mohapatra References: <81d3637d-8ea4-4048-98de-584e813ddeaf@linaro.org> From: Adhemerval Zanella Message-ID: Date: Mon, 26 Jul 2021 16:51:31 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_BLACK autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-help@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-help mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jul 2021 19:51:37 -0000 On 23/07/2021 23:51, Fengkai Sun wrote: > Hi Adhemerval, > > Thanks so much for your reply! I will explain my idea in detail and sorry > for the unclearness. > > > Do you mean by preloading the shared library list using dlmopen in a new >> namespace? Or do you mean to use RTLD_DEEPBIND with the preload libraries? >> > > I mean making users able to specify a list of preload libraries in a shared > library's local scope, so that when the library is loaded with > RTLD_DEEPBIND on, the preload libraries will take precedence over any other > library except the dlopen-ed one. This seems very specific to add such complexity and extra internal state on loader and I want to understand better the issue you are trying to solve and why you can't use the current tools instead. > > >> By doing so, the user can easily provide a different definition of a >> symbol >>> from the one of the main executable, by enabling RTLD_DEEPBIND. >>> This is useful under some circumstances. For example, a dlopen-ed library >>> may want to use a separate heap from the main heap, and the user can >>> provide another malloc implementation for that library. >> >> But how is this different than the malloc() interposition already supported >> with LD_PRELOAD? >> > > I found that LD_PRELOAD cannot provide a different definition for a > dlopen-ed library from the main executable. Let's say we are preloading > mymalloc.so in an executable: Yes, LD_PRELOAD act only for the default global scope. > > scope0(global scope): ./main_exe mymalloc.so libc.so libdl.so > So the reference in the main executable is binded with the definition in > mymalloc.so. > > A dlopen-ed shared library will have such kind of scope, let's call it > lib1.so: > scope0(global scope): ./main_exe mymalloc.so libc.so libdl.so > scope1(local scope): lib1.so libc.so > > If lib1.so is loaded without RTLD_DEEPBIND, its reference to malloc will be > binded to mymalloc.so too. That means shared libraries and the main > executable are using the same heap, sometimes the user may want to prevent > it. > My goal is to preload libraries inside local scope, and it will look like: > scope1(local scope): lib1.so othermalloc.so libc.so > > In this way, the main executable will never see the definition inside > othermalloc.so, and lib1 can bind to it when RTLD_DEEPBIND is on. Right, now I have a better grasp of what you are trying to do. I am not sure if I really like this: besides adding more complexity on the loader for a very specify user case through environment variables, I think you could implement it with in a different way with a new interface we are aiming to support on next 2.35 (RTLD_SHARED): void *h = dlmopen (LM_ID_NEWLM, "othermalloc.so", RTLD_NOW) Lmid_t id; dlinfo (h, RTLD_DI_LMID, &id); dlmopen (id, "lib1.so", RTLD_SHARED | RTLD_NOW); The lib1.so will then bind its dependencies only on the newly created namespace scope (LD_PRELOAD won't be added in this case). So a subsequent dlopen ("lib1.so", ...) will return the handler to the lib1.so already created in the extra namespace. (Vivek can correct me if I am wrong here) > >> The auditing interface can do the similar thing, but after doing some >>> experiments, I found that `la_symbind64' cannot catch the bindings of >>> global variables, and it cannot hook all of the function bindings. >> >> The rtld-audit currently only works for symbols which requires a PLT call, >> the global variables either done with GOT access directly or through copy >> relocations. I am working on extending la_symbind() to work with bind-now >> binaries, so it would be called at loading time in symbol resolution >> instead >> on the lazy binding resolution. >> >>> >>> Would it be a good idea to add an interface to enable preloading in the >>> local scope of dlopen-ed shared libraries? >> >> I am trying to understand better what you are trying to do here, because >> you are mixing two different usercases here. The RTLD_DEEPBIND is usually >> used for shared libraries to use its direct dependencies over the global >> list, the rtld-audit interfaces are loaded in a different namespace. >> > > I mentioned the rtld-audit interface here because it could provide > different definitions for the same symbol. In la_symbind(), the user can > return a different symbol address other than sym->st_value to achieve this: > > void *mymalloc(...) {...} > void *la_symbind(...){ > if (refcook == lib1_cook && strcmp(symname, "malloc") == 0) return > mymalloc; > else return sym->st_value; > } > > > However, la_symbind() currently happens in lazy binding resolution(thank > you for clarifying it!), so it cannot work on every symbol for now. Yeah, my plan is to fix it for next release. The rtld-audit might indeed be another way to implement it. > > Again, I'm sorry for mixing RTLD_DEEPBIND, dlmopen and rtld-audit all > together in the first email. I hope I've made things clear now. No problem, this is quite complex due the multiple ways to change the loader dynamic symbol resolution. > > -- > Best, > Fengkai >