From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 57364 invoked by alias); 25 Oct 2016 14:32:19 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 57123 invoked by uid 89); 25 Oct 2016 14:32:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.3 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=Being, trivial, platforms, interpose X-HELO: mx1.redhat.com Subject: Re: Evolution of ELF symbol management To: Joseph Myers References: <9727f95a-df3d-ec11-8c1d-9b7ea6cbcaac@redhat.com> Cc: GNU C Library From: Florian Weimer Message-ID: <2e86a3a6-3ad3-6834-4c6c-64836a956dbd@redhat.com> Date: Tue, 25 Oct 2016 14:32:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2016-10/txt/msg00424.txt.bz2 On 10/18/2016 06:49 PM, Joseph Myers wrote: > On Tue, 18 Oct 2016, Florian Weimer wrote: > >> I think the above sums up the status quo. With this message, I want to start >> a discussion why this symbol mangling stops at glibc-internal cross-DSO >> references (or static linking). Wouldn't other system libraries, such as >> libstdc++, glib, Qt and so on need to do the same thing? After all, if Qt >> calls foo@GLIBC_2.31, and the main program defines foo (which the static >> linker automatically exports to enable interposition), we almost certainly >> would want Qt to continue to call foo@GLIBC_2.31, and not the potentially >> incompatible implementation of foo in the main program. > > We've previously discussed this in the libstdc++ context and I think > agreed that implementation-namespace versions should be added at least for > functions used in libstdc++ headers, to allow G++ to stop defining > _GNU_SOURCE by default > > > . I'd > consider the namespace issues to apply equally to all language runtime > libraries - so including any symbols used in libstdc++ but not in the > headers, for example, and in other language runtimes in GCC where there is > a meaningful question for the relevant languages of certain C symbol names > being reserved or not reserved. Language runtimes also include e.g. > libdfp (on which basis the printf hooks functionality should be exported > under implementation-namespace names). Anything providing a plug-in framework is probably a language run-time in this sense. > Doing it more systematically for glibc function symbols rather than only > supporting it for particular privileged language runtimes seems reasonable > to me. Fine. >> To keep things simple, I suggest that for all new function symbols, we declare >> __libc_foo in the header file, redirect foo to __libc_foo, export both at the >> same symbol version from the DSO, and make __libc_foo a strong definition and >> foo a weak one. (We should not add new variable symbols.) > > I take it this is __libc_foo independent of which library contains foo (so > no __libm_foo, __libpthread_foo etc.)? Yes, this was my intent. I'm not fixated on a particular prefix. I just want to avoid a new discussion for each symbol. > There are a few existing __libc_foo exports at public symbol versions. Do > all those satisfy the rule that where both foo and __libc_foo exist, the > latest version of foo and the latest version of __libc_foo are aliases or > otherwise have the same semantics? (It would seem very confusing for old > and new __libc_* symbols to follow different rules in that regard.) I found a few symbols which differs in the exported version. The unprefixed symbol has a regular version, and the prefixed one appears as GLIBC_PRIVATE. These are: clntudp_bufcreate fork longjmp pread pwrite secure_getenv siglongjmp system vfork __libc_clntudp_bufcreate is GLIBC_PRIVATE and takes an additional argument compared to clntudp_bufcreate, so that's a true mismatch. It's used from libnsl.so. fork, longjmp, siglongjmp, vfork are aliases for the redirection from libpthread. They cannot use the usual __ alias because they define that as well. I don't know why pread, pwrite have __libc_-prefixed aliases instead of __-prefixed ones. I think the goal is namespace-cleanliness, so either one would work. Unfortunately, my current symbol lists do not include symbol values, so finding further discrepancies (which have the same version, but different values) is not entirely straightforward. I'll see what I can do to get more data (probably using some Perl script; unfortunately dlysm is currently broken in the presence of multiple symbol versions). > What should be done where the symbol is only added in the implementation > namespace - symbols for use in redirection for different standard > versions, macros, inline functions or *_nonshared.a, for example? Should > future such symbols also use the __libc_foo namespace (unless there are > ABI reasons for something else, e.g. the libmvec functions) (so if such a > practice were implemented before glibc 2.25 came out, __iscanonicall would > change to __libc_iscanonicall, etc.), or continue being __foo? I have no strong opinion here. My reason for using __libc_ instead of __ was that other libraries use __ for internal symbols. With __libc_, collisions seem even less likely. > What about compilers that do not support redirection? I worry more about things like mlnlffigen or SWIG which parse header files and attempt to generate bindings from them. Our current header files are already rather difficult to process by such tools, which often contain heuristics, in part to avoid implementing full C, in part to get some sort of API out of the header file even if parts of it lives in preprocessor macros only. > Right now we have > many individual #defines in the case where __REDIRECT is not supported. > If we required support for asm redirection in compilers using the glibc > headers, it would be possible to define a macro to declare both foo and > __libc_foo, with the same type and the same attributes (and the same throw > () information for C++), and do the redirection, all with one macro call. > Otherwise you get a lot of repetitive boilerplate in headers for every > such function, since a macro cannot generate a #define of another macro. > Or you say that compilers without redirection support don't get any of > these redirections, since they are not semantically required. Unfortunately, I forgot an important detail: Even in the __GNUC__ case, we need more than just the declaration. If we just set an asm alias, an interposing definition supplied by the user will happily interpose the supposedly-protected alias. This is less relevant for functions in non-standard headers (which applications would not include accidentally), but if we add something to (under _GNU_SOURCE) which is ripe for collisions, we need to somehow make sure that a user-defined function of the same name does not end up interposing the alias. The easiest way to do this is to add a function-style macro which expends to something that cannot be parsed as part of a function definition. We discussed using inline functions for this, but this appraoch was rejected due to standards compliance concerns. But maybe the desire to put all this in a single macro definition will make us reconsider. On the hand, inline functions are particularly hard on some wrapper generators. The internal binding generator of LuaJIT, for example, supports asm aliases, but ignores inline functions. > (When you're dealing with API issues as well as ABI then the macro > solution runs into complications with wanting to declare __libc_foo > unconditionally for use in libstdc++ headers, but foo only when the right > feature test macros are defined. Those complications can certainly be > resolved, e.g. with macros ___GNU to do the declaration whose > definitions depend on the feature test macros defined, and a first > solution might well only deal with the ABI issues and leave the API ones > for later.) Yes, please. :) > Being able to make all the declarations with a single macro is attractive, > since right now I'm sure that lots of the declarations in internal > include/ headers are in fact suboptimal because they are missing > attributes present on the public declarations. It would also have the > potential for defining variants of such macros in future that also do > *_hidden_proto (for public and internal function names) when building > glibc. Recall that *_hidden_* are still needed even for internal function > names, whether or not those names are exported - if exported, failure to > use *_hidden_* will be visible through localplt test failures, but if not > exported, less efficient code is still generated in the caller on 32-bit > x86 if the function isn't visibly hidden > . With consistent internal and external mangling (which was not the goal of my proposal, but I can see how it is related), we could compile sources within libc and outside, for unit testing purposes. It would also make it easier to avoid linknamespace violations. But that needs new aliases for basically everything. > In fact we have evidence > that the > headers have had problems for a long time for compilers not defining > __GNUC__, and those include problems relating to redirection. Ugh. Part of the problem is that we don't have a C89 or C++98 implementation with which we can test easily. I'm not aware of any way to turn GCC or Clang into a C89-only or a C++98-only compiler (and you said we'd need some extensions anyway). >> For existing symbols, we only do this if we receive reports of conflicts >> causing problems in the field. In this case, we add __libc_foo and the >> redirect to the header file, and use the current symbol version for the >> __libc_foo export (not the one of foo). > > "causing problems in the field" should be broadly interpreted there - to > allow adding lots of such functions if someone identifies what's needed to > make the libstdc++ headers or libraries namespace-clean, for example, or > for fixing the namespace issues described in > . Yes, I would assume this is fine. > What should be done in the case where __foo already has an export at a > public symbol version (and we have a use for __libc_foo)? Should we > arrange for __foo to be declared (with associated redirections) and say > people should be using that, or add __libc_foo as well? What about where > __foo is already exported, but that export is a compat symbol (if there > are any such cases)? Making it not a compat symbol would run into needing > new exports at new versions on platforms postdating the version where it > was made a compat symbol, and you don't want the API to be __foo on some > platforms and __libc_foo on others. I think this leads to the question whether we should prefer __ over __libc_ after all because as part of fixing the glibc-internal linknamespace issues, we often added a __ symbol with a public version (but sometimes a GLIBC_PRIVATE version as well). I would prefer if we could reuse those symbol names. I don't think we can switch them to __libc_ because they were part of the ABI, although they were not part of any header file. I'm less convinced now in which direction to move. Unfortunately, this has ABI impact, so the consequences are far from trivial. Maybe we need to take a step back and ask ourselves if we should use symbol versioning to address this. The two blockers I know of are purely static links, and the design decision (no doubt for backwards compatibility) to interpose versioned symbols with unversioned symbols. The latter is difficult to address, but if we could make the change somehow, it would enable a nice performance boost in the dynamic linker, too. But it certainly looks like that for the static link case, we only have the header files we can tweak to achieve what we want. Thanks, Florian