From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Lance Taylor To: rms@gnu.ai.mit.edu Cc: drepper@cygnus.com, dm@sgi.com, gcc2@cygnus.com, gas2@cygnus.com Subject: Re: global vars and symbol visibility for mips32/elf Date: Sat, 10 Aug 1996 19:41:00 -0000 Message-id: <9608110241.AA22482@tweedledumb.cygnus.com> References: <199608110037.UAA17547@psilocin.gnu.ai.mit.edu> X-SW-Source: 1996/msg00082.html Date: Sat, 10 Aug 1996 20:37:03 -0400 From: Richard Stallman That turns out not to be the case. A shared library is not like an archive library. A shared library is a single object. Everything that composes a shared library is linked together. There is only one set of symbols. There is no way for the linker to form any sort of transitive closure operation, because there is no longer any distinction between the various object files which compose the shared library. Although this may be how things work now, it is not really a useful way for things to work. Dividing an ordinary library into separate members provides two benefits: * The members you don't refer to, do not take up space in your program. * The members you don't refer to, do not fill up your program's name space. An ELF shared library only fills up a program's name space in a very specific sense. If a shared library defines a function foo, and your program defines a function foo, that does not cause any conflict. Your program's definition of foo is used. If functions in the shared library call foo, they will call your program's definition of foo rather than the shared library's definition. In this sense, shared libraries act just like archive libraries. The case which causes trouble is when a program uses a common variable (an uninitialized variable in C; most languages, including C++, have no notion of common variables). A common variable is treated as an undefined reference, unless no definition is seen. In the latter case, the common variable becomes a definition. When a common variable is used, the linker will resolve it against a definition found in a shared library. This is different from the handling of an archive library, at least in ELF. In an ELF archive library, a particular object will not be included in the link merely to satisfy a common reference; it will only be included to satisfy an undefined reference. In some cases, it will be desirable to resolve a common symbol with the definition in the shared library. This would be desirable when the actual intent was to use the initialized variable in the shared library. For example, if a program uses ``int optind;'' rather than ``extern int optind;'', and it also calls getopt, then, in both a shared library and an archive library, the common symbol optind will wind up referring to the initialized variable optind, rather than creating a new, uninitialized, optind variable. Thus, the only sense in which shared libraries occupy a program's name space is that common symbols will sometimes be bound to symbols defined in a shared library. Now that I've written this, I see that, since an ELF symbol table records whether a symbol represents a function or a variable, it would be possible to make the linker refuse to resolve a common symbol (which must represent a variable) against a function defined in a shared library. This will introduce a somewhat confusing situation in that a single symbol will be both a variable and a function. However, as far as I can see, if any actual confusion results from this, then the same program would not have worked correctly in the archive library case either. I will make this change. This will, as it happens, fix the particular test case which started this thread, so I was wrong in believing that this was a bug in the shared library. This will reduce the problematic cases in a shared library to global variables. It will continue to be the case that ELF common symbols and global variables will be resolved differently when using shared libraries as opposed to archive libraries. This should be much less of a problem, since libraries typically have relatively few global variables. Doing this right does not require any changes in the shared library run-time mechanism. It only requires some way of representing, in the shared library's symbol table, a division of external symbols into various "library members". Then ld can treat as weak any external definitions which are not in the same "library members" as some symbol that is referenced. Each "library member" should have references as well as definitions; that way, ld can tell that if "library member" A is referenced, and it references member B, then the definitions in B are not weak. With an open-ended format such as ELF, it should not be hard to design a way of representing this information, which does not conflict with anything else and will not confuse other linkers. If ld finds this data, it should act accordingly; otherwise, it should do what it does now. That way, each of our tools is upward compatible. This will make it possible to turn any unshared library into a shared library, with no special precautions, and get no change in the behavior except for sharing of memory. I can not see how to implement this without more than doubling the time it takes to link against a shared library. Linking against a shared library is a common operation, some shared libraries are large, and the amount of time it takes to link against them is important. I believe that this would be a poor tradeoff. Ian