From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ian Lance Taylor <ian@cygnus.com>
To: rms@gnu.ai.mit.edu
Cc: drepper@cygnus.com, dm@sgi.com, gcc2@cygnus.com, gas2@cygnus.com
Subject: Re: global vars and symbol visibility for mips32/elf
Date: Sat, 10 Aug 1996 19:41:00 -0000
Message-id: <9608110241.AA22482@tweedledumb.cygnus.com>
References: <199608110037.UAA17547@psilocin.gnu.ai.mit.edu>
X-SW-Source: 1996/msg00082.html

   Date: Sat, 10 Aug 1996 20:37:03 -0400
   From: Richard Stallman <rms@gnu.ai.mit.edu>

       That turns out not to be the case.  A shared library is not like an
       archive library.  A shared library is a single object.  Everything
       that composes a shared library is linked together.  There is only one
       set of symbols.  There is no way for the linker to form any sort of
       transitive closure operation, because there is no longer any
       distinction between the various object files which compose the shared
       library.

   Although this may be how things work now, it is not really a useful
   way for things to work.

   Dividing an ordinary library into separate members provides two benefits:

   * The members you don't refer to, do not take up space in your program.
   * The members you don't refer to, do not fill up your program's name space.

An ELF shared library only fills up a program's name space in a very
specific sense.

If a shared library defines a function foo, and your program defines a
function foo, that does not cause any conflict.  Your program's
definition of foo is used.  If functions in the shared library call
foo, they will call your program's definition of foo rather than the
shared library's definition.  In this sense, shared libraries act just
like archive libraries.

The case which causes trouble is when a program uses a common variable
(an uninitialized variable in C; most languages, including C++, have
no notion of common variables).  A common variable is treated as an
undefined reference, unless no definition is seen.  In the latter
case, the common variable becomes a definition.

When a common variable is used, the linker will resolve it against a
definition found in a shared library.  This is different from the
handling of an archive library, at least in ELF.  In an ELF archive
library, a particular object will not be included in the link merely
to satisfy a common reference; it will only be included to satisfy an
undefined reference.

In some cases, it will be desirable to resolve a common symbol with
the definition in the shared library.  This would be desirable when
the actual intent was to use the initialized variable in the shared
library.  For example, if a program uses ``int optind;'' rather than
``extern int optind;'', and it also calls getopt, then, in both a
shared library and an archive library, the common symbol optind will
wind up referring to the initialized variable optind, rather than
creating a new, uninitialized, optind variable.

Thus, the only sense in which shared libraries occupy a program's name
space is that common symbols will sometimes be bound to symbols
defined in a shared library.

Now that I've written this, I see that, since an ELF symbol table
records whether a symbol represents a function or a variable, it would
be possible to make the linker refuse to resolve a common symbol
(which must represent a variable) against a function defined in a
shared library.  This will introduce a somewhat confusing situation in
that a single symbol will be both a variable and a function.  However,
as far as I can see, if any actual confusion results from this, then
the same program would not have worked correctly in the archive
library case either.

I will make this change.  This will, as it happens, fix the particular
test case which started this thread, so I was wrong in believing that
this was a bug in the shared library.

This will reduce the problematic cases in a shared library to global
variables.  It will continue to be the case that ELF common symbols
and global variables will be resolved differently when using shared
libraries as opposed to archive libraries.  This should be much less
of a problem, since libraries typically have relatively few global
variables.

   Doing this right does not require any changes in the shared library
   run-time mechanism.  It only requires some way of representing, in the
   shared library's symbol table, a division of external symbols into
   various "library members".  Then ld can treat as weak any external
   definitions which are not in the same "library members" as some symbol
   that is referenced.  Each "library member" should have references as
   well as definitions; that way, ld can tell that if "library member" A
   is referenced, and it references member B, then the definitions in B
   are not weak.

   With an open-ended format such as ELF, it should not be hard to design
   a way of representing this information, which does not conflict with
   anything else and will not confuse other linkers.  If ld finds this
   data, it should act accordingly; otherwise, it should do what it does
   now.  That way, each of our tools is upward compatible.

   This will make it possible to turn any unshared library into a shared
   library, with no special precautions, and get no change in the
   behavior except for sharing of memory.

I can not see how to implement this without more than doubling the
time it takes to link against a shared library.  Linking against a
shared library is a common operation, some shared libraries are large,
and the amount of time it takes to link against them is important.  I
believe that this would be a poor tradeoff.

Ian