From mboxrd@z Thu Jan  1 00:00:00 1970
From: Charles Wilson <cwilson@ece.gatech.edu>
To: DJ Delorie <dj@delorie.com>
Cc: binutils@sources.redhat.com, cygwin@cygwin.com, Paul Sokolovsky <paul.sokolovsky@technologist.com>
Subject: Re: [aida_s@mx12.freecom.ne.jp: A serious bug of "ld --enable-auto-import"]
Date: Sun, 26 Aug 2001 15:35:00 -0000
Message-id: <3B8979A4.5060605@ece.gatech.edu>
References: <3B8884F6.80708@ece.gatech.edu> <200108260530.BAA28221@envy.delorie.com> <3B888D76.6090102@ece.gatech.edu> <200108260613.CAA28557@envy.delorie.com> <3B891172.9000207@ece.gatech.edu> <200108261543.LAA06415@envy.delorie.com> <3B891E23.9090407@ece.gatech.edu> <200108261643.MAA06855@envy.delorie.com>
X-SW-Source: 2001-08/msg00616.html

DJ Delorie wrote:

>>Well, that's interesting.  Since arrays ARE pointers(*), then perhaps 
>>it's enough to change gcc's behavior from
>>
> 
> Not from gcc's perspective.  From C's perspective, array symbols and
> pointer symbols are mostly interchangeable, but they are not the same.
> For example, these two declarations:
> 
> 	extern char *foo;
> 	extern char foo[];
> 
> are *not* the same, and using the wrong one results in a broken
> program.


Thanks for not ridiculing my thinko.  I knew this.


> For our purposes, a pointer is a symbol referencing a four-byte range
> of memory that holds the address of a range of memory that holds a
> sequence of characters, and an array is a symbol referencing a range
> of memory that holds a sequence of characters.  Because a pointer
> requires an extra indirection, gcc is limited in the optimizations it
> can do on it, but dealing with imports becomes simpler because the
> address occurs in exactly one place.
> 
> Since a symbol is always a constant (regardless of what it refers to),
> offsetting it by a constant results in a sum that can always be
> computed at compile time (well, link time) and gcc will always do it
> that way.  This is a fairly fundamental concept in gcc, and I doubt it
> would be practical to tell gcc to do it otherwise.
> 

AHA! But that the auto-import code replaces the extra indirection (for 
DATA access into a DLL) with the actual address in the loaded DLL.  (see 
docs pasted below).  Perhaps the auto-import needs to create additional 
pseudo-symbols for index-array access.  E.g.
   hwstr
   hwstr[1]
   hwstr[2]
   hwstr[12]
could each be mapped to *different* "fake" symbols.  The, the runtime 
loader would just replace them as before -- but this time, with the 
correct (offset) address in the DLL.

Downside: could lead to an explosion of symbols, if there's a lot of 
constant-offset indexing into arrays exported by the DLL.  (Variable 
offsets are computed at runtime, of course.  No problem there.  And it 
seems that ONLY arrays are subject to this problem...if I understand 
correctly)

Oh shoot.  I just realized that the above is garbage.  How will the DLL 
know *which* fake symbols to export?  It can't know how an external 
client will access an array variable, so the DLL has to export fake 
symbols for every conceivable constant index.  This is *possible* -- 
since we're talking about arrays (e.g. with fixed length; these are 
*not* pointers <g>) -- but not really practical.  A simple array 
foo[4096] leads to 4097 exported symbols.  No, that's just silly.

I'm going back to square one on this problem.  I'm out of ideas on this 
one.  Paul?  Paaauuulll?

FWIW, this is what a disassembly of hello.exe looks like (no declspec 
decorators, using the auto-import stuff.  Notice the "fixup" labels 
__fuN__symbol):

00401044 <_main>:
   401044:       55                      push   %ebp
   401045:       89 e5                   mov    %esp,%ebp
   401047:       83 ec 18                sub    $0x18,%esp
   40104a:       e8 8d 00 00 00          call   4010dc <___main>
   40104f:       c6 05 04 41 40 00 21    movb   $0x21,0x404104
                                                      ^^^^^^^^
                                              this is off by 12

00401051 <__fu0__hwstr1>:
   401051:       04 41                   add    $0x41,%al
   401053:       40                      inc    %eax
   401054:       00 21                   add    %ah,(%ecx)
   401056:       c7 45 fc fc 40 40 00    movl   $0x4040fc,0xfffffffc(%ebp)

00401059 <__fu2__hwstr2>:
   401059:       fc                      cld
   40105a:       40                      inc    %eax
   40105b:       40                      inc    %eax
   40105c:       00 8b 45 fc 83 c0       add    %cl,0xc083fc45(%ebx)
   401062:       0c c6                   or     $0xc6,%al
   401064:       00 21                   add    %ah,(%ecx)
   401066:       83 c4 f4                add    $0xfffffff4,%esp
   401069:       68 f8 40 40 00          push   $0x4040f8

0040106a <__fu1__hwstr1>:
   40106a:       f8                      clc
   40106b:       40                      inc    %eax
   40106c:       40                      inc    %eax
   40106d:       00 e8                   add    %ch,%al
   40106f:       71 00                   jno    401071 <__fu1__hwstr1+0x7>
   401071:       00 00                   add    %al,(%eax)
   401073:       83 c4 10                add    $0x10,%esp
   401076:       83 c4 f4                add    $0xfffffff4,%esp
   401079:       68 fc 40 40 00          push   $0x4040fc

0040107a <__fu3__hwstr2>:
   40107a:       fc                      cld
   40107b:       40                      inc    %eax
   40107c:       40                      inc    %eax
   40107d:       00 e8                   add    %ch,%al
   40107f:       61                      popa
   401080:       00 00                   add    %al,(%eax)
   401082:       00 83 c4 10 31 c0       add    %al,0xc03110c4(%ebx)
   401088:       eb 02                   jmp    40108c <__fu3__hwstr2+0x12>
   40108a:       89 f6                   mov    %esi,%esi
   40108c:       89 ec                   mov    %ebp,%esp
   40108e:       5d                      pop    %ebp
   40108f:       c3                      ret

Funky, huh?

--Chuck


Quoting from the pe-dll.c:
------------------------------------
Auto-import feature by Paul Sokolovsky

  Quick facts:

  1. With this feature on, DLL clients can import variables from DLL 
without any concern from their side (for example, without any source 
code modifications).

  2. This is done completely in bounds of the PE specification (to be 
fair, there's a place where it pokes nose out of, but in practise it 
works). So, resulting module can be used with any other PE compiler/linker.

  3. Auto-import is fully compatible with standard import method and 
they can be mixed together.

  4. Overheads: space: 8 bytes per imported symbol, plus 20 for each 
reference to it; load time: negligible; virtual/physical memory: should 
be less than effect of DLL relocation, and I sincerely hope it doesn't 
affect DLL sharability (too much).

  Idea

  The obvious and only way to get rid of dllimport insanity is to make 
client access variable directly in the DLL, bypassing extra dereference. 
I.e., whenever client contains someting like

  mov dll_var,%eax,

address of dll_var in the command should be relocated to point into 
loaded DLL. The aim is to make OS loader do so, and than make ld help 
with that. Import section of PE made following way: there's a vector of 
structures each describing imports from particular DLL. Each such 
structure points to two other parellel vectors: one holding imported 
names, and one which will hold address of corresponding imported name. 
So, the solution is de-vectorize these structures, making import 
locations be sparse and pointing directly into code. Before continuing, 
it is worth a note that, while authors strives to make PE act ELF-like, 
there're some other people make ELF act PE-like: elfvector, ;-) .

  Implementation

  For each reference of data symbol to be imported from DLL (to set of 
which belong symbols with name <sym>, if __imp_<sym> is found in 
implib), the import fixup entry is generated. That entry is of type 
IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3 subsection. Each fixup 
entry contains pointer to symbol's address within .text section (marked 
with __fuN_<sym> symbol, where N is integer), pointer to DLL name (so, 
DLL name is referenced by multiple entries), and pointer to symbol name 
thunk. Symbol name thunk is singleton vector (__nm_th_<symbol>) pointing 
to IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing 
imported name. Here comes that "om the edge" problem mentioned above: PE 
specification rambles that name vector (OriginalFirstThunk) should run 
in parallel with addresses vector (FirstThunk), i.e. that they (so, DLL 
name is referenced by multiple entries), and pointer to symbol name 
thunk. Symbol name thunk is singleton vector (__nm_th_<symbol>) pointing 
to IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing 
imported name. Here comes that "om the edge" problem mentioned above: PE 
specification rambles that name vector (OriginalFirstThunk) should run 
in parallel with addresses vector (FirstThunk), i.e. that they should 
have same number of elements and terminated with zero. We violate this, 
since FirstThunk points directly into machine code. But in practise, OS 
loader implemented the sane way: it goes thru OriginalFirstThunk and 
puts addresses to FirstThunk, not something else. It once again should 
be noted that dll and symbol name structures are reused across fixup 
entries and should be there anyway to support standard import stuff, so 
sustained overhead is 20 bytes per reference. Other question is whether 
having several IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. 
Answer is yes, it is done even by native compiler/linker (libth32's 
functions are in fact reside in windows9x kernel32.dll, so if you use 
it, you have two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other 
question is whether referencing the same PE structures several times is 
valid. The answer is why not, prohibitting that (detecting violation) 
would require more work on behalf of loader than not doing it.
--------------------------------------------