public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Regression for OCaml introduced by rebase 4.4.4
@ 2018-02-08 11:47 David Allsopp
  2018-02-08 15:04 ` Corinna Vinschen
  2018-02-08 15:15 ` Corinna Vinschen
  0 siblings, 2 replies; 12+ messages in thread
From: David Allsopp @ 2018-02-08 11:47 UTC (permalink / raw)
  To: cygwin

TL;DR flexlink-compiled DLLs (i.e. ocaml libraries) are broken by the
0x200000000 base address requirement added in rebase 4.4.4. Possible fixes
for this at the bottom.

Commit bfd383 in the rebase sources introduces a new minimum base address
requirement of 0x200000000 for Cygwin64. This is a problem for the correct
operation of the flexdll package and affects ocaml. On a fresh up-to-date
Cygwin64 installation, install the ocaml package:

  $ rebase -i /usr/lib/ocaml/stublibs/*
  /usr/lib/ocaml/stublibs/dllbigarray.so    base 0x000000010000 size
0x00015000 *
  /usr/lib/ocaml/stublibs/dllcamlstr.so     base 0x000000010000 size
0x00014000 *
  /usr/lib/ocaml/stublibs/dllgraphics.so    base 0x000000010000 size
0x00038000 *
  /usr/lib/ocaml/stublibs/dllnums.so        base 0x000000010000 size
0x00011000 *
  /usr/lib/ocaml/stublibs/dllthreads.so     base 0x000000010000 size
0x00025000 *
  /usr/lib/ocaml/stublibs/dllunix.so        base 0x000000010000 size
0x0004c000 *
  /usr/lib/ocaml/stublibs/dllvmthreads.so   base 0x000000010000 size
0x0001f000 *

Here you can see a problem we already know about with flexlink - all
libraries have a base address of 0x10000
(https://github.com/alainfrisch/flexdll/issues/50).

However, this allows you to load libraries dynamically:

  $ ocaml
          OCaml version 4.04.2

  # #load "unix.cma";;
  # #directory "+threads";;
  # #load "threads.cma";;

but not fork (we know about this problem):

  # Unix.fork ();;
        0 [main] ocamlrun 5688 child_info_fork::abort: address space needed
by 'dllunix.so' (0x400000) is already occupied
  Exception: Unix.Unix_error (Unix.EAGAIN, "fork", "").

Now do a rebaseall.

  $ rebase -i /usr/lib/ocaml/stublibs/*
  /usr/lib/ocaml/stublibs/dllvmthreads.so   base 0x0003fec20000 size
0x0001f000
  /usr/lib/ocaml/stublibs/dllunix.so        base 0x0003fec40000 size
0x0004c000
  /usr/lib/ocaml/stublibs/dllthreads.so     base 0x0003fec90000 size
0x00025000
  /usr/lib/ocaml/stublibs/dllnums.so        base 0x0003fecc0000 size
0x00011000
  /usr/lib/ocaml/stublibs/dllgraphics.so    base 0x0003fece0000 size
0x00038000
  /usr/lib/ocaml/stublibs/dllcamlstr.so     base 0x0003fed20000 size
0x00014000
  /usr/lib/ocaml/stublibs/dllbigarray.so    base 0x0003fed40000 size
0x00015000

So forking should now be fine. However:

  $ ocaml
          OCaml version 4.04.2

  # #load "unix.cma";;
  Cannot load required shared library dllunix.
  Reason: /usr/lib/ocaml/stublibs/dllunix.so: flexdll error: cannot relocate
RELOC_REL32, target is too far: 0xfffffffc013d8b5f  0x13d8b5f. 

This is a known problem and fundamental limitation of flexdll (there is no
RELOC_REL64 in COFF). On our CI, we have been using a workaround for the
fork problem at
https://github.com/ocaml/ocaml/blob/trunk/tools/ci-build#L230-L231 but that
no longer works with rebase 4.4.4 because of the new minimum base address.

It was already the case that rebaseall was breaking OCaml DLLs, but now with
4.4.4 they cannot even be fixed by hand, so it's clearly a good moment to
put some effort into this (read as: I'm offering both coding and testing
time!).

For this to work at all, there needs to be some address space below
0x80000000 which DLLs may be permitted to opt-in to using and which rebase
needs to respect. Assuming that's OK, I think something along the following
lines is needed:

 1. We (ab)use either a DLL characteristics flag or a section header flag to
indicate that the DLL needs to be loaded below 0x8000000
 2. The rebase utility warning for base addresses should take that flag into
account (to the point of requiring < 0x80000000 if this new bit is set in
the image)
[2a. While we're changing validation for the image base, it'd be sensible to
add a check that the supplied address is 64K aligned :$]
 3. The flexlink utility should stop using 0x10000 all the time. Probably
the best way to achieve this is if the rebase utility has a flag which
*sets* the new bit so that flexlink calls rebase after compilation to assign
an improved base address to the DLL. On x86, we don't force a given base
address at all - I assume that Cygwin's binutils stuff is already
rebase-aware and produces sensible base addresses for newly-compiled DLLs,
as I don't recall having ever seen the fork conflict problem on x86 builds
of OCaml?

	Comments on the proposed need for some DLLs to occupy memory below
0x80000000 and on the fixes much appreciated, thanks!


David


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression for OCaml introduced by rebase 4.4.4
  2018-02-08 11:47 Regression for OCaml introduced by rebase 4.4.4 David Allsopp
@ 2018-02-08 15:04 ` Corinna Vinschen
  2018-02-09 11:30   ` David Allsopp
  2018-02-08 15:15 ` Corinna Vinschen
  1 sibling, 1 reply; 12+ messages in thread
From: Corinna Vinschen @ 2018-02-08 15:04 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2533 bytes --]

On Feb  8 11:47, David Allsopp wrote:
> TL;DR flexlink-compiled DLLs (i.e. ocaml libraries) are broken by the
> 0x200000000 base address requirement added in rebase 4.4.4. Possible fixes
> for this at the bottom.
> 
> Commit bfd383 in the rebase sources introduces a new minimum base address
> requirement of 0x200000000 for Cygwin64. This is a problem for the correct
> operation of the flexdll package and affects ocaml. On a fresh up-to-date
> Cygwin64 installation, install the ocaml package:
> 
>   $ rebase -i /usr/lib/ocaml/stublibs/*
>   /usr/lib/ocaml/stublibs/dllbigarray.so    base 0x000000010000 size
> 0x00015000 *
>   /usr/lib/ocaml/stublibs/dllcamlstr.so     base 0x000000010000 size
> 0x00014000 *
>   /usr/lib/ocaml/stublibs/dllgraphics.so    base 0x000000010000 size
> 0x00038000 *
> [...]
> Here you can see a problem we already know about with flexlink - all
> libraries have a base address of 0x10000
> (https://github.com/alainfrisch/flexdll/issues/50).
> 
> However, this allows you to load libraries dynamically:
> 
>   $ ocaml
>           OCaml version 4.04.2
> 
>   # #load "unix.cma";;
>   # #directory "+threads";;
>   # #load "threads.cma";;
> 
> but not fork (we know about this problem):
> 
>   # Unix.fork ();;
>         0 [main] ocamlrun 5688 child_info_fork::abort: address space needed
> by 'dllunix.so' (0x400000) is already occupied
>   Exception: Unix.Unix_error (Unix.EAGAIN, "fork", "").
> 
> Now do a rebaseall.
> 
>   $ rebase -i /usr/lib/ocaml/stublibs/*
>   /usr/lib/ocaml/stublibs/dllvmthreads.so   base 0x0003fec20000 size
> 0x0001f000
>   /usr/lib/ocaml/stublibs/dllunix.so        base 0x0003fec40000 size
> 0x0004c000
> [...]
> 
> So forking should now be fine. However:
> 
>   $ ocaml
>           OCaml version 4.04.2
> 
>   # #load "unix.cma";;
>   Cannot load required shared library dllunix.
>   Reason: /usr/lib/ocaml/stublibs/dllunix.so: flexdll error: cannot relocate
> RELOC_REL32, target is too far: 0xfffffffc013d8b5f  0x13d8b5f. 

The problem is this:  Given that the lib is in a safe space anyway,
why do you still try to relocate it?  That's exactly what you don't
have to do anymore and you shouldn't do this.  The DLL is loaded
where it belongs, end of story.  What should another relocation
gain?  So, just wwitch it off for 64 bit Cygwin, no?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression for OCaml introduced by rebase 4.4.4
  2018-02-08 11:47 Regression for OCaml introduced by rebase 4.4.4 David Allsopp
  2018-02-08 15:04 ` Corinna Vinschen
@ 2018-02-08 15:15 ` Corinna Vinschen
  2018-02-09 11:30   ` David Allsopp
  1 sibling, 1 reply; 12+ messages in thread
From: Corinna Vinschen @ 2018-02-08 15:15 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1158 bytes --]

On Feb  8 11:47, David Allsopp wrote:
> TL;DR flexlink-compiled DLLs (i.e. ocaml libraries) are broken by the
> 0x200000000 base address requirement added in rebase 4.4.4. Possible fixes
> for this at the bottom.
> [...]
>   $ ocaml
>           OCaml version 4.04.2
> 
>   # #load "unix.cma";;
>   Cannot load required shared library dllunix.
>   Reason: /usr/lib/ocaml/stublibs/dllunix.so: flexdll error: cannot relocate
> RELOC_REL32, target is too far: 0xfffffffc013d8b5f  0x13d8b5f. 
> 
> This is a known problem and fundamental limitation of flexdll (there is no
> RELOC_REL64 in COFF).

Apart from that, not only Cygwin DLLs but also the Windows system DLLs
are all loaded and relocated to the area beyond 0x1:80000000, so relocation
beyond the 32 bit address space is no generic problem in Windows.  Why
isn't that possible in FlexDLL?  I don't understand this.  To me this looks
like a bug in FlexDLL, not a requirement to let certain DLLs slip through
the cracks.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Regression for OCaml introduced by rebase 4.4.4
  2018-02-08 15:04 ` Corinna Vinschen
@ 2018-02-09 11:30   ` David Allsopp
  0 siblings, 0 replies; 12+ messages in thread
From: David Allsopp @ 2018-02-09 11:30 UTC (permalink / raw)
  To: cygwin

Corinna Vinschen wrote:
> On Feb  8 11:47, David Allsopp wrote:
> > TL;DR flexlink-compiled DLLs (i.e. ocaml libraries) are broken by the
> > 0x200000000 base address requirement added in rebase 4.4.4. Possible
> > fixes for this at the bottom.
> >
> > Commit bfd383 in the rebase sources introduces a new minimum base
> > address requirement of 0x200000000 for Cygwin64. This is a problem for
> > the correct operation of the flexdll package and affects ocaml. On a
> > fresh up-to-date
> > Cygwin64 installation, install the ocaml package:
> >
> >   $ rebase -i /usr/lib/ocaml/stublibs/*
> >   /usr/lib/ocaml/stublibs/dllbigarray.so    base 0x000000010000 size
> > 0x00015000 *
> >   /usr/lib/ocaml/stublibs/dllcamlstr.so     base 0x000000010000 size
> > 0x00014000 *
> >   /usr/lib/ocaml/stublibs/dllgraphics.so    base 0x000000010000 size
> > 0x00038000 *
> > [...]
> > Here you can see a problem we already know about with flexlink - all
> > libraries have a base address of 0x10000
> > (https://github.com/alainfrisch/flexdll/issues/50).
> >
> > However, this allows you to load libraries dynamically:
> >
> >   $ ocaml
> >           OCaml version 4.04.2
> >
> >   # #load "unix.cma";;
> >   # #directory "+threads";;
> >   # #load "threads.cma";;
> >
> > but not fork (we know about this problem):
> >
> >   # Unix.fork ();;
> >         0 [main] ocamlrun 5688 child_info_fork::abort: address space
> > needed by 'dllunix.so' (0x400000) is already occupied
> >   Exception: Unix.Unix_error (Unix.EAGAIN, "fork", "").
> >
> > Now do a rebaseall.
> >
> >   $ rebase -i /usr/lib/ocaml/stublibs/*
> >   /usr/lib/ocaml/stublibs/dllvmthreads.so   base 0x0003fec20000 size
> > 0x0001f000
> >   /usr/lib/ocaml/stublibs/dllunix.so        base 0x0003fec40000 size
> > 0x0004c000
> > [...]
> >
> > So forking should now be fine. However:
> >
> >   $ ocaml
> >           OCaml version 4.04.2
> >
> >   # #load "unix.cma";;
> >   Cannot load required shared library dllunix.
> >   Reason: /usr/lib/ocaml/stublibs/dllunix.so: flexdll error: cannot
> > relocate RELOC_REL32, target is too far: 0xfffffffc013d8b5f
> 0x13d8b5f.
> 
> The problem is this:  Given that the lib is in a safe space anyway, why
> do you still try to relocate it?  That's exactly what you don't have to
> do anymore and you shouldn't do this.  The DLL is loaded where it
> belongs, end of story.  What should another relocation gain?  So, just
> wwitch it off for 64 bit Cygwin, no?

See other message, but we're not relocating the DLL, we're performing COFF relocations on specific symbols which were deferred from link time.


David

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Regression for OCaml introduced by rebase 4.4.4
  2018-02-08 15:15 ` Corinna Vinschen
@ 2018-02-09 11:30   ` David Allsopp
  2018-02-09 11:40     ` Corinna Vinschen
  0 siblings, 1 reply; 12+ messages in thread
From: David Allsopp @ 2018-02-09 11:30 UTC (permalink / raw)
  To: cygwin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3424 bytes --]

Corinna Vinschen wrote:
> On Feb  8 11:47, David Allsopp wrote:
> > TL;DR flexlink-compiled DLLs (i.e. ocaml libraries) are broken by the
> > 0x200000000 base address requirement added in rebase 4.4.4. Possible
> > fixes for this at the bottom.
> > [...]
> >   $ ocaml
> >           OCaml version 4.04.2
> >
> >   # #load "unix.cma";;
> >   Cannot load required shared library dllunix.
> >   Reason: /usr/lib/ocaml/stublibs/dllunix.so: flexdll error: cannot
> > relocate RELOC_REL32, target is too far: 0xfffffffc013d8b5f
> 0x13d8b5f.
> >
> > This is a known problem and fundamental limitation of flexdll (there
> > is no
> > RELOC_REL64 in COFF).
> 
> Apart from that, not only Cygwin DLLs but also the Windows system DLLs
> are all loaded and relocated to the area beyond 0x1:80000000, so
> relocation beyond the 32 bit address space is no generic problem in
> Windows.  Why isn't that possible in FlexDLL?  I don't understand this.
> To me this looks like a bug in FlexDLL, not a requirement to let certain
> DLLs slip through the cracks.

There's a more full explanation of what and why for flexdll here: https://github.com/alainfrisch/flexdll/blob/master/README.md. I believe it's not unrelated to some of the black magic going on in Cygwin's autoload.cc, but without (at least at the moment), quite as much self-modifying code.

FlexDLL is "solving" the problem of allowing a dynamically loaded library to refer to symbols in the main application (or in previously dynamically loaded libraries, without loading them a second time, as the Windows loader I believe does). FlexDLL does this by deferring COFF relocations to runtime and it achieves that by sitting in front of both the linker when the DLL is constructed and also an application's main (or dllmain). For normal linking, since PE limits code size to 2GB, there is no need for a RELOC_REL64 relocation type. However, because we're actually resolving the symbols dynamically, on 64-bit the DLL may have been loaded too far from the executable (or other DLL) image it's resolving to (for actual Windows resolution to DLL symbols, you'd be using the stub code generated either by the linker or by __declspec(dllimport), which would similarly be guaranteed to be within the range of RELOC_REL32 because the stub itself is static).

When this was originally encountered for 64-bit MSVC (this was all added before Cygwin64 existed), the solution at the time was to keep the preferred base addresses low, but in reality what's really required is that everything is within a 2GB window somewhere in the address space.

I guess one can argue over whether that's a bug or a limitation, but the problem we face is that we can engineer it so that our DLLs and executables are within a 2GB range (having looked again at this in even more detail, we could just as readily do this with addresses > 0x200000000), but we still run the risk of rebase messing up the DLLs.

However, we'll scratch our heads some more on possible alternative solutions, since having a flag for DLLs which says "keep us within a 2GB range somewhere" sounds even more less likely to get merged than my previous suggestion.


David
\0ТÒÐÐ¥\a&ö&ÆVÒ\a&W\x06÷'G3¢\x02\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒ÷\a&ö&ÆV×2æ‡FÖÀФd\x15\x13¢\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒöf\x17\x12ðФFö7VÖVçF\x17F–öã¢\x02\x02\x02\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒöFö72æ‡FÖÀÐ¥Vç7V'67&–&R\x06–æfó¢\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒöÖÂò7Vç7V'67&–&R×6–×\x06ÆPРÐ

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression for OCaml introduced by rebase 4.4.4
  2018-02-09 11:30   ` David Allsopp
@ 2018-02-09 11:40     ` Corinna Vinschen
  2018-02-09 13:11       ` Corinna Vinschen
  2018-02-09 13:13       ` David Allsopp
  0 siblings, 2 replies; 12+ messages in thread
From: Corinna Vinschen @ 2018-02-09 11:40 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3962 bytes --]

On Feb  9 11:29, David Allsopp wrote:
> Corinna Vinschen wrote:
> > On Feb  8 11:47, David Allsopp wrote:
> > > TL;DR flexlink-compiled DLLs (i.e. ocaml libraries) are broken by the
> > > 0x200000000 base address requirement added in rebase 4.4.4. Possible
> > > fixes for this at the bottom.
> > > [...]
> > >   $ ocaml
> > >           OCaml version 4.04.2
> > >
> > >   # #load "unix.cma";;
> > >   Cannot load required shared library dllunix.
> > >   Reason: /usr/lib/ocaml/stublibs/dllunix.so: flexdll error: cannot
> > > relocate RELOC_REL32, target is too far: 0xfffffffc013d8b5f
> > 0x13d8b5f.
> > >
> > > This is a known problem and fundamental limitation of flexdll (there
> > > is no
> > > RELOC_REL64 in COFF).
> > 
> > Apart from that, not only Cygwin DLLs but also the Windows system DLLs
> > are all loaded and relocated to the area beyond 0x1:80000000, so
> > relocation beyond the 32 bit address space is no generic problem in
> > Windows.  Why isn't that possible in FlexDLL?  I don't understand this.
> > To me this looks like a bug in FlexDLL, not a requirement to let certain
> > DLLs slip through the cracks.
> 
> There's a more full explanation of what and why for flexdll here:
> https://github.com/alainfrisch/flexdll/blob/master/README.md. I
> believe it's not unrelated to some of the black magic going on in
> Cygwin's autoload.cc, but without (at least at the moment), quite as
> much self-modifying code.
> 
> FlexDLL is "solving" the problem of allowing a dynamically loaded
> library to refer to symbols in the main application (or in previously
> dynamically loaded libraries, without loading them a second time, as
> the Windows loader I believe does). FlexDLL does this by deferring
> COFF relocations to runtime and it achieves that by sitting in front
> of both the linker when the DLL is constructed and also an
> application's main (or dllmain). For normal linking, since PE limits
> code size to 2GB, there is no need for a RELOC_REL64 relocation type.
> However, because we're actually resolving the symbols dynamically, on
> 64-bit the DLL may have been loaded too far from the executable (or
> other DLL) image it's resolving to (for actual Windows resolution to
> DLL symbols, you'd be using the stub code generated either by the
> linker or by __declspec(dllimport), which would similarly be
> guaranteed to be within the range of RELOC_REL32 because the stub
> itself is static).
> 
> When this was originally encountered for 64-bit MSVC (this was all
> added before Cygwin64 existed), the solution at the time was to keep
> the preferred base addresses low, but in reality what's really
> required is that everything is within a 2GB window somewhere in the
> address space.
> 
> I guess one can argue over whether that's a bug or a limitation, but
> the problem we face is that we can engineer it so that our DLLs and
> executables are within a 2GB range (having looked again at this in
> even more detail, we could just as readily do this with addresses >
> 0x200000000), but we still run the risk of rebase messing up the DLLs.
> 
> However, we'll scratch our heads some more on possible alternative
> solutions, since having a flag for DLLs which says "keep us within a
> 2GB range somewhere" sounds even more less likely to get merged than
> my previous suggestion.

Two points:

- You are aware that the main executable of 64 bit Cygwin processes are
  loaded to 0x1:00400000, right?  The 2 GB offset problem is already
  imminent.

- What about adding an addition jump table?  The relocation would only
  have to point to the jump table in the vicinity of the DLL in question,
  the jump table points to the actual 64 bit address.  I'm curious why
  this isn't done yet.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression for OCaml introduced by rebase 4.4.4
  2018-02-09 11:40     ` Corinna Vinschen
@ 2018-02-09 13:11       ` Corinna Vinschen
  2018-02-09 13:19         ` David Allsopp
  2018-02-09 13:13       ` David Allsopp
  1 sibling, 1 reply; 12+ messages in thread
From: Corinna Vinschen @ 2018-02-09 13:11 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2505 bytes --]

On Feb  9 12:40, Corinna Vinschen wrote:
> On Feb  9 11:29, David Allsopp wrote:
> > > Apart from that, not only Cygwin DLLs but also the Windows system DLLs
> > > are all loaded and relocated to the area beyond 0x1:80000000, so
> > > relocation beyond the 32 bit address space is no generic problem in
> > > Windows.  Why isn't that possible in FlexDLL?  I don't understand this.
> > > To me this looks like a bug in FlexDLL, not a requirement to let certain
> > > DLLs slip through the cracks.
> > 
> > There's a more full explanation of what and why for flexdll here:
> > https://github.com/alainfrisch/flexdll/blob/master/README.md. I
> > believe it's not unrelated to some of the black magic going on in
> > Cygwin's autoload.cc, but without (at least at the moment), quite as
> > much self-modifying code.
> > [...]
> > I guess one can argue over whether that's a bug or a limitation, but
> > the problem we face is that we can engineer it so that our DLLs and
> > executables are within a 2GB range (having looked again at this in
> > even more detail, we could just as readily do this with addresses >
> > 0x200000000), but we still run the risk of rebase messing up the DLLs.
> > 
> > However, we'll scratch our heads some more on possible alternative
> > solutions, since having a flag for DLLs which says "keep us within a
> > 2GB range somewhere" sounds even more less likely to get merged than
> > my previous suggestion.
> 
> Two points:
> 
> - You are aware that the main executable of 64 bit Cygwin processes are
>   loaded to 0x1:00400000, right?  The 2 GB offset problem is already
>   imminent.

...and you must not use the 0x0:80000000 - 0x1:00000000 area because that's
reserved for thread stacks.

To clarify, the full layout requirements:

- 0x0:00000000 - 0x0:80000000	Windows
- 0x0:80000000 - 0x1:00000000	Cygwin pthreads (including main thread)
- 0x1:00000000 - 0x1:80000000	Executable
- 0x1:80000000 - 0x2:00000000	Cygwin DLL
- 0x2:00000000 - 0x4:00000000	Rebased DLLs
- 0x4:00000000 - 0x6:00000000	Non-rebased DLLs (hashed default addresses
				generated by binutils ld with
				-auto-image-based (default on Cygwin))
- 0x6:00000000			Start Address Heap, growing upwards
- 0x8:00000000 - 0x700:00000000	Mmaps, allocated downwards
- 0x700:00000000 and beyond	Windows


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Regression for OCaml introduced by rebase 4.4.4
  2018-02-09 11:40     ` Corinna Vinschen
  2018-02-09 13:11       ` Corinna Vinschen
@ 2018-02-09 13:13       ` David Allsopp
  2018-02-09 17:11         ` Corinna Vinschen
  1 sibling, 1 reply; 12+ messages in thread
From: David Allsopp @ 2018-02-09 13:13 UTC (permalink / raw)
  To: cygwin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 4972 bytes --]

Corinna Vinschen wrote:
> On Feb  9 11:29, David Allsopp wrote:
> > Corinna Vinschen wrote:
> > > On Feb  8 11:47, David Allsopp wrote:
> > > > TL;DR flexlink-compiled DLLs (i.e. ocaml libraries) are broken by
> > > > the
> > > > 0x200000000 base address requirement added in rebase 4.4.4.
> > > > Possible fixes for this at the bottom.
> > > > [...]
> > > >   $ ocaml
> > > >           OCaml version 4.04.2
> > > >
> > > >   # #load "unix.cma";;
> > > >   Cannot load required shared library dllunix.
> > > >   Reason: /usr/lib/ocaml/stublibs/dllunix.so: flexdll error:
> > > > cannot relocate RELOC_REL32, target is too far: 0xfffffffc013d8b5f
> > > 0x13d8b5f.
> > > >
> > > > This is a known problem and fundamental limitation of flexdll
> > > > (there is no
> > > > RELOC_REL64 in COFF).
> > >
> > > Apart from that, not only Cygwin DLLs but also the Windows system
> > > DLLs are all loaded and relocated to the area beyond 0x1:80000000,
> > > so relocation beyond the 32 bit address space is no generic problem
> > > in Windows.  Why isn't that possible in FlexDLL?  I don't understand
> this.
> > > To me this looks like a bug in FlexDLL, not a requirement to let
> > > certain DLLs slip through the cracks.
> >
> > There's a more full explanation of what and why for flexdll here:
> > https://github.com/alainfrisch/flexdll/blob/master/README.md. I
> > believe it's not unrelated to some of the black magic going on in
> > Cygwin's autoload.cc, but without (at least at the moment), quite as
> > much self-modifying code.
> >
> > FlexDLL is "solving" the problem of allowing a dynamically loaded
> > library to refer to symbols in the main application (or in previously
> > dynamically loaded libraries, without loading them a second time, as
> > the Windows loader I believe does). FlexDLL does this by deferring
> > COFF relocations to runtime and it achieves that by sitting in front
> > of both the linker when the DLL is constructed and also an
> > application's main (or dllmain). For normal linking, since PE limits
> > code size to 2GB, there is no need for a RELOC_REL64 relocation type.
> > However, because we're actually resolving the symbols dynamically, on
> > 64-bit the DLL may have been loaded too far from the executable (or
> > other DLL) image it's resolving to (for actual Windows resolution to
> > DLL symbols, you'd be using the stub code generated either by the
> > linker or by __declspec(dllimport), which would similarly be
> > guaranteed to be within the range of RELOC_REL32 because the stub
> > itself is static).
> >
> > When this was originally encountered for 64-bit MSVC (this was all
> > added before Cygwin64 existed), the solution at the time was to keep
> > the preferred base addresses low, but in reality what's really
> > required is that everything is within a 2GB window somewhere in the
> > address space.
> >
> > I guess one can argue over whether that's a bug or a limitation, but
> > the problem we face is that we can engineer it so that our DLLs and
> > executables are within a 2GB range (having looked again at this in
> > even more detail, we could just as readily do this with addresses >
> > 0x200000000), but we still run the risk of rebase messing up the DLLs.
> >
> > However, we'll scratch our heads some more on possible alternative
> > solutions, since having a flag for DLLs which says "keep us within a
> > 2GB range somewhere" sounds even more less likely to get merged than
> > my previous suggestion.
> 
> Two points:
> 
> - You are aware that the main executable of 64 bit Cygwin processes are
>   loaded to 0x1:00400000, right?  The 2 GB offset problem is already
>   imminent.

Our executables are also compiled via flexdll's flexlink which sets --image-base in its call to the linker. I don't think the Cygwin DLL does anything which alters that, right? Another "fix" I tried while investigating was to change the --image-base we specified to be within 2GB of where rebase has put the DLLs, which also worked.

> - What about adding an addition jump table?  The relocation would only
>   have to point to the jump table in the vicinity of the DLL in
>   question, the jump table points to the actual 64 bit address.

That was what our head-scratching has arrived at too, which I'm in the process of doing.

> I'm curious why this isn't done yet.

I'm hoping that doing it is going to reveal that it simply wasn't considered in 2008, rather than that it was and there was an issue with it (I think it will just be that it wasn't thought of - like Cygwin at that time in 2008, our x86_64 on Windows support was extremely limited and not receiving much engineering focus).


David
\0ТÒÐÐ¥\a&ö&ÆVÒ\a&W\x06÷'G3¢\x02\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒ÷\a&ö&ÆV×2æ‡FÖÀФd\x15\x13¢\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒöf\x17\x12ðФFö7VÖVçF\x17F–öã¢\x02\x02\x02\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒöFö72æ‡FÖÀÐ¥Vç7V'67&–&R\x06–æfó¢\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒöÖÂò7Vç7V'67&–&R×6–×\x06ÆPРÐ

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Regression for OCaml introduced by rebase 4.4.4
  2018-02-09 13:11       ` Corinna Vinschen
@ 2018-02-09 13:19         ` David Allsopp
  0 siblings, 0 replies; 12+ messages in thread
From: David Allsopp @ 2018-02-09 13:19 UTC (permalink / raw)
  To: cygwin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3172 bytes --]

Corinna Vinschen wrote:
> On Feb  9 12:40, Corinna Vinschen wrote:
> > On Feb  9 11:29, David Allsopp wrote:
> > > > Apart from that, not only Cygwin DLLs but also the Windows system
> > > > DLLs are all loaded and relocated to the area beyond 0x1:80000000,
> > > > so relocation beyond the 32 bit address space is no generic
> > > > problem in Windows.  Why isn't that possible in FlexDLL?  I don't
> understand this.
> > > > To me this looks like a bug in FlexDLL, not a requirement to let
> > > > certain DLLs slip through the cracks.
> > >
> > > There's a more full explanation of what and why for flexdll here:
> > > https://github.com/alainfrisch/flexdll/blob/master/README.md. I
> > > believe it's not unrelated to some of the black magic going on in
> > > Cygwin's autoload.cc, but without (at least at the moment), quite as
> > > much self-modifying code.
> > > [...]
> > > I guess one can argue over whether that's a bug or a limitation, but
> > > the problem we face is that we can engineer it so that our DLLs and
> > > executables are within a 2GB range (having looked again at this in
> > > even more detail, we could just as readily do this with addresses >
> > > 0x200000000), but we still run the risk of rebase messing up the
> DLLs.
> > >
> > > However, we'll scratch our heads some more on possible alternative
> > > solutions, since having a flag for DLLs which says "keep us within a
> > > 2GB range somewhere" sounds even more less likely to get merged than
> > > my previous suggestion.
> >
> > Two points:
> >
> > - You are aware that the main executable of 64 bit Cygwin processes
> are
> >   loaded to 0x1:00400000, right?  The 2 GB offset problem is already
> >   imminent.
> 
> ...and you must not use the 0x0:80000000 - 0x1:00000000 area because
> that's reserved for thread stacks.
> 
> To clarify, the full layout requirements:
> 
> - 0x0:00000000 - 0x0:80000000	Windows
> - 0x0:80000000 - 0x1:00000000	Cygwin pthreads (including main thread)
> - 0x1:00000000 - 0x1:80000000	Executable
> - 0x1:80000000 - 0x2:00000000	Cygwin DLL
> - 0x2:00000000 - 0x4:00000000	Rebased DLLs
> - 0x4:00000000 - 0x6:00000000	Non-rebased DLLs (hashed default
> addresses
> 				generated by binutils ld with
> 				-auto-image-based (default on Cygwin))
> - 0x6:00000000			Start Address Heap, growing upwards
> - 0x8:00000000 - 0x700:00000000	Mmaps, allocated downwards
> - 0x700:00000000 and beyond	Windows

Thanks for this (and your time on this question generally!). I reckon that the jump tables will sort this and we'll be able to stop doing horrible things with --image-base completely, which should mean that flexlink (and therefore OCaml too) will start properly respecting the Cygwin address space layout! It's a shame that the layout means that the trampolines would always be needed, but I very much doubt their overhead will be significant in any program.


David
\0ТÒÐÐ¥\a&ö&ÆVÒ\a&W\x06÷'G3¢\x02\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒ÷\a&ö&ÆV×2æ‡FÖÀФd\x15\x13¢\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒöf\x17\x12ðФFö7VÖVçF\x17F–öã¢\x02\x02\x02\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒöFö72æ‡FÖÀÐ¥Vç7V'67&–&R\x06–æfó¢\x02\x02\x02\x02\x02\x06‡GG\x03¢òö7–wv–âæ6öÒöÖÂò7Vç7V'67&–&R×6–×\x06ÆPРÐ

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression for OCaml introduced by rebase 4.4.4
  2018-02-09 13:13       ` David Allsopp
@ 2018-02-09 17:11         ` Corinna Vinschen
  2018-02-15 11:44           ` David Allsopp
  0 siblings, 1 reply; 12+ messages in thread
From: Corinna Vinschen @ 2018-02-09 17:11 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2979 bytes --]

On Feb  9 13:12, David Allsopp wrote:
> Corinna Vinschen wrote:
> > On Feb  9 11:29, David Allsopp wrote:
> > > [...]
> > > When this was originally encountered for 64-bit MSVC (this was all
> > > added before Cygwin64 existed), the solution at the time was to keep
> > > the preferred base addresses low, but in reality what's really
> > > required is that everything is within a 2GB window somewhere in the
> > > address space.
> > >
> > > I guess one can argue over whether that's a bug or a limitation, but
> > > the problem we face is that we can engineer it so that our DLLs and
> > > executables are within a 2GB range (having looked again at this in
> > > even more detail, we could just as readily do this with addresses >
> > > 0x200000000), but we still run the risk of rebase messing up the DLLs.
> > >
> > > However, we'll scratch our heads some more on possible alternative
> > > solutions, since having a flag for DLLs which says "keep us within a
> > > 2GB range somewhere" sounds even more less likely to get merged than
> > > my previous suggestion.
> > 
> > Two points:
> > 
> > - You are aware that the main executable of 64 bit Cygwin processes are
> >   loaded to 0x1:00400000, right?  The 2 GB offset problem is already
> >   imminent.
> 
> Our executables are also compiled via flexdll's flexlink which sets
> --image-base in its call to the linker. I don't think the Cygwin DLL
> does anything which alters that, right?

It doesn't.  It can't, actually.  But you're breaking assumptions Cygwin
is relying on under the limitations Windows has.

> Another "fix" I tried while
> investigating was to change the --image-base we specified to be within
> 2GB of where rebase has put the DLLs, which also worked.
> 
> > - What about adding an addition jump table?  The relocation would only
> >   have to point to the jump table in the vicinity of the DLL in
> >   question, the jump table points to the actual 64 bit address.
> 
> That was what our head-scratching has arrived at too, which I'm in the
> process of doing.

\o/

> > I'm curious why this isn't done yet.
> 
> I'm hoping that doing it is going to reveal that it simply wasn't
> considered in 2008, rather than that it was and there was an issue
> with it (I think it will just be that it wasn't thought of - like
> Cygwin at that time in 2008, our x86_64 on Windows support was
> extremely limited and not receiving much engineering focus).

We had similar problems back then.  The idea to move the executable
and DLLs beyond the lower 32 bit area was nice as such... only GCC
didn't support it at the time at all.  We had to add the x86-64 medium
and large cmodel implementation to GCC to make this work first.
Cygwin executables are compiled with --mcmodel=medium, IIRC.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Regression for OCaml introduced by rebase 4.4.4
  2018-02-09 17:11         ` Corinna Vinschen
@ 2018-02-15 11:44           ` David Allsopp
  2018-02-15 12:02             ` Corinna Vinschen
  0 siblings, 1 reply; 12+ messages in thread
From: David Allsopp @ 2018-02-15 11:44 UTC (permalink / raw)
  To: cygwin

Corinna Vinschen wrote:
> On Feb  9 13:12, David Allsopp wrote:
> > Corinna Vinschen wrote:
> > > On Feb  9 11:29, David Allsopp wrote:
> > > > [...]
> > > > When this was originally encountered for 64-bit MSVC (this was all

<snip>

> > > I'm curious why this isn't done yet.
> >
> > I'm hoping that doing it is going to reveal that it simply wasn't
> > considered in 2008, rather than that it was and there was an issue
> > with it (I think it will just be that it wasn't thought of - like
> > Cygwin at that time in 2008, our x86_64 on Windows support was
> > extremely limited and not receiving much engineering focus).
> 
> We had similar problems back then.  The idea to move the executable and
> DLLs beyond the lower 32 bit area was nice as such... only GCC didn't
> support it at the time at all.  We had to add the x86-64 medium and
> large cmodel implementation to GCC to make this work first.
> Cygwin executables are compiled with --mcmodel=medium, IIRC.

This all seems to be working nicely, though the mcmodel stuff may also be part of why this wasn't fixed properly in 2008, which is because of data symbols. Everything does seem to be working correctly (there are various int symbols in the ocaml runtime which are always requiring absolute relocations, not relative ones, from the DLLs), but I'm trying to get my head around being certain that that should always be the case.

gcc -Q --help=target seems to show that the default for mcmodel is "32", but I'm struggling to find a description of precisely what that means? If I compile all the units with -mcmodel=small then, as expected, gcc starts generating RELOC_REL32 for data symbols as well. flexdll then starts creating thunks for data symbols, though the Cygwin runtime unsurprisingly blows up before there's time for flexdll's sins to become apparent!

Is it the case with -mcmodel=medium that an external data symbol could never be referred to via a RELOC_REL32? My reading of it was that that would only be the case if  the symbol itself refers to data which is large (-mlarge-data-threshold), but it seems to be happening for ints, which are clearly "small". Or have I still not properly understood x64 code models? The remaining question is what the difference between -mcmodel=32 and -mcmodel=medium really is?


David

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression for OCaml introduced by rebase 4.4.4
  2018-02-15 11:44           ` David Allsopp
@ 2018-02-15 12:02             ` Corinna Vinschen
  0 siblings, 0 replies; 12+ messages in thread
From: Corinna Vinschen @ 2018-02-15 12:02 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3393 bytes --]

On Feb 15 11:44, David Allsopp wrote:
> Corinna Vinschen wrote:
> > On Feb  9 13:12, David Allsopp wrote:
> > > Corinna Vinschen wrote:
> > > > On Feb  9 11:29, David Allsopp wrote:
> > > > > [...]
> > > > > When this was originally encountered for 64-bit MSVC (this was all
> 
> <snip>
> 
> > > > I'm curious why this isn't done yet.
> > >
> > > I'm hoping that doing it is going to reveal that it simply wasn't
> > > considered in 2008, rather than that it was and there was an issue
> > > with it (I think it will just be that it wasn't thought of - like
> > > Cygwin at that time in 2008, our x86_64 on Windows support was
> > > extremely limited and not receiving much engineering focus).
> > 
> > We had similar problems back then.  The idea to move the executable and
> > DLLs beyond the lower 32 bit area was nice as such... only GCC didn't
> > support it at the time at all.  We had to add the x86-64 medium and
> > large cmodel implementation to GCC to make this work first.
> > Cygwin executables are compiled with --mcmodel=medium, IIRC.
> 
> This all seems to be working nicely, though the mcmodel stuff may also
> be part of why this wasn't fixed properly in 2008, which is because of
> data symbols. Everything does seem to be working correctly (there are
> various int symbols in the ocaml runtime which are always requiring
> absolute relocations, not relative ones, from the DLLs), but I'm
> trying to get my head around being certain that that should always be
> the case.
> 
> gcc -Q --help=target seems to show that the default for mcmodel is
> "32", but I'm struggling to find a description of precisely what that
> means? If I compile all the units with -mcmodel=small then, as
> expected, gcc starts generating RELOC_REL32 for data symbols as well.
> flexdll then starts creating thunks for data symbols, though the
> Cygwin runtime unsurprisingly blows up before there's time for
> flexdll's sins to become apparent!
> 
> Is it the case with -mcmodel=medium that an external data symbol could
> never be referred to via a RELOC_REL32? My reading of it was that that
> would only be the case if  the symbol itself refers to data which is
> large (-mlarge-data-threshold), but it seems to be happening for ints,
> which are clearly "small". Or have I still not properly understood x64
> code models? The remaining question is what the difference between
> -mcmodel=32 and -mcmodel=medium really is?

I don't know what cmodel=32 is supposed to be.  The usual models for
x86_64 are small, medium, large.  Small means using 32 bit relocations
for everything.  That's not feasible for Cygwin.  Medium means to use
normal 32 bit relocs for code but an extra trampoline for data (the 32
bit relocs point to a table with the real 64 bit address).  Large means
to use the trampolines for code and data.  Given that Windows DLL entry
point relocation already uses import/export tables anyway, we can get
away with the medium cmodel by default.  I'm not aware of a package
using the large model, though they might exist.

Note:  I'm not a gcc expert, so I'm not sure how the models are
implemented exactly.  Personally I'm just happy that it works :}


HTH,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-02-15 12:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-08 11:47 Regression for OCaml introduced by rebase 4.4.4 David Allsopp
2018-02-08 15:04 ` Corinna Vinschen
2018-02-09 11:30   ` David Allsopp
2018-02-08 15:15 ` Corinna Vinschen
2018-02-09 11:30   ` David Allsopp
2018-02-09 11:40     ` Corinna Vinschen
2018-02-09 13:11       ` Corinna Vinschen
2018-02-09 13:19         ` David Allsopp
2018-02-09 13:13       ` David Allsopp
2018-02-09 17:11         ` Corinna Vinschen
2018-02-15 11:44           ` David Allsopp
2018-02-15 12:02             ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).