public inbox for cygwin-developers@cygwin.com
 help / color / mirror / Atom feed
* About the dll search algorithm of dlopen
@ 2016-06-01  6:39 Michael Haubenwallner
  2016-06-01 11:09 ` Corinna Vinschen
  0 siblings, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-06-01  6:39 UTC (permalink / raw)
  To: cygwin-developers

Hi,

two issues with dlopen here (I'm about to prepare patches):

*) The algorithm to combine dll file name variants with the search path
   entries needs to be reordered, as in:
   - for each dll file name variant:
   -   for each search path:
   + for each search path entry:
   +   for each dll file name variant:
         check if useable

*) The directory of the current main executable should be searched
   after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
   And PATH should be searched before /usr/bin:/usr/lib as well.

   For consistency, IMO, when any searched path ends in either
   x/bin or x/lib, we should search x/bin:x/lib.

Thoughts?

Thanks!
/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-06-01  6:39 About the dll search algorithm of dlopen Michael Haubenwallner
@ 2016-06-01 11:09 ` Corinna Vinschen
  2016-06-01 14:25   ` Michael Haubenwallner
  0 siblings, 1 reply; 32+ messages in thread
From: Corinna Vinschen @ 2016-06-01 11:09 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1694 bytes --]

On Jun  1 08:40, Michael Haubenwallner wrote:
> Hi,
> 
> two issues with dlopen here (I'm about to prepare patches):
> 
> *) The algorithm to combine dll file name variants with the search path
>    entries needs to be reordered, as in:
>    - for each dll file name variant:
>    -   for each search path:
>    + for each search path entry:
>    +   for each dll file name variant:
>          check if useable

Rationale?  We only need to find one version of the file and there
usually only is one.  This is mainly for moduled systems like perl,
python, etc.  However, if you can speed up the search process ignore the
question...

> *) The directory of the current main executable should be searched
>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
>    And PATH should be searched before /usr/bin:/usr/lib as well.

Checking the executable path and $PATH are Windows concepts.  dlopen
doesn't do that on POSIX systems and we're not doing that either.

Having said that, LoadLibrary will search the usual paths.  After 2.5.2,
we're leaving XP/2003 behind, and then we probably should tighten the
search algorithm along the lines of

  AddDllDirectory ("/usr/bin");
  AddDllDirectory ("/usr/lib");
  [...]
  LoadLibraryEx (path, NULL, LOAD_LIBRARY_SEARCH_USER_DIRS
			     | LOAD_LIBRARY_SEARCH_SYSTEM32);

>    For consistency, IMO, when any searched path ends in either
>    x/bin or x/lib, we should search x/bin:x/lib.

This might make sense, at least in the direction lib->bin.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-06-01 11:09 ` Corinna Vinschen
@ 2016-06-01 14:25   ` Michael Haubenwallner
  2016-06-01 20:18     ` Corinna Vinschen
  0 siblings, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-06-01 14:25 UTC (permalink / raw)
  To: cygwin-developers

On 06/01/2016 01:09 PM, Corinna Vinschen wrote:
> On Jun  1 08:40, Michael Haubenwallner wrote:
>> Hi,
>>
>> two issues with dlopen here (I'm about to prepare patches):
>>
>> *) The algorithm to combine dll file name variants with the search path
>>    entries needs to be reordered, as in:
>>    - for each dll file name variant:
>>    -   for each search path:
>>    + for each search path entry:
>>    +   for each dll file name variant:
>>          check if useable
> 
> Rationale?  We only need to find one version of the file and there
> usually only is one.  This is mainly for moduled systems like perl,
> python, etc.

While I indeed didn't face a problem here yet, the rationale behind is
that I need to install a large application that provides its own (portable)
package build & management system, including lots of packages probably installed
in the host (Cygwin) system as well, most likely in (slightly) different versions.

When these package management system now does provide a different dll naming
scheme than the host system, like stick to "liblib.dll" rather than "cyglib.dll",
and the application wants to dlopen its own "liblib.dll", currently the host's
"cyglib.dll" is loaded.

> However, if you can speed up the search process ignore the
> question...

This also depends on whether find_exec really is necessary here.

>> *) The directory of the current main executable should be searched
>>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
>>    And PATH should be searched before /usr/bin:/usr/lib as well.
> 
> Checking the executable path and $PATH are Windows concepts.  dlopen
> doesn't do that on POSIX systems and we're not doing that either.

Agreed, but POSIX also does have the concept of embedded RUNPATH,
which is completely missing in Cygwin as far as I can see.
However, there is one path name that can easily serve as minimal
"embedded RUNPATH" - the executable's directory.

This is where I do have a problem right now:

My own /application/bin/python2.7.exe is linked to libpython2.7.dll,
located in /application/bin. Now there is some python script that does
have some - strange enough - cygwin-conditional code that reads:

  import _ctypes
  _ctypes.dlopen("libpython%d.%d.dll" % sys.version_info[:2])

While this is questionable by itself, it really shouldn't load another
libpython2.7.dll than /application/bin/python2.7.exe has already loaded
just because dlopen using a different search algorithm than CreateProcess().

However, when dlopen is about to search some path list, I can imagine to
search the list of already loaded dlls first as well, but I'd prefer to
leave this up to LoadLibrary...

> Having said that, LoadLibrary will search the usual paths.  After 2.5.2,
> we're leaving XP/2003 behind, and then we probably should tighten the
> search algorithm along the lines of
> 
>   AddDllDirectory ("/usr/bin");
>   AddDllDirectory ("/usr/lib");
>   [...]
>   LoadLibraryEx (path, NULL, LOAD_LIBRARY_SEARCH_USER_DIRS
> 			     | LOAD_LIBRARY_SEARCH_SYSTEM32);

/me fails to see how this does help with the missing embedded RUNPATH.

>>    For consistency, IMO, when any searched path ends in either
>>    x/bin or x/lib, we should search x/bin:x/lib.
> 
> This might make sense, at least in the direction lib->bin.

Fine with me too.

Side note:
We also use some cl.exe/link.exe wrapper that supports LD_PRELOAD,
LD_LIBRARY_PATH, embedded RUNPATH, as well as lazy loading for both
LoadLibrary and CreateProcess: https://github.com/haubi/parity
Basically I'm wondering why Cygwin doesn't provide that (yet?)...

/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-06-01 14:25   ` Michael Haubenwallner
@ 2016-06-01 20:18     ` Corinna Vinschen
  2016-06-01 23:29       ` Yaakov Selkowitz
  2016-08-19 16:37       ` Michael Haubenwallner
  0 siblings, 2 replies; 32+ messages in thread
From: Corinna Vinschen @ 2016-06-01 20:18 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 5593 bytes --]

On Jun  1 16:26, Michael Haubenwallner wrote:
> On 06/01/2016 01:09 PM, Corinna Vinschen wrote:
> > On Jun  1 08:40, Michael Haubenwallner wrote:
> >> Hi,
> >>
> >> two issues with dlopen here (I'm about to prepare patches):
> >>
> >> *) The algorithm to combine dll file name variants with the search path
> >>    entries needs to be reordered, as in:
> >>    - for each dll file name variant:
> >>    -   for each search path:
> >>    + for each search path entry:
> >>    +   for each dll file name variant:
> >>          check if useable
> > 
> > Rationale?  We only need to find one version of the file and there
> > usually only is one.  This is mainly for moduled systems like perl,
> > python, etc.
> 
> While I indeed didn't face a problem here yet, the rationale behind is
> that I need to install a large application that provides its own (portable)
> package build & management system, including lots of packages probably installed
> in the host (Cygwin) system as well, most likely in (slightly) different versions.
> 
> When these package management system now does provide a different dll naming
> scheme than the host system, like stick to "liblib.dll" rather than "cyglib.dll",
> and the application wants to dlopen its own "liblib.dll", currently the host's
> "cyglib.dll" is loaded.

Sounds a teeny bit artificial, but basically that's possible, yes.

> > However, if you can speed up the search process ignore the
> > question...
> 
> This also depends on whether find_exec really is necessary here.

Not as such necessary, it's just the function used to search in a
search path.  If you want to change that you have to rewrite the
same logic again, just reversed.

One way around YA code duplication could be some kind of path iterator
class which could be used from find_exec as well as from
get_full_path_of_dll.

> >> *) The directory of the current main executable should be searched
> >>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
> >>    And PATH should be searched before /usr/bin:/usr/lib as well.
> > 
> > Checking the executable path and $PATH are Windows concepts.  dlopen
> > doesn't do that on POSIX systems and we're not doing that either.
> 
> Agreed, but POSIX also does have the concept of embedded RUNPATH,
> which is completely missing in Cygwin as far as I can see.

RPATH and RUNPATH are ELF dynamic loader features, not supported by
PE/COFF.

> However, there is one path name that can easily serve as minimal
> "embedded RUNPATH" - the executable's directory.
> 
> This is where I do have a problem right now:
> 
> My own /application/bin/python2.7.exe is linked to libpython2.7.dll,
> located in /application/bin. Now there is some python script that does
> have some - strange enough - cygwin-conditional code that reads:
> 
>   import _ctypes
>   _ctypes.dlopen("libpython%d.%d.dll" % sys.version_info[:2])
> 
> While this is questionable by itself, it really shouldn't load another
> libpython2.7.dll than /application/bin/python2.7.exe has already loaded
> just because dlopen using a different search algorithm than CreateProcess().
> 
> However, when dlopen is about to search some path list, I can imagine to
> search the list of already loaded dlls first as well, but I'd prefer to
> leave this up to LoadLibrary...

This problem would only occur if dlopen is not called with a path.  If
the given pathname is a plain filename, we could simply add a call to
GetModuleHandle and if it succeeds, return the handle (after checking
for RTLD_NODELETE).

> > Having said that, LoadLibrary will search the usual paths.  After 2.5.2,
> > we're leaving XP/2003 behind, and then we probably should tighten the
> > search algorithm along the lines of
> > 
> >   AddDllDirectory ("/usr/bin");
> >   AddDllDirectory ("/usr/lib");
> >   [...]
> >   LoadLibraryEx (path, NULL, LOAD_LIBRARY_SEARCH_USER_DIRS
> > 			     | LOAD_LIBRARY_SEARCH_SYSTEM32);
> 
> /me fails to see how this does help with the missing embedded RUNPATH.

It doesn't.  It just tightens the search path to not load from the cwd
or the application path.  If you want that, add it to LD_LIBRARY_PATH
explicitely.

> >>    For consistency, IMO, when any searched path ends in either
> >>    x/bin or x/lib, we should search x/bin:x/lib.
> > 
> > This might make sense, at least in the direction lib->bin.
> 
> Fine with me too.
> 
> Side note:
> We also use some cl.exe/link.exe wrapper that supports LD_PRELOAD,
> LD_LIBRARY_PATH, embedded RUNPATH, as well as lazy loading for both
> LoadLibrary and CreateProcess: https://github.com/haubi/parity
> Basically I'm wondering why Cygwin doesn't provide that (yet?)...

We discussed implementing our own dynamic loader once, but gave up due
to workload.  Parity is LGPLed and thus can't be included into Cygwin
itself.

DT_RUNTIME should be possible with not too much effort, but would
require gcc/binutils support.  Some tricking with a symbol in the linker
script, setting a pointer to it in _cygwin_crt0_common, and a matching
call to AddDllDirectory comes to mind...

LD_PRELOAD is (kind of) implemented but I think doesn't work as
intended.  Importing symbols is bound to the name of the DLL they came
from in a PE/COFF file.  To implement the full set of ELF dynloader
features would require some major effort, like what parity does.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-06-01 20:18     ` Corinna Vinschen
@ 2016-06-01 23:29       ` Yaakov Selkowitz
  2016-08-19 16:37       ` Michael Haubenwallner
  1 sibling, 0 replies; 32+ messages in thread
From: Yaakov Selkowitz @ 2016-06-01 23:29 UTC (permalink / raw)
  To: cygwin-developers

On 2016-06-01 15:17, Corinna Vinschen wrote:
> LD_PRELOAD is (kind of) implemented but I think doesn't work as
> intended.  Importing symbols is bound to the name of the DLL they came
> from in a PE/COFF file.

However, as long as you're overriding symbols in cygwin1.dll, 
cygwin_internal(CW_HOOK,...) allows LD_PRELOAD hacks to work as 
intended.  In fact, there are a few in the distro (artsdsp, datefudge, 
pulsedsp, Xdummy).

-- 
Yaakov

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-06-01 20:18     ` Corinna Vinschen
  2016-06-01 23:29       ` Yaakov Selkowitz
@ 2016-08-19 16:37       ` Michael Haubenwallner
  2016-08-19 16:39         ` Michael Haubenwallner
  1 sibling, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-19 16:37 UTC (permalink / raw)
  To: cygwin-developers

On 06/01/2016 10:17 PM, Corinna Vinschen wrote:
> On Jun  1 16:26, Michael Haubenwallner wrote:
>> On 06/01/2016 01:09 PM, Corinna Vinschen wrote:
>>> On Jun  1 08:40, Michael Haubenwallner wrote:
>>>> Hi,
>>>>
>>>> two issues with dlopen here (I'm about to prepare patches):
>>>>
>>>> *) The algorithm to combine dll file name variants with the search path
>>>>    entries needs to be reordered, as in:
>>>>    - for each dll file name variant:
>>>>    -   for each search path:
>>>>    + for each search path entry:
>>>>    +   for each dll file name variant:
>>>>          check if useable

>>> However, if you can speed up the search process ignore the
>>> question...

Not sure if there's a speedup actually, ...

>>
>> This also depends on whether find_exec really is necessary here.
> 
> Not as such necessary, it's just the function used to search in a
> search path.  If you want to change that you have to rewrite the
> same logic again, just reversed.
> 
> One way around YA code duplication could be some kind of path iterator
> class which could be used from find_exec as well as from
> get_full_path_of_dll.

... but:
0001.patch is a draft for some new cygwin::pathfinder class, with
0002.patch adding the executable's directory as searchpath, and
0003.patch to search the PATH environment as well.

Thoughts?
Any idea about different template nesting to be useful somewhere else?

>>>> *) The directory of the current main executable should be searched
>>>>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
>>>>    And PATH should be searched before /usr/bin:/usr/lib as well.
>>>
>>> Checking the executable path and $PATH are Windows concepts.  dlopen
>>> doesn't do that on POSIX systems and we're not doing that either.
>>
>> Agreed, but POSIX also does have the concept of embedded RUNPATH,
>> which is completely missing in Cygwin as far as I can see.
> 
> RPATH and RUNPATH are ELF dynamic loader features, not supported by
> PE/COFF.

In any case, to me it does feel quite important to have the (almost) same
dll search algorithm with dlopen() as with CreateProcess().

>> However, when dlopen is about to search some path list, I can imagine to
>> search the list of already loaded dlls first as well, but I'd prefer to
>> leave this up to LoadLibrary...
> 
> This problem would only occur if dlopen is not called with a path.  If
> the given pathname is a plain filename, we could simply add a call to
> GetModuleHandle and if it succeeds, return the handle (after checking
> for RTLD_NODELETE).

But indeed this might lower the need for same search algorithms...

>> Side note:
>> We also use some cl.exe/link.exe wrapper that supports LD_PRELOAD,
>> LD_LIBRARY_PATH, embedded RUNPATH, as well as lazy loading for both
>> LoadLibrary and CreateProcess: https://github.com/haubi/parity
>> Basically I'm wondering why Cygwin doesn't provide that (yet?)...
> 
> We discussed implementing our own dynamic loader once, but gave up due
> to workload.  Parity is LGPLed and thus can't be included into Cygwin
> itself.

've mentioned Parity more as proof-of-concept than for inclusion,
and accepting workload as reasoning.

Let's see if I can live without the ELF dynloader features.

Thanks!
/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-08-19 16:37       ` Michael Haubenwallner
@ 2016-08-19 16:39         ` Michael Haubenwallner
  2016-08-20 19:32           ` Corinna Vinschen
  0 siblings, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-19 16:39 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 3423 bytes --]

(and now with patches attached)

On 08/19/2016 06:37 PM, Michael Haubenwallner wrote:
> On 06/01/2016 10:17 PM, Corinna Vinschen wrote:
>> On Jun  1 16:26, Michael Haubenwallner wrote:
>>> On 06/01/2016 01:09 PM, Corinna Vinschen wrote:
>>>> On Jun  1 08:40, Michael Haubenwallner wrote:
>>>>> Hi,
>>>>>
>>>>> two issues with dlopen here (I'm about to prepare patches):
>>>>>
>>>>> *) The algorithm to combine dll file name variants with the search path
>>>>>    entries needs to be reordered, as in:
>>>>>    - for each dll file name variant:
>>>>>    -   for each search path:
>>>>>    + for each search path entry:
>>>>>    +   for each dll file name variant:
>>>>>          check if useable
> 
>>>> However, if you can speed up the search process ignore the
>>>> question...
> 
> Not sure if there's a speedup actually, ...
> 
>>>
>>> This also depends on whether find_exec really is necessary here.
>>
>> Not as such necessary, it's just the function used to search in a
>> search path.  If you want to change that you have to rewrite the
>> same logic again, just reversed.
>>
>> One way around YA code duplication could be some kind of path iterator
>> class which could be used from find_exec as well as from
>> get_full_path_of_dll.
> 
> ... but:
> 0001.patch is a draft for some new cygwin::pathfinder class, with
> 0002.patch adding the executable's directory as searchpath, and
> 0003.patch to search the PATH environment as well.
> 
> Thoughts?
> Any idea about different template nesting to be useful somewhere else?
> 
>>>>> *) The directory of the current main executable should be searched
>>>>>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
>>>>>    And PATH should be searched before /usr/bin:/usr/lib as well.
>>>>
>>>> Checking the executable path and $PATH are Windows concepts.  dlopen
>>>> doesn't do that on POSIX systems and we're not doing that either.
>>>
>>> Agreed, but POSIX also does have the concept of embedded RUNPATH,
>>> which is completely missing in Cygwin as far as I can see.
>>
>> RPATH and RUNPATH are ELF dynamic loader features, not supported by
>> PE/COFF.
> 
> In any case, to me it does feel quite important to have the (almost) same
> dll search algorithm with dlopen() as with CreateProcess().
> 
>>> However, when dlopen is about to search some path list, I can imagine to
>>> search the list of already loaded dlls first as well, but I'd prefer to
>>> leave this up to LoadLibrary...
>>
>> This problem would only occur if dlopen is not called with a path.  If
>> the given pathname is a plain filename, we could simply add a call to
>> GetModuleHandle and if it succeeds, return the handle (after checking
>> for RTLD_NODELETE).
> 
> But indeed this might lower the need for same search algorithms...
> 
>>> Side note:
>>> We also use some cl.exe/link.exe wrapper that supports LD_PRELOAD,
>>> LD_LIBRARY_PATH, embedded RUNPATH, as well as lazy loading for both
>>> LoadLibrary and CreateProcess: https://github.com/haubi/parity
>>> Basically I'm wondering why Cygwin doesn't provide that (yet?)...
>>
>> We discussed implementing our own dynamic loader once, but gave up due
>> to workload.  Parity is LGPLed and thus can't be included into Cygwin
>> itself.
> 
> 've mentioned Parity more as proof-of-concept than for inclusion,
> and accepting workload as reasoning.
> 
> Let's see if I can live without the ELF dynloader features.
> 
> Thanks!
> /haubi/
> 

[-- Attachment #2: 0001-dlopen-search-each-name-within-one-single-search-dir.patch --]
[-- Type: text/x-patch, Size: 28706 bytes --]

From fd7ed0d5cf2f255792e368bba3bc91c834d44896 Mon Sep 17 00:00:00 2001
From: Michael Haubenwallner <michael.haubenwallner@ssi-schaefer.com>
Date: Tue, 31 May 2016 13:09:11 +0000
Subject: [PATCH 1/3] dlopen: search each name within one single search dir

Search x/bin:x/lib if searching in x/lib.

Introduces and uses new pathfinder template, which
introduces and uses new samlist<vstring> templates.
---
 winsup/cygwin/cygheap_malloc.h |  15 ++
 winsup/cygwin/dlfcn.cc         |  71 +++---
 winsup/cygwin/pathfinder.h     | 116 ++++++++++
 winsup/cygwin/samlist.h        | 253 +++++++++++++++++++++
 winsup/cygwin/vstring.h        | 503 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 918 insertions(+), 40 deletions(-)
 create mode 100644 winsup/cygwin/pathfinder.h
 create mode 100644 winsup/cygwin/samlist.h
 create mode 100644 winsup/cygwin/vstring.h

diff --git a/winsup/cygwin/cygheap_malloc.h b/winsup/cygwin/cygheap_malloc.h
index 74f0bb6..dfed929 100644
--- a/winsup/cygwin/cygheap_malloc.h
+++ b/winsup/cygwin/cygheap_malloc.h
@@ -52,4 +52,19 @@ char *__reg1 cstrdup1 (const char *);
 void __reg2 cfree_and_set (char *&, char * = NULL);
 }
 
+#ifdef __cplusplus
+
+namespace cygwin {
+
+  template<cygheap_types HeapType>
+    struct allocator
+    {
+      static void * alloc (size_t s) { return cmalloc_abort (HeapType, s); }
+      static void free (void *p) { cfree (p); }
+    };
+
+}
+
+#endif
+
 #endif /*_CYGHEAP_MALLOC_H*/
diff --git a/winsup/cygwin/dlfcn.cc b/winsup/cygwin/dlfcn.cc
index 255a6d5..e2093ab 100644
--- a/winsup/cygwin/dlfcn.cc
+++ b/winsup/cygwin/dlfcn.cc
@@ -20,6 +20,7 @@ details. */
 #include "cygtls.h"
 #include "tls_pbuf.h"
 #include "ntdll.h"
+#include "pathfinder.h"
 
 static void
 set_dl_error (const char *str)
@@ -28,29 +29,6 @@ set_dl_error (const char *str)
   _my_tls.locals.dl_error = 1;
 }
 
-/* Look for an executable file given the name and the environment
-   variable to use for searching (eg., PATH); returns the full
-   pathname (static buffer) if found or NULL if not. */
-inline const char *
-check_path_access (const char *mywinenv, const char *name, path_conv& buf)
-{
-  return find_exec (name, buf, mywinenv, FE_NNF | FE_DLL);
-}
-
-/* Search LD_LIBRARY_PATH for dll, if it exists.  Search /usr/bin and /usr/lib
-   by default.  Return valid full path in path_conv real_filename. */
-static inline bool
-gfpod_helper (const char *name, path_conv &real_filename)
-{
-  if (strchr (name, '/'))
-    real_filename.check (name, PC_SYM_FOLLOW | PC_NULLEMPTY);
-  else if (!check_path_access ("LD_LIBRARY_PATH", name, real_filename))
-    check_path_access ("/usr/bin:/usr/lib", name, real_filename);
-  if (!real_filename.exists ())
-    real_filename.error = ENOENT;
-  return !real_filename.error;
-}
-
 static bool
 get_full_path_of_dll (const char* str, path_conv &real_filename)
 {
@@ -63,38 +41,50 @@ get_full_path_of_dll (const char* str, path_conv &real_filename)
       return false;		/* Yes.  Let caller deal with it. */
     }
 
-  tmp_pathbuf tp;
-  char *name = tp.c_get ();
+  cygwin::samlist<cygwin::vstring> basenames;
 
-  strcpy (name, str);	/* Put it somewhere where we can manipulate it. */
+  const char *basename = strrchr (str, '/');
+  basename = basename ? basename + 1 : str;
 
-  char *basename = strrchr (name, '/');
-  basename = basename ? basename + 1 : name;
-  char *suffix = strrchr (name, '.');
-  if (suffix && suffix < basename)
-    suffix = NULL;
+  int baselen = str + len - basename;
+  const char *suffix = strrchr (basename, '.');
 
+  char const * ext = "";
+  int extlen = 0;
   /* Is suffix ".so"? */
   if (suffix && !strcmp (suffix, ".so"))
     {
       /* Does the file exist? */
-      if (gfpod_helper (name, real_filename))
-	return true;
+      basenames.append (cygwin::vstring::constructor_args (basename, baselen, NULL));
       /* No, replace ".so" with ".dll". */
-      strcpy (suffix, ".dll");
+      baselen -= 3;
+      ext = ".dll";
+      extlen = 4;
     }
   /* Does the filename start with "lib"? */
   if (!strncmp (basename, "lib", 3))
     {
       /* Yes, replace "lib" with "cyg". */
-      strncpy (basename, "cyg", 3);
-      /* Does the file exist? */
-      if (gfpod_helper (name, real_filename))
-	return true;
+      basenames.append (cygwin::vstring::constructor_args ("cyg", 3, basename+3, baselen-3, ext, extlen, NULL));
       /* No, revert back to "lib". */
-      strncpy (basename, "lib", 3);
     }
-  if (gfpod_helper (name, real_filename))
+  basenames.append (cygwin::vstring::constructor_args (basename, baselen, ext, extlen, NULL));
+
+  cygwin::pathfinder finder (basenames);
+
+  if (basename > str)
+    finder.add_searchdir (str, basename - 1 - str);
+  else
+    {
+      /* NOTE: The Windows loader (for linked dlls) does
+	 not use the LD_LIBRARY_PATH environment variable. */
+      finder.add_envsearchpath ("LD_LIBRARY_PATH");
+
+      /* Finally we better have some fallback. */
+      finder.add_searchdir ("/usr/lib", -1);
+    }
+
+  if (finder.check_path_access (real_filename))
     return true;
 
   /* If nothing worked, create a relative path from the original incoming
@@ -113,6 +103,7 @@ dlopen (const char *name, int flags)
 {
   void *ret = NULL;
 
+  debug_printf ("flags %d for %s", flags, name);
   if (name == NULL)
     {
       ret = (void *) GetModuleHandle (NULL); /* handle for the current module */
diff --git a/winsup/cygwin/pathfinder.h b/winsup/cygwin/pathfinder.h
new file mode 100644
index 0000000..6845903
--- /dev/null
+++ b/winsup/cygwin/pathfinder.h
@@ -0,0 +1,116 @@
+/* pathfinder.h: find one of multiple file names in path
+
+   Copyright 2016 Red Hat, Inc.
+
+This file is part of Cygwin.
+
+This software is a copyrighted work licensed under the terms of the
+Cygwin license.  Please consult the file "CYGWIN_LICENSE" for
+details. */
+
+#include "samlist.h"
+#include "vstring.h"
+
+#ifdef __cplusplus
+
+namespace cygwin {
+
+class pathfinder
+{
+public:
+  typedef cygwin::samlist<cygwin::vstring> basenamelist;
+
+private:
+  pathfinder ();
+  pathfinder (pathfinder const &);
+  pathfinder & operator = (pathfinder const &);
+
+  basenamelist basenames_;
+  size_t basenames_maxlen_;
+
+  typedef cygwin::samlist<cygwin::vbuffer>  searchbufferlist;
+  searchbufferlist searchbuffers_;
+
+public:
+  ~pathfinder () {}
+
+  /* We need the basenames to search for first, to allow for optimized
+     memory allocation of each searchpath + basename combination. */
+  pathfinder (cygwin::samlist<cygwin::vstring> & basenames)
+    : basenames_ ()
+    , basenames_maxlen_ ()
+    , searchbuffers_()
+  {
+    basenames.swap(basenames_);
+
+    for (basenamelist::iterator basename = basenames_.begin ();
+	 basename != basenames_.end ();
+	 ++ basename)
+      {
+        if (basenames_maxlen_ < basename->stringlength ())
+	  basenames_maxlen_ = basename->stringlength ();
+      }
+  }
+
+  void add_searchdir (const char *dir, int dirlen)
+  {
+      if (dirlen < 0)
+        dirlen = strlen (dir);
+
+      if (!dirlen)
+        return;
+
+      /* Search "x/bin:x/lib" for "x/lib" */
+      if (dirlen >=4 && !strncmp (dir + dirlen - 4, "/lib", 4))
+	searchbuffers_.append (cygwin::vbuffer::constructor_args (dir, dirlen - 4, "/bin", 4, "/", 1 + basenames_maxlen_, NULL));
+//	searchbuffers_.appendv (dir, dirlen - 4, "/bin", 4, "/", 1 + basenames_maxlen_, NULL);
+
+      /* prealloc buffer in searchdir for any basename we will search for */
+      searchbuffers_.append (cygwin::vbuffer::constructor_args (dir, dirlen, "/", 1 + basenames_maxlen_, NULL));
+  }
+
+  void add_searchpath (const char *path)
+  {
+    while (path && *path)
+      {
+        const char *next = strchr (path, ':');
+        add_searchdir (path, next ? next - path : -1);
+	path = next ? next + 1 : next;
+      }
+  }
+
+  void add_envsearchpath (const char *envpath)
+    {
+      add_searchpath (getenv (envpath));
+    }
+
+  /* Within each searchdir registered, try each registered basename to
+     find as executable.  Returns found dir/basename in real_filename.
+     Returns true when found. */
+  bool check_path_access (path_conv& real_filename)
+  {
+    for (searchbufferlist::iterator dir = searchbuffers_.begin ();
+	 dir != searchbuffers_.end ();
+	 ++dir)
+      for (cygwin::samlist<cygwin::vstring>::iterator name = basenames_.begin ();
+	   name != basenames_.end ();
+	   ++name)
+	{
+	  /* complete the filename path to search for */
+	  memcpy (dir->buffer () + dir->stringlength (), name->string (), name->stringlength () + 1);
+	  debug_printf ("trying %s", dir->buffer ());
+	  real_filename.check (dir->string (), PC_SYM_FOLLOW | PC_POSIX);
+	  if (real_filename.exists () && !real_filename.isdir ())
+	    {
+	      debug_printf (" found %s", dir->buffer ());
+	      return true;
+	    }
+	}
+    real_filename.error = ENOENT;
+    return !real_filename.error;
+  }
+};
+
+}
+
+#endif /* __cplusplus */
diff --git a/winsup/cygwin/samlist.h b/winsup/cygwin/samlist.h
new file mode 100644
index 0000000..aec30ae
--- /dev/null
+++ b/winsup/cygwin/samlist.h
@@ -0,0 +1,253 @@
+/* samlist.h: cygwin::samlist
+
+   Copyright 2016 Red Hat, Inc.
+
+This file is part of Cygwin.
+
+This software is a copyrighted work licensed under the terms of the
+Cygwin license.  Please consult the file "CYGWIN_LICENSE" for
+details. */
+
+#ifdef __cplusplus
+
+namespace cygwin
+{
+  /* basic member for SinglyAllocatedMembers list */
+  class samlist_member
+  {
+    samlist_member * prev_;
+    samlist_member * next_;
+
+    /* no copy */
+    samlist_member (samlist_member const &);
+    samlist_member & operator = (samlist_member const &);
+
+  public:
+    samlist_member ()
+      : prev_ (this)
+      , next_ (this)
+    {}
+
+    class constructor_args
+    {
+      friend class samlist_member;
+
+      template<typename MemberType>
+	friend class samlist;
+
+      samlist_member * before;
+
+    public:
+      constructor_args () /* list anchor */
+	: before (NULL)
+      {}
+
+      constructor_args (samlist_member * b) /* list anchor */
+	: before (b)
+      {}
+    };
+
+    samlist_member (constructor_args & ca)
+    {
+      if (!ca.before) /* default constructor */
+	{
+	  /* anchor */
+	  prev_ = this;
+	  next_ = this;
+	  return;
+	}
+      /* instantly insert into list */
+      prev_ = ca.before->prev_;
+      next_ = ca.before;
+      prev_->next_ = this;
+      next_->prev_ = this;
+    }
+
+    ~samlist_member ()
+    {
+      /* remove from list */
+      samlist_member * next = next_;
+      samlist_member * prev = prev_;
+      next->prev_ = prev;
+      prev->next_ = next;
+      prev_ = NULL;
+      next_ = NULL;
+    }
+
+    samlist_member * next ()
+    {
+      return next_;
+    }
+
+    samlist_member * prev ()
+    {
+      return next_;
+    }
+  };
+
+  /* Double-linked list with SinglyAllocatedMember storage, that is,
+     double-linkage pointers are allocated alongside the member data. */
+  template<typename MemberType>
+    class samlist
+    {
+    public:
+      struct member_type
+	: public samlist_member
+	, public MemberType
+      {
+	struct constructor_args
+	  : public samlist_member::constructor_args
+	  , public MemberType::constructor_args
+	{
+	  constructor_args (samlist_member::constructor_args lca,
+			    typename MemberType::constructor_args mca)
+	    : samlist_member::constructor_args (lca)
+	    , MemberType::constructor_args (mca)
+	  {}
+	};
+
+	member_type (constructor_args & ca)
+	  : samlist_member (ca)
+	  , MemberType (ca)
+	{}
+
+        void * operator new (size_t classSize, constructor_args & ca)
+	{
+	  return MemberType::operator new (classSize, ca);
+	}
+      };
+
+      class iterator
+      {
+	samlist_member * current_;
+
+	iterator ();
+
+	friend class samlist;
+	iterator (samlist_member * current)
+	  : current_ (current)
+	{}
+
+      public:
+	iterator (iterator const & rhs)
+	  : current_ (rhs.current_)
+	{}
+
+	iterator & operator = (iterator const & rhs)
+	{
+	  current_ = rhs.current_;
+	  return *this;
+	}
+
+	iterator & operator ++ ()
+	{
+	  current_ = current_->next ();
+	  return *this;
+	}
+
+	iterator operator ++ (int)
+	{
+	  iterator ret (*this);
+	  current_ = current_->next ();
+	  return ret;
+	}
+
+	iterator & operator -- ()
+	{
+	  current_ = current_->prev ();
+	  return *this;
+	}
+
+	iterator operator -- (int)
+	{
+	  iterator ret (*this);
+	  current_ = current_->prev ();
+	  return ret;
+	}
+
+	bool operator == (iterator const & rhs) const
+	{
+	  return current_ == rhs.current_;
+	}
+
+	bool operator != (iterator const & rhs) const
+	{
+	  return !(*this == rhs);
+	}
+
+	member_type const & operator *  () const { return *ptr (); }
+	member_type       & operator *  ()       { return *ptr (); }
+	member_type const * operator -> () const { return  ptr (); }
+	member_type       * operator -> ()       { return  ptr (); }
+
+	operator member_type const * () const { return ptr (); }
+	operator member_type       * ()       { return ptr (); }
+
+	member_type const * ptr () const { return static_cast<member_type *>(current_); }
+	member_type       * ptr ()       { return static_cast<member_type *>(current_); }
+
+	void remove ()
+	{
+	  member_type * old = ptr ();
+	  ++ *this;
+	  delete old;
+	}
+      };
+
+    private:
+      samlist_member * anchor_;
+
+      /* no copy */
+      samlist (samlist const &);
+      samlist & operator = (samlist const &);
+
+    public:
+      iterator  begin () { return iterator (anchor_->next ()); }
+      iterator  end   () { return iterator (anchor_         ); }
+      iterator rbegin () { return iterator (anchor_->prev ()); }
+      iterator rend   () { return iterator (anchor_         ); }
+
+      samlist ()
+	: anchor_ (new samlist_member ())
+      {}
+
+      ~samlist ()
+      {
+	for (iterator it = begin (); it != end (); it.remove ());
+	delete anchor_;
+      }
+
+      void swap (samlist & that)
+      {
+	samlist_member * old = anchor_;
+	anchor_ = that.anchor_;
+	that.anchor_ = old;
+      }
+
+      member_type * append (typename MemberType::constructor_args ca)
+      {
+	typename member_type::constructor_args mca (samlist_member::constructor_args (anchor_), ca);
+	return new (mca) member_type (mca);
+      }
+
+      template<typename FirstType>
+	member_type * appendv (FirstType & firstArg, va_list const & moreArgs)
+	{
+	  typename member_type::constructor_args ca (firstArg, moreArgs);
+	  return append (ca);
+	}
+
+      template<typename FirstType>
+	member_type * appendv(FirstType & firstArg, ...)
+	{
+	  va_list moreArgs;
+	  va_start (moreArgs, firstArg);
+	  member_type * ret = appendv (firstArg, moreArgs);
+	  va_end (moreArgs);
+	  return ret;
+	}
+    };
+
+}
+
+#endif /* __cplusplus */
diff --git a/winsup/cygwin/vstring.h b/winsup/cygwin/vstring.h
new file mode 100644
index 0000000..0658699
--- /dev/null
+++ b/winsup/cygwin/vstring.h
@@ -0,0 +1,503 @@
+/* vstring.h: cygwin::vstring, cygwin::vbuffe
+
+   Copyright 2016 Red Hat, Inc.
+
+This file is part of Cygwin.
+
+This software is a copyrighted work licensed under the terms of the
+Cygwin license.  Please consult the file "CYGWIN_LICENSE" for
+details. */
+
+#include <assert.h>
+#include "cygheap_malloc.h"
+
+#ifdef __cplusplus
+
+namespace cygwin {
+
+  struct constructible {
+    struct constructor_args {};
+    constructible (constructor_args &) {}
+  };
+
+  namespace string {
+
+    template<typename CharType>
+      struct char_traits
+      {
+	typedef CharType char_type;
+	typedef size_t size_type;
+
+	/* see strlen */
+	static size_type length (char_type const *);
+
+	/* see stpcpy */
+	static char_type * copy (char_type *, char_type const *);
+
+	/* see stpncpy */
+	static char_type * copy (char_type *, char_type const *, size_type);
+      };
+
+    template<>
+      struct char_traits<char>
+      {
+	typedef char char_type;
+	typedef size_t size_type;
+
+	static size_type length(char_type const * s)
+	{
+	  return strlen (s);
+	}
+	static char_type * copy(char_type * d, char_type const * s)
+	{
+	  return stpcpy (d, s);
+	}
+	static char_type * copy(char_type * d, char_type const * s, size_type n)
+	{
+	  return stpncpy (d, s, n);
+	}
+      };
+
+    template<>
+      struct char_traits<wchar_t>
+      {
+	typedef wchar_t char_type;
+	typedef size_t size_type;
+
+	static size_type length(char_type const *s)
+	{
+	  return wcslen (s);
+	}
+	static char_type * copy(char_type *d, char_type const *s)
+	{
+	  return wcpcpy (d, s);
+	}
+	static char_type * copy(char_type *d, char_type const *s, size_type n)
+	{
+	  return wcpncpy (d, s, n);
+	}
+      };
+
+    namespace storage {
+
+      template<typename CharTraits, typename AllocatorType>
+	class varying_length /* must be last data member wherever used */
+	{
+	protected:
+	  typedef CharTraits char_traits;
+	  typedef AllocatorType allocator_type;
+	  typedef typename CharTraits::char_type char_type;
+	  typedef typename CharTraits::size_type size_type;
+
+	private:
+	  union {
+	    struct {
+	      size_type bufferlength_; /* initialized by operator new */
+	      size_type stringlength_; /* abused to check for operator new */
+	      /* must be last data member even in derived class */
+	      char_type buffer_[1]; /* we always have space for trailing zero */
+	    };
+	    double alignment_;
+	  };
+
+	  /* variable length disallows placement new */
+	  void * operator new (size_t classSize, void * p);
+
+	protected:
+	  varying_length ()
+	    : stringlength_ ()
+	    , buffer_ ()
+	  {}
+
+	  struct constructor_args
+	  {
+	    size_type bufferlength;
+
+	    constructor_args ()
+	      : bufferlength (0)
+	    {}
+
+	    constructor_args (size_type l)
+	      : bufferlength (l)
+	    {}
+
+	    void set (size_type l)
+	    {
+	      bufferlength = l;
+	    }
+	  };
+
+	  varying_length (constructor_args & ca)
+	  {
+	    /* check if our operator new was used as expected */
+	    assert (bufferlength_ == ca.bufferlength);
+	    assert (stringlength_ == (size_type) this);
+	    stringlength_ = (size_type) 0;
+	    buffer_[0] = (char_type) 0;
+	  }
+
+	  size_type   bufferlength () const { return bufferlength_; }
+	  size_type   stringlength () const { return stringlength_; }
+	  char_type const * buffer () const { return buffer_; }
+	  char_type       * buffer ()       { return buffer_; }
+
+	  void stringlength (size_type stringLength)
+	  {
+	    assert (stringLength <= bufferlength_);
+	    stringlength_ = stringLength;
+	  }
+
+	  void * operator new (size_t classSize, constructor_args & ca)
+	  {
+	    void *ret = allocator_type::alloc (classSize + ca.bufferlength);
+	    /* we are last data member, aligned to the end of real struct:
+	       find self to initialize bufferlength_ */
+	    union { void *v; char *b; varying_length *p; } self;
+	    self.v = ret;
+	    self.b += classSize - sizeof (varying_length);
+	    /* tell constructor about our allocation */
+	    self.p->bufferlength_ = ca.bufferlength;
+	    self.p->stringlength_ = (size_type) self.p;
+	    return ret;
+	  }
+
+	  void operator delete (void * p)
+	  {
+	    allocator_type::free (p);
+	  }
+	};
+
+    }
+
+    namespace access {
+
+      template<typename AllocatorType>
+	class readonly
+	  : protected AllocatorType /* hide AllocatorType::buffer () */
+	{
+	public:
+	  typedef AllocatorType storage_type;
+	  using typename storage_type::char_type;
+	  using typename storage_type::size_type;
+
+	  using typename AllocatorType::constructor_args;
+
+	  readonly (constructor_args & c)
+	    : AllocatorType (c)
+	  {}
+
+	  char_type const * string () const { return AllocatorType::buffer (); }
+	  size_type   stringlength () const { return AllocatorType::stringlength (); }
+	};
+
+      template<typename AllocatorType>
+	class writeable
+	  : public readonly<AllocatorType>
+	{
+	public:
+	  using typename readonly<AllocatorType>::storage_type;
+	  using typename readonly<AllocatorType>::char_type;
+	  using typename readonly<AllocatorType>::size_type;
+
+	  using typename readonly<AllocatorType>::constructor_args;
+
+	  writeable (constructor_args & c)
+	    : readonly<AllocatorType> (c)
+	  {}
+
+	  char_type *     buffer () { return storage_type::buffer (); }
+	  size_type bufferlength () { return storage_type::bufferlength (); }
+	};
+
+    }
+
+    /* Extend some ConstructibleBaseType by one string with length unknown before
+       construction.
+       The ConstructibleBaseType is required to provide the contructor_args type
+       to support polymorphic constructor arguments;
+       Provides 
+       The new operator needs to get the (varying) constructor
+       arguments for determining required memory size, while the constructor
+       itself assumes enough memory being allocated to copy the varying
+       string arguments.  */
+    template<typename AccessType, typename ConstructibleBaseType>
+      class varying_extension
+	: public ConstructibleBaseType
+	, public AccessType /* must be last data member */
+      {
+      public:
+	using typename AccessType::storage_type;
+
+	using typename AccessType::char_traits;
+	using typename AccessType::char_type;
+	using typename AccessType::size_type;
+
+	struct constructor_args
+	  : public ConstructibleBaseType::constructor_args
+	  , public AccessType::constructor_args
+	{
+	  char_type const * part0;
+	  va_list           parts;
+
+	  constructor_args ()
+	    : ConstructibleBaseType::constructor_args ()
+	    , AccessType::constructor_args ()
+	    , part0 (NULL)
+	    , parts ()
+	  {}
+
+	  constructor_args (char_type const * p0, va_list const & pn)
+	    : ConstructibleBaseType::constructor_args ()
+	    , AccessType::constructor_args ()
+	    , part0 (p0)
+	    , parts ()
+	  {
+	    if (part0)
+	      va_copy (parts, pn);
+	  }
+
+	  constructor_args (char_type const * p0, ...)
+	    : ConstructibleBaseType::constructor_args ()
+	    , AccessType::constructor_args ()
+	    , part0 (p0)
+	    , parts ()
+	  {
+	    if (part0)
+	      va_start (parts, p0);
+	  }
+
+	  void set (char_type const * p0, va_list const & pn)
+	  {
+	    if (part0)
+	      va_end (parts);
+	    part0 = p0;
+	    if (part0)
+	      va_copy (parts, pn);
+	  }
+
+	  void set (char_type const * p0, ...)
+	  {
+	    if (part0)
+	      va_end (parts);
+	    part0 = p0;
+	    if (part0)
+	      va_start (parts, p0);
+	  }
+
+	  ~constructor_args ()
+	  {
+	    if (part0)
+	      va_end (parts);
+	  }
+	};
+
+
+	void operator delete (void * p)
+	{
+	  AccessType::operator delete (p);
+	}
+
+	void * operator new (size_t classSize)
+	{
+	  constructor_args ca;
+	  return AccessType::operator new (classSize, ca);
+	}
+
+	void * operator new (size_t classSize, constructor_args & ca)
+	{
+	  char_type const * part0 = ca.part0;
+	  va_list parts;
+	  va_copy (parts, ca.parts);
+	  ca.bufferlength = 0;
+	  while (part0)
+	    {
+	      int part0len = va_arg (parts, int);
+	      if (part0len < 0)
+		part0len = char_traits::length (part0);
+	      ca.bufferlength += part0len;
+	      part0 = va_arg (parts, const char_type *);
+	    }
+	  va_end (parts);
+
+	  /* allocate member- and string-buffer at once */
+	  return AccessType::operator new (classSize, ca);
+	}
+
+	varying_extension (constructor_args & ca)
+	  : ConstructibleBaseType (ca)
+	  , AccessType (ca)
+	{
+	  char_type * dest = storage_type::buffer ();
+	  size_type bufferLength = 0;
+	  char_type const * part0 = ca.part0;
+	  va_list parts;
+	  va_copy (parts, ca.parts);
+	  while (part0)
+	    {
+	      int part0len = va_arg (parts, int);
+	      if (part0len < 0)
+		{
+		  char_type * old = dest;
+		  dest = char_traits::copy (old, part0);
+		  part0len = dest - old;
+		}
+	      else
+		dest = char_traits::copy (dest, part0, part0len);
+	      bufferLength += part0len;
+	      part0 = va_arg (parts, const char_type *);
+	    }
+	  va_end (parts);
+	  *dest = (char_type)0;
+	  storage_type::stringlength (dest - storage_type::buffer ());
+	  assert (bufferLength == storage_type::bufferlength ());
+	}
+
+	template<typename FinalType>
+	  static FinalType * create (typename FinalType::constructor_args & ca)
+	  {
+	    return new (ca) FinalType (ca);
+	  }
+
+	template<typename FinalType>
+	  static FinalType * create (typename FinalType::constructor_args & c, const char_type * part0, va_list const & parts)
+	  {
+	    c.set (part0, parts);
+	    return create (c);
+	  }
+
+	template<typename FinalType>
+	  static FinalType * create (typename FinalType::constructor_args & c, const char_type * part0, ...)
+	  {
+	    va_list parts;
+	    va_start (parts, part0);
+	    c.set (part0, parts);
+	    va_end (parts);
+	    return create (c);
+	  }
+
+	template<typename FinalType>
+	  static FinalType * create (const char_type * part0, va_list const & parts)
+	  {
+	    typename FinalType::constructor_args c;
+	    c.set (part0, parts);
+	    return create (c);
+	  }
+
+	template<typename FinalType>
+	  static FinalType * create (const char_type * part0, ...)
+	  {
+	    typename FinalType::constructor_args c;
+	    va_list parts;
+	    va_start (parts, part0);
+	    c.set (part0, parts);
+	    va_end (parts);
+	    return create (c);
+	  }
+      };
+  }
+
+  template<typename CharType, typename ConstructibleBaseType, typename Allocator>
+    struct basic_vstring_extension
+      : public string::varying_extension
+	< string::access::readonly
+	  < string::storage::varying_length
+	    < string::char_traits<CharType>
+	    , Allocator
+	  > >
+	, ConstructibleBaseType
+	>
+    {
+      using typename string::varying_extension
+	< string::access::readonly
+	  < string::storage::varying_length
+	    < string::char_traits<CharType>
+	    , Allocator
+	  > >
+	, ConstructibleBaseType
+	>::constructor_args;
+      basic_vstring_extension (constructor_args & ca)
+	: string::varying_extension
+	  < string::access::readonly
+	    < string::storage::varying_length
+	      < string::char_traits<CharType>
+	      , Allocator
+	    > >
+	  , ConstructibleBaseType
+	  > (ca)
+      {}
+    };
+
+  template<typename CharType, typename ConstructibleBaseType, typename Allocator>
+    struct basic_vbuffer_extension
+      : public string::varying_extension
+	< string::access::writeable
+	  < string::storage::varying_length
+	    < string::char_traits<CharType>
+	    , Allocator
+	  > >
+	, ConstructibleBaseType
+	>
+    {
+      using typename string::varying_extension
+	< string::access::writeable
+	  < string::storage::varying_length
+	    < string::char_traits<CharType>
+	    , Allocator
+	  > >
+	, ConstructibleBaseType
+	>::constructor_args;
+
+      basic_vbuffer_extension (constructor_args & ca)
+	: string::varying_extension
+	  < string::access::writeable
+	    < string::storage::varying_length
+	      < string::char_traits<CharType>
+	      , Allocator
+	    > >
+	  , ConstructibleBaseType
+	  > (ca)
+      {}
+    };
+
+  template<typename ConstructibleBaseType, typename Allocator>
+    struct vstring_extension
+      : public basic_vstring_extension<char, ConstructibleBaseType, Allocator>
+    {
+      using typename basic_vstring_extension<char, ConstructibleBaseType, Allocator>::constructor_args;
+      vstring_extension (constructor_args & ca)
+	: basic_vstring_extension<char, ConstructibleBaseType, Allocator> (ca)
+      {}
+    };
+
+  template<typename ConstructibleBaseType, typename Allocator>
+    struct wvstring_extension
+      : public basic_vstring_extension<wchar_t, ConstructibleBaseType, Allocator>
+    {
+      using typename basic_vstring_extension<wchar_t, ConstructibleBaseType, Allocator>::constructor_args;
+    };
+
+  template<typename ConstructibleBaseType, typename Allocator>
+    struct vbuffer_extension
+      : public basic_vbuffer_extension<char, ConstructibleBaseType, Allocator>
+    {
+      using typename basic_vbuffer_extension<char, ConstructibleBaseType, Allocator>::constructor_args;
+      vbuffer_extension (constructor_args & ca)
+	: basic_vbuffer_extension<char, ConstructibleBaseType, Allocator> (ca)
+      {}
+    };
+
+  template<typename ConstructibleBaseType, typename Allocator>
+    struct wvbuffer_extension
+      : public basic_vbuffer_extension<wchar_t, ConstructibleBaseType, Allocator>
+    {
+      using typename basic_vbuffer_extension<wchar_t, ConstructibleBaseType, Allocator>::constructor_args;
+    };
+
+  typedef vstring_extension<constructible, allocator<HEAP_STR> > vstring;
+  typedef vbuffer_extension<constructible, allocator<HEAP_STR> > vbuffer;
+
+  typedef wvstring_extension<constructible, allocator<HEAP_STR> > wvstring;
+  typedef wvbuffer_extension<constructible, allocator<HEAP_STR> > wvbuffer;
+}
+
+#endif /* __cplusplus */
-- 
2.8.3


[-- Attachment #3: 0002-dlopen-search-main-executable-s-directory.patch --]
[-- Type: text/x-patch, Size: 1347 bytes --]

From 6e2739deccb06a4d739d2a41fa32355f21c151a1 Mon Sep 17 00:00:00 2001
From: Michael Haubenwallner <michael.haubenwallner@ssi-schaefer.com>
Date: Tue, 19 Jul 2016 14:56:48 +0200
Subject: [PATCH 2/3] dlopen: search main executable's directory

---
 winsup/cygwin/dlfcn.cc | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/winsup/cygwin/dlfcn.cc b/winsup/cygwin/dlfcn.cc
index e2093ab..4ea9984 100644
--- a/winsup/cygwin/dlfcn.cc
+++ b/winsup/cygwin/dlfcn.cc
@@ -80,6 +80,19 @@ get_full_path_of_dll (const char* str, path_conv &real_filename)
 	 not use the LD_LIBRARY_PATH environment variable. */
       finder.add_envsearchpath ("LD_LIBRARY_PATH");
 
+      /* Unless we have a special cygwin loader, there is no such thing like
+	 DT_RUNPATH on Windows we can use to search for dlls, except for the
+	 directory of the main executable. */
+      tmp_pathbuf tp;
+      PWCHAR exewname = tp.w_get ();
+      GetModuleFileNameW (NULL, exewname, NT_MAX_PATH);
+      char * exedir = tp.c_get ();
+      *exedir = '\0';
+      cygwin_conv_path (CCP_WIN_W_TO_POSIX, exewname, exedir, NT_MAX_PATH);
+      char * lastsep = strrchr (exedir, '/');
+      if (lastsep)
+	finder.add_searchdir (exedir, lastsep - exedir);
+
       /* Finally we better have some fallback. */
       finder.add_searchdir ("/usr/lib", -1);
     }
-- 
2.8.3


[-- Attachment #4: 0003-dlopen-search-PATH-environment-variable.patch --]
[-- Type: text/x-patch, Size: 851 bytes --]

From efe31198bf5747253a9c0d588c354a6b86234525 Mon Sep 17 00:00:00 2001
From: Michael Haubenwallner <michael.haubenwallner@ssi-schaefer.com>
Date: Tue, 19 Jul 2016 14:57:10 +0200
Subject: [PATCH 3/3] dlopen: search PATH environment variable

---
 winsup/cygwin/dlfcn.cc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/winsup/cygwin/dlfcn.cc b/winsup/cygwin/dlfcn.cc
index 4ea9984..302869d 100644
--- a/winsup/cygwin/dlfcn.cc
+++ b/winsup/cygwin/dlfcn.cc
@@ -93,6 +93,9 @@ get_full_path_of_dll (const char* str, path_conv &real_filename)
       if (lastsep)
 	finder.add_searchdir (exedir, lastsep - exedir);
 
+      /* Windows uses the PATH environment variable to search for dlls. */
+      finder.add_envsearchpath ("PATH");
+
       /* Finally we better have some fallback. */
       finder.add_searchdir ("/usr/lib", -1);
     }
-- 
2.8.3


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-08-19 16:39         ` Michael Haubenwallner
@ 2016-08-20 19:32           ` Corinna Vinschen
  2016-08-22 12:15             ` Michael Haubenwallner
  0 siblings, 1 reply; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-20 19:32 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 4703 bytes --]

Hi Michael,

On Aug 19 18:39, Michael Haubenwallner wrote:
> (and now with patches attached)
> 
> On 08/19/2016 06:37 PM, Michael Haubenwallner wrote:
> > On 06/01/2016 10:17 PM, Corinna Vinschen wrote:
> >> On Jun  1 16:26, Michael Haubenwallner wrote:
> >>> On 06/01/2016 01:09 PM, Corinna Vinschen wrote:
> >>>> On Jun  1 08:40, Michael Haubenwallner wrote:
> >>>>> Hi,
> >>>>>
> >>>>> two issues with dlopen here (I'm about to prepare patches):
> >>>>>
> >>>>> *) The algorithm to combine dll file name variants with the search path
> >>>>>    entries needs to be reordered, as in:
> >>>>>    - for each dll file name variant:
> >>>>>    -   for each search path:
> >>>>>    + for each search path entry:
> >>>>>    +   for each dll file name variant:
> >>>>>          check if useable
> > 
> >>>> However, if you can speed up the search process ignore the
> >>>> question...
> > 
> > Not sure if there's a speedup actually, ...
> > 
> >>>
> >>> This also depends on whether find_exec really is necessary here.
> >>
> >> Not as such necessary, it's just the function used to search in a
> >> search path.  If you want to change that you have to rewrite the
> >> same logic again, just reversed.
> >>
> >> One way around YA code duplication could be some kind of path iterator
> >> class which could be used from find_exec as well as from
> >> get_full_path_of_dll.
> > 
> > ... but:
> > 0001.patch is a draft for some new cygwin::pathfinder class, with
> > 0002.patch adding the executable's directory as searchpath, and
> > 0003.patch to search the PATH environment as well.
> > 
> > Thoughts?

Ok, that might be disappointing now because you already put so much work
into it, but I actually expected some more discussion first.  I have two
problem with this.

I'm not a big fan of templates.  In fact, to me they are confusing and
hard to debug.  We only use templates in a single instance in thread.cc
to implement linked lists of pthread objects and I would be very
grateful if we don't add to them.  Somebody has to maintain the code,
and ultimately that would be me.  We also don't really need namespaces
since we don't include any external C++ libs, so we don't have name
collisions.  There are also a few style-issues in the code, e.g,
CamelBack.  Cygwin usually doesn't use it (exceptions confirm the rule)
but it's usually a good way to differ Windows functions from Cygwin
functions.

What I had in mind was a *simple* class which gets told if it searches
for libs or executables and then checks the different paths accordingly,
kind of a copy of find_exec as a class, just additionally handling the
prefix issue for DLLs.

> >>>>> *) The directory of the current main executable should be searched
> >>>>>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
> >>>>>    And PATH should be searched before /usr/bin:/usr/lib as well.
> >>>>
> >>>> Checking the executable path and $PATH are Windows concepts.  dlopen
> >>>> doesn't do that on POSIX systems and we're not doing that either.
> >>>
> >>> Agreed, but POSIX also does have the concept of embedded RUNPATH,
> >>> which is completely missing in Cygwin as far as I can see.
> >>
> >> RPATH and RUNPATH are ELF dynamic loader features, not supported by
> >> PE/COFF.
> > 
> > In any case, to me it does feel quite important to have the (almost) same
> > dll search algorithm with dlopen() as with CreateProcess().

Last but not least I'm not yet convinced if it's *really* a good idea to
prepend the executable path to the DLL search path unconditionally.  Be
it as it is in terms of DT_RUNPATH, why is the application dir a good
choice at all, unless we're talking Windows semantics?  Which we don't.
Also, if loading from the applications dir from dlopen is important for
you, you can emulate it by adding the application dir to LD_LIBRAYR_PATH.

I checked for the usage of DT_RUNPATH/DT_RPATH on Fedora 23 and only a
limited number of packages use it (texlive, samba, python, man-db,
swipl, and a few more).  Some of them, like texlive, even use it wrongly,
with RPATH pointing to a non-existing build dir.  There are also a few
stray "/usr/lib64" settings, but all in all it's not used to point to
the dir the application is installed to, but rather to some package specific
subdir, e.g.  /usr/lib64/samba, /usr/lib64/swipl-7.2.3/lib/x86_64-linux,
etc.

IMHO this means just adding the applications bin dir is most of the time
an unused or even wrong workaround.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-08-20 19:32           ` Corinna Vinschen
@ 2016-08-22 12:15             ` Michael Haubenwallner
  2016-08-22 12:48               ` Peter Rosin
                                 ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-22 12:15 UTC (permalink / raw)
  To: cygwin-developers

Hi Corinna,

On 08/20/2016 09:32 PM, Corinna Vinschen wrote:
>>>>
>>>> One way around YA code duplication could be some kind of path iterator
>>>> class which could be used from find_exec as well as from
>>>> get_full_path_of_dll.

>>> 0001.patch is a draft for some new cygwin::pathfinder class, with
>>> 0002.patch adding the executable's directory as searchpath, and
>>> 0003.patch to search the PATH environment as well.
>>>
>>> Thoughts?
> 
> Ok, that might be disappointing now because you already put so much work
> into it, but I actually expected some more discussion first.  I have two
> problem with this.
> 
> I'm not a big fan of templates.

Never mind, it's been some template exercise to me anyway.

> What I had in mind was a *simple* class which gets told if it searches
> for libs or executables and then checks the different paths accordingly,
> kind of a copy of find_exec as a class, just additionally handling the
> prefix issue for DLLs.

What I'm more interested in for such a class is the actual API for use
by dlopen() and exec(), and the final list of files searched for - with
these use cases coming to my mind:

Libraries/dlls with final search path "/lib:/morelibs":
  L1) dlopen("libN.so")
  L2) dlopen("libN.dll")
  L3) dlopen("cygN.dll")
  L4) dlopen("N.so")
  L5) dlopen("N.dll")
Executables with final search path "/bin:/moreexes"
  X1) exec("X")
  X2) exec("X.exe")
  X3) exec("X.com")

Instead of API calls similar to:
  L1) find(dll, "N", ["/lib", "/morelibs"])
  L2) find(dll, "N", ["/lib", "/morelibs"])
  L3) find(dll, "N", ["/lib", "/morelibs"])
  L4) find(dll, "N", ["/lib", "/morelibs"])
  L5) find(dll, "N", ["/lib", "/morelibs"])
  X1) find(exe, "X", ["/bin", "/moreexes"])
  X2) find(exe, "X", ["/bin", "/moreexes"])
  X3) find(exe, "X", ["/bin", "/moreexes"])
it feels necessary to support more explicit naming, as in:
  L1) find(["libN.so", "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
  L2) find([           "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
  L3) find([           "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
  L4) find(["N.so", "N.dll"                  ], ["/lib/../bin","/lib","/morelibs"])
  L5) find([        "N.dll"                  ], ["/lib/../bin","/lib","/morelibs"])
  X1) find(["X", "X.exe","X.com"], ["/bin","/moreexes"])
  X2) find(["X", "X.exe"        ], ["/bin","/moreexes"])
  X3) find(["X", "X.com"        ], ["/bin","/moreexes"])

Where the find method does not need to actually know whether it searches
for a dll or an exe, but dlopen() and exec() instead define the file
names to search for. This is what the patch draft does in dlopen.

>>>>>>> *) The directory of the current main executable should be searched
>>>>>>>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
>>>>>>>    And PATH should be searched before /usr/bin:/usr/lib as well.
>>>>>>
>>>>>> Checking the executable path and $PATH are Windows concepts.  dlopen
>>>>>> doesn't do that on POSIX systems and we're not doing that either.
>>>>>
>>>>> Agreed, but POSIX also does have the concept of embedded RUNPATH,
>>>>> which is completely missing in Cygwin as far as I can see.
>>>>
>>>> RPATH and RUNPATH are ELF dynamic loader features, not supported by
>>>> PE/COFF.
>>>
>>> In any case, to me it does feel quite important to have the (almost) same
>>> dll search algorithm with dlopen() as with CreateProcess().
> 
> Last but not least I'm not yet convinced if it's *really* a good idea to
> prepend the executable path to the DLL search path unconditionally.  Be
> it as it is in terms of DT_RUNPATH, why is the application dir a good
> choice at all, unless we're talking Windows semantics?  Which we don't.
> Also, if loading from the applications dir from dlopen is important for
> you, you can emulate it by adding the application dir to LD_LIBRAYR_PATH.

As long as there is lack of a Cygwin specific dll loader to find the
dlls to load during process startup, we're bound to Windows semantics.

For dlopen, it is more important to find the same dll file as would be
found when the exe was linked against that dll file, rather than using
the Linux-known algorithm and environment variables - and differ from
process startup: Both really should result in the same algorithm here,
even if that means some difference compared to Linux.

As far as I understand, lack of DT_RUNPATH (besides /etc/ld.so.conf)
support during process start was the main reason for the dlls to install
into /lib/../bin instead of /lib at all, to be found at process start
because of residing in the application's bin dir:
Why should that be different for dlopen?

> I checked for the usage of DT_RUNPATH/DT_RPATH on Fedora 23 and only a
> limited number of packages use it (texlive, samba, python, man-db,
> swipl, and a few more).  Some of them, like texlive, even use it wrongly,
> with RPATH pointing to a non-existing build dir.  There are also a few
> stray "/usr/lib64" settings, but all in all it's not used to point to
> the dir the application is installed to, but rather to some package specific
> subdir, e.g.  /usr/lib64/samba, /usr/lib64/swipl-7.2.3/lib/x86_64-linux,
> etc.

On Linux, the binaries installed in /usr usually rely on the Linux
loader to be configured via /etc/ld.so.conf to find their runtime
libs in /usr/lib.

Please remember: This whole thing is not a problem with packages
installed to /usr, but with packages installed to /somewhere/else
that provide runtime libraries that are also available in /usr.

Using LD_LIBRARY_PATH pointing to /somewhere/else/lib may break the
binaries found in /usr/bin - and agreed, searching PATH doesn't make
it better, as PATH is the "LD_LIBRARY_PATH" for Windows.

> IMHO this means just adding the applications bin dir is most of the time
> an unused or even wrong workaround.

Although GetModuleHandle may reduce that pressure for dlopen - as long as
the applications bin dir is searched at process start, it really should
be searched by dlopen too, even if for /usr/bin/* this might indeed become
redundant, as we always add /usr/bin in dlopen - which really mimics
the /etc/ld.so.conf content actually, although that one is unavailable
to process startup.

Thanks!
/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-08-22 12:15             ` Michael Haubenwallner
@ 2016-08-22 12:48               ` Peter Rosin
  2016-08-22 13:01                 ` Michael Haubenwallner
  2016-08-22 16:02               ` About the dll search algorithm of dlopen (patch-r2) Michael Haubenwallner
  2016-08-22 18:49               ` About the dll search algorithm of dlopen Corinna Vinschen
  2 siblings, 1 reply; 32+ messages in thread
From: Peter Rosin @ 2016-08-22 12:48 UTC (permalink / raw)
  To: cygwin-developers



On 2016-08-22 14:15, Michael Haubenwallner wrote:
> Hi Corinna,
> 
> On 08/20/2016 09:32 PM, Corinna Vinschen wrote:
>>>>>
>>>>> One way around YA code duplication could be some kind of path iterator
>>>>> class which could be used from find_exec as well as from
>>>>> get_full_path_of_dll.
> 
>>>> 0001.patch is a draft for some new cygwin::pathfinder class, with
>>>> 0002.patch adding the executable's directory as searchpath, and
>>>> 0003.patch to search the PATH environment as well.
>>>>
>>>> Thoughts?
>>
>> Ok, that might be disappointing now because you already put so much work
>> into it, but I actually expected some more discussion first.  I have two
>> problem with this.
>>
>> I'm not a big fan of templates.
> 
> Never mind, it's been some template exercise to me anyway.
> 
>> What I had in mind was a *simple* class which gets told if it searches
>> for libs or executables and then checks the different paths accordingly,
>> kind of a copy of find_exec as a class, just additionally handling the
>> prefix issue for DLLs.
> 
> What I'm more interested in for such a class is the actual API for use
> by dlopen() and exec(), and the final list of files searched for - with
> these use cases coming to my mind:
> 
> Libraries/dlls with final search path "/lib:/morelibs":
>   L1) dlopen("libN.so")
>   L2) dlopen("libN.dll")
>   L3) dlopen("cygN.dll")
>   L4) dlopen("N.so")
>   L5) dlopen("N.dll")
> Executables with final search path "/bin:/moreexes"
>   X1) exec("X")
>   X2) exec("X.exe")
>   X3) exec("X.com")
> 
> Instead of API calls similar to:
>   L1) find(dll, "N", ["/lib", "/morelibs"])
>   L2) find(dll, "N", ["/lib", "/morelibs"])
>   L3) find(dll, "N", ["/lib", "/morelibs"])
>   L4) find(dll, "N", ["/lib", "/morelibs"])
>   L5) find(dll, "N", ["/lib", "/morelibs"])
>   X1) find(exe, "X", ["/bin", "/moreexes"])
>   X2) find(exe, "X", ["/bin", "/moreexes"])
>   X3) find(exe, "X", ["/bin", "/moreexes"])
> it feels necessary to support more explicit naming, as in:
>   L1) find(["libN.so", "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
>   L2) find([           "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
>   L3) find([           "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
>   L4) find(["N.so", "N.dll"                  ], ["/lib/../bin","/lib","/morelibs"])
>   L5) find([        "N.dll"                  ], ["/lib/../bin","/lib","/morelibs"])
>   X1) find(["X", "X.exe","X.com"], ["/bin","/moreexes"])
>   X2) find(["X", "X.exe"        ], ["/bin","/moreexes"])
>   X3) find(["X", "X.com"        ], ["/bin","/moreexes"])
> 
> Where the find method does not need to actually know whether it searches
> for a dll or an exe, but dlopen() and exec() instead define the file
> names to search for. This is what the patch draft does in dlopen.
> 
>>>>>>>> *) The directory of the current main executable should be searched
>>>>>>>>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
>>>>>>>>    And PATH should be searched before /usr/bin:/usr/lib as well.
>>>>>>>
>>>>>>> Checking the executable path and $PATH are Windows concepts.  dlopen
>>>>>>> doesn't do that on POSIX systems and we're not doing that either.
>>>>>>
>>>>>> Agreed, but POSIX also does have the concept of embedded RUNPATH,
>>>>>> which is completely missing in Cygwin as far as I can see.
>>>>>
>>>>> RPATH and RUNPATH are ELF dynamic loader features, not supported by
>>>>> PE/COFF.
>>>>
>>>> In any case, to me it does feel quite important to have the (almost) same
>>>> dll search algorithm with dlopen() as with CreateProcess().
>>
>> Last but not least I'm not yet convinced if it's *really* a good idea to
>> prepend the executable path to the DLL search path unconditionally.  Be
>> it as it is in terms of DT_RUNPATH, why is the application dir a good
>> choice at all, unless we're talking Windows semantics?  Which we don't.
>> Also, if loading from the applications dir from dlopen is important for
>> you, you can emulate it by adding the application dir to LD_LIBRAYR_PATH.
> 
> As long as there is lack of a Cygwin specific dll loader to find the
> dlls to load during process startup, we're bound to Windows semantics.
> 
> For dlopen, it is more important to find the same dll file as would be
> found when the exe was linked against that dll file, rather than using
> the Linux-known algorithm and environment variables - and differ from
> process startup: Both really should result in the same algorithm here,
> even if that means some difference compared to Linux.
> 
> As far as I understand, lack of DT_RUNPATH (besides /etc/ld.so.conf)
> support during process start was the main reason for the dlls to install
> into /lib/../bin instead of /lib at all, to be found at process start
> because of residing in the application's bin dir:
> Why should that be different for dlopen?
> 
>> I checked for the usage of DT_RUNPATH/DT_RPATH on Fedora 23 and only a
>> limited number of packages use it (texlive, samba, python, man-db,
>> swipl, and a few more).  Some of them, like texlive, even use it wrongly,
>> with RPATH pointing to a non-existing build dir.  There are also a few
>> stray "/usr/lib64" settings, but all in all it's not used to point to
>> the dir the application is installed to, but rather to some package specific
>> subdir, e.g.  /usr/lib64/samba, /usr/lib64/swipl-7.2.3/lib/x86_64-linux,
>> etc.
> 
> On Linux, the binaries installed in /usr usually rely on the Linux
> loader to be configured via /etc/ld.so.conf to find their runtime
> libs in /usr/lib.
> 
> Please remember: This whole thing is not a problem with packages
> installed to /usr, but with packages installed to /somewhere/else
> that provide runtime libraries that are also available in /usr.
> 
> Using LD_LIBRARY_PATH pointing to /somewhere/else/lib may break the
> binaries found in /usr/bin - and agreed, searching PATH doesn't make
> it better, as PATH is the "LD_LIBRARY_PATH" for Windows.
> 
>> IMHO this means just adding the applications bin dir is most of the time
>> an unused or even wrong workaround.
> 
> Although GetModuleHandle may reduce that pressure for dlopen - as long as
> the applications bin dir is searched at process start, it really should
> be searched by dlopen too, even if for /usr/bin/* this might indeed become
> redundant, as we always add /usr/bin in dlopen - which really mimics
> the /etc/ld.so.conf content actually, although that one is unavailable
> to process startup.

Just mentioning that changed dlopen semantics will in all likelihood affect
libtool. Do we have any libtool hackers around (besides me, I do not have
time) since Chuck disappeared?

Cheers,
Peter

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-08-22 12:48               ` Peter Rosin
@ 2016-08-22 13:01                 ` Michael Haubenwallner
  0 siblings, 0 replies; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-22 13:01 UTC (permalink / raw)
  To: cygwin-developers

Hi Peter,

On 08/22/2016 02:48 PM, Peter Rosin wrote:
> 
> Just mentioning that changed dlopen semantics will in all likelihood affect
> libtool. Do we have any libtool hackers around (besides me, I do not have
> time) since Chuck disappeared?

/me has some (non-cygwin) libtool-patches@ pending for review&commit already...

/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r2)
  2016-08-22 12:15             ` Michael Haubenwallner
  2016-08-22 12:48               ` Peter Rosin
@ 2016-08-22 16:02               ` Michael Haubenwallner
  2016-08-22 18:37                 ` Corinna Vinschen
  2016-08-22 18:49               ` About the dll search algorithm of dlopen Corinna Vinschen
  2 siblings, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-22 16:02 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 6598 bytes --]

On 08/22/2016 02:15 PM, Michael Haubenwallner wrote:
> On 08/20/2016 09:32 PM, Corinna Vinschen wrote:
>>>>>
>>>>> One way around YA code duplication could be some kind of path iterator
>>>>> class which could be used from find_exec as well as from
>>>>> get_full_path_of_dll.
> 
>>>> 0001.patch is a draft for some new cygwin::pathfinder class, with
>>>> 0002.patch adding the executable's directory as searchpath, and
>>>> 0003.patch to search the PATH environment as well.
>>>>
>>>> Thoughts?
>>
>> Ok, that might be disappointing now because you already put so much work
>> into it, but I actually expected some more discussion first.  I have two
>> problem with this.
>>
>> I'm not a big fan of templates.
> 
> Never mind, it's been some template exercise to me anyway.
> 
>> What I had in mind was a *simple* class which gets told if it searches
>> for libs or executables and then checks the different paths accordingly,
>> kind of a copy of find_exec as a class, just additionally handling the
>> prefix issue for DLLs.

What about this one: 've dropped any templates now,
while keeping the basenames logic inside dlopen.

/haubi/

> 
> What I'm more interested in for such a class is the actual API for use
> by dlopen() and exec(), and the final list of files searched for - with
> these use cases coming to my mind:
> 
> Libraries/dlls with final search path "/lib:/morelibs":
>   L1) dlopen("libN.so")
>   L2) dlopen("libN.dll")
>   L3) dlopen("cygN.dll")
>   L4) dlopen("N.so")
>   L5) dlopen("N.dll")
> Executables with final search path "/bin:/moreexes"
>   X1) exec("X")
>   X2) exec("X.exe")
>   X3) exec("X.com")
> 
> Instead of API calls similar to:
>   L1) find(dll, "N", ["/lib", "/morelibs"])
>   L2) find(dll, "N", ["/lib", "/morelibs"])
>   L3) find(dll, "N", ["/lib", "/morelibs"])
>   L4) find(dll, "N", ["/lib", "/morelibs"])
>   L5) find(dll, "N", ["/lib", "/morelibs"])
>   X1) find(exe, "X", ["/bin", "/moreexes"])
>   X2) find(exe, "X", ["/bin", "/moreexes"])
>   X3) find(exe, "X", ["/bin", "/moreexes"])
> it feels necessary to support more explicit naming, as in:
>   L1) find(["libN.so", "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
>   L2) find([           "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
>   L3) find([           "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
>   L4) find(["N.so", "N.dll"                  ], ["/lib/../bin","/lib","/morelibs"])
>   L5) find([        "N.dll"                  ], ["/lib/../bin","/lib","/morelibs"])
>   X1) find(["X", "X.exe","X.com"], ["/bin","/moreexes"])
>   X2) find(["X", "X.exe"        ], ["/bin","/moreexes"])
>   X3) find(["X", "X.com"        ], ["/bin","/moreexes"])
> 
> Where the find method does not need to actually know whether it searches
> for a dll or an exe, but dlopen() and exec() instead define the file
> names to search for. This is what the patch draft does in dlopen.
> 
>>>>>>>> *) The directory of the current main executable should be searched
>>>>>>>>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
>>>>>>>>    And PATH should be searched before /usr/bin:/usr/lib as well.
>>>>>>>
>>>>>>> Checking the executable path and $PATH are Windows concepts.  dlopen
>>>>>>> doesn't do that on POSIX systems and we're not doing that either.
>>>>>>
>>>>>> Agreed, but POSIX also does have the concept of embedded RUNPATH,
>>>>>> which is completely missing in Cygwin as far as I can see.
>>>>>
>>>>> RPATH and RUNPATH are ELF dynamic loader features, not supported by
>>>>> PE/COFF.
>>>>
>>>> In any case, to me it does feel quite important to have the (almost) same
>>>> dll search algorithm with dlopen() as with CreateProcess().
>>
>> Last but not least I'm not yet convinced if it's *really* a good idea to
>> prepend the executable path to the DLL search path unconditionally.  Be
>> it as it is in terms of DT_RUNPATH, why is the application dir a good
>> choice at all, unless we're talking Windows semantics?  Which we don't.
>> Also, if loading from the applications dir from dlopen is important for
>> you, you can emulate it by adding the application dir to LD_LIBRAYR_PATH.
> 
> As long as there is lack of a Cygwin specific dll loader to find the
> dlls to load during process startup, we're bound to Windows semantics.
> 
> For dlopen, it is more important to find the same dll file as would be
> found when the exe was linked against that dll file, rather than using
> the Linux-known algorithm and environment variables - and differ from
> process startup: Both really should result in the same algorithm here,
> even if that means some difference compared to Linux.
> 
> As far as I understand, lack of DT_RUNPATH (besides /etc/ld.so.conf)
> support during process start was the main reason for the dlls to install
> into /lib/../bin instead of /lib at all, to be found at process start
> because of residing in the application's bin dir:
> Why should that be different for dlopen?
> 
>> I checked for the usage of DT_RUNPATH/DT_RPATH on Fedora 23 and only a
>> limited number of packages use it (texlive, samba, python, man-db,
>> swipl, and a few more).  Some of them, like texlive, even use it wrongly,
>> with RPATH pointing to a non-existing build dir.  There are also a few
>> stray "/usr/lib64" settings, but all in all it's not used to point to
>> the dir the application is installed to, but rather to some package specific
>> subdir, e.g.  /usr/lib64/samba, /usr/lib64/swipl-7.2.3/lib/x86_64-linux,
>> etc.
> 
> On Linux, the binaries installed in /usr usually rely on the Linux
> loader to be configured via /etc/ld.so.conf to find their runtime
> libs in /usr/lib.
> 
> Please remember: This whole thing is not a problem with packages
> installed to /usr, but with packages installed to /somewhere/else
> that provide runtime libraries that are also available in /usr.
> 
> Using LD_LIBRARY_PATH pointing to /somewhere/else/lib may break the
> binaries found in /usr/bin - and agreed, searching PATH doesn't make
> it better, as PATH is the "LD_LIBRARY_PATH" for Windows.
> 
>> IMHO this means just adding the applications bin dir is most of the time
>> an unused or even wrong workaround.
> 
> Although GetModuleHandle may reduce that pressure for dlopen - as long as
> the applications bin dir is searched at process start, it really should
> be searched by dlopen too, even if for /usr/bin/* this might indeed become
> redundant, as we always add /usr/bin in dlopen - which really mimics
> the /etc/ld.so.conf content actually, although that one is unavailable
> to process startup.
> 
> Thanks!
> /haubi/
> 


[-- Attachment #2: 0001-dlopen-search-each-name-within-one-single-search-dir.patch --]
[-- Type: text/x-patch, Size: 13852 bytes --]

From ff75702593aed447f695efa37fc3312495681f8f Mon Sep 17 00:00:00 2001
From: Michael Haubenwallner <michael.haubenwallner@ssi-schaefer.com>
Date: Tue, 31 May 2016 13:09:11 +0000
Subject: [PATCH 1/3] dlopen: search each name within one single search dir

Search x/bin:x/lib if searching in x/lib.

Introduces and uses new pathfinder class, which
introduces and uses new vstrlist class.
---
 winsup/cygwin/dlfcn.cc     |  71 ++++++-------
 winsup/cygwin/pathfinder.h | 112 ++++++++++++++++++++
 winsup/cygwin/vstrlist.h   | 253 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 396 insertions(+), 40 deletions(-)
 create mode 100644 winsup/cygwin/pathfinder.h
 create mode 100644 winsup/cygwin/vstrlist.h

diff --git a/winsup/cygwin/dlfcn.cc b/winsup/cygwin/dlfcn.cc
index 255a6d5..da20a58 100644
--- a/winsup/cygwin/dlfcn.cc
+++ b/winsup/cygwin/dlfcn.cc
@@ -20,6 +20,7 @@ details. */
 #include "cygtls.h"
 #include "tls_pbuf.h"
 #include "ntdll.h"
+#include "pathfinder.h"
 
 static void
 set_dl_error (const char *str)
@@ -28,29 +29,6 @@ set_dl_error (const char *str)
   _my_tls.locals.dl_error = 1;
 }
 
-/* Look for an executable file given the name and the environment
-   variable to use for searching (eg., PATH); returns the full
-   pathname (static buffer) if found or NULL if not. */
-inline const char *
-check_path_access (const char *mywinenv, const char *name, path_conv& buf)
-{
-  return find_exec (name, buf, mywinenv, FE_NNF | FE_DLL);
-}
-
-/* Search LD_LIBRARY_PATH for dll, if it exists.  Search /usr/bin and /usr/lib
-   by default.  Return valid full path in path_conv real_filename. */
-static inline bool
-gfpod_helper (const char *name, path_conv &real_filename)
-{
-  if (strchr (name, '/'))
-    real_filename.check (name, PC_SYM_FOLLOW | PC_NULLEMPTY);
-  else if (!check_path_access ("LD_LIBRARY_PATH", name, real_filename))
-    check_path_access ("/usr/bin:/usr/lib", name, real_filename);
-  if (!real_filename.exists ())
-    real_filename.error = ENOENT;
-  return !real_filename.error;
-}
-
 static bool
 get_full_path_of_dll (const char* str, path_conv &real_filename)
 {
@@ -63,38 +41,50 @@ get_full_path_of_dll (const char* str, path_conv &real_filename)
       return false;		/* Yes.  Let caller deal with it. */
     }
 
-  tmp_pathbuf tp;
-  char *name = tp.c_get ();
+  pathfinder::basenamelist basenames;
 
-  strcpy (name, str);	/* Put it somewhere where we can manipulate it. */
+  const char *basename = strrchr (str, '/');
+  basename = basename ? basename + 1 : str;
 
-  char *basename = strrchr (name, '/');
-  basename = basename ? basename + 1 : name;
-  char *suffix = strrchr (name, '.');
-  if (suffix && suffix < basename)
-    suffix = NULL;
+  int baselen = str + len - basename;
+  const char *suffix = strrchr (basename, '.');
 
+  char const * ext = "";
+  int extlen = 0;
   /* Is suffix ".so"? */
   if (suffix && !strcmp (suffix, ".so"))
     {
       /* Does the file exist? */
-      if (gfpod_helper (name, real_filename))
-	return true;
+      basenames.appendv (basename, baselen, NULL);
       /* No, replace ".so" with ".dll". */
-      strcpy (suffix, ".dll");
+      baselen -= 3;
+      ext = ".dll";
+      extlen = 4;
     }
   /* Does the filename start with "lib"? */
   if (!strncmp (basename, "lib", 3))
     {
       /* Yes, replace "lib" with "cyg". */
-      strncpy (basename, "cyg", 3);
-      /* Does the file exist? */
-      if (gfpod_helper (name, real_filename))
-	return true;
+      basenames.appendv ("cyg", 3, basename+3, baselen-3, ext, extlen, NULL);
       /* No, revert back to "lib". */
-      strncpy (basename, "lib", 3);
     }
-  if (gfpod_helper (name, real_filename))
+  basenames.appendv (basename, baselen, ext, extlen, NULL);
+
+  pathfinder finder (basenames);
+
+  if (basename > str)
+    finder.add_searchdir (str, basename - 1 - str);
+  else
+    {
+      /* NOTE: The Windows loader (for linked dlls) does
+	 not use the LD_LIBRARY_PATH environment variable. */
+      finder.add_envsearchpath ("LD_LIBRARY_PATH");
+
+      /* Finally we better have some fallback. */
+      finder.add_searchdir ("/usr/lib", -1);
+    }
+
+  if (finder.check_path_access (real_filename))
     return true;
 
   /* If nothing worked, create a relative path from the original incoming
@@ -113,6 +103,7 @@ dlopen (const char *name, int flags)
 {
   void *ret = NULL;
 
+  debug_printf ("flags %d for %s", flags, name);
   if (name == NULL)
     {
       ret = (void *) GetModuleHandle (NULL); /* handle for the current module */
diff --git a/winsup/cygwin/pathfinder.h b/winsup/cygwin/pathfinder.h
new file mode 100644
index 0000000..3453692
--- /dev/null
+++ b/winsup/cygwin/pathfinder.h
@@ -0,0 +1,112 @@
+/* pathfinder.h: find one of multiple file names in path
+
+   Copyright 2016 Red Hat, Inc.
+
+This file is part of Cygwin.
+
+This software is a copyrighted work licensed under the terms of the
+Cygwin license.  Please consult the file "CYGWIN_LICENSE" for
+details. */
+
+#include "vstrlist.h"
+
+#ifdef __cplusplus
+
+class pathfinder
+{
+public:
+  typedef vstrlist basenamelist;
+
+private:
+  pathfinder ();
+  pathfinder (pathfinder const &);
+  pathfinder & operator = (pathfinder const &);
+
+  basenamelist basenames_;
+  size_t basenames_maxlen_;
+
+  typedef vstrlist searchbufferlist;
+  searchbufferlist searchbuffers_;
+
+public:
+  ~pathfinder () {}
+
+  /* We need the basenames to search for first, to allow for optimized
+     memory allocation of each searchpath + basename combination.
+     The incoming list of basenames is emptied (ownership take over). */
+  pathfinder (basenamelist & basenames)
+    : basenames_ ()
+    , basenames_maxlen_ ()
+    , searchbuffers_()
+  {
+    basenames.swap(basenames_);
+
+    for (basenamelist::iterator basename = basenames_.begin ();
+	 basename != basenames_.end ();
+	 ++ basename)
+      {
+        if (basenames_maxlen_ < basename->stringlength ())
+	  basenames_maxlen_ = basename->stringlength ();
+      }
+  }
+
+  void add_searchdir (const char *dir, int dirlen)
+  {
+      if (dirlen < 0)
+        dirlen = strlen (dir);
+
+      if (!dirlen)
+        return;
+
+      /* Search "x/bin:x/lib" for "x/lib" */
+      if (dirlen >=4 && !strncmp (dir + dirlen - 4, "/lib", 4))
+	/* prealloc buffer in searchdir for any basename we will search for */
+	searchbuffers_.appendv (dir, dirlen - 4, "/bin", 4, "/", 1 + basenames_maxlen_, NULL);
+
+      /* prealloc buffer in searchdir for any basename we will search for */
+      searchbuffers_.appendv (dir, dirlen, "/", 1 + basenames_maxlen_, NULL);
+  }
+
+  void add_searchpath (const char *path)
+  {
+    while (path && *path)
+      {
+        const char *next = strchr (path, ':');
+        add_searchdir (path, next ? next - path : -1);
+	path = next ? next + 1 : next;
+      }
+  }
+
+  void add_envsearchpath (const char *envpath)
+    {
+      add_searchpath (getenv (envpath));
+    }
+
+  /* Within each searchdir registered, try each registered basename to
+     find as executable.  Returns found dir/basename in real_filename.
+     Returns true when found. */
+  bool check_path_access (path_conv& real_filename)
+  {
+    for (searchbufferlist::iterator dir = searchbuffers_.begin ();
+	 dir != searchbuffers_.end ();
+	 ++dir)
+      for (basenamelist::iterator name = basenames_.begin ();
+	   name != basenames_.end ();
+	   ++name)
+	{
+	  /* complete the filename path to search for */
+	  memcpy (dir->buffer () + dir->stringlength (), name->string (), name->stringlength () + 1);
+	  debug_printf ("trying %s", dir->buffer ());
+	  real_filename.check (dir->string (), PC_SYM_FOLLOW | PC_POSIX);
+	  if (real_filename.exists () && !real_filename.isdir ())
+	    {
+	      debug_printf (" found %s", dir->buffer ());
+	      return true;
+	    }
+	}
+    real_filename.error = ENOENT;
+    return !real_filename.error;
+  }
+};
+
+#endif /* __cplusplus */
diff --git a/winsup/cygwin/vstrlist.h b/winsup/cygwin/vstrlist.h
new file mode 100644
index 0000000..ecbdc64
--- /dev/null
+++ b/winsup/cygwin/vstrlist.h
@@ -0,0 +1,253 @@
+/* vstrlist.h: vstrlist
+
+   Copyright 2016 Red Hat, Inc.
+
+This file is part of Cygwin.
+
+This software is a copyrighted work licensed under the terms of the
+Cygwin license.  Please consult the file "CYGWIN_LICENSE" for
+details. */
+
+#ifdef __cplusplus
+
+class vstrlist
+{
+public:
+  class member
+  {
+    friend class vstrlist;
+    friend class iterator;
+
+    member * prev_;
+    member * next_;
+    size_t   bufferlength_;
+    size_t   stringlength_;
+    char     buffer_[1]; /* we always have space for a trailing zero */
+
+    /* no copy */
+    member (member const &);
+    member & operator = (member const &);
+
+    /* anchor */
+    void * operator new (size_t class_size)
+    {
+      return cmalloc_abort (HEAP_STR, class_size);
+    }
+
+    /* anchor */
+    member ()
+      : prev_ (this)
+      , next_ (this)
+      , bufferlength_ (0)
+      , stringlength_ (0)
+      , buffer_ ()
+    {}
+
+    /* entry: determine memory size from args */
+    void * operator new (size_t class_size, char const * part0, va_list parts)
+    {
+      char const * part = part0;
+      va_list partsdup;
+      va_copy (partsdup, parts);
+      size_t bufferlength = 0;
+      while (part)
+	{
+	  int partlen = va_arg (partsdup, int);
+	  if (partlen < 0)
+	    partlen = strlen (part);
+	  bufferlength += partlen;
+	  part = va_arg (partsdup, char const *);
+	}
+      va_end (partsdup);
+
+      return cmalloc_abort (HEAP_STR, class_size + bufferlength);
+    }
+
+    /* entry: instantly insert into list */
+    member (member * before, char const * part0, va_list parts)
+      : prev_ (before->prev_)
+      , next_ (before)
+      , bufferlength_ (0)
+      , stringlength_ ()
+      , buffer_ ()
+    {
+      prev_->next_ = this;
+      next_->prev_ = this;
+
+      char * dest = buffer_;
+      char const * part = part0;
+      va_list partsdup;
+      va_copy (partsdup, parts);
+      while (part)
+	{
+	  int partlen = va_arg (partsdup, int);
+	  if (partlen < 0)
+	    {
+	      char * old = dest;
+	      dest = stpcpy (old, part);
+	      partlen = dest - old;
+	    }
+	  else
+	    dest = stpncpy (dest, part, partlen);
+	  bufferlength_ += partlen;
+	  part = va_arg (partsdup, const char *);
+	}
+      va_end (partsdup);
+      *dest = (char)0;
+      stringlength_ = dest - buffer_;
+    }
+
+    void operator delete (void * p)
+    {
+      cfree (p);
+    }
+
+    /* remove entry from list */
+    ~member ()
+    {
+      member * next = next_;
+      member * prev = prev_;
+      next->prev_ = prev;
+      prev->next_ = next;
+      prev_ = NULL;
+      next_ = NULL;
+    }
+
+  public:
+    member const * next () const { return next_; }
+    member       * next ()       { return next_; }
+    member const * prev () const { return next_; }
+    member       * prev ()       { return next_; }
+
+    /* always is a readonly string */
+    char const * string () const { return buffer_; }
+    size_t stringlength () const { return stringlength_; }
+
+    /* give write access to the buffer when writeable */
+    char       * buffer ()       { return buffer_; }
+    size_t bufferlength ()       { return bufferlength_; }
+  };
+
+  class iterator
+  {
+    friend class vstrlist;
+
+    member * current_;
+
+    iterator ();
+
+    iterator (member * current)
+      : current_ (current)
+    {}
+
+  public:
+    iterator (iterator const & rhs)
+      : current_ (rhs.current_)
+    {}
+
+    iterator & operator = (iterator const & rhs)
+    {
+      current_ = rhs.current_;
+      return *this;
+    }
+
+    iterator & operator ++ ()
+    {
+      current_ = current_->next ();
+      return *this;
+    }
+
+    iterator operator ++ (int)
+    {
+      iterator ret (*this);
+      current_ = current_->next ();
+      return ret;
+    }
+
+    iterator & operator -- ()
+    {
+      current_ = current_->prev ();
+      return *this;
+    }
+
+    iterator operator -- (int)
+    {
+      iterator ret (*this);
+      current_ = current_->prev ();
+      return ret;
+    }
+
+    bool operator == (iterator const & rhs) const
+    {
+      return current_ == rhs.current_;
+    }
+
+    bool operator != (iterator const & rhs) const
+    {
+      return current_ != rhs.current_;
+    }
+
+    member const & operator *  () const { return *current_; }
+    member       & operator *  ()       { return *current_; }
+    member const * operator -> () const { return  current_; }
+    member       * operator -> ()       { return  current_; }
+
+    void remove ()
+    {
+      member * old = current_;
+      ++ *this;
+      delete old;
+    }
+  };
+
+private:
+  member * anchor_;
+
+  /* no copy */
+  vstrlist (vstrlist const &);
+  vstrlist & operator = (vstrlist const &);
+
+public:
+  iterator  begin () { return iterator (anchor_->next ()); }
+  iterator  end   () { return iterator (anchor_         ); }
+  iterator rbegin () { return iterator (anchor_->prev ()); }
+  iterator rend   () { return iterator (anchor_         ); }
+
+  vstrlist ()
+    : anchor_ (new member ())
+  {}
+
+  ~vstrlist ()
+  {
+    for (iterator it = begin (); it != end (); it.remove ());
+    delete anchor_;
+  }
+
+  void swap (vstrlist & that)
+  {
+    member * old = anchor_;
+    anchor_ = that.anchor_;
+    that.anchor_ = old;
+  }
+
+  member * appendv (char const * part0, va_list parts)
+  {
+    return new (part0, parts) member (anchor_, part0, parts);
+  }
+
+  member * appendv (char const * part0, ...)
+  {
+    va_list parts;
+    va_start (parts, part0);
+    member * ret = appendv (part0, parts);
+    va_end (parts);
+    return ret;
+  }
+
+  member * append (char const * part)
+  {
+    return appendv (part, strlen(part));
+  }
+};
+
+#endif /* __cplusplus */
-- 
2.8.3


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r2)
  2016-08-22 16:02               ` About the dll search algorithm of dlopen (patch-r2) Michael Haubenwallner
@ 2016-08-22 18:37                 ` Corinna Vinschen
  2016-08-23  8:16                   ` Corinna Vinschen
  2016-08-25 17:48                   ` About the dll search algorithm of dlopen (patch-r3) Michael Haubenwallner
  0 siblings, 2 replies; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-22 18:37 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 4041 bytes --]

Hi Michael,

On Aug 22 18:02, Michael Haubenwallner wrote:
> What about this one: 've dropped any templates now,
> while keeping the basenames logic inside dlopen.

I was going to reply to your other mail but I'll deferred that for now
and looked into your code instead.  I think we can basically go along
with this.  A few points, though.

> +      /* Finally we better have some fallback. */
> +      finder.add_searchdir ("/usr/lib", -1);

You're not adding /usr/bin here because your code contains automatic
code for lib -> bin duplication... see (*) below.

> @@ -113,6 +103,7 @@ dlopen (const char *name, int flags)
>  {
>    void *ret = NULL;
>  
> +  debug_printf ("flags %d for %s", flags, name);

Stray debug_printf?  Otherwise, if you have a reason to see the
flags, please provide an extra patch for this.

> diff --git a/winsup/cygwin/pathfinder.h b/winsup/cygwin/pathfinder.h
> new file mode 100644
> index 0000000..3453692
> --- /dev/null
> +++ b/winsup/cygwin/pathfinder.h
> @@ -0,0 +1,112 @@
> +/* pathfinder.h: find one of multiple file names in path
> +
> +   Copyright 2016 Red Hat, Inc.

Minor point:  Please drop the "Copyright ..., Red Hat" line.  With the
new LGPL copyright regime we removed them.  They are not required and
keeping them in shape is awkward.

> +      /* Search "x/bin:x/lib" for "x/lib" */
> +      if (dirlen >=4 && !strncmp (dir + dirlen - 4, "/lib", 4))
> +	/* prealloc buffer in searchdir for any basename we will search for */
> +	searchbuffers_.appendv (dir, dirlen - 4, "/bin", 4, "/", 1 + basenames_maxlen_, NULL);
> +
> +      /* prealloc buffer in searchdir for any basename we will search for */
> +      searchbuffers_.appendv (dir, dirlen, "/", 1 + basenames_maxlen_, NULL);
> +  }

(*) Yuk!  Do we really, *really* want that?  The redirection from
    /usr/lib to /usr/bin is only done for system libs, and only because
    otherwise we had trouble starting Cygwin from CMD or the Explorer
    GUI "Run..." box.  We can't change this without breaking everything
    since we *do* depend on the Windows loader yet.
    
    However, as long as this is restricted to /usr/lib, /usr/bin, it's a
    closed system under control of "the distro".  If you extend this to
    *any* external path ending in "lib", isn't it inherently dangerous
    to automate this under the hood, without the application's consent?
    Or, FWIW, the user's consent in case of LD_LIBRARY_PATH?

> diff --git a/winsup/cygwin/vstrlist.h b/winsup/cygwin/vstrlist.h
> new file mode 100644
> index 0000000..ecbdc64
> --- /dev/null
> +++ b/winsup/cygwin/vstrlist.h
> [...]
> +    void * operator new (size_t class_size, char const * part0, va_list parts)
> +    {
> +      char const * part = part0;
> +      va_list partsdup;
> +      va_copy (partsdup, parts);
> +      size_t bufferlength = 0;
> +      while (part)
> +	{
> +	  int partlen = va_arg (partsdup, int);
> +	  if (partlen < 0)
> +	    partlen = strlen (part);
> +	  bufferlength += partlen;
> +	  part = va_arg (partsdup, char const *);
> +	}
> +      va_end (partsdup);
> +
> +      return cmalloc_abort (HEAP_STR, class_size + bufferlength);
> +    }

Uhm... I don't think this is correct.  Using cmalloc and friends has a
specific purpose.  Only internal datastructures which are inherited by
child processes via fork/exec are supposed to be allocated on the
cygheap.  The cygheap really shouldn't be used freely for other
purposes.  You could leak it to child processes if another thread forks
during your thread's call to dlopen.  Also, especially on 32 bit the
pressure on the cygheap is not insignificant.  Please use malloc/free
instead.

Other than that your patch looks pretty well to me.  Except, uhm...
maybe you want to comment the new classes with a usage description?


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-08-22 12:15             ` Michael Haubenwallner
  2016-08-22 12:48               ` Peter Rosin
  2016-08-22 16:02               ` About the dll search algorithm of dlopen (patch-r2) Michael Haubenwallner
@ 2016-08-22 18:49               ` Corinna Vinschen
  2016-08-23  8:55                 ` Michael Haubenwallner
  2 siblings, 1 reply; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-22 18:49 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1174 bytes --]

Hi Michael,

On Aug 22 14:15, Michael Haubenwallner wrote:
> For dlopen, it is more important to find the same dll file as would be
> found when the exe was linked against that dll file, rather than using
> the Linux-known algorithm and environment variables - and differ from
> process startup: Both really should result in the same algorithm here,
> even if that means some difference compared to Linux.

IMHO there is no good reason to have a DLL in a 3rd party subdir which
is available as system DLL.  DT_RPATH/DT_RUNPATH semantics are not
created to overload system DLLs, rather to add a safe search path.  Why
would you want to install a system DLL under the same name and thus the
same version in a non- system dir and expect the application to load
that?  Ultimately this is bound to becoming outdated, fail if the
dependencies of the system DLL change, etc.

Even *if* we add the application dir adding it to dlopen should be
configurable, not statically, invisible under the hood.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r2)
  2016-08-22 18:37                 ` Corinna Vinschen
@ 2016-08-23  8:16                   ` Corinna Vinschen
  2016-08-25 17:48                   ` About the dll search algorithm of dlopen (patch-r3) Michael Haubenwallner
  1 sibling, 0 replies; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-23  8:16 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2462 bytes --]

Hi Michael,

On Aug 22 20:37, Corinna Vinschen wrote:
> On Aug 22 18:02, Michael Haubenwallner wrote:
> > diff --git a/winsup/cygwin/vstrlist.h b/winsup/cygwin/vstrlist.h
> > new file mode 100644
> > index 0000000..ecbdc64
> > --- /dev/null
> > +++ b/winsup/cygwin/vstrlist.h
> > [...]
> > +    void * operator new (size_t class_size, char const * part0, va_list parts)
> > +    {
> > +      char const * part = part0;
> > +      va_list partsdup;
> > +      va_copy (partsdup, parts);
> > +      size_t bufferlength = 0;
> > +      while (part)
> > +	{
> > +	  int partlen = va_arg (partsdup, int);
> > +	  if (partlen < 0)
> > +	    partlen = strlen (part);
> > +	  bufferlength += partlen;
> > +	  part = va_arg (partsdup, char const *);
> > +	}
> > +      va_end (partsdup);
> > +
> > +      return cmalloc_abort (HEAP_STR, class_size + bufferlength);
> > +    }
> 
> Uhm... I don't think this is correct.  Using cmalloc and friends has a
> specific purpose.  Only internal datastructures which are inherited by
> child processes via fork/exec are supposed to be allocated on the
> cygheap.  The cygheap really shouldn't be used freely for other
> purposes.  You could leak it to child processes if another thread forks
> during your thread's call to dlopen.  Also, especially on 32 bit the
> pressure on the cygheap is not insignificant.  Please use malloc/free
> instead.

After a night's sleep something else occured to me.  The old, existing
code didn't have to use {c}malloc/{c}free at all (ignoring called
functions).  By using the pathfinder method in its current state
you now allocate/free multiple blocks, one for each basename and one for
each search dir.  And at the end of the function you free the space
again.

Malloc/free is a costly operation which also requires locking and using
pathfinder as is adds that multiple times for each call of dlopen.
Every time we can get rid of malloc/free is a win.  So, here's a
question: What if you only use a single tmp_pathbuf for the basenames
and a single tmp_pathbuf of 64K for the paths.  That should be
sufficient for this functionality and the necessary buffers are already
allocated.  Alternatively you could use alloca for the basenames, they
are light on space anyway.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen
  2016-08-22 18:49               ` About the dll search algorithm of dlopen Corinna Vinschen
@ 2016-08-23  8:55                 ` Michael Haubenwallner
  0 siblings, 0 replies; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-23  8:55 UTC (permalink / raw)
  To: cygwin-developers

Hi Corinna,

On 08/22/2016 08:48 PM, Corinna Vinschen wrote:
> On Aug 22 14:15, Michael Haubenwallner wrote:
>> For dlopen, it is more important to find the same dll file as would be
>> found when the exe was linked against that dll file, rather than using
>> the Linux-known algorithm and environment variables - and differ from
>> process startup: Both really should result in the same algorithm here,
>> even if that means some difference compared to Linux.
> 
> IMHO there is no good reason to have a DLL in a 3rd party subdir which
> is available as system DLL.  DT_RPATH/DT_RUNPATH semantics are not
> created to overload system DLLs, rather to add a safe search path.  Why
> would you want to install a system DLL under the same name and thus the
> same version in a non- system dir and expect the application to load
> that?  Ultimately this is bound to becoming outdated, fail if the
> dependencies of the system DLL change, etc.

This highly depends on the definition of which DLLs are part of the
"system", and which DLLs are part of the "application".

Just installing a DLL into /usr/bin does not mean it is a "system DLL"
which an application safely can rely on.

Instead, an application expected to be "stable" has to ship dependencies
along the application itself, as it cannot expect the "system" to provide
the identical (versions of) dependencies the application was regression
tested against.

Agreed, this asks for outdated dependencies. But this is the reason why
the update mechanism of the application also is able to update its own
dependencies - within the application directory.

This allows for always running a regression tested application, which
is expected to not break just because some random DLL is installed or
updated within /usr/bin.

For the definition of "system": As the application is required to run
on lots of different operating systems and architectures, a sensible
definition of "system" that allows for most portability with least
porting effort (code duplication) is to stick with "libc", along the
lines of "POSIX" as close as possible.

> Even *if* we add the application dir adding it to dlopen should be
> configurable, not statically, invisible under the hood.

Independent of application stability and portability: Is there a reason
for dlopen to potentially find different DLLs than process startup does?

Thanks!
/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-22 18:37                 ` Corinna Vinschen
  2016-08-23  8:16                   ` Corinna Vinschen
@ 2016-08-25 17:48                   ` Michael Haubenwallner
  2016-08-26 10:59                     ` Corinna Vinschen
  2016-08-26 12:04                     ` Corinna Vinschen
  1 sibling, 2 replies; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-25 17:48 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 4686 bytes --]

Hi Corinna,

On 08/22/2016 08:37 PM, Corinna Vinschen wrote:
> Hi Michael,
> 
> On Aug 22 18:02, Michael Haubenwallner wrote:
>> What about this one: 've dropped any templates now,
>> while keeping the basenames logic inside dlopen.
> 
> I was going to reply to your other mail but I'll deferred that for now
> and looked into your code instead.  I think we can basically go along
> with this.  A few points, though.

besides addressing your concerns I've also split the patch one more time:
1) switch dlopen to pathfinder, not changing search behaviour yet
2) fix pathfinder to search basenames within single dir

> 
>> +      /* Finally we better have some fallback. */
>> +      finder.add_searchdir ("/usr/lib", -1);
> 
> You're not adding /usr/bin here because your code contains automatic
> code for lib -> bin duplication... see (*) below.
> 
>> @@ -113,6 +103,7 @@ dlopen (const char *name, int flags)
>>  {
>>    void *ret = NULL;
>>  
>> +  debug_printf ("flags %d for %s", flags, name);
> 
> Stray debug_printf?  Otherwise, if you have a reason to see the
> flags, please provide an extra patch for this.

've been more after the name passed to dlopen, actually.

> 
>> diff --git a/winsup/cygwin/pathfinder.h b/winsup/cygwin/pathfinder.h
>> new file mode 100644
>> index 0000000..3453692
>> --- /dev/null
>> +++ b/winsup/cygwin/pathfinder.h
>> @@ -0,0 +1,112 @@
>> +/* pathfinder.h: find one of multiple file names in path
>> +
>> +   Copyright 2016 Red Hat, Inc.
> 
> Minor point:  Please drop the "Copyright ..., Red Hat" line.  With the
> new LGPL copyright regime we removed them.  They are not required and
> keeping them in shape is awkward.

ok.

> 
>> +      /* Search "x/bin:x/lib" for "x/lib" */
>> +      if (dirlen >=4 && !strncmp (dir + dirlen - 4, "/lib", 4))
>> +	/* prealloc buffer in searchdir for any basename we will search for */
>> +	searchbuffers_.appendv (dir, dirlen - 4, "/bin", 4, "/", 1 + basenames_maxlen_, NULL);
>> +
>> +      /* prealloc buffer in searchdir for any basename we will search for */
>> +      searchbuffers_.appendv (dir, dirlen, "/", 1 + basenames_maxlen_, NULL);
>> +  }
> 
> (*) Yuk!  Do we really, *really* want that?  The redirection from
>     /usr/lib to /usr/bin is only done for system libs, and only because
>     otherwise we had trouble starting Cygwin from CMD or the Explorer
>     GUI "Run..." box.  We can't change this without breaking everything
>     since we *do* depend on the Windows loader yet.
>     
>     However, as long as this is restricted to /usr/lib, /usr/bin, it's a
>     closed system under control of "the distro".  If you extend this to
>     *any* external path ending in "lib", isn't it inherently dangerous
>     to automate this under the hood, without the application's consent?
>     Or, FWIW, the user's consent in case of LD_LIBRARY_PATH?

've split into add_lib_searchdir (), used for "/usr/lib" only.

Should be a separate patch anyway if really necessary.

> 
>> diff --git a/winsup/cygwin/vstrlist.h b/winsup/cygwin/vstrlist.h
>> new file mode 100644
>> index 0000000..ecbdc64
>> --- /dev/null
>> +++ b/winsup/cygwin/vstrlist.h
>> [...]
>> +    void * operator new (size_t class_size, char const * part0, va_list parts)
>> +    {
>> +      char const * part = part0;
>> +      va_list partsdup;
>> +      va_copy (partsdup, parts);
>> +      size_t bufferlength = 0;
>> +      while (part)
>> +	{
>> +	  int partlen = va_arg (partsdup, int);
>> +	  if (partlen < 0)
>> +	    partlen = strlen (part);
>> +	  bufferlength += partlen;
>> +	  part = va_arg (partsdup, char const *);
>> +	}
>> +      va_end (partsdup);
>> +
>> +      return cmalloc_abort (HEAP_STR, class_size + bufferlength);
>> +    }
> 
> Uhm... I don't think this is correct.  Using cmalloc and friends has a
> specific purpose.  Only internal datastructures which are inherited by
> child processes via fork/exec are supposed to be allocated on the
> cygheap.  The cygheap really shouldn't be used freely for other
> purposes.  You could leak it to child processes if another thread forks
> during your thread's call to dlopen.  Also, especially on 32 bit the
> pressure on the cygheap is not insignificant.  Please use malloc/free
> instead.

Accepted - wasn't aware of cygheap's dedication to fork only.

Using tmp_pathbuf now, wrapped behind some trivial allocator - which
might fit better somewhere else than to dlfcn.cc?

BTW: Is it really intended for tmp_pathbuf to have a single active
instance (per thread) at a time?

> 
> Other than that your patch looks pretty well to me.  Except, uhm...
> maybe you want to comment the new classes with a usage description?

Done (tried at least).

Thanks!
/haubi/

[-- Attachment #2: 0001-dlopen-switch-to-new-pathfinder-class.patch --]
[-- Type: text/x-patch, Size: 20065 bytes --]

From e79e096ad8e2b85f6fb098b4f224db94eb6e406f Mon Sep 17 00:00:00 2001
From: Michael Haubenwallner <michael.haubenwallner@ssi-schaefer.com>
Date: Tue, 31 May 2016 13:09:11 +0000
Subject: [PATCH 1/4] dlopen: switch to new pathfinder class

Instead of find_exec, without changing behaviour use new pathfinder
class with new allocator_interface around tmp_pathbuf and new vstrlist
class.
* pathfinder.h (pathfinder): New file.
* vstrlist.h (allocator_interface, allocated_type, vstrlist): New file.
* dlfcn.cc (get_full_path_of_dll): Switch to new pathfinder class, using
(tmp_pathbuf_allocator): New class.
---
 winsup/cygwin/dlfcn.cc     | 139 ++++++++++++-----
 winsup/cygwin/pathfinder.h | 141 +++++++++++++++++
 winsup/cygwin/vstrlist.h   | 371 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 611 insertions(+), 40 deletions(-)
 create mode 100644 winsup/cygwin/pathfinder.h
 create mode 100644 winsup/cygwin/vstrlist.h

diff --git a/winsup/cygwin/dlfcn.cc b/winsup/cygwin/dlfcn.cc
index 255a6d5..5528d35 100644
--- a/winsup/cygwin/dlfcn.cc
+++ b/winsup/cygwin/dlfcn.cc
@@ -20,6 +20,74 @@ details. */
 #include "cygtls.h"
 #include "tls_pbuf.h"
 #include "ntdll.h"
+#include "pathfinder.h"
+
+/* Dumb allocator using memory from tmp_pathbuf.w_get ().
+
+   Does not reuse free'd memory areas.  Instead, memory
+   is released when the tmp_pathbuf goes out of scope.
+
+   ATTENTION: Requesting memory from an instance of tmp_pathbuf breaks
+   when another instance on a newer stack frame has provided memory. */
+class tmp_pathbuf_allocator
+  : public allocator_interface
+{
+  tmp_pathbuf & tp_;
+  union
+    {
+      PWCHAR wideptr;
+      void * voidptr;
+      char * byteptr;
+    }    freemem_;
+  size_t freesize_;
+
+  /* allocator_interface */
+  virtual void * alloc (size_t need)
+  {
+    if (need == 0)
+      need = 1; /* GNU-ish */
+    size_t chunksize = NT_MAX_PATH * sizeof (WCHAR);
+    if (need > chunksize)
+      api_fatal ("temporary buffer too small for %d bytes", need);
+    if (need > freesize_)
+      {
+	/* skip remaining, use next chunk */
+	freemem_.wideptr = tp_.w_get ();
+	freesize_ = chunksize;
+      }
+
+    void * ret = freemem_.voidptr;
+
+    /* adjust remaining, aligned at 8 byte boundary */
+    need = need + 7 - (need - 1) % 8;
+    freemem_.byteptr += need;
+    if (need > freesize_)
+      freesize_ = 0;
+    else
+      freesize_ -= need;
+
+    return ret;
+  }
+
+  /* allocator_interface */
+  virtual void free (void *)
+  {
+    /* no-op: released by tmp_pathbuf at end of scope */
+  }
+
+  tmp_pathbuf_allocator ();
+  tmp_pathbuf_allocator (tmp_pathbuf_allocator const &);
+  tmp_pathbuf_allocator & operator = (tmp_pathbuf_allocator const &);
+
+public:
+  /* use tmp_pathbuf of current stack frame */
+  tmp_pathbuf_allocator (tmp_pathbuf & tp)
+    : allocator_interface ()
+    , tp_ (tp)
+    , freemem_ ()
+    , freesize_ (0)
+  {}
+};
 
 static void
 set_dl_error (const char *str)
@@ -28,29 +96,6 @@ set_dl_error (const char *str)
   _my_tls.locals.dl_error = 1;
 }
 
-/* Look for an executable file given the name and the environment
-   variable to use for searching (eg., PATH); returns the full
-   pathname (static buffer) if found or NULL if not. */
-inline const char *
-check_path_access (const char *mywinenv, const char *name, path_conv& buf)
-{
-  return find_exec (name, buf, mywinenv, FE_NNF | FE_DLL);
-}
-
-/* Search LD_LIBRARY_PATH for dll, if it exists.  Search /usr/bin and /usr/lib
-   by default.  Return valid full path in path_conv real_filename. */
-static inline bool
-gfpod_helper (const char *name, path_conv &real_filename)
-{
-  if (strchr (name, '/'))
-    real_filename.check (name, PC_SYM_FOLLOW | PC_NULLEMPTY);
-  else if (!check_path_access ("LD_LIBRARY_PATH", name, real_filename))
-    check_path_access ("/usr/bin:/usr/lib", name, real_filename);
-  if (!real_filename.exists ())
-    real_filename.error = ENOENT;
-  return !real_filename.error;
-}
-
 static bool
 get_full_path_of_dll (const char* str, path_conv &real_filename)
 {
@@ -63,38 +108,52 @@ get_full_path_of_dll (const char* str, path_conv &real_filename)
       return false;		/* Yes.  Let caller deal with it. */
     }
 
-  tmp_pathbuf tp;
-  char *name = tp.c_get ();
+  tmp_pathbuf tp; /* single one per stack frame */
+  tmp_pathbuf_allocator allocator (tp);
+  pathfinder::basenamelist basenames (allocator);
 
-  strcpy (name, str);	/* Put it somewhere where we can manipulate it. */
+  const char *basename = strrchr (str, '/');
+  basename = basename ? basename + 1 : str;
 
-  char *basename = strrchr (name, '/');
-  basename = basename ? basename + 1 : name;
-  char *suffix = strrchr (name, '.');
-  if (suffix && suffix < basename)
-    suffix = NULL;
+  int baselen = str + len - basename;
+  const char *suffix = strrchr (basename, '.');
 
+  char const * ext = "";
+  int extlen = 0;
   /* Is suffix ".so"? */
   if (suffix && !strcmp (suffix, ".so"))
     {
       /* Does the file exist? */
-      if (gfpod_helper (name, real_filename))
-	return true;
+      basenames.appendv (basename, baselen, NULL);
       /* No, replace ".so" with ".dll". */
-      strcpy (suffix, ".dll");
+      baselen -= 3;
+      ext = ".dll";
+      extlen = 4;
     }
   /* Does the filename start with "lib"? */
   if (!strncmp (basename, "lib", 3))
     {
       /* Yes, replace "lib" with "cyg". */
-      strncpy (basename, "cyg", 3);
-      /* Does the file exist? */
-      if (gfpod_helper (name, real_filename))
-	return true;
+      basenames.appendv ("cyg", 3, basename+3, baselen-3, ext, extlen, NULL);
       /* No, revert back to "lib". */
-      strncpy (basename, "lib", 3);
     }
-  if (gfpod_helper (name, real_filename))
+  basenames.appendv (basename, baselen, ext, extlen, NULL);
+
+  pathfinder finder (allocator, basenames);
+
+  if (basename > str)
+    finder.add_searchdir (str, basename - 1 - str);
+  else
+    {
+      /* NOTE: The Windows loader (for linked dlls) does
+	 not use the LD_LIBRARY_PATH environment variable. */
+      finder.add_envsearchpath ("LD_LIBRARY_PATH");
+
+      /* Finally we better have some fallback. */
+      finder.add_lib_searchdir ("/usr/lib", -1);
+    }
+
+  if (finder.check_path_access (real_filename))
     return true;
 
   /* If nothing worked, create a relative path from the original incoming
diff --git a/winsup/cygwin/pathfinder.h b/winsup/cygwin/pathfinder.h
new file mode 100644
index 0000000..a9c74b3
--- /dev/null
+++ b/winsup/cygwin/pathfinder.h
@@ -0,0 +1,141 @@
+/* pathfinder.h: find one of multiple file names in path list
+
+This file is part of Cygwin.
+
+This software is a copyrighted work licensed under the terms of the
+Cygwin license.  Please consult the file "CYGWIN_LICENSE" for
+details. */
+
+#include "vstrlist.h"
+
+#ifdef __cplusplus
+
+/* Search a list of directory names for first occurrence of a file,
+   which's file name matches one out of a list of file names.  */
+class pathfinder
+{
+public:
+  typedef vstrlist searchdirlist;
+  typedef vstrlist basenamelist;
+
+private:
+  pathfinder ();
+  pathfinder (pathfinder const &);
+  pathfinder & operator = (pathfinder const &);
+
+  basenamelist basenames_;
+  size_t       basenames_maxlen_;
+
+  /* Add to searchdirs_ with extra buffer for any basename we may search for.
+     This is an optimization for the loops in check_path_access method. */
+  searchdirlist searchdirs_;
+
+public:
+  ~pathfinder () {}
+
+  /* We need the basenames to search for first, to allow for optimized
+     memory allocation of each searchpath + longest basename combination.
+     The incoming list of basenames is emptied (ownership take over). */
+  pathfinder (allocator_interface & a, basenamelist & basenames)
+    : basenames_ (a)
+    , basenames_maxlen_ ()
+    , searchdirs_(a)
+  {
+    basenames_.swap(basenames);
+
+    for (basenamelist::iterator basename = basenames_.begin ();
+	 basename != basenames_.end ();
+	 ++ basename)
+      {
+	if (basenames_maxlen_ < basename->stringlength ())
+	  basenames_maxlen_ = basename->stringlength ();
+      }
+  }
+
+  void add_searchdir (const char *dir, int dirlen)
+  {
+      if (dirlen < 0)
+	dirlen = strlen (dir);
+
+      if (!dirlen)
+	return;
+
+      searchdirs_.appendv (dir, dirlen, "/", 1 + basenames_maxlen_, NULL);
+  }
+
+  void add_lib_searchdir (const char *dir, int dirlen)
+  {
+      if (dirlen < 0)
+	dirlen = strlen (dir);
+
+      if (!dirlen)
+	return;
+
+      /* Search "x/bin:x/lib" for "x/lib": As we don't have DT_RUNPATH,
+	 we have the real dll in /bin for library installed in /lib. */
+      if (dirlen >=4 && !strncmp (dir + dirlen - 4, "/lib", 4))
+	searchdirs_.appendv (dir, dirlen - 4, "/bin", 4, "/", 1 + basenames_maxlen_, NULL);
+
+      searchdirs_.appendv (dir, dirlen, "/", 1 + basenames_maxlen_, NULL);
+  }
+
+  void add_searchpath (const char *path)
+  {
+    while (path && *path)
+      {
+	const char *next = strchr (path, ':');
+	add_searchdir (path, next ? next - path : -1);
+	path = next ? next + 1 : next;
+      }
+  }
+
+  void add_lib_searchpath (const char *path)
+  {
+    while (path && *path)
+      {
+	const char *next = strchr (path, ':');
+	add_lib_searchdir (path, next ? next - path : -1);
+	path = next ? next + 1 : next;
+      }
+  }
+
+  void add_envsearchpath (const char *envpath)
+    {
+      add_searchpath (getenv (envpath));
+    }
+
+  void add_lib_envsearchpath (const char *envpath)
+    {
+      add_lib_searchpath (getenv (envpath));
+    }
+
+  /* Within each searchdir registered, try each registered basename to
+     find as executable.  Returns found dir/basename in real_filename.
+     Returns true when found. */
+  bool check_path_access (path_conv& real_filename)
+  {
+    for (basenamelist::iterator name = basenames_.begin ();
+	 name != basenames_.end ();
+	 ++name)
+      for (searchdirlist::buffer_iterator dir(searchdirs_.begin ());
+	   dir != searchdirs_.end ();
+	   ++dir)
+	{
+	  /* Complete the filename path to search for.
+	     We have allocated enough memory above.  */
+	  memcpy (dir->buffer () + dir->stringlength (),
+		  name->string (), name->stringlength () + 1);
+	  real_filename.check (dir->string (), PC_SYM_FOLLOW | PC_POSIX);
+	  if (real_filename.exists () && !real_filename.isdir ())
+	    {
+	      debug_printf ("found %s", dir->buffer ());
+	      return true;
+	    }
+	  debug_printf ("no %s", dir->buffer ());
+	}
+    real_filename.error = ENOENT;
+    return !real_filename.error;
+  }
+};
+
+#endif /* __cplusplus */
diff --git a/winsup/cygwin/vstrlist.h b/winsup/cygwin/vstrlist.h
new file mode 100644
index 0000000..241d8aa
--- /dev/null
+++ b/winsup/cygwin/vstrlist.h
@@ -0,0 +1,371 @@
+/* vstrlist.h: class vstrlist
+
+This file is part of Cygwin.
+
+This software is a copyrighted work licensed under the terms of the
+Cygwin license.  Please consult the file "CYGWIN_LICENSE" for
+details. */
+
+#ifdef __cplusplus
+
+struct allocator_interface
+{
+  virtual void * alloc (size_t) = 0;
+  virtual void free (void *) = 0;
+};
+
+
+/* The allocated_type makes sure to use the free () method of the
+   same allocator_interface than the alloc () method was used of.
+
+   Stores the allocator_interface address before the real object,
+   to hide it from (construction & destruction of) real object.  */
+class allocated_type
+{
+  union allocator_store
+  {
+    allocator_interface * allocator_;
+    char alignment_[8];
+
+    union pointer
+    {
+      void            * vptr;
+      allocator_store * real;
+    };
+  };
+
+public:
+  void * operator new (size_t class_size, allocator_interface & allocator)
+  {
+    allocator_store::pointer astore;
+    astore.vptr = allocator.alloc (sizeof (allocator_store) + class_size);
+    astore.real->allocator_ = &allocator;
+    ++ astore.real;
+    return astore.vptr;
+  }
+
+  void operator delete (void * p)
+  {
+    allocator_store::pointer astore;
+    astore.vptr = p;
+    -- astore.real;
+    astore.real->allocator_->free (astore.vptr);
+  }
+};
+
+
+/* Double linked list of char arrays, each being a string buffer,
+   which's final buffer size and initial string content is defined
+   by a NULL terminated variable argument list of STRING+LEN pairs,
+   where each STRING (up to LEN) is concatenated for the initial
+   string buffer content, and each LEN is added to the final size
+   of the allocated string buffer.
+   If LEN is -1, strlen(STRING) is used for LEN.
+
+   Needs:
+     An implementation of the allocator_interface.
+
+   Provides:
+     iterator:
+       short name for the string_iterator
+     string_iterator:
+       provides readonly access via member methods:
+	 string (): readonly string buffer
+	 stringlength (): length (readonly) of initial string
+     buffer_iterator:
+       extends string_iterator
+       provides writeable access via member methods:
+	 buffer (): writeable string buffer
+	 bufferlength (): length (readonly) of allocated buffer
+
+   Usage sample:
+     char * text = "snipe";
+     vstrlist l;
+     l.appendv (text, 4, text+3, 2, "", 2, NULL);
+     buffer_iterator it (l.begin ());
+     strcpy (it->buffer () + it->stringlength (), "ts");
+     printf ("Sample result is: '%s'", it->string ());
+   Sample result is: 'snippets' */
+class vstrlist
+{
+public:
+  class member
+    : public allocated_type
+  {
+    friend class vstrlist;
+    friend class string_iterator;
+
+    member * prev_;
+    member * next_;
+    size_t   bufferlength_;
+    size_t   stringlength_;
+    char     buffer_[1]; /* we always have space for the trailing zero */
+
+    /* no copy, just swap */
+    member (member const &);
+    member & operator = (member const &);
+
+    /* anchor */
+    void * operator new (size_t class_size, allocator_interface & allocator)
+    {
+      return allocated_type::operator new (class_size, allocator);
+    }
+
+    /* anchor */
+    member ()
+      : allocated_type ()
+      , prev_ (this)
+      , next_ (this)
+      , bufferlength_ (0)
+      , stringlength_ (0)
+      , buffer_ ()
+    {}
+
+    /* entry: determine memory size from args */
+    void * operator new (size_t class_size, allocator_interface & allocator,
+			 char const * part0, va_list parts)
+    {
+      char const * part = part0;
+      va_list partsdup;
+      va_copy (partsdup, parts);
+      while (part)
+	{
+	  int partlen = va_arg (partsdup, int);
+	  if (partlen < 0)
+	    partlen = strlen (part);
+	  class_size += partlen;
+	  part = va_arg (partsdup, char const *);
+	}
+      va_end (partsdup);
+
+      return allocated_type::operator new (class_size, allocator);
+    }
+
+    /* entry: instantly insert into list */
+    member (member * before, char const * part0, va_list parts)
+      : allocated_type ()
+      , prev_ (NULL)
+      , next_ (NULL)
+      , bufferlength_ (0)
+      , stringlength_ ()
+      , buffer_ ()
+    {
+      prev_ = before->prev_;
+      next_ = before;
+
+      prev_->next_ = this;
+      next_->prev_ = this;
+
+      char * dest = buffer_;
+      char const * part = part0;
+      va_list partsdup;
+      va_copy (partsdup, parts);
+      while (part)
+	{
+	  int partlen = va_arg (partsdup, int);
+	  if (partlen < 0)
+	    {
+	      char * old = dest;
+	      dest = stpcpy (old, part);
+	      partlen = dest - old;
+	    }
+	  else
+	    dest = stpncpy (dest, part, partlen);
+	  bufferlength_ += partlen;
+	  part = va_arg (partsdup, const char *);
+	}
+      va_end (partsdup);
+      *dest = (char)0;
+      stringlength_ = dest - buffer_;
+    }
+
+    /* remove entry from list */
+    ~member ()
+    {
+      member * next = next_;
+      member * prev = prev_;
+      if (next)
+	next->prev_ = prev;
+      if (prev)
+	prev->next_ = next;
+      prev_ = NULL;
+      next_ = NULL;
+    }
+
+  public:
+    member const * next () const { return next_; }
+    member       * next ()       { return next_; }
+    member const * prev () const { return next_; }
+    member       * prev ()       { return next_; }
+
+    /* readonly access */
+    char const * string () const { return buffer_; }
+    size_t stringlength () const { return stringlength_; }
+
+    /* writeable access */
+    char       * buffer ()       { return buffer_; }
+    size_t bufferlength ()       { return bufferlength_; }
+  };
+
+  /* readonly access */
+  class string_iterator
+  {
+    friend class vstrlist;
+    friend class buffer_iterator;
+
+    member * current_;
+
+    string_iterator ();
+
+    string_iterator (member * current)
+      : current_ (current)
+    {}
+
+  public:
+    string_iterator (string_iterator const & rhs)
+      : current_ (rhs.current_)
+    {}
+
+    string_iterator & operator = (string_iterator const & rhs)
+    {
+      current_ = rhs.current_;
+      return *this;
+    }
+
+    string_iterator & operator ++ ()
+    {
+      current_ = current_->next ();
+      return *this;
+    }
+
+    string_iterator operator ++ (int)
+    {
+      string_iterator ret (*this);
+      current_ = current_->next ();
+      return ret;
+    }
+
+    string_iterator & operator -- ()
+    {
+      current_ = current_->prev ();
+      return *this;
+    }
+
+    string_iterator operator -- (int)
+    {
+      string_iterator ret (*this);
+      current_ = current_->prev ();
+      return ret;
+    }
+
+    bool operator == (string_iterator const & rhs) const
+    {
+      return current_ == rhs.current_;
+    }
+
+    bool operator != (string_iterator const & rhs) const
+    {
+      return current_ != rhs.current_;
+    }
+
+    /* readonly member access */
+    member const & operator *  () const { return *current_; }
+    member const * operator -> () const { return  current_; }
+
+    void remove ()
+    {
+      member * old = current_;
+      ++ *this;
+      delete old;
+    }
+  };
+
+  /* writeable access */
+  class buffer_iterator
+    : public string_iterator
+  {
+  public:
+    explicit /* can be used with vstrlist.begin () */
+    buffer_iterator (string_iterator const & begin)
+      : string_iterator (begin)
+    {}
+
+    buffer_iterator (buffer_iterator const & rhs)
+      : string_iterator (rhs)
+    {}
+
+    buffer_iterator & operator = (buffer_iterator const & rhs)
+    {
+      string_iterator::operator = (rhs);
+      return *this;
+    }
+
+    /* writeable member access */
+    member & operator *  () const { return *current_; }
+    member * operator -> () const { return  current_; }
+  };
+
+private:
+  allocator_interface & allocator_;
+  member              * anchor_;
+
+  /* not without an allocator */
+  vstrlist ();
+
+  /* no copy, just swap () */
+  vstrlist (vstrlist const &);
+  vstrlist & operator = (vstrlist const &);
+
+public:
+  /* iterator is the string_iterator */
+  typedef class string_iterator iterator;
+
+  iterator  begin () { return iterator (anchor_->next ()); }
+  iterator  end   () { return iterator (anchor_         ); }
+  iterator rbegin () { return iterator (anchor_->prev ()); }
+  iterator rend   () { return iterator (anchor_         ); }
+
+  vstrlist (allocator_interface & a)
+    : allocator_ (a)
+    , anchor_ (NULL) /* exception safety */
+  {
+    anchor_ = new (allocator_) member ();
+  }
+
+  ~vstrlist ()
+  {
+    if (anchor_ != NULL)
+      {
+	for (iterator it = begin (); it != end (); it.remove ());
+	delete anchor_;
+      }
+  }
+
+  void swap (vstrlist & that)
+  {
+    allocator_interface & a = allocator_;
+    member              * m = anchor_;
+    allocator_ = that.allocator_;
+    anchor_    = that.anchor_;
+    that.allocator_ = a;
+    that.anchor_    = m;
+  }
+
+  string_iterator appendv (char const * part0, va_list parts)
+  {
+    member * ret = new (allocator_, part0, parts)
+		       member (anchor_, part0, parts);
+    return string_iterator (ret);
+  }
+
+  string_iterator appendv (char const * part0, ...)
+  {
+    va_list parts;
+    va_start (parts, part0);
+    string_iterator ret = appendv (part0, parts);
+    va_end (parts);
+    return ret;
+  }
+};
+
+#endif /* __cplusplus */
-- 
2.8.3


[-- Attachment #3: 0002-dlopen-pathfinder-try-each-basename-per-dir.patch --]
[-- Type: text/x-patch, Size: 1363 bytes --]

From 33d5cd5cd1dc188662d6cd18b3e8f4d902559bcf Mon Sep 17 00:00:00 2001
From: Michael Haubenwallner <michael.haubenwallner@ssi-schaefer.com>
Date: Thu, 25 Aug 2016 14:14:55 +0200
Subject: [PATCH 2/4] dlopen (pathfinder): try each basename per dir

Rather than searching within all search dirs for one basename,
search for all basenames within one search dir.

pathfinder.h (check_path_access): Interchange dir- and basename-loops.
---
 winsup/cygwin/pathfinder.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/winsup/cygwin/pathfinder.h b/winsup/cygwin/pathfinder.h
index a9c74b3..e81d00e 100644
--- a/winsup/cygwin/pathfinder.h
+++ b/winsup/cygwin/pathfinder.h
@@ -114,12 +114,12 @@ public:
      Returns true when found. */
   bool check_path_access (path_conv& real_filename)
   {
-    for (basenamelist::iterator name = basenames_.begin ();
-	 name != basenames_.end ();
-	 ++name)
-      for (searchdirlist::buffer_iterator dir(searchdirs_.begin ());
-	   dir != searchdirs_.end ();
-	   ++dir)
+    for (searchdirlist::buffer_iterator dir(searchdirs_.begin ());
+	 dir != searchdirs_.end ();
+	 ++dir)
+      for (basenamelist::iterator name = basenames_.begin ();
+	   name != basenames_.end ();
+	   ++name)
 	{
 	  /* Complete the filename path to search for.
 	     We have allocated enough memory above.  */
-- 
2.8.3


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-25 17:48                   ` About the dll search algorithm of dlopen (patch-r3) Michael Haubenwallner
@ 2016-08-26 10:59                     ` Corinna Vinschen
  2016-08-26 11:18                       ` Corinna Vinschen
  2016-08-26 14:08                       ` Michael Haubenwallner
  2016-08-26 12:04                     ` Corinna Vinschen
  1 sibling, 2 replies; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-26 10:59 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1768 bytes --]

Hi Michael,

On Aug 25 19:48, Michael Haubenwallner wrote:
> Using tmp_pathbuf now, wrapped behind some trivial allocator - which
> might fit better somewhere else than to dlfcn.cc?
> 
> BTW: Is it really intended for tmp_pathbuf to have a single active
> instance (per thread) at a time?

Well, yes.  tmp_pathbuf is meant to be initialized on function entry
(more or less, depends).  It's supposed to exist only once per frame.
When the frame goes out of scope, the tmp_pathbuf usage counter is
restored to the values of the parent frame.

> +   ATTENTION: Requesting memory from an instance of tmp_pathbuf breaks
> +   when another instance on a newer stack frame has provided memory. */

I don't understand this comment, though.  tmp_pathbuf can be used
multiple times in the same thread stackm see other Cygwin functions.
What you can't do is to call a function, instantiate tmp_pathbuf,
allocate memory in the called function, and then use it in the caller.
Well, you *can* do that, but if you do this more than once, the same
memory region is reused.
That's the whole idea of tmp_pathbuf.
Temporary per-thread memory for the current frame and it's child frames
is allocated.  If a deeper frame using tmp_pathbuf goes out of scope,
the memory is not free'd, but recycled next time temporary memory is
needed.  The buffers are either 32K or 64K to matches the maximum long
path length.  They are now used for any purpose where larger temporary
per-thread memory is needed, but providing temporary long path buffers
without killing the stack was their original purposes.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-26 10:59                     ` Corinna Vinschen
@ 2016-08-26 11:18                       ` Corinna Vinschen
  2016-08-26 11:49                         ` Corinna Vinschen
  2016-09-01  8:58                         ` Michael Haubenwallner
  2016-08-26 14:08                       ` Michael Haubenwallner
  1 sibling, 2 replies; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-26 11:18 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2517 bytes --]

On Aug 26 12:59, Corinna Vinschen wrote:
> Hi Michael,
> 
> On Aug 25 19:48, Michael Haubenwallner wrote:
> > Using tmp_pathbuf now, wrapped behind some trivial allocator - which
> > might fit better somewhere else than to dlfcn.cc?
> > 
> > BTW: Is it really intended for tmp_pathbuf to have a single active
> > instance (per thread) at a time?
> 
> Well, yes.  tmp_pathbuf is meant to be initialized on function entry
> (more or less, depends).  It's supposed to exist only once per frame.
> When the frame goes out of scope, the tmp_pathbuf usage counter is
> restored to the values of the parent frame.
> 
> > +   ATTENTION: Requesting memory from an instance of tmp_pathbuf breaks
> > +   when another instance on a newer stack frame has provided memory. */
> 
> I don't understand this comment, though.  tmp_pathbuf can be used
> multiple times in the same thread stackm see other Cygwin functions.
> What you can't do is to call a function, instantiate tmp_pathbuf,
> allocate memory in the called function, and then use it in the caller.
> Well, you *can* do that, but if you do this more than once, the same
> memory region is reused.
> That's the whole idea of tmp_pathbuf.
> Temporary per-thread memory for the current frame and it's child frames
> is allocated.  If a deeper frame using tmp_pathbuf goes out of scope,
> the memory is not free'd, but recycled next time temporary memory is
> needed.  The buffers are either 32K or 64K to matches the maximum long
> path length.  They are now used for any purpose where larger temporary
> per-thread memory is needed, but providing temporary long path buffers
> without killing the stack was their original purposes.

So I also don't quite understand splitting off tmp_pathbuf_allocator.

Why didn't you just include a tmp_pathbuf member into class pathfinder,
kind of like this:

  /* pathfinder.h */
  class pathfinder
  {
    tmp_pathbuf tp;
    [...]
  };

  /* dlfcn.cc */
  get_full_path_of_dll()
  {
    pathfinder finder ();	/* Has a tmp_pathbuf member */

    [...]
    finder.add_basename (basename);
    [...]
    finder.add_searchdir (dir, ...);


    return true;		/* finder goes out of scope, so
				   its tmp_pathbuf goes out of scope
				   so tmp_pathbuf::~tmp_pathbuf is
				   called and all is well. */
  }

?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-26 11:18                       ` Corinna Vinschen
@ 2016-08-26 11:49                         ` Corinna Vinschen
  2016-08-29 16:50                           ` Michael Haubenwallner
  2016-09-01  8:58                         ` Michael Haubenwallner
  1 sibling, 1 reply; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-26 11:49 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 921 bytes --]

On Aug 26 13:18, Corinna Vinschen wrote:
> Why didn't you just include a tmp_pathbuf member into class pathfinder,
> kind of like this:
> 
>   /* pathfinder.h */
>   class pathfinder
>   {
>     tmp_pathbuf tp;
>     [...]
>   };
> 
>   /* dlfcn.cc */
>   get_full_path_of_dll()
>   {
>     pathfinder finder ();	/* Has a tmp_pathbuf member */
> 
>     [...]
>     finder.add_basename (basename);
>     [...]
>     finder.add_searchdir (dir, ...);
> 
> 
>     return true;		/* finder goes out of scope, so
> 				   its tmp_pathbuf goes out of scope
> 				   so tmp_pathbuf::~tmp_pathbuf is
> 				   called and all is well. */
>   }
> 
> ?

Apart from these minor bits and pieces, I really like this new
pathfinder class, btw.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-25 17:48                   ` About the dll search algorithm of dlopen (patch-r3) Michael Haubenwallner
  2016-08-26 10:59                     ` Corinna Vinschen
@ 2016-08-26 12:04                     ` Corinna Vinschen
  2016-08-29  9:24                       ` Michael Haubenwallner
  1 sibling, 1 reply; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-26 12:04 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1508 bytes --]

On Aug 25 19:48, Michael Haubenwallner wrote:
> On 08/22/2016 08:37 PM, Corinna Vinschen wrote:
> > (*) Yuk!  Do we really, *really* want that?  The redirection from
> >     /usr/lib to /usr/bin is only done for system libs, and only because
> >     otherwise we had trouble starting Cygwin from CMD or the Explorer
> >     GUI "Run..." box.  We can't change this without breaking everything
> >     since we *do* depend on the Windows loader yet.
> >     
> >     However, as long as this is restricted to /usr/lib, /usr/bin, it's a
> >     closed system under control of "the distro".  If you extend this to
> >     *any* external path ending in "lib", isn't it inherently dangerous
> >     to automate this under the hood, without the application's consent?
> >     Or, FWIW, the user's consent in case of LD_LIBRARY_PATH?
> 
> 've split into add_lib_searchdir (), used for "/usr/lib" only.

Btw., I just noticed something interesting, independently of your patch.
Consider the file /usr/bin/cygz.dll:

- dlopen (libz.so)            success

- dlopen (/usr/bin/libz.so)   success

- dlopen (/usr/lib/libz.so)   fails

That's pretty clear when looking through the code, but... wouldn't
it make sense to allow that?  If a path is given, and the path points
to /usr/lib, search the file in /usr/bin as well?


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-26 10:59                     ` Corinna Vinschen
  2016-08-26 11:18                       ` Corinna Vinschen
@ 2016-08-26 14:08                       ` Michael Haubenwallner
  2016-08-30 13:26                         ` Corinna Vinschen
  1 sibling, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-26 14:08 UTC (permalink / raw)
  To: cygwin-developers

On 08/26/2016 12:59 PM, Corinna Vinschen wrote:
> Hi Michael,
> 
> On Aug 25 19:48, Michael Haubenwallner wrote:
>> Using tmp_pathbuf now, wrapped behind some trivial allocator - which
>> might fit better somewhere else than to dlfcn.cc?
>>
>> BTW: Is it really intended for tmp_pathbuf to have a single active
>> instance (per thread) at a time?
> 
> Well, yes.  tmp_pathbuf is meant to be initialized on function entry
> (more or less, depends).  It's supposed to exist only once per frame.
> When the frame goes out of scope, the tmp_pathbuf usage counter is
> restored to the values of the parent frame.
> 
>> +   ATTENTION: Requesting memory from an instance of tmp_pathbuf breaks
>> +   when another instance on a newer stack frame has provided memory. */
> 
> I don't understand this comment, though.

Problem is that while the second tmp_pathbuf is constructed,
the first tmp_pathbuf must not be asked for another buffer,
because destructing the second tmp_pathbuf will reset the
tls.counter to what it was before constructing the second,
causing the first tmp_pathbuf to return buffers *again* which
it may have returned already while the second one was alive.

I've had something like this scope flow breaking, where pathfinder
used tmp_pathbuf tpF as its own instance, while the local stack
used tmp_pathbuf tpL:

{
  pathfinder finder (w_buf_old=0);   // tls.w_cnt is 0
  finder.add_some_dirs(...);         // tls.w_cnt is 1 now (by tpF)
  {
    tmp_pathbuf tpL (w_buf_old=1);   // tls.w_cnt is 1 still
    finder.add_some_dirs(...);       // tls.w_cnt is 2 now (by tpF)
    PWCHAR exewname = tpL.w_get ();  // tls.w_cnt is 3 now (by tpL)
    GetModuleFileNameW ( exewname );
    finder.add_dir (from exewname);  // tls.w_cnt is 4 now (by tpF)
  } // destruct tpL (w_buf_old==1)   // tls.w_cnt is 1 now (restored by ~tpL)
  finder.add_some_dirs(...);         // tls.w_cnt is 2 now (tpF already returned that above)
  // here the memory provided by tpF since first time tls.w_cnt=2
  // is overwritten due to tpF returning the same buffers again!
}

/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-26 12:04                     ` Corinna Vinschen
@ 2016-08-29  9:24                       ` Michael Haubenwallner
  2016-08-30 13:35                         ` Corinna Vinschen
  0 siblings, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-29  9:24 UTC (permalink / raw)
  To: cygwin-developers

Hi Corinna,

On 08/26/2016 02:04 PM, Corinna Vinschen wrote:
> On Aug 25 19:48, Michael Haubenwallner wrote:
>> On 08/22/2016 08:37 PM, Corinna Vinschen wrote:
>>> (*) Yuk!  Do we really, *really* want that?  The redirection from
>>>     /usr/lib to /usr/bin is only done for system libs, and only because
>>>     otherwise we had trouble starting Cygwin from CMD or the Explorer
>>>     GUI "Run..." box.  We can't change this without breaking everything
>>>     since we *do* depend on the Windows loader yet.
>>>     
>>>     However, as long as this is restricted to /usr/lib, /usr/bin, it's a
>>>     closed system under control of "the distro".  If you extend this to
>>>     *any* external path ending in "lib", isn't it inherently dangerous
>>>     to automate this under the hood, without the application's consent?
>>>     Or, FWIW, the user's consent in case of LD_LIBRARY_PATH?
>>
>> 've split into add_lib_searchdir (), used for "/usr/lib" only.
> 
> Btw., I just noticed something interesting, independently of your patch.
> Consider the file /usr/bin/cygz.dll:
> 
> - dlopen (libz.so)            success
> 
> - dlopen (/usr/bin/libz.so)   success
> 
> - dlopen (/usr/lib/libz.so)   fails
> 
> That's pretty clear when looking through the code, but... wouldn't
> it make sense to allow that?  If a path is given, and the path points
> to /usr/lib, search the file in /usr/bin as well?

Easy enough - but this should apply to any prefix IMO: While the
application specific prefix often isn't /usr - but something like
/usr/local or /opt/application, application specific libs may be
built & installed with libtool or something similar as well - at
least some tool that knows about installing the real dll into
<app-prefix>/bin (because of the missing Cygwin loader).

But agreed, it makes sense doing /lib->/bin for the explicit path and
the /usr/lib default only and not for the environment-provided paths.

/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-26 11:49                         ` Corinna Vinschen
@ 2016-08-29 16:50                           ` Michael Haubenwallner
  2016-08-30 14:51                             ` Corinna Vinschen
  0 siblings, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-29 16:50 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1825 bytes --]

On 08/26/2016 01:49 PM, Corinna Vinschen wrote:
> On Aug 26 13:18, Corinna Vinschen wrote:
> 
> Apart from these minor bits and pieces, I really like this new
> pathfinder class, btw.

Thank you! :)

While at it: Combining dlopen thoughts with the "forkables" background
around replacing dlls-in-use leads me to this thought for redundant
calls of dlopen:

As we've already agreed that GetModuleHandleEx might make sense,
I'm thinking whether pathfinder could be useful even for calling
GetModuleHandleEx before actually testing for the existing file.

First, we need to agree on which search order is correct for a
specific (redundant) dlopen usage. IMO, this should be for:

dlopen ("libz.so"):
  1) GetModuleHandleEx ("libz.so")
  2) GetModuleHandleEx ("cygz.dll")
  3) GetModuleHandleEx ("libz.dll")
  4) path_conv.exists-based only, as with current pathfinder

dlopen ("/usr/lib/libz.so"):
  1) GetModuleHandleEx ("\winpath\of\usr\bin\libz.so")
  2) GetModuleHandleEx ("\winpath\of\usr\bin\cygz.dll")
  3) GetModuleHandleEx ("\winpath\of\usr\bin\libz.dll")

  4) GetModuleHandleEx ("\winpath\of\usr\lib\libz.so")
  5) GetModuleHandleEx ("\winpath\of\usr\lib\cygz.dll")
  6) GetModuleHandleEx ("\winpath\of\usr\lib\libz.dll")

  7) path_conv.exists ("\winpath\of\usr\bin\libz.so")
  8) path_conv.exists ("\winpath\of\usr\bin\cygz.dll")
  9) path_conv.exists ("\winpath\of\usr\bin\libz.dll")

 10) path_conv.exists ("\winpath\of\usr\lib\libz.so")
 11) path_conv.exists ("\winpath\of\usr\lib\cygz.dll")
 12) path_conv.exists ("\winpath\of\usr\lib\libz.dll")

Thing is that with the latter (predefined path), in a first iteration
we need the same filenames to test for already-loaded as in a second
iteration where we test for real files.

Attaching a patch draft for a pathfinder criterion interface...

Thoughts?

/haubi/

[-- Attachment #2: pathfinder-criteria.patch --]
[-- Type: text/x-patch, Size: 5342 bytes --]

diff --git a/winsup/cygwin/dlfcn.cc b/winsup/cygwin/dlfcn.cc
index 46348bb..c2cd740 100644
--- a/winsup/cygwin/dlfcn.cc
+++ b/winsup/cygwin/dlfcn.cc
@@ -168,7 +168,9 @@ get_full_path_of_dll (const char* str, path_conv &real_filename)
       finder.add_lib_searchdir ("/usr/lib", -1);
     }
 
-  if (finder.check_path_access (real_filename))
+  if (finder.find (pathfinder::
+		   exists_and_is_file (real_filename,
+				       PC_SYM_FOLLOW | PC_POSIX)))
     return true;
 
   /* If nothing worked, create a relative path from the original incoming
diff --git a/winsup/cygwin/pathfinder.h b/winsup/cygwin/pathfinder.h
index 11bb5fd..c7e79eb 100644
--- a/winsup/cygwin/pathfinder.h
+++ b/winsup/cygwin/pathfinder.h
@@ -109,32 +109,130 @@ public:
       add_lib_searchpath (getenv (envpath));
     }
 
-  /* Within each searchdir registered, try each registered basename to
-     find as executable.  Returns found dir/basename in real_filename.
-     Returns true when found. */
-  bool check_path_access (path_conv& real_filename)
+
+  /* pathfinder::criterion_interface
+     Overload this test method when you need separate dir and basename.  */
+  struct criterion_interface
+  {
+    virtual char const * name () const { return NULL; }
+
+    virtual bool test (searchdirlist::iterator dir,
+		       basenamelist::iterator name) const = 0;
+  };
+
+
+  /* pathfinder::simple_criterion_interface
+     Overload this test method when you need a single filename.  */
+  class simple_criterion_interface
+    : public criterion_interface
+  {
+    virtual bool test (searchdirlist::iterator dir,
+		       basenamelist::iterator name) const
+    {
+      /* Complete the filename path to search for within dir,
+	 We have allocated enough memory above.  */
+      searchdirlist::buffer_iterator dirbuf (dir);
+      memcpy (dirbuf->buffer () + dirbuf->stringlength (),
+	      name->string (), name->stringlength () + 1);
+      bool ret = test (dirbuf->string ());
+      /* reset original dir */
+      dirbuf->buffer ()[dirbuf->stringlength ()] = '\0';
+      return ret;
+    }
+
+  public:
+    virtual bool test (const char * filename) const = 0;
+  };
+
+
+  /* pathfinder::path_conv_criterion_interface
+     Overload this test method when you need a path_conv. */
+  class path_conv_criterion_interface
+    : public simple_criterion_interface
+  {
+    path_conv mypc_;
+    path_conv & pc_;
+    unsigned opt_;
+
+    /* simple_criterion_interface */
+    virtual bool test (const char * filename) const
+    {
+      pc_.check (filename, opt_);
+      return test (pc_);
+    }
+
+  public:
+    path_conv_criterion_interface (unsigned opt = PC_SYM_FOLLOW)
+      : mypc_ ()
+      , pc_ (mypc_)
+      , opt_ (opt)
+    {}
+
+    path_conv_criterion_interface (path_conv & ret, unsigned opt = PC_SYM_FOLLOW)
+      : mypc_ ()
+      , pc_ (ret)
+      , opt_ (opt)
+    {}
+
+    virtual bool test (path_conv & pc) const = 0;
+  };
+
+
+  /* pathfinder::exists_and_is_file
+     Test if path_conv argument does exist and is not a directory. */
+  struct exists_and_is_file
+    : public path_conv_criterion_interface
+  {
+    virtual char const * name () const { return "exists and is file"; }
+
+    exists_and_is_file (path_conv & pc, unsigned opt = PC_SYM_FOLLOW)
+      : path_conv_criterion_interface (pc, opt)
+    {}
+
+    /* path_conv_criterion_interface */
+    virtual bool test (path_conv & pc) const
+    {
+      if (pc.exists () && !pc.isdir ())
+	return true;
+
+      pc.error = ENOENT;
+      return false;
+    }
+  };
+
+
+  /* Find the single dir + basename that matches criterion.
+
+     Calls criterion.test method for each registered dir + basename
+     until returning true:
+       Returns true with found_dir + found_basename set.
+     If criterion.test method never returns true:
+       Returns false, not modifying found_dir nor found_basename.  */
+  bool find (criterion_interface const & criterion,
+	     searchdirlist::member const ** found_dir = NULL,
+	     basenamelist::member const ** found_basename = NULL)
   {
-    for (searchdirlist::buffer_iterator dir(searchdirs_.begin ());
+    char const * critname = criterion.name ();
+    for (searchdirlist::iterator dir(searchdirs_.begin ());
 	 dir != searchdirs_.end ();
 	 ++dir)
       for (basenamelist::iterator name = basenames_.begin ();
 	   name != basenames_.end ();
 	   ++name)
-	{
-	  /* Complete the filename path to search for.
-	     We have allocated enough memory above.  */
-	  memcpy (dir->buffer () + dir->stringlength (),
-		  name->string (), name->stringlength () + 1);
-	  real_filename.check (dir->string (), PC_SYM_FOLLOW | PC_POSIX);
-	  if (real_filename.exists () && !real_filename.isdir ())
-	    {
-	      debug_printf ("found %s", dir->buffer ());
-	      return true;
-	    }
-	  debug_printf ("no %s", dir->buffer ());
-	}
-    real_filename.error = ENOENT;
-    return !real_filename.error;
+	if (criterion.test (dir, name))
+	  {
+	    debug_printf ("(%s) take %s%s", critname,
+			  dir->string(), name->string ());
+	    if (found_dir)
+	      *found_dir = dir.operator -> ();
+	    if (found_basename)
+	      *found_basename = name.operator -> ();
+	    return true;
+	  }
+	else
+	  debug_printf ("(%s) skip %s%s", critname,
+			dir->string(), name->string ());
+    return false;
   }
 };
 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-26 14:08                       ` Michael Haubenwallner
@ 2016-08-30 13:26                         ` Corinna Vinschen
  0 siblings, 0 replies; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-30 13:26 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1720 bytes --]

On Aug 26 16:08, Michael Haubenwallner wrote:
> On 08/26/2016 12:59 PM, Corinna Vinschen wrote:
> > Hi Michael,
> > 
> > On Aug 25 19:48, Michael Haubenwallner wrote:
> >> Using tmp_pathbuf now, wrapped behind some trivial allocator - which
> >> might fit better somewhere else than to dlfcn.cc?
> >>
> >> BTW: Is it really intended for tmp_pathbuf to have a single active
> >> instance (per thread) at a time?
> > 
> > Well, yes.  tmp_pathbuf is meant to be initialized on function entry
> > (more or less, depends).  It's supposed to exist only once per frame.
> > When the frame goes out of scope, the tmp_pathbuf usage counter is
> > restored to the values of the parent frame.
> > 
> >> +   ATTENTION: Requesting memory from an instance of tmp_pathbuf breaks
> >> +   when another instance on a newer stack frame has provided memory. */
> > 
> > I don't understand this comment, though.
> 
> Problem is that while the second tmp_pathbuf is constructed,
> the first tmp_pathbuf must not be asked for another buffer,
> because destructing the second tmp_pathbuf will reset the
> tls.counter to what it was before constructing the second,
> causing the first tmp_pathbuf to return buffers *again* which
> it may have returned already while the second one was alive.
> 
> I've had something like this scope flow breaking, where pathfinder
> used tmp_pathbuf tpF as its own instance, while the local stack
> used tmp_pathbuf tpL:

Yeah, ok, that's not what tmp_pathbuf was designed for, it was strictly
a per-frame thingy.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-29  9:24                       ` Michael Haubenwallner
@ 2016-08-30 13:35                         ` Corinna Vinschen
  2016-08-30 16:41                           ` Michael Haubenwallner
  0 siblings, 1 reply; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-30 13:35 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2609 bytes --]

On Aug 29 11:24, Michael Haubenwallner wrote:
> Hi Corinna,
> 
> On 08/26/2016 02:04 PM, Corinna Vinschen wrote:
> > On Aug 25 19:48, Michael Haubenwallner wrote:
> >> On 08/22/2016 08:37 PM, Corinna Vinschen wrote:
> >>> (*) Yuk!  Do we really, *really* want that?  The redirection from
> >>>     /usr/lib to /usr/bin is only done for system libs, and only because
> >>>     otherwise we had trouble starting Cygwin from CMD or the Explorer
> >>>     GUI "Run..." box.  We can't change this without breaking everything
> >>>     since we *do* depend on the Windows loader yet.
> >>>     
> >>>     However, as long as this is restricted to /usr/lib, /usr/bin, it's a
> >>>     closed system under control of "the distro".  If you extend this to
> >>>     *any* external path ending in "lib", isn't it inherently dangerous
> >>>     to automate this under the hood, without the application's consent?
> >>>     Or, FWIW, the user's consent in case of LD_LIBRARY_PATH?
> >>
> >> 've split into add_lib_searchdir (), used for "/usr/lib" only.
> > 
> > Btw., I just noticed something interesting, independently of your patch.
> > Consider the file /usr/bin/cygz.dll:
> > 
> > - dlopen (libz.so)            success
> > 
> > - dlopen (/usr/bin/libz.so)   success
> > 
> > - dlopen (/usr/lib/libz.so)   fails
> > 
> > That's pretty clear when looking through the code, but... wouldn't
> > it make sense to allow that?  If a path is given, and the path points
> > to /usr/lib, search the file in /usr/bin as well?
> 
> Easy enough - but this should apply to any prefix IMO: While the
> application specific prefix often isn't /usr - but something like
> /usr/local or /opt/application, application specific libs may be
> built & installed with libtool or something similar as well - at
> least some tool that knows about installing the real dll into
> <app-prefix>/bin (because of the missing Cygwin loader).

You have a point there.

> But agreed, it makes sense doing /lib->/bin for the explicit path and
> the /usr/lib default only and not for the environment-provided paths.

It feels certainly more safe to restrict this to the system path for
now.  But... yeah, you have a point.

Not well thought out, just an idea kicking around:

Apart from the obvious system path handling, what if other lib->bin
transitions only take place if the calling application is installed
in that very bin dir...?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-29 16:50                           ` Michael Haubenwallner
@ 2016-08-30 14:51                             ` Corinna Vinschen
  2016-08-30 15:57                               ` Michael Haubenwallner
  0 siblings, 1 reply; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-30 14:51 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1200 bytes --]

On Aug 29 18:50, Michael Haubenwallner wrote:
> On 08/26/2016 01:49 PM, Corinna Vinschen wrote:
> > On Aug 26 13:18, Corinna Vinschen wrote:
> > 
> > Apart from these minor bits and pieces, I really like this new
> > pathfinder class, btw.
> 
> Thank you! :)
> 
> While at it: Combining dlopen thoughts with the "forkables" background
> around replacing dlls-in-use leads me to this thought for redundant
> calls of dlopen:
> 
> As we've already agreed that GetModuleHandleEx might make sense,

Erm... where exactly did we do that?  I recall I mentioned using
LoadLibraryEx with LOAD_LIBRARY_SEARCH_USER_DIRS and the AddDllDirectory
functions, but I don't think we talked about GetModuleHandleEx.

The question is if it really makes sense to add calls to
GetModuleHandleEx.  Apart from slowing down the dlopen call, it seems
unnecessary.  Assuming you call LoadLibrary with the same paths and same
extensions in the same order, wouldn't a second call to dlopen have the
exact same result (loading the same file)?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-30 14:51                             ` Corinna Vinschen
@ 2016-08-30 15:57                               ` Michael Haubenwallner
  2016-08-31 18:48                                 ` Corinna Vinschen
  0 siblings, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-30 15:57 UTC (permalink / raw)
  To: cygwin-developers

Hi Corinna,

On 08/30/2016 04:51 PM, Corinna Vinschen wrote:
> On Aug 29 18:50, Michael Haubenwallner wrote:
>> On 08/26/2016 01:49 PM, Corinna Vinschen wrote:
>>> On Aug 26 13:18, Corinna Vinschen wrote:
>>>
>>> Apart from these minor bits and pieces, I really like this new
>>> pathfinder class, btw.
>>
>> Thank you! :)
>>
>> While at it: Combining dlopen thoughts with the "forkables" background
>> around replacing dlls-in-use leads me to this thought for redundant
>> calls of dlopen:
>>
>> As we've already agreed that GetModuleHandleEx might make sense,
> 
> Erm... where exactly did we do that?  I recall I mentioned using
> LoadLibraryEx with LOAD_LIBRARY_SEARCH_USER_DIRS and the AddDllDirectory
> functions, but I don't think we talked about GetModuleHandleEx.

Indeed not for dlopen calls including a path - but for without a path:
https://sourceware.org/ml/cygwin-developers/2016-06/msg00003.html

> The question is if it really makes sense to add calls to
> GetModuleHandleEx.  Apart from slowing down the dlopen call, it seems
> unnecessary.  Assuming you call LoadLibrary with the same paths and same
> extensions in the same order, wouldn't a second call to dlopen have the
> exact same result (loading the same file)?

Sure - as long as an already loaded dll is not replaced by an updated one.
Before I'll gonna try: Do you know by chance what LoadLibrary does here?

Artifical corner case LoadLibrary will never handle:
* 1st dlopen("libN.so") loads "libN.dll"
* same program installs newer package N, now providing "cygN.dll"
* 2nd dlopen("libN.so") loads "cygX.dll": expected is old "libN.dll" handle
This is true for both dlopen("libN.so") and dlopen("/path/lib/libN.so").

However, for perfomance reasons I can imagine some shared flag for both
dlopen and fork (others?): "support updating running binaries (NTFS only)",
to avoid the extra GetModuleHandleEx calls in dlopen by default.

/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-30 13:35                         ` Corinna Vinschen
@ 2016-08-30 16:41                           ` Michael Haubenwallner
  2016-08-31 18:43                             ` Corinna Vinschen
  0 siblings, 1 reply; 32+ messages in thread
From: Michael Haubenwallner @ 2016-08-30 16:41 UTC (permalink / raw)
  To: cygwin-developers

Hi Corinna,

On 08/30/2016 03:35 PM, Corinna Vinschen wrote:
> On Aug 29 11:24, Michael Haubenwallner wrote:
>> On 08/26/2016 02:04 PM, Corinna Vinschen wrote:
>>> On Aug 25 19:48, Michael Haubenwallner wrote:
>>>> On 08/22/2016 08:37 PM, Corinna Vinschen wrote:
>>>>> (*) Yuk!  Do we really, *really* want that?  The redirection from
>>>>>     /usr/lib to /usr/bin is only done for system libs, and only because
>>>>>     otherwise we had trouble starting Cygwin from CMD or the Explorer
>>>>>     GUI "Run..." box.  We can't change this without breaking everything
>>>>>     since we *do* depend on the Windows loader yet.
>>>>>     
>>>>>     However, as long as this is restricted to /usr/lib, /usr/bin, it's a
>>>>>     closed system under control of "the distro".  If you extend this to
>>>>>     *any* external path ending in "lib", isn't it inherently dangerous
>>>>>     to automate this under the hood, without the application's consent?
>>>>>     Or, FWIW, the user's consent in case of LD_LIBRARY_PATH?
>>>>
>>>> 've split into add_lib_searchdir (), used for "/usr/lib" only.
>>>
>>> Btw., I just noticed something interesting, independently of your patch.
>>> Consider the file /usr/bin/cygz.dll:
>>>
>>> - dlopen (libz.so)            success

This one succeeds because of /usr/bin being the fallback path, but ...

>>>
>>> - dlopen (/usr/bin/libz.so)   success
>>>
>>> - dlopen (/usr/lib/libz.so)   fails
>>>
>>> That's pretty clear when looking through the code, but... wouldn't
>>> it make sense to allow that?  If a path is given, and the path points
>>> to /usr/lib, search the file in /usr/bin as well?
>>
>> Easy enough - but this should apply to any prefix IMO: While the
>> application specific prefix often isn't /usr - but something like
>> /usr/local or /opt/application, application specific libs may be
>> built & installed with libtool or something similar as well - at
>> least some tool that knows about installing the real dll into
>> <app-prefix>/bin (because of the missing Cygwin loader).
> 
> You have a point there.

Thanks!

... I forgot about dlopen("libAPP.so") (without path): This I expect
to find <app-prefix>/bin/cygAPP.dll - which is the application dir.

>> But agreed, it makes sense doing /lib->/bin for the explicit path and
>> the /usr/lib default only and not for the environment-provided paths.
> 
> It feels certainly more safe to restrict this to the system path for
> now.  But... yeah, you have a point.
> 
> Not well thought out, just an idea kicking around:
> 
> Apart from the obvious system path handling, what if other lib->bin
> transitions only take place if the calling application is installed
> in that very bin dir...?

Interesting idea - might work indeed! Even for prefix=/usr, to
have consistent behaviour across different application prefixes.

For safety regarding the application dir: If one can write to the
application dir, couldn't one put a malicious kernel32.dll there
as well, and/or an empty application.exe.local for dll redirection?

/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-30 16:41                           ` Michael Haubenwallner
@ 2016-08-31 18:43                             ` Corinna Vinschen
  0 siblings, 0 replies; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-31 18:43 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1187 bytes --]

Hi Michael,

On Aug 30 18:41, Michael Haubenwallner wrote:
> On 08/30/2016 03:35 PM, Corinna Vinschen wrote:
> > Not well thought out, just an idea kicking around:
> > 
> > Apart from the obvious system path handling, what if other lib->bin
> > transitions only take place if the calling application is installed
> > in that very bin dir...?
> 
> Interesting idea - might work indeed! Even for prefix=/usr, to
> have consistent behaviour across different application prefixes.

Actually, no.  This test is not ok for the system DLL path, because
system DLLs are expected to exist for all applications, even those
not installed in a a system path itself.

> For safety regarding the application dir: If one can write to the
> application dir, couldn't one put a malicious kernel32.dll there
> as well, and/or an empty application.exe.local for dll redirection?

I'm not overly fluent with the .local stuff, but the kernel32.dll thingy
should work as desired since kernel32.dll is one of the KnownDLLs.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-30 15:57                               ` Michael Haubenwallner
@ 2016-08-31 18:48                                 ` Corinna Vinschen
  0 siblings, 0 replies; 32+ messages in thread
From: Corinna Vinschen @ 2016-08-31 18:48 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1921 bytes --]

On Aug 30 17:56, Michael Haubenwallner wrote:
> Hi Corinna,
> 
> On 08/30/2016 04:51 PM, Corinna Vinschen wrote:
> > On Aug 29 18:50, Michael Haubenwallner wrote:
> >> On 08/26/2016 01:49 PM, Corinna Vinschen wrote:
> >>> On Aug 26 13:18, Corinna Vinschen wrote:
> >>>
> >>> Apart from these minor bits and pieces, I really like this new
> >>> pathfinder class, btw.
> >>
> >> Thank you! :)
> >>
> >> While at it: Combining dlopen thoughts with the "forkables" background
> >> around replacing dlls-in-use leads me to this thought for redundant
> >> calls of dlopen:
> >>
> >> As we've already agreed that GetModuleHandleEx might make sense,
> > 
> > Erm... where exactly did we do that?  I recall I mentioned using
> > LoadLibraryEx with LOAD_LIBRARY_SEARCH_USER_DIRS and the AddDllDirectory
> > functions, but I don't think we talked about GetModuleHandleEx.
> 
> Indeed not for dlopen calls including a path - but for without a path:
> https://sourceware.org/ml/cygwin-developers/2016-06/msg00003.html
> 
> > The question is if it really makes sense to add calls to
> > GetModuleHandleEx.  Apart from slowing down the dlopen call, it seems
> > unnecessary.  Assuming you call LoadLibrary with the same paths and same
> > extensions in the same order, wouldn't a second call to dlopen have the
> > exact same result (loading the same file)?
> 
> Sure - as long as an already loaded dll is not replaced by an updated one.
> Before I'll gonna try: Do you know by chance what LoadLibrary does here?

No, I don't know exactly, but in theory it should try to match
the incoming filename against the filename stored in the loader
list attached to the PEB.  This would even work if the DLL has been
renamed in the meantime.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: About the dll search algorithm of dlopen (patch-r3)
  2016-08-26 11:18                       ` Corinna Vinschen
  2016-08-26 11:49                         ` Corinna Vinschen
@ 2016-09-01  8:58                         ` Michael Haubenwallner
  1 sibling, 0 replies; 32+ messages in thread
From: Michael Haubenwallner @ 2016-09-01  8:58 UTC (permalink / raw)
  To: cygwin-developers

Hi Corinna,

On 08/26/2016 01:18 PM, Corinna Vinschen wrote:
> On Aug 26 12:59, Corinna Vinschen wrote:
>> On Aug 25 19:48, Michael Haubenwallner wrote:
>>> Using tmp_pathbuf now, wrapped behind some trivial allocator - which
>>> might fit better somewhere else than to dlfcn.cc?
>>>
>>> BTW: Is it really intended for tmp_pathbuf to have a single active
>>> instance (per thread) at a time?
>>
>> Well, yes.  tmp_pathbuf is meant to be initialized on function entry
>> (more or less, depends).  It's supposed to exist only once per frame.
>> When the frame goes out of scope, the tmp_pathbuf usage counter is
>> restored to the values of the parent frame.
>>
>>> +   ATTENTION: Requesting memory from an instance of tmp_pathbuf breaks
>>> +   when another instance on a newer stack frame has provided memory. */
>>
>> I don't understand this comment, though.  tmp_pathbuf can be used
>> multiple times in the same thread stackm see other Cygwin functions.
>> What you can't do is to call a function, instantiate tmp_pathbuf,
>> allocate memory in the called function, and then use it in the caller.
>> Well, you *can* do that, but if you do this more than once, the same
>> memory region is reused.
>> That's the whole idea of tmp_pathbuf.
>> Temporary per-thread memory for the current frame and it's child frames
>> is allocated.  If a deeper frame using tmp_pathbuf goes out of scope,
>> the memory is not free'd, but recycled next time temporary memory is
>> needed.  The buffers are either 32K or 64K to matches the maximum long
>> path length.  They are now used for any purpose where larger temporary
>> per-thread memory is needed, but providing temporary long path buffers
>> without killing the stack was their original purposes.
> 
> So I also don't quite understand splitting off tmp_pathbuf_allocator.

Sorry, 've expected to have provided enough reasoning here:
https://cygwin.com/ml/cygwin-developers/2016-08/msg00016.html

Actually it is vstrlist doing the alloc+free, and I do expect vstrlist
to be useful even in scenarios where it does not work to ignore it's
free calls, but requires the allocator to fully maintain the alloc'ed
regions to allow for free'd regions to become reused.
And as I've stumbled over 3 different memory providers already (malloc,
cmalloc, tmp_pathbuf), I don't think vstrlist should know anything
about tmp_pathbuf at all - otherwise it should be named tmp_vstrlist.

Same stands for pathfinder, which needs the allocator interface reference
only for construction of the embedded vstrlist that holds the searchdirs.

> Why didn't you just include a tmp_pathbuf member into class pathfinder,
> kind of like this:
> 
>   /* pathfinder.h */
>   class pathfinder
>   {
>     tmp_pathbuf tp;
>     [...]
>   };
> 
>   /* dlfcn.cc */
>   get_full_path_of_dll()
>   {
>     pathfinder finder ();	/* Has a tmp_pathbuf member */
> 
>     [...]
>     finder.add_basename (basename);
>     [...]
>     finder.add_searchdir (dir, ...);
> 
> 
>     return true;		/* finder goes out of scope, so
> 				   its tmp_pathbuf goes out of scope
> 				   so tmp_pathbuf::~tmp_pathbuf is
> 				   called and all is well. */
>   }

Well, as one stackframe better uses one _single_ instance of tmp_pathbuf,
I prefer to be explicit about this and have pathfinder+vstrlist take a
tmp_pathbuf _reference_:
A member instance would be hidden to some degree, and thus will ask for
trouble in the long run, as it would not be obvious to developers in the
future that this stackframe already _does_ have that single instance and
one should not add (another) one.

In hope to provide graspable thoughts,
/haubi/

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-09-01  8:58 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01  6:39 About the dll search algorithm of dlopen Michael Haubenwallner
2016-06-01 11:09 ` Corinna Vinschen
2016-06-01 14:25   ` Michael Haubenwallner
2016-06-01 20:18     ` Corinna Vinschen
2016-06-01 23:29       ` Yaakov Selkowitz
2016-08-19 16:37       ` Michael Haubenwallner
2016-08-19 16:39         ` Michael Haubenwallner
2016-08-20 19:32           ` Corinna Vinschen
2016-08-22 12:15             ` Michael Haubenwallner
2016-08-22 12:48               ` Peter Rosin
2016-08-22 13:01                 ` Michael Haubenwallner
2016-08-22 16:02               ` About the dll search algorithm of dlopen (patch-r2) Michael Haubenwallner
2016-08-22 18:37                 ` Corinna Vinschen
2016-08-23  8:16                   ` Corinna Vinschen
2016-08-25 17:48                   ` About the dll search algorithm of dlopen (patch-r3) Michael Haubenwallner
2016-08-26 10:59                     ` Corinna Vinschen
2016-08-26 11:18                       ` Corinna Vinschen
2016-08-26 11:49                         ` Corinna Vinschen
2016-08-29 16:50                           ` Michael Haubenwallner
2016-08-30 14:51                             ` Corinna Vinschen
2016-08-30 15:57                               ` Michael Haubenwallner
2016-08-31 18:48                                 ` Corinna Vinschen
2016-09-01  8:58                         ` Michael Haubenwallner
2016-08-26 14:08                       ` Michael Haubenwallner
2016-08-30 13:26                         ` Corinna Vinschen
2016-08-26 12:04                     ` Corinna Vinschen
2016-08-29  9:24                       ` Michael Haubenwallner
2016-08-30 13:35                         ` Corinna Vinschen
2016-08-30 16:41                           ` Michael Haubenwallner
2016-08-31 18:43                             ` Corinna Vinschen
2016-08-22 18:49               ` About the dll search algorithm of dlopen Corinna Vinschen
2016-08-23  8:55                 ` Michael Haubenwallner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).