From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27017 invoked by alias); 22 Aug 2016 12:15:26 -0000 Mailing-List: contact cygwin-developers-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner@cygwin.com Mail-Followup-To: cygwin-developers@cygwin.com Received: (qmail 27005 invoked by uid 89); 22 Aug 2016 12:15:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=residing, H*r:UNKNOWN X-HELO: smtp.salomon.at Received: from smtp.salomon.at (HELO smtp.salomon.at) (193.186.16.13) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 22 Aug 2016 12:15:14 +0000 Received: from samail03.wamas.com ([172.28.33.235] helo=mailhost.salomon.at) by smtp.salomon.at with esmtps (UNKNOWN:DHE-RSA-AES256-SHA:256) (Exim 4.80.1) (envelope-from ) id 1bbo8E-0004VA-DD for cygwin-developers@cygwin.com; Mon, 22 Aug 2016 14:15:11 +0200 Received: from [172.28.41.34] by mailhost.salomon.at with esmtp (Exim 4.77) (envelope-from ) id 1bbo8E-000085-8A for cygwin-developers@cygwin.com; Mon, 22 Aug 2016 14:15:10 +0200 Subject: Re: About the dll search algorithm of dlopen References: <574E835E.7090109@ssi-schaefer.com> <20160601110947.GE11431@calimero.vinschen.de> <574EF07B.1060806@ssi-schaefer.com> <20160601201748.GI11431@calimero.vinschen.de> <57B735D5.4070401@ssi-schaefer.com> <57B7362D.8060707@ssi-schaefer.com> <20160820193243.vbmpfjc5mjdhrndh@calimero.vinschen.de> To: cygwin-developers@cygwin.com From: Michael Haubenwallner Message-ID: <4d4481ce-c163-0ec9-29f5-59bbe13260fa@ssi-schaefer.com> Date: Mon, 22 Aug 2016 12:15:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20160820193243.vbmpfjc5mjdhrndh@calimero.vinschen.de> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2016-08/txt/msg00003.txt.bz2 Hi Corinna, On 08/20/2016 09:32 PM, Corinna Vinschen wrote: >>>> >>>> One way around YA code duplication could be some kind of path iterator >>>> class which could be used from find_exec as well as from >>>> get_full_path_of_dll. >>> 0001.patch is a draft for some new cygwin::pathfinder class, with >>> 0002.patch adding the executable's directory as searchpath, and >>> 0003.patch to search the PATH environment as well. >>> >>> Thoughts? > > Ok, that might be disappointing now because you already put so much work > into it, but I actually expected some more discussion first. I have two > problem with this. > > I'm not a big fan of templates. Never mind, it's been some template exercise to me anyway. > What I had in mind was a *simple* class which gets told if it searches > for libs or executables and then checks the different paths accordingly, > kind of a copy of find_exec as a class, just additionally handling the > prefix issue for DLLs. What I'm more interested in for such a class is the actual API for use by dlopen() and exec(), and the final list of files searched for - with these use cases coming to my mind: Libraries/dlls with final search path "/lib:/morelibs": L1) dlopen("libN.so") L2) dlopen("libN.dll") L3) dlopen("cygN.dll") L4) dlopen("N.so") L5) dlopen("N.dll") Executables with final search path "/bin:/moreexes" X1) exec("X") X2) exec("X.exe") X3) exec("X.com") Instead of API calls similar to: L1) find(dll, "N", ["/lib", "/morelibs"]) L2) find(dll, "N", ["/lib", "/morelibs"]) L3) find(dll, "N", ["/lib", "/morelibs"]) L4) find(dll, "N", ["/lib", "/morelibs"]) L5) find(dll, "N", ["/lib", "/morelibs"]) X1) find(exe, "X", ["/bin", "/moreexes"]) X2) find(exe, "X", ["/bin", "/moreexes"]) X3) find(exe, "X", ["/bin", "/moreexes"]) it feels necessary to support more explicit naming, as in: L1) find(["libN.so", "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"]) L2) find([ "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"]) L3) find([ "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"]) L4) find(["N.so", "N.dll" ], ["/lib/../bin","/lib","/morelibs"]) L5) find([ "N.dll" ], ["/lib/../bin","/lib","/morelibs"]) X1) find(["X", "X.exe","X.com"], ["/bin","/moreexes"]) X2) find(["X", "X.exe" ], ["/bin","/moreexes"]) X3) find(["X", "X.com" ], ["/bin","/moreexes"]) Where the find method does not need to actually know whether it searches for a dll or an exe, but dlopen() and exec() instead define the file names to search for. This is what the patch draft does in dlopen. >>>>>>> *) The directory of the current main executable should be searched >>>>>>> after LD_LIBRARY_PATH and before /usr/bin:/usr/lib. >>>>>>> And PATH should be searched before /usr/bin:/usr/lib as well. >>>>>> >>>>>> Checking the executable path and $PATH are Windows concepts. dlopen >>>>>> doesn't do that on POSIX systems and we're not doing that either. >>>>> >>>>> Agreed, but POSIX also does have the concept of embedded RUNPATH, >>>>> which is completely missing in Cygwin as far as I can see. >>>> >>>> RPATH and RUNPATH are ELF dynamic loader features, not supported by >>>> PE/COFF. >>> >>> In any case, to me it does feel quite important to have the (almost) same >>> dll search algorithm with dlopen() as with CreateProcess(). > > Last but not least I'm not yet convinced if it's *really* a good idea to > prepend the executable path to the DLL search path unconditionally. Be > it as it is in terms of DT_RUNPATH, why is the application dir a good > choice at all, unless we're talking Windows semantics? Which we don't. > Also, if loading from the applications dir from dlopen is important for > you, you can emulate it by adding the application dir to LD_LIBRAYR_PATH. As long as there is lack of a Cygwin specific dll loader to find the dlls to load during process startup, we're bound to Windows semantics. For dlopen, it is more important to find the same dll file as would be found when the exe was linked against that dll file, rather than using the Linux-known algorithm and environment variables - and differ from process startup: Both really should result in the same algorithm here, even if that means some difference compared to Linux. As far as I understand, lack of DT_RUNPATH (besides /etc/ld.so.conf) support during process start was the main reason for the dlls to install into /lib/../bin instead of /lib at all, to be found at process start because of residing in the application's bin dir: Why should that be different for dlopen? > I checked for the usage of DT_RUNPATH/DT_RPATH on Fedora 23 and only a > limited number of packages use it (texlive, samba, python, man-db, > swipl, and a few more). Some of them, like texlive, even use it wrongly, > with RPATH pointing to a non-existing build dir. There are also a few > stray "/usr/lib64" settings, but all in all it's not used to point to > the dir the application is installed to, but rather to some package specific > subdir, e.g. /usr/lib64/samba, /usr/lib64/swipl-7.2.3/lib/x86_64-linux, > etc. On Linux, the binaries installed in /usr usually rely on the Linux loader to be configured via /etc/ld.so.conf to find their runtime libs in /usr/lib. Please remember: This whole thing is not a problem with packages installed to /usr, but with packages installed to /somewhere/else that provide runtime libraries that are also available in /usr. Using LD_LIBRARY_PATH pointing to /somewhere/else/lib may break the binaries found in /usr/bin - and agreed, searching PATH doesn't make it better, as PATH is the "LD_LIBRARY_PATH" for Windows. > IMHO this means just adding the applications bin dir is most of the time > an unused or even wrong workaround. Although GetModuleHandle may reduce that pressure for dlopen - as long as the applications bin dir is searched at process start, it really should be searched by dlopen too, even if for /usr/bin/* this might indeed become redundant, as we always add /usr/bin in dlopen - which really mimics the /etc/ld.so.conf content actually, although that one is unavailable to process startup. Thanks! /haubi/