From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8213 invoked by alias); 26 Oct 2014 19:44:05 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 8200 invoked by uid 89); 26 Oct 2014 19:44:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-ig0-f178.google.com MIME-Version: 1.0 X-Received: by 10.50.142.104 with SMTP id rv8mr17350349igb.23.1414352638330; Sun, 26 Oct 2014 12:43:58 -0700 (PDT) In-Reply-To: References: <1407860209-21797-1-git-send-email-bpshacklett@gmail.com> Date: Sun, 26 Oct 2014 19:44:00 -0000 Message-ID: Subject: Re: [PATCH][BZ 17251] Calculate RPATH $ORIGIN from absolute path From: Brennan Shacklett To: libc-alpha@sourceware.org Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2014-10/txt/msg00594.txt.bz2 ping On Mon, Oct 6, 2014 at 9:18 AM, Brennan Shacklett wrote: > ping > > On Wed, Oct 1, 2014 at 12:28 AM, Brennan Shacklett > wrote: >> ping >> >> On Tue, Sep 23, 2014 at 8:09 PM, Brennan Shacklett >> wrote: >>> On Tue, Aug 12, 2014 at 9:16 AM, Brennan Shacklett >>> wrote: >>>> Bug 17251 deals with (what I believe is) the incorrect calculation of $ORIGIN >>>> for shared libraries. Currently $ORIGIN is calculated correctly for executables >>>> on linux, because readlink is called on /proc/self/exe, which means the >>>> resulting path is absolute and has no symlinks. >>>> Shared libraries with relative paths on the other hand are based >>>> off of appending the name / path of the library to the current working >>>> directory, which means if the library is a symlink, it is not followed, which >>>> breaks RPATH $ORIGIN in the following scenario: >>>> >>>> libone.so and libtwo.so are both in ~/lib >>>> libone.so needs libtwo.so, so libone.so has an RPATH of $ORIGIN. >>>> If I run ldd on ~/lib/libone.so, libtwo.so is found and all is good. >>>> If I create a symlink named ~/libone.so to ~/lib/libone.so, and run ldd on it >>>> libtwo is not found, because $ORIGIN for the library is calculated as ~ >>>> instead of ~/lib. >>>> >>>> If I was to repeat the above test but instead of libone.so use an executable, >>>> everything would work as expected, which is why I think the shared library >>>> behavior is a bug. >>>> >>>> I ran into this in the real world when attempting to dynamically load a .so >>>> for python SWIG bindings, because the .so which python was loading was >>>> symlinked to the actual library directory where all of the .so's dependencies >>>> were located. >>>> >>>> The attached patch fixes this behavior by having realname evaluated with >>>> __realpath in _dl_map_object_from_fd, before _dl_new_object is called (which is >>>> where l_origin is assigned). This also somewhat simplifies the code in >>>> elf/dl-object.c, because realname is guaranteed to be an absolute path >>>> generated by realpath, so I was able to remove the code dealing with relative >>>> paths. >>>> I used the implementation of dl_realpath from sysdeps/tile/dl-runtime.c as the >>>> generic implementation, and extended it with lstat and readlink in the linux >>>> version. This means the bug is still present on other systems than linux (the >>>> generic implementation only returns an absolute path, it doesn't do anything >>>> with symlinks), but if there is a way to get the generic version to follow >>>> symlinks on all systems please let me know. >>>> >>>> I tested and ran the elf test suite on Gentoo Linux x86-64. If this change is >>>> wanted, I will happily write a test to go along with it. >>>> I don't have any sort of copyright attribution set up with the FSF, let me know >>>> if it is necessary for this change. >>>> >>>> Thanks, >>>> Brennan Shacklett >>>> >>>> 2014-08-12 Brennan Shacklett >>>> >>>> [BZ #17251] >>>> * elf/Makefile (rtld-routines): Add dl-realpath. >>>> * elf/dl-load.c (_dl_map_object_from_fd): >>>> Calculate realname with __realpath. >>>> * elf/dl-object.c (_dl_new_object): Remove relative path handling. >>>> * elf/dl-realpath.c: New file. Generic implementation of __realpath. >>>> * elf/tile/dl-runtime.c (_dl_after_load): Remove dl_realpath and >>>> change call to __realpath. >>>> * sysdeps/unix/sysv/linux/Makefile (sysdep-rtld-routines): >>>> Add dl-lxstat64. >>>> * sysdeps/unix/sysv/linux/dl-lxstat64.c: New file. >>>> * sysdeps/unix/sysv/linux/dl-realpath.c: New file. Handle symlinks >>>> unlike generic implementation. >>>> --- >>>> elf/Makefile | 3 +- >>>> elf/dl-load.c | 18 +++++ >>>> elf/dl-object.c | 54 ++------------- >>>> elf/dl-realpath.c | 85 ++++++++++++++++++++++++ >>>> sysdeps/tile/dl-runtime.c | 66 +------------------ >>>> sysdeps/unix/sysv/linux/Makefile | 2 +- >>>> sysdeps/unix/sysv/linux/dl-lxstat64.c | 1 + >>>> sysdeps/unix/sysv/linux/dl-realpath.c | 121 ++++++++++++++++++++++++++++++++++ >>>> 8 files changed, 236 insertions(+), 114 deletions(-) >>>> create mode 100644 elf/dl-realpath.c >>>> create mode 100644 sysdeps/unix/sysv/linux/dl-lxstat64.c >>>> create mode 100644 sysdeps/unix/sysv/linux/dl-realpath.c >>>> >>>> diff --git a/elf/Makefile b/elf/Makefile >>>> index 25012cc..c130f21 100644 >>>> --- a/elf/Makefile >>>> +++ b/elf/Makefile >>>> @@ -43,7 +43,8 @@ shared-only-routines += dl-caller >>>> >>>> # ld.so uses those routines, plus some special stuff for being the program >>>> # interpreter and operating independent of libc. >>>> -rtld-routines := rtld $(dl-routines) dl-sysdep dl-environ dl-minimal >>>> +rtld-routines := rtld $(dl-routines) dl-sysdep dl-environ dl-minimal \ >>>> + dl-realpath >>>> all-rtld-routines = $(rtld-routines) $(sysdep-rtld-routines) >>>> >>>> CFLAGS-dl-runtime.c = -fexceptions -fasynchronous-unwind-tables >>>> diff --git a/elf/dl-load.c b/elf/dl-load.c >>>> index 016a99c..7353837 100644 >>>> --- a/elf/dl-load.c >>>> +++ b/elf/dl-load.c >>>> @@ -891,6 +891,7 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp, >>>> int errval = 0; >>>> struct r_debug *r = _dl_debug_initialize (0, nsid); >>>> bool make_consistent = false; >>>> + char *absolutename = NULL; >>>> >>>> /* Get file information. */ >>>> if (__glibc_unlikely (__fxstat64 (_STAT_VER, fd, &st) < 0)) >>>> @@ -902,6 +903,23 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp, >>>> lose (errval, fd, name, realname, l, errstring, >>>> make_consistent ? r : NULL, nsid); >>>> } >>>> + /* Find absolute pathname for object */ >>>> + absolutename = (char *) malloc (PATH_MAX); >>>> + if (!absolutename) >>>> + { >>>> + errstring = N_("cannot allocate memory for absolute path"); >>>> + goto call_lose_errno; >>>> + } >>>> + >>>> + if (!__realpath (realname, absolutename)) >>>> + { >>>> + free (absolutename); >>>> + errstring = N_("cannot find absolute path of shared object"); >>>> + goto call_lose_errno; >>>> + } >>>> + >>>> + free (realname); >>>> + realname = absolutename; >>>> >>>> /* Look again to see if the real name matched another already loaded. */ >>>> for (l = GL(dl_ns)[nsid]._ns_loaded; l; l = l->l_next) >>>> diff --git a/elf/dl-object.c b/elf/dl-object.c >>>> index afd80a6..d93935b 100644 >>>> --- a/elf/dl-object.c >>>> +++ b/elf/dl-object.c >>>> @@ -158,55 +158,15 @@ _dl_new_object (char *realname, const char *libname, int type, >>>> char *origin; >>>> char *cp; >>>> >>>> - if (realname[0] == '/') >>>> + /* It is an absolute path, calculated by realpath. Use it. >>>> + * But we have to make a copy since we strip out the trailing slash. */ >>>> + assert (realname[0] == '/'); >>>> + cp = origin = (char *) malloc (realname_len); >>>> + if (origin == NULL) >>>> { >>>> - /* It is an absolute path. Use it. But we have to make a >>>> - copy since we strip out the trailing slash. */ >>>> - cp = origin = (char *) malloc (realname_len); >>>> - if (origin == NULL) >>>> - { >>>> - origin = (char *) -1; >>>> - goto out; >>>> - } >>>> + origin = (char *) -1; >>>> + goto out; >>>> } >>>> - else >>>> - { >>>> - size_t len = realname_len; >>>> - char *result = NULL; >>>> - >>>> - /* Get the current directory name. */ >>>> - origin = NULL; >>>> - do >>>> - { >>>> - char *new_origin; >>>> - >>>> - len += 128; >>>> - new_origin = (char *) realloc (origin, len); >>>> - if (new_origin == NULL) >>>> - /* We exit the loop. Note that result == NULL. */ >>>> - break; >>>> - origin = new_origin; >>>> - } >>>> - while ((result = __getcwd (origin, len - realname_len)) == NULL >>>> - && errno == ERANGE); >>>> - >>>> - if (result == NULL) >>>> - { >>>> - /* We were not able to determine the current directory. >>>> - Note that free(origin) is OK if origin == NULL. */ >>>> - free (origin); >>>> - origin = (char *) -1; >>>> - goto out; >>>> - } >>>> - >>>> - /* Find the end of the path and see whether we have to add a >>>> - slash. We could use rawmemchr but this need not be >>>> - fast. */ >>>> - cp = (strchr) (origin, '\0'); >>>> - if (cp[-1] != '/') >>>> - *cp++ = '/'; >>>> - } >>>> - >>>> /* Add the real file name. */ >>>> cp = __mempcpy (cp, realname, realname_len); >>>> >>>> diff --git a/elf/dl-realpath.c b/elf/dl-realpath.c >>>> new file mode 100644 >>>> index 0000000..2549bb3 >>>> --- /dev/null >>>> +++ b/elf/dl-realpath.c >>>> @@ -0,0 +1,85 @@ >>>> +/* Dynamic loader version of realpath. >>>> + Copyright (C) 2014 Free Software Foundation, Inc. >>>> + This file is part of the GNU C Library. >>>> + >>>> + The GNU C Library is free software; you can redistribute it and/or >>>> + modify it under the terms of the GNU Lesser General Public >>>> + License as published by the Free Software Foundation; either >>>> + version 2.1 of the License, or (at your option) any later version. >>>> + >>>> + The GNU C Library is distributed in the hope that it will be useful, >>>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>>> + Lesser General Public License for more details. >>>> + >>>> + You should have received a copy of the GNU Lesser General Public >>>> + License along with the GNU C Library; if not, see >>>> + . */ >>>> + >>>> +#include >>>> +#include >>>> +#include >>>> +#include >>>> + >>>> +/* Simplified implementation of realpath(): no dynamic memory use, no lstat(), >>>> + no set_errno(), no valid "rpath" on error, etc. This simplifies cases >>>> + involving relative paths, specifically where $ORIGIN needs to be >>>> + calculated. For this relatively rare case, one could also imagine using >>>> + link_map.l_origin to avoid the getcwd() here, but the simpler code >>>> + here seems like a better solution. */ >>>> +char * internal_function >>>> +__realpath (const char *name, char *rpath) >>>> +{ >>>> + char *dest; >>>> + const char *start, *end; >>>> + >>>> + if (name[0] != '/') >>>> + { >>>> + if (!__getcwd (rpath, PATH_MAX)) >>>> + return NULL; >>>> + dest = __rawmemchr (rpath, '\0'); >>>> + } >>>> + else >>>> + { >>>> + rpath[0] = '/'; >>>> + dest = rpath + 1; >>>> + } >>>> + >>>> + for (start = end = name; *start; start = end) >>>> + { >>>> + /* Skip sequence of multiple path-separators. */ >>>> + while (*start == '/') >>>> + ++start; >>>> + >>>> + /* Find end of path component. */ >>>> + for (end = start; *end && *end != '/'; ++end) >>>> + /* Nothing. */; >>>> + >>>> + if (end - start == 0) >>>> + break; >>>> + else if (end - start == 1 && start[0] == '.') >>>> + /* nothing */; >>>> + else if (end - start == 2 && start[0] == '.' && start[1] == '.') >>>> + { >>>> + /* Back up to previous component, ignore if at root already. */ >>>> + if (dest > rpath + 1) >>>> + while ((--dest)[-1] != '/'); >>>> + } >>>> + else >>>> + { >>>> + if (dest[-1] != '/') >>>> + *dest++ = '/'; >>>> + >>>> + if (dest + (end - start) >= rpath + PATH_MAX) >>>> + return NULL; >>>> + >>>> + dest = __mempcpy (dest, start, end - start); >>>> + *dest = '\0'; >>>> + } >>>> + } >>>> + if (dest > rpath + 1 && dest[-1] == '/') >>>> + --dest; >>>> + *dest = '\0'; >>>> + >>>> + return rpath; >>>> +} >>>> diff --git a/sysdeps/tile/dl-runtime.c b/sysdeps/tile/dl-runtime.c >>>> index bcc00bc..45251d0 100644 >>>> --- a/sysdeps/tile/dl-runtime.c >>>> +++ b/sysdeps/tile/dl-runtime.c >>>> @@ -29,70 +29,6 @@ >>>> #include >>>> #include >>>> >>>> -/* Like realpath(), but simplified: no dynamic memory use, no lstat(), >>>> - no set_errno(), no valid "rpath" on error, etc. This handles some >>>> - simple cases where the simulator might not have a valid entry for >>>> - a loaded Elf object, in particular dlopen() with a relative path. >>>> - For this relatively rare case, one could also imagine using >>>> - link_map.l_origin to avoid the getcwd() here, but the simpler code >>>> - here seems like a better solution. */ >>>> -static char * >>>> -dl_realpath (const char *name, char *rpath) >>>> -{ >>>> - char *dest; >>>> - const char *start, *end; >>>> - >>>> - if (name[0] != '/') >>>> - { >>>> - if (!__getcwd (rpath, PATH_MAX)) >>>> - return NULL; >>>> - dest = __rawmemchr (rpath, '\0'); >>>> - } >>>> - else >>>> - { >>>> - rpath[0] = '/'; >>>> - dest = rpath + 1; >>>> - } >>>> - >>>> - for (start = end = name; *start; start = end) >>>> - { >>>> - /* Skip sequence of multiple path-separators. */ >>>> - while (*start == '/') >>>> - ++start; >>>> - >>>> - /* Find end of path component. */ >>>> - for (end = start; *end && *end != '/'; ++end) >>>> - /* Nothing. */; >>>> - >>>> - if (end - start == 0) >>>> - break; >>>> - else if (end - start == 1 && start[0] == '.') >>>> - /* nothing */; >>>> - else if (end - start == 2 && start[0] == '.' && start[1] == '.') >>>> - { >>>> - /* Back up to previous component, ignore if at root already. */ >>>> - if (dest > rpath + 1) >>>> - while ((--dest)[-1] != '/'); >>>> - } >>>> - else >>>> - { >>>> - if (dest[-1] != '/') >>>> - *dest++ = '/'; >>>> - >>>> - if (dest + (end - start) >= rpath + PATH_MAX) >>>> - return NULL; >>>> - >>>> - dest = __mempcpy (dest, start, end - start); >>>> - *dest = '\0'; >>>> - } >>>> - } >>>> - if (dest > rpath + 1 && dest[-1] == '/') >>>> - --dest; >>>> - *dest = '\0'; >>>> - >>>> - return rpath; >>>> -} >>>> - >>>> /* Support notifying the simulator about new objects. */ >>>> void internal_function >>>> _dl_after_load (struct link_map *l) >>>> @@ -117,7 +53,7 @@ _dl_after_load (struct link_map *l) >>>> DLPUTC (':'); >>>> >>>> /* Write the library path, including the terminating '\0'. */ >>>> - path = dl_realpath (l->l_name, pathbuf) ?: l->l_name; >>>> + path = __realpath (l->l_name, pathbuf) ?: l->l_name; >>>> for (size_t i = 0;; i++) >>>> { >>>> DLPUTC (path[i]); >>>> diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile >>>> index 9ad6d22..4525429 100644 >>>> --- a/sysdeps/unix/sysv/linux/Makefile >>>> +++ b/sysdeps/unix/sysv/linux/Makefile >>>> @@ -176,7 +176,7 @@ endif >>>> >>>> ifeq ($(subdir),elf) >>>> sysdep-rtld-routines += dl-brk dl-sbrk dl-getcwd dl-openat64 dl-opendir \ >>>> - dl-fxstatat64 >>>> + dl-fxstatat64 dl-lxstat64 >>>> >>>> CPPFLAGS-lddlibc4 += -DNOT_IN_libc >>>> >>>> diff --git a/sysdeps/unix/sysv/linux/dl-lxstat64.c b/sysdeps/unix/sysv/linux/dl-lxstat64.c >>>> new file mode 100644 >>>> index 0000000..63e6800 >>>> --- /dev/null >>>> +++ b/sysdeps/unix/sysv/linux/dl-lxstat64.c >>>> @@ -0,0 +1 @@ >>>> +#include >>>> diff --git a/sysdeps/unix/sysv/linux/dl-realpath.c b/sysdeps/unix/sysv/linux/dl-realpath.c >>>> new file mode 100644 >>>> index 0000000..91652dc >>>> --- /dev/null >>>> +++ b/sysdeps/unix/sysv/linux/dl-realpath.c >>>> @@ -0,0 +1,121 @@ >>>> +/* Dynamic loader version of realpath for linux. >>>> + Copyright (C) 2014 Free Software Foundation, Inc. >>>> + This file is part of the GNU C Library. >>>> + >>>> + The GNU C Library is free software; you can redistribute it and/or >>>> + modify it under the terms of the GNU Lesser General Public >>>> + License as published by the Free Software Foundation; either >>>> + version 2.1 of the License, or (at your option) any later version. >>>> + >>>> + The GNU C Library is distributed in the hope that it will be useful, >>>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>>> + Lesser General Public License for more details. >>>> + >>>> + You should have received a copy of the GNU Lesser General Public >>>> + License along with the GNU C Library; if not, see >>>> + . */ >>>> + >>>> +#include >>>> +#include >>>> +#include >>>> +#include >>>> +#include >>>> +#include >>>> +#include >>>> + >>>> +/* Simplified version of realpath which extends elf/dl-realpath.c to use >>>> + * linux syscalls for handling symlinks */ >>>> +char * internal_function >>>> +__realpath (const char *name, char *rpath) >>>> +{ >>>> + char *dest; >>>> + char extra_buf[PATH_MAX]; >>>> + const char *start, *end; >>>> + int num_links = 0; >>>> + >>>> + if (name[0] != '/') >>>> + { >>>> + if (!__getcwd (rpath, PATH_MAX)) >>>> + return NULL; >>>> + dest = __rawmemchr (rpath, '\0'); >>>> + } >>>> + else >>>> + { >>>> + rpath[0] = '/'; >>>> + dest = rpath + 1; >>>> + } >>>> + >>>> + for (start = end = name; *start; start = end) >>>> + { >>>> + struct stat64 st; >>>> + int n; >>>> + /* Skip sequence of multiple path-separators. */ >>>> + while (*start == '/') >>>> + ++start; >>>> + >>>> + /* Find end of path component. */ >>>> + for (end = start; *end && *end != '/'; ++end) >>>> + /* Nothing. */; >>>> + >>>> + if (end - start == 0) >>>> + break; >>>> + else if (end - start == 1 && start[0] == '.') >>>> + /* nothing */; >>>> + else if (end - start == 2 && start[0] == '.' && start[1] == '.') >>>> + { >>>> + /* Back up to previous component, ignore if at root already. */ >>>> + if (dest > rpath + 1) >>>> + while ((--dest)[-1] != '/'); >>>> + } >>>> + else >>>> + { >>>> + if (dest[-1] != '/') >>>> + *dest++ = '/'; >>>> + >>>> + if (dest + (end - start) >= rpath + PATH_MAX) >>>> + return NULL; >>>> + >>>> + dest = __mempcpy (dest, start, end - start); >>>> + *dest = '\0'; >>>> + if (__lxstat64 (_STAT_VER, rpath, &st) < 0) >>>> + return NULL; >>>> + >>>> + if (S_ISLNK (st.st_mode)) >>>> + { >>>> + char buf[PATH_MAX]; >>>> + size_t len; >>>> + >>>> + INTERNAL_SYSCALL_DECL (err); >>>> + >>>> + n = INTERNAL_SYSCALL (readlink, err, 3, rpath, buf, >>>> + PATH_MAX - 1); >>>> + >>>> + if (n <= 0 || buf[0] == '[') >>>> + return NULL; >>>> + buf[n] = '\0'; >>>> + >>>> + len = strlen (end); >>>> + if ((long int) (n + len) >= PATH_MAX) >>>> + return NULL; >>>> + >>>> + /* Careful here, end may be a pointer into extra_buf... */ >>>> + memmove (&extra_buf[n], end, len + 1); >>>> + name = end = memcpy (extra_buf, buf, n); >>>> + >>>> + if (buf[0] == '/') >>>> + dest = rpath + 1; /* Absolute symlink */ >>>> + else >>>> + /* Back up to previous component, ignore if at root already: */ >>>> + if (dest > rpath + 1) >>>> + while ((--dest)[-1] != '/'); >>>> + } >>>> + } >>>> + } >>>> + if (dest > rpath + 1 && dest[-1] == '/') >>>> + --dest; >>>> + *dest = '\0'; >>>> + >>>> + return rpath; >>>> + >>>> +} >>>> -- >>>> 2.0.4 >>>> >>> >>> ping.