* inode-based dlopen caching @ 2021-06-05 13:59 Soni L. 2021-06-07 21:53 ` Adhemerval Zanella 2021-06-08 16:56 ` Florian Weimer 0 siblings, 2 replies; 12+ messages in thread From: Soni L. @ 2021-06-05 13:59 UTC (permalink / raw) To: libc-help Currently dlopen caching is based on filenames, it'd be nice if it was based on inodes to support better "re"loading (aka loading a new module with the same name because unloading modules with threads is never a good idea). This is good for stuff that deals with plugins. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-05 13:59 inode-based dlopen caching Soni L. @ 2021-06-07 21:53 ` Adhemerval Zanella 2021-06-07 22:50 ` Soni L. 2021-06-08 16:56 ` Florian Weimer 1 sibling, 1 reply; 12+ messages in thread From: Adhemerval Zanella @ 2021-06-07 21:53 UTC (permalink / raw) To: Soni L., Libc-help On 05/06/2021 10:59, Soni L. via Libc-help wrote: > Currently dlopen caching is based on filenames, it'd be nice if it was > based on inodes to support better "re"loading (aka loading a new module > with the same name because unloading modules with threads is never a > good idea). This is good for stuff that deals with plugins. > What do you mean by 'caching' in this scenario? glibc does not maintain a cache of loaded libraries, different than other implementation it does try to unload the library on dlclose (there are cases where it is not readily possible due dependency chains). And I am not seeing how inote-bases dlopen really helps here, if inode changes means that file was potentially changed (so you will need to proper dclose it). I think using filenames is in fact the proper way here, since Linux does the hard lifting of the inode cache and provide fast file access and mmap support for shared libraries. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-07 21:53 ` Adhemerval Zanella @ 2021-06-07 22:50 ` Soni L. 2021-06-08 13:14 ` Adhemerval Zanella 0 siblings, 1 reply; 12+ messages in thread From: Soni L. @ 2021-06-07 22:50 UTC (permalink / raw) To: Adhemerval Zanella, Libc-help On 2021-06-07 6:53 p.m., Adhemerval Zanella wrote: > > > On 05/06/2021 10:59, Soni L. via Libc-help wrote: >> Currently dlopen caching is based on filenames, it'd be nice if it was >> based on inodes to support better "re"loading (aka loading a new module >> with the same name because unloading modules with threads is never a >> good idea). This is good for stuff that deals with plugins. >> > > What do you mean by 'caching' in this scenario? glibc does not maintain > a cache of loaded libraries, different than other implementation it > does try to unload the library on dlclose (there are cases where it > is not readily possible due dependency chains). > > And I am not seeing how inote-bases dlopen really helps here, if inode > changes means that file was potentially changed (so you will need to > proper dclose it). I think using filenames is in fact the proper way > here, since Linux does the hard lifting of the inode cache and provide > fast file access and mmap support for shared libraries. > You can't unload a dlopen that uses threads, at least not safely. So for all intents and purposes you can't unload it. Instead you need to tell it you're done with it, but not unload it, and load the new one. But that's the problem - the filename-based stuff means you can't load the new one. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-07 22:50 ` Soni L. @ 2021-06-08 13:14 ` Adhemerval Zanella 2021-06-08 16:26 ` Soni L. 0 siblings, 1 reply; 12+ messages in thread From: Adhemerval Zanella @ 2021-06-08 13:14 UTC (permalink / raw) To: Soni L., Libc-help On 07/06/2021 19:50, Soni L. wrote: > > > On 2021-06-07 6:53 p.m., Adhemerval Zanella wrote: >> >> >> On 05/06/2021 10:59, Soni L. via Libc-help wrote: >>> Currently dlopen caching is based on filenames, it'd be nice if it was >>> based on inodes to support better "re"loading (aka loading a new module >>> with the same name because unloading modules with threads is never a >>> good idea). This is good for stuff that deals with plugins. >>> >> >> What do you mean by 'caching' in this scenario? glibc does not maintain >> a cache of loaded libraries, different than other implementation it >> does try to unload the library on dlclose (there are cases where it >> is not readily possible due dependency chains). >> >> And I am not seeing how inote-bases dlopen really helps here, if inode >> changes means that file was potentially changed (so you will need to >> proper dclose it). I think using filenames is in fact the proper way >> here, since Linux does the hard lifting of the inode cache and provide >> fast file access and mmap support for shared libraries. >> > You can't unload a dlopen that uses threads, at least not safely. So for > all intents and purposes you can't unload it. Instead you need to tell > it you're done with it, but not unload it, and load the new one. But > that's the problem - the filename-based stuff means you can't load the > new one. > Sure you can unload a dlopen library, the API makes the program responsible to synchronize the access (since dlsym/dlvsym returns an function pointer). If I understood correctly what you are suggesting is making dlclose a noop, so a newer dlopen will also be a noop if it is essentially the same shared object (what happen if the shared library is updated and the inode keep the same?). This is design choice to actually unload the shared object on dlclose and changing it because it might incurs in concurrent issues on programs that do not synchronize its access is a really bad motivation. There are multiple better ways to handle it, either by wrapping with a more user-friendly API or using a high level language. If the motivation is to avoid the potential synchronization issues libc itself need to handle (such as TLS and other shared resources), it could be a better motivation. But even it is a trade off of keep allocated resources even when caller states it does not need them anymore. As fair I know this the design musl-libc has done. We could do it, but we have been fixing an improving the dynamic loader over time that makes this approach also complex and with not large benefits. Also, we still need to do some filename caching to handle things as RUNPATH/RPATH and SONAME; so the implementation to also take in consideration inode might add even more complexity and have more corner cases. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-08 13:14 ` Adhemerval Zanella @ 2021-06-08 16:26 ` Soni L. 2021-06-08 16:51 ` Adhemerval Zanella 0 siblings, 1 reply; 12+ messages in thread From: Soni L. @ 2021-06-08 16:26 UTC (permalink / raw) To: Adhemerval Zanella, Libc-help On 2021-06-08 10:14 a.m., Adhemerval Zanella wrote: > > > On 07/06/2021 19:50, Soni L. wrote: >> >> >> On 2021-06-07 6:53 p.m., Adhemerval Zanella wrote: >>> >>> >>> On 05/06/2021 10:59, Soni L. via Libc-help wrote: >>>> Currently dlopen caching is based on filenames, it'd be nice if it was >>>> based on inodes to support better "re"loading (aka loading a new module >>>> with the same name because unloading modules with threads is never a >>>> good idea). This is good for stuff that deals with plugins. >>>> >>> >>> What do you mean by 'caching' in this scenario? glibc does not maintain >>> a cache of loaded libraries, different than other implementation it >>> does try to unload the library on dlclose (there are cases where it >>> is not readily possible due dependency chains). >>> >>> And I am not seeing how inote-bases dlopen really helps here, if inode >>> changes means that file was potentially changed (so you will need to >>> proper dclose it). I think using filenames is in fact the proper way >>> here, since Linux does the hard lifting of the inode cache and provide >>> fast file access and mmap support for shared libraries. >>> >> You can't unload a dlopen that uses threads, at least not safely. So for >> all intents and purposes you can't unload it. Instead you need to tell >> it you're done with it, but not unload it, and load the new one. But >> that's the problem - the filename-based stuff means you can't load the >> new one. >> > > Sure you can unload a dlopen library, the API makes the program responsible > to synchronize the access (since dlsym/dlvsym returns an function pointer). > [snip] If the shared library creates its own threads, those won't be killed when the shared library is closed. If the shared library is then unloaded, those threads will be running code from the void. That's a problem. Being able to update the library, without getting rid of the original (i.e. by unlinking the original file first), allows the program to gracefully update loaded plugins without a full restart. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-08 16:26 ` Soni L. @ 2021-06-08 16:51 ` Adhemerval Zanella 0 siblings, 0 replies; 12+ messages in thread From: Adhemerval Zanella @ 2021-06-08 16:51 UTC (permalink / raw) To: Soni L., Libc-help On 08/06/2021 13:26, Soni L. wrote: > > > On 2021-06-08 10:14 a.m., Adhemerval Zanella wrote: >> >> >> On 07/06/2021 19:50, Soni L. wrote: >>> >>> >>> On 2021-06-07 6:53 p.m., Adhemerval Zanella wrote: >>>> >>>> >>>> On 05/06/2021 10:59, Soni L. via Libc-help wrote: >>>>> Currently dlopen caching is based on filenames, it'd be nice if it was >>>>> based on inodes to support better "re"loading (aka loading a new module >>>>> with the same name because unloading modules with threads is never a >>>>> good idea). This is good for stuff that deals with plugins. >>>>> >>>> >>>> What do you mean by 'caching' in this scenario? glibc does not maintain >>>> a cache of loaded libraries, different than other implementation it >>>> does try to unload the library on dlclose (there are cases where it >>>> is not readily possible due dependency chains). >>>> >>>> And I am not seeing how inote-bases dlopen really helps here, if inode >>>> changes means that file was potentially changed (so you will need to >>>> proper dclose it). I think using filenames is in fact the proper way >>>> here, since Linux does the hard lifting of the inode cache and provide >>>> fast file access and mmap support for shared libraries. >>>> >>> You can't unload a dlopen that uses threads, at least not safely. So for >>> all intents and purposes you can't unload it. Instead you need to tell >>> it you're done with it, but not unload it, and load the new one. But >>> that's the problem - the filename-based stuff means you can't load the >>> new one. >>> >> >> Sure you can unload a dlopen library, the API makes the program responsible >> to synchronize the access (since dlsym/dlvsym returns an function pointer). >> [snip] > If the shared library creates its own threads, those won't be killed > when the shared library is closed. If the shared library is then > unloaded, those threads will be running code from the void. That's a > problem. > > Being able to update the library, without getting rid of the original > (i.e. by unlinking the original file first), allows the program to > gracefully update loaded plugins without a full restart. > Again, this is a library issue that should be dealt by the provided API by the library (such as providing a cleanup handler to synchronize or cancel the threads execution). In this scenario you are describing you will end up with the library loaded in two different mapping with potentially two different code (since you are updating the library and dlclose might be a noop). I really don't see this as an improvement, it is rather a potential trigger to subtle issues, specially if the threads trying to sync with process shared resources. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-05 13:59 inode-based dlopen caching Soni L. 2021-06-07 21:53 ` Adhemerval Zanella @ 2021-06-08 16:56 ` Florian Weimer 2021-06-08 17:19 ` Adhemerval Zanella 1 sibling, 1 reply; 12+ messages in thread From: Florian Weimer @ 2021-06-08 16:56 UTC (permalink / raw) To: Soni L. via Libc-help * Soni L. via Libc-help: > Currently dlopen caching is based on filenames, it'd be nice if it was > based on inodes to support better "re"loading (aka loading a new module > with the same name because unloading modules with threads is never a > good idea). This is good for stuff that deals with plugins. It's an interesting idea. We'd probably also want a flag that hides the symbols from general binding and makes them available for direct dlsym lookups using the handle returned by dlopen (otherwise the old definitions stick around). The tricky question is what to do about dependencies. A behavioral change for just one level is not too hard, but everything goes further is difficult. Vivek Das Mohapatra's RTLD_SHARED patches may help with isolating dependencies. Thanks, Florian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-08 16:56 ` Florian Weimer @ 2021-06-08 17:19 ` Adhemerval Zanella 2021-06-08 17:20 ` Florian Weimer 0 siblings, 1 reply; 12+ messages in thread From: Adhemerval Zanella @ 2021-06-08 17:19 UTC (permalink / raw) To: Florian Weimer, Libc-help On 08/06/2021 13:56, Florian Weimer via Libc-help wrote: > * Soni L. via Libc-help: > >> Currently dlopen caching is based on filenames, it'd be nice if it was >> based on inodes to support better "re"loading (aka loading a new module >> with the same name because unloading modules with threads is never a >> good idea). This is good for stuff that deals with plugins. > > It's an interesting idea. We'd probably also want a flag that hides the > symbols from general binding and makes them available for direct dlsym > lookups using the handle returned by dlopen (otherwise the old > definitions stick around). > > The tricky question is what to do about dependencies. A behavioral > change for just one level is not too hard, but everything goes further > is difficult. > > Vivek Das Mohapatra's RTLD_SHARED patches may help with isolating > dependencies. The RTLD_SHARED with the -Wl,-z,unique might help with the dependency, but it would require the caller to proper setup the linker flag on the specific shared libraries. But my main reservation with this is the idea of reload the new module is, although the symbol resolution is a different namespace, it still share the same process resources. It would require a lot of careful code within the library so it can run with multiple instances, we see the potential issues with our static dlopen support. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-08 17:19 ` Adhemerval Zanella @ 2021-06-08 17:20 ` Florian Weimer 2021-06-08 18:10 ` Soni L. 0 siblings, 1 reply; 12+ messages in thread From: Florian Weimer @ 2021-06-08 17:20 UTC (permalink / raw) To: Adhemerval Zanella; +Cc: Libc-help * Adhemerval Zanella: > But my main reservation with this is the idea of reload the new module > is, although the symbol resolution is a different namespace, it still > share the same process resources. It would require a lot of careful > code within the library so it can run with multiple instances, we see > the potential issues with our static dlopen support. It would work quite well for a lot of stateless JNI wrappers or Python extensions, though. That alone looks like a relevant use case to me. Thanks, Florian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-08 17:20 ` Florian Weimer @ 2021-06-08 18:10 ` Soni L. 2021-06-08 18:17 ` Florian Weimer 0 siblings, 1 reply; 12+ messages in thread From: Soni L. @ 2021-06-08 18:10 UTC (permalink / raw) To: Florian Weimer, Adhemerval Zanella; +Cc: Libc-help On 2021-06-08 2:20 p.m., Florian Weimer via Libc-help wrote: > * Adhemerval Zanella: > >> But my main reservation with this is the idea of reload the new module >> is, although the symbol resolution is a different namespace, it still >> share the same process resources. It would require a lot of careful >> code within the library so it can run with multiple instances, we see >> the potential issues with our static dlopen support. > > It would work quite well for a lot of stateless JNI wrappers or Python > extensions, though. That alone looks like a relevant use case to me. > > Thanks, > Florian > The motivating use-case is hexchat plugins. Especially when they're written in Rust. Rust is supposed to provide memory safety but some edge-cases with dlopen break that, like truncating the shared library, or closing a shared library that has spawned threads. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-08 18:10 ` Soni L. @ 2021-06-08 18:17 ` Florian Weimer 2021-06-08 19:25 ` Adhemerval Zanella 0 siblings, 1 reply; 12+ messages in thread From: Florian Weimer @ 2021-06-08 18:17 UTC (permalink / raw) To: Soni L.; +Cc: Adhemerval Zanella, Libc-help * Soni L.: > The motivating use-case is hexchat plugins. Especially when they're > written in Rust. Rust is supposed to provide memory safety but some > edge-cases with dlopen break that, like truncating the shared library, > or closing a shared library that has spawned threads. Truncating the shared object will always cause problems until the kernel implements MAP_COPY, or we stop mapping code in the dynamic loader. Another option (implemented by GHC and others) is to have a customer loader. Except for initial-exec memory and symbol interposition, there is nothing magic at all about dlopen. Applications certainly can implement their own object code loading mechanisms. Thanks, Florian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: inode-based dlopen caching 2021-06-08 18:17 ` Florian Weimer @ 2021-06-08 19:25 ` Adhemerval Zanella 0 siblings, 0 replies; 12+ messages in thread From: Adhemerval Zanella @ 2021-06-08 19:25 UTC (permalink / raw) To: Florian Weimer, Soni L.; +Cc: Libc-help On 08/06/2021 15:17, Florian Weimer wrote: > * Soni L.: > >> The motivating use-case is hexchat plugins. Especially when they're >> written in Rust. Rust is supposed to provide memory safety but some >> edge-cases with dlopen break that, like truncating the shared library, >> or closing a shared library that has spawned threads. > > Truncating the shared object will always cause problems until the kernel > implements MAP_COPY, or we stop mapping code in the dynamic loader. Which I recall some discussion from Linus won't going to happen (unless he changed his mind, the discussion it some years old already). > > Another option (implemented by GHC and others) is to have a customer > loader. Except for initial-exec memory and symbol interposition, there > is nothing magic at all about dlopen. Applications certainly can > implement their own object code loading mechanisms. I think the main problem is implement all the ELF idiosyncrasies correctly, assuming that ELF is used in first place. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-06-08 19:25 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-05 13:59 inode-based dlopen caching Soni L. 2021-06-07 21:53 ` Adhemerval Zanella 2021-06-07 22:50 ` Soni L. 2021-06-08 13:14 ` Adhemerval Zanella 2021-06-08 16:26 ` Soni L. 2021-06-08 16:51 ` Adhemerval Zanella 2021-06-08 16:56 ` Florian Weimer 2021-06-08 17:19 ` Adhemerval Zanella 2021-06-08 17:20 ` Florian Weimer 2021-06-08 18:10 ` Soni L. 2021-06-08 18:17 ` Florian Weimer 2021-06-08 19:25 ` Adhemerval Zanella
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).