* Why does elf/dl-load.c have local_strdup? @ 2014-10-20 20:25 Carlos O'Donell 2014-10-20 20:41 ` Roland McGrath 0 siblings, 1 reply; 14+ messages in thread From: Carlos O'Donell @ 2014-10-20 20:25 UTC (permalink / raw) To: GNU C Library What's the purpose of local_strdup in elf/dl-load.c? Is there any point in that sequence when we can't just call strdup? Cheers, Carlos. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Why does elf/dl-load.c have local_strdup? 2014-10-20 20:25 Why does elf/dl-load.c have local_strdup? Carlos O'Donell @ 2014-10-20 20:41 ` Roland McGrath 2014-10-20 21:19 ` Carlos O'Donell 0 siblings, 1 reply; 14+ messages in thread From: Roland McGrath @ 2014-10-20 20:41 UTC (permalink / raw) To: Carlos O'Donell; +Cc: GNU C Library I don't see any reason for it. When it was added, it was replacing open-coded malloc+memcpy sequences. But I don't know why the normal implementation could not have been used. Name space issues might mean we should only use __strdup here, but that could have been done at the time local_strdup was added too. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Why does elf/dl-load.c have local_strdup? 2014-10-20 20:41 ` Roland McGrath @ 2014-10-20 21:19 ` Carlos O'Donell 2014-10-21 22:47 ` Roland McGrath 0 siblings, 1 reply; 14+ messages in thread From: Carlos O'Donell @ 2014-10-20 21:19 UTC (permalink / raw) To: Roland McGrath; +Cc: GNU C Library On 10/20/2014 04:41 PM, Roland McGrath wrote: > I don't see any reason for it. When it was added, it was replacing > open-coded malloc+memcpy sequences. But I don't know why the normal > implementation could not have been used. Name space issues might mean we > should only use __strdup here, but that could have been done at the time > local_strdup was added too. Thanks, I had not considered namespace issues. I'm fixing this partly because I've just found that ld.so.cache management has no ref counting on the mmap, and recursive dlopen leads to segfaults if you have a malloc hook that calls dlopen. The deepest dlopen, as it completes, unmaps the cache, and the next deepest dlopen segfaults trying to reference data in the cache it expects to be valid. I figure that malloc hooks should be allowed to call dlopen, and we should either ref-count ld.so.cache mmap, or better yet, just change the contract of _dl_load_cache_lookup to return a copy of the caches string. In 99% of the caches the hot path is that the cache has the file, and you're going to strdup anyway, so why not just do it while you know the cache to be valid and avoid this entire issue? Well, one issue is that any strdup in _dl_load_cache_lookup will trigger the problem (call malloc, which calls a malloc hook, which calls dlopen, which later unmaps the cache you're trying to copy). So you'd need an intermediate alloca, followed by a strdup. Is `alloca + strdup` worse or better than `ref-counted ld.so.cache mapping?` Cheers, Carlos. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Why does elf/dl-load.c have local_strdup? 2014-10-20 21:19 ` Carlos O'Donell @ 2014-10-21 22:47 ` Roland McGrath 2014-10-23 2:29 ` Should glibc be fully reentrant? What do we allow interposed symbols to do? Carlos O'Donell 0 siblings, 1 reply; 14+ messages in thread From: Roland McGrath @ 2014-10-21 22:47 UTC (permalink / raw) To: Carlos O'Donell; +Cc: GNU C Library > I'm fixing this partly because I've just found that ld.so.cache > management has no ref counting on the mmap, and recursive dlopen leads to > segfaults if you have a malloc hook that calls dlopen. The deepest > dlopen, as it completes, unmaps the cache, and the next deepest dlopen > segfaults trying to reference data in the cache it expects to be valid. Understood. > I figure that malloc hooks should be allowed to call dlopen, and we should > either ref-count ld.so.cache mmap, or better yet, just change the contract > of _dl_load_cache_lookup to return a copy of the caches string. That sounds harmless on its face, but it is clearly the edge of a slippery slope. What freedom of action should malloc hooks really expect? Library code calls malloc from all sorts of deep internal places. It's not reasonable to reorganize all our code paths so that they can be reentered safely from malloc. You've found one case and proposed one workaround, for a single specific direct path of reentry. There are lots of other paths, and a huge combinatoric explosion of indirect paths of reentry. Should a malloc hook be able to call printf? What about dlopen->malloc->printf? What about printf->malloc->dlopen? Why should we support dlopen->malloc->dlopen any more than we support dlopen->signal handler->dlopen? > In 99% of the caches the hot path is that the cache has the file, and > you're going to strdup anyway, so why not just do it while you know the > cache to be valid and avoid this entire issue? Well, one issue is that any > strdup in _dl_load_cache_lookup will trigger the problem (call malloc, > which calls a malloc hook, which calls dlopen, which later unmaps the > cache you're trying to copy). So you'd need an intermediate alloca, > followed by a strdup. What you mean is an intermediate strdupa to copy the value before doing a strdup. When talking about cost, it's important to be precise. It's the CPU cost of alloca (constant) plus the cost of strlen+memcpy (roughly linear in the length of the file name), plus the other kinds of cost of deepening the stack by a variable amount. > Is `alloca + strdup` worse or better than `ref-counted ld.so.cache mapping?` The somewhat open-ended deepening of the stack concerns me more than the CPU cost of strdupa. (This is all in preparation for open, mmap, reloc processing with COW faults, etc., so the actual CPU cost of strdupa will pale in comparison.) But the slippery slope concerns me most of all. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-21 22:47 ` Roland McGrath @ 2014-10-23 2:29 ` Carlos O'Donell 2014-10-23 8:23 ` Will Newton 2014-10-24 9:31 ` Torvald Riegel 0 siblings, 2 replies; 14+ messages in thread From: Carlos O'Donell @ 2014-10-23 2:29 UTC (permalink / raw) To: Roland McGrath; +Cc: GNU C Library On 10/21/2014 06:47 PM, Roland McGrath wrote: > But the slippery slope concerns me most of all. Any function the user interposes acts as a synchronous interrupt on the runtime. It is my opinion that users expect to be able to call any routine in the runtime without caution unless we tell them otherwise. Given that dlopen locks are recursive, as are stdio locks, I propose we fully support this notion that users already believe exists. The alternative is that we don't support it and treat interposed functions as if they were in a signal handler context, only being allowed to call async-signal-safe functions, and we might as well remove the recursive support from the locks such that users get useful backtraces from deadlocks. It is my opinion that such a direction would not help our users and would not help the project. The similar situations we need to clarify are LD_AUDIT modules, and IFUNC resolvers, but let us proceed orderly one topic at a time. In summary ========== Allow interposed functions to call back into the runtime, and fix any places where this breaks. Cheers, Carlos. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-23 2:29 ` Should glibc be fully reentrant? What do we allow interposed symbols to do? Carlos O'Donell @ 2014-10-23 8:23 ` Will Newton 2014-10-23 14:40 ` Carlos O'Donell 2014-10-24 9:31 ` Torvald Riegel 1 sibling, 1 reply; 14+ messages in thread From: Will Newton @ 2014-10-23 8:23 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Roland McGrath, GNU C Library On 23 October 2014 03:29, Carlos O'Donell <carlos@redhat.com> wrote: > On 10/21/2014 06:47 PM, Roland McGrath wrote: >> But the slippery slope concerns me most of all. > > Any function the user interposes acts as a synchronous interrupt on the > runtime. > > It is my opinion that users expect to be able to call any routine in the > runtime without caution unless we tell them otherwise. > > Given that dlopen locks are recursive, as are stdio locks, I propose we > fully support this notion that users already believe exists. > > The alternative is that we don't support it and treat interposed functions > as if they were in a signal handler context, only being allowed to call > async-signal-safe functions, and we might as well remove the recursive > support from the locks such that users get useful backtraces from deadlocks. > It is my opinion that such a direction would not help our users and would > not help the project. > > The similar situations we need to clarify are LD_AUDIT modules, and > IFUNC resolvers, but let us proceed orderly one topic at a time. > > In summary > ========== > > Allow interposed functions to call back into the runtime, and fix any > places where this breaks. Do you have a plan to fix dlsym similarly? ISTR that pretty reliably trips on the same issue when used in a malloc hook. -- Will Newton Toolchain Working Group, Linaro ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-23 8:23 ` Will Newton @ 2014-10-23 14:40 ` Carlos O'Donell 2014-10-23 16:40 ` Will Newton 2014-10-28 12:24 ` Will Newton 0 siblings, 2 replies; 14+ messages in thread From: Carlos O'Donell @ 2014-10-23 14:40 UTC (permalink / raw) To: Will Newton; +Cc: Roland McGrath, GNU C Library On 10/23/2014 04:23 AM, Will Newton wrote: > On 23 October 2014 03:29, Carlos O'Donell <carlos@redhat.com> wrote: >> On 10/21/2014 06:47 PM, Roland McGrath wrote: >>> But the slippery slope concerns me most of all. >> >> Any function the user interposes acts as a synchronous interrupt on the >> runtime. >> >> It is my opinion that users expect to be able to call any routine in the >> runtime without caution unless we tell them otherwise. >> >> Given that dlopen locks are recursive, as are stdio locks, I propose we >> fully support this notion that users already believe exists. >> >> The alternative is that we don't support it and treat interposed functions >> as if they were in a signal handler context, only being allowed to call >> async-signal-safe functions, and we might as well remove the recursive >> support from the locks such that users get useful backtraces from deadlocks. >> It is my opinion that such a direction would not help our users and would >> not help the project. >> >> The similar situations we need to clarify are LD_AUDIT modules, and >> IFUNC resolvers, but let us proceed orderly one topic at a time. >> >> In summary >> ========== >> >> Allow interposed functions to call back into the runtime, and fix any >> places where this breaks. > > Do you have a plan to fix dlsym similarly? ISTR that pretty reliably > trips on the same issue when used in a malloc hook. If you can write a test case I will look at it. I have a reproducer for dlopen. Cheers, Carlos. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-23 14:40 ` Carlos O'Donell @ 2014-10-23 16:40 ` Will Newton 2014-10-23 17:10 ` Carlos O'Donell 2014-10-28 12:24 ` Will Newton 1 sibling, 1 reply; 14+ messages in thread From: Will Newton @ 2014-10-23 16:40 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Roland McGrath, GNU C Library On 23 October 2014 15:40, Carlos O'Donell <carlos@redhat.com> wrote: > On 10/23/2014 04:23 AM, Will Newton wrote: >> On 23 October 2014 03:29, Carlos O'Donell <carlos@redhat.com> wrote: >>> On 10/21/2014 06:47 PM, Roland McGrath wrote: >>>> But the slippery slope concerns me most of all. >>> >>> Any function the user interposes acts as a synchronous interrupt on the >>> runtime. >>> >>> It is my opinion that users expect to be able to call any routine in the >>> runtime without caution unless we tell them otherwise. >>> >>> Given that dlopen locks are recursive, as are stdio locks, I propose we >>> fully support this notion that users already believe exists. >>> >>> The alternative is that we don't support it and treat interposed functions >>> as if they were in a signal handler context, only being allowed to call >>> async-signal-safe functions, and we might as well remove the recursive >>> support from the locks such that users get useful backtraces from deadlocks. >>> It is my opinion that such a direction would not help our users and would >>> not help the project. >>> >>> The similar situations we need to clarify are LD_AUDIT modules, and >>> IFUNC resolvers, but let us proceed orderly one topic at a time. >>> >>> In summary >>> ========== >>> >>> Allow interposed functions to call back into the runtime, and fix any >>> places where this breaks. >> >> Do you have a plan to fix dlsym similarly? ISTR that pretty reliably >> trips on the same issue when used in a malloc hook. > > If you can write a test case I will look at it. I'll see what I can do. Are you looking for a glibc testcase or just a reproducer? -- Will Newton Toolchain Working Group, Linaro ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-23 16:40 ` Will Newton @ 2014-10-23 17:10 ` Carlos O'Donell 0 siblings, 0 replies; 14+ messages in thread From: Carlos O'Donell @ 2014-10-23 17:10 UTC (permalink / raw) To: Will Newton; +Cc: Roland McGrath, GNU C Library On 10/23/2014 12:40 PM, Will Newton wrote: > On 23 October 2014 15:40, Carlos O'Donell <carlos@redhat.com> wrote: >> On 10/23/2014 04:23 AM, Will Newton wrote: >>> On 23 October 2014 03:29, Carlos O'Donell <carlos@redhat.com> wrote: >>>> On 10/21/2014 06:47 PM, Roland McGrath wrote: >>>>> But the slippery slope concerns me most of all. >>>> >>>> Any function the user interposes acts as a synchronous interrupt on the >>>> runtime. >>>> >>>> It is my opinion that users expect to be able to call any routine in the >>>> runtime without caution unless we tell them otherwise. >>>> >>>> Given that dlopen locks are recursive, as are stdio locks, I propose we >>>> fully support this notion that users already believe exists. >>>> >>>> The alternative is that we don't support it and treat interposed functions >>>> as if they were in a signal handler context, only being allowed to call >>>> async-signal-safe functions, and we might as well remove the recursive >>>> support from the locks such that users get useful backtraces from deadlocks. >>>> It is my opinion that such a direction would not help our users and would >>>> not help the project. >>>> >>>> The similar situations we need to clarify are LD_AUDIT modules, and >>>> IFUNC resolvers, but let us proceed orderly one topic at a time. >>>> >>>> In summary >>>> ========== >>>> >>>> Allow interposed functions to call back into the runtime, and fix any >>>> places where this breaks. >>> >>> Do you have a plan to fix dlsym similarly? ISTR that pretty reliably >>> trips on the same issue when used in a malloc hook. >> >> If you can write a test case I will look at it. > > I'll see what I can do. Are you looking for a glibc testcase or just a > reproducer? Just a reproducer. Please try to make it minimal. For example my reproducer here was pretty small, install malloc hook, dlopen something, have malloc hook dlopen something else, boom. Cheers, Carlos. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-23 14:40 ` Carlos O'Donell 2014-10-23 16:40 ` Will Newton @ 2014-10-28 12:24 ` Will Newton 2014-10-28 13:32 ` Carlos O'Donell 1 sibling, 1 reply; 14+ messages in thread From: Will Newton @ 2014-10-28 12:24 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Roland McGrath, GNU C Library On 23 October 2014 15:40, Carlos O'Donell <carlos@redhat.com> wrote: Hi Carlos, > On 10/23/2014 04:23 AM, Will Newton wrote: >> On 23 October 2014 03:29, Carlos O'Donell <carlos@redhat.com> wrote: >>> On 10/21/2014 06:47 PM, Roland McGrath wrote: >>>> But the slippery slope concerns me most of all. >>> >>> Any function the user interposes acts as a synchronous interrupt on the >>> runtime. >>> >>> It is my opinion that users expect to be able to call any routine in the >>> runtime without caution unless we tell them otherwise. >>> >>> Given that dlopen locks are recursive, as are stdio locks, I propose we >>> fully support this notion that users already believe exists. >>> >>> The alternative is that we don't support it and treat interposed functions >>> as if they were in a signal handler context, only being allowed to call >>> async-signal-safe functions, and we might as well remove the recursive >>> support from the locks such that users get useful backtraces from deadlocks. >>> It is my opinion that such a direction would not help our users and would >>> not help the project. >>> >>> The similar situations we need to clarify are LD_AUDIT modules, and >>> IFUNC resolvers, but let us proceed orderly one topic at a time. >>> >>> In summary >>> ========== >>> >>> Allow interposed functions to call back into the runtime, and fix any >>> places where this breaks. >> >> Do you have a plan to fix dlsym similarly? ISTR that pretty reliably >> trips on the same issue when used in a malloc hook. > > If you can write a test case I will look at it. > > I have a reproducer for dlopen. I had mis-remembered the details of the issue. What I did find was that using LD_PRELOAD to override malloc and calling dlsym can be a problem. This was caused by the following code in dlfcn/dlerror.c: /* We don't use the static buffer and so we have a key. Use it to get the thread-specific buffer. */ result = __libc_getspecific (key); if (result == NULL) { result = (struct dl_action_result *) calloc (1, sizeof (*result)); if (result == NULL) /* We are out of memory. Since this is no really critical situation we carry on by using the global variable. This might lead to conflicts between the threads but they soon all will have memory problems. */ result = &last_result; else /* Set the tsd. */ __libc_setspecific (key, result); The calloc call can cause a loop and overflow the stack. If you use a malloc hook and enable/disable the hook correctly then everything is OK I think. -- Will Newton Toolchain Working Group, Linaro ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-28 12:24 ` Will Newton @ 2014-10-28 13:32 ` Carlos O'Donell 0 siblings, 0 replies; 14+ messages in thread From: Carlos O'Donell @ 2014-10-28 13:32 UTC (permalink / raw) To: Will Newton; +Cc: Roland McGrath, GNU C Library On 10/28/2014 08:24 AM, Will Newton wrote: >> If you can write a test case I will look at it. > > I had mis-remembered the details of the issue. What I did find was > that using LD_PRELOAD to override malloc and calling dlsym can be a > problem. This was caused by the following code in dlfcn/dlerror.c: Thanks for looking! > /* We don't use the static buffer and so we have a key. Use it > to get the thread-specific buffer. */ > result = __libc_getspecific (key); > if (result == NULL) > { > result = (struct dl_action_result *) calloc (1, sizeof (*result)); > if (result == NULL) > /* We are out of memory. Since this is no really critical > situation we carry on by using the global variable. > This might lead to conflicts between the threads but > they soon all will have memory problems. */ > result = &last_result; > else > /* Set the tsd. */ > __libc_setspecific (key, result); > > The calloc call can cause a loop and overflow the stack. If you use a > malloc hook and enable/disable the hook correctly then everything is > OK I think. Right, this is a developer problem. Infinite loops are your own fault. You need to detect them and terminate the recursion with memory from a static buffer or the original malloc. There isn't any other solution I can suggest. Does that make sense? Cheers, Carlos. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-23 2:29 ` Should glibc be fully reentrant? What do we allow interposed symbols to do? Carlos O'Donell 2014-10-23 8:23 ` Will Newton @ 2014-10-24 9:31 ` Torvald Riegel 2014-10-24 15:25 ` Carlos O'Donell 1 sibling, 1 reply; 14+ messages in thread From: Torvald Riegel @ 2014-10-24 9:31 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Roland McGrath, GNU C Library On Wed, 2014-10-22 at 22:29 -0400, Carlos O'Donell wrote: > On 10/21/2014 06:47 PM, Roland McGrath wrote: > > But the slippery slope concerns me most of all. > > Any function the user interposes acts as a synchronous interrupt on the > runtime. Note that is simpler because we have no parallelism in glibc right now. And we may never have a need for this because we don't do big chunks of work, but it's worth considering, I think. Once you go parallel, "synchronous interrupt" constricts the implementation to a larger degree than when this applies just to a single thread that doesn't change. > It is my opinion that users expect to be able to call any routine in the > runtime without caution unless we tell them otherwise. > > Given that dlopen locks are recursive, as are stdio locks, I propose we > fully support this notion that users already believe exists. Well, what this tells us is that these two things do not disallow reentrancy on all the resources they protect with locks. But that doesn't quite tell us that they can deal with reentrancy in all situations, right? > The alternative is that we don't support it and treat interposed functions > as if they were in a signal handler context, only being allowed to call > async-signal-safe functions, and we might as well remove the recursive > support from the locks such that users get useful backtraces from deadlocks. > It is my opinion that such a direction would not help our users and would > not help the project. > > The similar situations we need to clarify are LD_AUDIT modules, and > IFUNC resolvers, but let us proceed orderly one topic at a time. > > In summary > ========== > > Allow interposed functions to call back into the runtime, and fix any > places where this breaks. As worded, I do not agree. IMHO, this should be opt-in. It may very well be that we'd want an opt-in for large sets of functionality, but I'd be concerned about promising support for this *in general*. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-24 9:31 ` Torvald Riegel @ 2014-10-24 15:25 ` Carlos O'Donell 2014-10-28 17:07 ` Torvald Riegel 0 siblings, 1 reply; 14+ messages in thread From: Carlos O'Donell @ 2014-10-24 15:25 UTC (permalink / raw) To: Torvald Riegel; +Cc: Roland McGrath, GNU C Library On 10/24/2014 05:31 AM, Torvald Riegel wrote: >> Allow interposed functions to call back into the runtime, and fix any >> places where this breaks. > > As worded, I do not agree. IMHO, this should be opt-in. It may very > well be that we'd want an opt-in for large sets of functionality, but > I'd be concerned about promising support for this *in general*. I concede. You make a good point. How do we opt-in? Add a new safety annotation? Re-entrant? Cheers, Carlos. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Should glibc be fully reentrant? What do we allow interposed symbols to do? 2014-10-24 15:25 ` Carlos O'Donell @ 2014-10-28 17:07 ` Torvald Riegel 0 siblings, 0 replies; 14+ messages in thread From: Torvald Riegel @ 2014-10-28 17:07 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Roland McGrath, GNU C Library On Fri, 2014-10-24 at 11:25 -0400, Carlos O'Donell wrote: > On 10/24/2014 05:31 AM, Torvald Riegel wrote: > >> Allow interposed functions to call back into the runtime, and fix any > >> places where this breaks. > > > > As worded, I do not agree. IMHO, this should be opt-in. It may very > > well be that we'd want an opt-in for large sets of functionality, but > > I'd be concerned about promising support for this *in general*. > > I concede. You make a good point. > > How do we opt-in? Add a new safety annotation? Re-entrant? I don't know. An annotation would be part of it I guess, but I'm not sure whether we'd put it on the interposed function, or on the function that you are allowed to call from the hooks. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-10-28 17:07 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-10-20 20:25 Why does elf/dl-load.c have local_strdup? Carlos O'Donell 2014-10-20 20:41 ` Roland McGrath 2014-10-20 21:19 ` Carlos O'Donell 2014-10-21 22:47 ` Roland McGrath 2014-10-23 2:29 ` Should glibc be fully reentrant? What do we allow interposed symbols to do? Carlos O'Donell 2014-10-23 8:23 ` Will Newton 2014-10-23 14:40 ` Carlos O'Donell 2014-10-23 16:40 ` Will Newton 2014-10-23 17:10 ` Carlos O'Donell 2014-10-28 12:24 ` Will Newton 2014-10-28 13:32 ` Carlos O'Donell 2014-10-24 9:31 ` Torvald Riegel 2014-10-24 15:25 ` Carlos O'Donell 2014-10-28 17:07 ` Torvald Riegel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).