* nptl/tst-stack4 failure
@ 2016-11-16 15:39 Florian Weimer
2016-11-16 15:52 ` Siddhesh Poyarekar
0 siblings, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2016-11-16 15:39 UTC (permalink / raw)
To: GNU C Library
The nptl/tst-stack4 test fails reliably on several architectures for me
(ppc, ppc64, aarch64). I'm trying to figure out what's going on.
Florian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nptl/tst-stack4 failure
2016-11-16 15:39 nptl/tst-stack4 failure Florian Weimer
@ 2016-11-16 15:52 ` Siddhesh Poyarekar
2016-11-16 16:48 ` Florian Weimer
0 siblings, 1 reply; 6+ messages in thread
From: Siddhesh Poyarekar @ 2016-11-16 15:52 UTC (permalink / raw)
To: Florian Weimer, GNU C Library
On Wednesday 16 November 2016 09:09 PM, Florian Weimer wrote:
> The nptl/tst-stack4 test fails reliably on several architectures for me
> (ppc, ppc64, aarch64). I'm trying to figure out what's going on.
I saw this yesterday and bisected it to this commit:
=================
commit 17af5da98cd2c9ec958421ae2108f877e0945451
Author: Alexandre Oliva <aoliva@redhat.com>
Date: Wed Sep 21 22:01:16 2016 -0300
[PR19826] fix non-LE TLS in static programs
An earlier fix for TLS dropped early initialization of DTV entries for
modules using static TLS, leaving it for __tls_get_addr to set them
up. That worked on platforms that require the GD access model to be
relaxed to LE in the main executable, but it caused a regression on
platforms that allow GD in the main executable, particularly in
statically-linked programs: they use a custom __tls_get_addr that does
not update the DTV, which fails when the DTV early initialization is
not performed.
In static programs, __libc_setup_tls performs the DTV initialization
for the main thread, but the DTV of other threads is set up in
_dl_allocate_tls_init, so that's the fix that matters.
Restoring the initialization in the remaining functions modified by
this patch was just for uniformity. It's not clear that it is ever
needed: even on platforms that allow GD in the main executable, the
dynamically-linked version of __tls_get_addr would set up the DTV
entries, even for static TLS modules, while updating the DTV counter.
=================
I had planned to look at it later in the week but if you have the time
then please feel free to pick it up.
Siddhesh
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nptl/tst-stack4 failure
2016-11-16 15:52 ` Siddhesh Poyarekar
@ 2016-11-16 16:48 ` Florian Weimer
2016-11-16 16:58 ` Szabolcs Nagy
2016-11-16 18:05 ` Torvald Riegel
0 siblings, 2 replies; 6+ messages in thread
From: Florian Weimer @ 2016-11-16 16:48 UTC (permalink / raw)
To: Siddhesh Poyarekar, GNU C Library
On 11/16/2016 04:52 PM, Siddhesh Poyarekar wrote:
> On Wednesday 16 November 2016 09:09 PM, Florian Weimer wrote:
>> The nptl/tst-stack4 test fails reliably on several architectures for me
>> (ppc, ppc64, aarch64). I'm trying to figure out what's going on.
>
> I saw this yesterday and bisected it to this commit:
>
> =================
> commit 17af5da98cd2c9ec958421ae2108f877e0945451
> Author: Alexandre Oliva <aoliva@redhat.com>
> Date: Wed Sep 21 22:01:16 2016 -0300
>
> [PR19826] fix non-LE TLS in static programs
Thanks, I was approaching this commit as well, I think.
> I had planned to look at it later in the week but if you have the time
> then please feel free to pick it up.
I'm not sure what's going on there. I don't know what the expectations
for this code are. Clearly, __tls_get_addr has to be async-signal-safe,
but it calls update_get_addr and tls_get_addr_tail, and the latter
acquires dl_load_lock.
There is an acquire load on dl_tls_max_dtv_idx in dl-tls.c, but it's
never written in an atomic fashion. It seems that nptl/allocatestack.c
calls into the elf/dl-tls.c code without acquiring any rtld locks.
Considering that we reuse modid slots after a dlclose, this can't be
right, I think.
Florian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nptl/tst-stack4 failure
2016-11-16 16:48 ` Florian Weimer
@ 2016-11-16 16:58 ` Szabolcs Nagy
2016-11-16 18:36 ` Florian Weimer
2016-11-16 18:05 ` Torvald Riegel
1 sibling, 1 reply; 6+ messages in thread
From: Szabolcs Nagy @ 2016-11-16 16:58 UTC (permalink / raw)
To: Florian Weimer, Siddhesh Poyarekar, GNU C Library; +Cc: nd
On 16/11/16 16:48, Florian Weimer wrote:
> On 11/16/2016 04:52 PM, Siddhesh Poyarekar wrote:
>> On Wednesday 16 November 2016 09:09 PM, Florian Weimer wrote:
>>> The nptl/tst-stack4 test fails reliably on several architectures for me
>>> (ppc, ppc64, aarch64). I'm trying to figure out what's going on.
>>
>> I saw this yesterday and bisected it to this commit:
>>
>> =================
>> commit 17af5da98cd2c9ec958421ae2108f877e0945451
>> Author: Alexandre Oliva <aoliva@redhat.com>
>> Date: Wed Sep 21 22:01:16 2016 -0300
>>
>> [PR19826] fix non-LE TLS in static programs
>
> Thanks, I was approaching this commit as well, I think.
>
>> I had planned to look at it later in the week but if you have the time
>> then please feel free to pick it up.
>
> I'm not sure what's going on there. I don't know what the expectations for this code are. Clearly,
> __tls_get_addr has to be async-signal-safe, but it calls update_get_addr and tls_get_addr_tail, and the latter
> acquires dl_load_lock.
>
> There is an acquire load on dl_tls_max_dtv_idx in dl-tls.c, but it's never written in an atomic fashion. It
> seems that nptl/allocatestack.c calls into the elf/dl-tls.c code without acquiring any rtld locks. Considering
> that we reuse modid slots after a dlclose, this can't be right, I think.
>
nptl/tst-stack4 is failing for me for a long time (but not reliably)
i have a patch for it, but haven't finished the concurrency notes:
https://sourceware.org/bugzilla/show_bug.cgi?id=19329
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nptl/tst-stack4 failure
2016-11-16 16:48 ` Florian Weimer
2016-11-16 16:58 ` Szabolcs Nagy
@ 2016-11-16 18:05 ` Torvald Riegel
1 sibling, 0 replies; 6+ messages in thread
From: Torvald Riegel @ 2016-11-16 18:05 UTC (permalink / raw)
To: Florian Weimer; +Cc: Siddhesh Poyarekar, GNU C Library
On Wed, 2016-11-16 at 17:48 +0100, Florian Weimer wrote:
> There is an acquire load on dl_tls_max_dtv_idx in dl-tls.c, but it's
> never written in an atomic fashion.
Agreed that this is a red flag. (That doesn't mean that the acquire
load is wrong, just that there might be something else missing that runs
the matching atomic release MO store or similar.)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: nptl/tst-stack4 failure
2016-11-16 16:58 ` Szabolcs Nagy
@ 2016-11-16 18:36 ` Florian Weimer
0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2016-11-16 18:36 UTC (permalink / raw)
To: Szabolcs Nagy, Siddhesh Poyarekar, GNU C Library; +Cc: nd
On 11/16/2016 05:57 PM, Szabolcs Nagy wrote:
> nptl/tst-stack4 is failing for me for a long time (but not reliably)
> i have a patch for it, but haven't finished the concurrency notes:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=19329
Right, the patch slipped my mind. I think your original failure had a
different cause, but it looks like you are going to fix everything.
Thanks. :)
Florian
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-11-16 18:36 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-16 15:39 nptl/tst-stack4 failure Florian Weimer
2016-11-16 15:52 ` Siddhesh Poyarekar
2016-11-16 16:48 ` Florian Weimer
2016-11-16 16:58 ` Szabolcs Nagy
2016-11-16 18:36 ` Florian Weimer
2016-11-16 18:05 ` Torvald Riegel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).