public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* nptl/tst-stack4 failure
@ 2016-11-16 15:39 Florian Weimer
  2016-11-16 15:52 ` Siddhesh Poyarekar
  0 siblings, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2016-11-16 15:39 UTC (permalink / raw)
  To: GNU C Library

The nptl/tst-stack4 test fails reliably on several architectures for me 
(ppc, ppc64, aarch64).  I'm trying to figure out what's going on.

Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nptl/tst-stack4 failure
  2016-11-16 15:39 nptl/tst-stack4 failure Florian Weimer
@ 2016-11-16 15:52 ` Siddhesh Poyarekar
  2016-11-16 16:48   ` Florian Weimer
  0 siblings, 1 reply; 6+ messages in thread
From: Siddhesh Poyarekar @ 2016-11-16 15:52 UTC (permalink / raw)
  To: Florian Weimer, GNU C Library

On Wednesday 16 November 2016 09:09 PM, Florian Weimer wrote:
> The nptl/tst-stack4 test fails reliably on several architectures for me
> (ppc, ppc64, aarch64).  I'm trying to figure out what's going on.

I saw this yesterday and bisected it to this commit:

=================
commit 17af5da98cd2c9ec958421ae2108f877e0945451
Author: Alexandre Oliva <aoliva@redhat.com>
Date: Wed Sep 21 22:01:16 2016 -0300

[PR19826] fix non-LE TLS in static programs

An earlier fix for TLS dropped early initialization of DTV entries for
modules using static TLS, leaving it for __tls_get_addr to set them
up. That worked on platforms that require the GD access model to be
relaxed to LE in the main executable, but it caused a regression on
platforms that allow GD in the main executable, particularly in
statically-linked programs: they use a custom __tls_get_addr that does
not update the DTV, which fails when the DTV early initialization is
not performed.

In static programs, __libc_setup_tls performs the DTV initialization
for the main thread, but the DTV of other threads is set up in
_dl_allocate_tls_init, so that's the fix that matters.

Restoring the initialization in the remaining functions modified by
this patch was just for uniformity. It's not clear that it is ever
needed: even on platforms that allow GD in the main executable, the
dynamically-linked version of __tls_get_addr would set up the DTV
entries, even for static TLS modules, while updating the DTV counter.
=================

I had planned to look at it later in the week but if you have the time
then please feel free to pick it up.

Siddhesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nptl/tst-stack4 failure
  2016-11-16 15:52 ` Siddhesh Poyarekar
@ 2016-11-16 16:48   ` Florian Weimer
  2016-11-16 16:58     ` Szabolcs Nagy
  2016-11-16 18:05     ` Torvald Riegel
  0 siblings, 2 replies; 6+ messages in thread
From: Florian Weimer @ 2016-11-16 16:48 UTC (permalink / raw)
  To: Siddhesh Poyarekar, GNU C Library

On 11/16/2016 04:52 PM, Siddhesh Poyarekar wrote:
> On Wednesday 16 November 2016 09:09 PM, Florian Weimer wrote:
>> The nptl/tst-stack4 test fails reliably on several architectures for me
>> (ppc, ppc64, aarch64).  I'm trying to figure out what's going on.
>
> I saw this yesterday and bisected it to this commit:
>
> =================
> commit 17af5da98cd2c9ec958421ae2108f877e0945451
> Author: Alexandre Oliva <aoliva@redhat.com>
> Date: Wed Sep 21 22:01:16 2016 -0300
>
> [PR19826] fix non-LE TLS in static programs

Thanks, I was approaching this commit as well, I think.

> I had planned to look at it later in the week but if you have the time
> then please feel free to pick it up.

I'm not sure what's going on there.  I don't know what the expectations 
for this code are.  Clearly, __tls_get_addr has to be async-signal-safe, 
but it calls update_get_addr and tls_get_addr_tail, and the latter 
acquires dl_load_lock.

There is an acquire load on dl_tls_max_dtv_idx in dl-tls.c, but it's 
never written in an atomic fashion.  It seems that nptl/allocatestack.c 
calls into the elf/dl-tls.c code without acquiring any rtld locks. 
Considering that we reuse modid slots after a dlclose, this can't be 
right, I think.

Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nptl/tst-stack4 failure
  2016-11-16 16:48   ` Florian Weimer
@ 2016-11-16 16:58     ` Szabolcs Nagy
  2016-11-16 18:36       ` Florian Weimer
  2016-11-16 18:05     ` Torvald Riegel
  1 sibling, 1 reply; 6+ messages in thread
From: Szabolcs Nagy @ 2016-11-16 16:58 UTC (permalink / raw)
  To: Florian Weimer, Siddhesh Poyarekar, GNU C Library; +Cc: nd

On 16/11/16 16:48, Florian Weimer wrote:
> On 11/16/2016 04:52 PM, Siddhesh Poyarekar wrote:
>> On Wednesday 16 November 2016 09:09 PM, Florian Weimer wrote:
>>> The nptl/tst-stack4 test fails reliably on several architectures for me
>>> (ppc, ppc64, aarch64).  I'm trying to figure out what's going on.
>>
>> I saw this yesterday and bisected it to this commit:
>>
>> =================
>> commit 17af5da98cd2c9ec958421ae2108f877e0945451
>> Author: Alexandre Oliva <aoliva@redhat.com>
>> Date: Wed Sep 21 22:01:16 2016 -0300
>>
>> [PR19826] fix non-LE TLS in static programs
> 
> Thanks, I was approaching this commit as well, I think.
> 
>> I had planned to look at it later in the week but if you have the time
>> then please feel free to pick it up.
> 
> I'm not sure what's going on there.  I don't know what the expectations for this code are.  Clearly,
> __tls_get_addr has to be async-signal-safe, but it calls update_get_addr and tls_get_addr_tail, and the latter
> acquires dl_load_lock.
> 
> There is an acquire load on dl_tls_max_dtv_idx in dl-tls.c, but it's never written in an atomic fashion.  It
> seems that nptl/allocatestack.c calls into the elf/dl-tls.c code without acquiring any rtld locks. Considering
> that we reuse modid slots after a dlclose, this can't be right, I think.
> 


nptl/tst-stack4 is failing for me for a long time (but not reliably)
i have a patch for it, but haven't finished the concurrency notes:

https://sourceware.org/bugzilla/show_bug.cgi?id=19329

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nptl/tst-stack4 failure
  2016-11-16 16:48   ` Florian Weimer
  2016-11-16 16:58     ` Szabolcs Nagy
@ 2016-11-16 18:05     ` Torvald Riegel
  1 sibling, 0 replies; 6+ messages in thread
From: Torvald Riegel @ 2016-11-16 18:05 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Siddhesh Poyarekar, GNU C Library

On Wed, 2016-11-16 at 17:48 +0100, Florian Weimer wrote:
> There is an acquire load on dl_tls_max_dtv_idx in dl-tls.c, but it's 
> never written in an atomic fashion.

Agreed that this is a red flag.  (That doesn't mean that the acquire
load is wrong, just that there might be something else missing that runs
the matching atomic release MO store or similar.)



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: nptl/tst-stack4 failure
  2016-11-16 16:58     ` Szabolcs Nagy
@ 2016-11-16 18:36       ` Florian Weimer
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2016-11-16 18:36 UTC (permalink / raw)
  To: Szabolcs Nagy, Siddhesh Poyarekar, GNU C Library; +Cc: nd

On 11/16/2016 05:57 PM, Szabolcs Nagy wrote:

> nptl/tst-stack4 is failing for me for a long time (but not reliably)
> i have a patch for it, but haven't finished the concurrency notes:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=19329

Right, the patch slipped my mind.  I think your original failure had a 
different cause, but it looks like you are going to fix everything. 
Thanks. :)

Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-11-16 18:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-16 15:39 nptl/tst-stack4 failure Florian Weimer
2016-11-16 15:52 ` Siddhesh Poyarekar
2016-11-16 16:48   ` Florian Weimer
2016-11-16 16:58     ` Szabolcs Nagy
2016-11-16 18:36       ` Florian Weimer
2016-11-16 18:05     ` Torvald Riegel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).