public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
@ 2012-10-09 14:08 Dominique Dhumieres
  2012-10-09 14:43 ` Jack Howarth
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Dominique Dhumieres @ 2012-10-09 14:08 UTC (permalink / raw)
  To: gcc-patches; +Cc: jason

On x86_64-apple-darwin10 The following tests:

g++.dg/gomp/tls-5.C
g++.dg/tls/thread_local-cse.C
g++.dg/tls/thread_local-order*.C
g++.dg/tls/thread_local*g.C

fail with

sorry, unimplemented: dynamic initialization of non-function-local thread_local variables not supported on this target

In addition, I see

FAIL: g++.dg/tls/thread_local3.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local4.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local7.C scan-assembler-not \\.data

and

FAIL: g++.dg/tls/static-1.C *

for the latter, the error is

Undefined symbols:
  "TLS init function for A::i", referenced from:
      test()    in ccNTapVf.o
      test()    in ccNTapVf.o
      __ZTHN1A1iE$non_lazy_ptr in ccNTapVf.o

TIA

Dominique

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-09 14:08 RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables Dominique Dhumieres
@ 2012-10-09 14:43 ` Jack Howarth
  2012-10-09 15:28   ` Jason Merrill
  2012-10-10 15:01 ` Rainer Orth
  2012-10-15 20:25 ` Richard Sandiford
  2 siblings, 1 reply; 20+ messages in thread
From: Jack Howarth @ 2012-10-09 14:43 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: gcc-patches, jason

On Tue, Oct 09, 2012 at 04:07:51PM +0200, Dominique Dhumieres wrote:
> On x86_64-apple-darwin10 The following tests:
> 
> g++.dg/gomp/tls-5.C
> g++.dg/tls/thread_local-cse.C
> g++.dg/tls/thread_local-order*.C
> g++.dg/tls/thread_local*g.C
> 
> fail with
> 
> sorry, unimplemented: dynamic initialization of non-function-local thread_local variables not supported on this target
> 
> In addition, I see
> 
> FAIL: g++.dg/tls/thread_local3.C -std=gnu++11 execution test
> FAIL: g++.dg/tls/thread_local4.C -std=gnu++11 execution test
> FAIL: g++.dg/tls/thread_local7.C scan-assembler-not \\.data
> 
> and
> 
> FAIL: g++.dg/tls/static-1.C *
> 
> for the latter, the error is
> 
> Undefined symbols:
>   "TLS init function for A::i", referenced from:
>       test()    in ccNTapVf.o
>       test()    in ccNTapVf.o
>       __ZTHN1A1iE$non_lazy_ptr in ccNTapVf.o
> 
> TIA
> 
> Dominique

 This patch was probably never tested on any targets that use emutls for tls support.
I assume there is a way to switch the standard linux build from tls to emutls so this
regression can be tested on targets other than darwin. Perhaps with gcc_cv_use_emutls?.
             Jack

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-09 14:43 ` Jack Howarth
@ 2012-10-09 15:28   ` Jason Merrill
  2012-10-09 16:28     ` Dominique Dhumieres
                       ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Jason Merrill @ 2012-10-09 15:28 UTC (permalink / raw)
  To: Jack Howarth; +Cc: Dominique Dhumieres, gcc-patches

On 10/09/2012 10:43 AM, Jack Howarth wrote:
> On Tue, Oct 09, 2012 at 04:07:51PM +0200, Dominique Dhumieres wrote:
>> On x86_64-apple-darwin10 The following tests:
>>
>> g++.dg/gomp/tls-5.C
>> g++.dg/tls/thread_local-cse.C
>> g++.dg/tls/thread_local-order*.C
>> g++.dg/tls/thread_local*g.C
>>
>> fail with
>>
>> sorry, unimplemented: dynamic initialization of non-function-local thread_local variables not supported on this target

These don't work because of the lack of alias support; that's why I put 
dg-require-alias in the tests.  Do I need a different magic incantation?

>> In addition, I see
>>
>> FAIL: g++.dg/tls/thread_local3.C -std=gnu++11 execution test
>> FAIL: g++.dg/tls/thread_local4.C -std=gnu++11 execution test

These ought to work.  Can you debug the problem?

>> FAIL: g++.dg/tls/thread_local7.C scan-assembler-not \\.data
>> FAIL: g++.dg/tls/static-1.C *

I'll take a look at these.

Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-09 15:28   ` Jason Merrill
@ 2012-10-09 16:28     ` Dominique Dhumieres
  2012-10-09 20:43     ` Dominique Dhumieres
  2012-10-11 15:23     ` Jason Merrill
  2 siblings, 0 replies; 20+ messages in thread
From: Dominique Dhumieres @ 2012-10-09 16:28 UTC (permalink / raw)
  To: jason, howarth; +Cc: gcc-patches, dominiq

> These don't work because of the lack of alias support; that's why I put
> dg-require-alias in the tests.  Do I need a different magic incantation?

I understand nothing about alias, weak, ... stuff, but from pr52945, if you need
weak-alias, then you have also to use

/* { dg-require-weak "" } */

Dominique

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-09 15:28   ` Jason Merrill
  2012-10-09 16:28     ` Dominique Dhumieres
@ 2012-10-09 20:43     ` Dominique Dhumieres
  2012-10-10  1:16       ` Jason Merrill
  2012-10-11 15:23     ` Jason Merrill
  2 siblings, 1 reply; 20+ messages in thread
From: Dominique Dhumieres @ 2012-10-09 20:43 UTC (permalink / raw)
  To: jason, howarth; +Cc: gcc-patches, dominiq

> >> FAIL: g++.dg/tls/thread_local3.C -std=gnu++11 execution test
> >> FAIL: g++.dg/tls/thread_local4.C -std=gnu++11 execution test
>
> These ought to work.  Can you debug the problem?

Backtrace for thread_local4.C

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x1503 of process 36991]
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000100005424 in __mini_vector<std::pair<__gnu_cxx::bitmap_allocator<wchar_t>::_Alloc_block*, __gnu_cxx::bitmap_allocator<wchar_t>::_Alloc_block*> >::insert (this=<value optimized out>, __pos=<value optimized out>, __x=<value optimized out>)
    at /opt/gcc/p_build/x86_64-apple-darwin10.8.0/libstdc++-v3/include/ext/bitmap_allocator.h:158
#2  0x0000000000000001 in ?? ()
#3  0x0000000100381000 in ?? ()
#4  0x0000000100000cea in f () at /opt/gcc/work/gcc/testsuite/g++.dg/tls/thread_local4.C:23
#5  0x0000000100380ed0 in ?? ()
#6  0x00007fff8297e39c in _pthread_exit () from /usr/lib/libSystem.B.dylib
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

valgrind gives

--36994:0:schedule VG_(sema_down): read returned -4
==36994== Thread 2:
==36994== Invalid read of size 4
==36994==    at 0x100021400: (anonymous namespace)::list::run() (in /opt/gcc/gcc4.8p-192219/lib/libstdc++.6.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x100: ???
==36994==    by 0x10084DD9F: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994==  Address 0x1003cd2e0 is 16 bytes inside a block of size 536 free'd
==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x10084DD9F: ???
==36994==    by 0x10084DFFF: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994== 
==36994== Invalid read of size 8
==36994==    at 0x10002141D: (anonymous namespace)::list::run() (in /opt/gcc/gcc4.8p-192219/lib/libstdc++.6.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x100: ???
==36994==    by 0x10084DD9F: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994==  Address 0x1003cd2e8 is 24 bytes inside a block of size 536 free'd
==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x10084DD9F: ???
==36994==    by 0x10084DFFF: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994== 
==36994== Invalid read of size 8
==36994==    at 0x100021421: (anonymous namespace)::list::run() (in /opt/gcc/gcc4.8p-192219/lib/libstdc++.6.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x100: ???
==36994==    by 0x10084DD9F: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994==  Address 0x1003cd2f0 is 32 bytes inside a block of size 536 free'd
==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x10084DD9F: ???
==36994==    by 0x10084DFFF: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994== 
==36994== Invalid read of size 8
==36994==    at 0x100021429: (anonymous namespace)::list::run() (in /opt/gcc/gcc4.8p-192219/lib/libstdc++.6.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x100: ???
==36994==    by 0x10084DD9F: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994==  Address 0x1003cd2d8 is 8 bytes inside a block of size 536 free'd
==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x10084DD9F: ???
==36994==    by 0x10084DFFF: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994== 
--36994:0:schedule VG_(sema_down): read returned -4
==36994== Invalid read of size 4
==36994==    at 0x100021400: (anonymous namespace)::list::run() (in /opt/gcc/gcc4.8p-192219/lib/libstdc++.6.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x100: ???
==36994==    by 0x1008CED9F: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994==  Address 0x1003cd800 is 16 bytes inside a block of size 536 free'd
==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x1008CED9F: ???
==36994==    by 0x1008CEFFF: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994== 
==36994== Invalid read of size 8
==36994==    at 0x10002141D: (anonymous namespace)::list::run() (in /opt/gcc/gcc4.8p-192219/lib/libstdc++.6.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x100: ???
==36994==    by 0x1008CED9F: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994==  Address 0x1003cd808 is 24 bytes inside a block of size 536 free'd
==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x1008CED9F: ???
==36994==    by 0x1008CEFFF: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994== 
==36994== Invalid read of size 8
==36994==    at 0x100021421: (anonymous namespace)::list::run() (in /opt/gcc/gcc4.8p-192219/lib/libstdc++.6.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x100: ???
==36994==    by 0x1008CED9F: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994==  Address 0x1003cd810 is 32 bytes inside a block of size 536 free'd
==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x1008CED9F: ???
==36994==    by 0x1008CEFFF: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994== 
==36994== Invalid read of size 8
==36994==    at 0x100021429: (anonymous namespace)::list::run() (in /opt/gcc/gcc4.8p-192219/lib/libstdc++.6.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x100: ???
==36994==    by 0x1008CED9F: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994==  Address 0x1003cd7f8 is 8 bytes inside a block of size 536 free'd
==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)
==36994==    by 0xFF: ???
==36994==    by 0x1008CED9F: ???
==36994==    by 0x1008CEFFF: ???
==36994==    by 0x10018C6E7: _pthread_tsd_cleanup (in /usr/lib/libSystem.B.dylib)
==36994== 
==36994== 
==36994== HEAP SUMMARY:
==36994==     in use at exit: 976 bytes in 4 blocks
==36994==   total heap usage: 14 allocs, 10 frees, 2,674 bytes allocated
==36994== 
==36994== LEAK SUMMARY:
==36994==    definitely lost: 0 bytes in 0 blocks
==36994==    indirectly lost: 0 bytes in 0 blocks
==36994==      possibly lost: 536 bytes in 1 blocks
==36994==    still reachable: 352 bytes in 2 blocks
==36994==         suppressed: 88 bytes in 1 blocks
==36994== Rerun with --leak-check=full to see details of leaked memory
==36994== 
==36994== For counts of detected and suppressed errors, rerun with: -v
==36994== ERROR SUMMARY: 8 errors from 8 contexts (suppressed: 0 from 0)

Dominique

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-09 20:43     ` Dominique Dhumieres
@ 2012-10-10  1:16       ` Jason Merrill
  2012-10-10 13:27         ` Jack Howarth
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Merrill @ 2012-10-10  1:16 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: howarth, gcc-patches

On 10/09/2012 04:36 PM, Dominique Dhumieres wrote:
> ==36994==  Address 0x1003cd2e0 is 16 bytes inside a block of size 536 free'd
> ==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
> ==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)

Aha.  So the problem is that we're destroying the TLS data from one 
pthread key destructor, and then a later destructor tries to use it. 
Hmm, that's awkward.  And surprising, since we do TLS lookup before 
creating the key for the atexit list, so the emutls_destroy destructor 
should have been registered sooner, and so run later.  Does the Darwin 
pthread_tsd_cleanup not run destructors in reverse order of registration?

Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-10  1:16       ` Jason Merrill
@ 2012-10-10 13:27         ` Jack Howarth
  2012-10-10 14:54           ` Rainer Orth
  0 siblings, 1 reply; 20+ messages in thread
From: Jack Howarth @ 2012-10-10 13:27 UTC (permalink / raw)
  To: Jason Merrill; +Cc: Dominique Dhumieres, gcc-patches, ro

On Tue, Oct 09, 2012 at 09:13:06PM -0400, Jason Merrill wrote:
> On 10/09/2012 04:36 PM, Dominique Dhumieres wrote:
>> ==36994==  Address 0x1003cd2e0 is 16 bytes inside a block of size 536 free'd
>> ==36994==    at 0x10001252D: free (vg_replace_malloc.c:430)
>> ==36994==    by 0x1003B5CB2: emutls_destroy (in /opt/gcc/gcc4.8p-192219/lib/libgcc_s.1.dylib)
>
> Aha.  So the problem is that we're destroying the TLS data from one  
> pthread key destructor, and then a later destructor tries to use it.  
> Hmm, that's awkward.  And surprising, since we do TLS lookup before  
> creating the key for the atexit list, so the emutls_destroy destructor  
> should have been registered sooner, and so run later.  Does the Darwin  
> pthread_tsd_cleanup not run destructors in reverse order of registration?

Jason,
   Have you tried a gcc trunk build on linux configured to use emutls instead
of tls to confirm that this issue is really darwin-specific? These failures might
also appear on sparc-sun-solaris2.9 but we don't have recent gcc trunk testresults
for that. Perhaps Rainer can try a build of current gcc trunk and see if it
impacts sparc-sun-solaris2.9's use of emutls as well.
           Jack

>
> Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-10 13:27         ` Jack Howarth
@ 2012-10-10 14:54           ` Rainer Orth
  2012-10-10 20:20             ` Jack Howarth
  0 siblings, 1 reply; 20+ messages in thread
From: Rainer Orth @ 2012-10-10 14:54 UTC (permalink / raw)
  To: Jack Howarth; +Cc: Jason Merrill, Dominique Dhumieres, gcc-patches

Jack Howarth <howarth@bromo.med.uc.edu> writes:

>    Have you tried a gcc trunk build on linux configured to use emutls instead
> of tls to confirm that this issue is really darwin-specific? These failures might
> also appear on sparc-sun-solaris2.9 but we don't have recent gcc trunk testresults
> for that. Perhaps Rainer can try a build of current gcc trunk and see if it
> impacts sparc-sun-solaris2.9's use of emutls as well.
>            Jack

There's no reason to test on a emutls-only target, just configure with
--disable-tls on any target.  I probably won't be able to test Solaris 9
(SPARC or x86) before the weekend.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-09 14:08 RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables Dominique Dhumieres
  2012-10-09 14:43 ` Jack Howarth
@ 2012-10-10 15:01 ` Rainer Orth
  2012-10-15 20:25 ` Richard Sandiford
  2 siblings, 0 replies; 20+ messages in thread
From: Rainer Orth @ 2012-10-10 15:01 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: gcc-patches, jason

dominiq@lps.ens.fr (Dominique Dhumieres) writes:

> On x86_64-apple-darwin10 The following tests:
>
> g++.dg/gomp/tls-5.C
> g++.dg/tls/thread_local-cse.C
> g++.dg/tls/thread_local-order*.C
> g++.dg/tls/thread_local*g.C
>
> fail with
>
> sorry, unimplemented: dynamic initialization of non-function-local thread_local variables not supported on this target

On i386-pc-solaris2.10 (with native TLS), I see

FAIL: g++.dg/tls/thread_local-order1.C execution test
FAIL: g++.dg/tls/thread_local3g.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local4g.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local5g.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local6g.C execution test

thread_local-order1.exe aborts like this, with i = 0, c = 2:

Program received signal SIGABRT, Aborted.
[Switching to Thread 1 (LWP 1)]
0xfed5c5f5 in _lwp_kill () from /lib/libc.so.1
(gdb) where
#0  0xfed5c5f5 in _lwp_kill () from /lib/libc.so.1
#1  0xfed57485 in thr_kill () from /lib/libc.so.1
#2  0xfed0366f in raise () from /lib/libc.so.1
#3  0xfece2971 in abort () from /lib/libc.so.1
#4  0x08050f5d in A::~A (this=0x8061340 <a0>, __in_chrg=<optimized out>)
    at /vol/gcc/src/hg/trunk/solaris/gcc/testsuite/g++.dg/tls/thread_local-order1.C:14
#5  0x08050e6a in __static_initialization_and_destruction_0 (__initialize_p=0, 
    __priority=65535)
    at /vol/gcc/src/hg/trunk/solaris/gcc/testsuite/g++.dg/tls/thread_local-order1.C:17
#6  0x08050f01 in _GLOBAL__sub_D_c ()
    at /vol/gcc/src/hg/trunk/solaris/gcc/testsuite/g++.dg/tls/thread_local-order1.C:25
#7  0x08050d50 in __do_global_dtors_aux ()
#8  0x08050ff5 in _fini ()
#9  0xfece30bd in _exithandle () from /lib/libc.so.1
#10 0xfecd5662 in exit () from /lib/libc.so.1
#11 0x08050bda in _start ()

I cannot run the others under gdb 4.5, though:

Starting program: /var/gcc/gcc-4.8.0-20121010/10-gcc-gas/gcc/testsuite/g++/thread_local4g.exe 
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New LWP    2        ]
[LWP    2         exited]
[New Thread 2        ]
thread_to_lwp: td_ta_map_id2thr Debugger service failed

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-10 14:54           ` Rainer Orth
@ 2012-10-10 20:20             ` Jack Howarth
  2012-10-10 20:25               ` Jack Howarth
  0 siblings, 1 reply; 20+ messages in thread
From: Jack Howarth @ 2012-10-10 20:20 UTC (permalink / raw)
  To: Rainer Orth; +Cc: Jason Merrill, Dominique Dhumieres, gcc-patches

On Wed, Oct 10, 2012 at 04:50:22PM +0200, Rainer Orth wrote:
> Jack Howarth <howarth@bromo.med.uc.edu> writes:
> 
> >    Have you tried a gcc trunk build on linux configured to use emutls instead
> > of tls to confirm that this issue is really darwin-specific? These failures might
> > also appear on sparc-sun-solaris2.9 but we don't have recent gcc trunk testresults
> > for that. Perhaps Rainer can try a build of current gcc trunk and see if it
> > impacts sparc-sun-solaris2.9's use of emutls as well.
> >            Jack
> 
> There's no reason to test on a emutls-only target, just configure with
> --disable-tls on any target.  I probably won't be able to test Solaris 9
> (SPARC or x86) before the weekend.
> 
> 	Rainer

A quick build of gcc trunk (r192324) on Fedoa 15 x86_64 using --disable-tls shows...

Native configuration is x86_64-unknown-linux-gnu

		=== gcc tests ===


Running target unix

		=== gcc Summary ===

# of expected passes		499
# of unsupported tests		5
/home/howarth/work-gcc/gcc/xgcc  version 4.8.0 20121010 (experimental) (GCC) 

		=== g++ tests ===


Running target unix
FAIL: g++.dg/tls/thread_local3.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local3g.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local4.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local4g.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local5.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local5g.C -std=gnu++11 execution test
FAIL: g++.dg/tls/thread_local7.C scan-assembler-not \\\\.data
FAIL: g++.dg/tls/thread_local7g.C scan-assembler-not \\\\.data

		=== g++ Summary ===

# of expected passes		105
# of unexpected failures	8
# of expected failures		2
# of unsupported tests		6
/home/howarth/work-gcc/gcc/testsuite/g++/../../g++  version 4.8.0 20121010 (experimental) (GCC) 


Compiler version: 4.8.0 20121010 (experimental) (GCC) 
Platform: x86_64-unknown-linux-gnu
configure flags: --disable-ppl --disable-cloog --prefix=/home/howarth/dist-gcc --enable-languages=c,c++ --disable-multilib --disable-lto --disable-tls --disable-bootstrap

The first failure backtraces as...

$ gdb ./thread_local3.exe
GNU gdb (GDB) Fedora (7.3.1-48.fc15)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/howarth/work-gcc/gcc/testsuite/g++/thread_local3.exe...(no debugging symbols found)...done.
(gdb) r
Starting program: /home/howarth/work-gcc/gcc/testsuite/g++/thread_local3.exe 
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff7aaa700 (LWP 27463)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7aaa700 (LWP 27463)]
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff7d50f44 in (anonymous namespace)::list::run() () at ../../../../gcc/libstdc++-v3/libsupc++/atexit_thread.cc:71
#2  0x00000035e9e07933 in __nptl_deallocate_tsd () at pthread_create.c:156
#3  0x00000035e9e07b4f in start_thread (arg=0x7ffff7aaa700) at pthread_create.c:312
#4  0x00000035e96e0e6d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) 



> 
> -- 
> -----------------------------------------------------------------------------
> Rainer Orth, Center for Biotechnology, Bielefeld University

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-10 20:20             ` Jack Howarth
@ 2012-10-10 20:25               ` Jack Howarth
  0 siblings, 0 replies; 20+ messages in thread
From: Jack Howarth @ 2012-10-10 20:25 UTC (permalink / raw)
  To: Rainer Orth; +Cc: Jason Merrill, Dominique Dhumieres, gcc-patches

On Wed, Oct 10, 2012 at 04:16:03PM -0400, Jack Howarth wrote:
> On Wed, Oct 10, 2012 at 04:50:22PM +0200, Rainer Orth wrote:
> > Jack Howarth <howarth@bromo.med.uc.edu> writes:
> > 
> > >    Have you tried a gcc trunk build on linux configured to use emutls instead
> > > of tls to confirm that this issue is really darwin-specific? These failures might
> > > also appear on sparc-sun-solaris2.9 but we don't have recent gcc trunk testresults
> > > for that. Perhaps Rainer can try a build of current gcc trunk and see if it
> > > impacts sparc-sun-solaris2.9's use of emutls as well.
> > >            Jack
> > 
> > There's no reason to test on a emutls-only target, just configure with
> > --disable-tls on any target.  I probably won't be able to test Solaris 9
> > (SPARC or x86) before the weekend.
> > 
> > 	Rainer
> 
> A quick build of gcc trunk (r192324) on Fedoa 15 x86_64 using --disable-tls shows...
> 
> Native configuration is x86_64-unknown-linux-gnu
> 
> 		=== gcc tests ===
> 
> 
> Running target unix
> 
> 		=== gcc Summary ===
> 
> # of expected passes		499
> # of unsupported tests		5
> /home/howarth/work-gcc/gcc/xgcc  version 4.8.0 20121010 (experimental) (GCC) 
> 
> 		=== g++ tests ===
> 
> 
> Running target unix
> FAIL: g++.dg/tls/thread_local3.C -std=gnu++11 execution test
> FAIL: g++.dg/tls/thread_local3g.C -std=gnu++11 execution test
> FAIL: g++.dg/tls/thread_local4.C -std=gnu++11 execution test
> FAIL: g++.dg/tls/thread_local4g.C -std=gnu++11 execution test
> FAIL: g++.dg/tls/thread_local5.C -std=gnu++11 execution test
> FAIL: g++.dg/tls/thread_local5g.C -std=gnu++11 execution test
> FAIL: g++.dg/tls/thread_local7.C scan-assembler-not \\\\.data
> FAIL: g++.dg/tls/thread_local7g.C scan-assembler-not \\\\.data
> 
> 		=== g++ Summary ===
> 
> # of expected passes		105
> # of unexpected failures	8
> # of expected failures		2
> # of unsupported tests		6
> /home/howarth/work-gcc/gcc/testsuite/g++/../../g++  version 4.8.0 20121010 (experimental) (GCC) 
> 
> 
> Compiler version: 4.8.0 20121010 (experimental) (GCC) 
> Platform: x86_64-unknown-linux-gnu
> configure flags: --disable-ppl --disable-cloog --prefix=/home/howarth/dist-gcc --enable-languages=c,c++ --disable-multilib --disable-lto --disable-tls --disable-bootstrap
> 
> The first failure backtraces as...
> 
> $ gdb ./thread_local3.exe
> GNU gdb (GDB) Fedora (7.3.1-48.fc15)
> Copyright (C) 2011 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /home/howarth/work-gcc/gcc/testsuite/g++/thread_local3.exe...(no debugging symbols found)...done.
> (gdb) r
> Starting program: /home/howarth/work-gcc/gcc/testsuite/g++/thread_local3.exe 
> [Thread debugging using libthread_db enabled]
> [New Thread 0x7ffff7aaa700 (LWP 27463)]
> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7ffff7aaa700 (LWP 27463)]
> 0x0000000000000000 in ?? ()
> (gdb) bt
> #0  0x0000000000000000 in ?? ()
> #1  0x00007ffff7d50f44 in (anonymous namespace)::list::run() () at ../../../../gcc/libstdc++-v3/libsupc++/atexit_thread.cc:71
> #2  0x00000035e9e07933 in __nptl_deallocate_tsd () at pthread_create.c:156
> #3  0x00000035e9e07b4f in start_thread (arg=0x7ffff7aaa700) at pthread_create.c:312
> #4  0x00000035e96e0e6d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
> (gdb) 
> 
> 

Valgrind shows...

$ valgrind --leak-check=full ./thread_local3.exe
==27634== Memcheck, a memory error detector
==27634== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==27634== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==27634== Command: ./thread_local3.exe
==27634== 
==27634== Thread 2:
==27634== Invalid read of size 4
==27634==    at 0x4C69F20: (anonymous namespace)::list::run() (atexit_thread.cc:70)
==27634==    by 0x35E9E07932: __nptl_deallocate_tsd (pthread_create.c:156)
==27634==    by 0x35E9E07B4E: start_thread (pthread_create.c:312)
==27634==    by 0x35E96E0E6C: clone (clone.S:115)
==27634==  Address 0x595f390 is 16 bytes inside a block of size 536 free'd
==27634==    at 0x4A055FE: free (vg_replace_malloc.c:366)
==27634==    by 0x4F5512A: emutls_destroy (emutls.c:75)
==27634==    by 0x35E9E07932: __nptl_deallocate_tsd (pthread_create.c:156)
==27634==    by 0x35E9E07B4E: start_thread (pthread_create.c:312)
==27634==    by 0x35E96E0E6C: clone (clone.S:115)
==27634== 
==27634== Invalid read of size 8
==27634==    at 0x4C69F3D: (anonymous namespace)::list::run() (atexit_thread.cc:71)
==27634==    by 0x35E9E07932: __nptl_deallocate_tsd (pthread_create.c:156)
==27634==    by 0x35E9E07B4E: start_thread (pthread_create.c:312)
==27634==    by 0x35E96E0E6C: clone (clone.S:115)
==27634==  Address 0x595f398 is 24 bytes inside a block of size 536 free'd
==27634==    at 0x4A055FE: free (vg_replace_malloc.c:366)
==27634==    by 0x4F5512A: emutls_destroy (emutls.c:75)
==27634==    by 0x35E9E07932: __nptl_deallocate_tsd (pthread_create.c:156)
==27634==    by 0x35E9E07B4E: start_thread (pthread_create.c:312)
==27634==    by 0x35E96E0E6C: clone (clone.S:115)
==27634== 
==27634== Invalid read of size 8
==27634==    at 0x4C69F41: (anonymous namespace)::list::run() (atexit_thread.cc:71)
==27634==    by 0x35E9E07932: __nptl_deallocate_tsd (pthread_create.c:156)
==27634==    by 0x35E9E07B4E: start_thread (pthread_create.c:312)
==27634==    by 0x35E96E0E6C: clone (clone.S:115)
==27634==  Address 0x595f3a0 is 32 bytes inside a block of size 536 free'd
==27634==    at 0x4A055FE: free (vg_replace_malloc.c:366)
==27634==    by 0x4F5512A: emutls_destroy (emutls.c:75)
==27634==    by 0x35E9E07932: __nptl_deallocate_tsd (pthread_create.c:156)
==27634==    by 0x35E9E07B4E: start_thread (pthread_create.c:312)
==27634==    by 0x35E96E0E6C: clone (clone.S:115)
==27634== 
==27634== Invalid read of size 8
==27634==    at 0x4C69F49: (anonymous namespace)::list::run() (atexit_thread.cc:72)
==27634==    by 0x35E9E07932: __nptl_deallocate_tsd (pthread_create.c:156)
==27634==    by 0x35E9E07B4E: start_thread (pthread_create.c:312)
==27634==    by 0x35E96E0E6C: clone (clone.S:115)
==27634==  Address 0x595f388 is 8 bytes inside a block of size 536 free'd
==27634==    at 0x4A055FE: free (vg_replace_malloc.c:366)
==27634==    by 0x4F5512A: emutls_destroy (emutls.c:75)
==27634==    by 0x35E9E07932: __nptl_deallocate_tsd (pthread_create.c:156)
==27634==    by 0x35E9E07B4E: start_thread (pthread_create.c:312)
==27634==    by 0x35E96E0E6C: clone (clone.S:115)
==27634== 
==27634== 
==27634== HEAP SUMMARY:
==27634==     in use at exit: 824 bytes in 2 blocks
==27634==   total heap usage: 13 allocs, 11 frees, 2,794 bytes allocated
==27634== 
==27634== Thread 1:
==27634== 536 bytes in 1 blocks are possibly lost in loss record 2 of 2
==27634==    at 0x4A0649D: malloc (vg_replace_malloc.c:236)
==27634==    by 0x4F45A1B: emutls_alloc (emutls.c:102)
==27634==    by 0x4F551D3: __emutls_get_address (emutls.c:183)
==27634==    by 0x4C69F7F: (anonymous namespace)::run_current() (atexit_thread.cc:91)
==27634==    by 0x35E96388C0: __run_exit_handlers (exit.c:78)
==27634==    by 0x35E9638944: exit (exit.c:100)
==27634==    by 0x35E9621363: (below main) (libc-start.c:258)
==27634== 
==27634== LEAK SUMMARY:
==27634==    definitely lost: 0 bytes in 0 blocks
==27634==    indirectly lost: 0 bytes in 0 blocks
==27634==      possibly lost: 536 bytes in 1 blocks
==27634==    still reachable: 288 bytes in 1 blocks
==27634==         suppressed: 0 bytes in 0 blocks
==27634== Reachable blocks (those to which a pointer was found) are not shown.
==27634== To see them, rerun with: --leak-check=full --show-reachable=yes
==27634== 
==27634== For counts of detected and suppressed errors, rerun with: -v
==27634== ERROR SUMMARY: 9 errors from 5 contexts (suppressed: 6 from 6)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-09 15:28   ` Jason Merrill
  2012-10-09 16:28     ` Dominique Dhumieres
  2012-10-09 20:43     ` Dominique Dhumieres
@ 2012-10-11 15:23     ` Jason Merrill
  2 siblings, 0 replies; 20+ messages in thread
From: Jason Merrill @ 2012-10-11 15:23 UTC (permalink / raw)
  To: Jack Howarth; +Cc: Dominique Dhumieres, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 347 bytes --]

On 10/09/2012 11:27 AM, Jason Merrill wrote:
>>> FAIL: g++.dg/tls/thread_local7.C scan-assembler-not \\.data

I've changed this test to require native TLS.

>>> FAIL: g++.dg/tls/static-1.C *

And I've fixed the compiler to not mess with thread_local wrappers on 
this test, since it uses __thread.

Tested x86_64-pc-linux-gnu, applying to trunk.


[-- Attachment #2: static-tls.patch --]
[-- Type: text/x-patch, Size: 2166 bytes --]

commit 3c317cba2f2b6100987ee720f7d7da76f5f43c19
Author: Jason Merrill <jason@redhat.com>
Date:   Tue Oct 9 12:55:52 2012 -0400

    	* decl.c (grokdeclarator): Set DECL_GNU_TLS_P for static data
    	members, too.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 0b936ea..e78c664 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -10446,7 +10446,11 @@ grokdeclarator (const cp_declarator *declarator,
 		DECL_EXTERNAL (decl) = 1;
 
 		if (thread_p)
-		  DECL_TLS_MODEL (decl) = decl_default_tls_model (decl);
+		  {
+		    DECL_TLS_MODEL (decl) = decl_default_tls_model (decl);
+		    if (declspecs->gnu_thread_keyword_p)
+		      DECL_GNU_TLS_P (decl) = true;
+		  }
 
 		if (constexpr_p && !initialized)
 		  {
diff --git a/gcc/testsuite/g++.dg/gomp/tls-5.C b/gcc/testsuite/g++.dg/gomp/tls-5.C
index 74e4faa..f1dcdae 100644
--- a/gcc/testsuite/g++.dg/gomp/tls-5.C
+++ b/gcc/testsuite/g++.dg/gomp/tls-5.C
@@ -1,6 +1,7 @@
 // The reference temp should be TLS, not normal data.
 // { dg-require-effective-target c++11 }
-// { dg-final { scan-assembler-not "\\.data" } }
+// { dg-require-alias }
+// { dg-final { scan-assembler-not "\\.data" { target tls_native } } }
 
 extern int&& ir;
 #pragma omp threadprivate (ir)
diff --git a/gcc/testsuite/g++.dg/tls/static2.C b/gcc/testsuite/g++.dg/tls/static2.C
new file mode 100644
index 0000000..ab688dd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/static2.C
@@ -0,0 +1,18 @@
+// { dg-final { scan-assembler-not "_ZTHN1A1iE" } }
+// { dg-final { scan-assembler-not "_ZTWN1A1iE" } }
+// { dg-require-effective-target tls }
+
+struct A
+{
+  static __thread int i;
+};
+
+int
+test ()
+{
+  if (A::i != 8)
+    return 1;
+
+  A::i = 17;
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local7.C b/gcc/testsuite/g++.dg/tls/thread_local7.C
index 77a1c05..f453b96 100644
--- a/gcc/testsuite/g++.dg/tls/thread_local7.C
+++ b/gcc/testsuite/g++.dg/tls/thread_local7.C
@@ -2,7 +2,7 @@
 // { dg-require-effective-target tls }
 
 // The reference temp should be TLS, not normal data.
-// { dg-final { scan-assembler-not "\\.data" } }
+// { dg-final { scan-assembler-not "\\.data" { target tls_native } } }
 
 void f()
 {

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-09 14:08 RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables Dominique Dhumieres
  2012-10-09 14:43 ` Jack Howarth
  2012-10-10 15:01 ` Rainer Orth
@ 2012-10-15 20:25 ` Richard Sandiford
  2 siblings, 0 replies; 20+ messages in thread
From: Richard Sandiford @ 2012-10-15 20:25 UTC (permalink / raw)
  To: Dominique Dhumieres; +Cc: gcc-patches, jason

dominiq@lps.ens.fr (Dominique Dhumieres) writes:
> On x86_64-apple-darwin10 The following tests:
>
> g++.dg/gomp/tls-5.C
> g++.dg/tls/thread_local-cse.C
> g++.dg/tls/thread_local-order*.C
> g++.dg/tls/thread_local*g.C
>
> fail with
>
> sorry, unimplemented: dynamic initialization of non-function-local thread_local variables not supported on this target

May not be related, but I was seeing g++.dg/tls/thread_local-cse.C
fail on mipsisa64-elf too.  It had the right conditions, but the dg-do
line needs to come first.

g++.dg/tls/thread_local-wrap4.C was also failing because it requires -fPIC.

I committed the following as (hopefully) obvious after testing
on mipsisa64-elf.

Richard


gcc/testsuite/
	* g++.dg/tls/thread_local-cse.C: Move dg-do line.
	* g++.dg/tls/thread_local-wrap4.C: Require fpic.

Index: gcc/testsuite/g++.dg/tls/thread_local-cse.C
===================================================================
--- gcc/testsuite/g++.dg/tls/thread_local-cse.C	2012-10-10 20:53:22.000000000 +0100
+++ gcc/testsuite/g++.dg/tls/thread_local-cse.C	2012-10-15 20:28:38.147650178 +0100
@@ -1,11 +1,11 @@
 // Test for CSE of the wrapper function: we should only call it once
 // for the two references to ir.
+// { dg-do run }
 // { dg-options "-std=c++11 -O -fno-inline -save-temps" }
 // { dg-require-effective-target tls_runtime }
 // { dg-require-alias }
 // { dg-final { scan-assembler-times "call *_ZTW2ir" 1 { xfail *-*-* } } }
 // { dg-final cleanup-saved-temps }
-// { dg-do run }
 
 // XFAILed until the back end supports a way to mark a function as cseable
 // though not pure.
Index: gcc/testsuite/g++.dg/tls/thread_local-wrap4.C
===================================================================
--- gcc/testsuite/g++.dg/tls/thread_local-wrap4.C	2012-10-14 14:02:01.000000000 +0100
+++ gcc/testsuite/g++.dg/tls/thread_local-wrap4.C	2012-10-15 20:28:38.147650178 +0100
@@ -2,6 +2,7 @@
 // copy per shared object.
 
 // { dg-require-effective-target tls }
+// { dg-require-effective-target fpic }
 // { dg-options "-std=c++11 -fPIC" }
 // { dg-final { scan-assembler-not "_ZTW1i@PLT" { target i?86-*-* x86_64-*-* } } }
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-05 17:38   ` Jason Merrill
@ 2012-10-15 17:49     ` Jason Merrill
  0 siblings, 0 replies; 20+ messages in thread
From: Jason Merrill @ 2012-10-15 17:49 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches List, Jakub Jelinek, Richard Henderson

On 10/05/2012 10:38 AM, Jason Merrill wrote:
> On 10/05/2012 04:29 AM, Richard Guenther wrote:
>> Or if we have the extra indirection via a reference anyway, we could
>> have a pointer TLS variable (NULL initialized) that on the first access
>> will trap where in a trap handler we could then perform initialization
>> and setup of that pointer.
>
> Interesting idea.  But I don't think there's any way to determine from a
> SEGV handler which null pointer needs to be initialized, and in any case
> users might want to have their own SEGV handlers.

But then, with support from the kernel and dynamic loader we aren't 
limited to normal signal handlers; it ought to be possible to mark a 
page of thread_local variables as trapping so that the first reference 
invokes a designated initialization function.  I think that such a 
scheme ought to be link-compatible with this one so long as the 
initialization function is the same; references would just be direct 
rather than through the wrapper function.

It sounds like recent versions of MacOS X support something like this, 
though clang doesn't take advantage of it yet.

Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-05  8:30 ` Richard Guenther
  2012-10-05  8:41   ` Jakub Jelinek
@ 2012-10-05 17:38   ` Jason Merrill
  2012-10-15 17:49     ` Jason Merrill
  1 sibling, 1 reply; 20+ messages in thread
From: Jason Merrill @ 2012-10-05 17:38 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches List

On 10/05/2012 04:29 AM, Richard Guenther wrote:
> Or if we have the extra indirection via a reference anyway, we could
> have a pointer TLS variable (NULL initialized) that on the first access
> will trap where in a trap handler we could then perform initialization
> and setup of that pointer.

Interesting idea.  But I don't think there's any way to determine from a 
SEGV handler which null pointer needs to be initialized, and in any case 
users might want to have their own SEGV handlers.

Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-05  8:41   ` Jakub Jelinek
@ 2012-10-05 17:28     ` Jason Merrill
  0 siblings, 0 replies; 20+ messages in thread
From: Jason Merrill @ 2012-10-05 17:28 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Guenther, gcc-patches List

On 10/05/2012 04:41 AM, Jakub Jelinek wrote:
> Unfortunately, that penalty is not only for thread_local vars with
> ctors/dtors.  There is some penalty even for using
> extern thread_local int i;
> int foo (void)
> {
>    return i;
> }
> (as compared to extern __thread int i;), because we have to at least check
> whether the weak TLS initializer for i is NULL or not.  Other TU could have
> thread_local int i = dynamic_initialization ();

Right.  This is why my patch continues to treat thread_local and 
__thread differently; people that don't want this overhead for exported 
TLS variables with static initialization can continue to use __thread.

Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-05  8:30 ` Richard Guenther
@ 2012-10-05  8:41   ` Jakub Jelinek
  2012-10-05 17:28     ` Jason Merrill
  2012-10-05 17:38   ` Jason Merrill
  1 sibling, 1 reply; 20+ messages in thread
From: Jakub Jelinek @ 2012-10-05  8:41 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jason Merrill, gcc-patches List

On Fri, Oct 05, 2012 at 10:29:54AM +0200, Richard Guenther wrote:
> I wonder if an implementation is conforming that performs non-local TLS
> variable inits at thread creation time instead (probably also would require
> glibc support)?

I think it is conforming, but not really doable, because of dlopen.
If you have multiple threads running already and dlopen a library that has
thread_local with ctors, you can construct them in the current thread, but
can't construct them in other threads (even just sending some magic syscall,
interrupting those threads, isn't going to work, because then you'd be in
async-signal context where the number of things you can portably do is
limited, while the constructors can expect to be able to execute arbitrary
code).  This is similar to the reason why we want to make libraries
which have thread_local with dtors non-dlcloseable.  We can destruct in the
current thread at dlclose time, but can't destruct in other threads (and the
C++ standard isn't aware of dlclose, therefore it can't easily say anything
allowing the dtors not to be run in that case).  thread_local dtors for
other threads (from what Jason said, haven't checked the standard) don't
need to run for other threads at exit time (which is a situation very
similar to dlclose).

> Should we document our choice of implementation somewhere in the
> manual?  And thus implicitely provide the hint that using non-local TLS
> variables with dynamic initialization comes with an abstraction penalty
> that we might not be easily able to remove?

Unfortunately, that penalty is not only for thread_local vars with
ctors/dtors.  There is some penalty even for using
extern thread_local int i;
int foo (void)
{
  return i;
}
(as compared to extern __thread int i;), because we have to at least check
whether the weak TLS initializer for i is NULL or not.  Other TU could have
thread_local int i = dynamic_initialization ();

	Jakub

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-04 17:39 Jason Merrill
  2012-10-05  8:30 ` Richard Guenther
@ 2012-10-05  8:31 ` Jakub Jelinek
  1 sibling, 0 replies; 20+ messages in thread
From: Jakub Jelinek @ 2012-10-05  8:31 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches List

On Thu, Oct 04, 2012 at 01:38:38PM -0400, Jason Merrill wrote:
> commit 18c01be0ec8b7a3cda6a16e86356e8e434c12f89
> Author: Jason Merrill <jason@redhat.com>
> Date:   Thu Sep 20 16:00:08 2012 -0400
> 
>     	Support C++11 thread_local destructors.
>     libstdc++-v3/
>     	* libsupc++/cxxabi.h: Declare __cxa_thread_atexit.
>     	* libsupc++/atexit_thread.cc: New.
>     	* libsupc++/Makefile.am (nested_exception.lo): Add it.
>     	* config/abi/pre/gnu.ver: Add __cxa_thread_atexit.

If we want to add it to glibc, then it shouldn't be exported
from libstdc++ (or perhaps only as a fallback implementation;
the question is what to do if non-Linux targets add __cxa_thread_atexit
to their C libraries).  If this is a fallback implementation, then it
needs to have some destructor that will call pthread_key_delete
(well, its __gthread_* wrapper), so that it libstdc++.so is dlclosed,
things don't crash (well, they likely will anyway if the C++ shared
library using thread_local destructors is dlclosed with multiple running
threads, but in the libstdc++ case it is at least avoidable).

	Jakub

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
  2012-10-04 17:39 Jason Merrill
@ 2012-10-05  8:30 ` Richard Guenther
  2012-10-05  8:41   ` Jakub Jelinek
  2012-10-05 17:38   ` Jason Merrill
  2012-10-05  8:31 ` Jakub Jelinek
  1 sibling, 2 replies; 20+ messages in thread
From: Richard Guenther @ 2012-10-05  8:30 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches List

On Thu, Oct 4, 2012 at 7:38 PM, Jason Merrill <jason@redhat.com> wrote:
> Both C++11 and OpenMP specify that thread_local/threadprivate variables can
> have dynamic initialization and destruction semantics.  This sequence of
> patches implements that.
>
> The first patch adds the C++11 thread_local keyword and implements the C++11
> parsing rules for thread_local, which differ somewhat from __thread.  It
> also allows dynamic initialization of function-local thread_local variables.
>
> The second patch adds a __cxa_thread_atexit interface for registering
> cleanups to be run when a thread exits.  The current implementation is not
> fully conformant; on exit, TLS destructors are supposed to run before any
> destructors for non-TLS variables, but I think that will require glibc
> support for __cxa_thread_atexit.
>
> The third patch implements dynamic initialization of non-function-local TLS
> variables using the init-on-first-use idiom: uses of such a variable are
> replaced with calls to a wrapper function, so that
>
> int f();
> thread_local int i = f();
>
> implies
>
> void i_init() {
>   static bool initialized;
>   if (!initialized)
>     {
>       initialized = true;
>       i = f();
>     }
> }
> inline int& i_wrapper() {
>   i_init();
>   return i;
> }
>
> Note that if we see
>
> extern thread_local int i;
>
> in a header, we don't know whether it has a dynamic initializer in its
> defining translation unit, so we need to make conservative assumptions.  For
> a type that doesn't always get dynamic initialization, we make i_init a
> weakref and only call it if it exists.  In the same translation unit as the
> definition, we optimize appropriately.
>
> The wrapper function is the function I'm talking about in
>   http://gcc.gnu.org/ml/gcc/2012-10/msg00024.html
>
> Any comments before I check this in?

I wonder if an implementation is conforming that performs non-local TLS
variable inits at thread creation time instead (probably also would require
glibc support)?

Or if we have the extra indirection via a reference anyway, we could
have a pointer TLS variable (NULL initialized) that on the first access
will trap where in a trap handler we could then perform initialization
and setup of that pointer.

Should we document our choice of implementation somewhere in the
manual?  And thus implicitely provide the hint that using non-local TLS
variables with dynamic initialization comes with an abstraction penalty
that we might not be easily able to remove?

Thanks,
Richard.

> Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables
@ 2012-10-04 17:39 Jason Merrill
  2012-10-05  8:30 ` Richard Guenther
  2012-10-05  8:31 ` Jakub Jelinek
  0 siblings, 2 replies; 20+ messages in thread
From: Jason Merrill @ 2012-10-04 17:39 UTC (permalink / raw)
  To: gcc-patches List

[-- Attachment #1: Type: text/plain, Size: 1707 bytes --]

Both C++11 and OpenMP specify that thread_local/threadprivate variables 
can have dynamic initialization and destruction semantics.  This 
sequence of patches implements that.

The first patch adds the C++11 thread_local keyword and implements the 
C++11 parsing rules for thread_local, which differ somewhat from 
__thread.  It also allows dynamic initialization of function-local 
thread_local variables.

The second patch adds a __cxa_thread_atexit interface for registering 
cleanups to be run when a thread exits.  The current implementation is 
not fully conformant; on exit, TLS destructors are supposed to run 
before any destructors for non-TLS variables, but I think that will 
require glibc support for __cxa_thread_atexit.

The third patch implements dynamic initialization of non-function-local 
TLS variables using the init-on-first-use idiom: uses of such a variable 
are replaced with calls to a wrapper function, so that

int f();
thread_local int i = f();

implies

void i_init() {
   static bool initialized;
   if (!initialized)
     {
       initialized = true;
       i = f();
     }
}
inline int& i_wrapper() {
   i_init();
   return i;
}

Note that if we see

extern thread_local int i;

in a header, we don't know whether it has a dynamic initializer in its 
defining translation unit, so we need to make conservative assumptions. 
  For a type that doesn't always get dynamic initialization, we make 
i_init a weakref and only call it if it exists.  In the same translation 
unit as the definition, we optimize appropriately.

The wrapper function is the function I'm talking about in
   http://gcc.gnu.org/ml/gcc/2012-10/msg00024.html

Any comments before I check this in?

Jason

[-- Attachment #2: thread_local.patch --]
[-- Type: text/x-patch, Size: 77374 bytes --]

commit 972a5ac7c33d93b27d86d010af7c154da1281e6b
Author: Jason Merrill <jason@redhat.com>
Date:   Wed Sep 19 11:23:24 2012 -0400

    	Partial implementation of C++11 thread_local.
    c-family/
    	* c-common.c (c_common_reswords): Add thread_local.
    cp/
    	* decl.c (cp_finish_decl): Remove errors about non-trivial
    	initialization and destruction of TLS variables.
    	(register_dtor_fn): Add sorry about TLS variables.
    	(expand_static_init): Add sorry about non-local TLS variables,
    	or error with __thread.
    	Don't emit thread-safety guards for local TLS variables.
    	(grokdeclarator): thread_local in a function implies static.
    	* decl.h: Adjust prototype.
    	* decl2.c (get_guard): Copy DECL_TLS_MODEL.
    	* parser.c (cp_parser_set_storage_class, cp_parser_set_decl_spec_type)
    	(set_and_check_decl_spec_loc): Take the token rather than the location.
    	Distinguish between __thread and thread_local.
    	(cp_parser_set_storage_class): Don't complain about thread_local before
    	extern/static.
    	(token_is__thread): New.
    	* call.c (make_temporary_var_for_ref_to_temp): Handle TLS.
    	* cp-tree.h (DECL_GNU_TLS_P): New.
    	(cp_decl_specifier_seq): Add gnu_thread_keyword_p.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 6de2f1c..fcc9132 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -542,6 +542,7 @@ const struct c_common_resword c_common_reswords[] =
   { "switch",		RID_SWITCH,	0 },
   { "template",		RID_TEMPLATE,	D_CXXONLY | D_CXXWARN },
   { "this",		RID_THIS,	D_CXXONLY | D_CXXWARN },
+  { "thread_local",	RID_THREAD,	D_CXXONLY | D_CXX0X | D_CXXWARN },
   { "throw",		RID_THROW,	D_CXX_OBJC | D_CXXWARN },
   { "true",		RID_TRUE,	D_CXXONLY | D_CXXWARN },
   { "try",		RID_TRY,	D_CXX_OBJC | D_CXXWARN },
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index d0492d8..5cc9bad 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8718,9 +8718,9 @@ perform_direct_initialization_if_possible (tree type,
 
   The next several functions are involved in this lifetime extension.  */
 
-/* DECL is a VAR_DECL whose type is a REFERENCE_TYPE.  The reference
-   is being bound to a temporary.  Create and return a new VAR_DECL
-   with the indicated TYPE; this variable will store the value to
+/* DECL is a VAR_DECL or FIELD_DECL whose type is a REFERENCE_TYPE.  The
+   reference is being bound to a temporary.  Create and return a new
+   VAR_DECL with the indicated TYPE; this variable will store the value to
    which the reference is bound.  */
 
 tree
@@ -8732,13 +8732,15 @@ make_temporary_var_for_ref_to_temp (tree decl, tree type)
   var = create_temporary_var (type);
 
   /* Register the variable.  */
-  if (TREE_STATIC (decl))
+  if (TREE_CODE (decl) == VAR_DECL
+      && (TREE_STATIC (decl) || DECL_THREAD_LOCAL_P (decl)))
     {
       /* Namespace-scope or local static; give it a mangled name.  */
       /* FIXME share comdat with decl?  */
       tree name;
 
-      TREE_STATIC (var) = 1;
+      TREE_STATIC (var) = TREE_STATIC (decl);
+      DECL_TLS_MODEL (var) = DECL_TLS_MODEL (decl);
       name = mangle_ref_init_variable (decl);
       DECL_NAME (var) = name;
       SET_DECL_ASSEMBLER_NAME (var, name);
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index bfe7ad7..12ad4ed 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -56,6 +56,7 @@ c-common.h, not after.
       AGGR_INIT_VIA_CTOR_P (in AGGR_INIT_EXPR)
       PTRMEM_OK_P (in ADDR_EXPR, OFFSET_REF, SCOPE_REF)
       PAREN_STRING_LITERAL (in STRING_CST)
+      DECL_GNU_TLS_P (in VAR_DECL)
       KOENIG_LOOKUP_P (in CALL_EXPR)
       STATEMENT_LIST_NO_SCOPE (in STATEMENT_LIST).
       EXPR_STMT_STMT_EXPR_RESULT (in EXPR_STMT)
@@ -2422,6 +2423,11 @@ struct GTY((variable_size)) lang_decl {
   (DECL_NAME (NODE) \
    && !strcmp (IDENTIFIER_POINTER (DECL_NAME (NODE)), "__PRETTY_FUNCTION__"))
 
+/* Nonzero if the thread-local variable was declared with __thread
+   as opposed to thread_local.  */
+#define DECL_GNU_TLS_P(NODE) \
+  (TREE_LANG_FLAG_0 (VAR_DECL_CHECK (NODE)))
+
 /* The _TYPE context in which this _DECL appears.  This field holds the
    class where a virtual function instance is actually defined.  */
 #define DECL_CLASS_CONTEXT(NODE) \
@@ -4722,6 +4728,8 @@ typedef struct cp_decl_specifier_seq {
   BOOL_BITFIELD explicit_int128_p : 1;
   /* True iff "char" was explicitly provided.  */
   BOOL_BITFIELD explicit_char_p : 1;
+  /* True iff ds_thread is set for __thread, not thread_local.  */
+  BOOL_BITFIELD gnu_thread_keyword_p : 1;
 } cp_decl_specifier_seq;
 
 /* The various kinds of declarators.  */
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index d0933ef..980aec2 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6194,13 +6194,6 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
 
   if (TREE_CODE (decl) == VAR_DECL)
     {
-      /* Only variables with trivial initialization and destruction can
-	 have thread-local storage.  */
-      if (DECL_THREAD_LOCAL_P (decl)
-	  && (type_has_nontrivial_default_init (TREE_TYPE (decl))
-	      || TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (decl))))
-	error ("%qD cannot be thread-local because it has non-trivial "
-	       "type %qT", decl, TREE_TYPE (decl));
       /* If this is a local variable that will need a mangled name,
 	 register it now.  We must do this before processing the
 	 initializer for the variable, since the initialization might
@@ -6246,13 +6239,6 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
 	    }
 	  cleanups = make_tree_vector ();
 	  init = check_initializer (decl, init, flags, &cleanups);
-	  /* Thread-local storage cannot be dynamically initialized.  */
-	  if (DECL_THREAD_LOCAL_P (decl) && init)
-	    {
-	      error ("%qD is thread-local and so cannot be dynamically "
-		     "initialized", decl);
-	      init = NULL_TREE;
-	    }
 
 	  /* Check that the initializer for a static data member was a
 	     constant.  Although we check in the parser that the
@@ -6701,6 +6687,12 @@ register_dtor_fn (tree decl)
       end_cleanup_fn ();
     }
 
+  if (DECL_THREAD_LOCAL_P (decl))
+    /* We don't have a thread-local atexit yet.  FIXME write one using
+       pthread_key_create and friends.  */
+    sorry ("thread-local variable %q#D with non-trivial "
+	   "destructor", decl);
+
   /* Call atexit with the cleanup function.  */
   mark_used (cleanup);
   cleanup = build_address (cleanup);
@@ -6764,6 +6756,36 @@ expand_static_init (tree decl, tree init)
       && TYPE_HAS_TRIVIAL_DESTRUCTOR (TREE_TYPE (decl)))
     return;
 
+  if (DECL_THREAD_LOCAL_P (decl) && DECL_GNU_TLS_P (decl)
+      && !DECL_FUNCTION_SCOPE_P (decl))
+    {
+      if (init)
+	error ("non-local variable %qD declared %<__thread%> "
+	       "needs dynamic initialization", decl);
+      else
+	error ("non-local variable %qD declared %<__thread%> "
+	       "has a non-trivial destructor", decl);
+      static bool informed;
+      if (!informed)
+	{
+	  inform (DECL_SOURCE_LOCATION (decl),
+		  "C++11 %<thread_local%> allows dynamic initialization "
+		  "and destruction");
+	  informed = true;
+	}
+      return;
+    }
+
+  if (DECL_THREAD_LOCAL_P (decl) && !DECL_FUNCTION_SCOPE_P (decl))
+    {
+      /* We haven't implemented dynamic initialization of non-local
+	 thread-local storage yet.  FIXME transform to singleton
+	 function.  */
+      sorry ("thread-local variable %qD with dynamic initialization outside "
+	     "function scope", decl);
+      return;
+    }
+
   if (DECL_FUNCTION_SCOPE_P (decl))
     {
       /* Emit code to perform this initialization but once.  */
@@ -6771,6 +6793,9 @@ expand_static_init (tree decl, tree init)
       tree then_clause = NULL_TREE, inner_then_clause = NULL_TREE;
       tree guard, guard_addr;
       tree flag, begin;
+      /* We don't need thread-safety code for thread-local vars.  */
+      bool thread_guard = (flag_threadsafe_statics
+			   && !DECL_THREAD_LOCAL_P (decl));
 
       /* Emit code to perform this initialization but once.  This code
 	 looks like:
@@ -6809,7 +6834,7 @@ expand_static_init (tree decl, tree init)
       /* This optimization isn't safe on targets with relaxed memory
 	 consistency.  On such targets we force synchronization in
 	 __cxa_guard_acquire.  */
-      if (!targetm.relaxed_ordering || !flag_threadsafe_statics)
+      if (!targetm.relaxed_ordering || !thread_guard)
 	{
 	  /* Begin the conditional initialization.  */
 	  if_stmt = begin_if_stmt ();
@@ -6817,7 +6842,7 @@ expand_static_init (tree decl, tree init)
 	  then_clause = begin_compound_stmt (BCS_NO_SCOPE);
 	}
 
-      if (flag_threadsafe_statics)
+      if (thread_guard)
 	{
 	  tree vfntype = NULL_TREE;
 	  tree acquire_name, release_name, abort_name;
@@ -6875,14 +6900,14 @@ expand_static_init (tree decl, tree init)
 
       finish_expr_stmt (init);
 
-      if (flag_threadsafe_statics)
+      if (thread_guard)
 	{
 	  finish_compound_stmt (inner_then_clause);
 	  finish_then_clause (inner_if_stmt);
 	  finish_if_stmt (inner_if_stmt);
 	}
 
-      if (!targetm.relaxed_ordering || !flag_threadsafe_statics)
+      if (!targetm.relaxed_ordering || !thread_guard)
 	{
 	  finish_compound_stmt (then_clause);
 	  finish_then_clause (if_stmt);
@@ -7699,7 +7724,11 @@ grokvardecl (tree type,
     }
 
   if (decl_spec_seq_has_spec_p (declspecs, ds_thread))
-    DECL_TLS_MODEL (decl) = decl_default_tls_model (decl);
+    {
+      DECL_TLS_MODEL (decl) = decl_default_tls_model (decl);
+      if (declspecs->gnu_thread_keyword_p)
+	DECL_GNU_TLS_P (decl) = true;
+    }
 
   /* If the type of the decl has no linkage, make sure that we'll
      notice that in mark_used.  */
@@ -8386,7 +8415,7 @@ check_var_type (tree identifier, tree type)
 
 tree
 grokdeclarator (const cp_declarator *declarator,
-		const cp_decl_specifier_seq *declspecs,
+		cp_decl_specifier_seq *declspecs,
 		enum decl_context decl_context,
 		int initialized,
 		tree* attrlist)
@@ -9100,9 +9129,15 @@ grokdeclarator (const cp_declarator *declarator,
 	   && storage_class != sc_extern
 	   && storage_class != sc_static)
     {
-      error ("function-scope %qs implicitly auto and declared %<__thread%>",
-	     name);
-      thread_p = false;
+      if (declspecs->gnu_thread_keyword_p)
+	pedwarn (input_location, 0, "function-scope %qs implicitly auto and "
+		 "declared %<__thread%>", name);
+
+      /* When thread_local is applied to a variable of block scope the
+	 storage-class-specifier static is implied if it does not appear
+	 explicitly.  */
+      storage_class = declspecs->storage_class = sc_static;
+      staticp = 1;
     }
 
   if (storage_class && friendp)
@@ -10337,7 +10372,14 @@ grokdeclarator (const cp_declarator *declarator,
 	else if (storage_class == sc_register)
 	  error ("storage class %<register%> invalid for function %qs", name);
 	else if (thread_p)
-	  error ("storage class %<__thread%> invalid for function %qs", name);
+	  {
+	    if (declspecs->gnu_thread_keyword_p)
+	      error ("storage class %<__thread%> invalid for function %qs",
+		     name);
+	    else
+	      error ("storage class %<thread_local%> invalid for function %qs",
+		     name);
+	  }
 
         if (virt_specifiers)
           error ("virt-specifiers in %qs not allowed outside a class definition", name);
diff --git a/gcc/cp/decl.h b/gcc/cp/decl.h
index a8a2b78..193df27 100644
--- a/gcc/cp/decl.h
+++ b/gcc/cp/decl.h
@@ -34,7 +34,7 @@ enum decl_context
 
 /* We need this in here to get the decl_context definition.  */
 extern tree grokdeclarator (const cp_declarator *,
-			    const cp_decl_specifier_seq *,
+			    cp_decl_specifier_seq *,
 			    enum decl_context, int, tree*);
 
 /* States indicating how grokdeclarator() should handle declspecs marked
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 87e38d3..a240ff4 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -2696,6 +2696,7 @@ get_guard (tree decl)
       TREE_STATIC (guard) = TREE_STATIC (decl);
       DECL_COMMON (guard) = DECL_COMMON (decl);
       DECL_COMDAT (guard) = DECL_COMDAT (decl);
+      DECL_TLS_MODEL (guard) = DECL_TLS_MODEL (decl);
       if (DECL_ONE_ONLY (decl))
 	make_decl_one_only (guard, cxx_comdat_group (guard));
       if (TREE_PUBLIC (decl))
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 155b51a..e3fb9ac 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2215,12 +2215,12 @@ static tree cp_parser_trait_expr
 static bool cp_parser_declares_only_class_p
   (cp_parser *);
 static void cp_parser_set_storage_class
-  (cp_parser *, cp_decl_specifier_seq *, enum rid, location_t);
+  (cp_parser *, cp_decl_specifier_seq *, enum rid, cp_token *);
 static void cp_parser_set_decl_spec_type
-  (cp_decl_specifier_seq *, tree, location_t, bool);
+  (cp_decl_specifier_seq *, tree, cp_token *, bool);
 static void set_and_check_decl_spec_loc
   (cp_decl_specifier_seq *decl_specs,
-   cp_decl_spec ds, source_location location);
+   cp_decl_spec ds, cp_token *);
 static bool cp_parser_friend_p
   (const cp_decl_specifier_seq *);
 static void cp_parser_required_error
@@ -10672,7 +10672,7 @@ cp_parser_decl_specifier_seq (cp_parser* parser,
 
               /* Set the storage class anyway.  */
               cp_parser_set_storage_class (parser, decl_specs, RID_AUTO,
-					   token->location);
+					   token);
             }
           else
 	    /* C++0x auto type-specifier.  */
@@ -10686,7 +10686,7 @@ cp_parser_decl_specifier_seq (cp_parser* parser,
 	  /* Consume the token.  */
 	  cp_lexer_consume_token (parser->lexer);
           cp_parser_set_storage_class (parser, decl_specs, token->keyword,
-				       token->location);
+				       token);
 	  break;
 	case RID_THREAD:
 	  /* Consume the token.  */
@@ -10706,7 +10706,7 @@ cp_parser_decl_specifier_seq (cp_parser* parser,
 	error ("decl-specifier invalid in condition");
 
       if (ds != ds_last)
-	set_and_check_decl_spec_loc (decl_specs, ds, token->location);
+	set_and_check_decl_spec_loc (decl_specs, ds, token);
 
       /* Constructors are a special case.  The `S' in `S()' is not a
 	 decl-specifier; it is the beginning of the declarator.  */
@@ -10855,7 +10855,7 @@ cp_parser_function_specifier_opt (cp_parser* parser,
   switch (token->keyword)
     {
     case RID_INLINE:
-      set_and_check_decl_spec_loc (decl_specs, ds_inline, token->location);
+      set_and_check_decl_spec_loc (decl_specs, ds_inline, token);
       break;
 
     case RID_VIRTUAL:
@@ -10864,11 +10864,11 @@ cp_parser_function_specifier_opt (cp_parser* parser,
 	 A member function template shall not be virtual.  */
       if (PROCESSING_REAL_TEMPLATE_DECL_P ())
 	error_at (token->location, "templates may not be %<virtual%>");
-      set_and_check_decl_spec_loc (decl_specs, ds_virtual, token->location);
+      set_and_check_decl_spec_loc (decl_specs, ds_virtual, token);
       break;
 
     case RID_EXPLICIT:
-      set_and_check_decl_spec_loc (decl_specs, ds_explicit, token->location);
+      set_and_check_decl_spec_loc (decl_specs, ds_explicit, token);
       break;
 
     default:
@@ -13372,7 +13372,7 @@ cp_parser_type_specifier (cp_parser* parser,
 	  if (decl_specs)
 	    cp_parser_set_decl_spec_type (decl_specs,
 					  type_spec,
-					  token->location,
+					  token,
 					  /*type_definition_p=*/true);
 	  return type_spec;
 	}
@@ -13401,7 +13401,7 @@ cp_parser_type_specifier (cp_parser* parser,
 	  if (decl_specs)
 	    cp_parser_set_decl_spec_type (decl_specs,
 					  type_spec,
-					  token->location,
+					  token,
 					  /*type_definition_p=*/true);
 	  return type_spec;
 	}
@@ -13423,7 +13423,7 @@ cp_parser_type_specifier (cp_parser* parser,
       if (decl_specs)
 	cp_parser_set_decl_spec_type (decl_specs,
 				      type_spec,
-				      token->location,
+				      token,
 				      /*type_definition_p=*/false);
       return type_spec;
 
@@ -13459,7 +13459,7 @@ cp_parser_type_specifier (cp_parser* parser,
     {
       if (decl_specs)
 	{
-	  set_and_check_decl_spec_loc (decl_specs, ds, token->location);
+	  set_and_check_decl_spec_loc (decl_specs, ds, token);
 	  decl_specs->any_specifiers_p = true;
 	}
       return cp_lexer_consume_token (parser->lexer)->u.value;
@@ -13550,7 +13550,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
       type = boolean_type_node;
       break;
     case RID_SHORT:
-      set_and_check_decl_spec_loc (decl_specs, ds_short, token->location);
+      set_and_check_decl_spec_loc (decl_specs, ds_short, token);
       type = short_integer_type_node;
       break;
     case RID_INT:
@@ -13567,15 +13567,15 @@ cp_parser_simple_type_specifier (cp_parser* parser,
       break;
     case RID_LONG:
       if (decl_specs)
-	set_and_check_decl_spec_loc (decl_specs, ds_long, token->location);
+	set_and_check_decl_spec_loc (decl_specs, ds_long, token);
       type = long_integer_type_node;
       break;
     case RID_SIGNED:
-      set_and_check_decl_spec_loc (decl_specs, ds_signed, token->location);
+      set_and_check_decl_spec_loc (decl_specs, ds_signed, token);
       type = integer_type_node;
       break;
     case RID_UNSIGNED:
-      set_and_check_decl_spec_loc (decl_specs, ds_unsigned, token->location);
+      set_and_check_decl_spec_loc (decl_specs, ds_unsigned, token);
       type = unsigned_type_node;
       break;
     case RID_FLOAT:
@@ -13613,7 +13613,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
 
       if (decl_specs)
 	cp_parser_set_decl_spec_type (decl_specs, type,
-				      token->location,
+				      token,
 				      /*type_definition_p=*/false);
 
       return type;
@@ -13622,7 +13622,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
       type = cp_parser_trait_expr (parser, RID_UNDERLYING_TYPE);
       if (decl_specs)
 	cp_parser_set_decl_spec_type (decl_specs, type,
-				      token->location,
+				      token,
 				      /*type_definition_p=*/false);
 
       return type;
@@ -13632,7 +13632,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
       type = cp_parser_trait_expr (parser, token->keyword);
       if (decl_specs)
        cp_parser_set_decl_spec_type (decl_specs, type,
-                                     token->location,
+                                     token,
                                      /*type_definition_p=*/false);
       return type;
     default:
@@ -13647,7 +13647,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
       type = token->u.value;
       if (decl_specs)
 	cp_parser_set_decl_spec_type (decl_specs, type,
-				      token->location,
+				      token,
 				      /*type_definition_p=*/false);
       cp_lexer_consume_token (parser->lexer);
       return type;
@@ -13664,7 +13664,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
 	      && token->keyword != RID_LONG))
 	cp_parser_set_decl_spec_type (decl_specs,
 				      type,
-				      token->location,
+				      token,
 				      /*type_definition_p=*/false);
       if (decl_specs)
 	decl_specs->any_specifiers_p = true;
@@ -13741,7 +13741,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
 	type = NULL_TREE;
       if (type && decl_specs)
 	cp_parser_set_decl_spec_type (decl_specs, type,
-				      token->location,
+				      token,
 				      /*type_definition_p=*/false);
     }
 
@@ -15092,21 +15092,24 @@ static tree
 cp_parser_alias_declaration (cp_parser* parser)
 {
   tree id, type, decl, pushed_scope = NULL_TREE, attributes;
-  location_t id_location, using_location, attrs_location = 0;
+  location_t id_location;
   cp_declarator *declarator;
   cp_decl_specifier_seq decl_specs;
   bool member_p;
   const char *saved_message = NULL;
 
   /* Look for the `using' keyword.  */
-  using_location = cp_lexer_peek_token (parser->lexer)->location;
-  cp_parser_require_keyword (parser, RID_USING, RT_USING);
+  cp_token *using_token
+    = cp_parser_require_keyword (parser, RID_USING, RT_USING);
+  if (using_token == NULL)
+    return error_mark_node;
+
   id_location = cp_lexer_peek_token (parser->lexer)->location;
   id = cp_parser_identifier (parser);
   if (id == error_mark_node)
     return error_mark_node;
 
-  attrs_location = cp_lexer_peek_token (parser->lexer)->location;
+  cp_token *attrs_token = cp_lexer_peek_token (parser->lexer);
   attributes = cp_parser_attributes_opt (parser);
   if (attributes == error_mark_node)
     return error_mark_node;
@@ -15163,14 +15166,14 @@ cp_parser_alias_declaration (cp_parser* parser)
       decl_specs.attributes = attributes;
       set_and_check_decl_spec_loc (&decl_specs,
 				   ds_attribute,
-				   attrs_location);
+				   attrs_token);
     }
   set_and_check_decl_spec_loc (&decl_specs,
 			       ds_typedef,
-			       using_location);
+			       using_token);
   set_and_check_decl_spec_loc (&decl_specs,
 			       ds_alias,
-			       using_location);
+			       using_token);
 
   declarator = make_id_declarator (NULL_TREE, id, sfk_none);
   declarator->id_loc = id_location;
@@ -22054,13 +22057,13 @@ static void
 cp_parser_set_storage_class (cp_parser *parser,
 			     cp_decl_specifier_seq *decl_specs,
 			     enum rid keyword,
-			     location_t location)
+			     cp_token *token)
 {
   cp_storage_class storage_class;
 
   if (parser->in_unbraced_linkage_specification_p)
     {
-      error_at (location, "invalid use of %qD in linkage specification",
+      error_at (token->location, "invalid use of %qD in linkage specification",
 		ridpointers[keyword]);
       return;
     }
@@ -22071,11 +22074,11 @@ cp_parser_set_storage_class (cp_parser *parser,
     }
 
   if ((keyword == RID_EXTERN || keyword == RID_STATIC)
-      && decl_spec_seq_has_spec_p (decl_specs, ds_thread))
+      && decl_spec_seq_has_spec_p (decl_specs, ds_thread)
+      && decl_specs->gnu_thread_keyword_p)
     {
-      error_at (decl_specs->locations[ds_thread],
+      pedwarn (decl_specs->locations[ds_thread], 0,
 		"%<__thread%> before %qD", ridpointers[keyword]);
-      decl_specs->locations[ds_thread] = 0;
     }
 
   switch (keyword)
@@ -22099,7 +22102,7 @@ cp_parser_set_storage_class (cp_parser *parser,
       gcc_unreachable ();
     }
   decl_specs->storage_class = storage_class;
-  set_and_check_decl_spec_loc (decl_specs, ds_storage_class, location);
+  set_and_check_decl_spec_loc (decl_specs, ds_storage_class, token);
 
   /* A storage class specifier cannot be applied alongside a typedef 
      specifier. If there is a typedef specifier present then set 
@@ -22115,7 +22118,7 @@ cp_parser_set_storage_class (cp_parser *parser,
 static void
 cp_parser_set_decl_spec_type (cp_decl_specifier_seq *decl_specs,
 			      tree type_spec,
-			      location_t location,
+			      cp_token *token,
 			      bool type_definition_p)
 {
   decl_specs->any_specifiers_p = true;
@@ -22140,12 +22143,12 @@ cp_parser_set_decl_spec_type (cp_decl_specifier_seq *decl_specs,
       decl_specs->redefined_builtin_type = type_spec;
       set_and_check_decl_spec_loc (decl_specs,
 				   ds_redefined_builtin_type_spec,
-				   location);
+				   token);
       if (!decl_specs->type)
 	{
 	  decl_specs->type = type_spec;
 	  decl_specs->type_definition_p = false;
-	  set_and_check_decl_spec_loc (decl_specs,ds_type_spec, location);
+	  set_and_check_decl_spec_loc (decl_specs,ds_type_spec, token);
 	}
     }
   else if (decl_specs->type)
@@ -22155,10 +22158,19 @@ cp_parser_set_decl_spec_type (cp_decl_specifier_seq *decl_specs,
       decl_specs->type = type_spec;
       decl_specs->type_definition_p = type_definition_p;
       decl_specs->redefined_builtin_type = NULL_TREE;
-      set_and_check_decl_spec_loc (decl_specs, ds_type_spec, location);
+      set_and_check_decl_spec_loc (decl_specs, ds_type_spec, token);
     }
 }
 
+/* True iff TOKEN is the GNU keyword __thread.  */
+
+static bool
+token_is__thread (cp_token *token)
+{
+  gcc_assert (token->keyword == RID_THREAD);
+  return !strcmp (IDENTIFIER_POINTER (token->u.value), "__thread");
+}
+
 /* Set the location for a declarator specifier and check if it is
    duplicated.
 
@@ -22173,15 +22185,21 @@ cp_parser_set_decl_spec_type (cp_decl_specifier_seq *decl_specs,
 
 static void
 set_and_check_decl_spec_loc (cp_decl_specifier_seq *decl_specs,
-			     cp_decl_spec ds, source_location location)
+			     cp_decl_spec ds, cp_token *token)
 {
   gcc_assert (ds < ds_last);
 
   if (decl_specs == NULL)
     return;
 
+  source_location location = token->location;
+
   if (decl_specs->locations[ds] == 0)
-    decl_specs->locations[ds] = location;
+    {
+      decl_specs->locations[ds] = location;
+      if (ds == ds_thread)
+	decl_specs->gnu_thread_keyword_p = token_is__thread (token);
+    }
   else
     {
       if (ds == ds_long)
@@ -22197,6 +22215,15 @@ set_and_check_decl_spec_loc (cp_decl_specifier_seq *decl_specs,
 			     "ISO C++ 1998 does not support %<long long%>");
 	    }
 	}
+      else if (ds == ds_thread)
+	{
+	  bool gnu = token_is__thread (token);
+	  if (gnu != decl_specs->gnu_thread_keyword_p)
+	    error_at (location,
+		      "both %<__thread%> and %<thread_local%> specified");
+	  else
+	    error_at (location, "duplicate %qD", token->u.value);
+	}
       else
 	{
 	  static const char *const decl_spec_names[] = {
@@ -22214,8 +22241,7 @@ set_and_check_decl_spec_loc (cp_decl_specifier_seq *decl_specs,
 	    "typedef",
 	    "using",
             "constexpr",
-	    "__complex",
-	    "__thread"
+	    "__complex"
 	  };
 	  error_at (location,
 		    "duplicate %qs", decl_spec_names[ds]);
@@ -24056,7 +24082,7 @@ cp_parser_objc_class_ivars (cp_parser* parser)
 	  declspecs.storage_class = sc_none;
 	}
 
-      /* __thread.  */
+      /* thread_local.  */
       if (decl_spec_seq_has_spec_p (&declspecs, ds_thread))
 	{
 	  cp_parser_error (parser, "invalid type for instance variable");
@@ -24635,7 +24661,7 @@ cp_parser_objc_struct_declaration (cp_parser *parser)
       declspecs.storage_class = sc_none;
     }
   
-  /* __thread.  */
+  /* thread_local.  */
   if (decl_spec_seq_has_spec_p (&declspecs, ds_thread))
     {
       cp_parser_error (parser, "invalid type for property");
diff --git a/gcc/testsuite/g++.dg/tls/init-2.C b/gcc/testsuite/g++.dg/tls/init-2.C
index c9f646d..327c309 100644
--- a/gcc/testsuite/g++.dg/tls/init-2.C
+++ b/gcc/testsuite/g++.dg/tls/init-2.C
@@ -2,13 +2,13 @@
 /* { dg-require-effective-target tls } */
 
 extern __thread int i;
-__thread int *p = &i;	/* { dg-error "dynamically initialized" } */
+__thread int *p = &i;	/* { dg-error "dynamic initialization" } */
 
 extern int f();
-__thread int j = f();	/* { dg-error "dynamically initialized" } */
+__thread int j = f();	/* { dg-error "dynamic initialization" } */
 
 struct S
 {
   S();
 };
-__thread S s;		/* { dg-error "" } two errors here */
+__thread S s;		/* { dg-error "dynamic initialization" } */
diff --git a/gcc/testsuite/g++.dg/tls/thread_local1.C b/gcc/testsuite/g++.dg/tls/thread_local1.C
new file mode 100644
index 0000000..e7734a0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local1.C
@@ -0,0 +1,21 @@
+// { dg-options "-std=c++11" }
+// { dg-require-effective-target tls }
+
+// The variable should have a guard.
+// { dg-final { scan-assembler "_ZGVZ1fvE1a" } }
+// But since it's thread local we don't need to guard against
+// simultaneous execution.
+// { dg-final { scan-assembler-not "cxa_guard" } }
+// The guard should be TLS, not local common.
+// { dg-final { scan-assembler-not "\.comm" } }
+
+struct A
+{
+  A();
+};
+
+A &f()
+{
+  thread_local A a;
+  return a;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local2.C b/gcc/testsuite/g++.dg/tls/thread_local2.C
new file mode 100644
index 0000000..4cbef15
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local2.C
@@ -0,0 +1,27 @@
+// { dg-do run }
+// { dg-options "-std=c++11" }
+// { dg-require-effective-target tls_runtime }
+
+extern "C" void abort();
+
+struct A
+{
+  A();
+  int i;
+};
+
+A &f()
+{
+  thread_local A a;
+  return a;
+}
+
+int j;
+A::A(): i(j) { }
+
+int main()
+{
+  j = 42;
+  if (f().i != 42)
+    abort ();
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local7.C b/gcc/testsuite/g++.dg/tls/thread_local7.C
new file mode 100644
index 0000000..77a1c05
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local7.C
@@ -0,0 +1,10 @@
+// { dg-options "-std=c++11" }
+// { dg-require-effective-target tls }
+
+// The reference temp should be TLS, not normal data.
+// { dg-final { scan-assembler-not "\\.data" } }
+
+void f()
+{
+  thread_local int&& ir = 42;
+}

commit 18c01be0ec8b7a3cda6a16e86356e8e434c12f89
Author: Jason Merrill <jason@redhat.com>
Date:   Thu Sep 20 16:00:08 2012 -0400

    	Support C++11 thread_local destructors.
    gcc/cp/
    	* decl.c (get_thread_atexit_node): New.
    	(register_dtor_fn): Use it for TLS.
    libstdc++-v3/
    	* libsupc++/cxxabi.h: Declare __cxa_thread_atexit.
    	* libsupc++/atexit_thread.cc: New.
    	* libsupc++/Makefile.am (nested_exception.lo): Add it.
    	* config/abi/pre/gnu.ver: Add __cxa_thread_atexit.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 980aec2..0d04dc0 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6542,6 +6542,24 @@ get_atexit_node (void)
   return atexit_node;
 }
 
+/* Like get_atexit_node, but for thread-local cleanups.  */
+
+static tree
+get_thread_atexit_node (void)
+{
+  /* The declaration for `__cxa_thread_atexit' is:
+
+     int __cxa_atexit (void (*)(void *), void *)  */
+  tree fn_type = build_function_type_list (integer_type_node,
+					   get_atexit_fn_ptr_type (),
+					   ptr_type_node, ptr_type_node,
+					   NULL_TREE);
+
+  /* Now, build the function declaration.  */
+  tree atexit_fndecl = build_library_fn_ptr ("__cxa_thread_atexit", fn_type);
+  return decay_conversion (atexit_fndecl, tf_warning_or_error);
+}
+
 /* Returns the __dso_handle VAR_DECL.  */
 
 static tree
@@ -6633,23 +6651,27 @@ tree
 register_dtor_fn (tree decl)
 {
   tree cleanup;
+  tree addr;
   tree compound_stmt;
   tree fcall;
   tree type;
-  bool use_dtor;
-  tree arg0, arg1 = NULL_TREE, arg2 = NULL_TREE;
+  bool ob_parm, dso_parm, use_dtor;
+  tree arg0, arg1, arg2;
+  tree atex_node;
 
   type = TREE_TYPE (decl);
   if (TYPE_HAS_TRIVIAL_DESTRUCTOR (type))
     return void_zero_node;
 
-  /* If we're using "__cxa_atexit" (or "__aeabi_atexit"), and DECL is
-     a class object, we can just pass the destructor to
-     "__cxa_atexit"; we don't have to build a temporary function to do
-     the cleanup.  */
-  use_dtor = (flag_use_cxa_atexit 
-	      && !targetm.cxx.use_atexit_for_cxa_atexit ()
-	      && CLASS_TYPE_P (type));
+  /* If we're using "__cxa_atexit" (or "__cxa_thread_atexit" or
+     "__aeabi_atexit"), and DECL is a class object, we can just pass the
+     destructor to "__cxa_atexit"; we don't have to build a temporary
+     function to do the cleanup.  */
+  ob_parm = (DECL_THREAD_LOCAL_P (decl)
+	     || (flag_use_cxa_atexit
+		 && !targetm.cxx.use_atexit_for_cxa_atexit ()));
+  dso_parm = ob_parm;
+  use_dtor = ob_parm && CLASS_TYPE_P (type);
   if (use_dtor)
     {
       int idx;
@@ -6687,44 +6709,48 @@ register_dtor_fn (tree decl)
       end_cleanup_fn ();
     }
 
-  if (DECL_THREAD_LOCAL_P (decl))
-    /* We don't have a thread-local atexit yet.  FIXME write one using
-       pthread_key_create and friends.  */
-    sorry ("thread-local variable %q#D with non-trivial "
-	   "destructor", decl);
-
   /* Call atexit with the cleanup function.  */
   mark_used (cleanup);
   cleanup = build_address (cleanup);
-  if (flag_use_cxa_atexit && !targetm.cxx.use_atexit_for_cxa_atexit ())
+
+  if (DECL_THREAD_LOCAL_P (decl))
+    atex_node = get_thread_atexit_node ();
+  else
+    atex_node = get_atexit_node ();
+
+  if (use_dtor)
     {
-      tree addr;
+      /* We must convert CLEANUP to the type that "__cxa_atexit"
+	 expects.  */
+      cleanup = build_nop (get_atexit_fn_ptr_type (), cleanup);
+      /* "__cxa_atexit" will pass the address of DECL to the
+	 cleanup function.  */
+      mark_used (decl);
+      addr = build_address (decl);
+      /* The declared type of the parameter to "__cxa_atexit" is
+	 "void *".  For plain "T*", we could just let the
+	 machinery in cp_build_function_call convert it -- but if the
+	 type is "cv-qualified T *", then we need to convert it
+	 before passing it in, to avoid spurious errors.  */
+      addr = build_nop (ptr_type_node, addr);
+    }
+  else if (ob_parm)
+    /* Since the cleanup functions we build ignore the address
+       they're given, there's no reason to pass the actual address
+       in, and, in general, it's cheaper to pass NULL than any
+       other value.  */
+    addr = null_pointer_node;
+
+  if (dso_parm)
+    arg2 = cp_build_addr_expr (get_dso_handle_node (),
+			       tf_warning_or_error);
+  else
+    arg2 = NULL_TREE;
 
-      if (use_dtor)
-	{
-	  /* We must convert CLEANUP to the type that "__cxa_atexit"
-	     expects.  */
-	  cleanup = build_nop (get_atexit_fn_ptr_type (), cleanup);
-	  /* "__cxa_atexit" will pass the address of DECL to the
-	     cleanup function.  */
-	  mark_used (decl);
-	  addr = build_address (decl);
-	  /* The declared type of the parameter to "__cxa_atexit" is
-	     "void *".  For plain "T*", we could just let the
-	     machinery in cp_build_function_call convert it -- but if the
-	     type is "cv-qualified T *", then we need to convert it
-	     before passing it in, to avoid spurious errors.  */
-	  addr = build_nop (ptr_type_node, addr);
-	}
-      else
-	/* Since the cleanup functions we build ignore the address
-	   they're given, there's no reason to pass the actual address
-	   in, and, in general, it's cheaper to pass NULL than any
-	   other value.  */
-	addr = null_pointer_node;
-      arg2 = cp_build_addr_expr (get_dso_handle_node (),
-				 tf_warning_or_error);
-      if (targetm.cxx.use_aeabi_atexit ())
+  if (ob_parm)
+    {
+      if (!DECL_THREAD_LOCAL_P (decl)
+	  && targetm.cxx.use_aeabi_atexit ())
 	{
 	  arg1 = cleanup;
 	  arg0 = addr;
@@ -6736,8 +6762,11 @@ register_dtor_fn (tree decl)
 	}
     }
   else
-    arg0 = cleanup;
-  return cp_build_function_call_nary (get_atexit_node (), tf_warning_or_error,
+    {
+      arg0 = cleanup;
+      arg1 = NULL_TREE;
+    }
+  return cp_build_function_call_nary (atex_node, tf_warning_or_error,
 				      arg0, arg1, arg2, NULL_TREE);
 }
 
diff --git a/gcc/testsuite/g++.dg/tls/thread_local3.C b/gcc/testsuite/g++.dg/tls/thread_local3.C
new file mode 100644
index 0000000..461f126
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local3.C
@@ -0,0 +1,37 @@
+// { dg-do run }
+// { dg-require-effective-target c++11 }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-effective-target pthread }
+// { dg-options -pthread }
+
+int c;
+int d;
+struct A
+{
+  A() { ++c; }
+  ~A() { ++d; }
+};
+
+void f()
+{
+  thread_local A a;
+}
+
+void *thread_main(void *)
+{
+  f(); f(); f();
+}
+
+#include <pthread.h>
+
+int main()
+{
+  pthread_t thread;
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_join(thread, 0);
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_join(thread, 0);
+
+  if (c != 2 || d != 2)
+    __builtin_abort();
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local4.C b/gcc/testsuite/g++.dg/tls/thread_local4.C
new file mode 100644
index 0000000..53b1f05
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local4.C
@@ -0,0 +1,47 @@
+// Test for cleanups with pthread_cancel.
+
+// { dg-do run }
+// { dg-require-effective-target c++11 }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-effective-target pthread }
+// { dg-options -pthread }
+
+#include <pthread.h>
+#include <unistd.h>
+
+int c;
+int d;
+struct A
+{
+  A() { ++c; }
+  ~A() { ++d; }
+};
+
+void f()
+{
+  thread_local A a;
+}
+
+void *thread_main(void *)
+{
+  f(); f(); f();
+  while (true)
+    {
+      pthread_testcancel();
+      sleep (1);
+    }
+}
+
+int main()
+{
+  pthread_t thread;
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_cancel(thread);
+  pthread_join(thread, 0);
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_cancel(thread);
+  pthread_join(thread, 0);
+
+   if (c != 2 || d != 2)
+     __builtin_abort();
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local5.C b/gcc/testsuite/g++.dg/tls/thread_local5.C
new file mode 100644
index 0000000..7ce02f6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local5.C
@@ -0,0 +1,47 @@
+// Test for cleanups in the main thread, too.
+
+// { dg-do run }
+// { dg-require-effective-target c++11 }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-effective-target pthread }
+// { dg-options -pthread }
+
+#include <pthread.h>
+#include <unistd.h>
+
+int c;
+int d;
+struct A
+{
+  A() { ++c; }
+  ~A() {
+    if (++d == 3)
+      _exit (0);
+  }
+};
+
+void f()
+{
+  thread_local A a;
+}
+
+void *thread_main(void *)
+{
+  f(); f(); f();
+}
+
+int main()
+{
+  pthread_t thread;
+  thread_main(0);
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_join(thread, 0);
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_join(thread, 0);
+
+  // The dtor for a in the main thread is run after main exits, so we
+  // return 1 now and override the return value with _exit above.
+  if (c != 3 || d != 2)
+    __builtin_abort();
+  return 1;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local6.C b/gcc/testsuite/g++.dg/tls/thread_local6.C
new file mode 100644
index 0000000..118969a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local6.C
@@ -0,0 +1,33 @@
+// Test for cleanups in the main thread without -pthread.
+
+// { dg-do run }
+// { dg-options "-std=c++11" }
+// { dg-require-effective-target tls_runtime }
+
+extern "C" void _exit (int);
+
+int c;
+struct A
+{
+  A() { ++c; }
+  ~A() { if (c == 1) _exit(0); }
+};
+
+void f()
+{
+  thread_local A a;
+}
+
+void *thread_main(void *)
+{
+  f(); f(); f();
+}
+
+int main()
+{
+  thread_main(0);
+
+  // The dtor for a in the main thread is run after main exits, so we
+  // return 1 now and override the return value with _exit above.
+  return 1;
+}
diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver
index 396feec..e23fdfb 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -1531,6 +1531,10 @@ CXXABI_1.3.6 {
 
 } CXXABI_1.3.5;
 
+CXXABI_1.3.7 {
+    __cxa_thread_atexit;
+} CXXABI_1.3.6;
+
 
 # Symbols in the support library (libsupc++) supporting transactional memory.
 CXXABI_TM_1 {
diff --git a/libstdc++-v3/libsupc++/Makefile.am b/libstdc++-v3/libsupc++/Makefile.am
index 69cbf5c..a019bd8 100644
--- a/libstdc++-v3/libsupc++/Makefile.am
+++ b/libstdc++-v3/libsupc++/Makefile.am
@@ -48,6 +48,7 @@ endif
 sources = \
 	array_type_info.cc \
 	atexit_arm.cc \
+	atexit_thread.cc \
 	bad_alloc.cc \
 	bad_cast.cc \
 	bad_typeid.cc \
@@ -123,6 +124,11 @@ guard.lo: guard.cc
 guard.o: guard.cc
 	$(CXXCOMPILE) -std=gnu++0x -c $<
 
+atexit_thread.lo: atexit_thread.cc
+	$(LTCXXCOMPILE) -std=gnu++0x -c $<
+atexit_thread.o: atexit_thread.cc
+	$(CXXCOMPILE) -std=gnu++0x -c $<
+
 nested_exception.lo: nested_exception.cc
 	$(LTCXXCOMPILE) -std=gnu++0x -c $<
 nested_exception.o: nested_exception.cc
diff --git a/libstdc++-v3/libsupc++/Makefile.in b/libstdc++-v3/libsupc++/Makefile.in
index b2af9ba..e745179 100644
--- a/libstdc++-v3/libsupc++/Makefile.in
+++ b/libstdc++-v3/libsupc++/Makefile.in
@@ -90,7 +90,7 @@ am__installdirs = "$(DESTDIR)$(toolexeclibdir)" "$(DESTDIR)$(bitsdir)" \
 	"$(DESTDIR)$(stddir)"
 LTLIBRARIES = $(noinst_LTLIBRARIES) $(toolexeclib_LTLIBRARIES)
 libsupc___la_LIBADD =
-am__objects_1 = array_type_info.lo atexit_arm.lo bad_alloc.lo \
+am__objects_1 = array_type_info.lo atexit_arm.lo atexit_thread.lo bad_alloc.lo \
 	bad_cast.lo bad_typeid.lo class_type_info.lo del_op.lo \
 	del_opnt.lo del_opv.lo del_opvnt.lo dyncast.lo eh_alloc.lo \
 	eh_arm.lo eh_aux_runtime.lo eh_call.lo eh_catch.lo \
@@ -362,6 +362,7 @@ headers = $(std_HEADERS) $(bits_HEADERS)
 sources = \
 	array_type_info.cc \
 	atexit_arm.cc \
+	atexit_thread.cc \
 	bad_alloc.cc \
 	bad_cast.cc \
 	bad_typeid.cc \
@@ -800,6 +801,11 @@ guard.lo: guard.cc
 guard.o: guard.cc
 	$(CXXCOMPILE) -std=gnu++0x -c $<
 
+atexit_thread.lo: atexit_thread.cc
+	$(LTCXXCOMPILE) -std=gnu++0x -c $<
+atexit_thread.o: atexit_thread.cc
+	$(CXXCOMPILE) -std=gnu++0x -c $<
+
 nested_exception.lo: nested_exception.cc
 	$(LTCXXCOMPILE) -std=gnu++0x -c $<
 nested_exception.o: nested_exception.cc
diff --git a/libstdc++-v3/libsupc++/atexit_thread.cc b/libstdc++-v3/libsupc++/atexit_thread.cc
new file mode 100644
index 0000000..517b4ef
--- /dev/null
+++ b/libstdc++-v3/libsupc++/atexit_thread.cc
@@ -0,0 +1,134 @@
+// Copyright (C) 2012 Free Software Foundation, Inc.
+//
+// This file is part of GCC.
+//
+// GCC is free software; you can redistribute it and/or modify
+// it under the terms of the GNU General Public License as published by
+// the Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// GCC is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#include <cxxabi.h>
+#include <new>
+#include "bits/gthr.h"
+
+namespace {
+  // Data structure for the list of destructors: Singly-linked list
+  // of arrays.
+  class list
+  {
+    struct elt
+    {
+      void *object;
+      void (*destructor)(void *);
+    };
+
+    static const int max_nelts = 32;
+
+    list *next;
+    int nelts;
+    elt array[max_nelts];
+
+    elt *allocate_elt();
+  public:
+    void run();
+    static void run(void *p);
+    int add_elt(void (*)(void *), void *);
+  };
+
+  // Return the address of an open slot.
+  list::elt *
+  list::allocate_elt()
+  {
+    if (nelts < max_nelts)
+      return &array[nelts++];
+    if (!next)
+      next = new (std::nothrow) list();
+    if (!next)
+      return 0;
+    return next->allocate_elt();
+  }
+
+  // Run all the cleanups in the list.
+  void
+  list::run()
+  {
+    for (int i = nelts - 1; i >= 0; --i)
+      array[i].destructor (array[i].object);
+    if (next)
+      next->run();
+  }
+
+  // Static version to use as a callback to __gthread_key_create.
+  void
+  list::run(void *p)
+  {
+    static_cast<list *>(p)->run();
+  }
+
+  // The list of cleanups is per-thread.
+  thread_local list first;
+
+  // The pthread data structures for actually running the destructors at
+  // thread exit are shared.  The constructor of the thread-local sentinel
+  // object in add_elt performs the initialization.
+  __gthread_key_t key;
+  __gthread_once_t once = __GTHREAD_ONCE_INIT;
+  void run_current (void *) { first.run(); }
+  void key_init() {
+    __gthread_key_create (&key, list::run);
+    // Also make sure the destructors are run by std::exit.
+    // FIXME TLS cleanups should run before static cleanups and atexit
+    // cleanups.
+    abi::__cxa_atexit (run_current, NULL, NULL);
+  }
+  struct sentinel
+  {
+    sentinel()
+    {
+      if (__gthread_active_p ())
+	{
+	  __gthread_once (&once, key_init);
+	  __gthread_setspecific (key, &first);
+	}
+      else
+	abi::__cxa_atexit (run_current, NULL, NULL);
+    }
+  };
+
+  // Actually insert an element.
+  int
+  list::add_elt(void (*dtor)(void *), void *obj)
+  {
+    thread_local sentinel s;
+    elt *e = allocate_elt ();
+    if (!e)
+      return -1;
+    e->object = obj;
+    e->destructor = dtor;
+    return 0;
+  }
+}
+
+namespace __cxxabiv1
+{
+  extern "C" int
+  __cxa_thread_atexit (void (*dtor)(void *), void *obj, void *dso_handle)
+    _GLIBCXX_NOTHROW
+  {
+    return first.add_elt (dtor, obj);
+  }
+}
diff --git a/libstdc++-v3/libsupc++/cxxabi.h b/libstdc++-v3/libsupc++/cxxabi.h
index b924fc1..582c435 100644
--- a/libstdc++-v3/libsupc++/cxxabi.h
+++ b/libstdc++-v3/libsupc++/cxxabi.h
@@ -134,6 +134,10 @@ namespace __cxxabiv1
   int
   __cxa_finalize(void*);
 
+  // TLS destruction.
+  int
+  __cxa_thread_atexit(void (*)(void*), void*, void *) _GLIBCXX_NOTHROW;
+
   // Pure virtual functions.
   void
   __cxa_pure_virtual(void) __attribute__ ((__noreturn__));

commit c20f7291032d68f0afe249f1d1a67707acab720d
Author: Jason Merrill <jason@redhat.com>
Date:   Thu Sep 27 06:15:22 2012 -0400

    	Allow dynamic initialization of thread_locals.
    gcc/cp/
    	* decl.c: Define tls_aggregates.
    	(expand_static_init): Remove sorry.  Add to tls_aggregates.
    	* cp-tree.h: Declare tls_aggregates.
    	* call.c (set_up_extended_ref_temp): Add to tls_aggregates.
    	* decl2.c (var_needs_tls_wrapper): New.
    	(var_defined_without_dynamic_init): New.
    	(get_tls_init_fn, get_tls_wrapper_fn): New.
    	(generate_tls_wrapper, handle_tls_init): New.
    	(cp_write_global_declarations): Call handle_tls_init and
    	enerate_tls_wrapper.
    	* mangle.c (write_guarded_var_name): Split out from..
    	(mangle_guard_variable): ...here.
    	(mangle_tls_init_fn, mangle_tls_wrapper_fn): Use it.
    	(decl_tls_wrapper_p): New.
    	* semantics.c (finish_id_expression): Replace use of thread_local
    	variable with a call to its wrapper.
    libiberty/
    	* cp-demangle.c (d_special_name, d_dump): Handle TH and TW.
    	(d_make_comp, d_print_comp): Likewise.
    include/
    	* demangle.h (enum demangle_component_type): Add
    	DEMANGLE_COMPONENT_TLS_INIT and DEMANGLE_COMPONENT_TLS_WRAPPER.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 5cc9bad..943d0c0 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8859,8 +8859,14 @@ set_up_extended_ref_temp (tree decl, tree expr, VEC(tree,gc) **cleanups,
     {
       rest_of_decl_compilation (var, /*toplev=*/1, at_eof);
       if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type))
-	static_aggregates = tree_cons (NULL_TREE, var,
-				       static_aggregates);
+	{
+	  if (DECL_THREAD_LOCAL_P (var))
+	    tls_aggregates = tree_cons (NULL_TREE, var,
+					tls_aggregates);
+	  else
+	    static_aggregates = tree_cons (NULL_TREE, var,
+					   static_aggregates);
+	}
     }
 
   *initp = init;
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 12ad4ed..97b6d51 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4378,6 +4378,8 @@ extern int at_eof;
    in the TREE_VALUE slot and the initializer is stored in the
    TREE_PURPOSE slot.  */
 extern GTY(()) tree static_aggregates;
+/* Likewise, for thread local storage.  */
+extern GTY(()) tree tls_aggregates;
 
 enum overload_flags { NO_SPECIAL = 0, DTOR_FLAG, TYPENAME_FLAG };
 
@@ -5179,6 +5181,7 @@ extern tree cp_build_parm_decl			(tree, tree);
 extern tree get_guard				(tree);
 extern tree get_guard_cond			(tree);
 extern tree set_guard				(tree);
+extern tree get_tls_wrapper_fn			(tree);
 extern void mark_needed				(tree);
 extern bool decl_needed_p			(tree);
 extern void note_vague_linkage_fn		(tree);
@@ -5973,6 +5976,9 @@ extern tree mangle_ctor_vtbl_for_type		(tree, tree);
 extern tree mangle_thunk			(tree, int, tree, tree);
 extern tree mangle_conv_op_name_for_type	(tree);
 extern tree mangle_guard_variable		(tree);
+extern tree mangle_tls_init_fn			(tree);
+extern tree mangle_tls_wrapper_fn		(tree);
+extern bool decl_tls_wrapper_p			(tree);
 extern tree mangle_ref_init_variable		(tree);
 
 /* in dump.c */
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 0d04dc0..246ba9a 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -169,6 +169,9 @@ tree global_scope_name;
    in the TREE_PURPOSE slot.  */
 tree static_aggregates;
 
+/* Like static_aggregates, but for thread_local variables.  */
+tree tls_aggregates;
+
 /* -- end of C++ */
 
 /* A node for the integer constant 2.  */
@@ -6805,16 +6808,6 @@ expand_static_init (tree decl, tree init)
       return;
     }
 
-  if (DECL_THREAD_LOCAL_P (decl) && !DECL_FUNCTION_SCOPE_P (decl))
-    {
-      /* We haven't implemented dynamic initialization of non-local
-	 thread-local storage yet.  FIXME transform to singleton
-	 function.  */
-      sorry ("thread-local variable %qD with dynamic initialization outside "
-	     "function scope", decl);
-      return;
-    }
-
   if (DECL_FUNCTION_SCOPE_P (decl))
     {
       /* Emit code to perform this initialization but once.  */
@@ -6943,6 +6936,8 @@ expand_static_init (tree decl, tree init)
 	  finish_if_stmt (if_stmt);
 	}
     }
+  else if (DECL_THREAD_LOCAL_P (decl))
+    tls_aggregates = tree_cons (init, decl, tls_aggregates);
   else
     static_aggregates = tree_cons (init, decl, static_aggregates);
 }
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index a240ff4..869c616 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -2781,6 +2781,187 @@ set_guard (tree guard)
 			       tf_warning_or_error);
 }
 
+/* Returns true iff we can tell that VAR does not have a dynamic
+   initializer.  */
+
+static bool
+var_defined_without_dynamic_init (tree var)
+{
+  /* If it's defined in another TU, we can't tell.  */
+  if (DECL_EXTERNAL (var))
+    return false;
+  /* If it has a non-trivial destructor, registering the destructor
+     counts as dynamic initialization.  */
+  if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (var)))
+    return false;
+  /* If it's in this TU, its initializer has been processed.  */
+  gcc_assert (DECL_INITIALIZED_P (var));
+  /* If it has no initializer or a constant one, it's not dynamic.  */
+  return (!DECL_NONTRIVIALLY_INITIALIZED_P (var)
+	  || DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (var));
+}
+
+/* Returns true iff VAR is a variable that needs uses to be
+   wrapped for possible dynamic initialization.  */
+
+static bool
+var_needs_tls_wrapper (tree var)
+{
+  return (DECL_THREAD_LOCAL_P (var)
+	  && !DECL_GNU_TLS_P (var)
+	  && !DECL_FUNCTION_SCOPE_P (var)
+	  && !var_defined_without_dynamic_init (var));
+}
+
+/* Get a FUNCTION_DECL for the init function for the thread_local
+   variable VAR.  The init function will be an alias to the function
+   that initializes all the non-local TLS variables in the translation
+   unit.  The init function is only used by the wrapper function.  */
+
+static tree
+get_tls_init_fn (tree var)
+{
+  /* Only C++11 TLS vars need this init fn.  */
+  if (!var_needs_tls_wrapper (var))
+    return NULL_TREE;
+
+  tree sname = mangle_tls_init_fn (var);
+  tree fn = IDENTIFIER_GLOBAL_VALUE (sname);
+  if (!fn)
+    {
+      fn = build_lang_decl (FUNCTION_DECL, sname,
+			    build_function_type (void_type_node,
+						 void_list_node));
+      SET_DECL_LANGUAGE (fn, lang_c);
+      TREE_PUBLIC (fn) = TREE_PUBLIC (var);
+      DECL_ARTIFICIAL (fn) = true;
+      DECL_COMDAT (fn) = DECL_COMDAT (var);
+      DECL_EXTERNAL (fn) = true;
+      if (DECL_ONE_ONLY (var))
+	make_decl_one_only (fn, cxx_comdat_group (fn));
+      if (TREE_PUBLIC (var))
+	{
+	  tree obtype = strip_array_types (non_reference (TREE_TYPE (var)));
+	  /* If the variable might have static initialization, make the
+	     init function a weak reference.  */
+	  if ((!TYPE_NEEDS_CONSTRUCTING (obtype)
+	       || TYPE_HAS_CONSTEXPR_CTOR (obtype))
+	      && TARGET_SUPPORTS_WEAK)
+	    declare_weak (fn);
+	  else
+	    DECL_WEAK (fn) = DECL_WEAK (var);
+	}
+      DECL_VISIBILITY (fn) = DECL_VISIBILITY (var);
+      DECL_VISIBILITY_SPECIFIED (fn) = DECL_VISIBILITY_SPECIFIED (var);
+      DECL_DLLIMPORT_P (fn) = DECL_DLLIMPORT_P (var);
+      DECL_IGNORED_P (fn) = 1;
+      mark_used (fn);
+
+      DECL_BEFRIENDING_CLASSES (fn) = var;
+
+      SET_IDENTIFIER_GLOBAL_VALUE (sname, fn);
+    }
+  return fn;
+}
+
+/* Get a FUNCTION_DECL for the init wrapper function for the thread_local
+   variable VAR.  The wrapper function calls the init function (if any) for
+   VAR and then returns a reference to VAR.  The wrapper function is used
+   in place of VAR everywhere VAR is mentioned.  */
+
+tree
+get_tls_wrapper_fn (tree var)
+{
+  /* Only C++11 TLS vars need this wrapper fn.  */
+  if (!var_needs_tls_wrapper (var))
+    return NULL_TREE;
+
+  tree sname = mangle_tls_wrapper_fn (var);
+  tree fn = IDENTIFIER_GLOBAL_VALUE (sname);
+  if (!fn)
+    {
+      /* A named rvalue reference is an lvalue, so the wrapper should
+	 always return an lvalue reference.  */
+      tree type = non_reference (TREE_TYPE (var));
+      type = build_reference_type (type);
+      tree fntype = build_function_type (type, void_list_node);
+      fn = build_lang_decl (FUNCTION_DECL, sname, fntype);
+      SET_DECL_LANGUAGE (fn, lang_c);
+      TREE_PUBLIC (fn) = TREE_PUBLIC (var);
+      DECL_ARTIFICIAL (fn) = true;
+      DECL_IGNORED_P (fn) = 1;
+      /* The wrapper is inline and emitted everywhere var is used.  */
+      DECL_DECLARED_INLINE_P (fn) = true;
+      if (TREE_PUBLIC (var))
+	{
+	  comdat_linkage (fn);
+#ifdef HAVE_GAS_HIDDEN
+	  /* Make the wrapper bind locally; there's no reason to share
+	     the wrapper between multiple shared objects.  */
+	  DECL_VISIBILITY (fn) = VISIBILITY_INTERNAL;
+	  DECL_VISIBILITY_SPECIFIED (fn) = true;
+#endif
+	}
+      if (!TREE_PUBLIC (fn))
+	DECL_INTERFACE_KNOWN (fn) = true;
+      mark_used (fn);
+      note_vague_linkage_fn (fn);
+
+#if 0
+      /* We want CSE to commonize calls to the wrapper, but marking it as
+	 pure is unsafe since it has side-effects.  I guess we need a new
+	 ECF flag even weaker than ECF_PURE.  FIXME!  */
+      DECL_PURE_P (fn) = true;
+#endif
+
+      DECL_BEFRIENDING_CLASSES (fn) = var;
+
+      SET_IDENTIFIER_GLOBAL_VALUE (sname, fn);
+    }
+  return fn;
+}
+
+/* At EOF, generate the definition for the TLS wrapper function FN:
+
+   T& var_wrapper() {
+     if (init_fn) init_fn();
+     return var;
+   }  */
+
+static void
+generate_tls_wrapper (tree fn)
+{
+  tree var = DECL_BEFRIENDING_CLASSES (fn);
+
+  start_preparsed_function (fn, NULL_TREE, SF_DEFAULT | SF_PRE_PARSED);
+  tree body = begin_function_body ();
+  /* Only call the init fn if there might be one.  */
+  if (tree init_fn = get_tls_init_fn (var))
+    {
+      tree if_stmt = NULL_TREE;
+      /* If init_fn is a weakref, make sure it exists before calling.  */
+      if (lookup_attribute ("weak", DECL_ATTRIBUTES (init_fn)))
+	{
+	  if_stmt = begin_if_stmt ();
+	  tree addr = cp_build_addr_expr (init_fn, tf_warning_or_error);
+	  tree cond = cp_build_binary_op (DECL_SOURCE_LOCATION (var),
+					  NE_EXPR, addr, nullptr_node,
+					  tf_warning_or_error);
+	  finish_if_stmt_cond (cond, if_stmt);
+	}
+      finish_expr_stmt (build_cxx_call
+			(init_fn, 0, NULL, tf_warning_or_error));
+      if (if_stmt)
+	{
+	  finish_then_clause (if_stmt);
+	  finish_if_stmt (if_stmt);
+	}
+    }
+  finish_return_stmt (convert_from_reference (var));
+  finish_function_body (body);
+  expand_or_defer_fn (finish_function (0));
+}
+
 /* Start the process of running a particular set of global constructors
    or destructors.  Subroutine of do_[cd]tors.  */
 
@@ -3668,6 +3849,75 @@ clear_decl_external (struct cgraph_node *node, void * /*data*/)
   return false;
 }
 
+/* Build up the function to run dynamic initializers for thread_local
+   variables in this translation unit and alias the init functions for the
+   individual variables to it.  */
+
+static void
+handle_tls_init (void)
+{
+  tree vars = prune_vars_needing_no_initialization (&tls_aggregates);
+  if (vars == NULL_TREE)
+    return;
+
+  location_t loc = DECL_SOURCE_LOCATION (TREE_VALUE (vars));
+
+  #ifndef ASM_OUTPUT_DEF
+  /* This currently requires alias support.  FIXME other targets could use
+     small thunks instead of aliases.  */
+  input_location = loc;
+  sorry ("dynamic initialization of non-function-local thread_local "
+	 "variables not supported on this target");
+  return;
+  #endif
+
+  write_out_vars (vars);
+
+  tree guard = build_decl (loc, VAR_DECL, get_identifier ("__tls_guard"),
+			   boolean_type_node);
+  TREE_PUBLIC (guard) = false;
+  TREE_STATIC (guard) = true;
+  DECL_ARTIFICIAL (guard) = true;
+  DECL_IGNORED_P (guard) = true;
+  TREE_USED (guard) = true;
+  DECL_TLS_MODEL (guard) = decl_default_tls_model (guard);
+  pushdecl_top_level_and_finish (guard, NULL_TREE);
+
+  tree fn = build_lang_decl (FUNCTION_DECL,
+			     get_identifier ("__tls_init"),
+			     build_function_type (void_type_node,
+						  void_list_node));
+  SET_DECL_LANGUAGE (fn, lang_c);
+  TREE_PUBLIC (fn) = false;
+  DECL_ARTIFICIAL (fn) = true;
+  mark_used (fn);
+  start_preparsed_function (fn, NULL_TREE, SF_PRE_PARSED);
+  tree body = begin_function_body ();
+  tree if_stmt = begin_if_stmt ();
+  tree cond = cp_build_unary_op (TRUTH_NOT_EXPR, guard, false,
+				 tf_warning_or_error);
+  finish_if_stmt_cond (cond, if_stmt);
+  finish_expr_stmt (cp_build_modify_expr (guard, NOP_EXPR, boolean_true_node,
+					  tf_warning_or_error));
+  for (; vars; vars = TREE_CHAIN (vars))
+    {
+      tree var = TREE_VALUE (vars);
+      tree init = TREE_PURPOSE (vars);
+      one_static_initialization_or_destruction (var, init, true);
+
+      tree single_init_fn = get_tls_init_fn (var);
+      cgraph_node *alias
+	= cgraph_same_body_alias (cgraph_get_create_node (fn),
+				  single_init_fn, fn);
+      gcc_assert (alias != NULL);
+    }
+
+  finish_then_clause (if_stmt);
+  finish_if_stmt (if_stmt);
+  finish_function_body (body);
+  expand_or_defer_fn (finish_function (0));
+}
+
 /* This routine is called at the end of compilation.
    Its job is to create all the code needed to initialize and
    destroy the global aggregates.  We do the destruction
@@ -3845,6 +4095,9 @@ cp_write_global_declarations (void)
 	  /* ??? was:  locus.line++; */
 	}
 
+      /* Now do the same for thread_local variables.  */
+      handle_tls_init ();
+
       /* Go through the set of inline functions whose bodies have not
 	 been emitted yet.  If out-of-line copies of these functions
 	 are required, emit them.  */
@@ -3869,6 +4122,9 @@ cp_write_global_declarations (void)
 	      reconsider = true;
 	    }
 
+	  if (!DECL_INITIAL (decl) && decl_tls_wrapper_p (decl))
+	    generate_tls_wrapper (decl);
+
 	  if (!DECL_SAVED_TREE (decl))
 	    continue;
 
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 13c658b..163c18e 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -3678,23 +3678,70 @@ mangle_conv_op_name_for_type (const tree type)
   return identifier;
 }
 
-/* Return an identifier for the name of an initialization guard
-   variable for indicated VARIABLE.  */
+/* Write out the appropriate string for this variable when generating
+   another mangled name based on this one.  */
 
-tree
-mangle_guard_variable (const tree variable)
+static void
+write_guarded_var_name (const tree variable)
 {
-  start_mangling (variable);
-  write_string ("_ZGV");
   if (strncmp (IDENTIFIER_POINTER (DECL_NAME (variable)), "_ZGR", 4) == 0)
     /* The name of a guard variable for a reference temporary should refer
        to the reference, not the temporary.  */
     write_string (IDENTIFIER_POINTER (DECL_NAME (variable)) + 4);
   else
     write_name (variable, /*ignore_local_scope=*/0);
+}
+
+/* Return an identifier for the name of an initialization guard
+   variable for indicated VARIABLE.  */
+
+tree
+mangle_guard_variable (const tree variable)
+{
+  start_mangling (variable);
+  write_string ("_ZGV");
+  write_guarded_var_name (variable);
+  return finish_mangling_get_identifier (/*warn=*/false);
+}
+
+/* Return an identifier for the name of a thread_local initialization
+   function for VARIABLE.  */
+
+tree
+mangle_tls_init_fn (const tree variable)
+{
+  start_mangling (variable);
+  write_string ("_ZTH");
+  write_guarded_var_name (variable);
+  return finish_mangling_get_identifier (/*warn=*/false);
+}
+
+/* Return an identifier for the name of a thread_local wrapper
+   function for VARIABLE.  */
+
+#define TLS_WRAPPER_PREFIX "_ZTW"
+
+tree
+mangle_tls_wrapper_fn (const tree variable)
+{
+  start_mangling (variable);
+  write_string (TLS_WRAPPER_PREFIX);
+  write_guarded_var_name (variable);
   return finish_mangling_get_identifier (/*warn=*/false);
 }
 
+/* Return true iff FN is a thread_local wrapper function.  */
+
+bool
+decl_tls_wrapper_p (const tree fn)
+{
+  if (TREE_CODE (fn) != FUNCTION_DECL)
+    return false;
+  tree name = DECL_NAME (fn);
+  return strncmp (IDENTIFIER_POINTER (name), TLS_WRAPPER_PREFIX,
+		  strlen (TLS_WRAPPER_PREFIX)) == 0;
+}
+
 /* Return an identifier for the name of a temporary variable used to
    initialize a static reference.  This isn't part of the ABI, but we might
    as well call them something readable.  */
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 68cbb4b..570ffeb 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -3254,7 +3254,17 @@ finish_id_expression (tree id_expression,
 	  *non_integral_constant_expression_p = true;
 	}
 
-      if (scope)
+      tree wrap;
+      if (TREE_CODE (decl) == VAR_DECL
+	  && !cp_unevaluated_operand
+	  && DECL_THREAD_LOCAL_P (decl)
+	  && (wrap = get_tls_wrapper_fn (decl)))
+	{
+	  /* Replace an evaluated use of the thread_local variable with
+	     a call to its wrapper.  */
+	  decl = build_cxx_call (wrap, 0, NULL, tf_warning_or_error);
+	}
+      else if (scope)
 	{
 	  decl = (adjust_result_of_qualified_name_lookup
 		  (decl, scope, current_nonlambda_class_type()));
diff --git a/gcc/testsuite/g++.dg/gomp/tls-5.C b/gcc/testsuite/g++.dg/gomp/tls-5.C
new file mode 100644
index 0000000..74e4faa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/tls-5.C
@@ -0,0 +1,12 @@
+// The reference temp should be TLS, not normal data.
+// { dg-require-effective-target c++11 }
+// { dg-final { scan-assembler-not "\\.data" } }
+
+extern int&& ir;
+#pragma omp threadprivate (ir)
+int&& ir = 42;
+
+void f()
+{
+  ir = 24;
+}
diff --git a/gcc/testsuite/g++.dg/gomp/tls-wrap1.C b/gcc/testsuite/g++.dg/gomp/tls-wrap1.C
new file mode 100644
index 0000000..91c9f86
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/tls-wrap1.C
@@ -0,0 +1,13 @@
+// If we can see the definition at the use site, we don't need to bother
+// with a wrapper.
+
+// { dg-require-effective-target tls }
+// { dg-final { scan-assembler-not "_ZTW1i" } }
+
+int i = 42;
+#pragma omp threadprivate (i)
+
+int main()
+{
+  return i - 42;
+}
diff --git a/gcc/testsuite/g++.dg/gomp/tls-wrap2.C b/gcc/testsuite/g++.dg/gomp/tls-wrap2.C
new file mode 100644
index 0000000..7aa1371
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/tls-wrap2.C
@@ -0,0 +1,16 @@
+// If we can't see the definition at the use site, but it's in this translation
+// unit, we build a wrapper but don't bother with an init function.
+
+// { dg-require-effective-target tls }
+// { dg-final { scan-assembler "_ZTW1i" } }
+// { dg-final { scan-assembler-not "_ZTH1i" } }
+
+extern int i;
+#pragma omp threadprivate (i)
+
+int main()
+{
+  return i - 42;
+}
+
+int i = 42;
diff --git a/gcc/testsuite/g++.dg/gomp/tls-wrap3.C b/gcc/testsuite/g++.dg/gomp/tls-wrap3.C
new file mode 100644
index 0000000..2504d99
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/tls-wrap3.C
@@ -0,0 +1,14 @@
+// If we can't see the definition at all, we need to assume there might be
+// an init function.
+
+// { dg-require-effective-target tls }
+// { dg-final { scan-assembler "_ZTW1i" } }
+// { dg-final { scan-assembler "_ZTH1i" } }
+
+extern int i;
+#pragma omp threadprivate (i)
+
+int main()
+{
+  return i - 42;
+}
diff --git a/gcc/testsuite/g++.dg/gomp/tls-wrap4.C b/gcc/testsuite/g++.dg/gomp/tls-wrap4.C
new file mode 100644
index 0000000..1301148
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/tls-wrap4.C
@@ -0,0 +1,13 @@
+// We don't need to call the wrapper through the PLT; we can use a separate
+// copy per shared object.
+
+// { dg-require-effective-target tls }
+// { dg-options "-std=c++11 -fPIC" }
+// { dg-final { scan-assembler-not "_ZTW1i@PLT" { target i?86-*-* x86_64-*-* } } }
+
+extern thread_local int i;
+
+int main()
+{
+  return i - 42;
+}
diff --git a/gcc/testsuite/g++.dg/gomp/tls-wrapper-cse.C b/gcc/testsuite/g++.dg/gomp/tls-wrapper-cse.C
new file mode 100644
index 0000000..af2de2f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/tls-wrapper-cse.C
@@ -0,0 +1,18 @@
+// Test for CSE of the wrapper function: we should only call it once
+// for the two references to ir.
+// { dg-options "-fopenmp -O -fno-inline" }
+// { dg-require-effective-target tls }
+// { dg-final { scan-assembler-times "call *_ZTW2ir" 1 { xfail *-*-* } } }
+
+// XFAILed until the back end supports a way to mark a function as cseable
+// though not pure.
+
+int f() { return 42; }
+
+int ir = f();
+#pragma omp threadprivate (ir)
+
+int main()
+{
+  return ir + ir - 84;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local-cse.C b/gcc/testsuite/g++.dg/tls/thread_local-cse.C
new file mode 100644
index 0000000..47c6aed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local-cse.C
@@ -0,0 +1,20 @@
+// Test for CSE of the wrapper function: we should only call it once
+// for the two references to ir.
+// { dg-options "-std=c++11 -O -fno-inline -save-temps" }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-alias }
+// { dg-final { scan-assembler-times "call *_ZTW2ir" 1 { xfail *-*-* } } }
+// { dg-final cleanup-saved-temps }
+// { dg-do run }
+
+// XFAILed until the back end supports a way to mark a function as cseable
+// though not pure.
+
+int f() { return 42; }
+
+thread_local int ir = f();
+
+int main()
+{
+  return ir + ir - 84;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local-order1.C b/gcc/testsuite/g++.dg/tls/thread_local-order1.C
new file mode 100644
index 0000000..6557e93
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local-order1.C
@@ -0,0 +1,25 @@
+// { dg-do run }
+// { dg-options "-std=c++11" }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-alias }
+
+extern "C" void abort();
+extern "C" int printf (const char *, ...);
+#define printf(...)
+
+int c;
+struct A {
+  int i;
+  A(int i): i(i) { printf ("A(%d)\n", i); if (i != c++) abort (); }
+  ~A() { printf("~A(%d)\n", i); if (i != --c) abort(); }
+};
+
+A a0(0);
+thread_local A a1(1);
+thread_local A a2(2);
+A* ap = &a1;
+
+int main()
+{
+  if (c != 3) abort();
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local-order2.C b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
new file mode 100644
index 0000000..eb9c769
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
@@ -0,0 +1,28 @@
+// The standard says that a1 should be destroyed before a0 even though
+// that isn't reverse order of construction.  We need to move
+// __cxa_thread_atexit into glibc to get this right.
+
+// { dg-do run { xfail *-*-* } }
+// { dg-options "-std=c++11" }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-alias }
+
+extern "C" void abort();
+extern "C" int printf (const char *, ...);
+#define printf(...)
+
+int c;
+struct A {
+  int i;
+  A(int i): i(i) { printf ("A(%d)\n", i); ++c; }
+  ~A() { printf("~A(%d)\n", i); if (i != --c) abort(); }
+};
+
+thread_local A a1(1);
+A* ap = &a1;
+A a0(0);
+
+int main()
+{
+  if (c != 2) abort();
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local-wrap1.C b/gcc/testsuite/g++.dg/tls/thread_local-wrap1.C
new file mode 100644
index 0000000..56177da
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local-wrap1.C
@@ -0,0 +1,13 @@
+// If we can see the definition at the use site, we don't need to bother
+// with a wrapper.
+
+// { dg-require-effective-target tls }
+// { dg-options "-std=c++11" }
+// { dg-final { scan-assembler-not "_ZTW1i" } }
+
+thread_local int i = 42;
+
+int main()
+{
+  return i - 42;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local-wrap2.C b/gcc/testsuite/g++.dg/tls/thread_local-wrap2.C
new file mode 100644
index 0000000..1e8078f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local-wrap2.C
@@ -0,0 +1,16 @@
+// If we can't see the definition at the use site, but it's in this translation
+// unit, we build a wrapper but don't bother with an init function.
+
+// { dg-require-effective-target tls }
+// { dg-options "-std=c++11" }
+// { dg-final { scan-assembler "_ZTW1i" } }
+// { dg-final { scan-assembler-not "_ZTH1i" } }
+
+extern thread_local int i;
+
+int main()
+{
+  return i - 42;
+}
+
+thread_local int i = 42;
diff --git a/gcc/testsuite/g++.dg/tls/thread_local-wrap3.C b/gcc/testsuite/g++.dg/tls/thread_local-wrap3.C
new file mode 100644
index 0000000..19e6ab8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local-wrap3.C
@@ -0,0 +1,14 @@
+// If we can't see the definition at all, we need to assume there might be
+// an init function.
+
+// { dg-require-effective-target tls }
+// { dg-options "-std=c++11" }
+// { dg-final { scan-assembler "_ZTW1i" } }
+// { dg-final { scan-assembler "_ZTH1i" } }
+
+extern thread_local int i;
+
+int main()
+{
+  return i - 42;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local-wrap4.C b/gcc/testsuite/g++.dg/tls/thread_local-wrap4.C
new file mode 100644
index 0000000..1301148
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local-wrap4.C
@@ -0,0 +1,13 @@
+// We don't need to call the wrapper through the PLT; we can use a separate
+// copy per shared object.
+
+// { dg-require-effective-target tls }
+// { dg-options "-std=c++11 -fPIC" }
+// { dg-final { scan-assembler-not "_ZTW1i@PLT" { target i?86-*-* x86_64-*-* } } }
+
+extern thread_local int i;
+
+int main()
+{
+  return i - 42;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local2g.C b/gcc/testsuite/g++.dg/tls/thread_local2g.C
new file mode 100644
index 0000000..36451d2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local2g.C
@@ -0,0 +1,29 @@
+// { dg-do run }
+// { dg-options "-std=c++11" }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-alias }
+
+extern "C" void abort();
+
+struct A
+{
+  A();
+  int i;
+};
+
+thread_local A a;
+
+A &f()
+{
+  return a;
+}
+
+int j;
+A::A(): i(j) { }
+
+int main()
+{
+  j = 42;
+  if (f().i != 42)
+    abort ();
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local3g.C b/gcc/testsuite/g++.dg/tls/thread_local3g.C
new file mode 100644
index 0000000..d5e83e8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local3g.C
@@ -0,0 +1,35 @@
+// { dg-do run }
+// { dg-require-effective-target c++11 }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-effective-target pthread }
+// { dg-require-alias }
+// { dg-options -pthread }
+
+int c;
+int d;
+struct A
+{
+  A() { ++c; }
+  ~A() { ++d; }
+};
+
+thread_local A a;
+
+void *thread_main(void *)
+{
+  A* ap = &a;
+}
+
+#include <pthread.h>
+
+int main()
+{
+  pthread_t thread;
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_join(thread, 0);
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_join(thread, 0);
+
+  if (c != 2 || d != 2)
+    __builtin_abort();
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local4g.C b/gcc/testsuite/g++.dg/tls/thread_local4g.C
new file mode 100644
index 0000000..574d267
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local4g.C
@@ -0,0 +1,45 @@
+// Test for cleanups with pthread_cancel.
+
+// { dg-do run }
+// { dg-require-effective-target c++11 }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-effective-target pthread }
+// { dg-require-alias }
+// { dg-options -pthread }
+
+#include <pthread.h>
+#include <unistd.h>
+
+int c;
+int d;
+struct A
+{
+  A() { ++c; }
+  ~A() { ++d; }
+};
+
+thread_local A a;
+
+void *thread_main(void *)
+{
+  A *ap = &a;
+  while (true)
+    {
+      pthread_testcancel();
+      sleep (1);
+    }
+}
+
+int main()
+{
+  pthread_t thread;
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_cancel(thread);
+  pthread_join(thread, 0);
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_cancel(thread);
+  pthread_join(thread, 0);
+
+   if (c != 2 || d != 2)
+     __builtin_abort();
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local5g.C b/gcc/testsuite/g++.dg/tls/thread_local5g.C
new file mode 100644
index 0000000..badab4f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local5g.C
@@ -0,0 +1,45 @@
+// Test for cleanups in the main thread, too.
+
+// { dg-do run }
+// { dg-require-effective-target c++11 }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-effective-target pthread }
+// { dg-require-alias }
+// { dg-options -pthread }
+
+#include <pthread.h>
+#include <unistd.h>
+
+int c;
+int d;
+struct A
+{
+  A() { ++c; }
+  ~A() {
+    if (++d == 3)
+      _exit (0);
+  }
+};
+
+thread_local A a;
+
+void *thread_main(void *)
+{
+  A* ap = &a;
+}
+
+int main()
+{
+  pthread_t thread;
+  thread_main(0);
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_join(thread, 0);
+  pthread_create (&thread, 0, thread_main, 0);
+  pthread_join(thread, 0);
+
+  // The dtor for a in the main thread is run after main exits, so we
+  // return 1 now and override the return value with _exit above.
+  if (c != 3 || d != 2)
+    __builtin_abort();
+  return 1;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local6g.C b/gcc/testsuite/g++.dg/tls/thread_local6g.C
new file mode 100644
index 0000000..ff8d608
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local6g.C
@@ -0,0 +1,31 @@
+// Test for cleanups in the main thread without -pthread.
+
+// { dg-do run }
+// { dg-options "-std=c++11" }
+// { dg-require-effective-target tls_runtime }
+// { dg-require-alias }
+
+extern "C" void _exit (int);
+
+int c;
+struct A
+{
+  A() { ++c; }
+  ~A() { if (c == 1) _exit(0); }
+};
+
+thread_local A a;
+
+void *thread_main(void *)
+{
+  A* ap = &a;
+}
+
+int main()
+{
+  thread_main(0);
+
+  // The dtor for a in the main thread is run after main exits, so we
+  // return 1 now and override the return value with _exit above.
+  return 1;
+}
diff --git a/gcc/testsuite/g++.dg/tls/thread_local7g.C b/gcc/testsuite/g++.dg/tls/thread_local7g.C
new file mode 100644
index 0000000..6960598
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local7g.C
@@ -0,0 +1,13 @@
+// { dg-options "-std=c++11" }
+// { dg-require-effective-target tls }
+// { dg-require-alias }
+
+// The reference temp should be TLS, not normal data.
+// { dg-final { scan-assembler-not "\\.data" } }
+
+thread_local int&& ir = 42;
+
+void f()
+{
+  ir = 24;
+}
diff --git a/include/demangle.h b/include/demangle.h
index 34b3ed3..5da79d8 100644
--- a/include/demangle.h
+++ b/include/demangle.h
@@ -272,6 +272,9 @@ enum demangle_component_type
   /* A guard variable.  This has one subtree, the name for which this
      is a guard variable.  */
   DEMANGLE_COMPONENT_GUARD,
+  /* The init and wrapper functions for C++11 thread_local variables.  */
+  DEMANGLE_COMPONENT_TLS_INIT,
+  DEMANGLE_COMPONENT_TLS_WRAPPER,
   /* A reference temporary.  This has one subtree, the name for which
      this is a temporary.  */
   DEMANGLE_COMPONENT_REFTEMP,
diff --git a/libgomp/testsuite/libgomp.c++/tls-init1.C b/libgomp/testsuite/libgomp.c++/tls-init1.C
new file mode 100644
index 0000000..4cbaccb
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c++/tls-init1.C
@@ -0,0 +1,26 @@
+extern "C" void abort();
+
+struct A
+{
+  A();
+  int i;
+};
+
+extern A a;
+#pragma omp threadprivate (a)
+A a;
+
+A &f()
+{
+  return a;
+}
+
+int j;
+A::A(): i(j) { }
+
+int main()
+{
+  j = 42;
+  if (f().i != 42)
+    abort ();
+}
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 258aaa7..32df38c 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -696,6 +696,12 @@ d_dump (struct demangle_component *dc, int indent)
     case DEMANGLE_COMPONENT_PACK_EXPANSION:
       printf ("pack expansion\n");
       break;
+    case DEMANGLE_COMPONENT_TLS_INIT:
+      printf ("tls init function\n");
+      break;
+    case DEMANGLE_COMPONENT_TLS_WRAPPER:
+      printf ("tls wrapper function\n");
+      break;
     }
 
   d_dump (d_left (dc), indent + 2);
@@ -832,6 +838,8 @@ d_make_comp (struct d_info *di, enum demangle_component_type type,
     case DEMANGLE_COMPONENT_COVARIANT_THUNK:
     case DEMANGLE_COMPONENT_JAVA_CLASS:
     case DEMANGLE_COMPONENT_GUARD:
+    case DEMANGLE_COMPONENT_TLS_INIT:
+    case DEMANGLE_COMPONENT_TLS_WRAPPER:
     case DEMANGLE_COMPONENT_REFTEMP:
     case DEMANGLE_COMPONENT_HIDDEN_ALIAS:
     case DEMANGLE_COMPONENT_TRANSACTION_CLONE:
@@ -1867,6 +1875,14 @@ d_special_name (struct d_info *di)
 	  return d_make_comp (di, DEMANGLE_COMPONENT_JAVA_CLASS,
 			      cplus_demangle_type (di), NULL);
 
+	case 'H':
+	  return d_make_comp (di, DEMANGLE_COMPONENT_TLS_INIT,
+			      d_name (di), NULL);
+
+	case 'W':
+	  return d_make_comp (di, DEMANGLE_COMPONENT_TLS_WRAPPER,
+			      d_name (di), NULL);
+
 	default:
 	  return NULL;
 	}
@@ -4072,6 +4088,16 @@ d_print_comp (struct d_print_info *dpi, int options,
       d_print_comp (dpi, options, d_left (dc));
       return;
 
+    case DEMANGLE_COMPONENT_TLS_INIT:
+      d_append_string (dpi, "TLS init function for ");
+      d_print_comp (dpi, options, d_left (dc));
+      return;
+
+    case DEMANGLE_COMPONENT_TLS_WRAPPER:
+      d_append_string (dpi, "TLS wrapper function for ");
+      d_print_comp (dpi, options, d_left (dc));
+      return;
+
     case DEMANGLE_COMPONENT_REFTEMP:
       d_append_string (dpi, "reference temporary #");
       d_print_comp (dpi, options, d_right (dc));

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-10-15 19:58 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-09 14:08 RFC: C++ PATCH to support dynamic initialization and destruction of C++11 and OpenMP TLS variables Dominique Dhumieres
2012-10-09 14:43 ` Jack Howarth
2012-10-09 15:28   ` Jason Merrill
2012-10-09 16:28     ` Dominique Dhumieres
2012-10-09 20:43     ` Dominique Dhumieres
2012-10-10  1:16       ` Jason Merrill
2012-10-10 13:27         ` Jack Howarth
2012-10-10 14:54           ` Rainer Orth
2012-10-10 20:20             ` Jack Howarth
2012-10-10 20:25               ` Jack Howarth
2012-10-11 15:23     ` Jason Merrill
2012-10-10 15:01 ` Rainer Orth
2012-10-15 20:25 ` Richard Sandiford
  -- strict thread matches above, loose matches on Subject: below --
2012-10-04 17:39 Jason Merrill
2012-10-05  8:30 ` Richard Guenther
2012-10-05  8:41   ` Jakub Jelinek
2012-10-05 17:28     ` Jason Merrill
2012-10-05 17:38   ` Jason Merrill
2012-10-15 17:49     ` Jason Merrill
2012-10-05  8:31 ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).