On Fri, Jan 18, 2013 at 05:29:02PM -0500, Jason Merrill wrote: > I've been thinking about how to deal with the problems with calling > initialization functions in another translation unit; namely, that it's > significant overhead even for cases where it isn't useful, and that it > requires alias and weak reference support. This patch introduces a new > flag -fno-extern-tls-init that allows the user to disable support for > calling a lazy initialization function in a different translation unit > to avoid the overhead. It also disables it by default for targets > without support for aliases or weak references. > > Tested x86_64-pc-linux-gnu. Jack, does this fix the thread_local tests > on Darwin? Jason, The proposed patch eliminates all of the previous failures from PR54908 at both -m32/-m64 on x86_64-apple-darwin12 but seems to introduce a new set of failures... FAIL: g++.dg/tls/thread_local-wrap3.C scan-assembler _ZTH1i FAIL: g++.dg/gomp/tls-wrap3.C -std=c++98 scan-assembler _ZTH1i FAIL: g++.dg/gomp/tls-wrap3.C -std=c++11 scan-assembler _ZTH1i at both -m32 and -m64. The thread_local-wrap3.s generated with... /sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/gcc/testsuite/g++/../../xg++ -B/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/gcc/testsuite/g++/../../ /sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130118/gcc/testsuite/g++.dg/tls/thread_local-wrap3.C -fno-diagnostics-show-caret -nostdinc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include/x86_64-apple-darwin12.2.0 -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130118/libstdc++-v3/libsupc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130118/libstdc++-v3/include/backward -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20130118/libstdc++-v3/testsuite/util -fmessage-length=0 -std=c++11 -S -m64 -o thread_local-wrap3.s is attached. Recompiling the testcase with -fextern-tls-init doesn't produce the missing _ZTH1i. Jack ps I'll post full regression results tomorrow. > commit b3cf0f73d853bb9f0d5bfde22995eafc244210d3 > Author: Jason Merrill > Date: Thu Jan 17 06:15:22 2013 -0500 > > PR target/54908 > c-family/ > * c.opt (-fextern-tls-init): New. > * c-opts.c (c_common_post_options): Handle it. > cp/ > * decl2.c (get_local_tls_init_fn): New. > (get_tls_init_fn): Handle flag_extern_tls_init. Don't bother > with aliases for internal variables. Don't use weakrefs if > the variable needs destruction. > (generate_tls_wrapper): Mark the wrapper as const if no > initialization is needed. > (handle_tls_init): Don't require aliases. > > diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c > index 3fabb36..1a922a8 100644 > --- a/gcc/c-family/c-opts.c > +++ b/gcc/c-family/c-opts.c > @@ -901,6 +901,20 @@ c_common_post_options (const char **pfilename) > else if (warn_narrowing == -1) > warn_narrowing = 0; > > + if (flag_extern_tls_init) > + { > +#if !defined (ASM_OUTPUT_DEF) || !SUPPORTS_WEAK > + /* Lazy TLS initialization for a variable in another TU requires > + alias and weak reference support. */ > + if (flag_extern_tls_init > 0) > + sorry ("external TLS initialization functions not supported " > + "on this target"); > + flag_extern_tls_init = 0; > +#else > + flag_extern_tls_init = 1; > +#endif > + } > + > if (flag_preprocess_only) > { > /* Open the output now. We must do so even if flag_no_output is > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index 187f3be..10ae84d 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -913,6 +913,9 @@ finput-charset= > C ObjC C++ ObjC++ Joined RejectNegative > -finput-charset= Specify the default character set for source files > > +fextern-tls-init > +C++ ObjC++ Var(flag_extern_tls_init) Init(-1) > +Support dynamic initialization of thread-local variables in a different translation unit > > fexternal-templates > C++ ObjC++ Ignore Warn(switch %qs is no longer supported) > diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c > index 619d30d..4496395 100644 > --- a/gcc/cp/decl2.c > +++ b/gcc/cp/decl2.c > @@ -2812,6 +2812,28 @@ var_needs_tls_wrapper (tree var) > && !var_defined_without_dynamic_init (var)); > } > > +/* Get the FUNCTION_DECL for the shared TLS init function for this > + translation unit. */ > + > +static tree > +get_local_tls_init_fn (void) > +{ > + tree sname = get_identifier ("__tls_init"); > + tree fn = IDENTIFIER_GLOBAL_VALUE (sname); > + if (!fn) > + { > + fn = build_lang_decl (FUNCTION_DECL, sname, > + build_function_type (void_type_node, > + void_list_node)); > + SET_DECL_LANGUAGE (fn, lang_c); > + TREE_PUBLIC (fn) = false; > + DECL_ARTIFICIAL (fn) = true; > + mark_used (fn); > + SET_IDENTIFIER_GLOBAL_VALUE (sname, fn); > + } > + return fn; > +} > + > /* Get a FUNCTION_DECL for the init function for the thread_local > variable VAR. The init function will be an alias to the function > that initializes all the non-local TLS variables in the translation > @@ -2824,6 +2846,18 @@ get_tls_init_fn (tree var) > if (!var_needs_tls_wrapper (var)) > return NULL_TREE; > > + /* If -fno-extern-tls-init, assume that we don't need to call > + a tls init function for a variable defined in another TU. */ > + if (!flag_extern_tls_init && DECL_EXTERNAL (var)) > + return NULL_TREE; > + > +#ifdef ASM_OUTPUT_DEF > + /* If the variable is internal, or if we can't generate aliases, > + call the local init function directly. */ > + if (!TREE_PUBLIC (var)) > +#endif > + return get_local_tls_init_fn (); > + > tree sname = mangle_tls_init_fn (var); > tree fn = IDENTIFIER_GLOBAL_VALUE (sname); > if (!fn) > @@ -2841,11 +2875,12 @@ get_tls_init_fn (tree var) > if (TREE_PUBLIC (var)) > { > tree obtype = strip_array_types (non_reference (TREE_TYPE (var))); > - /* If the variable might have static initialization, make the > - init function a weak reference. */ > + /* If the variable is defined somewhere else and might have static > + initialization, make the init function a weak reference. */ > if ((!TYPE_NEEDS_CONSTRUCTING (obtype) > || TYPE_HAS_CONSTEXPR_CTOR (obtype)) > - && TARGET_SUPPORTS_WEAK) > + && TYPE_HAS_TRIVIAL_DESTRUCTOR (obtype) > + && DECL_EXTERNAL (var)) > declare_weak (fn); > else > DECL_WEAK (fn) = DECL_WEAK (var); > @@ -2956,6 +2991,9 @@ generate_tls_wrapper (tree fn) > finish_if_stmt (if_stmt); > } > } > + else > + /* If there's no initialization, the wrapper is a constant function. */ > + TREE_READONLY (fn) = true; > finish_return_stmt (convert_from_reference (var)); > finish_function_body (body); > expand_or_defer_fn (finish_function (0)); > @@ -3861,15 +3899,6 @@ handle_tls_init (void) > > location_t loc = DECL_SOURCE_LOCATION (TREE_VALUE (vars)); > > - #ifndef ASM_OUTPUT_DEF > - /* This currently requires alias support. FIXME other targets could use > - small thunks instead of aliases. */ > - input_location = loc; > - sorry ("dynamic initialization of non-function-local thread_local " > - "variables not supported on this target"); > - return; > - #endif > - > write_out_vars (vars); > > tree guard = build_decl (loc, VAR_DECL, get_identifier ("__tls_guard"), > @@ -3882,14 +3911,7 @@ handle_tls_init (void) > DECL_TLS_MODEL (guard) = decl_default_tls_model (guard); > pushdecl_top_level_and_finish (guard, NULL_TREE); > > - tree fn = build_lang_decl (FUNCTION_DECL, > - get_identifier ("__tls_init"), > - build_function_type (void_type_node, > - void_list_node)); > - SET_DECL_LANGUAGE (fn, lang_c); > - TREE_PUBLIC (fn) = false; > - DECL_ARTIFICIAL (fn) = true; > - mark_used (fn); > + tree fn = get_local_tls_init_fn (); > start_preparsed_function (fn, NULL_TREE, SF_PRE_PARSED); > tree body = begin_function_body (); > tree if_stmt = begin_if_stmt (); > @@ -3904,11 +3926,17 @@ handle_tls_init (void) > tree init = TREE_PURPOSE (vars); > one_static_initialization_or_destruction (var, init, true); > > - tree single_init_fn = get_tls_init_fn (var); > - cgraph_node *alias > - = cgraph_same_body_alias (cgraph_get_create_node (fn), > - single_init_fn, fn); > - gcc_assert (alias != NULL); > +#ifdef ASM_OUTPUT_DEF > + /* Output init aliases even with -fno-extern-tls-init. */ > + if (TREE_PUBLIC (var)) > + { > + tree single_init_fn = get_tls_init_fn (var); > + cgraph_node *alias > + = cgraph_same_body_alias (cgraph_get_create_node (fn), > + single_init_fn, fn); > + gcc_assert (alias != NULL); > + } > +#endif > } > > finish_then_clause (if_stmt); > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 9ef6c93..48ee779 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -2029,6 +2029,29 @@ exceptions in violation of the exception specifications; the compiler > still optimizes based on the specifications, so throwing an > unexpected exception results in undefined behavior at run time. > > +@item -fextern-tls-init > +@itemx -fno-extern-tls-init > +@opindex fextern-tls-init > +@opindex fno-extern-tls-init > +The C++11 and OpenMP standards allow @samp{thread_local} and > +@samp{threadprivate} variables to have dynamic (runtime) > +initialization. To support this, any use of such a variable goes > +through a wrapper function that performs any necessary initialization. > +When the use and definition of the variable are in the same > +translation unit, this overhead can be optimized away, but when the > +use is in a different translation unit there is significant overhead > +even if the variable doesn't actually need dynamic initialization. If > +the programmer can be sure that no use of the variable in a > +non-defining TU needs to trigger dynamic initialization (either > +because the variable is statically initialized, or a use of the > +variable in the defining TU will be executed before any uses in > +another TU), they can avoid this overhead with the > +@option{-fno-extern-tls-init} option. > + > +On targets that support symbol aliases, the default is > +@option{-fextern-tls-init}. On targets that do not support symbol > +aliases, the default is @option{-fno-extern-tls-init}. > + > @item -ffor-scope > @itemx -fno-for-scope > @opindex ffor-scope > diff --git a/libgomp/testsuite/libgomp.c++/pr24455.C b/libgomp/testsuite/libgomp.c++/pr24455.C > index 3185ca5..8256b66 100644 > --- a/libgomp/testsuite/libgomp.c++/pr24455.C > +++ b/libgomp/testsuite/libgomp.c++/pr24455.C > @@ -1,8 +1,7 @@ > // { dg-do run } > // { dg-additional-sources pr24455-1.C } > // { dg-require-effective-target tls_runtime } > -// { dg-options "-Wl,-G" { target powerpc-ibm-aix* } } > -// { dg-options "-Wl,-undefined,dynamic_lookup" { target *-*-darwin* } } > +// { dg-options "-fno-extern-tls-init" } > > extern "C" void abort (void); >