From: Jason Merrill <jason@redhat.com>
To: Iain Sandoe <iain@sandoe.co.uk>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] c++, TLS: Support cross-tu static initialization for targets without alias support [PR106435].
Date: Wed, 4 Jan 2023 10:03:01 -0500 [thread overview]
Message-ID: <2de9a8cc-70f7-4417-1df8-f3cf4f25a7f6@redhat.com> (raw)
In-Reply-To: <5C16E9B3-E14E-4B65-A8DE-5CF670F6961B@sandoe.co.uk>
On 1/3/23 18:17, Iain Sandoe wrote:
>
>
>> On 3 Jan 2023, at 22:22, Jason Merrill <jason@redhat.com> wrote:
>>
>> On 12/7/22 10:39, Iain Sandoe wrote:
>>> This has been tested on x86_64 and arm64 Darwin and on x86_64 linux gnu.
>>> The basic patch is live in the homebrew macOS support and so has had quite
>>> wide coverage on non-trivial codebases.
>>> OK for master?
>>> Iain
>>> Since this actually fixes wrong code, I wonder if we should also consider
>>> back-porting.
>>> --- >8 ---
>>> The description below relates to the code path when TARGET_SUPPORTS_ALIASES is
>>> false; current operation is maintained for targets with alias support and any
>>> new support code should be DCEd in that case.
>>> --
>>> Currently, cross-tu static initialisation is not supported for targets without
>>> alias support.
>>> The patch adds support by building a shim function in place of the alias for
>>> these targets; the shim simply calls the generic initialiser. Although this is
>>> slightly less efficient than the alias, in practice (for targets that allow
>>> sibcalls) the penalty is a single jump when code is optimised.
>>> From the perspective of a TU referencing an extern TLS variable, there is no
>>> way to determine if it requires a guarded dynamic init. So, in the referencing
>>> TU, we build a weak reference to the potential init and check at runtime if the
>>> init is present before calling it. This strategy is fine for targets that have
>>> ELF semantics, but fails at link time for Mach-O (which does not permit the
>>> reference to be undefined in the static link).
>>> The actual initialiser call is contained in a wrapper function, and to resolve
>>> the Mach-O linker issue, in the TU that is referencing the var, we now generate
>>> both the wrapper _and_ a weak definition of a dummy init function. In the case
>>> that there _is_ a dynamic init (in a different TU), that version will be non-weak
>>> and will be override the weak dummy one.
>>
>> IIUC, this isn't reliable in general; in specific, I believe that the glibc dynamic loader no longer prefers strong definitions to weak ones.
>
> Neither does Darwin’s dynamic loader, this implemenation works there because the static linker _will_ override the weak def with a strong one. IIUC, binutils ld does this too.
>
> If we need this to work between DSOs then that potentially presents a problem (for Darwin the DSO is identified so that the symbol will be found in the library that resolved it in the static link, [but that can be defeated by forcing “flat linking”]), I am not sure if glibc dynamic loader would do something similar (although this code path is not taken on ELF targets since they have the symbol aliases).
>
>> Perhaps on targets that don't allow weakrefs to be unbound,
>
> Darwin would allow it if we were able to tell the static linker that the symbol is permitted to be undefined - but since we don’t know the symbol’s name outside the FE, that is not going to fly.
Can you elaborate on this?
>> we should unconditionally emit the init function where the variable is defined, even if it does nothing, and unconditionally call it from the wrapper?
>
> OK. that seems a safer option .. I will have to look at it when I have a chance.
>
> thanks
> Iain
>>
>>> In the case that we have a trivial
>>> static init (so no init in any other TU) the weak-defined dummy init will be
>>> called (a single return insn for optimised code). We mitigate the call to
>>> the dummy init by reworking the wrapper code-gen path to remove the test for
>>> the weak reference function (as it will always be true) since the static linker
>>> will now determine the function to be called.
>>> Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
>>> PR c++/106435
>>> gcc/c-family/ChangeLog:
>>> * c-opts.cc (c_common_post_options): Allow fextern-tls-init for targets
>>> without alias support.
>>> gcc/cp/ChangeLog:
>>> * decl2.cc (get_tls_init_fn): Allow targets without alias support.
>>> (handle_tls_init): Emit aliases for single init functions where the
>>> target supporst this, otherwise emit a stub function that calls the
>>> main tls init function. (generate_tls_dummy_init): New.
>>> gcc/testsuite/ChangeLog:
>>> * g++.dg/cpp0x/pr106435-b.cc: New file.
>>> * g++.dg/cpp0x/pr106435.C: New test.
>>> * g++.dg/cpp0x/pr106435.h: New file.
>>> ---
>>> gcc/c-family/c-opts.cc | 2 +-
>>> gcc/cp/decl2.cc | 80 ++++++++++++++++++++----
>>> gcc/testsuite/g++.dg/cpp0x/pr106435-b.cc | 22 +++++++
>>> gcc/testsuite/g++.dg/cpp0x/pr106435.C | 24 +++++++
>>> gcc/testsuite/g++.dg/cpp0x/pr106435.h | 27 ++++++++
>>> 5 files changed, 142 insertions(+), 13 deletions(-)
>>> create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr106435-b.cc
>>> create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr106435.C
>>> create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr106435.h
>>> diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
>>> index 70745aa4e7c..064645f980d 100644
>>> --- a/gcc/c-family/c-opts.cc
>>> +++ b/gcc/c-family/c-opts.cc
>>> @@ -1070,7 +1070,7 @@ c_common_post_options (const char **pfilename)
>>> if (flag_extern_tls_init)
>>> {
>>> - if (!TARGET_SUPPORTS_ALIASES || !SUPPORTS_WEAK)
>>> + if (!SUPPORTS_WEAK)
>>> {
>>> /* Lazy TLS initialization for a variable in another TU requires
>>> alias and weak reference support. */
>>> diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
>>> index f95529a5c9a..c6550c0c2fc 100644
>>> --- a/gcc/cp/decl2.cc
>>> +++ b/gcc/cp/decl2.cc
>>> @@ -3672,9 +3672,8 @@ get_tls_init_fn (tree var)
>>> if (!flag_extern_tls_init && DECL_EXTERNAL (var))
>>> return NULL_TREE;
>>> - /* If the variable is internal, or if we can't generate aliases,
>>> - call the local init function directly. */
>>> - if (!TREE_PUBLIC (var) || !TARGET_SUPPORTS_ALIASES)
>>> + /* If the variable is internal call the local init function directly. */
>>> + if (!TREE_PUBLIC (var))
>>> return get_local_tls_init_fn (DECL_SOURCE_LOCATION (var));
>>> tree sname = mangle_tls_init_fn (var);
>>> @@ -3811,8 +3810,12 @@ generate_tls_wrapper (tree fn)
>>> if (tree init_fn = get_tls_init_fn (var))
>>> {
>>> tree if_stmt = NULL_TREE;
>>> - /* If init_fn is a weakref, make sure it exists before calling. */
>>> - if (lookup_attribute ("weak", DECL_ATTRIBUTES (init_fn)))
>>> +
>>> + /* If init_fn is a weakref, make sure it exists before calling.
>>> + If the target does not support aliases, then we will have generated
>>> + a dummy weak function, so there is no need to test its existence. */
>>> + if (TARGET_SUPPORTS_ALIASES &&
>>> + lookup_attribute ("weak", DECL_ATTRIBUTES (init_fn)))
>>> {
>>> if_stmt = begin_if_stmt ();
>>> tree addr = cp_build_addr_expr (init_fn, tf_warning_or_error);
>>> @@ -3837,6 +3840,25 @@ generate_tls_wrapper (tree fn)
>>> expand_or_defer_fn (finish_function (/*inline_p=*/false));
>>> }
>>> +/* A dummy init function to act as a weak placeholder for a (possibly non-
>>> + existent) dynamic init. */
>>> +static void
>>> +generate_tls_dummy_init (tree fn)
>>> +{
>>> + tree var = DECL_BEFRIENDING_CLASSES (fn);
>>> + tree init_fn = get_tls_init_fn (var);
>>> + /* If have no init fn, or it is non-weak, then we do not need to make a
>>> + dummy. */
>>> + if (!init_fn || !lookup_attribute ("weak", DECL_ATTRIBUTES (init_fn)))
>>> + return;
>>> + start_preparsed_function (init_fn, NULL_TREE, SF_DEFAULT | SF_PRE_PARSED);
>>> + tree body = begin_function_body ();
>>> + declare_weak (init_fn);
>>> + finish_return_stmt (NULL_TREE);
>>> + finish_function_body (body);
>>> + expand_or_defer_fn (finish_function (/*inline_p=*/false));
>>> +}
>>> +
>>> /* Start a global constructor or destructor function. */
>>> static tree
>>> @@ -4626,22 +4648,24 @@ handle_tls_init (void)
>>> finish_expr_stmt (cp_build_modify_expr (loc, guard, NOP_EXPR,
>>> boolean_true_node,
>>> tf_warning_or_error));
>>> + auto_vec<tree> direct_calls;
>>> for (; vars; vars = TREE_CHAIN (vars))
>>> {
>>> tree var = TREE_VALUE (vars);
>>> tree init = TREE_PURPOSE (vars);
>>> one_static_initialization_or_destruction (/*initp=*/true, var, init);
>>> - /* Output init aliases even with -fno-extern-tls-init. */
>>> - if (TARGET_SUPPORTS_ALIASES && TREE_PUBLIC (var))
>>> + /* Output inits even with -fno-extern-tls-init.
>>> + We save the list here and output either an alias or a stub function
>>> + below. */
>>> + if (TREE_PUBLIC (var))
>>> {
>>> tree single_init_fn = get_tls_init_fn (var);
>>> if (single_init_fn == NULL_TREE)
>>> continue;
>>> - cgraph_node *alias
>>> - = cgraph_node::get_create (fn)->create_same_body_alias
>>> - (single_init_fn, fn);
>>> - gcc_assert (alias != NULL);
>>> + if (single_init_fn == fn)
>>> + continue;
>>> + direct_calls.safe_push (single_init_fn);
>>> }
>>> }
>>> @@ -4649,6 +4673,30 @@ handle_tls_init (void)
>>> finish_if_stmt (if_stmt);
>>> finish_function_body (body);
>>> expand_or_defer_fn (finish_function (/*inline_p=*/false));
>>> +
>>> + /* For each TLS var that we have an init function, we either emit an alias
>>> + between that and the tls_init, or a stub function that just calls the
>>> + tls_init. */
>>> + while (!direct_calls.is_empty())
>>> + {
>>> + tree single_init_fn = direct_calls.pop ();
>>> + if (TARGET_SUPPORTS_ALIASES)
>>> + {
>>> + cgraph_node *alias
>>> + = cgraph_node::get_create (fn)->create_same_body_alias
>>> + (single_init_fn, fn);
>>> + gcc_assert (alias != NULL);
>>> + }
>>> + else
>>> + {
>>> + start_preparsed_function (single_init_fn, NULL_TREE, SF_PRE_PARSED);
>>> + tree body = begin_function_body ();
>>> + tree r = build_call_expr (fn, 0);
>>> + finish_expr_stmt (r);
>>> + finish_function_body (body);
>>> + expand_or_defer_fn (finish_function (/*inline_p=*/false));
>>> + }
>>> + }
>>> }
>>> /* We're at the end of compilation, so generate any mangling aliases that
>>> @@ -5058,7 +5106,15 @@ c_parse_final_cleanups (void)
>>> }
>>> if (!DECL_INITIAL (decl) && decl_tls_wrapper_p (decl))
>>> - generate_tls_wrapper (decl);
>>> + {
>>> + generate_tls_wrapper (decl);
>>> + if (!TARGET_SUPPORTS_ALIASES)
>>> + /* The wrapper might have a weak reference to an init, we provide
>>> + a dummy function to satisfy that here. The linker/dynamic
>>> + loader will override this with the actual init, if one is
>>> + required. */
>>> + generate_tls_dummy_init (decl);
>>> + }
>>> if (!DECL_SAVED_TREE (decl))
>>> continue;
>>> diff --git a/gcc/testsuite/g++.dg/cpp0x/pr106435-b.cc b/gcc/testsuite/g++.dg/cpp0x/pr106435-b.cc
>>> new file mode 100644
>>> index 00000000000..bd586286647
>>> --- /dev/null
>>> +++ b/gcc/testsuite/g++.dg/cpp0x/pr106435-b.cc
>>> @@ -0,0 +1,22 @@
>>> +// PR c++/106435
>>> +#include "pr106435.h"
>>> +
>>> +//#include <iostream>
>>> +
>>> +Foo::Foo() {
>>> + ++num_calls;
>>> +// std::cout << "Foo::Foo(this=" << this << ")\n";
>>> +}
>>> +
>>> +int Foo::func() {
>>> +// std::cout << "Foo::func(this=" << this << ")\n";
>>> + return num_calls;
>>> +}
>>> +
>>> +thread_local Foo Bar::foo;
>>> +thread_local Foo Bar::baz;
>>> +
>>> +thread_local Foo F = Foo{5};
>>> +
>>> +thread_local void* tl = (void*)&F;
>>> +thread_local void* tl1 = nullptr;
>>> diff --git a/gcc/testsuite/g++.dg/cpp0x/pr106435.C b/gcc/testsuite/g++.dg/cpp0x/pr106435.C
>>> new file mode 100644
>>> index 00000000000..9071e032697
>>> --- /dev/null
>>> +++ b/gcc/testsuite/g++.dg/cpp0x/pr106435.C
>>> @@ -0,0 +1,24 @@
>>> +// PR c++/106435
>>> +// { dg-do run { target c++11 } }
>>> +// { dg-additional-sources "pr106435-b.cc" }
>>> +
>>> +#include "pr106435.h"
>>> +
>>> +int num_calls = 0;
>>> +
>>> +thread_local Foo Bar::bat;
>>> +
>>> +int main() {
>>> + int v = Bar::foo.func();
>>> + if (v != 2)
>>> + __builtin_abort ();
>>> + v = Bar::bat.func();
>>> + if (v != 3)
>>> + __builtin_abort ();
>>> + if (F.getX() != 5)
>>> + __builtin_abort();
>>> + if (gt_tl () != (void*)&F)
>>> + __builtin_abort();
>>> + if (gt_tl1 ())
>>> + __builtin_abort();
>>> +}
>>> diff --git a/gcc/testsuite/g++.dg/cpp0x/pr106435.h b/gcc/testsuite/g++.dg/cpp0x/pr106435.h
>>> new file mode 100644
>>> index 00000000000..06d3e60484f
>>> --- /dev/null
>>> +++ b/gcc/testsuite/g++.dg/cpp0x/pr106435.h
>>> @@ -0,0 +1,27 @@
>>> +// PR c++/106435
>>> +#pragma once
>>> +
>>> +extern int num_calls;
>>> +struct Foo {
>>> + Foo();
>>> + Foo(int _x) : x(_x) {}
>>> + int func();
>>> + int getX () { return x; }
>>> + int x;
>>> +};
>>> +
>>> +struct Bar {
>>> + thread_local static Foo foo;
>>> + thread_local static Foo baz;
>>> + thread_local static Foo bat;
>>> +};
>>> +
>>> +extern thread_local void* tl;
>>> +
>>> +inline void* gt_tl () {return tl;}
>>> +
>>> +extern thread_local Foo F;
>>> +
>>> +extern thread_local void* tl1;
>>> +
>>> +inline void* gt_tl1 () {return tl1;}
>
next prev parent reply other threads:[~2023-01-04 15:03 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-07 15:39 Iain Sandoe
2022-12-30 9:07 ` [ping^1] " Iain Sandoe
2023-01-03 22:22 ` Jason Merrill
2023-01-03 23:17 ` Iain Sandoe
2023-01-04 15:03 ` Jason Merrill [this message]
2023-01-04 15:30 ` Iain Sandoe
2023-01-04 15:49 ` Jason Merrill
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2de9a8cc-70f7-4417-1df8-f3cf4f25a7f6@redhat.com \
--to=jason@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=iain@sandoe.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).